Mimic 3 Preview

Originally published at: Mimic 3 Preview - Mycroft

Today Mycroft is launching a preview of the Mimic 3 Cloud service

Mimic 3 is Mycroft’s performance-optimized, open-source neural Text to Speech (TTS) engine. In human terms, that means it sounds far more natural than Mimic 1 and even Mimic 2. And, unlike Mimic 2, it can run completely offline on low-cost devices.

In addition to supporting Mycroft’s privacy-first principle, Mimic 3 advances our goal of bringing quality voice-based AI to every language. Already Mimic 3 can…

  • speak more than two dozen languages, with over 100 English voices available
  • use SSML (Speaking Synthesis Markup Language) to switch voices, control speaking rate and volume, add pauses, and pronounce words phonetically between sentences
  • run completely offline on devices like the Raspberry Pi 4

Anyone who makes products that need a voice can use Mimic 3

It’s powerful enough to add a hundred voices to in-game characters at once and small enough to run on handheld and embedded devices. Or if you want the simplest integration, our cloud-service version can run even higher quality models with extremely low latency, enabling any interactive project with a quality voice, or even multiple voices.

We’re launching this preview of Mimic 3 Cloud today to get feedback on the over 100 languages and voices we’ve trained to-date. Currently, there are Mimic 3 voices available in 25 languages:

  • Afrikaans
  • Bengali
  • Dutch
  • English
  • Farsi
  • Finnish
  • French
  • German
  • Greek
  • Gujarati
  • Hausa
  • Hungarian
  • Italian
  • Javanese
  • Kiswahili
  • Korean
  • Nepali
  • Polish
  • Russian
  • Setswana
  • Spanish
  • Telugu
  • Ukrainian
  • Vietnamese
  • Yoruba

If you know one of these languages and would like to test-drive Mimic 3…

…through an interactive web page, please register below and we’ll send you instructions and a warm virtual handshake.

You can also register for product updates on your preferred platforms.

https://mycroft.ai/blog/mimic-3-preview/
8 Likes

Hey everyone, Mike from Mycroft here. Happy to answer any questions regarding Mimic 3 :slight_smile:

Also, PM me if you’d like to try it out locally. I have a Python package and Debian (bullseye) packages available for x86_64 as well as 32-bit and 64-bit ARM (Pi 3/4).

3 Likes

NOW you have my interest. I’ve been watching this project from the sidelines and the TTS was one factor that has kept me using Echos. I’m not at a point where I can migrate or install anything on my Pi, but is there a demo on Youtube or somewhere where I can just hear some examples? I’m super curious about how it sounds. Does it have a quick response?
Thanks!

1 Like

Hi @Quixote, there are samples for every language/voice available here: Mimic 3 Voice Samples

You’re welcome to join the beta too and try it out for yourself. Hope your journey with Mycroft goes well :slight_smile:

Thanks, I found that page yesterday. Very impressive!
I am now prepared if anyone ever asks me to define what a rainbow is.
I currently don’t have a development machine to work with since I rely on my only RPi for my home automation tasks, but when (if) it becomes easier to get another RPi 4 for a reasonable price, I’ll definitely give it a shot.
Hopefully others will post videos showcasing the full capabilities of Mimic 3. It will be nice to see it used for its intended purposes, beyond a single phrase. I’m curious about if it will sound as natural as Alexa. Tough competition, but the fact that it’s not cloud-based is a HUGE advantage in my opinion. I’ve been going through my recordings that my various Echos have captured and it’s concerning to realize how much it’s unintentionally recording and trying to process.

:rainbow:

Not a video, but here’s a little tutorial page I put together that has two voices narrating the text: https://mycroftai.github.io/mimic3-presentation/

There’s enough variation to get a sense for how good the voices are as well as their shortcomings. For example, the Alan Pope voice has trouble with the word “allowing” in one of the sentences. But I’d say it’s a good thing if those are the only issues with a local TTS option :slight_smile:

2 Likes

Thanks! That’s pretty awesome. I’d say it’s getting quite close to “Alexa quality”, which is a BIG accomplishment when you consider that Amazon is an industry titan.
The final test will be if it pronounces “kilometers” correctly (not kilahmitters). :wink:

1 Like

I’m planning to do a full video about Mimic 3 when it’s released on my Youtube channel :slight_smile: . Right now, there’s already a short announcement video.

3 Likes