That’s what I would have meant if I’d been paying attention
So it seems the Docker image may be the only option unless you want to run Mimic 3 remotely on a different machine. If you do that, you can just use Mycroft’s MaryTTS plugin to connect to it.
I have the same error. I set a speaker in the user config as suggested, but the only difference is that mycroft falls back to mimic1 instead of staying silent
Hi I have setup mimic3 using the guide, It all setup. The speaker works the microphone works but it wont speak to me. I set it up using the plugin and have a raspberry pi 4 setup using the raspbian os with mycroft installed. I don’t know where to get logs from. Thanks in advanced
Any errors should be in /var/log/mycroft/audio.log
A common problem is using an outdated pip before installing the plugin. There is some dependency-of-a-dependency problem with the dateparser and regex package. After upgrading pip, you may need to do pip install --upgrade regex as well.
edit:
Listening to the recordings resulting from the TTS experments on this page TTS-Portuguese Corpus In my opinion the audio file results from of the Portuguese TTS is perfectly fine and understandable - the results of experiment #1 and #3 on that page. Experiment #2 was also understandable but had added noise distortion.
For example this wav file result of the longest phrase from Experiment #3 is very good and highly understandablef: Hoje é fundamental encontrar a razão da existência humana
So @synesthesiamI wonder why when you used this data set your results were not understandable? The above results are pretty much the same quality as you get when using the Google Translate page to generate Portuguese TTS audio. It’s good!
Experiment 1 uses the DCTTS model, trained in the TTS-Portuguese Corpus, and vocoder RTISI-LA (Good).
Experiment 2 uses the Tacotron 1 model, trained in the TTS-Portuguese Corpus (Bad)
Experiment 3 this experiment explores the use of the TTS Mozilla model, trained in the TTS-Portuguese Corpus (Very Good)
I wonder if I need to just train a model directly on characters rather than trying to use a phonemizer. My most recent attempt in Mimic 3 used the pt-br voice from espeak-ng.
Looking at Edresson’s model config, his audio settings are a bit different from mine. For example, his sample rate is 20000 instead of 22050, and he’s using “preemphasis” which appears to filter the audio before training. So maybe the problem is my naive use of the data directly without enough preprocessing?
Hi @rostom132, no bother
I do have a plan, but not definite release date yet. The training software as it is right now is the result of a year and a half of experimentation, including a lot of dead-ends. I’m cleaning it up now and removing a lot of the unused code. My hope is to make it work closely with Mimic Studio.
In general a volume value between 0 and 100 makes sense, but just in case you want your TTS voice “yelling” at you a higher value could make sense too .