German corpus available for use with mimic recording studio?

Hello.
I’ve read that training my own voice for mimic tts takes besides good mic equipment a huge amount of time. But before any further researches - are there any compatible corpus files available for german language?

The english version is available on github. https://github.com/MycroftAI/mimic-recording-studio/blob/master/backend/prompts/english_corpus.csv

Thorsten

Maybe you can extract your own corpus from PAVOQUE Corpus of Expressive Speech or NEGRA corpus?

Ich habe ein eigenes corpus file erstellt https://github.com/gras64/corpus-file-gen/blob/master/prompts/german_corpus.csv ich habe auch ein generator der ein File aus einzelnen Sätzen macht.

Both links look great. It should be no problem to merge both collections technically into one compatible file, if it is ok for licensing reasons.

Have you already created your own tts voice? If so, are you satisfied with the results?

The creation is very complex over 60 hours of work with many breaks what takes months with me. I had to stop because of a cold I took about 1/3.

Since i have never used the software. Must the process be completely finished before hearing your tts voice or can you already hear how it might sound after finishing 1/3?

Well, if you have too little data, some sentences may not be output correctly. 20 h language files are needed for mimic2. if you have less data can be that there is no result and you have calculated for free the model. The quality is also very important. read the docs about it exactly. I also find in language-de and mimic chat

Hello @gras64.

I started with your german corpus files.
I’m at phrase 195 out of 30049. When do i have the option of testing my tts voice? Is there a button appearing after a minimum of 20h voice has been recorded?

You can download and calculate the data anytime. However, I advise to wait.

I’m just at phrase 312 of 30049. So there’s still some work to do.
Since i haven’t worked with mimic before is this the right link to follow when i reached a good amount of speech samples?

basically yes. but I made some adjustments in my rep. I have not done anything to it for a long time

Would you now still use your existing corpus or would you generate a new one. Yours seems primarily based on phrases used by an (home) assistant.
Maybe we can generate more common/generic phrases from opencontent?

We’ve received unofficial advice that extracting phrases randomly from Wikipedia is acceptable. Just in case that is helpful

I started using mycroft skill sets until about 2000. then mozilla voice data and rest from 10000 come from wikipedia. To work up this data has cost me myself with wikipedia downloader and mycroft utils filter months. you need to rest assured to create a mimic voice takes very very long

Wow. Seems like you really spend a lot of time in creating and mixing the corpus from different sources. After all of your work should your corpus not be included in official repo as “official” german corpus then?

Thanks for your work :slight_smile:

btw: I just finished 500 phrases and i’m getting an idea of how long it will take to speak out all 30.000 phrases. Wow - this takes really a huge amount of time. Hope it’s worth it :-).
My average is 10.4 words per second. And for now i recorded 19 min and 26 sec.

1 Like

We’ve also found that consistency in recordings really affects the quality of the final voice model. So whilst it’s tempting to record a lot really quickly, it’s better to take your time, with regular breaks, speak at a consistent pace using the same equipment and environment, etc.

Definitely not a straight forward task but can be pretty amazing if you stick with it.

I’ll do my very best.
Is a pace of 10.5 good or am i rushing through the phrases?

I just made 1.400 phrases in 45min 28sec.

Might be a stupid question. But what is the best way to do a backup of the current work when running mimic recording studio as docker container?

You need the folder in mimic-recording-studio/data.

Thanks. Now i have a backup - just in case.
I’m at phrase 2.316. So now i start with your second block (mozilla voice data set).

But for today, it’s enough.