Since i am using your generated german corpus csv (https://github.com/gras64/corpus-file-gen/blob/master/prompts/german_corpus.csv) file for mimic-recording-studio is it okay for you if i use/upload the csv on my own github account with way more files (including wav files of my original voice recordings) when i reference to your github page?
I use the german corpus from @gras64 (2k phrases from MyCroft skills, 20k phrases from mozilla common voice, 10k phrases from wikipedia) with mimic-recording-studio (mrs) as docker container which works like a charm. As @gras64 and @gez-mycroft mentioned you need time, a lot of time for reading when you want useable results.
Out of curiosity I joined the weekly TTS meetings some weeks ago and Thomas - one of the leading Mozilla TTS contributors - reported on his progress. To me that sounded like there is actual work done on a german model, although there was no target date given…
I’m currently in contact with mozilla by email on contributing my voice samples from recording-studio. But they have questions about CC0 licensing from the used corpus.
Since it’s prepacked from MyCroft skills, common voice and wikipedia phrases it could be that it’s only partial usable for mozilla voice project.
I extracted the “mozilla common voice” phrases (cc0) from @gras64 corpus, an export of sqlite db and some sample wav files and published them to my github account.
I wanted to upload all of my recordings (currently zipped around 1,3gb) and thats to much for github.