Voice donation - Italian

savohead · February 5, 2018, 1:07pm

Hi everybody,
I’m new to the forum and I’m going to support mycroft on kickstarter, but I wanted to help with the only skill I’ve got (and it’s not even a skill): my voice.
I’m a voice actor and a dubbing director, I support open source in every way possible, and now that I found you, I want to do something for this project. How can I donate my voice for speech synthesis? Where should I look for documentation?
(sorry but i couldn’t find an answer elsewhere)
Thanks for making the world a nicer place to live
s

nate-mycroft · February 6, 2018, 8:44pm

@KathyReid @derick-mycroft @steve.penrod Mimic 2 maybe??

Savohead,

That would be amazing. I am copying in some of the smarter people here to see how we can help.

KathyReid · February 7, 2018, 11:24am

@savohead - Huge thank you for your offer of assistance!

At this stage we’re receiving a lot of offers of help and assistance from people who speak different languages, which is wonderful. Unfortunately, we don’t yet have the tools to be able to easily synthesize foreign language voices, but we will be working on those tools this year.

There are many components to being able to support an open source voice stack in other languages - covered in link [0] below. Wake Words, Speech to Text, Skills, Text to Speech, user interfaces - all have to be translated into the target language.

This comes with all sorts of complexities. For example, some languages, such as French and German - ‘Romance’ languages - differentiate between masculine and feminine nouns. In Portuguese and English there are different forms of address based on whether the subject is male or female. In some languages, like Indonesian, there are different language constructs - such as ‘passive voice’ - that have implications for Intent Pasing software such as Adapt and Padatious.

In the CJK group of languages (Chinese, Japanese, Korean), there are complexities to handling the ideographs (xiàngxíng, kanji, hiragana, katakana, hangul) - including ASCII double byte input.

But, all of this is technical detail - it doesn’t actually help you to help us - the the wider community!

At this stage, you might be able to contribute by;

providing voice samples at the Mozilla voice project [1]
Joining us at our discussion of languages, and language tools at Mycroft Chat - at [2]
And of course, keeping in touch as our voice journey continues

[0] https://mycroft.ai/documentation/language-support/adding-language/
[1] https://voice.mozilla.org/
[2] https://chat.mycroft.ai/community/channels/languages

savohead · February 7, 2018, 11:42am

Hi Kathy and thanks for your answer.
I already looked at the mozilla project for voices, but unfortunately it looks english-centered and even there, no other language is being “harvested” yet.
I’ll go through your links and try to help as much as possible with my knowledge of italian, for translations or localizations.
Thanks again and keep me posted with progresses, 'cause I want to be part of this great project!

savohead · February 7, 2018, 11:43am

Thanks Nate! Hope I’ll be of some use for the community.

HenryMiller1 · February 7, 2018, 4:21pm

I think that the best thing to do is ask Mozilla how you can get a new language started, if they answer follow their process. If they don’t create your own clone as best you can to create a large corpus of voice samples in your language.

Having you read a book (where the etext is freely available) is useful. Having 10000 people each read a single sentence (each a different sentence) is helpful. What you need to do is provide researchers a large amount of sound samples with the text. Once you have that it is much easier for someone to do the next steps of speech to text.

savohead · February 7, 2018, 4:58pm

Hi Henry,
thanks for the guidelines, I’ll immediately try and ask mozilla for what you suggest. For the 10000 people reading, I’ll see what I can do
Thanks again and I’ll get back with news as soon as possible.

Wolfgange · February 7, 2018, 5:03pm

In particular, clean, transcribed recordings split by sentences and spoken by a single speaker are very helpful for TTS. However, to feasibly generate a TTS voice off of it, a large amount of data is required. To give some background as to the format and amount of data, take a look at the LJSpeech corpus which has been used to create a TTS voice before.