I am so new on speech world. I work on offline speech recognition project and I use mycroft-precise for wake word. I have some specific question. To summarize briefly the work I do:
I work on offline speech recognition with Rapsberry pi 3. I use mycroft-precise for wake word. I created a two-word wake word on Turkish Language. The word is “Mega yedi”, English speech “Mega seven”. I followed the steps in the mycroft-precise github repo. Avarage I recorded 60 .wav file for wake word and I recude wake word used for non-wake-word. The wake word working %85 accuracy. So sometimes I have to repeat the wake word a few times and it is not working with woman voice.(I tested with one woman and she repeated wake word four times.). I also want to solve these problems. I also want to solve these problems. I will collect new sample voices from different people for wake word and restart mycroft-precise learning.
1- How much data should I collect for this? (I can collect sounds for this particular wake word from about 20 people. Is it enough?)
2- How many times should this 20 people say this wake-up word?
3- I search some dataset but some voice file is not appropriate for Record wake word conditions. You should say word 1,5-2 second after begin record. Do you have any Turkish dataset suggestions that meet these conditions?
4- Would it be useful to record the words “mega” and “yedi” separately and teach mycroft-precise?
5- Can the accuracy of the wake word be increased with words generated by digital human voice generation applications such as AWS Polly Generator? Because these are not human voices after all.
Thanks for timing