Precise, personal Wake Word for everyone

Hi guys,

I need your support. I installed Mycroft on my Raspberry PI 3b. I want to change the wake word from “Hey Mycroft” to “Jerry”. I’m using Mycroft-precise for this.

In the last weeks, I trained Mycroft with 50 recordings of my voice plus ten during the test. The wake word works.

My goal now is that everyone can activate the wake word “Jerry” without recordings and training of their voice. Do you believe this can be possible?

If yes, I think that Mycroft needs to be trained with several recordings from different people and voices before. Do you know how many recordings it could need? Has someone already tried this?

Thank you

You’d want a lot of recordings from as wide a range of users as possible, 50 is a good start. It really depends on how many different users you have and having their vocal characteristics matching your model.

One approach that the Rhasspy crew are trying out is to use the output of a broad range of TTS voices as training data. I doubt it will be as good as collecting real samples from a diverse group, but it requires much less work to get setup.

1 Like

Jarbas and I used this as well, it’s not a bad idea if you can get the pronunciations correct.

add some background noise and pitch shifting and you can double your samples

2 Likes

Hi guys, i made my train. For now i use only my voice for test, and i use this guide:

https://github.com/MycroftAI/mycroft-precise/issues/94

After this i try to put all together in Pycroft but seam not work. I found a solution in this link (i used precise 0.3.0 for make ):

https://community.openconversational.ai/t/precise-wakeword-not-working/8140

I need your support again for some troubles with precise on custom skill.

1- Some times ago, I created on raspberry pi 3b my dataset with the custom wake word “hey jerry” and then I modified the configuration file with the following instruction:

$ mycroft-config edit user

{
“max_allowed_core_version”: 20.2,
“lang”: “it-it”,
“skills”: {
“auto_update”: false,
“blacklisted_skills”: [
“mycroft-wiki”,
“mycroft-alarm”,
“mycroft-audio-record”,
“mycroft-date-time”,
“mycroft-npr-news”,
“mycroft-singing”,
“mycroft-timer”,
“mycroft-hello-world”,
“mycroft-weather”,
“mycroft-personal” ],
“priority_skills”: [
“jerry-data-ora”,
“jerry-chi-sei”
]
},
“listener”: {
“wake_word”: “ehy_jerry”
},
“hotwords”: {
“ehy_jerry”: {
“module”: “precise”,
“local_model_file”: “/home/pi/mycroft-precise/ehy-jerry.pb”,
“phonemes”: “JH EH R IY .”,
“threshold”: 1e-18
}
}
}

2- After this, I restarted Mycroft, but I read from the file voice.log: “a lot of error in ALSA lib conf.c 4568 and conf.c 5047”:

2020-07-09 15:58:17.134 | INFO | 9112 | mycroft.client.speech.listener:create_wake_word_recognizer:323 | Creating wake word engine
2020-07-09 15:58:17.146 | INFO | 9112 | mycroft.client.speech.listener:create_wake_word_recognizer:346 | Using hotword entry for jerry
2020-07-09 15:58:17.152 | INFO | 9112 | mycroft.client.speech.hotword_factory:load_module:403 | Loading “jerry” wake word via precise
2020-07-09 15:58:19.202 | INFO | 9112 | mycroft.client.speech.listener:create_wakeup_recognizer:360 | creating stand up word engine
2020-07-09 15:58:19.207 | INFO | 9112 | mycroft.client.speech.hotword_factory:load_module:403 | Loading “wake up” wake word via pocketsphinx
2020-07-09 15:58:19.375 | INFO | 9112 | mycroft.messagebus.client.client:on_open:114 | Connected

3- Today, when I pronounce “Hey Jerry!”…nothing happens and I don’t receive feedback from the log file. If I restart it again (Mycroft-stop all and Mycroft-start all), Mycroft catches my wake word, and in the log file I read this:

2020-07-09 16:03:06.027 | INFO | 9672 | main:handle_wakeword:67 | Wakeword Detected: jerry
Playing WAVE ‘/home/pi/mycroft-core/mycroft/res/snd/start_listening.wav’ : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
2020-07-09 16:03:06.474 | INFO | 9672 | main:handle_record_begin:37 | Begin Recording…
2020-07-09 16:03:09.359 | INFO | 9672 | main:handle_record_end:45 | End Recording…
2020-07-09 16:03:10.716 | INFO | 9672 | main:handle_utterance:72 | Utterance: [‘chi ti ha creato’]

4- So, when I restart Mycroft, the custom wake word works and then the skill too, but at the end of the session, the wake word seems “turning off” and, I have to restart all the process explained before to continue. So sometimes Mycroft and precise catch the wake work, other times not.

5 - Using the tool Mycroft-cli-client, I notice that the volume rises when I pronounce “Hey Jerry”.

I also tried “alsamixer” to raise and lower my microphone volume (Logitech USB with integrated camera).

6 Here, another example for the log file, that I receive when the wake word is caught for a while and then nothing more:

2020-07-09 16:21:52.576 | INFO | 10501 | mycroft.session:get:74 | New Session Start: 5663591a-4c5c-4053-9c89-501354c33f43
2020-07-09 16:21:52.586 | INFO | 10501 | main:handle_wakeword:67 | Wakeword Detected: jerry
Playing WAVE ‘/home/pi/mycroft-core/mycroft/res/snd/start_listening.wav’ : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
2020-07-09 16:21:53.613 | INFO | 10501 | main:handle_record_begin:37 | Begin Recording…
2020-07-09 16:21:55.538 | INFO | 10501 | main:handle_record_end:45 | End Recording…
2020-07-09 16:21:56.896 | INFO | 10501 | main:handle_utterance:72 | Utterance: [“puoi darmi un po’ d’acqua”]
2020-07-09 16:22:29.611 | INFO | 10501 | main:handle_wakeword:67 | Wakeword Detected: jerry
Playing WAVE ‘/home/pi/mycroft-core/mycroft/res/snd/start_listening.wav’ : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
2020-07-09 16:22:30.062 | INFO | 10501 | main:handle_record_begin:37 | Begin Recording…
2020-07-09 16:22:33.281 | INFO | 10501 | main:handle_record_end:45 | End Recording…
2020-07-09 16:22:34.719 | INFO | 10501 | main:handle_utterance:72 | Utterance: [“puoi darmi un po’ d’acqua”]
2020-07-09 16:23:06.927 | INFO | 10501 | main:handle_wakeword:67 | Wakeword Detected: jerry
Playing WAVE ‘/home/pi/mycroft-core/mycroft/res/snd/start_listening.wav’ : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
2020-07-09 16:23:07.383 | INFO | 10501 | main:handle_record_begin:37 | Begin Recording…
2020-07-09 16:23:12.263 | INFO | 10501 | main:handle_record_end:45 | End Recording…
2020-07-09 16:23:13.714 | INFO | 10501 | main:handle_utterance:72 | Utterance: [“com’è più comodo muoversi a roma”]
2020-07-09 16:23:41.141 | INFO | 10501 | main:handle_wakeword:67 | Wakeword Detected: jerry
Playing WAVE ‘/home/pi/mycroft-core/mycroft/res/snd/start_listening.wav’ : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
2020-07-09 16:23:41.602 | INFO | 10501 | main:handle_record_begin:37 | Begin Recording…
2020-07-09 16:23:43.726 | INFO | 10501 | main:handle_record_end:45 | End Recording…
2020-07-09 16:23:45.084 | INFO | 10501 | main:handle_utterance:72 | Utterance: [‘di cosa sei fatto’]
2020-07-09 16:24:10.693 | INFO | 10501 | main:handle_wakeword:67 | Wakeword Detected: jerry
Playing WAVE ‘/home/pi/mycroft-core/mycroft/res/snd/start_listening.wav’ : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
2020-07-09 16:24:11.163 | INFO | 10501 | main:handle_record_begin:37 | Begin Recording…
2020-07-09 16:24:12.707 | INFO | 10501 | main:handle_record_end:45 | End Recording…
2020-07-09 16:24:13.922 | INFO | 10501 | main:handle_utterance:72 | Utterance: [‘chi sei’]

7 - I did another test too. I had recorded the tone of my voice that precise catches and I had played it on my computer. Same results: sometimes it recognizes the wake word, other times not.

I can’t understand what happens and where is the error.

Could you please help me with this? Is there a way to do a Mycroft-precise debug, so I could understand why it doesn’t’ catch the wake word?

Thank you!

Did you model under .2.0 of precise or .3? Have you tried using precise-listen to evaluate your model in a standalone capacity? How large was your dataset?

If you’re using custom models you need to turn on wake word saving so you can track false activations and remodel with those to improve it.

I used precise .3 and put precise .3 under mycroft. In listner i have a perfect recognise, all the times.

How large was the dataset you used?