Changing wakeword response

Can the wakeword response be changed from a tone to a phrase?


The “wakeword response tone” is a audio-file located at mycroft-core/mycroft/res/snd/start_listening.wav

You can replace that file with your own recording of a phrase…

Thanks for that. So I could use a .wav of “Yes, Milady?” or whatever, but I can’t trigger text-to-speech output which would fit seamlessly with the rest of the dialogue.

Yes, the file should be wav-format, 16bit, stereo, 44100Hz sampling rate.

You could trigger TTS output by utterance “say yes milady” or from command-line mycroft-speak and record the output to a wav-file.

Being able to redirect that output to a file would be quite a useful debugging facility…

Mycroft temporarily saves tts ouput (at least when using Google-TTS) in folder /tmp/mycroft/cache/tts/.

Type say yes milady in the mycroft-cli-client and watch out for the latest file in the tmp folder, the .mp3-file with the latest file-stamp should be the one you are looking for (the last entry of /var/log/mycroft/audio.log should give you the file-name as well). Convert that file to .wav and you are set…

Thanks, I’ll check and report back. I’d expect Socks to be able to handle the conversion.

That works, and with the default (Mimic) TTS it’s already a .wav file. The one thing I’d caution is that the cached files don’t get their timestamps touched, so if it’s the second time in a session you’ve asked it to say something it might not be the most recent file.

One additional thing I’d throw in is that a slightly longer file (similar to the “yes milady” I used as an example) might possibly be messing something up related to echo suppression: asking e.g. “where are you” intermittently applies lots of “reverb” to the answer… I’ve not yet checked whether this is fixed by a restart.

That’s interesting to hear about the ‘reverb’, let us know if it continues…

It happened several times but a restart apparently fixed it. I’ll advise if it happens again.

OK, this is great fun. Wakeword acknowledgement is set to “Yes your excellency” (nod to one of the Niven/Pournelle books).

This is the result of asking it “what’s 22 divided by 7” asked a couple of times:

yes you like scentsy what’s 22 / 7

22/7 (irreducible)
yes your excellency what’s 22 / 7

22/7 (irreducible)

Several times I’ve got “nc” at the start of a command, suggesting that the timing of the start of command capture is a little erratic.

As in it’s sometimes catching the end of the “excellency” as the start of your utterance?

The precise timing of when to start capturing an utterance is a bit of an art form. You don’t want a big gap between the acknowledgement and when capture starts, but also even if the acknowledgement is just a noise like the default, it can throw out the automated mic leveling given that sound is being output from quite close to the microphone.

I suppose that this folds into what somebody was saying about echo cancellation since commercial devices are often playing music at the same time as they’re expected to listen for a wakeword.