Changing wakeword response

Can the wakeword response be changed from a tone to a phrase?

MarkMLl

1 Like

The “wakeword response tone” is a audio-file located at mycroft-core/mycroft/res/snd/start_listening.wav

You can replace that file with your own recording of a phrase…

Thanks for that. So I could use a .wav of “Yes, Milady?” or whatever, but I can’t trigger text-to-speech output which would fit seamlessly with the rest of the dialogue.

Yes, the file should be wav-format, 16bit, stereo, 44100Hz sampling rate.

You could trigger TTS output by utterance “say yes milady” or from command-line mycroft-speak and record the output to a wav-file.

Being able to redirect that output to a file would be quite a useful debugging facility…

Mycroft temporarily saves tts ouput (at least when using Google-TTS) in folder /tmp/mycroft/cache/tts/.

Type say yes milady in the mycroft-cli-client and watch out for the latest file in the tmp folder, the .mp3-file with the latest file-stamp should be the one you are looking for (the last entry of /var/log/mycroft/audio.log should give you the file-name as well). Convert that file to .wav and you are set…

1 Like

Thanks, I’ll check and report back. I’d expect Socks to be able to handle the conversion.

That works, and with the default (Mimic) TTS it’s already a .wav file. The one thing I’d caution is that the cached files don’t get their timestamps touched, so if it’s the second time in a session you’ve asked it to say something it might not be the most recent file.

One additional thing I’d throw in is that a slightly longer file (similar to the “yes milady” I used as an example) might possibly be messing something up related to echo suppression: asking e.g. “where are you” intermittently applies lots of “reverb” to the answer… I’ve not yet checked whether this is fixed by a restart.

That’s interesting to hear about the ‘reverb’, let us know if it continues…

It happened several times but a restart apparently fixed it. I’ll advise if it happens again.

1 Like

OK, this is great fun. Wakeword acknowledgement is set to “Yes your excellency” (nod to one of the Niven/Pournelle books).

This is the result of asking it “what’s 22 divided by 7” asked a couple of times:

yes you like scentsy what’s 22 / 7

22/7 (irreducible)
yes your excellency what’s 22 / 7
22/7 (irreducible)

Several times I’ve got “nc” at the start of a command, suggesting that the timing of the start of command capture is a little erratic.

As in it’s sometimes catching the end of the “excellency” as the start of your utterance?

The precise timing of when to start capturing an utterance is a bit of an art form. You don’t want a big gap between the acknowledgement and when capture starts, but also even if the acknowledgement is just a noise like the default, it can throw out the automated mic leveling given that sound is being output from quite close to the microphone.

I suppose that this folds into what somebody was saying about echo cancellation since commercial devices are often playing music at the same time as they’re expected to listen for a wakeword.