I am having just the damndest time getting my custom wake word model to activate properly. Join me, as we go on a journey of frustration.
Modeling machine: MacBook Pro running Ubuntu 18.04 through VirtualBox
Mycroft: Picroft RP4 with a working USB mic
I’ve been working very hard on creating a custom wake word for a while now. It’s “computer”. The model in the user repository doesn’t work very well, so I thought I’d make my own. I’ve recorded ~300 samples of the wake word, and added something in the neighborhood of ~51,000 not-wake-words to the model. I’ve been retraining from scratch every time I add new data, as per Mr. Eltocino’s suggestions. I’ve also spoken extensively with him about some of these issues, and he’s been incredibly helpful. But this precise issue is just baffling me.
The model, once created, was excellent. It activates perfectly on my modeling machine using precise-listen, doesn’t false activate, etc. Testing showed that it was as close to perfect as I could get, at least without live deployment when it would almost certainly need a few tweaks. However, once the .pb was generated, and uploaded to the Picroft, it wouldn’t activate, ever. Not once. It just sits there, like a bump on a log, stubbornly refusing to listen to the wake word. If manually activated, it would clearly hear and nearly perfectly transcribe my speech, so I know the USB mic is working, and I know it’s capturing correctly. This issue has been referenced before, without resolution. After a LOT of digging and talking to Mr. Eltocino along with this support post, I believed that I had figured out that I needed to be using precise 0.3.0 on the Pi. So, I downloaded it, erased the existing folder and uploaded the new precise-engine, and rebooted. (I also get the error about .params, but I’ve been told multiple times that it’s an error that doesn’t mean anything, so I’m ignoring it for now.)
So, this definitely solves the issue of not activating. Because now, it activates constantly. In a silent (and by silent I mean, maybe there’s a fridge running in the other room, but that’s it) room, it will activate every five seconds, perhaps less. Again, when it’s recording, it understands my speech perfectly.
I have messed with just about every setting possible. I’ve tried messing around with sensitivity and trigger_level, to no avail. I’ve turned on wake word saving, let it run in a room with the TV going and my toddler babbling, and uploaded the 300-some wake-word activations it saved to /tmp, and then re-modeled, in case it’s a case of the background noise. It’s definitely not the background noise. At this point, I’m like 90% sure this isn’t a model issue. There’s a disconnect between how precise evaluates the data, and how Mycroft is calling precise in the background. Something is broken somewhere.
There is one (possibly?) relevant error that I’m seeing in the startup logs in the CLI:
16:41:45.968 | INFO | 2762 | mycroft.client.speech.listener:create_wake_word_recognizer:328 | Creating wake word engine 16:41:45.978 | INFO | 2762 | mycroft.client.speech.listener:create_wake_word_recognizer:351 | Using hotword entry for computer ~~~~0 | WARNING | 2762 | mycroft.client.speech.listener:create_wake_word_recognizer:353 | Phonemes are missing falling back to listeners configuration ~~~~3 | WARNING | 2762 | mycroft.client.speech.listener:create_wake_word_recognizer:357 | Threshold is missing falling back to listeners configuration 16:41:45.988 | INFO | 2762 | mycroft.client.speech.hotword_factory:load_module:467 | Loading "computer" wake word via precise 16:41:47.119 | ERROR | 2762 | mycroft.client.speech.hotword_factory:initialize:492 | Could not create hotword. Falling back to default. Traceback (most recent call last): File "/home/pi/mycroft-core/mycroft/client/speech/hotword_factory.py", line 480, in initialize instance = clazz(hotword, config, lang=lang) File "/home/pi/mycroft-core/mycroft/client/speech/hotword_factory.py", line 228, in __init__ self.runner.start() File "/home/pi/mycroft-core/.venv/lib/python3.7/site-packages/precise_runner/runner.py", line 159, in start self.engine.start() File "/home/pi/mycroft-core/.venv/lib/python3.7/site-packages/precise_runner/runner.py", line 53, in start self.proc = Popen(self.exe_args, stdin=PIPE, stdout=PIPE) File "/usr/lib/python3.7/subprocess.py", line 775, in __init__ restore_signals, start_new_session) File "/usr/lib/python3.7/subprocess.py", line 1522, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) PermissionError: [Errno 13] Permission denied: '/home/pi/.mycroft/precise/precise-engine/precise-engine' 16:41:47.125 | INFO | 2762 | mycroft.client.speech.hotword_factory:load_module:467 | Loading "computer" wake word via pocketsphinx 16:41:47.230 | INFO | 2762 | mycroft.client.speech.listener:create_wakeup_recognizer:365 | creating stand up word engine 16:41:47.232 | INFO | 2762 | mycroft.client.speech.hotword_factory:load_module:467 | Loading "wake up" wake word via pocketsphinx 16:41:47.312 | INFO | 2728 | mycroft.skills.msm_wrapper:create_msm:111 | Releasing MSM instantiation lock. 16:41:47.314 | INFO | 2728 | mycroft.skills.skill_updater:_log_next_download_time:265 | Next scheduled skill update: 2021-05-11 17:40:17.920100 16:41:47.317 | INFO | 2728 | mycroft.skills.skill_loader:load:185 | ATTEMPTING TO LOAD SKILL: mycroft-pairing.mycroftai 16:41:47.339 | INFO | 2762 | __main__:on_ready:179 | Speech client is ready.
I’m happy to upload any other logs, anything anybody wants. My entire model sample set, whatever. This problem is confounding me, and I think I have enough high-quality data in my model sample set that it’s not that.
Any help would be sincerely and truly appreciated. I just don’t know where to look in the logs or debug information to figure out what’s going on. Once I have this figured out, I will be extremely happy to write a detailed step-by-step instructional thing for anyone else who wants to train their own model.
[edit 1] spelling, missing words, minor clarifications]
[edit 2] To rule out a microphone problem, I went back and listened to the data it saved from the saved wake-word activations. They’re just tiny snippets of room noise. No static, pops, or clicks or any other odd audio artifacts.
[edit 3] Logs. I swear, I’m trying to provide anything helpful.
[edit 4] Here are my model files in case anybody wants to give it a whirl.