Custom hot word stopped to work

I trained and successfully set up my own hot word. It did work last week. Suddenly it stopped to work. Was there any recent change in precise, or? I see that my own model was read, that the audio “bar” is nicely moving, just no hot word is ever detected. When I switch back to “hi Mycroft” it does work. When I check my model with precise-listen it wors perfectly. Jut not within Mycroft as it worked more than week ago.
Thanks in advance for any hint.

What’s in the voice.log (/var/log/mycroft/)?

Hi, in the voice log is nothing special. Below is the log. By some strange noise was the hot word once again detected. It does happen sometimes. But I am happy that it happens because later correct recognition confirm correct audio setting. But my hot wort in general stopped to work completely last week:
I removed initial jack server error messages.

10:05:00.903 - mycroft.messagebus.load_config:load_message_bus_config:33 - INFO - Loading message bus configs
10:05:02.833 - mycroft.util:find_input_device:222 - INFO - Searching for input device: pulse
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
ALSA lib pcm_a52.c:823:(_snd_pcm_a52_open) a52 is only for playback
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
10:05:02.892 - mycroft.client.speech.listener:create_wake_word_recognizer:305 - INFO - creating wake word engine
10:05:02.892 - mycroft.client.speech.hotword_factory:load_module:279 - INFO - Loading “hallo scotty” wake word via precise
############ Using local model: ~/Projects/mycroft-precise/hallo-scotty2.pb
Here I am: 2 0.5
10:05:04.510 - mycroft.client.speech.listener:create_wakeup_recognizer:329 - INFO - creating stand up word engine
10:05:04.511 - mycroft.client.speech.hotword_factory:load_module:279 - INFO - Loading “wake up” wake word via pocketsphinx

Recognizer created

10:05:04.883 - mycroft.stt:init:35 - WARNING - ### Recognizer:
10:05:04.884 - mycroft.messagebus.client.client:on_open:67 - INFO - Connected
10:05:04.885 - mycroft.stt:init:36 - WARNING - {‘uri’: ‘https://stt.eml.org/webSocket/rest/batch/de/cbcf9424-09ed-4a0b-bd0a-a3d201629bb8/?encoding=wav&srate=16000&result-type=text’}

Recognizer created

10:05:04.885 - mycroft.stt:init:338 - WARNING - YES!
10:05:34.287 - mycroft.session:get:74 - INFO - New Session Start: e9b8408e-f733-4d46-9182-899c9feb4d62
10:05:34.295 - main:handle_record_begin:36 - INFO - Begin Recording…
10:05:37.659 - main:handle_record_end:41 - INFO - End Recording…
10:05:37.663 - mycroft.client.speech.mic:listen:563 - INFO - Recording utterance
10:05:37.668 - main:handle_wakeword:57 - INFO - Wakeword Detected: hallo scotty
10:05:38.715 - mycroft.stt:execute:344 - WARNING - <Response [200]>
10:05:38.716 - main:handle_utterance:62 - INFO - Utterance: [‘wie spät ist es’]

Looks like it’s working?

Yes, 1 out of few hundred. When I take precise-listen with the same model and setting then 1 out of hundred is not working. So exactly oposite case :frowning:

If the model is being loaded successfully, and it’s only hearing you 1% of the time, then the model seems like the weak point here.

How much data did you model with? Wake words? Not Wake Words? Steps?

And why the model works perfectly if I test it with precise-listen??? I was thinking about the same that the model is weak point. But in such case some different audio processing in mycroft and precise-listen should be involved. I do not see any other explanation so far …

So what’s different from when you try it on precise-listen and when you have mycroft using it?

That with precise-listen it reacts on the hot word and when I start it within mycroft it does not react … :frowning:

without more information or the model, will be difficult to find out more.

Hey Dodo, it does sound like something is different between precise-listen and when Mycroft-core uses it. Would you be willing to share the model to help work out what’s going on?

If you are happy to share it with the broader community we’ve started a new repo for samples and completed models at:

Hi, thanks for your reply. I am happy to share the model, just to prepare model plus samples will take me a bit more time.
So just the model is here: https://drive.google.com/file/d/1iR8RFRMYrbsNNidOYupFhAFIi8V4u2pG/view?usp=sharing
The hot word is “Hallo Scotty” German pronunciation. Or “Hi Scotty” English. German version works better because of bigger fraction of German data. 80/20.

thanks

Jozef

I would recommend you not use two different phrasings, eg, use hallo scotty only. The cadence (4 syllables vs 3) of the delivery varies between the two, which may make it less adept at correct recognition. Also try and gather samples for the not-wake-word that are similar but different: “a hot toddy” or something similar, to help improve performance.

I tried your model on two different mics (built-in, external lapel) that I frequently use. I was never able to get it to activate with either variation of your wake word you posted. I think your data may need to be reviewed.

Yes, my model is strongly user and microphone dependent. Single user recorded with a close talk Sennheiser. That’s clear to me. But I think you will not disagree that if it is working with precise-listen without any problem (>95%) then it should work similarly within Mycroft. Or?

Important: I do not know how your German is, so the pronunciation is [H A L O: S K O T I]

It’s mediocre at best. :slight_smile:

How many wake words and not wake words did you use?

I do not understand your question. Do you mean for the training? 73 wake, 413 not-wake. And again. With precise-listen it works perfectly, so we can discuss the model quality but question remains: Why it wors at 99% with precise-listen and just at 1% with mycroft-core.

And the question then is, what’s different between those two? Something isn’t the same, otherwise you’d be getting the same results.

What are you talking about?