Custom hot word stopped to work

Dodo_Ivanecky · September 11, 2019, 9:57am

I trained and successfully set up my own hot word. It did work last week. Suddenly it stopped to work. Was there any recent change in precise, or? I see that my own model was read, that the audio “bar” is nicely moving, just no hot word is ever detected. When I switch back to “hi Mycroft” it does work. When I check my model with precise-listen it wors perfectly. Jut not within Mycroft as it worked more than week ago.
Thanks in advance for any hint.

baconator · September 11, 2019, 5:38pm

What’s in the voice.log (/var/log/mycroft/)?

Dodo_Ivanecky · September 12, 2019, 8:14am

Hi, in the voice log is nothing special. Below is the log. By some strange noise was the hot word once again detected. It does happen sometimes. But I am happy that it happens because later correct recognition confirm correct audio setting. But my hot wort in general stopped to work completely last week:
I removed initial jack server error messages.

10:05:00.903 - mycroft.messagebus.load_config:load_message_bus_config:33 - INFO - Loading message bus configs
10:05:02.833 - mycroft.util:find_input_device:222 - INFO - Searching for input device: pulse
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
ALSA lib pcm_a52.c:823:(_snd_pcm_a52_open) a52 is only for playback
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
10:05:02.892 - mycroft.client.speech.listener:create_wake_word_recognizer:305 - INFO - creating wake word engine
10:05:02.892 - mycroft.client.speech.hotword_factory:load_module:279 - INFO - Loading “hallo scotty” wake word via precise
############ Using local model: ~/Projects/mycroft-precise/hallo-scotty2.pb
Here I am: 2 0.5
10:05:04.510 - mycroft.client.speech.listener:create_wakeup_recognizer:329 - INFO - creating stand up word engine
10:05:04.511 - mycroft.client.speech.hotword_factory:load_module:279 - INFO - Loading “wake up” wake word via pocketsphinx

Recognizer created

10:05:04.883 - mycroft.stt:init:35 - WARNING - ### Recognizer:
10:05:04.884 - mycroft.messagebus.client.client:on_open:67 - INFO - Connected
10:05:04.885 - mycroft.stt:init:36 - WARNING - {‘uri’: ‘https://stt.eml.org/webSocket/rest/batch/de/cbcf9424-09ed-4a0b-bd0a-a3d201629bb8/?encoding=wav&srate=16000&result-type=text’}

Recognizer created

10:05:04.885 - mycroft.stt:init:338 - WARNING - YES!
10:05:34.287 - mycroft.session:get:74 - INFO - New Session Start: e9b8408e-f733-4d46-9182-899c9feb4d62
10:05:34.295 - main:handle_record_begin:36 - INFO - Begin Recording…
10:05:37.659 - main:handle_record_end:41 - INFO - End Recording…
10:05:37.663 - mycroft.client.speech.mic:listen:563 - INFO - Recording utterance
10:05:37.668 - main:handle_wakeword:57 - INFO - Wakeword Detected: hallo scotty
10:05:38.715 - mycroft.stt:execute:344 - WARNING - <Response [200]>
10:05:38.716 - main:handle_utterance:62 - INFO - Utterance: [‘wie spät ist es’]

baconator · September 12, 2019, 9:09pm

Looks like it’s working?

Dodo_Ivanecky · September 13, 2019, 7:32am

Yes, 1 out of few hundred. When I take precise-listen with the same model and setting then 1 out of hundred is not working. So exactly oposite case

baconator · September 13, 2019, 2:23pm

If the model is being loaded successfully, and it’s only hearing you 1% of the time, then the model seems like the weak point here.

How much data did you model with? Wake words? Not Wake Words? Steps?

Dodo_Ivanecky · September 13, 2019, 3:21pm

And why the model works perfectly if I test it with precise-listen??? I was thinking about the same that the model is weak point. But in such case some different audio processing in mycroft and precise-listen should be involved. I do not see any other explanation so far …

baconator · September 13, 2019, 3:38pm

So what’s different from when you try it on precise-listen and when you have mycroft using it?

Dodo_Ivanecky · September 13, 2019, 3:50pm

That with precise-listen it reacts on the hot word and when I start it within mycroft it does not react …

baconator · September 13, 2019, 7:20pm

without more information or the model, will be difficult to find out more.

gez-mycroft · September 16, 2019, 1:25am

Hey Dodo, it does sound like something is different between precise-listen and when Mycroft-core uses it. Would you be willing to share the model to help work out what’s going on?

If you are happy to share it with the broader community we’ve started a new repo for samples and completed models at:

Dodo_Ivanecky · September 16, 2019, 7:33am

Hi, thanks for your reply. I am happy to share the model, just to prepare model plus samples will take me a bit more time.
So just the model is here: https://drive.google.com/file/d/1iR8RFRMYrbsNNidOYupFhAFIi8V4u2pG/view?usp=sharing
The hot word is “Hallo Scotty” German pronunciation. Or “Hi Scotty” English. German version works better because of bigger fraction of German data. 80/20.

thanks

Jozef

baconator · September 16, 2019, 7:15pm

I would recommend you not use two different phrasings, eg, use hallo scotty only. The cadence (4 syllables vs 3) of the delivery varies between the two, which may make it less adept at correct recognition. Also try and gather samples for the not-wake-word that are similar but different: “a hot toddy” or something similar, to help improve performance.

baconator · September 16, 2019, 9:10pm

I tried your model on two different mics (built-in, external lapel) that I frequently use. I was never able to get it to activate with either variation of your wake word you posted. I think your data may need to be reviewed.

Dodo_Ivanecky · September 17, 2019, 8:03am

Yes, my model is strongly user and microphone dependent. Single user recorded with a close talk Sennheiser. That’s clear to me. But I think you will not disagree that if it is working with precise-listen without any problem (>95%) then it should work similarly within Mycroft. Or?

Dodo_Ivanecky · September 17, 2019, 10:15am

Important: I do not know how your German is, so the pronunciation is [H A L O: S K O T I]

baconator · September 17, 2019, 2:00pm

It’s mediocre at best.

How many wake words and not wake words did you use?

Dodo_Ivanecky · September 17, 2019, 2:15pm

I do not understand your question. Do you mean for the training? 73 wake, 413 not-wake. And again. With precise-listen it works perfectly, so we can discuss the model quality but question remains: Why it wors at 99% with precise-listen and just at 1% with mycroft-core.

baconator · September 17, 2019, 3:18pm

And the question then is, what’s different between those two? Something isn’t the same, otherwise you’d be getting the same results.

Dodo_Ivanecky · September 17, 2019, 3:55pm

What are you talking about?