Custom hot word stopped to work

Yes, 1 out of few hundred. When I take precise-listen with the same model and setting then 1 out of hundred is not working. So exactly oposite case :frowning:

If the model is being loaded successfully, and it’s only hearing you 1% of the time, then the model seems like the weak point here.

How much data did you model with? Wake words? Not Wake Words? Steps?

And why the model works perfectly if I test it with precise-listen??? I was thinking about the same that the model is weak point. But in such case some different audio processing in mycroft and precise-listen should be involved. I do not see any other explanation so far …

So what’s different from when you try it on precise-listen and when you have mycroft using it?

That with precise-listen it reacts on the hot word and when I start it within mycroft it does not react … :frowning:

without more information or the model, will be difficult to find out more.

Hey Dodo, it does sound like something is different between precise-listen and when Mycroft-core uses it. Would you be willing to share the model to help work out what’s going on?

If you are happy to share it with the broader community we’ve started a new repo for samples and completed models at:

Hi, thanks for your reply. I am happy to share the model, just to prepare model plus samples will take me a bit more time.
So just the model is here:
The hot word is “Hallo Scotty” German pronunciation. Or “Hi Scotty” English. German version works better because of bigger fraction of German data. 80/20.



I would recommend you not use two different phrasings, eg, use hallo scotty only. The cadence (4 syllables vs 3) of the delivery varies between the two, which may make it less adept at correct recognition. Also try and gather samples for the not-wake-word that are similar but different: “a hot toddy” or something similar, to help improve performance.

I tried your model on two different mics (built-in, external lapel) that I frequently use. I was never able to get it to activate with either variation of your wake word you posted. I think your data may need to be reviewed.

Yes, my model is strongly user and microphone dependent. Single user recorded with a close talk Sennheiser. That’s clear to me. But I think you will not disagree that if it is working with precise-listen without any problem (>95%) then it should work similarly within Mycroft. Or?

Important: I do not know how your German is, so the pronunciation is [H A L O: S K O T I]

How many wake words and not wake words did you use?

I do not understand your question. Do you mean for the training? 73 wake, 413 not-wake. And again. With precise-listen it works perfectly, so we can discuss the model quality but question remains: Why it wors at 99% with precise-listen and just at 1% with mycroft-core.

And the question then is, what’s different between those two? Something isn’t the same, otherwise you’d be getting the same results.

What are you talking about?

Something is different about how your model is getting used in mycroft vs. how it’s getting used when you’re trying precise-listen. I can’t diagnose that remotely without much more info.

If your model that you uploaded works for you on precise-listen, great. All you should have to do is upload it to your precise host and set the config for it.

One thing to try is running the precise-engine on the precise host with the model, and pass it wav files of your wake word. If that completes as expected, then it seems like the mic would be the issue. For mic issues, you’d have to verify the input, volume, etc., and check any other mycroft config settings that might be in place for it.

ok, now I understand you. Yes, I fully agree with you, something is different on mycroft/precise level. All I found out so far is, that my Mycroft is not using the same precise as I use when I use precise-listen. That’s why I started with suspect of differences in some audio processing.
I have also no clue what precise host is. But I have the keyword, so I will look at it. Thanks.

precise host is the computer you’re running precise on.

I am running everything on one computer. So there is just one audio input. The same for both.