i’ve (incrementally) trained some Keras model on the mycroft-precise dev version with quite an extensive load of material with a result of 97% accuracy.
Tested it with precise_listen with the expected outcome just dinging on “samira” (ww) or “samira”-ish words with 100% accuracy. Tested it against TV and other common noises around here. Everything as expected so far.
But with the implementation of that model things turned somewhat up-side-down. The recognition on “samira” is about 10% and the church bell triggers it with
an incredible accuracy.
Triple checked the config, which is in line with the ones given in the docs. The wake word is set to “samira” (mycroft_cli_client). And from what I can tell the voice.log not indicating some major problems.
2020-07-25 17:34:01.048 | INFO | 706 | mycroft.client.speech.listener:create_wake_word_recognizer:323 | Creating wake word engine 2020-07-25 17:34:01.050 | INFO | 706 | mycroft.client.speech.listener:create_wake_word_recognizer:346 | Using hotword entry for samira 2020-07-25 17:34:01.052 | WARNING | 706 | mycroft.client.speech.listener:create_wake_word_recognizer:348 | Phonemes are missing falling back to listeners configuration 2020-07-25 17:34:01.054 | WARNING | 706 | mycroft.client.speech.listener:create_wake_word_recognizer:352 | Threshold is missing falling back to listeners configuration 2020-07-25 17:34:01.060 | INFO | 706 | mycroft.client.speech.hotword_factory:load_module:403 | Loading "samira" wake word via precise 2020-07-25 17:34:03.368 | INFO | 706 | mycroft.client.speech.listener:create_wakeup_recognizer:360 | creating stand up word engine 2020-07-25 17:34:03.371 | INFO | 706 | mycroft.client.speech.hotword_factory:load_module:403 | Loading "wake up" wake word via pocketsphinx 2020-07-25 17:34:03.672 | INFO | 706 | mycroft.messagebus.client.client:on_open:114 | Connected 2020-07-25 17:35:51.805 | INFO | 706 | mycroft.session:get:74 | New Session Start: b49e73cc-c657-4611-8934-7a049a81546c
A couple of questions regarding that log:
Why is pocketsphinxs’ wake word loaded (since none is set up in the conf)? Fallback?
And why are phonemes and thresholds missing? (pocketsphinx stuff?)
"threshold_config": [[6, 4]], "threshold_center": 0.2)
What has gone so wrong that caused the live implementation to be that inaccurate? Is there a known problem with
dev? Should i step back to