New wakeword library and models - openWakeWord

Hey all!

I’ve been following the progress of Mycroft and similar open-source digital assistant frameworks for some time, and one area that I’ve always found to be quite challenging is a good wakeword/wake phrase framework and pre-trained models.

Mycroft Precise works fairly well, but it is relatively difficult to collect data for training new models and the false-accept rate is a bit too high in many cases. Commercial options like Picovoice Porcupine can work very well, but the limitations on custom models in the free-tier (while totally understandable) limits many applications.

As such, I’ve been working for some time on a new open-source framework and set of methods to build wakeword/wake phrase models, and the initial version was just released: openWakeWord.

You can also try a real-time demo right in your browser via HuggingFace Spaces.

By leveraging an impressive pre-trained model from Google (more details in the openWakeWord repo) and some of the text-to-speech advances from the last two years, I’ve been able to train models with 100% synthetic audio and still show good performance on real-world examples. For example, here is the false-accept/false-reject curve for Picovoice Porcupine, openWakeWord, and Precise for a “hey mycroft” model based on real-world examples that I collected. None of the these models were trained on my voice at any point.

If anyone finds this interesting or useful I would greatly appreciate feedback on how well the models work for different voices and environments, as well as general suggestions for new features and improvements.

Thanks!

6 Likes

New ovos plugin for this coming up, but only on 2023 :smiley:

1 Like

Very cool!

Will check it out.:+1:t2::muscle:t2:

very nice! is there a plugin to integrate this into mycroft already, is that being worked on? I couldn’t find docs on integrating…

as mentioned above I’ll be creating a plugin sometime next week :slight_smile:

@JarbasAl, thanks for offering to work on a plugin! Let me know if you run into any issues, happy to assist.

Hopefully it’s relatively simple to integrate. Looking at how hotwords/wakewords are handled in Mycroft Core (e.g., here) as long as the frame data can be prepared properly calling the openWakeWord models themselves should be straightforward.

1 Like

Thanks for such a cool project!

Also shared it in the SecretSauceAI chat and its having some good feedback as well over there

Hey, I’m trying to use openWakeWord’s automatic_model_training_simple.ipynb, but have been running into issues on the last cell. The error revolves around TensorFlow Addons and
No such file or directory: ‘my_custom_model/hey_ARCKK.onnx’.

I have a few questions about using this notebook.

  1. What operating system did you use for this Google Colab?
  2. Did you have issues using this notebook if you changed the runtime type?

Bump. I am encountering the same issue when changing the type to GPU.

I opened an issue on their GitHub. Onnx file Not Found · dscripka/openWakeWord · Discussion #82 · GitHub
I found that if you have already attempted to train on an account, the oynx file won’t be found on the next attempt when changing any parameters on the Google Colab Notebook.