Custom wakewords R Us!


#1

Wanting to try a custom wake word but not sure you can get a precise model made? Having problems getting a model built that works as expected? Just need some more data for your wake word?

Precise-Community-Data may be the place for you! It’s a place to build a community-sourced dataset, oriented specifically towards custom wake words.

For a limited time only, and for the simple price of uploading your wakewords, I’m doing automated custom builds of precise wakewords. Get your wake words and some not-wake-words uploaded, and I’ll build models for them over the following week (probably less). They’ll be automated builds based off your wakewords plus all the not-wake-word data I can use (google words, public noises archive, PCD nww’s, etc). No guarantee on the quality of the end model, they usually turn out high 90’s recognition percentage.

Feel free to post questions here, or upload data at the repo. We’re happy to start accepting more, and hope to build a much larger dataset for everyone to make better models from.

PAQ:

How many words do you need to upload?
For each wakeword at least 20. The more, the merrier.

Do you need to upload not-wake-words?
Yes…they’re at least as useful as wake words, and if you do targeted nww’s, even more so.

I don’t want to upload my voice since it might get recognized from my account, though.
Then don’t. Or make a secondary git account and upload from it. Or a tertiary one. Lots of ways to obfuscate things if you want. Most people and so far computers can’t accurately distinguish a very short sample of speech without additional context.

I don’t have a github account/know how to use git?
You should get one! The general usage isn’t too difficult. On the off chance you have some real issue with this, send me a message over on the chat system.

I don’t want to upload under public domain
there’s a couple other license types you can explore, but mainly the creative commons licenses that allow for derivative usage would be best.


#2

Hello. If anyone is interested I work on a skill for my own wakewords. the goal is to be able to improve wakeword over the time of use and upload automatically. If you would help meet me on chat.


#3

That’s a very generous offer, thanks baconator!!

Hope everyone makes the most of this while it’s available!


#4

Great!!! :slight_smile:

I guess this just covers english phonemas and so, isn’t it?


#5

If they’re in non-en languages, you can upload them and i’ll give it a whirl.


#6

Great!! :smiley:

Then I will try this with “computer” and/or “house” in spanish.
As I’m a complete newbie on this, I have some very basic questions…

· I need to record the wake words in clips, the more records, the better.
· Do you recommend any audio program to record the words?
· As per Athena example, I noticed you recorded the wake word with several distances, and with different people. I will try to do the same, and making no difference on its naming convention.
· Once I have the set of records recorded on the specified format, I need to fork your repo and make a PR, sending you the wav files on the structure you need and with the naming convention you want.
· I just need to put the wav files on the wake-word/lang-short/ directory, besides the README.md and Licenses.
· While it seems obvious what to do with the noises folder (I could record a cough, a siren, etc, and then relation it with a description through the metadata.csv file) I have no idea what to do with the lang-short not-wake-words, as there is no example at the moment, perhaps telling some random words? Record the TV?


#7

Don’t submit copyrighted stuff, please. :slight_smile:
for recording, I had a bunch of saved wake words from my picrofts, as well as a few from manually recording. I tended to use arecord -d 3 asdf.wav a lot.

Yes, variations in distance, speaker, speed, and inflection all help improve how robust the model can be.

For athena, some of my nww’s are words like athlete, christina, or gasoline. They were developed as I got false positives from my picroft over a few weeks (and from background tv noise, but I recorded my own words instead). They’re getting the metadata.csv completed and will be uploaded soon. See the targeted nww link in first post for more. NWW’s should also NOT be existing wakewords. If anything those should be submitted to the relevant directory!


#8

Heheh, don’t be afraid, I’m releasing my voice to any public license, I really love WTFPL, so I guess I will release them under that license :wink:

The only copyrighted stuff could be on the noise part if I have the TV turned on, that would be ok? If is not ok, well, I guess a siren, a truck or a blender, sounds similar in your country like in mine :stuck_out_tongue:

arecord you say? fine, so the command line for spanish woul be exactly this:
arecord -f S16_LE -r 16000 -d3 wakeword-es-$(uuid).wav

On Thursday I’ll take some days out of work and will make some recordings.


#9

looks good. For debian the package uuid-runtime has uuidgen. A more motivated scripting type would hook all that together for ease of use, haha.


#10

just PR’ed the wake word.
on the phoneme’s README.md I wrote
EY OR DE NAA DOR
but I really don’t have a clue how to write it properly. Under cmusphinx is written like
O R D E N A D O R
But with the dict in spanish, so… I will let people with more knowledge how to transcribe it…

PS: I’ve just realized I need to create the metadata.csv for the not-wake-words. I’m going to do it right now.


#11

Ok, PR with 2 models pending (precise .2 and .3). I used a partial subset of the nww data I have, since there’s not as comparable a dataset yet, but still got to high .99s in training. Should be merged by tomorrow at the latest.


#12

so… any luck with the spanish wake word @baconator?


#13

There’s a pending PR with the models if you go peruse the repo!