Picroft - Custom wakeword not accurate

I’m trying to make a custom wakeword. I’m following this guide: Training your own wake word. Here are my steps to do my wake word:

  • precise-collect
  • precise-train
  • precise-train-incremental
  • precise-test
  • precise-listen
  • precise-convert

After I have done all those steps, I configurate a json file with this command:

mycroft-config edit user

Here is my config:

{
  "max_allowed_core_version": 21.2,
  "listener": {
    "wake_word": "hey-nestore"
  },
  "hotwords": {
    "hey-nestore": {
      "module": "precise",
      "local_model_file": "/home/pi/mycroft-precise/hey-nestore.pb",
      "sensitivity": 0.1,
      "trigger_level": 10
    }
  }
}

Sensitity to 0.1 to prevent false positives and trigger_level to 10 to prevent Mycroft to unintentionally activate.

Then I reload the config with:

mycroft-config reload

And finally lauch the client with:

mycroft-cli-client

Now, with my model and my training (don’t forget that I have done the “precise-train-incremental” training too), the detection of my wakeword is very inaccurate. When clapping in my hand, the client detect my wakeword.
Am I the only one, who can’t make an accurate model ?

I’ve just trained my own wake-word (i.e.‘Hey Doc’) and it works quite well.

I can share my experience, I can’t tell you that’s the best way to train a wake-word, but I achieved a good result.

These are my tips:

Wake-word

  • Collect ~ 400 samples
  • Vary speed, inflection, volume, tone, distance from mic
  • Change the mic
  • Record the voice of as many people as possible

Not wake-word

  • Should be ~ 4000 samples
  • Use precise-listen <wake-word>.net -d <wake-word>/not-wake-word to record word(s) similar to the wake-word (rhymes, same prefix same suffix etc)
  • Use the bunch of long audio files here http://downloads.tuxfamily.org/pdsounds/pdsounds_march2009.7z and here http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz
  • Execute precise-listen <wake-word>.net -d <wake-word>/not-wake-word while you are watching a film or a series on Netflix or whatever
  • Execute precise-listen <wake-word>.net -d <wake-word>/not-wake-word and let it runs for days so that all spurious activations due to normal noises and sounds in the house (flapping doors, appliances, people’s speeches etc) will be automatically inserted in the not-wake-word folder

Training

  • Use 1000 or 2000 epochs, e.g. precise-train -e 1000 <wake-word>.net <wake-word>/
  • Copy all the wav files from wake-word folder to the test/wake-word folder, the same for the not-wake-word
  • Training steps
    1. Collect and train with wake-word
    2. Add not wake-word similar to the wake-word recorded and train again
    3. Incremental train with the bunch of long audio
    4. Add not wake-word registered during a film or due to a “home noise” and train again

Mycroft.conf setup

  • sensitivity: 0.2 or 0.3,
  • trigger_level: 7 or 8

There is also this link with lots of tips for training the wake-word: localcroft/Precise.md at master · el-tocino/localcroft · GitHub

3 Likes

And @sparkyvision has a more updated guide as well: GitHub - sparky-vision/mycroft-precise-tips: mycroft-precise-tips

1 Like

Thanks for sharing your experience with me. Have you done some tests with the accuracy that your wakeword gives to you ? If you try to run the command: precise-test with your wakeword, what is the statistics of it ?

Sure, I did it!

Here are my stats:

=== Counts ===
False Positives: 1
True Negatives: 3731
False Negatives: 7
True Positives: 422

=== Summary ===
4153 out of 4161
99.81 %

0.03 % false positives
1.63 % false negatives

Sensitity to 0.1 to prevent false positives and trigger_level to 10 to prevent Mycroft to unintentionally activate.

What is the difference between a false positive and an unintentional activation?