I am not really sure as I am a bit of a tensorflow geek which Precise is 100% tensorflow its just packaged and branded as Mycroft Precise but is like any other tensorflow model apart from if you go on the tensorflow forum as don’t take my opinion for granted or check the tensorflow documentation so much is basically wrong here.
wake_words=421 to not_wake_words=52678 is hugely imbalanced to start with.
421 really is far too few wake_words and likely should be a 0 on the end of that but for balance it should match the not_wake_words.
As I was saying about general tensorflow models doesn’t matter if GRU or some of the latest and greatest models you are just talking a few % in accuracy difference and the biggest provider to accuracy is the dataset you use.
Also you have stopped after 6 epochs which is when the full dataset has been trained 6x times is far too short for training any sort of KWS I am regularly training my own tensorflow models often around the 200 epoch mark.
To create some more KW use sox to augment what you have or concatenate 2 words so your kw qty is kw1*kw2 “Hey” is part of the common voice single voice segment.
Going back to this far too binary setup of kw vs not_word as the only ‘labels’ or ‘class’ with whatever terminology you wish to use is never going to be great as just pouring in extra not_kw into a singular label that just becomes this see-saw of spectra with little cross entropy to the KW will always be the case.
What I have seen what has been written about training and setup is just absolutely bizzare to how tensorflow works.
If you where going to be simplistic your setting up 2x spectra graphs tensorflow drops your input into each and with a binary setup like this often you get false positives not because the KW is particularly close but because because you are just far away from not_kw.
Here and with precise has been a complete misconception of how tensorflow works, how tensorflow states models should be, how tensorflow advises on ‘class’ / ‘label’ balance should be treated and strangely been regurgitated for years.
A few members have previously tried to rectify some of the flaws mycroft-precise-tips/README.md at main · sparky-vision/mycroft-precise-tips · GitHub
Its never going to be great though with a binary model where KW if vastly overfitted with a not_KW that is equally vastly underfitted as it contains so much variance.
But do not take my word for it read up on Tensorflow as Precise is 100% tensorflow and apols that you have spent so much time creating that hugely imbalanced dataset and not much more else that I can say.