Family Acceptance Factor

aussieW · October 16, 2017, 4:31am

Hi

I have been using Mycroft for about a month on a Raspberry Pi (Picroft) and I am excited about the possibilities it enables. In the past I have looked briefly at the Echo and Google AIY, also running on the Raspberry PI. In a lot of ways I prefer Mycroft over these two, however there are a number of things about Mycroft which are inferior and significantly impact the Family Acceptance Factor. For instance, Mycroft’s response to its wakeword is variable. Most of the time it will work for me, but both my wife and daughter find it extremely difficult (near impossible) to trigger Mycroft. When triggered, Mycroft is fairly slow in delivering responses to queries. On the other hand, neither my wife or daughter have any issues with getting Google’s AIY to wake up and it delivers responses to queries almost instantly.

So do people envisage Mycroft’s performance improving to a level closer to AIY’s? If so what are barriers currently restricting Mycroft’s speed and responsiveness, and what is the time frame to address them?

baconator · October 16, 2017, 6:30am

Perhaps a custom wake word would work better for your family?

You can adjust the sensitivity a bit on the home.mycroft.ai settings (go to settings, then advanced, threshold).

KathyReid · October 16, 2017, 11:50am

Hi @aussieW thanks for raising this.
We’re not as mature as Google Assistant or Amazon Alexa yet, but are putting significant effort into a number of aspects that will improve performance over the next 6 months.

Wake Word training: Internally we’re running software which allows us to record instantiations of the Wake Word, and this data is then used as a corpus for Machine Learning, which is then being used to improve the recognition accuracy of the Wake Word. We will release this functionality to the broader user base when it’s a little less “geeks only”.
The responsiveness is a big hurdle, and this is largely a function of Mycroft having to “go external” to do lookups. We’re looking at ways of having many functions “local” to whatever Device Mycroft is running on to get around this.
The other piece here is the range of skills that Mycroft currently has - to be comparable we need to build these out extensively, and this is a piece we’re collaborating extensively with our broader developer community on.

Our long term strategy is documented here

aussieW · October 18, 2017, 1:57am

Thanks @kathy-mycroft for the response. I appreciate the feedback.

In terms of improving the wake word hit-rate, do you know where I can get any info on how to go about tuning the wake word response (i.e. THRESHOLD, THRESHOLD MULTIPLIER and DYNAMIC ENERGY RATIO)? I just noticed another parameter: CHANNELS. It is currently set to 2 but I am using a Sony PS3 Eye which has 4 channels … I wonder if I should be changing that value.
Maybe I would have more success if I changed the STT engine to something like Snowboy, which is what I think AIY is using (I might be wrong on that).

In terms of responsiveness, my observations are that TTS is by far the biggest delay in the chain, which I believe is performed locally. Should I be trying a different engine? Which ones does the platform support out of the box?

KathyReid · October 19, 2017, 10:03am

Very happy to help, @aussieW

This documentation is super-super-super alpha stage but might be useful:
https://github.com/MycroftAI/docs-rewrite/blob/master/03.your-home.mycroft.ai-account/01.your-home.mycroft-account.md#changing-your-wake-word
(this repo is where I am doing the rewrite of docs.mycroft.ai, it will then be consumed by a presentation layer)

The Channels should not have a bearing on the Wake Word recognition accuracy, as the Channel is simply which audio channel the Listener should listen on. As a first step I would recommend changing you Threshold value - up or down depending on whether Mycroft is too sensitive or not sensitive enough.

TTS is performed locally by Mimic, but I would be very surprised if it were the biggest contributer to the overall time of the request / response cycle, predominantly because it’s local. Of course this is conjecture, and I’d much rather have some data to base it all on, but not sure how to test this.

Best, Kathy

aussieW · October 20, 2017, 12:03am

Thanks for the link to the doco. Most of the info I have come across at some stage or another by trawling existing docs, forums and chat. But one thing I noticed was the number of audio channels for the wake word detection. I was surprised to see that the default is 1 channel. Mine has shown 2 for as long as I can remember (I don’t recall ever changing it but maybe I did). I will start playing with that and the other parameters to see if I can improve things.

My reasoning for TTS being a major factor in responsiveness is that when I ask a question the text response fairly quickly appears on the screen but the audio response comes much later. So it appears mycroft knows what to say, but takes a long time to synthesize it.

Jarbas_Ai · October 20, 2017, 1:08am

you can try using PicoTTS, it answers really fast

or use some online option

aussieW · October 20, 2017, 1:17am

Thanks. I tried to use the Google TTS today on my picrofts, by following btotharye youtube video instructions. However, it didn’t work on either of my picrofts (one is 0.8.2 and the other 0.9.1). I can see that the configuration has been received from home.mycroft.ai but the picrofts seem to completely ignore it and continue to use mimic.

Are there any instructions for installing PicoTTS?

sparky · March 2, 2018, 7:47pm

I recently started with Mycroft on my Raspberry Pi to interface it with some robotics. Overall, things seem to be working fine. I would like to echo the observation from @aussieW in terms of Picroft being able to wake up with my wife’s or daughter’s voice. I have a Sony PlayStation Eye microphone. It picks up my voice very well even across the room with background noise. It just does not respond to a higher pitch voice like a child or female voice even in the absence of any background noise.

What I have tried so far is changing the threshold from 1e-50 to 1e-120 but it did not help. I have also changed the wake word as someone suggested in this thread (although not sure what should be the criteria for the new word but I picked some other word). Does anyone have any suggestions of what I can try? Or Is there a way to debug where is the wake word detection failing?

shaan7 · March 18, 2018, 12:34pm

My situation as well. Mycroft picks up my voice every damn time, even better than Google/Alexa but almost never for my wife. So much that she actually threatened to throw my Picroft speaker away

@KathyReid Um, maybe its that the corpus is getting dominated by Male samples and thats what making it harder for the ladies to work with it?

aussieW · March 18, 2018, 6:22pm

Actually, I have to say that since the release of 18.2.2 (possibly 18.2.1, but certainly after all the recent back end problems) my wife and daughter have seen a huge improvement in recognition. I had built a button for them to use (https://github.com/aussieW/mycroft-push-to-listen) but they are starting to use it less and less for activation. The weak area in my opinion is still in activating mycroft when it is playing audio. We all have to use the button to stop it. Hopefully the recent bounty that has been offered will go a long way to address this.

Wolfgange · March 18, 2018, 6:26pm

Edit: Just saw aussiew’s post above. We recently switched to the Precise wake word engine by default, which should have a huge impact on accuracy. Try it again and if you still find it more difficult for your wife to activate it, feel free to read what I have written below:

So, as a first step, make sure you are running Precise by asking, hey mycroft, what wake word engine are you using?. (Note that Google has trouble transcribing this phrase so if it doesn’t work, try saying it again or vary it with something like what listener are you using?). Provided the response is that you are using Precise, indeed our dataset consists of way too many men, which is something we very much want to fix.

If you would like to speed up the process, you can use this precise-collect script (don’t forget to install the requirements.txt at the bottom) to manually record a few samples of your wife and I could fast-track them into the dataset.

Instructions for recoding:

Leave a second or two of silence at the start of each recording
Leave no silence after the wake word
Record at least 12 or so samples per person

shaan7 · March 20, 2018, 11:51am

Yes Mycroft replied that its indeed Precise.

Cool, will do that soon. Where do I email these?
Also, somehow (you guys did something?) Mycroft has gotten better at understanding her since today afternoon. From 1 out of 10, success rate is now around 5 out of 10 times.

Wolfgange · March 20, 2018, 1:56pm

You can email me at matthew.scholefield@mycroft.ai. Also, we haven’t pushed a new model in the past week, but it’s possible it somehow hadn’t grabbed the latest one until now. Either way, I’m glad to hear it’s working better, but, of course, we still have a long ways to go.