Microphone freezes after wake word when playing music

I’ve built my first Mycroft/Picroft system as a project with my kids and it’s working pretty well. There were a few teething problems particularly around Pulseaudio but I’m working through them.

We’ve setup the Spotify skill and that works OK (it doesn’t find anything by AC/DC but I haven’t gotten round to looking at that yet). One specific issue which has come up is that the microphone freezes after hearing the wake word whilst playing music.

Basic order of commands are:-

“Hey Mycroft”
(Accepts wake word and beeps)
“Play Wonderwall”
“Now playing Wonderwall by Oasis”
(Song starts to play)
“Hey Mycroft”
(Accepts wake word but then mic input freezes, cli is still responsive)

Our intention at this point is to pause the music. The cli client shows that the wake word has been accepted but the microphone output freezes and we have to enter the “pause” command on the keyboard. At this point the music stops and the microphone comes back to life.

We’re using a Respeaker 6-mic circular array kit for input and a UGREEN USB to Audio Adapter for output which seems to work really well. Voice detection is good, even from the next room and the output sounds great. I’ve read a few posts on here about echo cancellation and I guess that the Respeaker must be doing a pretty good job of this because it’s able to detect the wake word during playback. So it seems as though Mycroft is able to listen for the wake word whilst music is playing but not record longer commands.

As a side note, I’ve also tried enabling ducking on the Spotify skill but this doesn’t seem to work. When the wake word is heard the mic hangs and the music carries on playing until we enter the pause command.

As I said I don’t think that it’s an insurmountable issue because Mycroft DOES detect the wake word during playback. I wondered if anybody had any suggestions before I take a deep-dive into the code. Thanks in advance.

Hey there, it’s great to hear you’ve had some good results with the RS mic array. They conduct their own echo cancellation, beam forming etc so are really made for this use case.

It would be good to see what occurs in the logs when this mic freezing happens. Particularly voice.log, audio.log and skills.log. Though there might also be something of interest in the others.
The logs are located at /var/log/mycroft/

You can also ask Mycroft to “create a support ticket” and it will upload them all to a free online service and email you a link. If you can share that (or just the relevant parts) here, it will help us track down what’s going. You can also email them to support@mycroft.ai if you’d prefer not to post it publicly.

I know there are a few other community members using the same mic array so someone else might chip in with an answer too.

Thanks for coming back to me.

I’ve found that if you pause Spotify from the cli - or simply wait for the song to finish - the wake word beep plays and then things are OK. So it seems as if we’re just stuck in a queue waiting for the wake word sound to play. I guess that this may be a limitation of the USB soundcard that I’m using here in which case I need some way to pause the audio BEFORE the wake word beep plays (as I mentioned in my last message, the Spotify Skill’s ducking doesn’t seem to work here).

Here are the relevant extracts from the logs:-

pi@picroft:~ $ tail /var/log/mycroft/audio.log 
16:56:24.615 - mycroft.audio.speech:mute_and_speak:128 - INFO - Speak: Just a second
16:56:29.825 - mycroft.audio.speech:mute_and_speak:128 - INFO - Speak: Listening to Wonderwall by Oasis.

pi@picroft:~ $ tail /var/log/mycroft/voice.log 
16:56:18.850 - __main__:handle_record_begin:36 - INFO - Begin Recording...
16:56:22.025 - __main__:handle_record_end:41 - INFO - End Recording...
16:56:22.029 - __main__:handle_wakeword:57 - INFO - Wakeword Detected: hey mycroft
16:56:23.685 - __main__:handle_utterance:62 - INFO - Utterance: ['play wonderwall by oasis']
16:56:44.054 - __main__:handle_record_begin:36 - INFO - Begin Recording...
16:56:57.434 - __main__:handle_record_end:41 - INFO - End Recording...
16:56:57.447 - __main__:handle_wakeword:57 - INFO - Wakeword Detected: hey mycroft
16:56:59.336 - mycroft.client.speech.listener:transcribe:183 - INFO - no words were transcribed
16:56:59.344 - mycroft.client.speech.listener:transcribe:186 - ERROR - Speech Recognition could not understand audio

pi@picroft:~ $ tail /var/log/mycroft/skills.log 
16:57:02.494 - SpotifySkill - INFO - spotify_play: 3a30638a0b725fee3c66198426ed1814fb52698f
16:57:04.788 - SpotifySkill - INFO - Pausing Spotify...

I’ll keep playing - let me know if you want me to try anything else.

Yeah nothing looks out of place there.

Mycroft does mute the mic (or input audio source) whilst it plays the listening sound, then unmutes the mic again. This prevents the end of the beep sound from being included in the audio for transcription. As the beep sound is also generally emitted from quite close to the mic, the automatic leveling would make the start of the utterance very soft.

But potentially it is not unmuting the mic after the beep sound, and is for some reason waiting until the song audio stream stops.

Probably the easiest way to test if this is the case (assuming you’re comfortable editing code) is to comment out the mute and unmute lines in mycroft-core/mycroft/client/speech/mic.py.

Something else I’ve found - when it tries to play the beep this message appears in the system log:-

May 20 18:08:06 picroft pulseaudio[1407]: [pulseaudio] main.c: User-configured server at 127.0.0.1, refusing to start/autospawn.

I’m guessing it’s got something to do with the ALSA/Pulse settings here. I remember configuring a server on the loopback whilst I was trying to get the mic working - but I can’t remember where I did it now!

I’ll keep investigating…

OK. So I sorted the start/autospawn message by adding…

load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1 auth-anonymous=1

…to default.pa.

It doesn’t look like the issue is with muting the mic as that didn’t make any difference. However commenting out the play_wav line there did improve things. Mycroft then listened for the “pause playback” command and did as it was asked. Unfortunately if it failed to understand something (“pools playback” for example) it was then frozen again because it was waiting to tell me that it hadn’t understood. Once I manually paused the playback it would then proceed to tell me that it hadn’t understood.

So the crux of the issue seems to be that when it’s playing music from Spotify it can’t play any other sounds (including the wake sound) so any sounds that it does try to play get queued up and come out in a big jumble once Spotify is paused.

Well, some partial success. I edited the Spotify skill as follows:-

#if (self.spotify.is_playing() and self.is_player_remote and
#        self.settings.get('use_ducking', False)):
if (self.spotify.is_playing()):

…and it now behaves better. It’s interesting because use_ducking is definitely True so I assume that is_player_remote is evaluating to False. Unfortunately, unless your next command is to pause Spotify it then begins playing immediately so asking Mycroft the weather whilst Spotify is playing will cause the weather to be queued until the song finishes playing.

Is there some sort of central ducking/queuing system which pauses existing audio whilst another skill activates or would this need to be controlled by each skill individually?

After further testing I suspect that the issue boils down to the alsa/pulse config that comes with the Respeaker. I’ve tried various combinations but either lose output or the mic.

For info the configs are:-