Microphone freezes after wake word when playing music

legsak1mbo · May 18, 2019, 8:06am

I’ve built my first Mycroft/Picroft system as a project with my kids and it’s working pretty well. There were a few teething problems particularly around Pulseaudio but I’m working through them.

We’ve setup the Spotify skill and that works OK (it doesn’t find anything by AC/DC but I haven’t gotten round to looking at that yet). One specific issue which has come up is that the microphone freezes after hearing the wake word whilst playing music.

Basic order of commands are:-

“Hey Mycroft”
(Accepts wake word and beeps)
“Play Wonderwall”
“Now playing Wonderwall by Oasis”
(Song starts to play)
“Hey Mycroft”
(Accepts wake word but then mic input freezes, cli is still responsive)

Our intention at this point is to pause the music. The cli client shows that the wake word has been accepted but the microphone output freezes and we have to enter the “pause” command on the keyboard. At this point the music stops and the microphone comes back to life.

We’re using a Respeaker 6-mic circular array kit for input and a UGREEN USB to Audio Adapter for output which seems to work really well. Voice detection is good, even from the next room and the output sounds great. I’ve read a few posts on here about echo cancellation and I guess that the Respeaker must be doing a pretty good job of this because it’s able to detect the wake word during playback. So it seems as though Mycroft is able to listen for the wake word whilst music is playing but not record longer commands.

As a side note, I’ve also tried enabling ducking on the Spotify skill but this doesn’t seem to work. When the wake word is heard the mic hangs and the music carries on playing until we enter the pause command.

As I said I don’t think that it’s an insurmountable issue because Mycroft DOES detect the wake word during playback. I wondered if anybody had any suggestions before I take a deep-dive into the code. Thanks in advance.

gez-mycroft · May 20, 2019, 1:36pm

Hey there, it’s great to hear you’ve had some good results with the RS mic array. They conduct their own echo cancellation, beam forming etc so are really made for this use case.

It would be good to see what occurs in the logs when this mic freezing happens. Particularly voice.log, audio.log and skills.log. Though there might also be something of interest in the others.
The logs are located at /var/log/mycroft/

You can also ask Mycroft to “create a support ticket” and it will upload them all to a free online service and email you a link. If you can share that (or just the relevant parts) here, it will help us track down what’s going. You can also email them to support@mycroft.ai if you’d prefer not to post it publicly.

I know there are a few other community members using the same mic array so someone else might chip in with an answer too.

legsak1mbo · May 20, 2019, 4:07pm

Thanks for coming back to me.

I’ve found that if you pause Spotify from the cli - or simply wait for the song to finish - the wake word beep plays and then things are OK. So it seems as if we’re just stuck in a queue waiting for the wake word sound to play. I guess that this may be a limitation of the USB soundcard that I’m using here in which case I need some way to pause the audio BEFORE the wake word beep plays (as I mentioned in my last message, the Spotify Skill’s ducking doesn’t seem to work here).

Here are the relevant extracts from the logs:-

pi@picroft:~ $ tail /var/log/mycroft/audio.log 
16:56:24.615 - mycroft.audio.speech:mute_and_speak:128 - INFO - Speak: Just a second
16:56:29.825 - mycroft.audio.speech:mute_and_speak:128 - INFO - Speak: Listening to Wonderwall by Oasis.

pi@picroft:~ $ tail /var/log/mycroft/voice.log 
16:56:18.850 - __main__:handle_record_begin:36 - INFO - Begin Recording...
16:56:22.025 - __main__:handle_record_end:41 - INFO - End Recording...
16:56:22.029 - __main__:handle_wakeword:57 - INFO - Wakeword Detected: hey mycroft
16:56:23.685 - __main__:handle_utterance:62 - INFO - Utterance: ['play wonderwall by oasis']
16:56:44.054 - __main__:handle_record_begin:36 - INFO - Begin Recording...
16:56:57.434 - __main__:handle_record_end:41 - INFO - End Recording...
16:56:57.447 - __main__:handle_wakeword:57 - INFO - Wakeword Detected: hey mycroft
16:56:59.336 - mycroft.client.speech.listener:transcribe:183 - INFO - no words were transcribed
16:56:59.344 - mycroft.client.speech.listener:transcribe:186 - ERROR - Speech Recognition could not understand audio

pi@picroft:~ $ tail /var/log/mycroft/skills.log 
16:57:02.494 - SpotifySkill - INFO - spotify_play: 3a30638a0b725fee3c66198426ed1814fb52698f
16:57:04.788 - SpotifySkill - INFO - Pausing Spotify...

I’ll keep playing - let me know if you want me to try anything else.

gez-mycroft · May 21, 2019, 3:44am

Yeah nothing looks out of place there.

Mycroft does mute the mic (or input audio source) whilst it plays the listening sound, then unmutes the mic again. This prevents the end of the beep sound from being included in the audio for transcription. As the beep sound is also generally emitted from quite close to the mic, the automatic leveling would make the start of the utterance very soft.

But potentially it is not unmuting the mic after the beep sound, and is for some reason waiting until the song audio stream stops.

Probably the easiest way to test if this is the case (assuming you’re comfortable editing code) is to comment out the mute and unmute lines in mycroft-core/mycroft/client/speech/mic.py.

legsak1mbo · May 21, 2019, 8:37am

Something else I’ve found - when it tries to play the beep this message appears in the system log:-

May 20 18:08:06 picroft pulseaudio[1407]: [pulseaudio] main.c: User-configured server at 127.0.0.1, refusing to start/autospawn.

I’m guessing it’s got something to do with the ALSA/Pulse settings here. I remember configuring a server on the loopback whilst I was trying to get the mic working - but I can’t remember where I did it now!

I’ll keep investigating…

legsak1mbo · May 21, 2019, 9:27am

OK. So I sorted the start/autospawn message by adding…

load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1 auth-anonymous=1

…to default.pa.

It doesn’t look like the issue is with muting the mic as that didn’t make any difference. However commenting out the play_wav line there did improve things. Mycroft then listened for the “pause playback” command and did as it was asked. Unfortunately if it failed to understand something (“pools playback” for example) it was then frozen again because it was waiting to tell me that it hadn’t understood. Once I manually paused the playback it would then proceed to tell me that it hadn’t understood.

So the crux of the issue seems to be that when it’s playing music from Spotify it can’t play any other sounds (including the wake sound) so any sounds that it does try to play get queued up and come out in a big jumble once Spotify is paused.

legsak1mbo · May 21, 2019, 9:44am

Well, some partial success. I edited the Spotify skill as follows:-

#if (self.spotify.is_playing() and self.is_player_remote and
#        self.settings.get('use_ducking', False)):
if (self.spotify.is_playing()):

…and it now behaves better. It’s interesting because use_ducking is definitely True so I assume that is_player_remote is evaluating to False. Unfortunately, unless your next command is to pause Spotify it then begins playing immediately so asking Mycroft the weather whilst Spotify is playing will cause the weather to be queued until the song finishes playing.

Is there some sort of central ducking/queuing system which pauses existing audio whilst another skill activates or would this need to be controlled by each skill individually?

legsak1mbo · May 21, 2019, 6:23pm

After further testing I suspect that the issue boils down to the alsa/pulse config that comes with the Respeaker. I’ve tried various combinations but either lose output or the mic.

For info the configs are:-

github.com

respeaker/seeed-voicecard/blob/master/asound_6mic.conf

# The IPC key of dmix or dsnoop plugin must be unique
# If 555555 or 666666 is used by other processes, use another one

# use samplerate to resample as speexdsp resample is broken
defaults.pcm.rate_converter "samplerate"

pcm.!default {
    type asym
    playback.pcm "dmixer"
    capture.pcm "ac108"
}


pcm.ac108 {
    type plug
    slave {
        rate 48000
        format S32_LE
        pcm "hw:seeed8micvoicec"
    }

This file has been truncated. show original

github.com

respeaker/seeed-voicecard/blob/master/pulseaudio/pulse_config_6mic/default.pa

#!/usr/bin/pulseaudio -nF
#
# This file is part of PulseAudio.
#
# PulseAudio is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# PulseAudio is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with PulseAudio; if not, see <http://www.gnu.org/licenses/>.

# This startup script is used only if PulseAudio is started per-user
# (i.e. not in system mode)

This file has been truncated. show original