Speed of reacting to the wake word

gekoch · April 22, 2020, 8:48am

Hy

I use Mycroft on a PI3.
When I call the wake word and I can see that the wake word is recognized in the CLI it takes 1.3 seconds till I hear the sound over my 3.5mm headphones to tell me that mycroft is listening. In some video I can see that it can react faster.
Can some one tell me if the Pi3 is the bottleneck or what can I improve?

Andy

j1nx · April 22, 2020, 9:44am

If you run the command;

pactl list sinks

What is the latency saying of your audio device?

gekoch · April 22, 2020, 9:46am

0.78seconds

State: IDLE
        Name: alsa_output.platform-soc_audio.analog-mono
        Description: Built-in Audio Analog Mono
        Driver: module-alsa-card.c
        Sample Specification: s16le 1ch 48000Hz
        Channel Map: mono
        Owner Module: 7
        Mute: no
        Volume: mono: 51773 /  79% / -6.14 dB
                balance 0.00
        Base Volume: 56210 /  86% / -4.00 dB
        Monitor Source: alsa_output.platform-soc_audio.analog-mono.monitor
        Latency: 782805 usec, configured 1365333 usec
        Flags: HARDWARE HW_MUTE_CTRL HW_VOLUME_CTRL DECIBEL_VOLUME LATENCY
        Properties:
                alsa.resolution_bits = "16"
                device.api = "alsa"
                device.class = "sound"
                alsa.class = "generic"
                alsa.subclass = "generic-mix"
                alsa.name = "bcm2835 ALSA"
                alsa.id = "bcm2835 ALSA"
                alsa.subdevice = "0"
                alsa.subdevice_name = "subdevice #0"
                alsa.device = "0"
                alsa.card = "0"
                alsa.card_name = "bcm2835 ALSA"
                alsa.long_card_name = "bcm2835 ALSA"
                alsa.driver_name = "snd_bcm2835"
                device.bus_path = "platform-soc:audio"
                sysfs.path = "/devices/platform/soc/soc:audio/sound/card0"
                device.form_factor = "internal"
                device.string = "hw:0"
                device.buffering.buffer_size = "131072"
                device.buffering.fragment_size = "131072"
                device.access_mode = "mmap+timer"
                device.profile.name = "analog-mono"
                device.profile.description = "Analog Mono"
                device.description = "Built-in Audio Analog Mono"
                alsa.mixer_name = "Broadcom Mixer"
                module-udev-detect.discovered = "1"
                device.icon_name = "audio-card"
        Ports:
                analog-output: Analog Output (priority: 9900)
        Active Port: analog-output
        Formats:
                pcm

j1nx · April 22, 2020, 9:52am

Removing the timebased schedular might bring it down.

Edit your /etc/pulse/default.pa and find;

load-module module-udev-detect

Replace it with this;

load-module module-udev-detect tsched=0

Might need a reboot for easy to get these setting applied. That might bring the latency down.

There is more ttweaking you can do, but is a bit of try and error;

gekoch · April 22, 2020, 9:58am

wohoo nice => 0.098s
Latency: 98142 usec, configured 99954 usec

Now it is much snappier. Still the answers take some time but it feels much better now

j1nx · April 22, 2020, 10:00am

The answers need to come from external server, so nothing you can do there.

gekoch · April 22, 2020, 10:02am

yeah exactly maybe in the future mycroft can already start talking while the info from the internet is delivered so there is not that “huge” waiting time

j1nx · April 22, 2020, 11:16am

For the Mark-2 they make use of the streaming TTS of google. You could do that yourself as well, but requires a API key.

Haven’t looked into that yet myself, perhaps @forslund or @gez-mycroft can help you out with that.

github.com

MycroftAI/enclosure-mark2/blob/master/etc/mycroft/mycroft.conf#L6


{
  "enclosure": {
    "platform": "mycroft_mark_2pi",
    "platform_build": 1
  },
  "stt": {
    "module": "google_cloud_streaming",
    "google_cloud_streaming": {
      "credential": {
        "json":
          # Google Service Key 
      }
    }
  },
  "tts": {
    "module": "mimic2",

gekoch · April 22, 2020, 12:26pm

thx I’m currently installing

mycroft-pip install google-cloud-speech

but this takes ages (stillt installing)… have already a working API Key since I use ti with my OpenHAB system.

by the way is it possible to send the recognized text by Mycroft to a url? I’m think of this Scenario: Mycroft passes the recognized text to the REST url of OpenHAB this then decides via a “rules” whether it should play the text via TTS (of the OopenHAB) to the pulseaudio speakers or not.
(I know there is a OpenHAB skill but I like to have the text recognized by Mycroft as text in a OpneHAB rule)

j1nx · April 22, 2020, 12:28pm

Euhm, I see i (once again) mixed up the STT and TTS stuff…

Please double check and see if it is still what you want

gez-mycroft · April 22, 2020, 12:59pm

This is a great tip on the timebased schedular. Will add it to the docs.

Yeah the Mark II prototypes are using the Google Cloud Streaming STT. There is also a built in support for DeepSpeech Streaming STT and the new IBM Watson service provides Streaming STT but Mycroft Core hasn’t yet been updated to support it.

j1nx · April 22, 2020, 1:07pm

Yes, as the RPI does not have a hardware clock, timebased scheduling is really not optimal.

gez-mycroft · April 22, 2020, 1:10pm

I know nothing about it, is this something we should just drop to 0 by default?

j1nx · April 22, 2020, 1:23pm

Depends a bit, could be now we use pulseaudio by default.

But perhaps need some testing from different users. I use it for a while now and have not run into problems, but perhaps with high bitrates playback of music this might not be beneficial.

muhammed · June 2, 2020, 1:57am

can you help me with how to change it to that value it equals 0 on my rasspberry pi

Dominik · June 2, 2020, 6:21am

In /etc/pulse/default.pa add tsched=0 to the line load-module module-udev-detect so it looks like

load-module module-udev-detect tsched=0

muhammed · June 3, 2020, 6:24pm

when I did this , it replied faster put the first word of the respond seems to be cropped, somebody wrote that 0.098s is giving greate results but I don’t know how to change it’s value from 0 to 0.098

gekoch · August 4, 2020, 6:40am

Have you set some parameter is the nano /etc/pulse/daemon.conf file?

gekoch · August 4, 2020, 6:43am

For every answer mycroft generates a new mp3 that is then played to the pulseaudio server.
Now for example mplayer needs around 2 seconds to open the tunnel to the pulseaudio sever and then play the mp3. How can we improve this?

One way would be to have a stream open the whole time mycroft is running with an unhearable audio sound. And once there is a new answer to play through the speakers the connection is already up and running and the new “mp3” can be injected into the the same stream…