Precise_runner failure on Pinephone (aarch64)

I hope this saves someone some time, sometime :slight_smile: When using precise_runner on the Pinephone (aarch64) the process will crash when reading from the microphone due to the use of a hard-coded sampling rate when initializing the stream.

2020-12-22 17:02:49 :: kalliope-0.7.0 :: Say something!
Exception in thread Thread-3:
Traceback (most recent call last):
File “/usr/lib/python3.9/threading.py”, line 954, in _bootstrap_inner
self.run()
File “/usr/lib/python3.9/threading.py”, line 892, in run
self._target(*self._args, **self._kwargs)
File “/usr/lib/python3.8/site-packages/precise_runner/runner.py”, line 231, in _handle_predictions
chunk = self.stream.read(self.chunk_size)
File “/usr/lib/python3.8/site-packages/precise_runner/runner.py”, line 186, in
stream.read = lambda x: pyaudio.Stream.read(stream, x // 2, False)
File “/usr/lib/python3.9/site-packages/pyaudio.py”, line 608, in read
return pa.read_stream(self._stream, num_frames, exception_on_overflow)
OSError: [Errno -9999] Unanticipated host error

Another Kalliope process returns “Invalid number of frames” from the exact same call. The problem also occurs when using the speech_recognition module directly and even when not using python. The alsa utility arecord with default values has the same issue.

In precise_runner/runner.py:

class PreciseRunner
def start(self):
“”“Start listening from stream”""
if self.stream is None:
from pyaudio import PyAudio, paInt16
self.pa = PyAudio()
self.stream = self.pa.open(
16000, 1, paInt16, True, frames_per_buffer=self.chunk_size
)

Changing the 16000 to (12000 and below) or (24000 and above) prevents the problem from occurring.

HTH
LF

1 Like

You should file this as an issue on github.

I think the hardcoded 16000 is because the training with it. Isn’t it better to resample the input to that bit rate within PulseAudio?

1 Like

Thank you, I will file on github.

Having tested further, while changing the sample rate prevents crashes, it also prevents recognition of the wakeword! Is this because the model was trained at 16K and so words with different sample rates are not digitally equal? Are there any models built with a different sample rate?

I am using the “Sheila” wakeword with > 95% accuracy on my test machines.

Resample within pulse earlier in the process. Thank you j1nx. You wouldn’t happen to have a one liner available, would you?

Off to do the research on how to do it myself.
Thank you both
LF

No, sorry not something readily at hand. Can also do some googling for you if you get stuck though.

Can you tell what format the clips are recorded at?

No sound clips. this is straight from the microphone. The systems reposrts the source as 48K

Incidentally, if you get it working on the PinePhone without eating all the resources, I’d be very interested in that. I’m working with PocketSphinx because I don’t have the time or the expertise to worry about Precise, and I’m more focused on functionality.

So after OS and python updates on January 8th and 9th, everything magically works. arecord, python speech_recognition, and precise_runner.

YMMV
LF

1 Like

I have it all working now, and precise_runner performance is not an issue at this time. Battery consumption requires it be suspended after use so I am currently using a timeout of 40 seconds. My biggest performance problem appears to be recognition of the order after waking up. Now that this speech problem is fixed I can start looking at that

We might be duplicating work. See HiveMind (though you might be better off hitting us up in Mycroft’s chat server at ~hivemind)

It’s a wake word and a thin client, connected to any Mycroft device. I’m using a Mycroft instance on my homeserver.