Use a wifi microphone

Hello, for a project I want to use an android based devices as a portable Speaker and Microphone.
Unfortunately, neither the documentation nor the android app sample found in the GitHub Repo helped me. As I’m concerned about privacy, the way of using Google STT is not an option (as done by the companion app). Also I do not have the computation capacities to setup my own Deepspeech Server in an local environment.

What I’m currently able to:
connect the phone to a server via a socket and send over the recorded byte Stream
connect the phone to the messagebus and interact in that way with mycroft

What i’d like to do:
configure the mycroft core in a way, that allows me to use the received bytestream instead of a directly connected microphone

Can any one help me/give me any suggestions to solve this problem?

Hi there,

Welcome to the Community :slight_smile:

By default Mycroft uses Google STT, as DeepSpeech isn’t quite at a comparable level just yet. We’re continuing to work with Mozilla to assist in bringing this forward and very much looking forward to making the switch when it’s ready. So even if you successfully receive the bytestream, this will still utilise Google STT or another large scale provider.

There are a few other STT providers you can choose from like Wit AI or IBM, however these are likely not much better in terms of your concerns.

We also do what we can given the limitations of what is available. So all STT requests are proxied through Mycroft servers and anonymised to the degree we can. Google therefore doesn’t know if it’s 10 people making 30,000 requests each, or 30,000 people each making a single request. This prevents them from profiling individual users.

Also wondering if you had a look at both Android versions? There is a native implementation, and a companion app that connects to your own instance of Mycroft.

Hello,
thank you for your response.

I thought by using the .conf file, I can adjust which STT mycroft is using and can change it to deepspeech (or mycroft-deepspeech). That’s what I’ve done.

With both the android apps I have some issues.
The companion App right away uses the integrated Google STT, which doesn’t work with my privacy issues. And I didn’t see a possibility to redirect these requests to Mycroft Servers.
And the Android Core App doesn’t seem to be working on my phone, I can start it, but regardless what I do from there on, I can’t do anything.

So far I’ve managed to get a byte stream to the core Application running on my Laptop, but I’m now wondering, if there is a way, to handle that stream as an microphone stream.

I hope, that I can share my final solution with the community, as soon as I managed to find it :wink:

I don’t believe that the mycroft-deepspeech server is available at the moment, but I could be mistaken on that.

Best place for questions regarding the Android apps is in the ~Android channel on Chat. These have been developed by a number of Community members and I’m not too familiar with them unfortunately.

I haven’t seen a way to consume a stream as a pulseaudio input, would be very interested to hear if you find anything.

I finally found a solution to the problem with getting a byte stream as an input.
It’s probably not the best possible solution, but it works. When using a Linux device, it’s possible to use the audio output as the microphone input at the same time.
The description can be found here: https://unix.stackexchange.com/questions/130774/creating-a-virtual-microphone

To get this to work, just use pyaudio to output the bytestream and then the microphone as it’s currently implemented will automatically use it as an input and process the audio

1 Like

The deepspeech backends are available, you do have to have them set up and follow the configs from the relevant PR’s to get them working. You could run one on a cloud vm if you don’t have local resources.

Edited to add: you can now run DS on pretty low-end hardware, just adds a bit of latency. A TFLite version for pi’s is coming soon as well, though I’d run it separate from mycroft. A relatively recent (2012 or newer) core i3/5/7 or better cpu could handle it pretty easily.