Easiest way to use Mycroft completely offline

And as seen in chat, https://github.com/MycroftAI/selene-backend is now out.

I will add that the biggest hinderence is still a strong speech recognition technology that can be run locally. While many tout CMU Sphinx/PocketSphinx or Kaldi as options, the experience of many is that they really aren’t adequate. Mozilla’s DeepSpeech has the brightest future, IMHO, but practically speaking it is still not quite ready for prime time for most users.

It is a bit of a chicken-and-the-egg situation, but I don’t think there is a strong case for creating an easy offline setup before the core components are ready. It is close, but I suspect it will be at least another year before we (the open source community) have all those pieces ready.

5 Likes

Maybe I’m oversimplifiying this, with all the deepspeech learning talk, but unless this is newly added, I just found this topic because I wanted to use mYcroft offline until I can get the security issues resolved like the one mentioned in the install re port 8181. And so I found the below setting in the README to disable Home, I will paste below. Of course, I needed to install the ‘local’ speech recognizer Mimic first (does this include the deepspeech personal backend mentioned above? I might assume its only a static model not dynamicaly learning, but that works for getting started).

## Using Mycroft Without Home

If you do not wish to use the Mycroft Home service, before starting Mycroft for the first time, create `$HOME/.mycroft/mycroft.conf` with the following contents:

```
{
  "skills": {
    "blacklisted_skills": [
      "mycroft-configuration.mycroftai",
      "mycroft-pairing.mycroftai"
    ]
  }
}
```

Mycroft will then be unable to perform speech-to-text conversion, so you'll need to set that up as well, using one of the [STT engines Mycroft supports](https://mycroft-ai.gi
tbook.io/docs/using-mycroft-ai/customizations/stt-engine).

You may insert your own API keys into the configuration files listed above in <b>Configuration</b>.  For example, to insert the API key for the Weather skill, create a new JS
ON key in the configuration file like so:

```
{
  // other configuration settings...
  //
  "WeatherSkill": {
    "api_key": "<insert your API key here>"
  }
}
```

Hey there,

Mimic is a Text-To-Speech (TTS) not Speech-To-Text (STT) engine. So you’d want to look at the other options listed in our docs. DeepSpeech is one possibility for STT engines.

The personal backend is not related to DeepSpeech and is not an STT engine. It is a replacement for home.mycroft.ai.

Any concerns you have with port 8181 being used would still exist if you run Mycroft without communicating with Mycroft’s backend. It is the websocket that your local Mycroft instance uses to communicate with different parts of itself. Best to lock it down with a firewall as described in the other thread.

I just found vosk a good tool (at least it seems like) which is a stt service running offline with good accuracy. I just do not know how use this instead of the normal stts.

vosk is not supported by mycroft, but you can install my hivemind skill

Then you can run mycroft without speech client and use the voice satellite instead (it supports vosk and deepspeech to run offline)

ill make a longer post soon about hivemind, and maybe a guide to run offline

Meanwhile if you are feeling adventurous links above should be helpful, apologies for lack of documentation in advance

2 Likes

I just want to add that the “port 8181” issue is only an issue if:

  • You are not running a firewall on the computer that runs Mycroft, and the attacker is inside your LAN, or
  • You are not running a firewall on your router, or
  • Your router is forwarding 8181 to the computer that runs Mycroft, or
  • The attacker is actually logged into the computer that runs Mycroft

There’s a disclaimer for good measure, but the person who “publicly disclosed” the vulnerability didn’t seem to understand that the unsandboxed git repo is fundamentally the dev build with the message bus exposed for roughly the same reason that SDK versions of major hardware tend to come with a serial port, whereas the consumer version will have that space on the board unpopulated.

If people who don’t normally have the stomach for /r/selfhosted software are trying to power through just because of that scary notice, all you need to do is secure port 8181 literally any way whatsoever.

1 Like
  • Or, the firewall is set up incorrectly (which I asked for specifics on in the post below).

The link is to my summary of the correct firewall settings… perhaps you could check them.

Regarding the OP going completely offline, in the second reply from KathyReid, she explains the STT (speech rec, right?) is the biggest obstacle at the moment. In the last few posts there seems to be progress on that front, though Google’s DeepSpeech is very accurate, but my understanding is currently not downloadable. Firefox’s may be, and there are other options directly above from jarbasAI.

The Mimic TTS can be downloaded. And the Mycroft.ai site ‘backend’ can be disabled by the config settings I pasted from the README. Or people have linked several personal backend projects, if you dont trust the initial setup from Mycroft.ai (dont know if it can be setup with the config settings in place from the getgo, havent tested), or you want to manage your own skills and config repository (which at that point, isnt it easier to apply skills changes and settings directly?)

For me, I have no problem using bandwidth for Googles DeepSpeech, and that seems to be less of a security concern if your voice data is anonymized, but I could see where wanting offline everything could be useful for all internal networked IOT and home devices could be controlled without any internet (appliances, gardening, robots, etc…).

Again, for me, I just wanted to secure unknown internet traffic after the port 8181 warning, but in the post I linked see all outbound traffic is on regular ports 80 and 443, the latter for secure account data I would presume. After securing port 8181 on internal networks with the above firewall settings, and disconnecting Mycroft.ai (at least until I want to exchange skills), I feel more confident any internet traffic from mycroft is at the same level of risk as a browser using ports 80 and 443. Correct me if Im wrong. (oh, and as was pointed out on the other post, where more discussion about security probably belongs, the threat from exposing port 8181 more than to localhost would be limited to data shared with mycroft… which could range from trivial to significant).

go fully offline using

2 Likes

Mozilla’s deepspeech is either viable or very close for North American English. You can download that over on their github:

1 Like