Self-hosting and architecture questions

I’m very new to the Mycroft project and haven’t actually run a functional install at this point - still considering how I want to set things up.

One issue is that I’m trying to figure out the overall architecture of the Mycroft project. At first, I thought that Mycroft Core was the “brains” of the system. Given the name, this made sense. However, it’s just the “listener” system that runs on your local hardware, right? You pair it with your home.mycraft.ai account, and the Mycroft systems do the actual voice recognition? It’s more Mycroft Client than Mycroft Core as far as what it does.

Selene and Selene UI are the actual brains, correct? This is that code that is behind the systems at home.mycroft.ai, from what I understand. Is that correct?

I’m really interested in running the Mycroft Android app because I refuse to rely on the virus that is Google. I see that there is Mycroft Android and Mycroft Core Android. From what I understand, the Mycroft Android app is just a very basic “listener” that relays voice to a Mycroft Core system, Mycroft Core sends that to home.mycroft.ai for recognition, and the results are sent back through to the device. Mycroft Core Android is an alpha-stage project for running Mycroft Core directly on Android instead of syncing with another system on your network. Is this correct?

Assuming I understand all of this correctly, here’s what I’d really like to do:

  • Host my own Mycroft Core system in the cloud using something like Google Cloud Run (while I hate Google, this is a great service - serverless Docker)
  • Build the Mycroft Android app and point it to my Mycroft Core instance on the cloud
  • (Optional) host my own Selene backend

However, I think that the Android app requires a Core system on the local network. Also, the configuration uses an IP address, not a hostname. That puts a damper on using a cloud-hosted Core system.

Do I understand everything correctly? Is there a way to do what I’m trying to accomplish?

Hey summersab, welcome to Mycroft!

It’s a good question and I’ve been working on some documentation to better show the overview of all Mycroft technologies.

Briefly, Mycroft-core sits in the middle and coordinates all the other components on your device. Eg Precise our main wake word listener is it’s own repository but gets run from Mycroft core.

Selene (home.mycroft.ai) at a high-level provides a GUI for settings and access to anonymized cloud services. Eg by default we use Google STT proxied through our servers for greater anonymity (explanation why here), and TTS is our own Mimic2 service.

The actual Skills and process all happens on device. So:

  • detecting wake word (on device)
  • recording utterances aka voice commands (on device)
  • transcribing those to text (cloud)
  • determine intent of the utterance (on device)
  • Skills process and respond to the utterance (on device)
  • response text gets synthesized into audio (cloud by default, on device backup or if you choose British Male)

There are a few community Android projects around but predominantly you are correct, they do the STT on the Android device, then send the text to mycroft-core for processing, and send the synthesized audio back to the Android device.

The core doesn’t have to be on a local network but you the messagebus does not have security built in by default. On Mycroft devices we handle this at the operating system layer. So putting an open instance of Mycroft in the cloud that anyone can connect to is a security disaster waiting to happen.

It’s certainly possible to run Mycroft in the cloud (and is done), but is beyond the support we can provide without a paid service level agreement.

Thanks for the explanation! I look forward to your documentation - a nice diagram made using something like https://www.diagrams.net/ might be helpful. Pictures span all languages, after all.

Cheers!