What is wrong with mycroft-core right now?

gez-mycroft · June 15, 2022, 12:54am

Mycroft-core is a large and complex code base that has grown massively since its initial commit in May 2016. We know that it’s not perfect, and want your help to map out what the issues are with the current framework.

Whilst we’ve done some great work improving the user experience on the Mark II using the current system, this often comes with more work arounds or one-off tweaks. The deeper, structural changes that are needed into the future become more complex and get put off even further. In a nutshell - technical debt.

Seeing the impact of that technical debt, we want to do three things:

Take a brief step back and map out the issues that the broader community see with the current system;
Define a small set of fundamental principles for what we should be working toward; and
Explore the best way for us to get from where mycroft-core is today, to where it needs to be for tomorrow.

What are the issues we’ve seen and heard about?

For some of our core developers, you will have seen, heard and been frustrated by many of these. We have been listening, and our own developers have been facing the same challenges. Whilst each have numerous possible solutions, some of these will require some fundamental changes. So we wanted to put them out on one big table together so that we don’t miss something.

We’d love your help in extending this list with anything you don’t love about Mycroft right now.

So without further ado… we have noticed that Mycroft currently:

Includes all possibilities for every possible implementation in a single code base. There are a broad range of services to choose from for STT, TTS, Audioservices, and Wake Word engines. There are also a lot of hard coded references to particular hardware platforms like the Mark 1.
Has tightly coupled Services across what could be very distinct components.
Implicitly depends on external libraries tightly coupling itself to their conventions, such as the Mycroft Skills Manager.
Has circular dependencies, such as utils that have configurable parameters, but the configuration module also relies on the utils.
Uses multiple paradigms for inter-process communication including:
- Web socket based message bus
- Files with hard-coded locations (mic level)
- File system “signals” (button presses, isSpeaking)
- File system locks (writing config files)
Has an inconsistent Message structure and schema. To determine what you should be emitting or listening for requires searching through mountains of code. API’s for each Service should be better defined, making it clearer what you can expect from each component, how to invoke it and the structure of the data it returns.
Re-implements standard library functionality in unique ways, such as our logging module.
Has varying levels of testing across the code which are mostly based on who wrote them, rather than how important it is for that code to have rigorous tests.
Has tight coupling of dependencies. Skills and plugins share the same Python virtual environment with core. This creates dependency conflicts and adds limitations to the system, particularly as it scales. Even the need to use a single version of Python creates an unnecessary dependency between core and its extensions. More broadly, having all systems share the same process space means that a poorly performing component can bring down the entire stack.
Almost requires that downstream projects fork the code. For a lot of use cases, modifications to core are required in order to add or modify some functionality. This also adds additional burden to ensure that any changes made maintain backwards compatibility, as well as merging in new changes from upstream.
Lacks state awareness and introspection at most levels of the system, requiring meticulous investigation to determine what is currently happening, or has recently happened. An important part of this is any concept of an interaction session. There is currently no way to determine what speech input caused a particular response, let alone any side effects that might have been triggered.
Is in need of some deep re-organization to account for the way each system has evolved over time.

Phew!!! That is quite a list, but unfortunately I’m sure there are things we have missed.

What has frustrated you about the current Mycroft architecture?

… and the good news?

That is a lot of stuff that we want to fix, but it’s not all bad news. To address these issues will require some new foundational concepts be added, however it will result a much more reliable and stateful system; faster and more robust future development efforts; and easier onboarding and for downstream projects and core contributors.

There is also a huge amount of great work that has gone into Mycroft over many years and some incredible technologies that bring Mycroft to life. Fixing these issues isn’t about replacing all of that work, it’s about giving those existing components better foundational structures so they can interoperate with each other more effectively.

Once we have mapped out the key foundational issues, we want to explore what that means for our development principles and explore the best way forward. But for now, please give us your honest and constructive feedback on mycroft-core as it currently stands.

We don’t want to propose solutions just yet. We want to map out all of the issues so that future solutions are developed with the broader picture in mind. So, what would you change about mycroft-core?

NeonDaniel · June 15, 2022, 1:49am

First off, I’m happy to see a planned tech debt repayment now that the Mark II goals are achieved (or at least mostly there). Here’s my initial feedback on items I think I have meaningful contributions to.

Includes all possibilities for every possible implementation in a single code base. There are a broad range of services to choose from for STT, TTS, Audioservices, and Wake Word engines. There are also a lot of hard coded references to particular hardware platforms like the Mark 1.

I think the OVOS Plugin Manager exemplifies a good structure for making plugins standalone components for use with any project. The plugin loading logic is already there for most of these in mycroft-core and ovos-core.

Has tightly coupled Services across what could be very distinct components.

I found it really helpful to separate service/modules into separate modules/packages; neon_core still includes the skills service with some other supporting modules, but speech, audio, messagebus, and gui were pulled into separate modules with unit tests and PyPI packages. This helps me avoid breaking changes, simplifies containerization (each module has automation to build as a Docker container), and helps manage intentional breaking changes by forcing proper versioning/dependencies.

Uses multiple paradigms for inter-process communication including:

As I mentioned in this issue, moving signals to the messagebus was a straightforward solution to part of this problem for me. I think file locks are still the most efficient method for managing any writes to disk (config); the combo-lock package handles this well and introduces less latency than something built on the messagebus would.

Has tight coupling of dependencies. Skills and plugins share the same Python virtual environment with core. This creates dependency conflicts and adds limitations to the system, particularly as it scales. Even the need to use a single version of Python creates an unnecessary dependency between core and its extensions. More broadly, having all systems share the same process space means that a poorly performing component can bring down the entire stack.

Relating to separating the services into separate packages, I can run each Neon service in its own environment (or container) since they just communicate via the Messagebus. This doesn’t address skills which I believe would require some more refactoring in the skill load logic.

Almost requires that downstream projects fork the code. For a lot of use cases, modifications to core are required in order to add or modify some functionality. This also adds additional burden to ensure that any changes made maintain backwards compatibility, as well as merging in new changes from upstream.

I fully agree with this. Packaging the core and pushing it to PyPI with semantic versioning (or even better, doing this for each service separately) would make it much easier to maintain derivative projects. This was a big motivator for me to rebase Neon on ovos-core.

Lacks state awareness and introspection at most levels of the system, requiring meticulous investigation to determine what is currently happening, or has recently happened. An important part of this is any concept of an interaction session. There is currently no way to determine what speech input caused a particular response, let alone any side effects that might have been triggered.

I don’t think this is entirely correct, message.context['ident'] should be available in all cases to associate responses to requests. In Neon, I use message context to associate responses with their respective sessions.

…however it will result a much more reliable and stateful system…

I consider core to be a stateless system ideally. Specifically, I think any session data, interaction history, etc. should be attached to the Message objects and the core shouldn’t manage any “state data”. This way, the core modules scale to serve multiple users/devices/endpoints and become less intertwined.

So, what would you change about mycroft-core?

The short answer is "everything I did in neon-core . Besides everything above:

I think separating “user config” from “core config” is high on my list.
Breaking mycroft-core into more discrete modules makes testing and maintenance easier in addition to helping resolve the mentioned issues with circular imports and coupled components.
Managing skills as Python packages would alleviate some complications with updates and dependency conflicts be deferring to pip; this is implemented in ovos-core

gez-mycroft · June 15, 2022, 3:20am

Thanks Daniel, some great additions!

It has been great to see the developments in Neon and OVOS too. We’ll certainly want to look at those when it comes to implementation, however we really want to make sure we have the problem well defined before we consider what the solutions might be.

So please keep those problems coming

AIIX · June 15, 2022, 6:28am

As a contributor to the project in certain areas, on top of my head I identify with the following frustrations and challenges with Mycroft-Core to start with (briefly without diving into code technicalities):

Hardcoded platforms and HAL instead of pluginized extensions for platforms, Tightly coupled platform integration in enclosure code which leaves no room for other projects to integrate with Mycroft-Core without forking Mycroft-Core, for example, Mycroft-Core only identifying with mark-2 and mark-1 hardware as supported platforms leaving all other possible platforms out of the core.
Limiting the GUI API to mark-2 specifically (as done in the mark-2/qa branch), artificially limiting it to a specific screen in enclosure codebase inside Mycroft-Core is a sad design, even for downstream and upstream projects wanting to expose the GUI to different platforms and hardware devices, which eventually leads to projects having to fork for integration.
Inconsistent GUI implementation in the dev branch and the Mark-2/qa branch, Mark-2/qa improved some of the GUI consistency issues but seemed to have left out a lot of the Mycroft-GUI API implementation and hardcoded certain behaviors, I would rather not like to see this coming in Mycroft-Core dev branch from the Mark-2/qa branch as it currently is implemented.
Lack of XDG standards, tightly coupled with hardcoded paths in core and skills manager for where skills, configurations, and data live, no multi-user system support because of pushing everything into /opt, basically a packaging nightmare for Linux distributions and package managers.

What I would like to see:

A pluginized HAL layer and removal of the enclosure system, A very good reference for this is the OpenVoiceOS PHAL plugin system GitHub - OpenVoiceOS/ovos_PHAL, which completely replaces the concept of a hardcoded HAL and enclosure code and provides a plugin system for different hardware and platform types, which means no more hardcoding led lights, volume control, wifi management, button control in enclosure code and only bundling or enabling plugins based on the environment and available hardware via fingerprinting. Where plugins actually live externally to the core and are pip installable. (list of various phal plugins OpenVoiceOS · GitHub)
Moving GUI codebase out of enclosure code, and making it modular and extendable, Look into the ovos-core implementation of the GUI protocol which takes the best of mark-2/qa branch implementation but adds all the missing functionality and removes all the hardcoded stuff and hacks the mark-2/qa branch does with the GUI protocol, Reference to the work done in ovos-core on how ovos has managed to make GUI API universal, standalone and extendable to any platform via interfaces and extensions, in turn, disconnecting and depreciating the enclosure and hardcoded use-cases: ovos-core/mycroft/gui at dev · OpenVoiceOS/ovos-core · GitHub

mike99mac · June 15, 2022, 9:01pm

I’m trying to build smart boomboxes, so music playing is the #1 priority.

NOTE: (I believe) general question answering and playing music are the two most common uses of voice assistants, both over 60% in user surveys.

There should be at least three ways to play music:

From local or network files (e.g. mp3 files on a USB drive)
From Internet radio stations
From a paid music streaming service that compensates artists fairly

For (3), it seems Pandora and Spotify are not the answer, but hopefully there will be an agreement with another company soon. I wait patiently.

For (2), there is a supported Internet radio skill, but when I add it, it seems to intercept requests for (1), even if I add the suffix “from Emby”.

For (1) it seems Emby is the supported music skill. I can say “Play {artist}” and it works. I cannot get requests like “Play track {track} by artist {artist}”, to work. This is really basic - maybe I’m missing something.

These are some huge issues. I hate to say it, but commercial voice assistants are way ahead. For me this is a blocker under the heading of “IoT devices with Mycroft in the name must be solid”. (not to mention the blocker of Raspberry Pi’s are currently “unobtanium” :))

Hope this helps.

-Mike Mac

gez-mycroft · June 16, 2022, 1:59am

Hey, thanks for all the contributions so far.

I want to flag that the whole reason we have been doing experimental development on a branch of core is because we don’t think that those changes should be incorporated into the main branches. But that’s also what this process is about. We don’t want to have that branch continuing to diverge - experimentation can only go so far. So we need to create a roadmap for how we move forward and solve the challenges we’ve identified in more universal ways.

We want mycroft-core to be a truly “core” package(s) that can be utilised by all the downstream projects and partners. What does it need to look like to enable each project to add the custom services and functionality that they need, without needing to bake all of that into the “core”.

To quickly recap some of the issues that have been highlighted already:

Having hardware interface components all baked in together.
There should be no hardcoded platforms of any kind.
The GUI API should also be platform independent.
Services are not containerizable, and don’t have clear versioning and dependencies per service.
Skills can’t be run in their own containers.
Current Skill distribution mechanism is complicated, especially for distro maintainers.
Current mycroft-core versioning scheme can be confusing.
No simple way to import Mycroft-core into other projects.
Different configuration levels are all mixed in together as part of mycroft.conf
Lack of XDG standards for Linux installations.
Installing everything in /opt reduces the functionality of multi-user systems.
Hardcoded file paths are just bad.

and some things that people like about current mycroft-core

File locks are an efficient way to manage writes to disk.
Like that core can serve multiple users / devices / endpoints.

@NeonDaniel I’m curious to dig a little into the benefits you see from containerizing each individual service, rather than running them all in a single Docker container for example. Was this to solve for a particular use case, or more to prevent accidental inter-service dependencies and make testing easier?

@mike99mac - agree that music is a big priority. I was kind of intentionally leaving out all the subsystems of this discussion as there are many improvements we can make across the services themselves - trying here to map out how we improve the coordination of those. But absolutely, without music a voice assistant in your home just isn’t anywhere near as useful or interesting and agree with your 3 broad categories assuming local servers like Emby/Jellyfin/Plex/etc would sit in the “local or network files” bucket.

Lots of good stuff for us to mull over so far - and I’d encourage anyone reading along to add your thoughts big or small!

Thanks

NeonDaniel · June 16, 2022, 3:45am

There are a few benefits I saw immediately from this route:

The core dependency list was unmaintainable (partly inherited from Mycroft, part bloat from early Neon dev). Separating the services made the dependency lists easier to maintain for me.
The modules were originally structured to be run independently (they have since been refactored in OVOS/Neon to extend Threads, so a single Python module could manage them now). Running them in separate containers means that if there’s a fatal error in one module, Docker can just restart the container without any dev work for raising exceptions from a worker thread.
To one of your points about logging, running a service in a container allows logging in Docker to “just work” without having to do anything special; I just tail the logs of whichever container/service is of interest
From an extensibility standpoint, making everything a separate service made sense to me since we have some optional services in Neon, i.e. the gui module/container doesn’t run on our Kubernetes cluster since it currently doesn’t support multiple clients.
For maintainability, I personally find it easier to deal with issues/PR’s per-service. This also makes releasing module updates less of an ordeal since I only have to deal in one part of the code at a time.
As mentioned, unit tests are a huge endeavor for something like mycroft-core; breaking things into components made it easier to pull common utilities into a separate package (neon-utils in my case). Migrating utilities helped me to identify and merge duplicate functions while writing unit tests for all of them.

StuartIanNaylor · June 16, 2022, 6:04am

With what @NeonDaniel & @mike99mac there is an overlap I see that might not of been there intention but generally the ‘Snips goldrush’ has seen choice of branding and appropriation of Mycroft only systems and protocols whilst ignoring the real upstream sources and existing widely used protocols for anything that can be labelled.

The killer privacy app is likely a local media service as Google and others move to offline ASR for free because they own the services that will continue to feed their ad push service and maintain a profile of your usage with no need for adding concerns of ASR monitoring.
The actual dev work to provide a skill for a niche user base such as Mycroft is not worth the effort of a full blown polished pro app that might be part of an app store or donation based.
This is where I see containerizing each individual service and particular generic skills abstract at the network layer and partitioning out via containers allows interoperability if you contract out of bespoke Api’s as intent is all that is needed. Intent is the API of a partitioned zonal system and it is merely the routed text output of ASR.
Mycroft core default operation should be a intent router that say in terms of media NLU would classify as Media and route the unprocessed intent to a standalone media skill server for processing.
This would allow opensource standalone skill servers that could be Mycroft branded.
The manner of audio distribution in smart assistants has such strong parallels to modern consumer wireless music that it should be delivered via wireless music protocols as the reuse of existing audio ticks every box in constructing a modern offline digital home.
Audio delivery is the same as it should also be containerised and part of a common zone/room multichannel audio system where time synced latency compensated audio is a default requirement.
Opensource is about choice and interoperability where a herd or huddle of like minds can often provide for many and without choice and interoperability it becomes for a few.
Partitioning via containers to abstract to a network layer at any point is a very good place to do this and if strong opensource already exists then use it and feed the upstream than leach into branded only use as that herd may likely employ the changes you provide.

Also in choice of dev tools and what has now become a historical hole in that on a embedded system heavily reliant on a DSP input audio processing pipeline there is a total absence and the Python Devs are great but this hole exists because “Python where we can, C++ where we must ” is ignored.
The mantra doesn’t have to be specific but with embedded DSP high level languages such as Python are extremely inefficient for the embedded platform to use and it would seem we have a skill set gap at the all important start of the audio pipeline and likely that element from Snips was their greatest value.

PS as for media don’t forget https://www.musicpd.org/ & https://mopidy.com/

NeonDaniel · June 16, 2022, 8:25pm

I think this goal could be achieved by the modularization I mentioned above. If you just want the intent parsing and skills, you could run the skills module and interface via the messagebus, or you could just use it as a dependency and wrap it up in your own service.

The concept for media-specific intent handling I think is already sufficiently outlined in the CommonPlay (and OvosCommonPlay) specs. The existence of OvosCommonPlay also exemplifies how anyone could replace a part of the intent system with something better suited to their specific needs.

StuartIanNaylor · June 17, 2022, 3:12am

The concept for media-specific intent handling you think is already sufficiently outlined in the CommonPlay (and OvosCommonPlay ) is an absolute over bloated crock that was designed to create IP than need.
Its an absolute parallel of the useless bloated and cumbersome Hermes control that near killed community skills in Rhasspy as it became so complex and what was worse most users questioned why apart from a small section who hijacked and ignored user feedback as they created the next system to sell to the next Sonos. Without realising the bubble burst when Snips was sold.

A skill server has no need to know a load of junk protocol from Mycroft or OvosCommonPlay all it does is enforce a complex confusing specs/protocol that is not standard or widespread and therefore is only used by a few and why the herd has contracted into remnants than growing into a widespread population.

Intent text from ASR and the group/zone it originated from is all that is needed and its that simple and the truth is the existence of CommonPlay & OvosCommonPlay is all part of the imaginary ‘Snips goldrush’ mentality of creating IP out of thin air.

That for some reason CommonPlay (and OvosCommonPlay ) re-invent the concept for media intent handling and monopolise with a complex load of junk specs for a skill server to send a media stream to the correct zone/group of a modern wireless audio/media system that needs absolutely zero of that supposed concept or specs to accomplish and why should they.
The devs who came up with it don’t have the skills to create a modern wireless latency synchronised media system as they haven’t provided one. That means you have to use someone else and its delusional to believe in anyway it needs any of what you state to use because you don’t provide any media system just something that is more conceit than concept and for many is pretty blatant.

Look at the size of the community and the manner it has shrank and maybe have a think about your supposed concepts that are basically delusional in scope and playing IP Disneyland.
The problem with the core is the core and a core that it has attracted if Mycroft is to be anything but an exercise in python programming enjoyment by a few.

forslundd · June 18, 2022, 10:07am

Thanks for starting this super-important discussion. A lot has already been said so I’ll try to be brief

To me the big thing about mycroft-core is that it does a bit much. It’s clients, services and library all in one. these could perhaps be separated more.

The client architecture should be consolidated. The speech client handles just input and lets the audioservice handle the TTS. It would be neat to make this symmetric, so we had a client that takes input passes it to the service for action and then receives output to handle (this would also include the “audioservice” starting of playback in a way relevant to the client). I think OVOS recently moved towards this structure.

“Fallback skills” are still a bit of a special case. AFAIK the fallback skills are the biggest blocker for sandboxing skills in their own processes.

Maybe consider moving “stable” parts of core over to a systems language (C / Rust / Go / Elixir / etc.). My naive port of the messagebus service into rust improved speeds of individual messages with 50% (using normal python client)

I love the way OVOS has made a plugin base class module that can be used without relying on the entire mycroft-core infrastructure.

More love to the intent parsers are needed. Adapt has been hugely improved by Clusterfudge et al but there are still a couple of rough edges that can be improved. And perhaps Padatious could use the same TLC or be replaced with something new providing similar features?

Core does some things that should be left to the system. Such as the NTP check / update at startup of the skills process. If that is needed for certain platforms maybe it should be done in some platform specific code or leave it to something like systemd.

Maybe the plugin system could be revisited and extended to make core more modular than it is today.

Glad to see the big issues being considered again!
/Åke

StuartIanNaylor · June 18, 2022, 6:33pm

Once intent has been routed to a sandboxed skillserver it doesn’t need the mycroft ‘fallback skills’ the only mycroft fallback skills needed are ones where intent has no type of skill server to route to which are not much more than sorry I didn't understand that.
The only thing blocking sandboxing skills in their own process is the current way Mycroft enforces skill processing as otherwise all you are doing is routing intent elsewhere.
The fallback skills could be a mycroft ‘fallback skills’ sandboxed server of paradoxical bad skills because the primary good skills are missing.

If you use a local music streamer purely as an example there is far more needed than just a bit of love to intent parsers to making a really robust and comprehensive system local private ‘spotify’ with bells & whilstles as an example. Its not worth developing for such a small herd , would be great if Mycroft did but I guess its not because it isn’t already here.
The entrapment and lack of interoperability is setting limits to what is worthwhile to develop and that sets the level of what users might appreciate low.

The core skills take a lot of polish and a lot of work and you would be better off developing those for Mycroft branded specific skill servers that could be used for a wider audience because they are sandboxed and interoperable.

I really believe opensource can beat commercial systems because opensource can be distributed and interoperable whilst commercial systems aim for all-in-ones for off the shelf product. Its not more love it needs, its the realisation that much of the core took a wrong turn and some big changes need to be made.

forslundd · June 19, 2022, 9:36am

Not sure I get what you mean by enforcing skill processing here?

Currently normal skills (based on the MycroftSkill class) can be run in a separate runtime since all communication between them and the system is basically either skill local files or messages on the messagebus (intent, gui, tts, start playback, etc). So these are trivial to sandbox in separate processes.

Fallbacks on the other hand are looped through and functions are called directly one by one by the central skill process and can’t currently be separated. So they need to be considered, they should either be removed or updated to work differently. They could even be considered pluggable special intent handlers.

StuartIanNaylor · June 19, 2022, 10:55am

You have just answered it in various forms aka MycroftSkill class, or specific Mycroft messagebus messages.

Intent once routed to a skill server needs nothing but the group/zone that it belongs to.

Look at your user qty, how the forum posts and interest has declined and the retention you have.
Have a think before you try to justify current and continuation of supposed processes that work so well.

Rather than binding to what has produced a collection of mediocre skills partition out into a much more interoperable framework and maybe concentrate on providing those few key killer skills that may garner user support, use and input.

Its probably a pointless exercise to try and make one of the key devs involved with a project for a number of years that actually what is available isn’t very good and overly restrictive.
When so many state-of-art open source modules don’t bare a Mycroft name its a real incentive not to tie yourself in to a near only Mycroft framework with Mycroft protocols.

Apologies Ake as I had been hanging around to see what would be happening with Montgomery stepping down and the new CEO and hoping for a change.
Sounds like its not happening and you are continuing as before so its pointless me getting into a discussion as I will only be negative as just really don’t like what you guys ended up with, or think its very good…

I am not going to bother commenting as you will be glad to know , enjoy and best wishes to all.

j1nx · June 19, 2022, 12:28pm

I can’t really comment on the python side of mycroft-core as I am not a python dev myself, however would like to mention the use of (open) standards and software for the whole implementation of mycroft-core into the OS.

It has already been mentioned above by others as well, but like to emphasize it.

By tapping into software and standards used by the Linux communities for many years, it makes sure you do not generate new technical depths on the way.

A good example for that to explain myself a bit is the wish to have some sort of supervisor that makes sure that the different services keep running healthy. Forgot the link to the technical document you guys wrote up about it, but in short what you guys wanted is already their for many years.

By splitting up mycroft-core into seperate services like we did with ovos-core, we can start each and every service separately utilizing systemd. It also allowed us to start it with the notify flags reporting back the different states of these services.

mycroft-core on a low level is just the message-bus and all other services rely on it, so that core module is started first. the systemd wrapper reports back to systemd that the service is started.

The skill service relies on the message-bus service, so pointless to start it before that service is rock and rolling. By depending the skill service on the message-bus it will be started when that service is ready.

Similar things can be implemented using the watchdog to keep reporting back to the OS if services are still in a healthy state and if not systemd will take care of the restart of it. Hence this can also be extended using the hardware watchdog of the system that in the case the whole system is on its knees the whole device gets rebooted.

Another example of utilizing standards is for instance our MPRIS integration for the audio services. If something plays other software knows about it and can control it. Pause, stop, next, prev, hence even volume control and showing the meta data.

The otherway around is the exact same. Because we implemented the MPRIS standard, when something plays on the device outside the audio service, mycroft is able to pick it up and control it as well. Example: I can stream something from my phone by for instance Airplay or Spotify to the device and mycroft immediately knows about it which allows me to control what is playing by voice and as the meta data is part of the MPRIS protocol the GUI shows what is playing just like I initiated the playback using any skill.

Above are just some small examples of utilizing things that are available to linux users for many years.

I am not saying you should implement what we implemented, I am just trying to say that you guys investigate a bit more what is out there and use it when you need certain things on an OS level. That way you can rely on the many developers that maintain that piece of software, instead of implementing something on your own which needs to be maintained by yourselfs and quickly falling behind (technical debt) because there are ony limited hands available and only 24 hours in a day.

If above makes any sense…

forslundd · June 19, 2022, 1:29pm

I did not want to in any way invalidate your opinions or concerns @StuartIanNaylor. I just tried to explain why I thought it was a good idea to think about the fallback skills.

Also I am not affiliated with Mycroft in any way since more than 2 years so my opinions are just that, opinions.

Once more, I’m sorry if I was dismissive of your concerns.

StuartIanNaylor · June 19, 2022, 5:06pm

Its not that you where dismissive and likely why I have been wondering why things seemed to of ground to a halt, with near no change for a long time.
You have no need to apologise as its good to hear from you in what has been a relatively long time as I didn’t even know you had left. I didn’t even think you where dismissive, it just sounded like little change when I think much of what Mycroft have needs a rethink and drastic change.
I was just getting out of the conversation because of what I have said and what I will likely say because I just don’t think much of it is all that good and was wondering if I should just keep quiet instead.
I have just been curious about the new CEO and changes of heads and wondered if also many other changes may happen and just been hovering once more in the forums looking for change.

There are now a lot of state-of-art near state of art ASR-TTS-NLU frameworks the scene is quite full of them and for me Mycroft in comparison tends to lag behind.
Few unlike Mycroft & Rhasspy try to knit these modules into what can be considered a complete system and there is a lot of value in super easy framework to loosely join any choice of the pipeline mic->kws-asr-nlu-skills-tts-audio so that mycroft is a base system for smart assistants that utilises best of breed, feeds upwards to the source and shares in larger herd when it comes to maintenance that distribute from singular to a many named node zoned system.
I have to question why Mycroft spreads so thin and provides certain modules that likely better already exist whilst its the linking framework of best of breed modules that is the real unique point of the likes of Mycroft & Rhasspy.

I did read what @j1nx said and totally agree GitHub - hifiberry/snapcastmpris: MPRIS interface to snapcast all the way

I am probably more hard-line on using open standards and existing code and against refactoring and rebranding but simply using and feeding back upstream.

For me in terms of a voice activated smart assistant there is very little need for a message bus apart from in skill servers on a shared node.
Each module/step/section should be containerised and shared resources should be purely network and basic Linux services from NFS and I don’t really see a Mycroft core.

KWS is a network mic that is is allocated to a zone and input audio is merely a mirror image of a wireless audio system and has a matching zone but no code or protocol connection.
ASR processes that zones info and drops intent text with the zone metadata for NLU and the NLU will make a decision onto which skill server that covers that named-node/zone it thinks appropriate.
If a skill server needs to announce discourse the zone info and text is dropped on the TTS which directs the outgoing sentence audio to a zone / channel of a distributed audio system.

Everything else is a intent fed containerised skillserver and really there is no core as near all operation is on open standards and routed zoned intent is the API with no hard coded application commands but merely MPRIS like abstraction of Pause, stop, next, prev, hence even volume control style intent that is received by a skill server and operates the applications / processes it knows based on the intent it receives.

There is no core as a system is a collection of loosely linked network containers driven by intent that compress into a single unit or scale across a multi zone system of many users but is purely a repetition of the same building blocks of intent driven skill servers. There is no need for a core but I still seem to be waiting for a singular good skill server that is any good at doing a singular job or skill.

Prob a massive over simplification but near everything we consume from an assistant is intent driven media and the MPRIS example j1nx sent was very apt
There is a double layer to intent parsing as the 1st is just a general purpose NLU intent router that obviously routes that intent to a skill server with a far more specific skill based parser without any need of knowing a core in the way I think you describe.
That specific skill server may use Adapt / Padatious or could equally use Rasa or any other and maybe that should be the discussion is how to remove all system specifics and abstract into one of choice using basic linux processes and security contexts that are not brand specific and how to provide that and escaping from a core and concentrate on providing the killer skill servers irrespective of the components IP than intent parsers that even with a bit of TLC in many respects to alternatives could now be seen as lacking.

So yeah not you personally Ake and I did put the effort in to try and explain as seriously no offence was taken but it sounds like to me much of the same without the sort of change I was looking for.

What @j1nx said seems perfectly reasonable and things should be that way and more.
I just shouldn’t of opened my mouth and its me to apologise.
Best wishes all

PS I kept writing NLU and for those confused for some reason I was thinking of a Natural Language Unit but a unit for NLP where there is a host of existing opensource from Spacy to PytorchNLP where the Mycroft modules seem only to exist to have IP or at least use of as there is much better.
That is my MPRIS example where certain modules in Mycroft are weak in everyway from function to community and only exist for branding IP.
The scope of Mycroft is out of whack with reality and why not reuse the ‘best of’ and actually develop some of those polished skill servers so it garners an active user base.
Seems strange to suggest dev work on modules that will never be best in class compared to others whilst there is little dev resources whilst not producing best in class skill servers that could be easily accomplished as what exists isn’t that good.

mike99mac · June 22, 2022, 3:13pm

Just as Linux beat out Solaris, AIX, HP-UX, etc. it is inevitable. The trick is to become the go to source …

-Mike Mac