Mycroft Dev Sync 2021-11-29

Welcome to our newest team member Mike! :tada:

4 Likes

Great discussion on Internet access!

A new/modified skill comes to mind: Hey Mycroft - what is your (Internet|pairing) status?

  • (I am paired with mycroft.ai since {time frame}|I am not paired with mycroft.ai - use {6 characters} to pair)
  • I have been connected to the Internet through {the WiFi network {SSID}|an ethernet cable} for {time frame}
  • I lost my Internet connection through {the WiFi network {SSID}|an ethernet cable} {time frame} ago. I am checking for restored Internet access every {time frame}

So maybe a new log file/DB of when Internet access was gained and lost (and also Linux boots/halts/reboots and device pairing timestamps)?

Curious - how do the Am*zon|G**gle devices behave when Internet is lost?

Hope this helps.

-Mike M

P.S. Congrats to Mike H on the new position! What does this mean for the future of Rhasspy? (if I may ask)

P.P.S Related: Smart things are as dumb as their makers. Let's fix that • The Register

1 Like

At least Amazon is sidewalking their way out of it

So glad to see you all are working on streamlining and organizing boot up! Also, very glad to see that internet connection is considered non-critical for passing startup. Having a short boot time is very important to the development cycle. Every test I did with shortening wake word required restarting all services, as the reload config didn’t seem to work for that. So each time I had to wait for all the skills to be checked. It slows the development cycle waiting for ‘unnecessary’ things in startup.

Then I tried removing several skills by just moving their directories. But of course they were redownloaded, and disconnecting from the network froze the bootup. I’d really like to see an offline version which you seem to be hinting at after completing the Mark II. I was working on a Docker version to this effect. And/or a portable installation that could be put on a thumb drive per se and works at some basic level. Or an app that worked in airplane mode.

I see that is going to take some of the alternative backend solutions. The suggestions from @JarbasAI seem best/easiest? (Offline mode with Home Assistant? - #4 by JarbasAl) I would also recommend removing the need for pairing (that post mentions flags for, but I haven’t worked out how to implement yet)… understanding the intention of the repeating request on startup, as a simple one step voice install that doesnt need a mic even, and as feedback its up and running, maybe just asking if the user “would like to pair” or “would like to connect”, or adding “if you would like to run locally, say cancel” option would be useful. Oh, wasn’t there a setting in the config file for that… I’ll have to look.

Anyway, last comment was that another great improvement to functionality would be two fold… i recently tried Serenade.ai and loved that it needed no wake word. And it has some desktop commands which is a major goal of mine. It has pause/listen instead. Also a little project called nerd-dictation just types everything always using Vosk. For a desktop assistant, a pause/listen feature is the way to go. Perhaps I should add that as a request if it isnt already.

The second comment is to ask for an adjustable listen length (maybe it exists?). Despite the wonders of Google Assistant, possibly being able to adjust this with Mycroft is a great example of why I’m contributing to this project. A simple (in concept) wake-word–command–end-word would be incredible here. Having to rush figuring out exactly how to say something in three seconds seems very contrary to natural speaking or conversation. The standard radio syntax is callsign–message–over. I think that would work wonders here. Sure, it takes a bit more storage and processing space, but I think it’s value far outweighs those costs for all but very minimally resourced systems. Sure, a time-out or an ‘are you finished’ would be prudent, like in any regular conversation.

You may want to blacklist the non crucial.
user conf (since system gets overwritten atm)

{"skills": {"blacklisted_skills":["<yourblacklistedskill>",..]}}

This doesn’t circumvent redownloading but the skill wouldn’t be loaded

In that spirit i have an other grudge. mycroft-stop often result in a process kill (skill process; prolonging the timeout to kill won’t help), what require a reboot to go along.

General unhelpful comment: Blacklisting is a TERRIBLE way to handle skill removal.

It’s not about removal, it’s about not loading. And i’d guess he will find it helpfull instead of doing …

Well it’s a terrible way to handle skills in general. Removing it should be sufficient, tbh, and there should be a better tool to manage skills (installed, deleted, pending, search, etc) vs having to do whatever that silliness is to get rid of them.

Yeah, we’ve been talking about changing the “default” Skills behaviour. There are some Skills that you really do need like Volume, where if it didn’t exist you would have a bad time. But most of them are just the suggested base Skills.

So yeah, we’re definitely going to change that.

@auwsom something that might be useful if you’re wanting faster iteration is to only restart the service you’re working on. So if you want the Audio Service to be completely rebooted you can run mycroft-start restart audio
The Skills Service is by far the longest to boot, and restarting just audio is really quick.

My prediction is that we’ll do a completely offline version, we’re just very focused on the base experience using the Mark II at the moment. One of our challenges is that there are so many possibilities of things to work on, but we really need the base experience to be solid first. That means that people can pull it out of the box, set it up quickly and easily, use it in real world contexts, interrupt it, ask it multiple things, tell it to shutup, and have all of that work seamlessly.

The same goes for the desktop experience really. I think the user experience of a desktop assistant is quite different to a counter-top assistant like the Mark II. You have different needs for it, and to make that great it’s going to need attention that we just can’t give it right now. We’d rather make the Mark II exceptional, then transfer those learnings to the next platform / environment.

The length of a possible utterance and how we detect when someone might be finished talking is an interesting problem too. I think this will be more interesting as we move to streaming STT as you could get a sense of whether the utterance received so far seems incomplete or not, but continue to transcribe. Imagine something simple like:

"turn on the… " few seconds pause “bathroom light”

At the moment we’d likely only get the first half and so Mycroft would probably reply that it doesn’t understand. At best we’d understand that they want to turn something on and would reply with something like “Sorry, what did you want me to turn on?”

Streaming STT also has benefits in terms of response times. The longer we record for, the longer the STT transcription takes, and the slower the system feels overall.

The wake-word > command > end-word format is an interesting idea. I’m sure it’s not everyone’s cup of tea but I think Ken was playing around with something similar for his own purposes.

Thanks for the tip on restarting a single service. (The config reload seems like it should supersede that? or inform if a service needs reloading?) I’ll have to ask Ken if has any tips on the radio syntax version. Does he have a handle on here?

I hear you on keeping focused on MVP!:+1:

Thanks for the tip. I dont mind a quicky and dirty workaround.

@baconator I didnt see a suggestion on how to do this otherwise? (especially one that prevents redownloading?)

Hi @mike99mac, sorry for the delay in my response!

I see Rhasspy’s key feature as being fully functional offline, and this is what I’d like to bring to a future version of Mycroft (for English and beyond).

My ultimate hope is to join our communities and combine our efforts. The biggest limiting factor of Rhasspy at this point is me. Mycroft has the infrastructure to do what I could never do without being independently wealthy: run a skill store, manufacture a device, and provide real support.

I’ve already begun work on Mimic 3 (an evolution of larynx). Expect to see offline speech to text options for the Mark II once we get the basic software in working order :slight_smile:

3 Likes

Text - to speech that is :crazy_face:

Both, actually :slight_smile:

For skills that have fairly fixed grammars (alarms, timers, change volume, etc), offline speech to text is totally viable.

1 Like

@synesthesiam - thanks for the reply - sounds good. The future is bright!
-Mike M

1 Like

I hope they pay you enough😉

While you are at it, let’s have a look at precise as well🤣

No complaints :slight_smile:

And precise is on my radar!