Develop dictation skill

Hello.
I am trying to develop a skill to be used on a PC.
I want to dictate, and Mycroft should write the text in the currently opened application.

Basically, the skill should accept:

  • a “start dictation” command, that enters some kind of “dictation mode”
  • when in dictation mode, every spoken sentence should be directly sent to the “type” command
  • a “stop dictation” command that exits dictation mode

Now, I cannot figure out how to handle such “dictation mode” with a Mycroft skill.

(P.S. maybe such skill does already exist?!?)

Thank you

I have been following this purpose since years now. The fact is Mycroft stops “hearing” if there is a significative pause, so we need to find a way to do it properly. Besides that difficulty, I don’t really know how can Mycroft can entirely replace the keyboard and “be aware” in which application it needs to behave as a keyboard (libreoffice document, notepad, or a simple textbox on any application), I guess that will be very difficult and you’ll need to go deep into each desktop environment to obtain where the cursor is focused at that moment.

@JarbasAl did years ago a dictation-skill (deprecated, not for use, not maintained and unsupported, so no need to ask him for help as he stated this skill is not for use), but you can take a look into it if you want some inspiration.

I did rummaged with pyUNO, the Open|LibreOffice API, and perhaps we can do something in this way, just with the libreoffice|openoffice suite.

I’m very interested in this subject, but I think this is a very big project, and if it is even possible, you’ll need the help from the Mycroft devs themselves, for the reasons I said before.

Ok, I think this could be a second step. In first instance, I would be happy to open myself the right program (e.g. libreoffice) and then dictate to Mycroft, and the latter may just send text to the active application (the Linux “type” program does exactly this, if I am not wrong).

Sory the command is “xvkbd”, not “type”. (Just to say, “type” was the alias used by LiSpeak).

That could be an approach. Another to obtain cursor position could be xdotool, like for example watch -n0.1 xdotool getmouselocation , or perhaps there is a way with dbus, I ignore that.

I like this idea for at desktop skill. I would recomend looking at screen keyboards and how this is used and let mycroft act as a keyboard.
The screen keyboards are well aware when user is in a typing area and pops up. Same could mycroft do - when user is in a typing areaa he could be readdy for dictation.

fairly easy to do… i might give it a spin if i have some free time

2 Likes

If you want some starting point, I have created a repo here:


As @malevolent wrote, it is not that easy at it seems.

i dont think it makes sense to use the CommonPlay framework here…

check my super old implementation using converse method https://github.com/JarbasAl/skill-dictation

CommonPlay was just a trick to handle a problem with the sentence “Start dictating”… probably it’s not needed.

Your TODO list must be really huge xD

I would use this voice helper a lot as I’m becoming lazier with the years. Having a voice assistant for writing would boost Mycroft usage in the desktop/bigscreen.