[Productivity] Document writting

Skill name: skill-document-writting
User story:

As a user I want this Skill to easily dictate to a LibreOffice document so that it can help to impaired users (or lazies ones) to write a document of any lenght.
You could use directly from the Mark/pycroft, generating a odt file or as an assistant to LibreOffice Writer (through the Plasmoid?), where you can launch more advanced options and commands and look at them visually (e.g.: select last word and put it in italics, insert a heading title, justify all text/X paragraph, etc)

What third party services, data sets or platforms will the Skill interact with?
LibreOffice Writer

Are there similar Mycroft Skills already?
The parrot skill has some useful code, as it listen and repeat all said by the user, instead of repeating, it should send it to a document until the user says something like "Hey Mycroft, stop dictation"
There is also a Jarbas skill, skill-dictation, which sends to you an email with the text you previously dictate. I think that skill is more a TODO/memo/reminder skill rather than a type assistant, but is the most similar skill at the moment.

What will the user Speak to trigger the Skill?
The two minimum words would be:
Start dictation {{document_name}}: should open for editing a document
Stop dictation: should save and close the document

It could be enhanced with many cool features:
Edit {{document_name}}: Open for editing an existing doument placed on {{document_path}}
Select last letter(s)/word(s)/paragraph(s): should select the last letter/word/paragraph and would wait
for an action (format text, delete text, copy/cut/paste text, change to capital letters, etc)
Undo: should undo last action done
Send {{document_name}} by email at {{email}}
Print {{document_name}} on [default] printer {{printer}}
and so…

What phrases will Mycroft Speak?
“Ok, tell me sentence by sentence what you want in the text. And say “Hey, Mycroft, stop dictation when you’re done.””
“Document {{document_name}}.odt stored at {{document_path}}”

What Skill Settings will this Skill need to store?

  • Document directory path
  • Any custom works or phrases that are not mainstream - such as medical terminology
  • Custom dictation start and stop phrases

Other comments?

This skill would be the hardest to develop and the most feature-rich, but it could help 3rd party devs to develop others simpler skills like sending emails and chats, creating TODO and remember the milk lists, and so on. And it would help Mycroft to become a helpful assistant.

Further reading

https://wiki.documentfoundation.org/Development/Extension_Development/Python_Extensions_Development

https://api.libreoffice.org/

you have dictation skill https://github.com/JarbasAl/skill-dictation

this uses email to send text to you

A more flexible idea might be to enable Mycroft to be able to register as a USB keyboard. Then the speech parser could return text to be “typed” into the users choice of text processor, on any type of computer that supports USB HID.

I have a small text capture keyboard that operates on this principle. I type notes on it while attending seminars/lectures/etc, and when I get home I plug it in, open LibreOffice, (though I could just as easily use Kwriter, AbiWord, or anything else, even MS-Word if I’m feeling masochistic and want to bother firing up the Windows VM) and send the document to the program.

Formatting would not be manageable that way, but the hard part is getting the text in. Formatting can be done later.

Yes, I saw it after posting this because isn’t listed on the skills repo.

I think the idea of your skill is pretty useful, and properly modified, it could be a perfect skill-send-email :slight_smile:

The idea presented here would be more a “text processor” or a type assintant to help you to write long texts, rather than a quick memo sent by email, that is as I see jarbas/skill-dictation.

1 Like

I like this idea, it would be more “processor-agnostic” and it would work in any scenario, but I don’t know if it would be even possible to develop. I mean, if you have any mycroft device and a separate computer, they can communicate through network, so it won’t be easy to do, if it is even possible.

What about extending Jarbas’ skill-dictation a little to give some additional options that would satisfy your requirements? Maybe something like adding enough vocab to optionally specify final document format to be attached to email (with plain-text in the body as the default)? Maybe a module like this would help: https://pypi.org/project/pypandoc/

2 Likes

The complexities of Mycroft communicating through a network was the major reason I was thinking of HID support to pose as a keyboard. To use the network feeding directly to a word processor or text editor, you would need to install a specialized module tailored to the operating system on the target computer, and this makes for a job with three major platforms and most of the minor ones would get ignored. Convince Mycroft to be a keyboard, and the whole mess goes away and any system that will recognize it can receive text. This does assume that it is possible to program HID emulation into software on Mycroft, if this is not possible, then my suggestion is pointless.

On the other hand, for Linux, Mac, and other Unix-derived systems, Mycroft could SSH in and simply cat > filename.txt then send the text and close the connection. Less automated, but faster than typing by hand.

Using Mycroft as a complete replacement of a keyboard, while is far more cool, I think is far more complex: you need to bare in mind the OS, the Desktop and a handful of other considerations for not to leave anyone in the lurch, while using it as type assistant for a rich-text editor like LibreOffice Writer, which is already multiplatform, should be easier and more specific, unless making Mycroft behave as an USB through the network (usign USBoIP protocol perhaps?) is easy to implement, which I ignore.

Again, I’m thinking on a LibreOffice Writer extension which let mycroft and libreoffice to communicate themselves

I had in mind to use LibreOffice format, as it’s a multiplatform and well known application for the grand public. Making some research, I found libreoffice can be scripted with python through its API.

Perhaps some talented developers on the community can understand better than me and see the pros and cons of what I’m saying.

After doing a bit of research, I was able to send a text from a python script through PyUNO and the Open/LibreOffice API (see the new links on the main post). I’m not pretty sure if we can open the socket to listen on all the network and not just localhost, but it seems promising.

1 Like

Pretty cool! That does look promising. If that approach becomes too difficult, and you don’t mind the external dependencies and the possible privacy risks, you could might try uploading the plain text to Google Docs (
https://developers.google.com/drive/v3/web/manage-uploads) then immediately turn around and redownload it via the export_media call with mimeType set to “application/vnd.oasis.opendocument.text” (https://developers.google.com/drive/v3/web/manage-downloads).
Keep us posted on your progress.

Awesome work, @malevolent

Hey @malevolent hows this coming along, I’ve been given a laptop with dictation software by my uni (i’m dyslexic) but I would prefer to be able to use Mycroft as the current software only runs on windows :frowning: if you need any help i’d be happy

Hello @Tasty213,

I have this completely stalled.

Right now I’m very busy making mycroft understand spanish by creating the parsers, sincerely it’s a hell making it working because spanish has plenty of exceptions and some singularities the rest of the languages doesn’t have…

Perhaps you (or anyone) can grab the idea and start developing it, because when I will finish with the spanish parse issue, I would like to empower mycroft with some spanish skill translations… besides I’m just a Linux sysadmin so, even I could make the document writting skill, it will surely take a lot more time than a proper python developer (and the code probably will be more neat).

can you send a link to what you’ve made already so i have somewhere to start from.

Well, besides de PoC shown here, I did nothing yet. It seems we need to dig into PyUNO (which is the python bridge of UNO), unfortunately Open/LibreOffice SDK’s documentation is pretty awful and the few examples and documentation refers to Java and C++ languages.

There are but some interesting bloggers out there with some nice entries, and a proper developer -unlike me- should be able to advance in the right way.

On the link I’d posted before https://onesheep.org/scripting-libreoffice-python/ there is the example I did to make the hello_world.py and at the end, some useful links.

Basically we need to start Libreoffice Writer listening to a socket, you can do it in one terminal (as in my screenshot) or inside the python script, invoking the uno methods.

$ soffice --writer "-accept=socket,host=localhost,port=2002;urp;"
import uno
localContext = uno.getComponentContext()
resolver = localContext.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", localContext)
context = resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
desktop = context.ServiceManager.createInstanceWithContext("com.sun.star.frame.Desktop", context)
model = desktop.getCurrentComponent()
text = model.Text
cursor = text.createTextCursor()
text.insertString(cursor, 'Hello world!', 0)
text.insertString(cursor, '\n',0) 
text.insertString(cursor, 'Another line!,0)

Hope you can start from this, I know is a very poor starting point. Let me know if I can help somehow. Despite I have all my developer skills rusted, I’m learning python because I want to contribute on Mycroft.

This approach would be to use Mycroft Desktop Plasmoid. To use in the Marks or rpi, which doesn’t have a proper desktop or GUI, there are also methods to create and fill a document and save them somewhere… I don’t really see the point on that

i know by using

echo 'dictate dictate dictate' >>dictate.txt

you can create a text file from bash, of course would be better if you could do bullets etc but it’d be quite simple whilst developing the libre-office bit

I’ve been playing a little bit more with pyUNO and with a simple iteration, I can concatenate strings

#!/usr/bin/env python
import uno
from com.sun.star.text.ControlCharacter import PARAGRAPH_BREAK
from com.sun.star.text.TextContentAnchorType import AS_CHARACTER
from com.sun.star.awt import Size

def insertTextIntoCell( table, cellName, text, color ):
    tableText = table.getCellByName( cellName )
    cursor = tableText.createTextCursor()
    cursor.setPropertyValue( "CharColor", color )
    tableText.setString( text )

localContext = uno.getComponentContext()
resolver = localContext.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", localContext)
context = resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
desktop = context.ServiceManager.createInstanceWithContext("com.sun.star.frame.Desktop", context)
model = desktop.getCurrentComponent()
text = model.Text
cursor = text.createTextCursor()
texto = ""
while (texto != "quit"):
    texto = input("Enter text (type 'enter' for carry return or 'quit' to exit): ")
    if (texto == "table"):
        table = model.createInstance( "com.sun.star.text.TextTable" )
        rows = input("How many rows: ")
        columns = input("How many columns: ")
        table.initialize(rows, columns)

    if (texto == "enter"):
        text.insertControlCharacter( cursor, PARAGRAPH_BREAK, 0 )
    else:
        if (texto != "quit"):
            text.insertString(cursor, str(texto), 0)

On the code above, I put also a code to insert a table and text into it (table insertion still doesn’t work). I need to get how to move the cursor to apply styles and so, and obviously, make mycroft listen instead for type into a terminal, which is something I still don’t know how to do it, but is a bit encouraging.
I know UNO is a scriptring language, but the OOo API seems powerful enough to make many (if not all) kind of thing with documents.

There are some python examples https://wiki.openoffice.org/wiki/PyUNO_samples

1 Like

Oh nice work! You’re thinking a transcription Skill based on the PyUNO library?

Yes, more or less. A transcription skill with extra features, over the time, I will like to have a full featured text processor, useful for the impaired or the lazies. The starting point I have in mind is handle paragraphs, lines and words… or in other words, know where the cursor is at any given time, which I don’t know yet how to tackle. I think once we get how to know where is it, we could do easily things like

  • Write new paragraph
  • Split paragraph
  • Align text
  • Capitalize first letter of the paragraph or the phrase
  • Coloring text
  • Change font of the text
  • Search next and previous coincidence
  • Replace coincidences
  • Create titles
  • Create a dynamic table of content from titles
  • Print document
  • Save document
  • Convert document to PDF

At this point, we would have a basic word processor. So we could just add one by one all the features Open|LibreOffice API has, yet many users will find the simple processor useful enough. Even if we can call (I don’t know) a skill from a skill, we could call send-email-skill or upload-skill (invented just now) to send the document as an attachment or upload it to nextcloud-like services.

2 Likes