Is there a basic text/doc reader skill for mycroft

krywenko · June 25, 2019, 8:56pm

I was wondering if there is a searchable document reader for mycroft… the reason I ask, is that I have a simple parsing program that processes allrecipe website link and turns it in a simple recipe document. that I use to save recipes I liked… I would not mind to be able have mycroft read them

IE : hey mycroft read recipe german rouladen
or recipe german rouladen ingredients ( and only read the ingredients) or recipe german rouladen directions

I suck at python otherwise I would write my own skill for it… but I could probably hack an existing document reader skill to work as I could see how the program language format works in that situation

the allrecipe basic scraper usage is : recipes URL

recipes https://www.allrecipes.com/recipe/25194/german-rouladen/

and it process the webpage and produce a file named “german-rouladen” saved in the Home Dir in a folder called AllRecipes

and it outputs text file in this format

german rouladen

 Ingredients
    1 1/2 pounds flank steak
    German stone ground mustard, to taste
    1/2 pound thick sliced bacon
    2 large onions, sliced
    1 16 ounce jar dill pickle slices
    2 tablespoons butter
    2 1/2 cups water
    1 cube beef bouillon
Directions

  Preparation Time
   20 minutes
  Cooking Time
   1 hour 10 minutes
  Ready In
   1 hour 30 minutes
 Cut the flank steak into thin filets; about 1/4 inch thick and 3
   inches wide.
 Generously spread one side of each filet with mustard to taste.
   Place bacon, onions and pickle slices on each filet and form into a
   roll. Use string or toothpicks to hold the roll together.
 Heat a skillet over medium heat and melt butter. Place the rolls in
   the butter and saute until browned.
 Pour in 2 1/2 cups of water and add the bouillon cube; stirring to
   dissolve the bouillon cube. Simmer the rolls for about an hour.

if you are curious here the Bash script scraper . maybe someone else would find it useful.

#!/bin/bash
URL="$( echo -e $1 | sed 's/\?.*//' )"
FILE="$( echo -e  $URL | tr  '/' ' ' |  awk '{print $NF}' )"
echo  Saved to $HOME/AllRecipes/$FILE
TITLE="$( echo $FILE | tr '-' ' ' )"

lynx $URL -dump |
sed -n '/^Ingredients/,/^Get the magazine/p;/^Get the magazine/q' |
sed '/^Ingredients/,/^   Note: Recipe directions are for original size./{//!d;};' |
sed '/^   You might also like/,/^Directions/{//!d;};' |
sed  's/*//g' |
sed  's/\[//g' |
sed  's/\]//g' |
sed  's/   Note: Recipe directions are for original size.//g' |
sed '/        Add all ingredients to list/,/   You might also like/d' |
sed  's/1\.//g' |
sed  's/Get the magazine//g' |
sed  's/\Prep\>/Preparation Time/g' |
sed  's/\Cook\>/Cooking Time/g' |
sed  's/\C\>/Celsius /g' |
sed  's/\F\>/Fahrenheit /g' |
sed  's/\ m\>/ minutes/g'|
sed  's/\ h\>/ hour/g' |
sed  's/(//g' |
sed  's/)//g' |
sed  's/2\.//g' |
sed  's/3\.//g' |
sed  's/4\.//g' |
sed  's/5\.//g' |
sed  's/6\.//g' |
sed  's/7\.//g' |
sed  's/8\.//g' |
sed  's/9\.//g' |
sed -e '/Directions/{n;N;d}' |
sed '/^$/d' | sed  "1s/^/$TITLE\n\n /" > $HOME/AllRecipes/$FILE

krywenko · June 26, 2019, 2:00am

okay I could not find any form of basic Text reader - but I looked through a few other skill… and I understand the fuction of bedtime-stories-skill for the most part as I said my python skills suck… but it looks like I could edit this skill to work for my needs…

but i do not know what function one would use to read a text file
bedtime-story- skill is using this to import the MP3

 from mycroft.util import play_mp3

and this line to play the file mp3 file

self.process = play_mp3(story_file)
 and 
self.process = play_mp3(score[0])

what would I use to import a text file and what would I use read it as I do not seam to find any mention of simple text file import or play method

gez-mycroft · June 26, 2019, 6:24am

Hey, looks like it will be a great Skill

If you have a list of phrases to speak eg
['Ingredients are', '1 1/2 pounds flank steak', ...]

Then you’re probably just looking for the speak method:

for phrase in recipe:
    self.speak(phrase)

Dominik · June 26, 2019, 1:21pm

You maybe want to have a look at the Cocktail skill which reads recipes (from a web database.

The bedtime-story skill works more like a audio-book-reader (pulling mp3-audio files from a website with audio books and plays them through mp3-player.

krywenko · June 26, 2019, 4:41pm

thank you very much for the reply

I figured it out some what using bedtime store as template

basically change this

  self.process = play_mp3(story_file)

to

filepath =  (recipe_file)
    with open(filepath) as fp:  
          for cnt, line in enumerate(fp):
              self.speak("{}".format(line)

and this:

  self.process = play_mp3(score[0])

to this

        filepath =  (score[0])
        with open(filepath) as fp:  
              for cnt, line in enumerate(fp):
                  self.speak("{}".format(line))

it reads at a nice pace- maybe I will place a slight delay between line reads to give it a little slower pace.
to bad mycroft does not understand fractions so I will probably have to modify my sub processing to convert fraction ie 1/2 to half… also make the processing script for scrapping allrecipes also update the mycroft play list…

perhap I could figure out how to make the skill get new recipes through verbal commands and add it to the library but that above my pay grade as I have no clue on that…

gez-mycroft · June 26, 2019, 11:15pm

Great to hear it’s working out

We have a range of methods for formatting things like numbers into human readable forms.

Checkout the docs on nice_number() which converts floats to human readable forms.

krywenko · June 26, 2019, 11:41pm

I looked at this how would I format to work with the FOR command that i am using to process the text file…

gez-mycroft · June 27, 2019, 2:54am

I’d probably look at a regex substitution using re.sub

So would look something like

import re 

for cnt, line in enumerate(fp):
    # match whole float so that "1 1/2" returns "one and a half" not "one half"
    # regex equates to "zero-or-more-digits one-or-more-digits/one-or-more-digits"
    regex = '({\d}* {\d}+\/{\d}+)'
    # assuming there is only one instance per line that we care about
    spoken_fraction = nice_number(re.search(regex, line).group(1))
    formatted_line = re.sub(regex, spoken_fraction, line)
    self.speak("{}".format(formatted_line))

Python isn’t my primary language either so anyone feel free to correct this as I haven’t tested it.

krywenko · June 27, 2019, 11:46am

thank for your help
but it seams not to work… at first nice_number was erroring out as not found till i edited from mycroft.util import play_mp3 to from mycroft.util import nice_number… but now it errors on group – NoneType - object has no attributed group – I am guessing it does not like null values

gez-mycroft · June 27, 2019, 1:59pm

Ah yeah, that makes sense, there’d be plenty of lines with nothing to match. So need to test re.search() to see that it returns something. Maybe something like this within the for loop:

fraction = re.search(regex, line)
if not fraction:
    continue
formatted_line = re.sub(regex, nice_number(fraction.group(1)), line)
self.speak("{}".format(formatted_line))

krywenko · June 28, 2019, 1:58pm

oh sorry never saw this yesterday;
I tried again still not quite working

for cnt, line in enumerate(fp):
                     regex = '({\d}* {\d}+\/{\d}+)'
                     fraction = re.search(regex, line)
                     if not fraction:
                         self.speak("{}".format(line))
                         continue
                     formatted_line = re.sub(regex, nice_number(fraction.group(1)), line)
                     self.speak("{}".format(formatted_line))

it does not seam regex detects fraction

so I tried as

for cnt, line in enumerate(fp):
                     regex = '1/2'
                     fraction = re.search(regex, line)
                     if not fraction:
                         self.speak("{}".format(line))
                         continue
                     formatted_line = re.sub(regex, nice_number(fraction.group(1)), line)
                     self.speak("{}".format(formatted_line))

but it crashes as soon as it reaches the 1/2

formatted_line = re.sub(regex, nice_number(fraction.group(1)), line) IndexError: no such group

thank you for your help and effort