Build an open future with us.

Invest in Mycroft and become a community partner.

Convenience functions suggestions


#1

I know there are some bigger fish to fry, but as I am getting my feet wet in skill development, I have found a couple of cases where it seems like the “right” solution would be a standard convenience function built into mycroft objects. Here are my suggestions:

First, when mycroft speaks an email address, it would be great to have a single function that would modify the string so that it will be pronounced in a “natural” way. There is nuance to this: sometimes the best thing is to spell out certain parts (but obviously rarely the TLD, unless it is a country-code), sometimes the domain or the mailbox name are just concatenations of two simple dictionary words (which should just be pronounced, probably). Also, SMTP actually permits several non-alpha-numeric characters such as hyphens, plus, period, underscore, an several others. Note, it is not always practical to put in a pre-modified string. What is the email address is supplied by settings.json? so something like this would be great: self.speak('please send email to: ’ + self.pronounce_email(self.settings.get(“support_email”, “”))
Maybe a whole list of special pronounce_ methods.

Second, while I found the scrolling text display code in the IP address skill and got it to work, this seems like just the sort of thing that should happen in a function named something like self.enclosure.auto_scroll_mouth_text(‘Important, longish info here’) or an optional mode of the already existing mouth_text method activated by an optional additional parameter. Just call it and forget it. Let it handle timing and enclosure type awareness and resetting, etc.

Last, a single function that when called, checks for the existence of a valid self.settings value, and if None exists, prompts for a value, confirms it and then writes it to settings.json. I presume that not every setting would be a reasonable fit for this, but some things probably don’t really need a trip to home.mycroft.ai (though that option should be kept available).


#2

These are great suggestions, @jrwarwick. Thank you!
I’m going to ping @forslund to get his opinion here.


#3

Thanks for the suggestion and I agree that this would be extremely useful.

We have the mycroft.util.format module that contains these kind of functions (or links to a language specific version). See https://github.com/MycroftAI/mycroft-core/blob/dev/mycroft/util/format.py

You’re example would become:

from mycroft.util.format import pronounce_email

...

         self.speak('please send email to: ’ + prounounce_email(self.settings.get(“support_email”, “”))

We’d be happy to accept contributions towards this!


#4

I would love an easy way to temporary switch to a very limited small vocab ( yes, no, 1, 2, 3, 4 ) with local TTS. That would help everyone in creating much better and faster (get_)responses, without converse hacking.


#5

Thank you, Kathy and forslund. I am taking a crack at implementing that pronounce_email method using some pretty clever code found on stackoverflow. One requirement of this method is a simple plaintext wordlist. The first edition of this list is about 1 MB. Where should it be stored? My first thought is in /usr/share/dict but perhaps you want things tighter within the mycroft home directory?


#6

@tjoen, do you mean local STT? Is the following your proposed scenario? Mycroft intent is activated successfully, but inside the handler, more information is needed. Perhaps a series of yes/no questions, and with the limited possible responses of yes/no, a less sophisticated, but locally executing Speech-To-Text recognition would still have a high probability of correctly distinguishing between responses.


#7

Precisely. Some way to switch inside the skill to a very limited subset of pocketsphinx words maybe? numbers 1 to 10, yes, no, repeat. Might reduce the load on the voice services too?


#8

That sounds like an interesting idea. Maybe also “stop”, “cancel”, and “never mind”.


#9

Thought about it some more and what would make this new function even better,
is the ability to also set the recording time of the speech.
That would also speed up the skill and avoid extra listening time.

Something like:
r = self.get_response_local_stt(‘what.is.your.answer’, validator=None, on_fail=None, num_retries=-1, duration=2 )

Let me give an example:
I’ve been testing with a new trivia skill, that asks multiple choice trivia questions, (so you answer a number 1 to 4) . I was thinking to implement multiplayer though a configuration on home.mycroft.ai but having 10 questions for 2 players would result in 20 remote STT requests each containing just one word. Also, the default recording time of the get_response() takes up a lot of extra time when speaking just one word.

https://github.com/tjoen/skill-trivia

Even testing with 5 questions, and one player, some of the STT replies take a lot of time.
Having a (somewhat less reliable) local function would probably be a lot faster, and result in a much better playable game.


#10

So, I did a small test on the MARK 1 and found that it is actually pretty accurate and fast,

https://github.com/tjoen/local-stt-test

This is using the default pocketspihinx and a limited dict and lm.

You’ll have to disable the speech-client to test before testing.
Still pretty good results. I will also try it on my AIY-picroft version.


#11

@forslund, my proposed implementation will need a plaintext wordlist. Would /mycroft-core/mycroft/util/lang/wordlist_simple_en.txt be an acceptable path for the addition? This file would be under 1 MB.


#12

We’re generally against large files in the mycroft repo. I think the best place, maybe download it in the dev_setup.sh script and a better place for a file like this would maybe be under the …mycroft/res/text/en-us/

Though it’s hard to say without the actual content and exactly what your trying to achieve, but I’m looking forward to seeing it!


#13

listen once

capture one utterance

local = LocalListener()
print local.listen_once()

listen continuous

local = LocalListener()
i = 0
for utterance in local.listen():
    print utterance
    i += 1
    if i > 5:
        local.stop_listening()

it would maybe be good to trigger naptime before this,

self.emitter.emit(Message('recognizer_loop:sleep'))

and reactivate after

self.emitter.emit(Message('recognizer_loop:wake_up'))

but naptime skill will answer with “i am awake”, not sure how to best handle this? @forslund