Replacing cloud services for STT by local ones

PasabaPorAqui · June 30, 2017, 8:59am

Congratulations for your German Mycroft !. Please, consider to share your experience, editing current wiki or creating a new one, as you prefer.

About replace remote STT by local German one, my design decision has been start a new client, lclient.

Modify current one had some problems: Mycroft team is busy to accept big pull requests, probably due to stability and regression test efforts; local STT seems not in his business plan; … . I don’t like forks, it is expected a rejoin when “lclient” was fully stable and Mycroft team has time for this improvement.

I see you have a very good level on computer science, you should no have big problems trying with this client.

Note: in last months I’ve working more in the hardware aspects than in the software. I’ve not updated with latest Mycroft versions, thus, some issue could appear.

(Why hardware now: I do not imagine Mycroft as an object to buy and place in a room, but as a hidden, embedded AI present in the house. Thus, I’m working in distributed microphones/speakers, integration with Z-wave and zigbee devices, … ).

tfontoura · June 30, 2017, 4:15pm

I agree with you. I couldn’t figure it out how to do it yet, though, as I don’t believe zigbee protocol can transport voice. Putting PIs in every room as slaves, maybe?

PasabaPorAqui · June 30, 2017, 8:42pm

Z-wave ang zigbee are protocols for actuator devices.

Microphones could be bluetooth, with newest class range 10 meters (usual low price ones has a range of only a few meters), or devices based on Arduino and some xBee card.

On going, I must recognize not yet found a valid solution.

The good new is that rtp based devices works correctly, Mycroft can be commanded using usual smartphones.

newone · July 2, 2017, 8:25pm

I have a similar situation here when sticking with English for my initial tests and get errors when pointing the remote server mycroft.ai to localhost. Thought we do not need the remote server when using pocketsphinx?

2017-07-02 22:29:32,538 - mycroft.client.speech.mic - DEBUG - Thinking...
2017-07-02 22:29:32,538 - SpeechClient - INFO - Wakeword Detected: hey mycroft
2017-07-02 22:29:32,548 - requests.packages.urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): localhost
2017-07-02 22:29:32,549 - requests.packages.urllib3.connectionpool - DEBUG - http://localhost:80 "POST /v1/stt?lang=en-US&limit=1 HTTP/1.1" 404 None
2017-07-02 22:29:33,491 - mycroft.client.speech.mic - DEBUG - Waiting for wake word...

Wake word ‘hey mycroft’ works well by the way.

Do we really need a webserver for handling local STT using pocketsphinx as indicated by above logs?

@PasabaPorAqui: How you handle this?

PasabaPorAqui · July 3, 2017, 10:35am

One of the critical features of pocketsphinx is that it is continuously monitoring the noise/ambient captured by microphone. With this information, it reconfigures some filters to improve recognition ratios.

With the official Mycroft client, the wake word detection (that is continually running except when handling a command or speech) is done totally independent of the command STT translation. This fact goes against previous feature.

For this a some other reasons, and decided to create a local client (“lclient”) different of the official “client”. See previous messages for source link.

Thorsten · July 3, 2017, 6:54pm

If it’s okay for you @PasabaPorAqui i would prefer editing your wiki page. If i would create a “german” wiki page it would be 90% copy’n paste. So maybe it’s the better choice to have one “international” wiki page which points out the language dependend differences.

Thorsten · July 3, 2017, 6:55pm

I’m waiting for delivery of my matrix voice. This would be my mic and mycroft will be hosted on an ubuntu server.

PasabaPorAqui · July 4, 2017, 8:18am

About wiki: ok perfect, modify current one, in this way we will have a more complete page.

About your matrix voice: it seems very interesting. Please, considerer to share your experience (wiki page or post or …) when you think you have it enough tested (I think better not in this thread, because it is too much different issue than the original one).

Thorsten · July 4, 2017, 8:05pm

@PasabaPorAqui i updated your wiki page. Since i worked on that wiki page for nearly 2 hours now please feel free to give it a review ;-). Comments/Updates are happily welcome.
I did not change it’s title for the reason of breaking no existing hyperlinks to the wiki.

newone · July 5, 2017, 4:11am

Great, maybe we can merge French anf Italaiano too ad they only differ in the download section?

The crucial part of this thread is the local STT server and to have a working wiki example for that.

@PasabaPorAqui: Should we clone your complete mycroft-core repository for this purpose and its already adapted, i.e. mycroft.conf points to the local server and uses your lclient or should we clone the official repository and take over the files from your lclient folder?

PasabaPorAqui · July 5, 2017, 1:37pm

A simple solution could be:

download only “lclient” directory # local client
mv client rclient # remote client
ln -s lclient client

in this way, switch from one client to the other is so easy as modify the softlink to “rclient” or “lclient”.

newone · July 6, 2017, 6:52am

Hello @PasabaPorAqui,

what about the subfolder lspeech within your client folder (lclient). Is it automatically used or do i need to use symbolic links too?

I have used PR184 from mycroft-core repository, maybe which also modifies client files. Maybe this is not a good idea.

newone · July 7, 2017, 4:02am

@PasabaPorAqui I guess you are starting lspeech/main.py within the ‘start.sh voice’ command?

There are so many changes in the last two months on mycroft-core that i would advice all which want to try your code to clone your complete repository. Maybe you can update your repository sometime in the future to match those changes?

First of all we need to reproduce that this is working well for others.

PasabaPorAqui · July 8, 2017, 10:33am

Updated my branch with latest official source, first regression test passed.

The differences between my branch and official one are:

addition of “mycroft/client/lspeech” directory with the local client
in start.sh, changed line:

“voice”) SCRIPT=${TOP}/mycroft/client/speech/main.py ;;

to:

“voice”) SCRIPT=${TOP}/mycroft/client/lspeech/main.py ;;

changes not related to local STT:

mycroft/skills/multi_thread_skill.py: optional base class for skills, with some improvements.
msm is removed (to enforce locality)

I do not recommend to clone my own branch, I do not guaranty any stability nor continuity. Clone official one and merge manually the previous changes.

Changes in configuration (in bold, mandatory ones for local STT)

“lang”: “es”,
…
“url”: “”,
“update”: false
…
“wake_word”: “vivienda”,
“threshold”: 1e-20,
“standup_word”: “vivienda”,
“standup_phonemes”: “b i b i e n d a”,
“standup_threshold”: 1e-30,
“producer”: “pocketsphinx”,
“grammar”: “jsgf”,
“wake_word_ack_cmnd”: “aplay /home/pma/actual/tools/R2D2a.wav”,
“msg_not_catch”: false,
“debug”: true
…
“module”: “espeak”,
“espeak”: {
“lang”: “es”,
“voice”: “m1”
}

as you can see, in order to increase recognition success ratio, I use initially a non-free speech grammar, stored in file “es.jsgf”. Skills can switch this grammar to any other one during its execution. Current content is:

#JSGF V1.0;

grammar prueba;

public <prueba> = <cmnd1> | <cmnd2> | <cmnd3> | <cmnd4> | <cmnd5> ;
<cmnd1> = apaga la música ;
<cmnd2> = pon música ;
<cmnd3> = avisa <when> | avisa ;
<cmnd4> = graba un mensaje ;
<cmnd5> = televisión pon canal <n_0_100> ;
<when> = en <n_0_100> ( minuto | minutos ) ;

<n_0_9> = cero | un | dos | tres | cuatro | cinco | seis | siete | ocho | nueve ;
<n_10_29> = diez | once | doce | trece | catorce | quince | dieciséis | diecisiete | dieciocho | diecinueve | veinte | veintiuno | veintidós | veintitrés | veinticuatro | veinticinco | veintiséis | veintisiete | veintiocho | veintinueve ;
<n_10n> = treinta | cuarenta | cincuenta | sesenta | setenta | ochenta | noventa ;
<n_0_100> = <n_0_9> | <n_10_29> | <n_10n> [y <n_0_9>] ;

newone · July 9, 2017, 8:05am

Thank you for you help which i appreciate but this turns out to be a big mess for me:-)

It starts with installing pocketsphinx using mycroft-core git:
- One has to uncomment the pocketsphinx part in the dev_setup.sh so that it installs correctly to mycroft virtualenv but you will get further error messages when installing and running
  - you have to uncomment TOP directory in scripts/install-pocketsphinx.sh otherwise it will crash
  - and the speech recognizer is located now in a subfolder called ‘recognizer’ which you need to modify there too
with the default settings (linked to client/speech) hot word recognition works but everything else will fail as there is no working local pocketsphinx configuration
using your lspeech and configuration modifications 1) - 3): This does not work for English language either:
- one has to link *.dict and *.lm file to those found in the pocketsphinx model folder but i am not sure as the lm is called *.lm.bin in this case. Do we need to name them en-us.{dict,lm,bin} or en.{dict,lm,bin}
- it looks like that your pocketsphinx audio consumer has Spanish hardcoded?

Independent of that i am getting error messages like

Traceback (most recent call last):
  File "/home/user/mycroft-core/mycroft/client/lspeech/main.py", line 196, in <module>
    main()
  File "/home/user/mycroft-core/mycroft/client/lspeech/main.py", line 188, in main
    loop.run()
  File "/home/user/mycroft-core/mycroft/client/lspeech/listener.py", line 250, in run 
    self.start_async()
  File "/home/user/mycroft-core/mycroft/client/lspeech/listener.py", line 209, in start_async
    self.config, self.lang, self.state, self)
  File "/home/user/mycroft-core/mycroft/client/lspeech/pocketsphinx_audio_consumer.py", line 76, in __init__
    self.wake_word_ack_cmnd = s.split(' ')
AttributeError: 'NoneType' object has no attribute 'split'

I am still sorting out things. It would be cool if at least they would make a local recognition work with English language in their own git code.

PasabaPorAqui · July 9, 2017, 8:23am

yes, add new features to any software could be a bit messy, just for programmers that decide spend enough time. Same for using and testing “beta” features as this one.

About your points:

first one is about install official mycroft, better to handle as it is. As pocktsphinx is used always by the wake word feature, it should install by default.
I’ve developed and tested local STT in Spanish. English must follow the same steps that initially has been done for Spanish, and now has been done for German: install pocketsphinx package for English language, etc.
last point is caused by the lack of “wake_word_ack_cmnd” entry in the config. I fixed the error in order to made this entry fully optional. This entry is used to say mycroft how to answer to the wake word. In my case, after wake word is recognized, mycroft mades the famous R2D2 sound.

PasabaPorAqui · July 9, 2017, 8:35am

@Thorsten: Really good additions to the wiki page. Is is a pity we can not change the title.

When we have it a few more tested, we can think about add a new chapter about the local client or open its own wiki page.

newone · July 9, 2017, 10:10am

@PasabaPorAqui Thank you for this easy fix in your code.

I need to figure out below issue and probably need more debug information with the English language model which could be related to the used dump format (*.lm.bin) as German and Spanish do not use the dumped format

ERROR: "pocketsphinx.c", line 233: Cannot redirect log output
Traceback (most recent call last):
  File "/home/user/mycroft-core/mycroft/client/lspeech/main.py", line 196, in <module>
    main()
  File "/home/user/mycroft-core/mycroft/client/lspeech/main.py", line 188, in main
    loop.run()
  File "/home/user/mycroft-core/mycroft/client/lspeech/listener.py", line 250, in run
    self.start_async()
  File "/home/user/mycroft-core/mycroft/client/lspeech/listener.py", line 209, in start_async
    self.config, self.lang, self.state, self)
  File "/home/user/mycroft-core/mycroft/client/lspeech/pocketsphinx_audio_consumer.py", line 82, in __init__
    self.decoder = Decoder(self.create_decoder_config(model_lang_dir))
  File "/home/user/.virtualenvs/mycroft/local/lib/python2.7/site-packages/pocketsphinx/pocketsphinx.py", line 271, in __init__
    this = _pocketsphinx.new_Decoder(*args)
RuntimeError: new_Decoder returned -1

On the other hand the wiki seems to refer to standard lInux installation paths for pocketsphinx like /usr/share/pocketsphinx. Why not use mycroft-core/pocketsphinx-python/pocketsphinx/ instead as all installations seem to be available locally?

p.s. Above error message is discussed here Mycroft on raspberry pi gives me this error and stops working

PasabaPorAqui · July 9, 2017, 11:49am

Previous error message “new_Decoder returned -1” and call stack appears in any case of error initializing pocketsphinx.

In this case, error is due to line “-logfn” in file “pocketsphinx_audio_consumer.py” that is pointing to a path not existing in your environment (“scripts/logs/decoder.log”). Fixed, now using “/tmp/pocketsphinx.log” that should exists in any Linux machine.

About “/usr/share/…” some oficial pocketsphinx packages install on this directory. However, this is a file not really used, wiki explains that these files must be copied/linked to some mycroft paths.

If you think wiki explanation must be improved, feel free of edit it.

newone · July 9, 2017, 5:32pm

@PasabaPorAqui Thanks, the log is useful regarding pocketsphinx, grammar stuff is not logged there.

I will for sure contribute to the wiki once things work. I think that all language wikis should go in the one @Thorsten did with a single table for the pocketsphinx download stuff.

Now, to new issues

Traceback (most recent call last):
  File "/home/user/mycroft-core/mycroft/client/lspeech/main.py", line 196, in <module>
    main()
  File "/home/user/mycroft-core/mycroft/client/lspeech/main.py", line 188, in main
    loop.run()
  File "/home/user/mycroft-core/mycroft/client/lspeech/listener.py", line 250, in run
    self.start_async()
  File "/home/user/mycroft-core/mycroft/client/lspeech/listener.py", line 209, in start_async
    self.config, self.lang, self.state, self)
  File "/home/user/mycroft-core/mycroft/client/lspeech/pocketsphinx_audio_consumer.py", line 84, in __init__
    self.decoder.set_keyphrase('wake_word', self.wake_word)
  File "/home/user/.virtualenvs/mycroft/local/lib/python2.7/site-packages/pocketsphinx/pocketsphinx.py", line 403, in set_keyphrase
    return _pocketsphinx.Decoder_set_keyphrase(self, name, keyphrase)
RuntimeError: Decoder_set_keyphrase returned -1

2017-07-09 19:24:13,215 - mycroft.messagebus.client.ws - INFO - Connected
2017-07-09 19:24:13,215 - Skills - DEBUG - {"type": "enclosure.reset", "data": {}, "context": null}
...

  File "/home/user/.virtualenvs/mycroft/local/lib/python2.7/site-packages/requests/models.py", line 382, in prepare_url
    raise MissingSchema(error)
MissingSchema: Invalid URL '/v1/device//setting': No schema supplied. Perhaps you meant http:///v1/device//setting?

The second one seems to be related to setting “url” to “”. Currently, i am switching between English and German. The first one uses a binary language grammar and might lead to issues. For the second one there are no language files for the skills (yet).