Replacing cloud services for STT by local ones

PasabaPorAqui · June 17, 2017, 8:19am

Hi,

A pocketsphinx package for a language is composed of three elements:

one dictionary (.dict)
the language model (several files as “mdef”, “means”, usually in a directory called “model”)
one grammar (file .lm).

The packages in this address usually follows this format.

In order to use it with Mycroft, it is necessary to download the package, extract/move/copy files and, at the end, have a files structure as the one described in the wiki after the phrase “at the end, you must have the following directory, files and softlink:” (It is not mandatory the softlink, it can be the real file instead).

When using pocketsphinx only to detect the wake words, it is not necessary a dictionary and grammar, only the model. If pocketsphinx will be used as STT, the the dictionary and one or more grammars (lm or jspg) are need. See alternative “lspeech” client here.

This final file structure of directories and files is forced by the speech cllient of Mycroft. It is the same one than for english language, as can be seen here.

Thorsten · June 18, 2017, 9:07am

Hi.
Thanks for your helpful reply .

At first i just want to change the “wake word” to a german word. Next step would be switching complete speech recognition to local pocketsphinx installation.

What i have done so far:

Downloaded cmusphinx-de-ptm-voxforge-5.2.tar.gz, cmusphinx-voxforge-de.lm.gz and cmusphinx-voxforge-de.dic from https://sourceforge.net/projects/cmusphinx. I used the “ptm” version because it contains a file named “sendump” which i found also in the original “en-us/hmm” folder.
Extracted and copied the files according to the wiki document.
Created the following directory/file structure:

“/home/thorsten/mycroft-core/mycroft/client/speech/model/de-de” containing the files “cmusphinx-voxforge-de.lm” and “cmusphinx-voxforge-de.dic” (these files should not be neccesary for just changing the wake up word).
“/home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm” containing the files “feat.params”, “mdef”, “means”, “mixture_weights”, “noisedict”, “README.md”, “sendump”, “transition_matrices”, “variances”.

Set "“lang”: “de-de” in mycroft.conf and set a german wake up word
During start auf “start.sh voice” i receive the following error:

2017-06-18 10:57:59,233 - mycroft.configuration - DEBUG - Configuration ‘/home/thorsten/.mycroft/mycroft.conf’ not found
_ Carnegie Mellon University, Copyright © 1999-2011, all rights reserved_
_ mimic developers, Copyright © 2016, all rights reserved_
_ version: mimic-1.2.0.2 ()_
Traceback (most recent call last):
_ File “/home/thorsten/mycroft-core/mycroft/client/speech/main.py”, line 221, in _
_ main()_
_ File “/home/thorsten/mycroft-core/mycroft/client/speech/main.py”, line 190, in main_
_ loop = RecognizerLoop()_
_ File “/home/thorsten/mycroft-core/mycroft/client/speech/listener.py”, line 193, in init_
_ self.load_config()
_ File “/home/thorsten/mycroft-core/mycroft/client/speech/listener.py”, line 209, in load_config
_ self.mycroft_recognizer = self.create_mycroft_recognizer(rate, lang)_
_ File “/home/thorsten/mycroft-core/mycroft/client/speech/listener.py”, line 220, in create_mycroft_recognizer_
_ return LocalRecognizer(wake_word, phonemes, threshold, rate, lang)_
_ File “/home/thorsten/mycroft-core/mycroft/client/speech/local_recognizer.py”, line 40, in init_
_ self.decoder = Decoder(self.create_config(dict_name))_
_ File “/home/thorsten/.virtualenvs/mycroft/local/lib/python2.7/site-packages/pocketsphinx/pocketsphinx.py”, line 271, in init_
_ this = pocketsphinx.new_Decoder(*args)
RuntimeError: new_Decoder returned -1

After some google research i found that this error may occur if the “model” path is not set using python. But since the path is configured similar to the “en-us” path i think that this is not the problem (https://github.com/cmusphinx/pocketsphinx/issues/32).

I checked the git changes you made and described in the wiki and as far as i understand the first two commits has been merged and the other two are closed at the moment. So i did not change anything within the code right now. Is this neccesary for just changing the wakeup word in the first step?

Addition info:

Just for testing i renamed the original folder “en-us” to “en-us.old” and the german folder “de-de” to “en-us”. Just to check if it’s a problem with the path or with the content of the directory. Since the error is still there it seems to be a problem within the content of the “de-de” folder.
(I think) I’m using the dev branch (whatever happens by default executing this command: it clone https://github.com/MycroftAI/mycroft-core.git)
I’m using ubuntu server 16.04
At the moment i didn’t take a look at “lspeech”, because i think it’s just needed for the second step.

Would be great if you can give me another hint.

Thanks so far :-).

Thorsten

newone · June 18, 2017, 11:27am

I failed with above approach yesterday too. My biggest issue is related to the mic i use which is not well supported pythonwise by the vendor.

I also went ahead and tried to install pocketsphinx 5prealpha, German language. Maybe we can get in sync to get this task done together as we seem to have similar goals.

Thorsten · June 18, 2017, 2:42pm

As far as i can see our common goal is the use mycroft-core with “offline” german voice recognition using pocketsphinx for the wake-up word and for the whole recognition. So would be great if we can support each other solving our problems.

I’m using a cheap headset which works without problems using the default configuration (arecord and aplay works without problems).
I read that installing an additional pocketsphinx is not neccesary, because mycroft has a buildin pocketsphinx installation used for recognition of the wakeup word.

I just checked that i am using the dev-branch and applied all code changes mentioned in the wiki. But modifing the core.py lead to an error during “start.sh skills” so i rolled this modification back.

I’m still getting nearly the error described in here.

So is there a way i can assist to make german recognition functional?

PasabaPorAqui · June 21, 2017, 7:24pm

Hi,

On this mycroft source file you can see the statement:

config.set_string(’-logfn’, ‘/dev/null’)

that disables pocketsphinx log. You can modify it to point to some file, by example:

config.set_string(’-logfn’, ‘/var/log/pocket.log’)

in this way, we could see why pocketsphinx is not starting.

My first hypothesis is that english wake word is still configured, but better we go step to step guided by the log files.

Thorsten · June 27, 2017, 8:30pm

Hi.

Sorry for the delayed answer.
Setting up the logfile parameter (config.set_string(‘-logfn’, ‘/var/log/pocket.log’)) was really helpful. After creating the logfile and adjusting the permissions i got hints within that approve your “first hypothesis” .

[snip]

INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/means
INFO: ms_gauden.c(242): 66 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/variances
INFO: ms_gauden.c(242): 66 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(304): 3955 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5198
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(838): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4101 * 32 bytes (128 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /tmp/tmpr8TV5V
> ERROR: “dict.c”, line 195: Line 1: Phone ‘EY’ is mising in the acoustic model; word ‘hey’ ignored
INFO: dict.c(213): Dictionary size 1, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 1 words read
INFO: dict.c(358): Reading filler dictionary: /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/noisedict
INFO: dict.c(213): Dictionary size 4, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 66^3 * 2 bytes (561 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 105072 bytes (102 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 105072 bytes (102 KiB) for single-phone word triphones
INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -2024, delay 10)
> ERROR: “kws_search.c”, line 171: The word ‘hey’ is missing in the dictionary
INFO: kws_search.c(467): TOTAL kws 0.00 CPU -nan xRT
INFO: kws_search.c(470): TOTAL kws 0.00 wall -nan xRT
[/snip]

I double checked my mycroft.conf file that the original wakeup word “hey mycroft” isn’t there.

// Settings used by the wake-up-word listener
// Override: REMOTE
“listener”: {
“producer”: “pocketsphinx”,
“grammar”: “lm”,
“sample_rate”: 16000,
“channels”: 1,
“record_wake_words”: false,
“wake_word”: “charlotte”,
“phonemes”: “SH AH EX L OO T AX”,
“phoneme_duration”: 120,
“standup_word”: “charlotte”,
“standup_phonemes”: “SH AH EX L OO T AX”,
“standup_threshold”: 1e-90,
“threshold”: 1e-90,
“multiplier”: 1.0,
“energy_ratio”: 1.5
},

There are no other mycroft config files (like within /home/thorsten/.mycroft or /etc/mycroft.conf) and i found nothing about a cached version of the original config file.

I also checked the code-patches you described within your wiki.

i18n: create_wakeup_recognizer must be configurable by SoloVeniaASaludar · Pull Request #575 · MycroftAI/mycroft-core · GitHub (is already merged and is up to date in my setup)
i18n: language codes are not always two word by SoloVeniaASaludar · Pull Request #611 · MycroftAI/mycroft-core · GitHub (has been merged on may 2017 and is up to date in my setup). In this point the wiki is a little bit outdated because “PENDING MERGE” should be “MERGE DONE”
#655 - i18n: spanish stt based on local pocketsphinx by SoloVeniaASaludar · Pull Request #656 · MycroftAI/mycroft-core · GitHub (i did not check this one because i’m unsure if it is required).

I really would appreciate your further support.

Thorsten

Thorsten · June 28, 2017, 7:52pm

I did a file search using string “hey” but primarily for it in “test” files so that this should not be the point. I used “strace start.sh voice” to find which file is read containing “hey” - without success.

Update 1:
Maybe i should have read the docs more in detail. Regarding this thread “Changing the wake word” i should disable remote updates or configure the wake up word in home.mycroft.ai. I will give it soonly a try.

PasabaPorAqui · June 29, 2017, 8:14am

yes, remote configuration is another feature to be disabled. I will edit the wiki adding this point.

Thorsten · June 29, 2017, 9:31pm

Thank you @PasabaPorAqui for updating the wiki.
Now my german wakeup word works like a charm.

Acording the wiki i added the following two lines to the listener section for offline pocketsphinx stt.

"producer": "pocketsphinx",
"grammar": "lm",

This is the whole section.

// Settings used by the wake-up-word listener
// Override: REMOTE
“listener”: {
“producer”: “pocketsphinx”,
“grammar”: “lm”,
//“grammar”: “jsgf”,
“sample_rate”: 16000,
“channels”: 1,
“record_wake_words”: false,
//“wake_word”: “hey mycroft”,
//“phonemes”: “HH EY . M AY K R AO F T”,
“wake_word”: “charlotte”,
“phonemes”: “SH AH EX L OO T AX”,
“standup_word”: “charlotte”,
“standup_phonemes”: “SH AH EX L OO T AX”,
“standup_threshold”: 1e-90,
“phoneme_duration”: 120,
“threshold”: 1e-90,
“multiplier”: 1.0,
“energy_ratio”: 1.5
},

With this configuration german stt works but just using the cloud service of mycroft. After i disabled my network interface i get no stt except the wakeup word. I just receive api errors because of the connection failure.

Commenting out the complete “server” block produces an syntax error on start.sh voice.
Writing an empty url does not work either.

The files (.dict, .lm) are in the “de” directory.

Did i miss something in the wiki (e.g. the git changes)?

Thanks for your help

PasabaPorAqui · June 30, 2017, 8:59am

Congratulations for your German Mycroft !. Please, consider to share your experience, editing current wiki or creating a new one, as you prefer.

About replace remote STT by local German one, my design decision has been start a new client, lclient.

Modify current one had some problems: Mycroft team is busy to accept big pull requests, probably due to stability and regression test efforts; local STT seems not in his business plan; … . I don’t like forks, it is expected a rejoin when “lclient” was fully stable and Mycroft team has time for this improvement.

I see you have a very good level on computer science, you should no have big problems trying with this client.

Note: in last months I’ve working more in the hardware aspects than in the software. I’ve not updated with latest Mycroft versions, thus, some issue could appear.

(Why hardware now: I do not imagine Mycroft as an object to buy and place in a room, but as a hidden, embedded AI present in the house. Thus, I’m working in distributed microphones/speakers, integration with Z-wave and zigbee devices, … ).

tfontoura · June 30, 2017, 4:15pm

I agree with you. I couldn’t figure it out how to do it yet, though, as I don’t believe zigbee protocol can transport voice. Putting PIs in every room as slaves, maybe?

PasabaPorAqui · June 30, 2017, 8:42pm

Z-wave ang zigbee are protocols for actuator devices.

Microphones could be bluetooth, with newest class range 10 meters (usual low price ones has a range of only a few meters), or devices based on Arduino and some xBee card.

On going, I must recognize not yet found a valid solution.

The good new is that rtp based devices works correctly, Mycroft can be commanded using usual smartphones.

newone · July 2, 2017, 8:25pm

I have a similar situation here when sticking with English for my initial tests and get errors when pointing the remote server mycroft.ai to localhost. Thought we do not need the remote server when using pocketsphinx?

2017-07-02 22:29:32,538 - mycroft.client.speech.mic - DEBUG - Thinking...
2017-07-02 22:29:32,538 - SpeechClient - INFO - Wakeword Detected: hey mycroft
2017-07-02 22:29:32,548 - requests.packages.urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): localhost
2017-07-02 22:29:32,549 - requests.packages.urllib3.connectionpool - DEBUG - http://localhost:80 "POST /v1/stt?lang=en-US&limit=1 HTTP/1.1" 404 None
2017-07-02 22:29:33,491 - mycroft.client.speech.mic - DEBUG - Waiting for wake word...

Wake word ‘hey mycroft’ works well by the way.

Do we really need a webserver for handling local STT using pocketsphinx as indicated by above logs?

@PasabaPorAqui: How you handle this?

PasabaPorAqui · July 3, 2017, 10:35am

One of the critical features of pocketsphinx is that it is continuously monitoring the noise/ambient captured by microphone. With this information, it reconfigures some filters to improve recognition ratios.

With the official Mycroft client, the wake word detection (that is continually running except when handling a command or speech) is done totally independent of the command STT translation. This fact goes against previous feature.

For this a some other reasons, and decided to create a local client (“lclient”) different of the official “client”. See previous messages for source link.

Thorsten · July 3, 2017, 6:54pm

If it’s okay for you @PasabaPorAqui i would prefer editing your wiki page. If i would create a “german” wiki page it would be 90% copy’n paste. So maybe it’s the better choice to have one “international” wiki page which points out the language dependend differences.

Thorsten · July 3, 2017, 6:55pm

I’m waiting for delivery of my matrix voice. This would be my mic and mycroft will be hosted on an ubuntu server.

PasabaPorAqui · July 4, 2017, 8:18am

About wiki: ok perfect, modify current one, in this way we will have a more complete page.

About your matrix voice: it seems very interesting. Please, considerer to share your experience (wiki page or post or …) when you think you have it enough tested (I think better not in this thread, because it is too much different issue than the original one).

Thorsten · July 4, 2017, 8:05pm

@PasabaPorAqui i updated your wiki page. Since i worked on that wiki page for nearly 2 hours now please feel free to give it a review ;-). Comments/Updates are happily welcome.
I did not change it’s title for the reason of breaking no existing hyperlinks to the wiki.

newone · July 5, 2017, 4:11am

Great, maybe we can merge French anf Italaiano too ad they only differ in the download section?

The crucial part of this thread is the local STT server and to have a working wiki example for that.

@PasabaPorAqui: Should we clone your complete mycroft-core repository for this purpose and its already adapted, i.e. mycroft.conf points to the local server and uses your lclient or should we clone the official repository and take over the files from your lclient folder?

PasabaPorAqui · July 5, 2017, 1:37pm

A simple solution could be:

download only “lclient” directory # local client
mv client rclient # remote client
ln -s lclient client

in this way, switch from one client to the other is so easy as modify the softlink to “rclient” or “lclient”.