A pocketsphinx package for a language is composed of three elements:
one dictionary (.dict)
the language model (several files as “mdef”, “means”, usually in a directory called “model”)
one grammar (file .lm).
The packages in this address usually follows this format.
In order to use it with Mycroft, it is necessary to download the package, extract/move/copy files and, at the end, have a files structure as the one described in the wiki after the phrase “at the end, you must have the following directory, files and softlink:” (It is not mandatory the softlink, it can be the real file instead).
When using pocketsphinx only to detect the wake words, it is not necessary a dictionary and grammar, only the model. If pocketsphinx will be used as STT, the the dictionary and one or more grammars (lm or jspg) are need. See alternative “lspeech” client here.
This final file structure of directories and files is forced by the speech cllient of Mycroft. It is the same one than for english language, as can be seen here.
At first i just want to change the “wake word” to a german word. Next step would be switching complete speech recognition to local pocketsphinx installation.
What i have done so far:
Downloaded cmusphinx-de-ptm-voxforge-5.2.tar.gz, cmusphinx-voxforge-de.lm.gz and cmusphinx-voxforge-de.dic from https://sourceforge.net/projects/cmusphinx. I used the “ptm” version because it contains a file named “sendump” which i found also in the original “en-us/hmm” folder.
Extracted and copied the files according to the wiki document.
Created the following directory/file structure:
“/home/thorsten/mycroft-core/mycroft/client/speech/model/de-de” containing the files “cmusphinx-voxforge-de.lm” and “cmusphinx-voxforge-de.dic” (these files should not be neccesary for just changing the wake up word).
After some google research i found that this error may occur if the “model” path is not set using python. But since the path is configured similar to the “en-us” path i think that this is not the problem (https://github.com/cmusphinx/pocketsphinx/issues/32).
I checked the git changes you made and described in the wiki and as far as i understand the first two commits has been merged and the other two are closed at the moment. So i did not change anything within the code right now. Is this neccesary for just changing the wakeup word in the first step?
Addition info:
Just for testing i renamed the original folder “en-us” to “en-us.old” and the german folder “de-de” to “en-us”. Just to check if it’s a problem with the path or with the content of the directory. Since the error is still there it seems to be a problem within the content of the “de-de” folder.
I failed with above approach yesterday too. My biggest issue is related to the mic i use which is not well supported pythonwise by the vendor.
I also went ahead and tried to install pocketsphinx 5prealpha, German language. Maybe we can get in sync to get this task done together as we seem to have similar goals.
As far as i can see our common goal is the use mycroft-core with “offline” german voice recognition using pocketsphinx for the wake-up word and for the whole recognition. So would be great if we can support each other solving our problems.
I’m using a cheap headset which works without problems using the default configuration (arecord and aplay works without problems).
I read that installing an additional pocketsphinx is not neccesary, because mycroft has a buildin pocketsphinx installation used for recognition of the wakeup word.
I just checked that i am using the dev-branch and applied all code changes mentioned in the wiki. But modifing the core.py lead to an error during “start.sh skills” so i rolled this modification back.
I’m still getting nearly the error described in here.
So is there a way i can assist to make german recognition functional?
Sorry for the delayed answer.
Setting up the logfile parameter (config.set_string(‘-logfn’, ‘/var/log/pocket.log’)) was really helpful. After creating the logfile and adjusting the permissions i got hints within that approve your “first hypothesis” .
[snip]
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/means
INFO: ms_gauden.c(242): 66 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/variances
INFO: ms_gauden.c(242): 66 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(304): 3955 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5198
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(838): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4101 * 32 bytes (128 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /tmp/tmpr8TV5V > ERROR: “dict.c”, line 195: Line 1: Phone ‘EY’ is mising in the acoustic model; word ‘hey’ ignored
INFO: dict.c(213): Dictionary size 1, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 1 words read
INFO: dict.c(358): Reading filler dictionary: /home/thorsten/mycroft-core/mycroft/client/speech/model/de-de/hmm/noisedict
INFO: dict.c(213): Dictionary size 4, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 66^3 * 2 bytes (561 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 105072 bytes (102 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 105072 bytes (102 KiB) for single-phone word triphones
INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -2024, delay 10) > ERROR: “kws_search.c”, line 171: The word ‘hey’ is missing in the dictionary
INFO: kws_search.c(467): TOTAL kws 0.00 CPU -nan xRT
INFO: kws_search.c(470): TOTAL kws 0.00 wall -nan xRT
[/snip]
I double checked my mycroft.conf file that the original wakeup word “hey mycroft” isn’t there.
// Settings used by the wake-up-word listener
// Override: REMOTE
“listener”: {
“producer”: “pocketsphinx”,
“grammar”: “lm”,
“sample_rate”: 16000,
“channels”: 1,
“record_wake_words”: false,
“wake_word”: “charlotte”,
“phonemes”: “SH AH EX L OO T AX”,
“phoneme_duration”: 120,
“standup_word”: “charlotte”,
“standup_phonemes”: “SH AH EX L OO T AX”,
“standup_threshold”: 1e-90,
“threshold”: 1e-90,
“multiplier”: 1.0,
“energy_ratio”: 1.5
},
There are no other mycroft config files (like within /home/thorsten/.mycroft or /etc/mycroft.conf) and i found nothing about a cached version of the original config file.
I also checked the code-patches you described within your wiki.
I did a file search using string “hey” but primarily for it in “test” files so that this should not be the point. I used “strace start.sh voice” to find which file is read containing “hey” - without success.
Update 1:
Maybe i should have read the docs more in detail. Regarding this thread “Changing the wake word” i should disable remote updates or configure the wake up word in home.mycroft.ai. I will give it soonly a try.
Thank you @PasabaPorAqui for updating the wiki.
Now my german wakeup word works like a charm.
Acording the wiki i added the following two lines to the listener section for offline pocketsphinx stt.
"producer": "pocketsphinx",
"grammar": "lm",
This is the whole section.
// Settings used by the wake-up-word listener
// Override: REMOTE
“listener”: {
“producer”: “pocketsphinx”,
“grammar”: “lm”,
//“grammar”: “jsgf”,
“sample_rate”: 16000,
“channels”: 1,
“record_wake_words”: false,
//“wake_word”: “hey mycroft”,
//“phonemes”: “HH EY . M AY K R AO F T”,
“wake_word”: “charlotte”,
“phonemes”: “SH AH EX L OO T AX”,
“standup_word”: “charlotte”,
“standup_phonemes”: “SH AH EX L OO T AX”,
“standup_threshold”: 1e-90,
“phoneme_duration”: 120,
“threshold”: 1e-90,
“multiplier”: 1.0,
“energy_ratio”: 1.5
},
With this configuration german stt works but just using the cloud service of mycroft. After i disabled my network interface i get no stt except the wakeup word. I just receive api errors because of the connection failure.
Commenting out the complete “server” block produces an syntax error on start.sh voice.
Writing an empty url does not work either.
The files (.dict, .lm) are in the “de” directory.
Did i miss something in the wiki (e.g. the git changes)?
Congratulations for your German Mycroft !. Please, consider to share your experience, editing current wiki or creating a new one, as you prefer.
About replace remote STT by local German one, my design decision has been start a new client, lclient.
Modify current one had some problems: Mycroft team is busy to accept big pull requests, probably due to stability and regression test efforts; local STT seems not in his business plan; … . I don’t like forks, it is expected a rejoin when “lclient” was fully stable and Mycroft team has time for this improvement.
I see you have a very good level on computer science, you should no have big problems trying with this client.
Note: in last months I’ve working more in the hardware aspects than in the software. I’ve not updated with latest Mycroft versions, thus, some issue could appear.
(Why hardware now: I do not imagine Mycroft as an object to buy and place in a room, but as a hidden, embedded AI present in the house. Thus, I’m working in distributed microphones/speakers, integration with Z-wave and zigbee devices, … ).
I agree with you. I couldn’t figure it out how to do it yet, though, as I don’t believe zigbee protocol can transport voice. Putting PIs in every room as slaves, maybe?
Z-wave ang zigbee are protocols for actuator devices.
Microphones could be bluetooth, with newest class range 10 meters (usual low price ones has a range of only a few meters), or devices based on Arduino and some xBee card.
On going, I must recognize not yet found a valid solution.
The good new is that rtp based devices works correctly, Mycroft can be commanded using usual smartphones.
I have a similar situation here when sticking with English for my initial tests and get errors when pointing the remote server mycroft.ai to localhost. Thought we do not need the remote server when using pocketsphinx?
One of the critical features of pocketsphinx is that it is continuously monitoring the noise/ambient captured by microphone. With this information, it reconfigures some filters to improve recognition ratios.
With the official Mycroft client, the wake word detection (that is continually running except when handling a command or speech) is done totally independent of the command STT translation. This fact goes against previous feature.
For this a some other reasons, and decided to create a local client (“lclient”) different of the official “client”. See previous messages for source link.
If it’s okay for you @PasabaPorAqui i would prefer editing your wiki page. If i would create a “german” wiki page it would be 90% copy’n paste. So maybe it’s the better choice to have one “international” wiki page which points out the language dependend differences.
About wiki: ok perfect, modify current one, in this way we will have a more complete page.
About your matrix voice: it seems very interesting. Please, considerer to share your experience (wiki page or post or …) when you think you have it enough tested (I think better not in this thread, because it is too much different issue than the original one).
@PasabaPorAqui i updated your wiki page. Since i worked on that wiki page for nearly 2 hours now please feel free to give it a review ;-). Comments/Updates are happily welcome.
I did not change it’s title for the reason of breaking no existing hyperlinks to the wiki.
Great, maybe we can merge French anf Italaiano too ad they only differ in the download section?
The crucial part of this thread is the local STT server and to have a working wiki example for that.
@PasabaPorAqui: Should we clone your complete mycroft-core repository for this purpose and its already adapted, i.e. mycroft.conf points to the local server and uses your lclient or should we clone the official repository and take over the files from your lclient folder?