Mycroft was having issues talking again, but last night her voice came back. As a test, I generally ask her what day it is. Today I also asked her to Spell ‘Monday’. When saying ‘M’, it sounds like she is say ‘misure’. I am using the Google voice if that makes a difference.
I don’t think this will actually help, because you can’t do letters, just words, but check out https://mimic.mycroft.ai/pronounce. This is tech that @LearnedVector created to help with the poor pronunciations that came out of the last Mimic 2 training. Check out the new voice, and see if you can pop in some words that trip Mycroft up and get them updated to a better pronunciation.
if you are using google voice that’s a problem with their model, nothing you can do to fix it
even the giants hit this kind of problem, moar data isn’t always the solution
in the case of mimic2 the link above helps it work around this kind of problems, but this is specific per voice model and will not help (usually) with other voices
I am using the google voice, but not sure I agree with that.
If it can say the letter ‘M’ correctly at the end of a word, it should be able to say it correctly parsed anywhere in the word’. I haven’t had a chance to pull the code to look at it yet, but I would venture there is something added to the letters when parsed and trimmed / missing on the last letter in the array.
google uses either tacotron or wavenet, these are neural networks that generate sound directly from text
these are machine learning techniques used to create a voice. When a child is taught to read, they learn pronunciation rules by example. But even with lots of examples the pronunciation of a specific word often needs to be taught explicitly. For example, most children (and many adults) will mispronounce “epitome”
they learned to read from lots of examples basically, there is no parsing anywhere, the model just isn’t sure how to pronounce some words. you found some example with M, eventually you will come across other weird cases
just try giving it garbage and see the output… lots of fun to be had
I wouldn’t say it isn’t sure how to pronounce ‘some’ words. It can’t pronounce ‘mom’, well it misses the first ‘m’ and gets the last one correct. So far all ‘normal’ words with an ‘m’ prior to the end it misses and gets the ‘m’ correct if it is the last letter.
Also note this only started after the last update. Prior to this, it pronounced the ‘m’ correctly in any location.
I think there is a misunderstanding of the issue. I have updated the subject for clarity.
It is not an issue with Mycroft ‘saying’ a word. It is an issue with Mycroft SPELLING letters of a word.
Examples for clarification / testing…
Phrase Result
Hey Mycroft say the letter M M
Hey Mycroft say mom mom
Hey Mycroft spell M M
Hey Mycroft spell tom T O M
Hey Mycroft spell mom Misurre O M
NOTE - The last letter being ‘spoken’ does not have a ‘period’ after it. Below are the spellings for ‘Monday’ and ‘Mom’. The only ‘m’ said correctly is the last one in ‘Mom’ (end of word).
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: M.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: O.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: N.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: D.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: A.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: Y
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: M.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: O.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: M
I really don’t know why this is so hard for people to grasp;)
It has been doing it for AGES now…
Mycroft pronounces the letter “M” when spelling IF the “M” is not the the last letter as “Misurre”.
This seems to be because it “sees” “M.” ( so M with a full-stop/period after it) as a single thing instead of an “Emmm period” like it does for all other letters.
Thus if it is at the end of a word that it is spelling, there is no full-stop and it is pronounced as “Emmm”
It doesn’t matter what voice you are using , it always does it
def handle_spell(self, message):
word = message.data.get("Word")
spelled_word = '. '.join(word).upper()
Going to try setting up a dev testing environment and remove / update this and see if it resolves the issue. If anyone already has one setup, it would be great to have this tested by them as well.
NOTE - editing this will require updating the tests as well as they are looking for the period on all letters other than the last one.
Hey can I clarify, this is only when using the Google TTS engine right?
I would assume they have added the periods to help space the audio out when so that it’s “M pause O pause M” rather than “M O M”, but could be worth trying with a comma instead of a period?
I will check with the other voices and let you know, but the logs look like each letter is sent individually to be spoken…
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: M.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: O.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: N.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: D.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: A.
mycroft.audio.speech:mute_and_speak:119 - INFO - Speaks: Y