Choosing a different speaking voice when using Google text-to-speech

sil · March 22, 2018, 2:55pm

I’m using the Google TTS backend for my Mark 1, and I wanted to change the voice; I’ve now done so, so I thought I’d write down how for future reference. This is pretty heavily technical, so if you’re not comfortable at that level, don’t do this; wait for the Mycroft team to make it easier

First, this only works for the Google backend (Mimic has its own ways of changing voices). So, to use the Google backend, on your Mycroft web config screen, choose Settings > Advanced and set “Google” rather than “Mimic” as the Text-to-Speech Engine. (Note: don’t do this if you’re not comfortable with Mycroft’s responses to you going via a Google service; Mycroft uses Mimic by default for a reason.)

Once that’s done, you’ll get responses from Mycroft using the default Google voice, which appears to be American. You can change this by adding a manual config setting. SSH into your Mycroft: you will need to enable this, and the documentation explains how to enable SSH on the Mark 1. If you’re using an Ubuntu machine (and possibly others) then your Mark 1 should be available as mark1.local, meaning that you can just do ssh pi@mark1.local and log in with password mycroft; if not, then the docs above explain how to find the IP address and connect. Once connected:

sudo su to become root, then su mycroft to become the mycroft user. Edit the config file with nano ~/.mycroft/mycroft.conf; it’s a JSON config file (the docs explain Mycroft config files). If there’s nothing in it, then you can add: {"tts": {"google": {"lang": "en-uk"}}} and save and you’re done. If you’ve already customised this file, you’ll need to add that key to what’s there. Fortunately, this config file only overrides the stuff that’s in it, so it’ll pick up the other keys from the web configuration and then override them with your local changes. You can choose a voice other than en-uk; the voices available are listed in the gTTS repository at https://github.com/pndurette/gTTS/blob/master/gtts/tts.py#L18.

Once you’ve made that change, you need Mycroft to notice that you’ve done so. You can just restart Mycroft, which works fine. To be more efficient you could just restart the mycroft-speech-client process.

aussieW · March 23, 2018, 12:55am

Thanks, I didn’t know you could do that. Going to try changing mine to en-au. I wonder if you can so a similar thing for the STT, it might improve accuracy on some words. Start is often interpreted as Stop, which is very frustrating!

BTW, was a big fan of LugRadio.

sil · March 23, 2018, 12:58am

I believe it’s also possible to use the Google (or other) speech-to-text engines, but I haven’t looked at that yet

Excellent! You might like Bad Voltage as well, then, which is going on now…

Fellhahn · February 14, 2020, 1:20am

Adding this for anyone like me who stumbles on this:

The gTTS-cli tool lists 13 different variations (accents) of English available for use. However in truth there are only 4:

en-au (Australia)
en-gb (United Kingdom)
en-in (India)
en-us (United States)

the remainder of the 13 options will simply return as one of these four. For example specifying en-nz (New Zealand) returns a speech file using the Australian accent. en-ca (Canada) uses the US accent. The African accents: Ghana, Nigeria, Tanzania, South Africa as well as Ireland return the en-gb accent. (Something something colonialism :S )

I’m hoping this post here:

Will lead to something really cool allowing us to use WaveNet based voices (way more natural sounding) in our TTS responses.