SSML Support in Mycroft

What is the status of SSML support? I find a mixed assessment from what I could find:
Mycroft core documentation does not seem to mention it, 2016 roadmap mentions it as planned, more recent 2018 roadmap does not mention it (the sub-roadmap for TTS), a recent mimic release notes metions additional features and there was at least one Jarbas pull request to core some time ago that seem like they got included that were SSML features.

And lastly, I threw a self.speak() message including some simple SSML tags and Mycroft (with classic voice) did not sound noticeably different, but at the same time, he didn’t do something weird like pronounce the tagnames or something.

If there is some level of support, I think at least a minimal starter entry in the docs would be really helpful.

1 Like


I think maybe, just maybe, on a second experiment at mycroft-cli-client, I did hear some difference after I put the whole utterance inside of “root” tag pair (I did not do that the first time). But the difference that I perceived was not very strong.

Also, it seemed to me that the rendering to audio time was much much longer. This may explain the situation. Functionality is really mostly there, but we are reluctant to declare it official and supported because people would perhaps then complain about it or get a false bad impression about mycroft performance.

1 Like

ssml is supported since april 2018, it will depend on your configured TTS engine, i think mimic2 does not support SSML

I tested with Amazon Polly only, PR is pending (i will get it in working order again soon)

related, helper tool, SSML builder in jarbas_utils >= 0.3.0

Jarbas is correct, mimic1 supports ssml, and is invoked with the ssml flag from mycroft so utterances with ssml tags will be rendered using them (as long as they’re supported) Mimic1 1.3 adds support for the pitch prosody tag (mycroft still pulls in 1.2 by default, that will change as soon as I get time to repackage mimic1 for the Mark-1).

Will make a note about the docs…

Thanks, all that is clarifying. It would certainly be helpful in (hopefully) forthcoming docs to include those tags and properties that are supported so far.

1 Like

I run Mycroft using Mimic1 with SSML tags in my skill, but this seems to make no difference on the audio output–the tags are simply not pronounced. Is this a known issue? If so, could anyone suggest where to start to resolve it?

In forslund’s reply, he mentions invocation with “ssml flag”, which i think is this:

You should be able to test that on your own if you have ssh terminal access. Then the question becomes: how do we correctly configure mycroft to make mimic invocations with that flag (or general custom flags)?