Mycroft Community Forum

Can Amazon's STT engine -Transcriber- be added to Mycroft?

Hello,

According to the documentation, there is a number of STT engines that are supported by Mycroft, but Transcriber (Amazon’s STT) is not listed. Does it mean that Mycroft won’t be able to work with it?

Regards.

I had a brief look at the Amzn Transcriber service, including terms and conditions. You will need an AWS account and with your initial subscription you get 60min per month for free (for the first 12 month). Each request will be “billed” as 15 seconds minimum. That averages 8 requests per day which should be enough for development and testing, but for daily use you will be probably charged (0,006 USD for 15 sec transcription)

Looking at the Transcribe Service API it doesn’t look too complicated as long as you don’t want to use features as custom vocabulary etc.

Thank you. Yes, the API does not look complicated, but I’m having trouble to understand how I can integrate Transcribe to Mycroft. In the documentation I see which modules are supported by Mycroft (e.g. bing, govivace etc). My question is: does anyone know where these modules are defined? This way I can have an idea about how my Transcribe module should look like :slight_smile: And also, how does Mycroft know where to find such modules?

look here https://github.com/MycroftAI/mycroft-core/blob/dev/mycroft/stt/init.py

I did some tests with AWS Transcribe. API itself is very easy to use, but there are some road blocks:

  • Audio file for transcription must be uploaded to a S3 bucket, as far as i can tell there is no way to pass it as http/json payload as other services do. So there is additional code required to upload the audio file to your S3 bucket (and delete it afterwards)
  • For each transcription you must create a job and in my tests the round trip of job creation, transcription and processing the result took 40 sec minimum, maximum was 112 secs (file upload to S3 bucket not included). This is not feasible for Mycroft application.

There is a StreamingTranscription API as well, but this looks more complicated and is currently only available for English and French.

Thank you. I also ran into the same issues. Using Transcriber takes too long (around 3 mins to get the text), so it is not a possibility anymore. I’m gonna test Kaldi, although setting it up is much harder as I thought it would be. Thank you all for the comments and the help!

Community member @JarbasAl already did some work:


this is based on https://github.com/gooofy/zamia-speech which is worth checking as well.

You should also search chat.mycroft.ai for “kaldi”…

Quick question for the broader community: Is there real demand for adding this as a supported STT service ( at least the streaming in English and French )?

1 Like

AWS Transcribe docs seem not to be up to date in all places, in total following languages are support by Transcribe Streaming API:

  • Australian English (en-AU)
  • British English (en-GB)
  • US English (en-US)
  • French (fr-FR)
  • Canadian French (fr-CA)
  • US Spanish (es-US)

Personally I think the streaming transcription doesn’t adds much to the existing STT.

1 Like

The latency seems to be a big issue, combined with the need for separate account management would make for a number of headaches.

1 Like

please focus on private alternatives, devote resources to fine tune deepspeech per user or something, don’t spend a second supporting extra spy services for no gain.

3 Likes

Crikey, they’ve got Australian English support?!?! Pretty tempting to yarn with that galah.

I can’t understand some of the Aussies around here :stuck_out_tongue:

1 Like