Mimic TTS audio sample rate

YuvrajDodia · January 8, 2019, 8:54am

Hi,

I am able to test Mimic TTS with pi3 device.
I am writing output to wav file. With Pi3 it outputs 16000 Hz wav audio file.

Now i am trying it with my Debian Linux device.
But with my Debian Linux it outputs 44100 Hz wav audio file.
Due to higher audio sampling rate TTS audio is breaking.

How can i change TTS output audio sampling rate?
I want to set output wav file sampling rate to 16000 Hz.

Thanks,
Yuvi

KathyReid · January 8, 2019, 11:58am

Hi @YuvrajDodia,

Firstly welcome, it’s great to have you here.

Could I confirm a few details with you?

Are you using the command $ ./mimic -f TEXTFILE -o WAVEFILE to do the TTS recording to a WAVE file?
Is pulseaudio installed on your Debian system? You can find this out by running the command which pulseaudio. If pulseaudio is installed on your system, please have a look in the file ./etc/pulse/daemon.conf and see if these two lines are present:

; default-sample-format = s16le
; default-sample-rate = 44100

These define the default sample format and rate.
The default-sample-rate is measured in Hertz, and the default-sample-format.

As reference, here is the same information from a Picroft;

; default-sample-format = s16le
; default-sample-rate = 44100
; alternate-sample-rate = 48000
; default-sample-channels = 2
; default-channel-map = front-left,front-right

When you say TTS audio is breaking, what are you observing? How is it breaking?

YuvrajDodia · January 9, 2019, 12:16pm

Hi KathyReid,

Thank you for replay.

I am using ./mimic -t ‘my message text’ -o out.wav
Yes, I have pulseaudio for echo cancellation on Debian system.
i have changed default-sample-rate to 48000 in /etc/pulse/daemon.conf as my audio codec supports only 16000/32000/48000

mimic always create wav file with sample rate 44100 Hz.
mimic wav file has 44100 Hz sample rate while my pulse audio paly it with 48000 Hz clock.

Is there any command line option using which i can tell mimic to create wav file for 16000Hz sample rate?
I am not sure from where it is getting default sample rate of 44100Hz from my system.

Thanks,
Yuvi

KathyReid · January 9, 2019, 1:16pm

Hi @YuvrajDodia unfortunately I don’t think there is an option to set it with the command line - here are all the options for the command;

Carnegie Mellon University, Copyright (c) 1999-2011, all rights reserved
  mimic developers, Copyright (c) 2016, all rights reserved
  version: mimic-1.2.0.2 ()
usage: mimic TEXT/FILE [WAVEFILE]
  Converts text in TEXTFILE to a waveform in WAVEFILE
  If text contains a space the it is treated as a literal
  textstring and spoken, and not as a file name
  if WAVEFILE is unspecified or "play" the result is
  played on the current systems audio device.  If WAVEFILE
  is "none" the waveform is discarded (good for benchmarking)
  Other options must appear before these options
  --version   Output mimic version number
  --help      Output usage string
  -o WAVEFILE Explicitly set output filename
  -f TEXTFILE Explicitly set input filename
  -t TEXT     Explicitly set input textstring
  -p PHONES   Explicitly set input textstring and synthesize as phones
  --set F=V   Set feature (guesses type)
  -s F=V      Set feature (guesses type)
  --seti F=V  Set int feature
  --setf F=V  Set float feature
  --sets F=V  Set string feature
  -ssml       Read input text/file in ssml mode
  -b          Benchmark mode
  -l          Loop endlessly
  -voice NAME Use voice NAME (NAME can be filename or url too)
  -voicedir NAME Directory contain voice data
  -lv         List voices available
  -add_lex FILENAME add lex addenda from FILENAME
  -pw         Print words
  -ps         Print segments
  -psdur      Print segments and their durations (end-time)
  -pr RelName Print relation RelName
  -voicedump FILENAME Dump selected (cg) voice to FILENAME
  -v          Verbose mode

Do you think it could be this bug with pulseaudio?
https://bugs.freedesktop.org/show_bug.cgi?id=66424

YuvrajDodia · January 10, 2019, 7:35am

Hi KathyReid,

Yes, mimic don’t have any option for audio sample rate.
You are right. Issue is with my pulseaudio.
If i disable pulseaudio then mimic generates wav file with 44100 Hz sample rate. When i play it using ALSA driver aplay utility audio is good(no breaking or jump is audio playback).
For now i will continue without pulseaudio and try to resolve it by fixing pulseaudio issue.

Thanks for your support.

Regards,
Yuvi