Creating a own Voice, form existing .wav files?

Hello all,

the question may have appeared here more often, but I am more than confused after my research on the Internet on this topic.

My goal is to create my own voice (in german).
Since I want to have a slightly “robotic” voice for Mycroft.

To get started with the topic, I watched a tutorial and came across Google Colab Notebook, which allows to train and synthesize a Tacotron 2 model. To better understand the topic, I tried to implement that. Unfortunately, the notebook used Tensorflow 1 and thus was outdated.

Also the topic with Mimic 2 and 3 confuses me a bit and I can’t really find an entry point. Can I use Mimic for my own models?

To make it easy for the first try, I decided to use an already existing voice. The voice of Glados, since it already exists as .wav files. If i am able to create a model, that works with mycroft, I will invest more time.

I sampled the .wav files down to 22050 Hz and created a transcript file.

And now I’m stuck because all the tutorials I’ve found are outdated or don’t get me anywhere.

Could someone explain me what are the necessary steps to create a voice for Mycroft? Or give me a hint which topics I have to look at?

Edit: I saw the Guide for GitHub - MycroftAI/mimic2: Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito., is this still the way to go?

Thank you very much,