CorentinJ/Real-Time-Voice-Cloning


#1

Thoughts?


#2

It’s exciting to see these adapted voices getting better. Particularly for users of alternative communication technologies who either had to pay thousands of dollars to get a custom voice, or have the same 3 voices as everyone else who uses the same device as they do.

Speaker identification would certainly be useful for Mycroft too…

Have you tried it out on your own voice?


#3

It currently requires a cuda enabled GPU, I’m sure there is a work around (ie use the torch-rocm docker image, and run it from in there and do some code massaging), I just have done that yet.

I would be really interested in anyone else’s experiences though!


#4

If you want to try it out without dropping a lot of cash on hardware, Fast.ai has some suggestions and guides on how to get started on different cloud platforms.


#5

How do you save a model and run it as a server? That’s what I want to know.

Quality is so-so based on the samples, though. Wonder if there’s like a minute version that improves things.