Funny thought. Has someone tinkered around with librosas’ melspectrogram / pytorch spectrogram in line with other tools to change the characteristics of a model? Given a rock solid multispeaker model, you should be able to create a respectable bandwidth of flavours.
Stumbled over Voice Cloning lately and his spin off Resembler where a whole buisiness model evolved around