I have been running deepspeech locally with the ‘pretrained’ model on a separate computer in my house recently.
It was fairly easy to set up and to point Mycroft at it. The server does not have a GPU, so it’s not as fast as it could be, but I think the gain in local network speed makes it not that different from the cloud service, which is kind of slow too, in my opinion. I will probably get a GPU based server at some point, but don’t expect a huge improvement in speed, because non-GPU is already usable for the short commands I use.
The big hit I’m taking is with accuracy. I have to speak slowly, right in front of the mycroft, and leave gaps of silence between words.
I’m currently starting to research ways to better train the local service. I have not gotten very far
My pipe dream would be for the mycroft community to be able to share and asimmilate incremental training gains without sharing any audio. That’s way over my head at this point, though