Self-hosted Selene backend and GUI

JarbasAl · March 21, 2024, 2:58am

Mark1 support is a goal! in fact we adquired 3 mark1s for the team the past year for this purpose

thats said, support is a work in progress, the raspOVOS image should detect the mk1 and work out of the box, but the mouth movements only really work well with the original mimic1 TTS, other voices don’t look so good, but in classic mycroft other voices dont show mouth movements at all so it’s already an improvement!

By default OVOS uses fasterwhisper for STT, via publlic servers hosted by the community, this is not the fastest STT and you will perceive the mark1 to be much faster if you use something like the google chromium plugin (equivalent to Selene servers google proxy)

markIV · March 22, 2024, 2:24am

This is all great news! I will be writing ovos to a new sd card soon. I’ve been using mimic3 via the marytts plugin and whisper via a primitive web service I run in flask. they are both gpu accelerated. local, fast, and only mostly arcane incantations needed to restart them… i’ll let you all know how I fare. Thank you very much.

builderjer · March 22, 2024, 11:39am

https://ovosimages.ziggyai.online/raspbian/newest

The headless image here is working rather well

mikejgray · March 22, 2024, 4:03pm

Great to hear! Btw, OVOS has some web services for both STT and TTS, for any supported plugin (that includes Whisper, although MaryTTS may be deprecated, and if it is you may want to try Piper since it has all the same voices as Mimic3 but it’s maintained).

https://github.com/OpenVoiceOS/ovos-tts-plugin-server

I run FasterWhisper and Coqui locally, GPU backed, and all my assistants point to those local servers.

This is the second time in 24 hours that FasterWhisper in the STT wrapper has come up, so I guess I need to do a blog post on it!