Tensorflow lite

hi.
has anyone tested using tensorflow lite?

what disadvantages are there for using lite?

Tested it on/for/with what?

anything/wake word/mycroft/other/etc/ it is an open question

I have got a repo of the google-kws that is just an easy access copy as for some reason google-research is just one repo.

TF-Lite is exceptionally rapid and I have a 20ms streaming KWS running from the examples google provide.
It uses flex-delegates so the delegated full TF layer is prefiltered with standard TFL CNNs and the quantised speed increases are extremely rapid.

When I have been playing with that Google-KWS it gives accuracy results for TF vs TFL but also the speed increase of a TFL quantised model is also really big.

I0329 19:49:03.096672 139843685021504 test.py:495] tf test accuracy, stream model state external = 100.00% 200 out of 609
I0329 19:49:40.055655 139843685021504 test.py:495] tf test accuracy, stream model state external = 100.00% 400 out of 609
I0329 19:50:17.611867 139843685021504 test.py:495] tf test accuracy, stream model state external = 100.00% 600 out of 609
I0329 19:50:19.069626 139843685021504 test.py:500] TF Final test accuracy of stream model state external = 100.00% (N=609)

INFO: TfLiteFlexDelegate delegate: 2 nodes delegated out of 34 nodes with 1 partitions.

I0329 19:52:51.021229 139843685021504 test.py:619] tflite test accuracy, stream model state external = 100.000000 200 out of 609
I0329 19:52:55.191242 139843685021504 test.py:619] tflite test accuracy, stream model state external = 100.000000 400 out of 609
I0329 19:52:59.372943 139843685021504 test.py:619] tflite test accuracy, stream model state external = 100.000000 600 out of 609
I0329 19:52:59.534713 139843685021504 test.py:624] tflite Final test accuracy, stream model state external = 100.00% (N=609)

Its not all running on TFL as 2 nodes do delegate out to run TF but the speed increases are pretty huge, but as you will see the exact same model being run as full TF to TFL gets x10 speed boost.
Also running on Aaarch64 due to the optimised 64bit code and any tensor based math benefits from a wider data bus you also get 2-3x speed increase.

I have been playing with the streaming external state CRNN as really in terms of state-of-art it has the best latency to Ops ratio and accuracy and uses a GRU just like precise but unlike precise TFL CNNs have prefiltered the parameters to an extent inference 50x a second (20ms) runs on less than 20% load of a single core on a Pi3A+

The only gotcha was working out how to install TF-Addons but its all figured out and documented in GitHub - StuartIanNaylor/g-kws: Adaption of the Googleresearch kws repo

The real brains and state-of-art kws is the google-research/kws_streaming at master · google-research/google-research · GitHub repo

There is a google-research/kws_experiments_paper_12_labels.md at master · google-research/google-research · GitHub which is very much what Precise uses and if chunked at the same rate as Precise even though haven’t tested due to CRNN having lower Ops and better accuracy I would expect huge perf increases also that could be 30x Precise running on Armv7 if TFL running the G-KWS GRU on Aaarch64.

For TTS I suggest having a look at GitHub - TensorSpeech/TensorFlowTTS: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese, German and Easy to adapt for other languages) as on a Pi4 its only 1.3x realtime but the quality is outstanding and further tweaks with n-mels and SR with a streaming output could see that rise substantially.

The static model of TF and the reduced subset of TFL is really rigid to what you can do with layers but when you do find working models the highly optimised static models are really fast.

The tensorspeech guys also do a ASR but haven’t looked as ‘Almost State-of-the-art Automatic Speech Recognition’ but presume the TFL version is extremely rapid.

I have been mainly concentrating on KWS datasets as that recipe can massively effect robustness but I have been extremely impressed at the accuracy you can achieve with custom datasets in the presence of high levels of noise.

But did test the TFL TTS model just for quality and was pretty impressed.

https://drive.google.com/file/d/1hAK2NNRUYulYNYK5MeWHmsfKWL93D0Zp/view