Benchmarks for various digital assistant components

Are there any benchmarks for testing throughput of various components of Mycroft? Ive seen apost that says something like deepspeech on an average gpu is about 140% of realtime, which is ok, but it seems like a standard audio file that is used by everyone would let us compare hardware configurations. Maybe another file that is an obstacle course for STT to compare accuracy.

Ill try installing deepspeech and report back my results either way, but it would save some work if there is a standard.

Deepspeech (for me) is about ~200% time on an nvidia 1030 vs. ~45% on a 1070. Depends on the GPU a great deal. Also recompiling everything for your platform probably helps a bit. Once a mycroft dataset can be compiled, it will probably help to generate a smaller more focused model that can be used as well.

For the whole unit, a timed transaction breakdown, with stuff like…

  • STT time
  • parsing time
  • evaluation/skill matching time
  • response/skill run time
  • TTS time

and have that run across multiple normalized queries, ie, what’s the local weather, who won superbowl 18, how many meters in a mile, etc.

Really, though, if the transaction is sufficiently snappy, then it’s probably moot.

I’m going to tag my colleague @Mn0491 on this one - he may have some more benchmarks available.

Sometimes I take the long way around the tree, to shorten up what I am thinking:

I see several people in the forums asking about standalone setups, and asking if this or that is fast enough. The 3d software blender has a spreadsheet that ranks GPUs (and CPUs) based on how long it takes to render. I’m just thinking about how to create something like that for mycroft, and I suppose more than just for STT.

Currently no official benchmarks but we have deepspeech running on a GTX 980 TI and it runs about 20% realtime which is pretty quickly. We also have a couple of GTX 1080ti that we haven’t tried it on.

1 Like