DeepSpeech 0.6+

Does mycroft use DeepSpeach 0.6 or later?
If not when can we expect it to.
DeepSpeech 0.6 and beyond introduces the use of tensorflow lite which makes it very viable on ARM, mobile, embedded devices.

1 Like

accuracy is not good, not yet ready for prime time at all

But it does work basically real time, so it’s only a matter of time until mozilla releases a better model :slight_smile: the issue now is the data available for training, deepspeech itself is getting really good!


What he said ^^

We are keen to switch over to DeepSpeech by default, we just haven’t got the accuracy high enough yet.

If you are trying out DeepSpeech yourself at home, you can point Mycroft to your local server for STT. All the details are here:

1 Like


Thanks for the info, this is good to know. Would the accuracy of DeepSpeech be good enough for keyword/keyphrase detection? What I would like to investigate is, could a subset of frequently used commands (i.e. volume up, volume down, play music, stop play back) be trained, and detected locally, with more complex recognition being sent to the cloud for number crunching.

Is it worth my time going down the DeepSpeech route, or would I be better off waiting until the accuracy of DeepSpeech improves overall?



Disclaimer: I am not a machine learning engineer. Others should feel free to chip in if you disagree.

My understanding is that your best bet would be to tune the publicly available model with your own data. This gets it to a usable accuracy for you as an individual, particularly if you are focusing on a smaller set of phrases.

The challenge with music playback will be more unique terms like song / artist / album names.

In terms of sending more complex recognition to the cloud, the question becomes how do you decide what goes to DeepSpeech and what is sent to the cloud?