Dictation skill?

Is there a general purpose dictation skill? (rms was asking) … like being able to speak to write a document or email?

Having one would probably add a lot of value to Mycroft.

Thanks.

-Mike Mac
2 Likes

There is this skill made by @JarbasAl long time ago.

2 Likes

I may update that for ovos later, but dont expect it to be functional since it’s so old

2 Likes

Isn’t whisper perfect for this?

1 Like

Got an email from rms@gnu.org, Richard Stallman himself. He writes:

having free software for dictation would be monumental! 

Having rms as a proponent would be a Good Thing, methinks.

-Mike Mac
4 Likes

That’s what I was thinking, whisper is an open source offline ASR STT tool, it works frankly well, even with non-english languages, which is a killer feature. It works almost perfect and it understands the language even from people who don’t vocalize well
The main issue with whisper is it is not real time, so if you pass a wav file of 1h of duration, it will take 1h of transcription (at least all the test I did, it took a bit more to transcribe thant the duration of the wav audio), and I don’t know if it will take even more with a rpi ( 1’20’’ of audio took 1’24’’ to transcribe in a 11th gen i7 with 48GB RAM and an nvidia rtx 3070ti card -if that matters-).

1 Like

Coqui STT does has this example: STT-examples/README.rst at r1.0 · coqui-ai/STT-examples · GitHub
Would take a bit of tweaking to make it play nice as a skill though.

Speechbrain might be able to do so, probably some call for that and it has a bunch of corporate backing. Probably someone out there doing it now.

1 Like

I forgot to say the other day I found a guy who created a demo for whisper in real time

It works incredible well, you can speak at your own pace, and it transcribes just perfect (or almost, I’ve read three paragraphs of a fantasy novel, and it did quite good even with names)

1 Like