Vary the recording timeout lenght

MSJarre · January 20, 2020, 9:38am

Hi,

New to Mycroft, this is my first post on this forum. Really like this project.
I’ve seen in the mic class that there’s a RECORDING_TIMEOUT variable that defines the number of seconds Mycroft will be listening to you. It is possible to change this variable but I couldn’t find a way to vary the value of the variable depending on where it is called on my skill.

What I would like to do is to have Mycroft listening for 10 seconds max in general ( so RECORDING_TIMEOUT = 10), but also have a get_response function where it only listen for 3 seconds looking for an answer on my part.
Maybe - passing a record_time variable to my get_response function / Override it - could do the job ?

Thanks in advance

pcwii · January 20, 2020, 12:03pm

Welcome to the forum. This is something I would be interested in as well. I have not been able to find any documents on this. I am hoping @gez-mycroft can point us in the right direction.

Dominik · January 20, 2020, 12:31pm

As far as I understand the RECORDING_TIMEOUT value is hardcoded: https://github.com/MycroftAI/mycroft-core/blob/0b0f31d1ccb49b2d8bcf2ecb426c8fd7120caa9c/mycroft/client/speech/mic.py#L202

Exposing the RECORDING_TIMEOUT to mycroft.conf should be an easy task. Making it a dynamic value that can be passed for a conversational context skill will be a bit more work…

Maybe you can open an issue in the mycroft-core repository on Github (or check if similar issue already exists).

MSJarre · January 20, 2020, 12:39pm

Ok , i’ll try that way. Thank you for the directions.

gez-mycroft · January 20, 2020, 2:02pm

Hi MSJarre, welcome to the Community

The hardcoded length is the maximum length of recording and I don’t believe there is a simple way to modify this other than to edit mycroft-core. However it’s likely that you don’t need to change this.

You will also see in the ResponsiveRecognizer class that Dominik has linked to there are some other values as well

10s is the longest allowable utterance
0.5s is the shortest allowable utterance (eg “no”)
0.25s is the shortest period of silence before it will consider an utterance complete
3.0s is the longest it will wait for a response.

So we don’t upload 10 seconds of audio for every utterance. A short utterance like “no” might be less than 1 second of actual speech, then the system waits for at least 0.25s after the noise has stopped to consider the utterance complete. The resulting audio file might be 0.5s of silence waiting for the user to respond, 0.6s of the user speaking, and 0.3s of silence again. Now we only have to handle 1.4s of audio which makes for faster upload, Speech-To-Text transcription, and overall response times.

Does that help, or is there a specific use case you have that requires a 3 second response?

MSJarre · January 20, 2020, 2:48pm

Hi,
Indeed I thought that the whole wav file was uploaded, but even though it’s not, my biggest problem was that in noisy environments, Mycroft would record a long time when I only wanted some type of very small confirmation or cancellation (something close to the validator / cancel function in get_response).

I ended up setting a Recording_Timeout in mycroft.conf like @Dominik suggested, so I can now change it when I want, & the new value is loaded with an update of the config added to mycroft-core inside the ResponsiveRecognizer class.

Problem solved, thank you.