Picroft Performance

Are there any tricks to speeding up Picroft? I already swapped out the TTS engine which helped a lot, but it seems like certain skills just take forever to process.

For instance the Playback skill. I’ll say Play news, or Play spotify, and the actual “play” handler seems to take forever to actually call up the right skill (news/spotify). It’ll say “Resolving the player for ‘news’” for 4-5 seconds, much longer than it takes to do the STT or execute the actual skill itself.

I’m going to be messing with the Playback skill to see if I can improve it any, but are there any tips in general to make the Picroft perform better? This is running on a RPI 3 B+.

Is this mainly a messagebus speed limitation? Or just iterating through all the skills and waiting to see if they can handle the event? I might even nix the Playback skill and just modify the actual news/spotify skills and see if that helps at all.

If all else fails I suppose I can play some kind of feedback sound so that at least I know Mycroft heard me and is just taking a long time to process it. Like a “busy” sound.

There are several factors which contribute to overall responsiveness with Mycroft in general - my colleague @eric-mycroft wrote about them in this blog post, which covers how we implemented a cache for Mimic2 to speed up the TTS layer.

These include;

  • processing of cloud based Speech to Text
  • latency of cloud based transactions
  • CPU intensive operations - RPi has pretty low end CPU and RAM
  • processing of onboard TTS using mimic

Some of these things we can tune or tweak, but at the end of the day the Pi hardware itself is a limitation, one of the reasons that we’re moving to a Xilinx processor in the Mark 2.

If you want a snappier experience, you can always take Mycroft for Linux for a spin on a grunty machine.

1 Like

PR#1889 should improve this a bit

if a skill also tells the common play it is searching for a match (instead of indicating confidence right away), the timeout will be extended for 5 seconds, if you have a lot of music playing skills this might build up, there is a TODO to prevent multiple timeout extensions, relevant code here

I think it is mainly the skill searching aspect that is slow for me, the STT, TTS, etc. seems fast enough. I’ve looked at the Playback skill a bit and what Jarbas mentioned would probably help in that regard. I had already manually lowered the searching timeout a bit which seems to have helped slightly though that obviously can cause its own issues.

Can your Mark 2 device run a more full featured Debian style Linux distro like Raspbian or is it some stripped down flavor of embedded Linux? Do you have any plans on releasing Mark 2 “DIY” kits for those of us who can’t afford multiple $200 devices, and don’t wish to have built-in cameras or screens?

Can the current Mycroft setup on the Pi act as a basically a voice proxy out of the box, where it just listens to a wake word and forwards the audio stream to a more powerful local server and then plays the audio back?

Hi @UH60, thanks for your feedback.

Running a full distro

We’ve deliberately used Jessie Lite (Mark 1) and Stretch Lite (Picroft) from the Raspbian family in the past. We haven’t yet settled on a distribution for the Mark 2 - my colleague @steve.penrod (our CTO) might have some comments there. We won’t run a distro that has a fully featured GUI because that takes up RAM and CPU cycles that we can better use.

DIY kits

We don’t plan on releasing a DIY kit at this time - Mark 2 is being positioned as more of a consumer than a hobbyist device.

Voice proxy

We haven’t experimented with this, but it’s possible. You would need to find a way to redirect the audio capture from going to home.mycroft.ai for STT transcription to the “hub” which would then presumably do the STT transcription. You might be interested in our Message bus documentation.