You will not be beamforming for long anyway as its dropped upstream and will be no longer.
The beamforming if you set it up right as the default points left or right (forgot) works sort of OK but it has no method to update co-ords as its part of the pulseaudio config.
The number of mics you have dictate how narrow a beam you can create and there is a really good invensense datasheet that goes through all the basics to a pretty good level of beamforming technology that I always quote https://invensense.tdk.com/wp-content/uploads/2015/02/Microphone-Array-Beamforming.pdf
The webrtc modules have there basis in Chrome & Chromebooks where the beamformer & aec had much focus on a static dual mic for someone in close proximity with AEC stopping feedback and a NS that also reduced keyboard clicks.
It worked quite well for what it was intended but really it pretty useless and depreciated as its being dropped which might give hint to its worth.
I do have a Delay Sum repo GitHub - robin1001/beamforming that I have been meaning to hack/molest and change the wavreader to a stdin stream.
My C is still non existent and a Delay Sum aint great but its the best compromise for load.
It needs to be C as Python is a great lang but for DSP it sucks as the interprocess of swapping small chunks at audio frequencies is just about the worst thing you can do with Python and its back end C libs.
PS3 eye is a great quad mic but the great algs to use it sadly where in the PS3 and we just don’t have anything and haven’t had for a long time as been banging on about this for what seems forever.
I have no idea why its recommenced whilst its known we are missing any decent algs such as beamforming and its TDOA ( Time Difference Of Arrival) to direct its beam to as if you have a KWS then you can do some sexy things like direct on voice.
But why you are getting bad results might not be anything to do with the STT because you are completely blind to what your STT actually received.
Try this in a simple venv as it gives simple max volume but also captures KW so you can actually play in audacity and see what your capturing its a ‘Hey Marvin’ but a simple script that is easy to see what is going on.
I haven’t looked at Mycroft for quite a while now but maybe someone can advise on how to capture the incoming audio stream as often it isn’t what or as good you think it is.
https://drive.google.com/file/d/1EFT4T0sxyVo9EXWMh-V0BL4QWXAFfVlE/view?usp=sharing
you can run that as its a simple script without the whole confusion of a system or at least check a simple VU arecord -D plughw:1 -V mono -r 16000 -f S16_LE -c 1 /dev/null
Also the PS3 eye drivers do work but don’t think it has a input volume? I always used the cnxsoft guide
AGC is usually essential to your input as with a room mic and how sound attenuates greatly by distance you can not set it with a static volume.
Also often the input volume is just too low and people set to 100% as that is a really bad idea as it leaves zero headroom and cause clipping that sends resonant frequencies galore just like a distortion pedal that does the same.
If you haven’t got hardware AGC use speex and can share my /etc/asound.conf
#pcm default to allow auto software plughw converion
pcm.!default {
type asym
playback.pcm "play"
capture.pcm "cap"
}
ctl.!default {
type hw card 1
}
ctl.equal {
type equal;
}
pcm.plugequal {
type equal;
slave.pcm "plughw:1,0";
}
pcm.equal {
type plug;
slave.pcm plugequal;
}
#pcm is pluhw so auto software conversion can take place
#pcm hw: is direct and faster but likely will not support sampling rate
pcm.play {
type plug
slave {
pcm "plughw:1,0"
}
}
#pcm is pluhw so auto software conversion can take place
#pcm hw: is direct and faster but likely will not support sampling rate
pcm.cap {
type plug
slave {
pcm "plugequal"
}
}
pcm.agc {
type speex
slave.pcm "cap"
agc on
agc_level 2000
denoise off
}
#sudo apt-get install asound2-plugins
#will use lower load but poorer linear resampling otherwise
defaults.pcm.rate_converter "speexrate"
In the above I have hardware AGC capture.pcm “cap” but to enable agc just change that line to capture.pcm “agc”.
PS yeah I also have alsa eq acting as a voice bandpass.
For some reason Debian still uses the RC of Speex even though years old but Alsaplugins rightly uses the release version so its doesn’t see it because Debian hasn’t updated and the Speex plugins don’t get installed on Buster.
Might be fixed on Bullseye but haven’t checked its a very easy and short compile and install and if you struggle ask I will give you a quick howto.
This one was for Buster GitHub - StuartIanNaylor/Alsa-plugins-speex-update and make sure you run the right one for 32 or 64bit.
aplay --version
will show you what version of alsa to aim at as prob need to update for Bullseye
PS don’t use the denoise as it sucks worse than Rnnnoise and is artefact city, AGC is great though and maybe a bit too high the default of 8000 which relates to max gain is just crazy I use 2000 but maybe should be lower depends on variation of distance in use.
You can use the AEC and AGC of pulseaudio but again the AEC is pretty poor as it does cancel but completely fails as SNR gets to fairly modest levels. Speex AEC attenuates and continues to do so and you can use it with pulse also.
I know alsa and for single use system audio don’t see the point in pulseaudio even though good for its uses the default AEC & beamforming module is prob its worst.
I have been trying to get my head round pipewire as its AEC just received a fresh rewrite.