I noticed with the Picroft image that when you select PS3 eye it sets up the Microphone but that is it.
Being playing with pulseaudio to get the best settings and added that to Audio_Setup.sh
Single line does it all.
nano $HOME/audio_setup.sh
#!/bin/bash
# Use this script to execute audio setup actions
sudo amixer cset numid=3 "1" > /dev/null 2>&1
amixer set PCM 79% > /dev/null 2>&1
amixer set Master 79% > /dev/null 2>&1
pactl load-module module-echo-cancel use_master_format=1 aec_method='webrtc' aec_args='"analog_gain_control=0 digital_gain_control=1 voice_detection=1 beamforming=1 mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0"'
audio_setup.sh runs on each boot or just ./audio_setup.sh
to test & mycroft-cli-client to watch how things are recognised.
Or pactl load-module module-echo-cancel use_master_format=1 aec_method='webrtc' aec_args='"analog_gain_control=0 digital_gain_control=1 voice_detection=1 beamforming=1 mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0"'
from the cli.
pactl unload-module module-echo-cancel
to remove
The digital auto_gain works great, voice_detection=1 seems to provide improvement.
The settings can be used with any mic just the beaming forming is the 4x linear mic array of the PS3 Eye.
I grabbed that from the web as really struggled to get a singular authoritive source of any of the settings.
The metric of the beamforming from memory is x1,y1,z1,x2,y2,z2,x3,y3,z3,x4,y4,z4
The linear mic array on the PS3 eye is 4x mics eqi spaced same orientation by approx 20mm so everything is from a virtual centre point and without actually knowing 0.01 seems to be approx 10mm.
The above seems to work great for me and wondering should it be part of the setup if PS3 eye is chosen as that single line makes the Mic a quantum improvement over the vanilla setup.
[edit]
Prob best to edit /etc/default.pa
### Enable Echo/Noise-Cancellation
load-module module-echo-cancel use_master_format=1 aec_method=webrtc aec_args="analog_gain_control=0\ digital_gain_control=1\ agc_start_volume=85\ beamforming=1\ mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0" source_name=echoCancel_source sink_name=echoCancel_sink
set-default-source echoCancel_source
set-default-sink echoCancel_sink
Check the further posts and with playin found the filters, noise suppresion & VAD seem to make zero difference to recog.
If you record and playback you can hear a difference but starting to think the recog will pluck out speech and prob best not to introduce filter noise that purely makes it cleaner for humans.
I never did work out how to add a ALSA pcm software volume as the default is quite quiet.
I will get round to that one day