Mic Arrays are rubbish

Sure that will act as bait for some and not sure why microphone arrays have perked my interest so much since my introduction to Mycroft, they just have.

I am not a sound engineer, I did a bit of audio late teens but that was a long time ago.

Firstly to pull the bait array microphones without fancy dsp are just an array of microphones.
As libs on a Pi3/4 you can do echo cancellation which is handy but apart summing the array which also sums noise many of the non DSP microphone arrays we have are practically pointless.
On a zero the load is too heavy even for Echo Cancellation (EC) and all you can do is increase sensitivity.

Increasing sensitivity is OK for far field in a silent room but apart from that its also pointless as any louder volume source will just swamp the signal and that sensitivity will count for nothing.

Even many of the DSP often USB mics are not that great but they are getting better, but even then with inbuilt EC they still suffer from other non distributed noise sources.
Meaning if your voice is the predominant noise source they work quite well, but against singular competing noise sources not so and the problem with home TV & HiFi are those types of noise source.

As well as being critical of for many options of array, omni directional without DSP is also pretty poor as without DSP you don’t have any form of control.

If you don’t have DSP then actually uni-directional cheap old electrets plugged into a cheap USB sound card especially on a zero or where your struggling with load for EC can be much better.
Its not high tech but they just have holes in the back so sound waves hit the back of the diaphragm and act as noise cancellation and give directionality.

The directionality actually gives you some form of control but again making your own gets a bit geeky as you have to tune a mic so that you sound waves are tuned to cancel.

But anyway here is a uni-directional electret quite a good one as is 16K as you do lose frequency.

https://uk.rs-online.com/web/p/condenser-microphone-components/1710881/

Sensitivity is less than mems but the Kingstate is a bit pricey just for a capsule but -37db isn’t bad.

https://uk.rs-online.com/web/p/condenser-microphone-components/7542104/

You will see they just have holes in the back of, Primo do really excellent ones but the prices are a bit crazy.

Uni directional microphones are noise cancelling and have an advantage over software EC that does DSP on the playing audio and subtracts from mic input.
The problem is when audio comes from another source as you don’t have that PCM on your sound card to cancel out its input to your Mic.
A directional Mic does if its facing the right way and often how you place a unit can very much give you a 3rd party noise cancelling solution.
If you stick it on top of your voice AI then it becomes a problem as directionality of that noise can be instantaneous at both back and front.

If you do have a Pi3 or Pi4 though you use software EC to cancel local unit noise and use the natural noise cancelation of a uni-directional microphone in conjunction to get the best of both worlds.

You might only have a single mic on a cheap sound card but in many situations it can be superior to a relatively pointless array that you have no control of.
Sensitivity becomes much less of an issue when you can cancel noise and have a ceiling where you can turn up the gain.

Basically they are uni-directional lapel mics that start really cheap.

You can find one for a couple of $ that will go with a couple of $ sound card but even what looks like it might be quite reasonable isn’t actually earth shattering.

But you can go more pro and you will notice more design on the rear cancelation part of the mic.

Or go all out with a shot gun mic but you Mycroft is going to get very Steampunk.
https://www.ebay.co.uk/itm/Unidirectional-Condenser-Microphone-Shotgun-Interview-Mic-for-DV-Camcorder/401655533631?

I can not say how well those work on a cheap usb soundcard as sometimes the gain can be pretty poor, hence why I have been taking a more DIY route with preamps as rather than passive I can make an active circuit with things like AGC and controllable hardware gain.

My fave of the moment is the MAX9814 but grabbing cheap clones and replacing the omni-directional.

The AGC timing cap Adafruit choose is rather low unfortunately but tacking on the top another cap is much easier with SMD than I thought, fiddly little things still drive me mad though.

There is a whole range of quite interesting low cost mic hardware.

Less than $3 and it has a noise gate & compressor built in…

I also can not tell you which cheap USB sound cards have decent gain as its a really mixed bag some are dire and others are great and even more confusing you can seem to buy identical but somehow they don’t seem to be.
Also with the mono USB cards I can’t telly you because I have been concentrating on the more expensive but much rare stereo ones.

So far
Edimax Dreambass which is a VIA VT1620A
Syba SD-AUD20101 which is a Cmedia CM6533
Also the likes of AXAGON ADA-17 or https://www.aliexpress.com/item/4001184939273.html

Hence can not say about sensitivity as with preamps and active the problem is too much gain and often in alsamixer set to 0db to about 9db gain depending on module I am using.

Yeah I know stereo and what I said about arrays but its the arrays generally at the moment we have available that seem to make little sense.
I have been playing with the idea of running one channel and then have an input for another Mic on a wired extension for extra coverage and playing with VAD to control channel mixing.

So a long winded post but for some rather than some $90 USB speakerphone if you can find one wit decent a gain a cheap sound card and Lavalier Lapel in the right situation will definitely give it a run for its money and even be in certain circumstances out perform.

If we get DSP especially advances in current DSP then planar Mems arrays are another story, but without they are a bit rubbish and sort of pointless.

2 Likes

Haven’t read it all yet (only the first clickbait stuff), but will do as all your lengthy posts have all been well researched before.

However the entrepeneur in me immediately started shouting; CHANCE!

We all know the personal assistance / smart speaker / A.I technology is THE new iPhone type of device for the upcoming years. As they all rely on this, there is a huge business (hardware) business oppertunity if you can figure it all out with relatively cheap components.

Will do some reading in a bit…

I think you hit the nail on the head as they do all rely on this but in terms of open source not one item of audio processing is employed.
They have embedded DSP and providing something that looks similar doesn’t mean in anyway in terms of audio processing it is actually effective.
We don’t have any DSP its not even explored as a repo. We can employ EC but for some reason that is also not part of the project.

What you can do is purchase in hardware with closed source DSP at prices for audio that are multiples of the price of leading brand complete units for relatively poor performance in comparison of audio processing.

Because of a lack of Array DSP options I have posted a few alternatives because yes exactly like an Iphone where shiny and new seems to forget all it is, is a phone and much of it is purely high pressure sales pitch.
We have had audio processing since the demise of silent movies and much now works as it did.

A Mic array without any DSP is rubbish and little more than a more complex omni-directional microphone.

1 Like

I agree with you, that is it comes to the DSP, nothing is opensourced and available to build upon. That is because it is so important and where they can make a difference. The difference between succes and failure. Also businesswise and the reason nowbody up to now released their source because it is their business model and the only reason they make money.

Mycroft A.I. is already doing things differently. Hence even look at the very early release of “their” mic-board. Over at the MatterMost Mark-2 channel, knowledgeable people are already chime in, helping them out with routing and such.

Perhaps you can join and such be part of a community based “task force” to get this hardware “properly” done, the opensource way.

Mycroft A.I. is NOT a hardware company and as I see it, their business model is build upon being a software company that any business can use to implement voice assistants. As you mentioned above, the microphone hardware is of up most importance, if not the most important for this to succeed.

Good software and good hardware go hand in hand. One does not exist without the other and need to go hand in hand during the development cycle. Steve Jobs understood this more then anybody else. If you want to know what I mean, you should watch back the iPhone release conference he did back in the day. It will be funny, now being years later how much he already knew what was needed and what would happen in the future at that point in time. If I remember correctly of that day, I think he even mentioned there; The marriage that is needed between software and hardware to be succesfull.

1 Like

I don’t think they can do it as its just economies of scale.
The likes of google and amazon are creating custom M-core arm chips with embedded dsp so its all on one die.
Its not like raspberry are likely to do anything that way.

I don’t think they would like what I would say as my opinion when you change a paradigm of the big data model to a domain model it makes little sense to copy the rest as we do.
Opensource shouldn’t be in business to push unnecessary consumer goods it should be about the reuse and best fit of what we have whilst returning a better system than what is currently sold by the likes of Google & Amazon.

I really liked your Iphone analogy but actually see huge flaws with the model as it is designed very much like a mobile product whilst like hi-fi and TV in the home domain or industry its an incredibly static model.

In fact I think the current Mycroft model is totally fubar as trying to copy the big guys is impossible and if you think about what you need from a home automation system the Google & Amazon model is probably wrong and open source could be taking the lead and providing a common interoperable standards but it would seem all are still playing VoiceAI klondike whilst the gold has come to an end.

Its why I wrote the above and started this thread as haven’t finished with testing and playing with Mic technologies but when you can not compete with the big guys look for a paradigm shift where you can provide the same or better by using technologies that are in your scope.

We need to look at what we have and partition and reuse the model so it garners more value.

A Mycroft base model should not have a speaker or a screen its purely a streaming microphone KWS.
It should use standard audio RTP from Snapcast, Airplay, Pulse, Spotify… focussed strongly on being an extremely simple cost effective device and rather than competing with the impossible of hi-tech array microphones we go back to conference technology that is far more accurate and create a distributed room mic.

Its a Pi3 a cheap sound card, a preamp and choice of uni or omni directional mics, it has an led ring and prob some touch buttons and that is it.
Only the first unit if a user wishes should have a speaker and amp and because they are small, terrible and ineffective as again once more we have no access to the design and manufacture capabilities of the big guys.
Like my speakers I am building a stereo pair and simply rather than trying to cover a whole room with a single unit I have 2.
I am also using stereo ADC sound cards because I might run another mic from each on a cable so I have a x4 pickup point in my room with 2 on each side wall.
The speakers usual affair far wall casting down the room which is perfect for passive noise reduction but also trying stream to the mic playing on a silent loop back and concurrently doing software EC.
The audio processing runs in 2 containers and hopefully it will squeeze the load into a $25 Pi3A+.
I know the load will cope with a single channel and from Raspberry only have a year to wait when the approx $25 Pi4A+ arrives.
I am also going to feed TV output to a line in of a X86 server running ASR and use my wireless speakers as hopefully I will be able to use the Pulse WebRTCAudioProcessing EC as its supposedly clock drift more capable and going to blat that from my soundscape as also because it is my HiFi system it will do that as well.
If it works I have already beaten the capabilities of the Google & Amazon by lateral design and paradigm shift.
Some of us will get thrifty and geeky and pickup or reuse existing X86 machines and others will go out and buy the latest and greatest in fanless design.
The HDMI will connect to my TV and I have my display and it could very well be my Mycroft Plasma or Kodi machine.
Because I already have a decent pair of old but really good vintage Japanese made Kenwood Bookshelf speakers, a HP6300 Pro I5 I picked up for $60 last year.
I am looking at a rough guess for 2 Pi Zero, 2 Pi3, 4 Sounds, Amps, 24V DC supplies and Mic bits and bobs somewhere about $100 dollars where I have wireless audio, full room voice AI and media recorder AI display and complete 100% browser on my TV rather than that embedded crap you usually get.
The single bluetooth aptx adapter also allows me to cast music from my phone and that is also the remote and secondary display of my overall system.
Purely by multiple uses and reuse and some simple technology adoption, but the choices and options are open to all.
My bedroom is getting the speaker and Mic system but going to share the server and have one extra Pi3 Line in from my TV here… Each room is a group that contains X number of KWS mics, Stereo 2.1 will probably be a thing when I get some decent subwoofers. Each room can have a pretty effective system starting at about $100.

A raspberry zero and again cheap sound card reuse what we have repurpose your room speakers mount the Pi on the back with and amp so that you share the value of wireless entertainment system with voice AI and hence why you use standard audio RTP formats.
Because you separate speaker from Mic unit passive noise reduction beats anything the big guys does because we are reusing decades of previous design using simple physics.

We partition cost, by partitioning functional design so when it comes to multiples we can actually compete and create better more functional items that many users can reuse what they have, purchase 2nd hand or have the choice of new and for purpose.

Each mic unit has KWS as it only streams on Keyword. Each mic in a distributed room environment can give a VAD value but only triggered units connect to the ASR server and the highest mic VAD becomes the mic of choice for that ASR sentence session.
Its actually very simple to create something far more effective as rather than one unit each mic unit is processing its own scope in parallel.

You could use a pi4 for a server but there are so many problems like WebRtcAudioProcessing for some reason doesn’t work well on Arm.
But because of the diversification of use a domain likely only needs a single server and there is a huge array of old X86 equipment that has the huge advantage of being able to use 2nd user GPU’s or the newest and greatest RTX models.
But again higher spec purchase or add more is always an option.

You give choice, you give options, you create/reuse common interoperable standards and you allow product reuse and leave consumerism as a choice to buy new but do not enforce it.
Mycroft shouldn’t even be trying to sell a single product but aiming at kits so users can create a plethora of products based on some common protocol standards.

What you don’t do is focus on a single product that will never be able to compete.

1 Like

A mic array won’t work if the environment does not fit. It does not make sense to test the performance with the bare breakout board when you plan to put it into a housing. And the housing needs proper acoustic design, otherwise it won’t work either.
This is for example the reason why EC will never not properly work for the Mark-I - the mic is a omni-directonal type and the speaker sits in the same “acoustic chamber” - blasting into the mic.

As written here I have built my Respeaker CoreV2 into the Mark-IIR prototype housing - and although the build is not finished yet the mic reception works far better than on my Mark-I.

Regarding open sourced software for EC: there is beamforming and EC support (via webtrc) available since Pulseaudio v9 (with some improvements in PA v12) where you can configure your mic-arrays geometry.

1 Like

We have already gone through all this before.
EC support via pulse audio does not work on Raspberry.
Beamforming via an email which I have already reported in a thread you responded to was never completed or finished and doesn’t work confirmed by the pulseaudio developer of webrtc for pulse audio.

"From:* Arun Raghavan mail@arunraghavan.net

Sent: Tuesday, March 17, 2020 6:46:34 PM
To: Stuart Naylor stuartiannaylor@outlook.com
Subject: Re: pulseaudio-webrtc-audio-processing
Hey Stuart,
The webrtc library doesn’t implement DOA. I’m not sure how much
doing
steering (changing target_direction) dynamically works either.
Unfortunately, the team has dropped beamforming upstream altogether,
so
when we next update the library, this support will be lost. :frowning:

Best regards,
Arun

It works so well Dominik its been dropped upstream already and next version of pulse will be dropped also.
Its a beamformer with no method to tell it where to beam! Or internal method to beam to.

Yes you can configure mic-arrays but it does absolutely nothing.
That has already been thoroughly discussed, shall I post the details once more to refresh your memory?

https://github.com/freedesktop/pulseaudio-webrtc-audio-processing if you want to check who mail@arunraghavan.net is.

Also exactly why with EC I am separating speakers and mic unit, but I have run various mic units next to or on top of a speaker and it gives an excellent level of echo attenuation which is a more accurate name for EC that allows barge in, as that is its purpose.

With minimal isolation requirements EC can be accomplished in the same housing but to maximise passive noise cancelation I am running in separate locations as it more natural for a stereo setup also rather than just a toy speaker on a desk, but also essential for a uni-directional noise cancelling mic to be in the noise path with clear separation of backwards noise to forwards pickup.

If you check the Google source then its gone.
https://webrtc.googlesource.com/src/+/refs/heads/master/modules/audio_processing/

Again will report those audio trials I did with a speaker in open air about 9" from the open mic.

Its another reason why ‘mic arrays’ are rubbish as on board mics in enclosures can be almost impossible to isolate unless on seperate cutout boards.
If you have a mic capsule or small board not in an array they are much easier to contain and isolate.

Its far easier to take a design paradigm and separate audio play and mic, but without doubt with the right DSP and accurate FFT sync you can cleanly extract one from the other to a quite large signal removal even with just a Pi lib.

Run it up on a Pi3/4 you might want to compile the latest speex and speexdsp which I tacked on the end, which is prob the wrong way round.

Install is pretty easy, but you need to make sure the sink and source come from the same clock source as clock drift will likely render it useless.
I have tried the pulseaudio webrtc EC several times which is supposed to cope with clock drift but on Arm it doesn’t seem to work at all.
Also strangely the echo of the alsaplugin of speex doesn’t seem to work but in the following this repo will use the speexdsp libs and it will work extremely well? Dunno…
But your soundcard make sure it plays and records such as a usb soundcard or some of the mic hats with a soundcard chip (gets a bit confusing as some hats have seperate on board clocks for dac and adc).
So cheap USB soundcard or something like the respeaker 2mic.

We are going to run from where we install this will just install in %HOME install where you wish

git clone https://github.com/voice-engine/ec.git
cd ec
make
./ec -h

should give you the help info of ec and show its installed.

Ec uses a fifo file in /tmp for audio rather than a source so again just install.

This utility will install rather than just in the repo directory.

cd ..
git clone https://github.com/voice-engine/alsa_plugin_fifo.git
cd alsa_plugin_fifo
make && sudo make install

https://raw.githubusercontent.com/voice-engine/ec/master/asound.conf copy to /etc/asound.conf

pcm.!default {
    type asym
    playback.pcm "eci"
    capture.pcm "eco"
}


pcm.eci {
    type plug
    slave {
        format S16_LE
        rate 16000
        channels 1
        pcm {
            type file
            slave.pcm null
            file "/tmp/ec.input"
            format "raw"
        }
    }
}

pcm.eco {
    type plug
    slave.pcm {
        type fifo
        infile "/tmp/ec.output"
        rate 16000
        format S16_LE
        channels 2
    }
}

So this sets to fifo files ec-in and ec-out as your playback and capture defaults.
Go back to ec

cd ..
cd ec
./ec -i plughw:2 -o plughw:2 -d 20

The -d 20 is to try and compensate for your latency which really should of mentioned earlier as you can use alsabat with –roundtriplatency to measure latency from output to mic capture.
The repo states its 200 msec but to be honest it seems to be much less but alsa the redirection of ec may also add to this but seems impossible to measure with EC running so test your hardware first.

You need to open another terminal and aplay a wav.

wget https://file-examples-com.github.io/uploads/2017/11/file_example_WAV_10MG.wav
aplay file_example_WAV_10MG.wav

EC will start on playing media and at end of file stop.
So aplay an example wav to the default playback device and Enable AEC will show.
When it ends you will get something like

playback filled 256 bytes zero
No playback, bypass AEC

So what is happening is EC is listening on the mic and trying to subtract the playing wav from that fifo file eci and places the result in eco which we need to record from.

So just to test we will open up another terminal and start a recording before we play.
arecord -r16000 -fS16_LE -c1 ec-rec.wav
EC is already running so go back to the aplay console and play the example wav again.
Then ctrl+c to stop the recording in the arecord terminal.

Now you prob have found the one quirk that EC has that on the end of recording EC ends with
-bash: playback: command not found

So we need to fix that with a alsa loopback device which we need to modprobe.
So this gets the alsa index it will on boot each we enable we are going to add it to the modules to load on boot.

sudo nano /etc/modules and add snd-aloop to that file save and reboot.
you should see something like

pi@raspberrypi:~ $ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: Loopback [Loopback], device 0: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7
card 0: Loopback [Loopback], device 1: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7
card 1: b1 [bcm2835 HDMI 1], device 0: bcm2835 HDMI 1 [bcm2835 HDMI 1]
  Subdevices: 4/4
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
card 2: Headphones [bcm2835 Headphones], device 0: bcm2835 Headphones [bcm2835 H                      eadphones]
  Subdevices: 4/4
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
card 3: Dongle [VIA USB Dongle], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

So what we will do is pipe the arecord into one side of the loopback a sink and the other side will be our source.

So in a terminal run ec again you may find the card index has changed ./ec -i plughw:3 -o plughw:3 -d 20
Pipe arecord into the loopback arecord -Deco -r16000 -fS16_LE -c1 | aplay -Dplughw:0,0,0 so its constantly recording.
But we will now use the other side as a source to record from arecord -Dplughw:0,1,0 -r16000 -fS16_LE ec-rec.wav

Now you prob want to edit the /etc/asound.conf and make the loopback on -Dplughw:0,1,0 the default capture device.
Or you may want to use that as a pcm slave for speex AGC but will let you decide on how you edit and channge /etc/asound.conf.
You should now find when Rhasspy plays or records on the default devices EC will continue.
You will have to create a boot script or service and run EC and pipe through the loopback but at least you know how to get EC running.
There is ec_hw in the repo and it would also of been compiled on make never did work it out maybe you can.

It should be called Echo Attenuation but you should find its more than capable of allowing barge in that was extremely problematic before.

You might want to run through this as the raspberry repo is actually out of sync with release for SpeexDSP.

libasound2-plugins (alsa-plugins) doesn’t install the speex plugins because in Debian speex-dsp is an old rc-1 version or something.

So if you have a asound.conf say as

 pcm.!default {
    type asym
    playback.pcm "plughw:2"
    capture.pcm  "agc"
}

pcm.array {
 type hw
 card 2
}

pcm.cap {
 type plug
 slave {
   pcm "array"
   channels 2
   }
 route_policy sum
}

pcm.agc {
 type speex
 slave.pcm "cap"
 agc 1
 agc_level 8000
 denoise yes
 dereverb no
}

Don’t be surprised if it doesn’t work.

First do the usual install

sudo apt install libasound2 libasound2-dev libasound2-plugins
sudo apt install libfftw3-3 libfftw3-dev
sudo apt install libspeex1 libspeexdsp1 libspeex-dev libspeexdsp-dev speex speex-doc
sudo apt install git autotools-dev autoconf libtool pkg-config

Clone the 2 Speex repos

git clone https://gitlab.xiph.org/xiph/speexdsp.git
cd speexdsp
./autogen.sh
./configure --libdir=/usr/lib/arm-linux-gnueabihf
make
sudo make install

Same procedure with speex.
git clone https://gitlab.xiph.org/xiph/speex

blah blah…

You now have the actual releases of Speex and SpeexDSP installed.

So now we have to recompile alsa-plugins (current version) so that it doesn’t omit the speex plugins.

aplay --version
1.1.8 must likely

https://www.alsa-project.org/wiki/Download

wget ftp://ftp.alsa-project.org/pub/plugins/alsa-plugins-1.1.8.tar.bz2
tar -xvzf alsa-plugins-1.1.8.tar.bz2
./configure --libdir=/usr/lib/arm-linux-gnueabihf
make 
sudo make install

Create a /etc/asound.conf like something included in the start and your far field will work much better.
Don’t worry about when AGC ramps up gain and noise becomes a thing as recognition will just ignore and when voice of a required volume is spoken the AGC will ramp down so it doesn’t clip.
It will sound a terrible setup for studio recording but for recognition will work well.

Total pain having old versions of Speex in the repo that alsa will not compile as it references RC2 but we have RC1 but hey the release has been out for ages so at least we are up to date.

Echo seems to be totally useless dunno why as the working EC on github uses exactly the same libs.
Reverb never got completed so just have it off.

asound.conf software AGC can be pretty handy, a pain to install due to a repo mismatch but there you go.
As I just did a install myself and thought I would post before I forget again.

PS I had forgotten about the custom board you guys are making as its of little interest.

I have not been really impressed with any product utilising the XMOS XVF3510.
The Anker Speakerphone I have on my desk is that chip I think, forgot but think its that.
Its OK.

My opinion is they add to much cost and the boards are inflexible often custom firmware or drivers that so far for me have been flakey at best.
I had forgot about that board but have been constantly bemused at this fixation of singular product whilst there is a lack of kit to make a whole range of Mycroft product.

Enclosures are really problematic for a Mycroft builder as whats available for the Pi doesn’t really fit a voice AI.
Again though I am bemused as a singular product seems to be of chosen to the detriment of many and choice.
I have often moaned in the forums is all we need is a ‘puck stack’ that is not much more than stacking tubes, hex mounts that is a ‘lego’ approach that can encompass various form factors.
That a Raspberry Pi-B format fits in a 4" tube and that material is very easy to source.

This has been my rationale as I don’t like the Mark II and don’t worry I am not going to go into why.
But I will say that is the huge problem as the for, nay and not bothered have a singular option and no other.
Its just crazy for a small team to invest so much of a niche product and create a singular option and narrow choice even further.

Also I will request that maybe you could look at a more modular product that provides the enclosure and sundries for a stacking tube and vinyls to finish.
It would be really small qty’s for a small company like Mycroft to supply but for a builder to get and create singularly its actually really expensive and consuming.

All the electronics hardware already exists and its much more modular and flexible.
$10 Stereo ADC Usb soundcards exist with zero software or firmware considerations and as I have repeatedly said software EC does work and is opensource hence free.
Being analogue simple mixers can make mic arrays that provide 2 channels and I have been purely collating cost effective and modular options.

The OP I posted and list of a couple of $ modules and there are many even Mems analogue as also I have a couple of these and like the electrets they work great and any number on 2 channels can be had.

I have just found as a builder round electrets are very easy to isolate and its called a rubber gromet and unlike the SMD mems can have short fly leads to make things rather easy.
You can isolate those little mems boards its just a little more complex than my KISS ideology.

It was rather simple as sound card gain on average is pretty poor but because you have 5v & 3v3 you can create near line in and suddenly you have loads of gain and simple AGC can provide extemely capable far field due to active powered mics.
You wouldn’t record on it yeah it can have clearly audible noise but that doesn’t matter as a common method of MFCC extraction is raising the log-mel-amplitudes to a suitable power (around 2 or 3) before taking the DCT.
Low energy sound components are not robust for recognition and that log-mel-amplitide gobblygook means they are removed and noise suppression happens @MFCC and its pointless to create more load to duplicate the process pointlessly.

So if you want your wonderful clear tech wonder sound card that that is OK but the only person who will ever listen to it is Mycroft and who cares if he listens to a large proportion of hiss 24/7.
I am creating solutions for a voiceAI and not a speakerphone and in terms of engineering there are clear requirement differences.

But also in-keeping with that KISS philosophy of a modern automated home as why else am I employing a voice AI that its extremely likely I will already or want to provide RTP wireless audio.
So I don’t need speakers or amps in my Mycroft and things are becoming even more simpler and options are opening rather than closing.
In terms of cost a Pi3, USB soundcard mic array, LED ring, touch buttons can be made extremely cheaply I am still frustrated at the lack of an enclosure and so definitely don’t need whats on offer.
That cost means for room coverage which is currently large is I have a very simple option of JGA (Just get another) and some simple algs and simple physics will just work better.

I have sort of posted my crown jewels of low cost modular hardware and the only thing it lacks is a modular enclosure but hey ho, those are obvious far too complex to provide?!

I am currently the proud owner of far too many USB soundcards and various cheap Chinese audio modules, so only reason I posted is you don’t end up with the same.
They do work well and hence why they where posted and apols if I don’t have much interest in your new fangled 2 mic card and promise to say no more on that one :slight_smile:

My next bit of research is compact shotgun mic formats for voiceAI as often with engineering all is needed is to be fit for purpose and looking like it will so gave you guys a heads up.
The partitioning of RTP audio out and in has made things really easy for me and also with software EC with 3rd source noise, passive noise reduction is extremely possible, cost effective and viable.
No such thing as a uni-directional mems but maybe I should Google it.

Because EC often involves mixing down to mono it really means those individual mic channels are pointless I will hardware sum on a channel and on the back of the unit have a 3.5mm for an array extension.
If its uni or omni directional on either channel is likely to be a matter of room size and mic position and will use both where needed.

Again its just KISS & cost but that extension is another option that vastly increases coverage at very little cost even if JGA is relatively cheap also.
I am not really bothered about losing an Array but guess at the low cost of mic modules I could just have a switch that chooses between Array or singular channel and don’t care at all about DOA indication as its a bit like a lift that tells you ‘Going up’ and a statement of the bleeding obvious, just needs to give indication of receipt .

Almost forgot the other thing I have been researching is the RK3308 aka RockPiS.
The analogues on it are noisy somewhere there is ripple on an internal reg, or at least that is what I have told Radxa.
PDM x8 Mics & built in Dac work also I2S 8 channel In and Out.
The Debian 64 image is robust but I am having various problems with Ubuntu 20.04 as they have changed things again and having loads of compile problems that are the same on X64.
That also makes a really cost effective Pi3 replacement @ Pi0 cost with built in audio and also has embedded DSP VAD.

Haven’t done that much as analogue mics where a pain but think there is going to be a hardware revision and they should be fixed.
Dunno yet,

Been a bit lazy as my engineering of interference tubes hasn’t happened yet.
Plastic tubes with holes in them are not the hardest but just haven’t got round to it.

I did place a uni-directional electret in free air and face both directions to he same levels of supposed near far.
1st facing near (Stephen Fry)
2nd facing far (Example wav)

https://drive.google.com/file/d/1ErN28qRi30ILYJCwCpOY6ji6MNmzrFse/view?usp=sharing

https://drive.google.com/file/d/1PqrMqf8TB0Tj2lOlW9juVPufRS9GIRDR/view?usp=sharing

These are just cheap old plain telephone style electrets and the noise reduction is substantial and for many situations it is possible to place your mic to get considerable noise reduction of all noise not just echo.

It is possible to create simple ‘shotgun mic’ interferance tubes that I still mean to play with to garner super cardioid and narrower patterns.
It did occur to me that I could do something that I haven’t seen before and have a reverse ‘shotgun mic’ where the interference tube is on the rear, maybe both and front to test different field patterns.

If I find the elusive chuck key to my pillar drill then I will prob post those also.

Still playing but also forgot to post the 3rd option with EC and uni-directional NS mic.

https://drive.google.com/file/d/1i-yp3tBbH8PfpeAK7vHRQEetIuwEPg8f/view?usp=sharing

Also a omni with no ec as reference.

https://drive.google.com/file/d/1xj3VrmRXRrHRUHZ7TFARMNwiYjLvAbB-/view?usp=sharing

Last reference omni with ec that does a fine job via speexdsp but will only attenuate played and not other sources as the unidirectional will.

https://drive.google.com/file/d/1d3it9i9gtAjK0f9tg1488ZgoPCCGFnuU/view?usp=sharing

In terms of modules and Mics I have sort of narrowed it down to a couple now and these are my faves out of quite a number I have tried connected to a USB soundcard.

I have been playing with Stereo ADC USB sound cards so my cable has been one of these.

To be honest without DOA & Beamforming there is no real advantage to having a stereo pair that I have noticed.

So I have just recorded a snippet on my fave setups on a cheap USB sound card.

Max9814 very high output AGC on board and gpio selectable gain. They come with a unidirectional mic and with AR=float and gain=gnd we get the following just using 6db gain on a cheap mono USB soundcard.

https://drive.google.com/file/d/165hhGRVL75uPAEpxnXXD8k_EUJ2LIoq9/view?usp=sharing

I actually prefer a directional mic but desoldering the omnidirectional to solder a uni-directional electret on is a bit of a pain. Also the AR (Attack/Release) is also very quick and prob pointless for voice so I tack on a 680uF to Ct to create 780uF and a better 1.5sec attack.
Dunno if the hassle is worth it but uni-directional with 780Uf AR=float gain=gnd.

https://drive.google.com/file/d/1dlw3Tc26jzDzTB9vHON6QwQh1cYXSrLh/view?usp=sharing

So dunno but they are really cheap and are extremely sensitive prob not great for broadcast as with the gain we get hiss but for recognition low order energy is cast off during MFCC, so matters not a whole lot.

Then if you want mems and maybe do want a stereo planar here is a analogue mems I like.

https://drive.google.com/file/d/1MA6nf3x4pqsepnVl-LvaIv2nxS1I1g6w/view?usp=sharing

Everything was tested at the same levels with the same usb card so you will notice output if much lower as no dedicated preamp as such on the mems apart from its built in.

There is also another preamp I have been playing with as it has a noise gate and compressor just never wired up 2x potentiometers yet to see what overall effect the can have.
Here it is with the Mems with just its default settings.

https://drive.google.com/file/d/1huH-MxEQTqeZDCgSRJ0KqoLEw-AtZe6J/view?usp=sharing

I haven’t found a cheap and cheerful passive mic that uses the soundcard bias output with enough gain, plenty sound great if your up close but forget far field pickup.

I dunno why as I am using stereo with 2x Max9814 with uni-directional but really x1 omni will do the job just use the Speex AGC alsa plugin and think you will be pretty impressed with range.

With sound cards the stereo dac are usually about $10+ but you have to shop around.

The Syba SD-AUD20101 is a fave stereo ADC of mine.

There are some ridiculously cheap and effective single channel cards but struggling to recommend one as with experience the internals can change.
Its why I have become a fan of the board based Sanwu cards as the components are well specced its a couple of $ and you can see that it is the same.
Also because bias is so ineffective in terms of gain because of the above you may want to remove that so it can have no effect on your circuits. (remove r6 on the cm108)

https://www.youtube.com/watch?v=m0FjQ-X04Jk

1 Like

This is some very interesting insight to Mic array alternatives.

Mind if I ask if you ever considered how to best use such microphones in a car?
I am looking into trying Mycroft as a car assistant, which I find more useful than just at home.

Probably just copying the design of hands-free call systems in cars would yield good results, but I believe you have more knowledge about this than I do.

Thanks for sharing!

Probably definitely electret uni due to the style of dash mount and antivibration they can provide.
You also get a proximity effect with unidirectionals that you could use and prob train in.

When you have a situation like a car then as in mobile phones that have front and back mics where blind source separation is used to remove noise as not so blind as the driver is known and so is that mic.
You can sort of guarantee front & back to driver position and like all mic arrays its rubbish without the correctly specced algs.

Hi…I bought a ReSpeaker USB microhone for $80. It incorporates a construct in soundcard that can yield sound, so the theory was you may play music through it, and after that the mouthpiece would still be able to observe your voice in spite of the music playing. It would ‘substract’ the music from the flag because it were. The reality is exceptionally diverse in any case. It as it were underpins 22khz sound out (low quality), and it doesn’t have program controls (ALSA scontrols). Basically, you can’t alter the volume of the yield as you’d with any other soundcard.

Respeaker does have dsp so is not just an array but its 16Khz I thought from memory but has both AEC & beamforming. Beamforming suffers as the VAD can be triggered by any voice but apart from that works OK.

The input audio quality for ASR doesn’t need to be broadcast quality but the WM8960 for output is separate to the mic sampling freq which at 16Khz is quite low as with 343M/s that is 21mm resolution per sample.

You can use the python control script for the device
https://wiki.seeedstudio.com/ReSpeaker-USB-Mic-Array/#faq