EC respeaker echo cancellation

StuartIanNaylor · April 10, 2020, 4:45am

PS @j1nx @Dominik
There is also a python script for that EC which with a bit of molesting I guess you could add portaudio streams to the script and negate the malarky for the fifo and EC usuage.

github.com

voice-engine/voice-engine.github.io/blob/mkdocs/docs/audio_processing/aec.md

# Acoustic Echo Cancellation

In a smart speaker, the algorithm Acoustic Echo Cancellation (AEC)  is used to cancel music, which is played by itself, from the audio captured by its microphones, so it can hear your voice clearly when it is playing music.

![](/assets/images/aec.png)

The open source library `speexdsp` has a AEC algorithm. There are two examples to use it in Python and C.

## using AEC in Python
1.  `pip3 install speexdsp`
2.  create a python script named `ec.py`

    ```
    """Acoustic Echo Cancellation for wav files."""

    import wave
    import sys
    from speexdsp import EchoCanceller

This file has been truncated. show original

"""Acoustic Echo Cancellation for wav files."""

import wave
import sys
from speexdsp import EchoCanceller


if len(sys.argv) < 4:
    print('Usage: {} near.wav far.wav out.wav'.format(sys.argv[0]))
    sys.exit(1)

frame_size = 256

near = wave.open(sys.argv[1], 'rb')
far = wave.open(sys.argv[2], 'rb')

if near.getnchannels() > 1 or far.getnchannels() > 1:
    print('Only support mono channel')
    sys.exit(2)

out = wave.open(sys.argv[3], 'wb')
out.setnchannels(near.getnchannels())
out.setsampwidth(near.getsampwidth())
out.setframerate(near.getframerate())


print('near - rate: {}, channels: {}, length: {}'.format(
        near.getframerate(),
        near.getnchannels(),
        near.getnframes() / near.getframerate()))
print('far - rate: {}, channels: {}'.format(far.getframerate(), far.getnchannels()))
echo_canceller = EchoCanceller.create(frame_size, 2048, near.getframerate())

in_data_len = frame_size
in_data_bytes = frame_size * 2
out_data_len = frame_size
out_data_bytes = frame_size * 2

while True:
    in_data = near.readframes(in_data_len)
    out_data = far.readframes(out_data_len)
    if len(in_data) != in_data_bytes or len(out_data) != out_data_bytes:
        break

    in_data = echo_canceller.process(in_data, out_data)

    out.writeframes(in_data)

near.close()
far.close()
out.close()

With the https://github.com/xiph/speexdsp/blob/master/doc/manual.pdf and the SpeexDSP source https://git.xiph.org/?p=speexdsp.git;a=summary
There is an absoute plethora of new toys to play with.
VAD, AGC and all sorts that will prob wait until I have a all-in-one soundcard to play with.
The tail length of the EC is confusing as there is an optimal small length for EC but also smaller means more load, so might take some playing with to get it optimal.

EC also has a delay that gives an example approximation of 200ms which when I ran alsabat -Pplughw:CARD=ALSA,DEV=0 -Cplughw:CARD=Device,DEV=0 --roundtriplatency -B1024 on a Pi4 its a long way out as was an approx average of 50ms but don’t know how good that test is.
Audacity and the other main audio app on linux that I can not remember (http://ardour.org/) have some latency tools that could prob verify alsa latency.
If you only have a tail length of 100ms then that is a country mile out.

It will be weeks before the 2 mic turns up pointless me testing on something sub optimal of 2 sound cards.
I did notice int frame_size = config.rate * 10 / 1000; // 10 ms whilst speex suggest 20ms (stereo?). Noticed also the default filter length is 2048 but if 100ms is a ‘good choice’ at 16khz then shouldn’t that be 1600?
Also its not an order (multiple) of the frame size.
Also the buffer which is 128x the filter length ?

Maybe the manual & latest release are a bit out of sync or some mistakes have been made.
Been sat waiting for ages for the 4 mic and might have to wait longer as should of got the 2 mic at least for initial setup.