EC respeaker echo cancellation

Yeah saw that python stuff before. There voicen wiki has some pages about the different software tools;
https://wiki.voicen.io/

Yeah to be honest the EC implementation is a question to why but maybe the creator was like me with the alsa-plugins where in raspbian the speex-dsp plugins are not enabled.

I switched to archlinux as it been bugging me on raspbian that I can not enable or seem to work out the compile switch for them.

Archlinux after installing alsa-plugins and speex works love arch generally and really should do a Mycroft setup but think there is one in the Aur that I should test.
The speex plugins for alsa work as in the run so dunno what EC is about really or maybe a better implementation?
I can not really say as I am doing what it says dont and running on seperate cards but can share a working asound.conf though if anyone wants to play.

pcm.!default {
    type asym
    playback.pcm "plughw:CARD=ALSA,DEV=0"
    capture.pcm  "cap"
}

pcm.array {
 type hw
 card 1
}

pcm.cap {
 type plug
 slave {
   pcm "array"
   channels 4
   }
 route_policy sum
}

pcm.echo {
 type speex
 slave.pcm "cap"
 echo yes
 frames 256
 filter_length 1600
 denoise false
}

pcm.agc {
 type speex
 slave.pcm "echo"
 agc 1
 denoise yes
 dereverb yes
}

arecord -Dplug:agc -r 16000 -f S16_LE test.wav

Apr 11 16:18:42 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:18:42 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback too far ahead (23694), drop source 6064
Apr 11 16:18:42 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:18:42 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback too far ahead (36730), drop source 9392
Apr 11 16:18:42 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:18:42 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback after capture (-1654), drop sink 288
Apr 11 16:20:10 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:20:10 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback too far ahead (44329), drop source 11344
Apr 11 16:20:10 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:20:10 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback too far ahead (1753), drop source 448
Apr 11 16:20:18 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:20:18 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback too far ahead (93066), drop source 23824
Apr 11 16:20:18 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:20:18 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback too far ahead (42528), drop source 10880
Apr 11 16:20:21 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Doing resync
Apr 11 16:20:21 raspberrypi pulseaudio[294]: E: [alsa-source-USB Audio] module-echo-cancel.c: Playback too far ahead (13670), drop source 3488

So is this what happens when you try to use 2 seperate soundcards?
I can not say as don’t have a single yet to try guess could throw on my headset but the mic will not really pick up the phones.
Will have to wait until the 2 mic turns up.

There is an Arch image here currently set where alsa is using pulse so think all should go through the AEC of webrtc.
Its also has the speex modules so you could try those also.
Or if you just want a Pi Arch image well there is one here already done, saves you doing the initial build and setup.
You just have to register you Archcroft. :slight_smile:

This is what happens when you use a RPI - regardless if audio in/out is on one soundcard (without hw-loopback) or on two seperate soundcards. At least this is exactly what I have seen on my Mark-I while experimenting with ec/speex.

Yeah thats the webrtc AEC which is designed to run with different sound cards and resync.
Speex doesn’t have that ability but also there might be to much variance with latency.
I am compiling The RT-Preempt kernel at the moment to see if it makes a difference.
It used to be a bit of a killer on the Pi3 but apparently runs quite well on the Pi4.
/etc/asound.conf

# Use PulseAudio by default
pcm.!default {
  type pulse
  fallback "sysdefault"
  hint {
    show on
    description "Default ALSA Output (currently PulseAudio Sound Server)"
  }
}

ctl.!default {
  type pulse
  fallback "sysdefault"
}

# vim:set ft=alsaconf:

I switched to webrtc but with pulseaudio-alsa so that all will run through pulse and that EC.
Its setup in the above image but also unlike rasbian it does have the alsa speex EC modules.

I can’t really go by anything here as haven’t really played with the adjust_time=<how often to readjust rates in s> and adjust_threshold=<how much drift to readjust after in ms>
Also see on a Pi4 how much benefit the RT kernel provides for drift.
But that is with Pi on board and a ps3eye cam and likely the worst pairing possible.

Oooh! Its all hard work kernel compile on arch didn’t work takes ages and thought sod that as a nice chap has already done it for raspbian.

Its actually a bit old that kernel as the Pi4 got some rapid updates but it will do.

Sort of stuck with no compile option for alsa-speex on raspbian or on arch the kernel on aur is X86 which will revisit as should prob update this.

There is raspbian which as you say seems to go to hell but haven’t tested with a single sound card but the is an RT preempt image here.

Its actually really s-pipe but that doesn’t matter as it picks up voice and cancels whats playing so its looking like a corking method could be created which would be a big plus for Forrest-Mycroft.

Still waiting for cards as still think trying this with PS3eye is prob a bad idea.

PS is it just me or does Mimic sound better on RT?

I doubt that EC or webrtc is properly coded to make use of RT latency.:thinking:

1 Like

? RT just tries to give a constant latency, there is no coding to make use of RT latency.
RT latency is low and constant?

True in some extend, but for real real time the program has to be written properly for it. All has to do with the threads within the program. There is a reason that patch set still didn’t make it to mainline.

https://rt.wiki.kernel.org/index.php/Frequently_Asked_Questions#Do_I_need_root_privileges_to_start_a_realtime_application.3F

https://rt.wiki.kernel.org/index.php/HOWTO:_Build_an_RT-application

I am not saying it will not bring you some benefits, but think those benefits could also be gained by increasing the prio of the running task.

EDIT: On MycroftOS I also running PulseAudio under a higher priority then the rest.

Yeah dunno @j1nx as was wondering but also was wondering about more constant latency with the speex EC.
But also maybe more priority.

To be honest the EC works quite well with webrtc but from what I can tell the AGC attenuates over time.
But to be honest just lining myself up with images to test when gear finnally gets delivered.
Thought I would share the images just a quick images if anybody else wanted to try rather than complete install.

To be honest from starting on here on the forum to now I have picked up loads of Linux audio info, but it would seem the more you get into it the more confusing it gets.
If you turn AGC off then it does remove echo but also is too vocodey if thats a word but prob know what I mean.
It might be much better on a single card and thought maybe the more consistency of RT may help.
I am not expecting any miracle but wondered with all these little additions could it be just good enough.

Strange and dunno id its just my ears or now have a different setup to before with pulse but Mimic seems to sound better.

Also can you stop and start Precise with different options depending if playing media or not?

// Precise options:
// “sensitivity”: 0.5, // Higher = more sensitive
// “trigger_level”: 3 // Higher = more delay & less sensitive

Whahaha, i know exactly what you mean with learning on the spot. The more you know, the more you realize you don’t no shit😂

But don’t get discouraged. I keep a close eye on your process. Hopefully somewhere you manage to connect the dots…

Just keep up the good hard work. “We” will help you where we can and sometimes that means we are critical or even sceptical. But Sometimes best solutions come from; Hah, proved you wrong.:stuck_out_tongue_winking_eye:

1 Like

About mimic. Two things could be the case;

  1. you where on the old version first and now pulled the newer 0.1.3.0 version.
  2. saw you push all also to pulse via alsa-utils pulse addon.

Both can be reverted easily to inspect if you are right about it sounding better.

:slight_smile:
I am not trying to prove anyone wrong but just refusing to believe something adequate to do a job can not be found at a sensible price.
The Forrest_Gump of a runaway deaf Mycroft is just a project showstopper to me and starts to require some gear that is overkill.
The Goggle home I think has a ARMADA 1500 Plus in it that has a Zilog ZSP800 in that and they are certainly not paying $70 for them, or just the audio processing part.

I think I have already connected the dots as the EC via webrtc works ok a bit vocodey that may work well enough if precise can run with different sensitivities when playing media and when not.
It will be a bit hacky but prob could work.

I am am merely hoping with a singluar sound card I get slightly better results than I am and Forrest may hear and stop.
There is some weird s-pipe happening with webrtc-audio-processing and like beamforming in many respects some items just don’t work that you have to hack around them.

The AGC seems to be inverted and tries to increase the volume but progressively mutes itself.
The EC is a bit s-pipe but its actually working as I have it now but so much of this is just plain weird.
If you have AGC on without playback its all loud and crisp.
Turn off AGC you can have volume at the max and do the same and never reach that volume.
Analogue gain you play with it and think WTF! is it as they warn that it could cause distortion if used, who cares as it just doesn’t work.
Digital AGC goes nuts when with EC playback and mutes rather than gives gain.

I am even more sure you can get reasonable EC on a Pi3/4 but boy the software tools we have are pretty weak.
PS wow the Precise load ramps up with the RT kernel, boy does it ramp up, but the PA-EC seems to produce less on it than the standard kernel.

If Webrtc worked in entirety and was like the speex Alsa module then I would be saying yes definately as could just build up the PCM structure via some nested EC options in an asound.conf file.

1 Like

Regarding the price of Google and Alexa devices you should consider that a) parts are much cheaper due to high production volume and b) these devices are subsidized by services and goods that are sold through them

That is no consideration to a consumer and purchaser and hence why I have been on a mission to reduce cost.
The margins Google & Amazon make on product are low, in fact they are almost dumping product , but unless reasonable alternatives can be found then there are no reasonable alternatives.
But actually the silicon is really cheap couple of $ cheap WM8281 and likes, they have access and we don’t.
So its either some lateral thought or silly priced product or the great priced commercial alternatives.

@j1nx Mimic does sound better as have noticed, wondered if it was just my ears and memory.
The pulseaudio daemon does have config for RT

 high-priority = yes
 nice-level = -11

 realtime-scheduling = yes
 realtime-priority = 5

But haven’t read up on it.

With the above raspbian image and

load-module module-echo-cancel aec_method=webrtc aec_args="analog_gain_control=0 digital_gain_control=1 agc_start_volume=85 drift_compensation=1" source_name=echoCancel_source sink_name=echoCancel_sink

set-default-source echoCancel_source
set-default-sink echoCancel_sink

The voice stays very much in sync and for some reason specifying drift compensation=1 does that but also introduces an amount of bleed or doesn’t subtract the echo completely.
Its massively better than how I started with speex and sure actually it could be used.
But will have to see as an spi sound card that has a syncronised input and output might even be better and I can run without drift compensation and get no bleed.

Also take a look at the “tsched” parameter which you can enable/disable for the alsa/udev sink. You can reduce the latency with that as well. There are also some timing parameters related to it, but that is very hardware dependent.

I have “tsched=0” for mycroft which helps a lot to make sound way snappier. Especially for the wake sound not being so delayed because of the time based schedular. (On a rpi that is)

@j1nx Next one is to work out module-role-ducking as think voice_detection=1 hopefully adds a media.role property? As otherwise totally bemused hwo to implement it.

The documentation is just disasterous and had enough and it will have to be another day.
Waiting for the 2 mic to turn up and see if Forrest Mycroft can be stopped.

I will reread the posts one time and try tsched with the USB.

Ducking is already merged within mycroft and is needed to duck played music when TTS kicks in.

Is that just pulse ducking or internal as presume its picked up via --stream-name=mycroft-voice

Yeah I want to try the same with the Mic and see how it goes as its VAD so hopefully its added over a threshold.
To be honest VAD in how you can utilise it or what function it has is a documentation mystery.

I discussed it here;