Mycroft Community Forum

PS3 Eye Best Settings

I noticed with the Picroft image that when you select PS3 eye it sets up the Microphone but that is it.
Being playing with pulseaudio to get the best settings and added that to Audio_Setup.sh

Single line does it all.

nano $HOME/audio_setup.sh

#!/bin/bash
# Use this script to execute audio setup actions
sudo amixer cset numid=3 "1" > /dev/null 2>&1
amixer set PCM 79% > /dev/null 2>&1
amixer set Master 79% > /dev/null 2>&1
pactl load-module module-echo-cancel use_master_format=1 aec_method='webrtc' aec_args='"analog_gain_control=0 digital_gain_control=1 voice_detection=1 beamforming=1 mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0"'

audio_setup.sh runs on each boot or just ./audio_setup.sh to test & mycroft-cli-client to watch how things are recognised.

Or pactl load-module module-echo-cancel use_master_format=1 aec_method='webrtc' aec_args='"analog_gain_control=0 digital_gain_control=1 voice_detection=1 beamforming=1 mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0"' from the cli.
pactl unload-module module-echo-cancel to remove
The digital auto_gain works great, voice_detection=1 seems to provide improvement.

The settings can be used with any mic just the beaming forming is the 4x linear mic array of the PS3 Eye.
I grabbed that from the web as really struggled to get a singular authoritive source of any of the settings.
The metric of the beamforming from memory is x1,y1,z1,x2,y2,z2,x3,y3,z3,x4,y4,z4
The linear mic array on the PS3 eye is 4x mics eqi spaced same orientation by approx 20mm so everything is from a virtual centre point and without actually knowing 0.01 seems to be approx 10mm.

The above seems to work great for me and wondering should it be part of the setup if PS3 eye is chosen as that single line makes the Mic a quantum improvement over the vanilla setup.

[edit]

Prob best to edit /etc/default.pa

    ### Enable Echo/Noise-Cancellation
    load-module module-echo-cancel use_master_format=1 aec_method=webrtc aec_args="analog_gain_control=0\ digital_gain_control=1\ agc_start_volume=85\ beamforming=1\ mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0"  source_name=echoCancel_source sink_name=echoCancel_sink

    set-default-source echoCancel_source
    set-default-sink echoCancel_sink

Check the further posts and with playin found the filters, noise suppresion & VAD seem to make zero difference to recog.
If you record and playback you can hear a difference but starting to think the recog will pluck out speech and prob best not to introduce filter noise that purely makes it cleaner for humans.

I never did work out how to add a ALSA pcm software volume as the default is quite quiet.
I will get round to that one day :slight_smile:

3 Likes

Hey Stuart this looks very interesting, thanks for sharing your findings!

Looks like you were spot on with the mic_geometry , there’s a comment in the PulseAudio code that all values are in metres.

Would you like to contribute this directly to the Picroft repo with a pull request?

It would require you to sign the Contributor License Agreement. This is a one-time signing that protects yourself, the project, and users of Mycroft technologies. The agreement makes it crystal clear that along with your code you are offering a license to use it within the confines of this project. You retain ownership of the code, this is just a license. You will also be added to our list of excellent human beings!

I wonder if it’s better to run the load module command on startup, or to add it to the default PulseAudio configuration? Do you think it’s something that people will want to turn on and off or is it just a better configuration all round?

2 Likes

Well it was strange as tried to add it to default.pa but dunno if I did something wrong but in the logs created an error as could tell on the volumes it wasn’t working.
I just tacked it on audiosetup.sh and it worked.

So yeah I 1st thought should be conf but actually audio_setup.sh thinking about it might be clearer.
If its set I think users will keep it, after using it I will not be using without.

Glad you found the comment about meters as that does make sense but wow found getting info on pulseaudio quite hard but think that is the optimal line for most arrays just geometry needs to be set.

You guys are the excellent humans beings thanks for the project as its excellent and actually really interesting with the current buzz around home assistant ai.

PS Motion is just sudo apt-get install motion and at most on movement detection creates about equal load to precise.
The Pi4 ticks away with a load of about 1.4 with both running and cam movement all the time.
So not a problem.

The MSC Mycroft Skill Creator is brilliant but will take me some time to get up to speed.
If anyone wants to create a Motion skill please do as its just a matter of disabling on start and just service motion start / service motion stop.
Also doesn’t make a lot of sense to have a security cam and store locally but https://medium.com/@artur.klauser/mounting-google-drive-on-raspberry-pi-f5002c7095c2
Is real easy to do and juust mount @ /var/lib/motion then if some nicks your Mycroft you have them in the act :slight_smile:
Quite like the idea especially that you have a cam anyway with the mic array the PS3 eye brings.
Really would like to trigger on openhab that might have a occupany enter / exit routine.

But just seems a good fit and feels weird having a cam that has no purpose other than its mic…
Awesome mic array for £5 like my ebay purchase was.

Arch linux like always was the best source of info just love that Distro.
https://wiki.archlinux.org/index.php/PulseAudio/Troubleshooting#Audio_quality

Possible ‘aec_args’ for ‘aec_method=webrtc’

Here is a list of possible ‘aec_args’ for ‘aec_method=webrtc’ with their default values [4][5]:

  • analog_gain_control=1 - Analog AGC - ‘Automatic Gain Control’ done over changing the volume directly - Will most likely lead to distortions.
  • digital_gain_control=0 - Digital AGC - ‘Automatic Gain Control’ done in post processing (higher CPU load).
  • experimental_agc=0 - Allow enabling of the webrtc experimental AGC mechanism.
  • agc_start_volume=85 - Initial volume when using AGC - Possible values 0-255 - A too low initial volume may prevent the AGC algorithm from ever raising the volume high enough [6].
  • high_pass_filter=1 - ?
  • noise_suppression=1 - Noise suppression.
  • voice_detection=1 - VAD - Voice activity detection.
  • extended_filter=0 - The extended filter is more complex and less sensitive to incorrect delay reporting from the hardware than the regular filter. The extended filter mode is disabled by default, because it seemed produce worse results during double-talk [7].
  • intelligibility_enhancer=0 - Some bits for webrtc intelligibility enhancer.
  • drift_compensation=0 - Drift compensation to allow echo cancellation between different devices (such as speakers on your laptop and the microphone on your USB webcam). - only possible with “mobile=0”.
  • beamforming=0 - This can significantly reduce background noise. See [8][9]
    • mic_geometry=x1,y1,z1,x2,y2,z2 - Only with “beamforming=1”.
    • target_direction=a,e,r - Only with “beamforming=1”. Note: If the module does not want to load with this argument, set azimuth (a) to the desired value, but set elevation (e) and radius ® to 0.
  • mobile=0 - ?
    • routing_mode=speakerphone - Possible Values “quiet-earpiece-or-headset,earpiece,loud-earpiece,speakerphone,loud-speakerphone” - only valid with “mobile=1”.
    • comfort_noise=1 - ? - only valid with “mobile=1”.

There are quite a few functions I didn’t try to see what improvement if none they provide.
I got that one liner working and the results where such an improvement I didn’t really play more.
voice_detection=1 to be honest I haven’t a clue.
Prob any of the above might actually be advantageous but I was rather slow on the beamforming and found the documentation for AEC very fragmented for examples.

PS also wondered why you guys are on Raspbian as Arch is just ace :slight_smile:

https://arunraghavan.net/2016/05/improvements-to-pulseaudios-echo-cancellation/
Like yeah they are improved but what do they actually do :slight_smile:
https://www.freedesktop.org/wiki/Software/PulseAudio/Notes/9.0/

@gez-mycroft To be honest gez dunno if all is ok but its confusing me as pulseaudio seems to be using the default source but the agc is obviously in action.
Confused.com about pulse audio and alsa as only use it desktop wise and its always just there.

I think where I went wrong before in default.pa was I was adding double quotes that the pactl needs.
But pactl list-sources shows it as suspended even if the agc seems to be working so presumed beamforming was also.
Strange though as just reflashed Mycroft and this time the low alsa volume defaults of 33% are 100% defaults.

So going back to the start as confused slightly, so guess more googling is needed.
I have it set in default.pa as your right if pulseaudio is ever killed then audio_setup will not get called again so prob isn’t the best place for it.

/etc/pulse/default.pa
### Enable Echo/Noise-Cancellation
load-module module-echo-cancel use_master_format=1 aec_method=webrtc aec_args=“analog_gain_control=0 digital_gain_control=1 agc_start_volume=85 high_pass_filter=1 noise_suppression=1 voice_detection=1 beamforming=1 mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0” source_name=echoCancel_source sink_name=echoCancel_sink

set-default-source echoCancel_source
set-default-sink echoCancel_sink

The script from the mighty ArchLinux is handy, must of deleted the source & sink

Script for reloading module-echo-cancel

Since the module-echo-cancel is not always needed, or must be reloaded if the source_master or sink_master has changed, it is nice to have a easy way to load or reload the module-echo-cancel.

Create the following script and make it executable:

echoCancelEnable.sh

#!/bin/bash
aecArgs="$*"
# If no "aec_args" are passed on to the script, use this "aec_args" as default:
[ -z "$aecArgs" ] && aecArgs="analog_gain_control=0 digital_gain_control=1"
newSourceName="echoCancelSource"
newSinkName="echoCancelSink"

# "module-switch-on-connect" with "ignore_virtual=no" (needs PulseAudio 12 or higher) is needed to automatically move existing streams to a new (virtual) default source and sink.
if ! pactl list modules short | grep "module-switch-on-connect.*ignore_virtual=no" >/dev/null 2>&1; then
	echo Load module \"module-switch-on-connect\" with \"ignore_virtual=no\"
	pactl unload-module module-switch-on-connect 2>/dev/null
	pactl load-module module-switch-on-connect ignore_virtual=no
fi

# Reload "module-echo-cancel"
echo Reload \"module-echo-cancel\" with \"aec_args=$aecArgs\"
pactl unload-module module-echo-cancel 2>/dev/null
if pactl load-module module-echo-cancel use_master_format=1 aec_method=webrtc aec_args=\"$aecArgs\" source_name=$newSourceName sink_name=$newSinkName; then
	# Set a new default source and sink, if module-echo-cancel has loaded successfully.
	pacmd set-default-source $newSourceName
	pacmd set-default-sink $newSinkName
fi

PS I found it much easier just to flash a desktop version of rasbian and install pavucontrol as its really simple but you can see the inputs via a really simple mixer side by side so it will show multichannel original and ech cancel side my side.

Each change just pulseaudio -k and the daemeon will auto restart with new settings.
Prob best to get a feel and setup there or at least was for me being a dummy.

Using audacity and pavucontrol made things much easier to get a feel for things.

Its real easy to setup the inputs via the config panel
https://imagebin.ca/v/5Dakzgv5wbAD

Then just view the inputs

https://imagebin.ca/v/5DamOAx7f5bS

Further update on the beamforming part as have been confused that you can seem to set up the beamforming without a target direction.

webrtc:

  • (most arguments are yet to be documented)
  • beamforming
    • A boolean, set to true to enable beamforming. When enabling this, the “mic_geometry” argument has to be given too. The “target_direction” argument can be used to configure the beamforming target direction.
  • mic_geometry
    • The microphone positions for beamforming. The value is a list of numbers (coordinates). For example, a microphone array containing two mics would require six numbers: “x1,y1,z1,x2,y2,z2”. In that example, the first three numbers specify the coordinates of the first mic relative to the center of the microphone array, and the last three numbers specify the coordinates of the second mic. All distances are given in meters.‘x’ is the horizontal coordinate, with positive values being to the right from the mic array’s perspective.‘y’ is the depth coordinate, with positive values being in front of the array.‘z’ is the vertical coordinate, with positive values being above the array.As an example, if you have a webcam with 2 microphones 8cm apart, and you want to point it forwards, you could use pactl load-module module-echo-cancel use_master_format=1 aec_method='webrtc' aec_args='"beamforming=1 mic_geometry=-0.04,0,0,0.04,0,0"'
  • target_direction
    • The target position relative to the centre of the mic array, for beamforming. The value is a list of three numbers (a spherical point): “a,e,r”. ‘a’ is the azimuth of the target in radians. Zero radians azimuth points to the right of the mic array, and positive angles move in a counter-clockwise direction. ‘e’ is the elevation of the target in radians. Zero radians elevation means that the target is on the same level horizontally as the center of the array, and positive angles go upwards. ‘r’ is the radius, i.e. the distance from the center of the array (in meters).

It only works with linear arrays as in the azimuth can only be set, but don’t think this is anything due to geometry.
Doesn’t matter anyway as intended use is only with stereo and mics like the ps3eye.
Whats also been confusing me is that you don’t have to declare the target direction and how there seems to be a lack of a DoA (Direction of Arrival) routine and if null then 0 is the default azimuth.
https://chromium.googlesource.com/external/webrtc/+/d82f55d2a73da925d4160e3d1430f79f76170992/webrtc/modules/audio_processing/beamformer/beamformer.h

Now that is weird as 0 as far as I can tell starts to the right of the mic array and so is beamforming at 90 degrees to the linear array on start?!
Am I being dumb.
Its in radians so be beamforming ahead needs to be 1.571 Rad as its counter-clockwise?

Its not very elegant but on wake word detection you can use a DoA routine and then unload and load the AeC module with the updated target_direction.
As been searching for some time now and need to spend some more time with the https://github.com/voice-engine code and https://hackaday.io/project/164221-smart-speaker-from-scratch
Brilliant reference stuff but planning on a straight steal of the DoA with a crude unload/load module of WebRtc.
For all those with posh array speakers you don’t need this stuff, but some have to make do with cheap and cheerfull.

To be honest at the start I was just thinking yep webrtc does it all but think its missing the DoA and strangely starts at 90 degrees to the right of the center point of the linear array.
Also need to apply and ALSA softvolume to the PS3eye as maybe because its multichannel but the min/max gets set to zero for control 2, but wondering if that means I am missing out on amplified gain.
I need to stop the googling frenzy and do some testing now, without DoA you can just set the target direction to 1.5708,0,0 on start or I have got it wrong and they are now pointing left.
We will see :slight_smile:

Well this has put a spanner in my works so far but great Arun replied and good of him to do so.

Hey Stuart,
The webrtc library doesn’t implement DOA. I’m not sure how much doing steering (changing target_direction) dynamically works either. Unfortunately, the team has dropped beamforming upstream altogether, so when we next update the library, this support will be lost. :frowning:

Best regards,
Arun