YouTube Audio Skill - testing and feedback

OK, just confirmed it.

“Hey mycroft, play rise against hero of war from youtube”

“>> Just a second”
“>> Now playing Rise Against - Hero of war ( Official Video ) from youtube”

“Hey mycroft, stream radionl”

“>> Playing streamin station RADIONL”

After the “Hey Mycroft” while playing the youtube stream, the listener beep got played, the stream get’s muted, and then continued at a low volume for the TTS output of “Playing streaming station RADIONL”. Then the volume of youtube get’s restored and the radio station also get’s played. Two songs at the same time.

Giving the stop command, stops them both.

Here are the default configs for my system;

asound.conf

# Use PulseAudio by default
pcm.!default {
  type pulse
  fallback "sysdefault"
  hint {
    show on
    description "Default ALSA Output (currently PulseAudio Sound Server)"
  }
}

ctl.!default {
  type pulse
  fallback "sysdefault"
}

deamon.conf

# This file is part of PulseAudio.
#
# PulseAudio is free software; you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# PulseAudio is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with PulseAudio; if not, see <http://www.gnu.org/licenses/>.

## Configuration file for the PulseAudio daemon. See pulse-daemon.conf(5) for
## more information. Default values are commented out. Use either ; or # for
## commenting.

; daemonize = no
; fail = yes
; allow-module-loading = yes
; allow-exit = yes
; use-pid-file = yes
; system-instance = no
; local-server-type = user
; enable-shm = yes
; enable-memfd = yes
; shm-size-bytes = 0 # setting this 0 will use the system-default, usually 64 MiB
; lock-memory = no
; cpu-limit = no

; high-priority = yes
; nice-level = -11

; realtime-scheduling = yes
; realtime-priority = 5

; exit-idle-time = 20
; scache-idle-time = 20

; dl-search-path = (depends on architecture)

; load-default-script-file = yes
; default-script-file = /etc/pulse/default.pa

; log-target = auto
; log-level = notice
; log-meta = no
; log-time = no
; log-backtrace = 0

; resample-method = speex-float-1
; enable-remixing = yes
; enable-lfe-remixing = no
; lfe-crossover-freq = 0

; flat-volumes = yes

; rlimit-fsize = -1
; rlimit-data = -1
; rlimit-stack = -1
; rlimit-core = -1
; rlimit-as = -1
; rlimit-rss = -1
; rlimit-nproc = -1
; rlimit-nofile = 256
; rlimit-memlock = -1
; rlimit-locks = -1
; rlimit-sigpending = -1
; rlimit-msgqueue = -1
; rlimit-nice = 31
; rlimit-rtprio = 9
; rlimit-rttime = 200000

; default-sample-format = s16le
; default-sample-rate = 96000
; alternate-sample-rate = 48000
; default-sample-channels = 4
; default-channel-map = front-left,front-right

; default-fragments = 4
; default-fragment-size-msec = 25

; enable-deferred-volume = yes
; deferred-volume-safety-margin-usec = 8000
; deferred-volume-extra-delay-usec = 0

# MycroftOS Audio Settings
resample-method = ffmpeg
default-sample-format = s24le
default-sample-rate = 48000
alternate-sample-rate = 44100
default-sample-channels = 4

system.pa (I am running pulseaudio systemwide)

#!/usr/bin/pulseaudio -nF
#
# This file is part of PulseAudio.
#
# PulseAudio is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# PulseAudio is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with PulseAudio; if not, see <http://www.gnu.org/licenses/>.

# This startup script is used only if PulseAudio is started per-user
# (i.e. not in system mode)

.fail

### Automatically restore the volume of streams and devices
load-module module-device-restore
load-module module-stream-restore
load-module module-card-restore

### Automatically augment property information from .desktop files
### stored in /usr/share/application
load-module module-augment-properties

### Should be after module-*-restore but before module-*-detect
load-module module-switch-on-port-available

### Load audio drivers statically
### (it's probably better to not load these drivers manually, but instead
### use module-udev-detect -- see below -- for doing this automatically)
#load-module module-alsa-sink device="hw:1,0" channels=8 rate=48000 format=s32le
#load-module module-alsa-source device="hw:1,0" channels=8 rate=48000 format=s32le
#load-module module-oss device="/dev/dsp" sink_name=output source_name=input
#load-module module-oss-mmap device="/dev/dsp" sink_name=output source_name=input
#load-module module-null-sink
#load-module module-pipe-sink

### Automatically load driver modules depending on the hardware available
.ifexists module-udev-detect.so
load-module module-udev-detect
#channels=8 rate=48000 format=s32le
.else
### Use the static hardware detection module (for systems that lack udev support)
load-module module-detect
.endif

### Automatically connect sink and source if JACK server is present
.ifexists module-jackdbus-detect.so
.nofail
load-module module-jackdbus-detect channels=2
.fail
.endif

### Automatically load driver modules for Bluetooth hardware
.ifexists module-bluetooth-policy.so
load-module module-bluetooth-policy
.endif

.ifexists module-bluetooth-discover.so
load-module module-bluetooth-discover
.endif

### Load several protocols
.ifexists module-esound-protocol-unix.so
load-module module-esound-protocol-unix
.endif
load-module module-native-protocol-unix auth-anonymous=1

### Network access (may be configured with paprefs, so leave this commented
### here if you plan to use paprefs)
#load-module module-esound-protocol-tcp
load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1;192.168.0.0/16;172.16.0.0/12;10.0.0.0/8 auth-anonymous=1
load-module module-zeroconf-publish

### Load the RTP receiver module (also configured via paprefs, see above)
#load-module module-rtp-recv

### Load the RTP sender module (also configured via paprefs, see above)
#load-module module-null-sink sink_name=rtp format=s16be channels=2 rate=44100 sink_properties="device.description='RTP Multicast Sink'"
#load-module module-rtp-send source=rtp.monitor

### Load additional modules from GConf settings. This can be configured with the paprefs tool.
### Please keep in mind that the modules configured by paprefs might conflict with manually
### loaded modules.
.ifexists module-gconf.so
.nofail
load-module module-gconf
.fail
.endif

### Automatically restore the default sink/source when changed by the user
### during runtime
### NOTE: This should be loaded as early as possible so that subsequent modules
### that look up the default sink/source get the right value
load-module module-default-device-restore

### Automatically move streams to the default sink if the sink they are
### connected to dies, similar for sources
load-module module-rescue-streams

### Make sure we always have a sink around, even if it is a null sink.
load-module module-always-sink

### Honour intended role device property
load-module module-intended-roles

### Automatically suspend sinks/sources that become idle for too long
load-module module-suspend-on-idle

### If autoexit on idle is enabled we want to make sure we only quit
### when no local session needs us anymore.
.ifexists module-console-kit.so
load-module module-console-kit
.endif
.ifexists module-systemd-login.so
load-module module-systemd-login
.endif

### Enable positioned event sounds
load-module module-position-event-sounds

### Cork music/video streams when a phone stream is active
load-module module-role-cork

### Modules to allow autoloading of filters (such as echo cancellation)
### on demand. module-filter-heuristics tries to determine what filters
### make sense, and module-filter-apply does the heavy-lifting of
### loading modules and rerouting streams.
load-module module-filter-heuristics
load-module module-filter-apply

### Make some devices default
#set-default-sink output
#set-default-source input
#set-default-source alsa_input.platform-soc_sound.seeed-source
#set-default-sink alsa_output.platform-soc_sound.seeed-sink

### MycroftOS Audio Settings
unload-module module-suspend-on-idle
unload-module module-role-cork
load-module module-role-ducking

### Enable Echo/Noise-Cancellation
load-module module-echo-cancel aec_method=webrtc source_name=echoCancel_source sink_name=echoCancel_sink
set-default-source echoCancel_source
set-default-sink echoCancel_sink

/etc/mycroft/mycroft.conf

{
  "play_wav_cmdline": "paplay %1",
  "play_mp3_cmdline": "mpg123 %1",
  "ipc_path": "/ramdisk/mycroft/ipc/",
  "enclosure": {
    "platform": "MycroftOS",
    "platform_build": 1
  },
  "listener": {
    "mute_during_output": false
  },
  "tts": {
    "module": "mimic2",
    "mimic2": {
      "lang": "en-us",
      "url": "https://mimic-api.mycroft.ai/synthesize?text=",
      "preloaded_cache": "/opt/mycroft/preloaded_cache/Mimic2"
    },
    "pulse_duck": true
  },
  "skills": {
    "priority_skills": ["mycroft-pairing", "mycroft-volume"]
  },
  "log_level": "INFO"
}

/home/mycroft/.mycroft.conf

{
  "max_allowed_core_version": 19.8
}

Forgot to mention that at this moment I patched the volume skill to add my “MycroftOS” enclosure tag to the ALSA_PLATFORMS as I have not yet started coding that part. But this should not matter because removing the enclosure tag from my config makes it an unknown tag doing the same.

Ok that I can repeat.

It’s the “stream X” command in the tune in radio that don’t emit a mycroft.stop message like a normal common play skill would (ex “play jack radio”).

Ok great! That’s why I said, not sure who to blame.

Then I will move over that feedback to that skill thread.:wink:

Thanks and sorry for bothering you😊

No worries, glad we could get to the bottom of this :slight_smile:

I do have some todos on the audioservice (especially VLC backend) so it’s good I got a kick in the right direction to get me going.

Thanks for the feedback, and for looking into the clashing playback behaviour.

I would quite like to add to the functionality so that - say - it’s easy to choose between the top few results rather than just always playing the top match.

Patches / PRs welcome!

Hey, thx for this skill.

Just want to ask you or others if you also got a fullscreen (over the cli) when you do:
“play die drei fragezeichen folge 122 original”.

Haven’t actually checkout the HDMI output, but it looks like you have VLC with framebuffer support installed and it plays the actual videostream.

Do you have that with only that stream or all streams?

(I will do some checking for you a bit later as well, to see if I have a framebuffer player as well.)

It is this config:
-Raspberry pi 4 - with 4GB
-Respeaker Mic Array v2.0 (https://www.seeedstudio.com/ReSpeaker-Mic-Array-v2-0.html )
-Picroft Unstable 2019-11-01 Buster image
-Mycroft-Core 19.8.3

I just install the skill and let them install all necessary requirements. VLC was a part of it.
Is there a reason why the skill used vlc instead of other ones? I think NewPipe (the app) used another technique, or?
https://newpipe.schabi.org/ or https://github.com/TeamNewPipe/NewPipe/

BR, suisat

Is it possible to have this skill play the audio on some other audio rendering device? ie. push the url to a different audio device (“Play Clapton Crossroads upstairs”). End devices may be an mpd client, volumio, vlc client on remote computer, chrome cast device, others…?

@suisat I think VLC is used because of the HTTPS connection and possibly the OPUS codec used by Youtube for most “best” audio streams.

Or at least, when I configured my system for it I needed to tweak my VLC install to support openssl for the https connection and needed to add the OPUS codec for it to work. I can imagine that the default mpg123 might be a bit to minimal for it all.

But… Just guessing here.

@pcwii Was wondering the exact same. Chromecast is integrated within Mycroft ini some extend, as I see something mention about it in the audio.log Wondering if you could say;

“play blabla from youtube on chromoecast-name”

Tried it of course, but nope…

1 Like

@j1nx, I did some playing around in my spare time and managed to get some youtube casted to my chromecast. It looks like the discovery portion of pychromecast, pychromecast.get_chromecasts() does not work on systems with more than one network interface as it uses the wrong interface to find the devices?? not sure if this is the mycroft issue or not.

I did have success casting a search with this code here.
https://github.com/pcwii/skill-chromecast/blob/master/tests.py

Yes, you’re right. I think sometimes one of the audio streams might be something that mpg123 could play codec-wise, but from my initial experiments it seemed like vlc was consistently able to cope with the stream URLs pafy / youtube-dl return.

I’d be happy to include an option to use something other than the “best” audio stream, and/or use mpg123 if an appropriate stream is returned… but it looked to me like this might sometimes mean none of the streams returned would be playable for a given search result.

@mcdruid I discovered that if you ask to play something on youtube and there is no audio only streams available, it actually plays a video one.

This should not be a problem, however probably not intentional, so perhaps you need to catch the best-stream == 0 and report back to the user.

(My VLC is configured without screen support, so playing one of those streams gives A LOT of errors on the mycroft-cli-client screen.

Interesting, thanks @j1nx

I mostly use picroft with no video/screen at all and haven’t noticed this.

Do you have any examples of streams / searches which come back with no audio-only streams?

I did file an issue on gitlab the other day when I stumbled across a result which seemed to do something like return null for bestaudio - perhaps that’s relevant:

Not going round to looking at the code yet, but a working test case would be good as I can’t reproduce the issue myself now.

I encountered it once and thought, will save the song. However, YouTube probably logs those things and puts the encoding in a queue of some kind because double checking the song the next day it played without video.

So hard to debug, by just asking it for songs. I believe the best way is to ask for old and / or not so popular songs. Perhaps in your native language if other than English.

Hi, first of all: thanks for sharing this great stuff!
In the default settings everything is working fine.

But is there a way to change the language? At the moment I am using mycroft with german language support, following this:

https://community.openconversational.ai/t/how-to-change-mycroft-language-to-german/6845

When I evoke a stream by (typing) “spiele Hendrix” (=“play Hendrix”) mycroft says “eine sekunde bitte” (=“a second please”) but than an error is thrown out, saying “Bei der Bearbeitung ist ein Fehler im Skill Youtube Skill aufgetreten” (Roughly=“there was an error in the Skill youtube Skill”). Skill’s log says:

2019-12-17 09:52:54.439 | INFO | 8167 |
mycroft.skills.skill_loader:load:112 | ATTEMPTING TO LOAD SKILL:
mycroft-youtube-audio
2019-12-17 09:52:54.675 | INFO | 8167 |
mycroft.skills.settings:get_local_settings:78 | /opt/mycroft/skills
/mycroft-youtube-audio/settings.json
2019-12-17 09:52:54.710 | INFO | 8167 |
mycroft.skills.skill_loader:_communicate_load_status:270 | Skill
mycroft-youtube-audio loaded successfully
2019-12-17 09:54:19.921 | INFO | 8167 |
mycroft.skills.settings:save_settings:109 | Skill settings successfully
saved to /opt/mycroft/skills/mycroft-youtube-audio/settings.json
2019-12-17 09:54:24.072 | INFO | 8167 | Playback Control Skill |
Playing with: mycroft-youtube-audio
File “/opt/mycroft/skills/mycroft-youtube-audio/init.py”, line 67,
in CPS_start
self.search_youtube(data)
File “/opt/mycroft/skills/mycroft-youtube-audio/init.py”, line 91,
in search_youtube

Also there was a regex error, so I’ve added de-de in locale; so this message is gone.

As I am totally new to mycroft my understanding of how things work is very few. It would be great if someone could point me to the place where this error-message comes from. Is it from the skill or from mycroft itself? Does anyone know if I could try some translations somewhere?

Hi there @t_ho, and welcome!

Skills need to be translated into each language before they can be used. You can see the translated skills into German here: https://translate.mycroft.ai/de/mycroft-skills/

As you can see, this skill is not even on the list because still is so new. So, the best should be waiting for a proper translation, but you can contribute it in your own language directly with the developer, translating yourself the vocabulary files like those https://gitlab.com/mcdruid/mycroft-youtube-audio/tree/master/locale

Thank you malevolent for your reply.
I set up a locale “de-de” folder locally and would be glad to share it. But as I am not aware of how translating a skill actually works, I would keep checking/understanding for a while first… If someone could hint me with files would need a translation, I could to that…
Cheers

Well, translation is a priori quite straight forward.

You need to create all the files inside the folders dialog and vocab or locale just like “en-us” does. That’s the easy part. Now, depending on the skill, the author has used more or less regular expressions. Let’s take an example:

You will create a directory de-de on locale folder, and inside the de-de folder, you need to create the file on_youtube.regex which content is
\s*(on|with|using) (youtube|you tube)\s*

I’m not an regex expert, but here you should translate just (on|with|using) into german and you will be ok. If you know nothing about regex, the character | (vertical bar, pipe, or) means it need to search one of these terms, so you can use any combination of “on”, “with”, and “using” with “youtube” or “you tube”, but both elements are needed (the to parenthesis groups).

My advice, copy all the english files into the german directories, and translate just the english words into your language, leaving all the symbols untouched.

Here you have a two useful links which can help you understand regex a little better.
https://pythex.org/