Some problems with most recent update


#1

I turned on my mycroft Mark 1 for the first time in a while and all seemed well. I have a custom skill for home automation that I tested and it worked fine. After a few minutes, mycroft went into update mode, and since it finished updating I’ve had two problems:

  1. when I ssh into the mycroft, things seem normal, but when I try to do some things, like list the files in /var/log, my terminal session locks up. Also, I can tail -f a file, for instance /var/log/mycroft-skills.log, but if I try to read the whole thing by opening it with ‘less /var/log/mycroft-skills.log’ my terminal locks up. If I try to scp the file from another box, the transfer stalls, e.g:

me@otherbox # scp pi@mycroft:/var/log/mycroft-skills.log .
mycroft-skills.log 0% 0 0.0KB/s - stalled -

and I never get the file. Note - the mycroft doesn’t lock up, just my terminal session. I can ssh in again to mycroft, but if I try to read these files or directories, that session also locks up. I thought maybe there were some bad sectors on the SD card, so I opened up the mycroft and took out the sdcard an mounted it on another system. Both the boot and the main file system test out fine with fsck - f, and I’m able to easily read the files that were giving me trouble.

Anyway, when I was able to actually read mycroft-skills.log on the other system, I noticed the second problem:

  1. a Python module that I was using for a local skill was gone - seems like it got wiped out by the update.

So I put the mycroft back together and ssh’ed in and tried to ‘pip install [that module]’ but then noticed pip wasn’t there. So I installed pip with apt-get, then was able to install the module, so that was good.

But my skill is still not working – must be some secondary issue that comes up after that missing module. But because of problem #1, I can’t check the mycroft-skills.log without actually taking apart the unit apart and pulling the SDCard - that’s a really painful way to debug of course.

If I can solve problem #1, I am sure I will be able to get through other issues. But I can’t imagine why I would be having that problem, given that the file system integrity appears to be fine. Anyone else notice this issue?

edit - the terminal lockup is spotty - I was just able to list the /var/log files and open mycroft-skills.log in less - thought all was great, but then I tried to go to the bottom of the mycroft-skills.log file and it locked up. I experience this lockup both when shelling in from a Mac and from another raspberry pi - both of them worked very well with no lockups before – it’s true that I hadn’t turned on the mycroft for a couple of weeks, and that although I did test my custom skill today prior to the update, I did not try to ssh in today prior to the update - only after.

edit 2 - I opened the skills log and gingerly scrolled down until I saw some Git errors for my local skills - looks like they fail to load if there’s no git project for them?? Is that really true??


#2

Hi there @jqh1234 thanks for reporting these issues, and apologies that you’re experiencing them. Firstly, excellent troubleshooting and great detail. Thank you.

ssh session unresponsive

The first thing I was going to suggest here until you mentioned it was checking the integrity of the Micro SD card; thank you for ruling that out. The other possibilities here might be:

  • network settings or a firewall - are both the Mark 1 and your localhost on the same network / subnet? Is there a firewall that could be interfering with port 22 at all?

  • how stable is your wireless SSID? Could the wireless session be dropping and re-activating?

  • do you frequently get messages like I can't connect to the internet from the Mark 1? How far away from the AP is it - ie is it dropping in and out of range?

Python module

In the 18.2.6b release a few weeks ago, we made the big jump from Python 2.7 to Python 3.4+. Read more about the Python jump and the key changes you’ll see here. We also implemented Skill Branching as part of this release, but if it’s a private Skill then Skill Branching won’t impact you at all.

So, it’s likely that the Python module in question has changed b/w version 2.7 and Python 3.4+.

git errors

Are you able to post the errors and I’ll take a closer look?

Best, Kathy


#3

thanks for parsing through that wall of text - I was about to edit it for brevity.

first - the git related error – for my local skill(s) (actually two of them - I forgot about one), I get:

13:24:18.782 - msm.mycroft_skills_manager - ERROR - Error running install_or_update on skill-switch: GitException(Git command failed: GitCommandError([‘git’, ‘rev-parse’, ‘HEAD’], 128, b’fatal: Not a git repository (or any of the parent directories): .git’, b’’))
13:24:18.809 - msm.mycroft_skills_manager - ERROR - Error running install_or_update on skill-feed-reader: GitException(Git command failed: GitCommandError([‘git’, ‘rev-parse’, ‘HEAD’], 128, b’fatal: Not a git repository (or any of the parent directories): .git’, b’’))

and for a stock skill that I had modified locally to make some small changes (I think to add bbc), I get:

13:24:18.930 - msm.mycroft_skills_manager - ERROR - Error running install_or_update on mycroft-npr-news: SkillModified(Uncommitted changes:
M init.py
M dialog/en-us/npr.news.dialog
M vocab/en-us/NPRNewsKeyword.voc

So it looks like the msm.mycroft_skills_manager is trying to reconcile the local code with git repositories, and errors out when they don’t match - doesn’t it?

I doubt the terminal lockup is a wifi issue - I’m using an Ubiquiti mesh network with a bunch of nodes, and I had never had the problem before, despite repeatedly ssh-ing in. But when I get a chance, I’ll switch to ethernet and see if that helps.

The python version switch sounds like it explains the module issue - that turned out probably not to be a real problem anyway after I re-installed - I probably shouldn’t have mentioned it.


#4

All good, we’re here to help.

What I think is happening here is that on boot, Mark 1 is attempting to update the Skills that are installed. I don’t know what that b repository is supposed to be - does it use some sort of Unicode character or a character with an umlaut or similar?

The second error is pretty much what it says on the box - msm is trying to update the Skills, but modifications have been made to the Skill it’s trying to update. Can you git commit the changes?


#5

There has never been a repository for the skills in the first two errors - the log says the same as what you see - I don’t think there is a unicode issue

I am not the repository owner for the mycroft-npr-news skill, so I doubt I could commit. My local changes are pretty sloppy as well - I doubt anyone would want me to commit :slight_smile:


#6

OK, thanks for letting me know, I’ll follow up internally to see what the issue could be here.


#7

more on the terminal locking up problem. I plugged in an ethernet cable (such that mycroft is now using both wifi and ethernet btw) and ssh’ed in over that wired interface.

It seemed like I could do more, but the lockup still happened – it happens when I try to read a bunch of data in the terminal session, like scrolling through a big log file (and maybe I could scroll further with ethernet than with wifi before the lockup, but that could just be my impression).

I was able to look in the syslog file over ethernet before lockup, and saw a ton of log entries like these:

Jun 8 13:54:36 mycroft kernel: [ 69.573932] [UFW BLOCK] IN=eth0 OUT= MAC=[long colon delimited string that includes the MAC address of the computer I’m using to ssh in] SRC=[IPv4 address of the computer I’m using to ssh in] DST=224.0.0.1 LEN=44 TOS=0x00 PREC=0x00 TTL=1 ID=305 PROTO=UDP SPT=63847 DPT=8612 LEN=24

Jun 8 13:54:43 mycroft kernel: [ 76.650381] [UFW BLOCK] IN=eth0 OUT= MAC=[long colon delimited string that includes the MAC address of the computer I’m using to ssh in] SRC=[IPv4 address of the computer I’m using to ssh in] DST=224.0.0.1 LEN=44 TOS=0x00 PREC=0x00 TTL=1 ID=755 PROTO=UDP SPT=55363 DPT=8612 LEN=24

I can’t tell you this wasn’t happening before, because I’m not sure I ever looked closely at the syslog file, but it is definitely something I am not used to seeing anywhere else.

Pure speculation, but it almost looks like there’s some defensive mechanism in the new mycroft that (incorrectly) identifies me as a threat when I’m ssh’ing in and shuts down the communication.

oh - and one big difference between the wifi-only lockup and the new ethernet lockup is that previously, I could ssh back into mycroft after the lockup and do some more stuff (prior to the next lockup) – the mycroft didn’t really seem to be affected by the lockup.

But with ethernet, the lockup appears to take mycroft off the network - he complains that he can’t access the mycroft service, and neither the wifi IP address nor the ethernet IP address is pingable after an ssh-over-ethernet-session lockup, and I have to reboot mycroft.

edit: – I will keep researching the local situation - this does look weird. But if anyone knows of a change in the most recent update that could explain the issue, it would be really helpful to know