Teaching Mycroft To Do Stuff

Is there any plan to implement some kind of learning functionality into Mycroft? If so, will there be a way of taking advantage of what other users teach their Mycroft? (i.e. so don’t have to hard code everything?). I see this as the only way of competing with more established companies that have no qualms about mining their users data to privacy-violating levels.

This should be opt-in - the Mycroft should ask if it’s okay to share it - but at least this way provides a filter (that simply mining the data could not manage)

“Robobrain*, hoover the house”

“I’m sorry Darren, what you mean by ‘hoover’?”

“Vacuum clean the house with my Roomba**”

“Okay, vacuum cleaning the house. Do you want me that share this entity*** with my central server so other Mycroft units will understand what ‘hoover’ means?”

“Yes please”

Obviously these will only be shared with other Mycroft units if a certain number users teach their Mycroft the same thing.

.* I will be renaming my Mycroft

** Let’s just assume a) that Roomba has an API that allows this, and b) that I have a Roomba

*** As in the Adapt thingmabob

6 Likes

Or, to accelerate the learning process, some one could sift through all the items MyCroft has potentially learned from one person, and then the human could decide whether the command is logical (This could take a mere 15 seconds per entry for someone to decide). And obviously, if the amount of entries gets yoo large, your method would work well.

1 Like

true, but without the users permission, thats a big invasion of privacy (unless of course its the person that said the commands originally)

Teaching your own Mycroft new things like
"Charmed is a television show" is a reasonable expectation for v1.

Sharing that information on a per-fact basis would likely negatively impact the user experience, but I could imagine a management portal where power users could review the facts they’ve taught their Mycroft and proactively push them up to a central service for review.

Quality control will be an issue in a centralized fact store, as there might be conflicting facts coming from multiple users. The owner of that centralized service will have to have final authority. There’s also a concept of facts that are only relevant to an application (for example, your home automation app cares nothing about new tv shows), so skills developers will likely be in the loop on some of this.

As with everything, it’s not going to be straightforward :).

2 Likes

Yeah, that’s understandable, but that’s why I was suggesting a threshold - a certain number of users must have made the same connection. So if 5% of all Mycroft users have associated the word “hoover” with “vacuum cleaner” and no one else has associated the word “hoover” with anything else, then it will stand well above the noise.

This is what I was referring to before when I mentioned it acting as a filter. While the more established players will be able to mine the data from their army of users, Mycroft will not (initially) have that luxury. What it will have, however, is a more technically-aware group of volunteers. If we can be used to teach Mycroft what it needs to know, it will leap-frog the first stage of the race. Wisdom of crowds and all that.

I’m finding it difficult to express this in words, so I’m going to stream-of-consciousness-style bullet-point it:

  • Amazon, Google, Apple etc (which will henceforth be referred to as Amazoople) have many users, but they are all consumers.
  • Mycroft appeals more to hackers, makers, tinkerers, etc in it’s initial roll-out, so it’s user-base will be less averse to having to do some of the work. This is typical of an open-source project, and is the driving force behind free software (but I’m sure I don’t need to tell you that).
  • Either way, the knowledge-graph (is that the correct term?) for all AI’s will need to be populated.
  • Amazoople have lots of money and large computing infrastructure to throw at this problem, but they cannot inconvenience their user-base in any way, and their user-base will be far less forgiving.
  • Mycroft, while has neither of those things, but will have an army of volunteers willing to help if this is done right.
  • While there are many things a computer can do better than a human-being, teaching another computer what it doesn’t know ain’t one of them. Humans are good at this - we already know the answers!
  • It is true that we do not wish to provide a bad user experience either, so maybe there should be a ‘training mode’ that can be switched on and off, so the end-user isn’t subjected to Mycrroft questioning it’s every command?
  • When activated, this training mode will allow any user - regardless of their programming ability - to contribute to the project by teaching their Mycroft (in the way described above). I think that will go down well with the community.
  • This will allow you to throw brain-power at the problem, while Amazoople throw money at it.

That could work…

Maybe the “facts” could be bundled together so a user could download a set (via that management portal you mentioned, maybe?). Say, a ‘TV Shows’ bundle to use your example?

Yeah, well, neither is setting up your own flexible and extensible intent definition and determination framework, so I get the impression you quite like a challenge :slight_smile:

3 Likes

(On reflection, I may have overused the “army of” idiom. I’d just like to state for the record that I am not trying to militarise Mycroft in any way.)

4 Likes

That’s a great idea!

2 Likes

Understanding is not just a matter of language, but also geographical location.
A flashlight in the US is a torch in UK.

For a learning procedure to work, it has to be regionally enclosed. If not, to many “false positives” might discourage the usage. Regions of similar language structure should be enclosed, so when learning and sharing new entities, only the devices that are located in the same “meaning region” learn it.

Also, when a new entity have a very low usage, Mycroft should ask if “that” is the intended action, until having a high yes-no relation (confidence).

  • Mycroft, do «this»
  • Hmmm, I don’t know how to do «this». Let me see if I can learn it. … OK. I think you want me to do «this». Is that OK?
  • YES!

+1 on that entity score and don’t ask again in that device.

3 Likes

there is a project by ConceptNet from MIT, which has created a “CommonSence” database. it teaches an AI things like “A Dog is an animal”, “Water is wet”, “Liquid is wet”, “Water is a liquid”. that way it is able to give the AI a concept frame of reference about the natural world. the thing is though, they have an export flat database which is something like 5 to 8 gigs or bigger. something like that would probably need to run on the cloud side for the AI. or if it was done through separate Adapt BDI scripts, it could be uploaded to a repository where people can chose what ones they want to add to their own system. i would recommend for the base system, that a basic “best of” selection be automatic included with the system. you also know that submissions from the wild will need to be heavily monitored, as we have seen with microsofts Tay twitter bot, it could be susceptible to “AI Poising” or intentional training the AI for mis-behaviour or unacceptable conditioning. an example would be someone training the AI to have sexually explicit conversations in a manner that would be in appropriate for a younger audience or family. or training the AI to pull pranks, like teaching it to dial log distance calls to china on a persons cellphone while in silent mode. it would have to be heavily monitored.

2 Likes

also if this has not been implemented yet, what would be the best format to store the data in, in order to make it easier for MyCroft to read from?

like would i be able to just put all the cities from a a state on one line along with their zip codes? or would a need a separate line for each city with in the state?

I like the idea of normal end-users being able to add task aliases easily, but there definitely needs to be a way for privacy-conscious users to prevent sharing those aliases with the global pool. There will most likely bee things that I want a short command for which will be specific to me, my family, or my house, and while I would be thrilled to be able to give a short instead of a long one, I don’t want to share that command or the intended results with the rest of the world/userbase.

So maybe when crating a new command, have it ask if it’s a local network command or one that can be made public; similar to how IFTTT has personal recipes that only become public if you choose to share them, on a per-recipe basis.

2 Likes

I agree with this, it will likely take a bit of work - but I’d love to see it happen.

Mycroft is nothing more than a child.
A child needs to learn as well, word by word. I’d rather mycroft asks me whether he can share the new thing he learned than that he adds it to a database silently and just shares it whenever he updates his core…

Don’t forget that sharing is part of the Free & Open Source ideology.

1 Like

I’ve already built a skill https://github.com/gras64/learning-skill.git that anyone can use to teach mycroft things about speak request.

  • I have also provided an upload so far no interface for upload.
  • You will also be asked for a category and
  • you can decide with each request if you want to release them
1 Like

I don’t think geo-fencing is the best way to do things. The way people talk is more common than different. To me it is far better to just add weights based on geo-graphic location, age, etc (just like targeted ads do), but with the assumption that we willing give that data and just to this purpose (unlike targeted ads).