How to set up a private knowledge base?


#1

Hi all, its been a while since i have been here. Havent had time to play with mycroft due to work. The question i had, is there a way to download something like the wikipedia data set, and set up mycroft to search that as an information source? On the back end of mycroft, what kind of services and knowledge bases does the mycroft company run on their end that mycroft is dependent on, in order to run? Put another way, if my internet dies, what would i need to install on a home server in order for it to run with out the internet?
Thanks :slight_smile:


#2

Hello!

Currently there are a few components that need to be switched over for 100% offline operation.

  1. Account management though home.mycroft.ai.
    Our back end does some configuration management for devices, and will be doing much more in the future. It can be disabled.
    The second thing that we do is proxy Wolfram Alpha and Open Weather Map for those base skills.
    The third piece is a proxy to Google for Speech to Text. We are planning to replace this part, but leads us to number two:

  2. Speech to Text:
    Recordings are sent to our servers, to be sent to the Google STT API in order to be processes. Then we send the text back to your Mycroft instance. At this point the Adapt intent parser analized the text and triggers skills, also locally. This can be changed as well. The first PR for local STT can be found here:
    https://github.com/MycroftAI/mycroft-core/pull/440

Other than those few bits that reach out, you can run 100% locally with a few tweaks. This obviously excludes other skills that need to reach out to an api, for instance.

Arron


#3

Oh ok, good to know… now is there a way to download the whole wikipedia dataset (40+ gigs) to the local server and set that up as a local knowledge base? Where would i go inside the mycroft file structure to add that? I would want to set it up to default to this, only in the internet went down.


#4

Well, I’m sure you can do something like that. The process would theoretically go like this:

  1. Download the data
  2. Write(or find) some kind of web api you can run locally. Then you can query it with a skill, or modify the base wikipedia skill to hit your local server instead.

You might check out the mycroft-skills repository to see some other skills and how they work.


Arron


#5

The most straight forward approach to me would be to. Build your own wikipedia using its software (including the wiki api https://en.wikipedia.org/w/api.php) download the python wrapper (https://github.com/goldsmith/Wikipedia/) then modfiy https://github.com/goldsmith/Wikipedia/blob/master/wikipedia/wikipedia.py 's ‘API_URL’ variable to be your new server.

Fork the wiki skill, change the ‘import wikipedia as wiki’ to ‘import mypedia as wiki’ (or whatever you want to name it).

I would be really interested to if do get this going!