Mycroft to query web pages?


#1

There is English information on a large number of web pages. I am looking for simple way to develop a siri-like capablity to query that information. Is Mycroft the right tool to use, or are there other better tools for that?


#2

Welcome to the Community @jrl124c :slight_smile:

I’m wondering if you could be a bit more specific about the use cases you are looking at?

Mycroft already pulls information from broad information sources such as DuckDuckGo and WolframAlpha, so can answer factual questions just like Siri. Often these answers will actually be the same wording because they are retrieving information from the same sources.

If you wanted to create your own knowledge graph eg to make your own general service like WolframAlpha, or for deep knowledge in a specific domain, that is not functionality that Mycroft currently provides. Mycroft handles the speech-to-text, natural language processing, fetching and manipulating data, then text-to-speech back to the user.


#3

Thanks for the help. Yes I want to restrict the service to just a specific set of web pages containing information. For example, suppose there was a specific web site that had extensive detailed information about car repair. The service would be set up to only query the information that was contained in that web site, no other information. What software tools would be helpful to do that?


#4

I guess the simple method if you just want search results would be to make an API call to a search engine with the sites domain as a limiting field. This would get you a list of pages that relate to your search and you could then read the blurb of the top result. All of this could exist in a Mycroft Skill.

If you want to extract information from those pages to provide answers to questions, not just search results, you would need to look at tools to produce a knowledge base like MindMeld. MindMeld doesn’t do any voice interaction, so you would need to run this as an independent service, and have a Mycroft Skill query that database.

If this is for a commercial purpose we can talk about what support Mycroft could provide to make it happen.