Build an open future with us.

Invest in Mycroft and become a community partner.

Tips for better context identification?


#1

I guess a good preface is that I’m a rookie Python programmer. Learning as I go.

I’ve run with the idea of an IRC bot and I have it working reasonably well. Humans being what they are seem to pretty regularly come up with new ways to ask for information. Don’t get me started on typos and the variance in word usage by geographically different even english speaking users. Which leads to the problem of tuning for intents.

I’ve tried to come up with as many keywords as I can and I add them as people seem to ask using new ones. The problem with this is the more keywords added the sharper the difference between regularly typed messages and those containing the context and the less fuzzy the context tolerance. This of course is kind of bad when people come up with new words to use to express their request but I’d still like adapt to, hopefully, come up with a response.

I was thinking maybe setting multiple tolerance ranges e.g. 1-5 don’t respond 5-10 ask if the user wants x and then act on a yes response and 10-15 just respond with the suitable information?

So having laid that out. I’d prefer to avoid writing my own logic like above if I could. Are there any pro-tips for getting the right balance of keywords, required vs optional and any other methods that might help keep things suitably fuzzy without just building really big complicated keyword lists?