Padatious Intent Parser: How can I save/store/persist trained models to disk?


#1

Hi,

I’m using the Padatious Intent Parser for NLU tasks and would like to persist the trained models to disk for later reuse. It does not seem possible?

I tried to “pickle” the IntentContainer object but since it uses SWIGPyObjects for the FANN2lib external framework (neural net) it does not work!

I also tried to use the cache of IntentContainer and the contained files but it does not seem to be loaded automatically after pointing to the directory. So I guess the issue is happening with the SWIGPyObject setting - any clues or ideas how to persist/save/store a trained model to disk for later reuse?

It is unfortunately not an option to keep everything in memory all the time or always RETRAIN with all the training data in order to be able to process intents!

In advance thank you for your help!
Best, Mycrofter


#2

Hi Mycrofter,

I flagged this possibility a few months ago but was warned by the dev team that it’s really not something Padatious was designed for unfortunately.

I’ll let them know this thread exists in case they have ideas that could help, however it would likely require some significant re-architecting.


#3

Hi [gez-mycroft] - thank you very much for replying to my request eventually!

I have good news here - as with O in Open Source I made it to adapt the code to something like that. It gave me headaches since little or almost none is documented here :frowning: But thanks to being Python code it was possible to pull through.

I guess I should made an official PULL REQUEST here? But I would like to ask for assistance from the dev team here since I am rather not familiar with those things yet (would be my first PR ;-). Also I would like to document and probably unit-test it before I commit it to the official base.

Another thing is: It occassionally (by chance) happens that I get a malloc error during training. It simply helps to repeat it and in almost all cases it worked then with the respective model. But I guess it is not “production-grade” then…

Regarding the technical background: I almost made no changes to padatious - I leveraged the “cache to file” functionality and adjusted some of the internal interfaces to “fake” a cache access when actually accessing a saved model (as a snapshot of the cached model so to speak). I guess, it’s probably not the best way but since I want a quick and practical solution it NOW somehow serves my purpose. Btw. I have combined it with the RASA NLU stack and created a new NLU component there for both entity extraction and intent detection.

It works almost perfectly! Better than the RASA NLU components available by the framework, I have to admit.

So - if anyone from the dev team or else is interested in having a look into incorporating my “fork” I would be happy to deliver to the code base in one way or another!

Best, MyCrofter


#4

Wow, that sounds like an incredible amount of work, I’d love to take a look.

Do you have the code in a Github repository or just locally?

A fork of the existing repo would be the easiest option. If you haven’t done much with Git before, Github provide some good guides to help you get started:
https://guides.github.com/activities/forking/


#5

alright, you’re welcome. Right now, it’s only local code and as already mentioned, not well documented or even unit-tested yet. I will look into this (fork or github post) and get back as soon as possible then.