Padatious Intent Parser: How can I save/store/persist trained models to disk?

Hi,

I’m using the Padatious Intent Parser for NLU tasks and would like to persist the trained models to disk for later reuse. It does not seem possible?

I tried to “pickle” the IntentContainer object but since it uses SWIGPyObjects for the FANN2lib external framework (neural net) it does not work!

I also tried to use the cache of IntentContainer and the contained files but it does not seem to be loaded automatically after pointing to the directory. So I guess the issue is happening with the SWIGPyObject setting - any clues or ideas how to persist/save/store a trained model to disk for later reuse?

It is unfortunately not an option to keep everything in memory all the time or always RETRAIN with all the training data in order to be able to process intents!

In advance thank you for your help!
Best, Mycrofter

Hi Mycrofter,

I flagged this possibility a few months ago but was warned by the dev team that it’s really not something Padatious was designed for unfortunately.

I’ll let them know this thread exists in case they have ideas that could help, however it would likely require some significant re-architecting.

Hi [gez-mycroft] - thank you very much for replying to my request eventually!

I have good news here - as with O in Open Source I made it to adapt the code to something like that. It gave me headaches since little or almost none is documented here :frowning: But thanks to being Python code it was possible to pull through.

I guess I should made an official PULL REQUEST here? But I would like to ask for assistance from the dev team here since I am rather not familiar with those things yet (would be my first PR ;-). Also I would like to document and probably unit-test it before I commit it to the official base.

Another thing is: It occassionally (by chance) happens that I get a malloc error during training. It simply helps to repeat it and in almost all cases it worked then with the respective model. But I guess it is not “production-grade” then…

Regarding the technical background: I almost made no changes to padatious - I leveraged the “cache to file” functionality and adjusted some of the internal interfaces to “fake” a cache access when actually accessing a saved model (as a snapshot of the cached model so to speak). I guess, it’s probably not the best way but since I want a quick and practical solution it NOW somehow serves my purpose. Btw. I have combined it with the RASA NLU stack and created a new NLU component there for both entity extraction and intent detection.

It works almost perfectly! Better than the RASA NLU components available by the framework, I have to admit.

So - if anyone from the dev team or else is interested in having a look into incorporating my “fork” I would be happy to deliver to the code base in one way or another!

Best, MyCrofter

Wow, that sounds like an incredible amount of work, I’d love to take a look.

Do you have the code in a Github repository or just locally?

A fork of the existing repo would be the easiest option. If you haven’t done much with Git before, Github provide some good guides to help you get started:
https://guides.github.com/activities/forking/

alright, you’re welcome. Right now, it’s only local code and as already mentioned, not well documented or even unit-tested yet. I will look into this (fork or github post) and get back as soon as possible then.

Hi again everyone,

I have now forked the repo “padatious” and pushed my code adaptions incl. a unit-test to the repo.

Looking forward to hear your comments on this, @gez-mycroft :slight_smile:

1 Like

Hey… no one interested in having a look at it? I was expecting a pull request or comments on the fragility of the approach or the like :smiley: @gez-mycroft ??

Hey sorry, I somehow missed the last post, I’ll ask one of the team more familiar with Padatious to take a look.

Are you able to open a PR back to the main repo? This will mean the right people get pinged and will make it easier to comment on the code

Alright, no worries. I just did that: https://github.com/MycroftAI/padatious/pull/18
Looking forward to your comments.

Best, MyCrofter

1 Like

Thanks MyCrofter, I’ve shot it to the core team to take a look at.

Anything we can do to reduce load times is definitely a good thing :smile:

Just wanted to mention your approach looks good and I’d be happy to pull it in. Could you take a look at the comment I put on it about fixing the formatting? Thanks!

Otherwise, if I do get a chance I can do the formatting fixes myself.

1 Like