I'll try to hit all questions above:
The skill framework is very open and allow connecting to basically any resources using standard python calls and modules. This means that a skill can be integrated with a LAN resource without too much trouble.
Intents are processed locally on the device as are skills. The process is:
1 Mic Records sound
2 Data is sent to STT in the cloud
3 STT replies with a text string
4 Text string is passed to intent parser systems (there are two on the Mycroft device)
5 Intent parser triggers appropriate skill
There are the option to use a local Kaldi server, however this is really messy setting up and is not as accurate as for example google STT so it's nothing we recommend.
To isolate from Mycroft servers there are some things that should be turned off (configuration updates and pairing for example) and you'd need to get an account for some STT service.
The example you provide should be able to be created with the current state of Mycroft. If you're interested in this kind of setup I'm sure we can help you out.
Also worth mentioning is that nothing is sent to an external STT until the "Hey Mycroft" wake-word is triggered, so no passive recording of everything.
If I missed something or if you have further questions I'd be happy to help further.