There are APIs out there for translating natural language to actions that a machine can take. An example from wit.ai is the IoT thermostat use case.
But why not instead use GPT-3? It ought to be quite good at this. And as I suspected, the results were quite good! The green highlighted text is AI-generated (so were the closing braces, but for some reason it didn't highlight those).
I think there's a lot here that can be expanded! E.g. you could define a schema beforehand rather than just give it some examples like I have, but I quite like this test-driven approach of defining what I actually want.
I did some tweaks to teach it that I want it to put words in my mouth as it were. It invented a new intent that I hadn't defined, so it would probably be useful to define an array of valid intents at the top. It did however manage to sweet-talk my "wife"!
I think this could work quite well in conjunction with other "modules", e.g. a prompt that takes a
recipient, and a list of people I know (and what their relationship is to me), and outputs a phone number for example.