OpenAI dropped a ton of cool stuff in their Dev Day presentations, including some updates to function calling. There are a few function-call-like things that currently exist within the Open AI ecosystem, so let's take a moment to disambiguate:
- Plugins: introduced in March 2023, allowed GPT to understand and call your HTTP APIs
- Actions: an evolution of Plugins, makes it easier but still calls your HTTP APIs
- Function Calling: Chat GPT understands your functions, tells you how to call them, but does not actually call them
It seems like Plugins are likely to be superseded by Actions, so we end up with 2 ways to have GPT call your functions - Actions for automatically calling HTTP APIs, Function Calling for indirectly calling anything else. We could call this Guided Invocation - despite the name it doesn't actually call the function, it just tells you how to.
That second category of calls is going to include anything that isn't an HTTP endpoint, so gives you a lot of flexibility to call internal APIs that never learned how to speak HTTP. Think legacy systems, private APIs that you don't want to expose to the internet, and other places where this can act as a highly adaptable glue.
I've put all the source code for this article up at https://github.com/edspencer/gpt-functions-example, so check that out if you want to follow along. It should just be a matter of following the steps in the README, but YMMV. We are, of course, going to use a task management app as a playground.
Creating Function definitions
In order for OpenAI Assistants to be able to call your code, you need to provide them with signatures for all of your functions, in the format that it wants, which look like this:
{
"type": "function",
"function": {
"name": "addTask",
"description": "Adds a new task to the database.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the task."
},
"priority": {
"type": "number",
"description": "The priority of the task, lower numbers indicating higher priority."
},
"completed": {
"type": "boolean",
"description": "Whether the task is marked as completed."
}
},
"required": ["name"]
}
}
}
{
"type": "function",
"function": {
"name": "addTask",
"description": "Adds a new task to the database.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the task."
},
"priority": {
"type": "number",
"description": "The priority of the task, lower numbers indicating higher priority."
},
"completed": {
"type": "boolean",
"description": "Whether the task is marked as completed."
}
},
"required": ["name"]
}
}
}
That's pretty self-explanatory. It's also a pain in the ass to keep tweaking and updating as you evolve your app, so let's use the OpenAI Chat Completions API with the json_object
setting enabled and see if we can have this done for us.
Our Internal API
Let's build a basic Task management app. We'll just use a super-naive implementation of Todos written in TypeScript. My little API.ts has functions like addTask
, updateTask
, removeTask
, getTasks
, etc. All the stuff you'd expect. Some of them take a bunch of different inputs.
Here's a snippet of our API.ts file. It's very basic but functional, using a sqlite database driven by Prisma:
interface TaskInput {
name: string;
priority?: number;
completed?: boolean;
deleted?: boolean;
}
/**
* Adds a new task to the database.
* @param taskInput - An object containing the details of the task to be added.
* @param taskInput.name - The name of the task.
* @param taskInput.priority - The priority of the task.
* @returns A Promise that resolves when the task has been added to the database.
*/
async function addTask(taskInput: Task): Promise<Task | void> {
try {
const task = await prisma.task.create({
data: taskInput
})
console.log(`Task ${task.id} created with name ${task.name} and priority ${task.priority}.`)
return task;
} catch (e) {
console.error(e)
}
}
/**
* Updates a task in the database.
* @param id - The ID of the task to update.
* @param updates - An object containing the updates to apply to the task.
* @param updates.name - The updated name of the task.
* @param updates.priority - The updated priority of the task.
* @param updates.completed - The updated completed status of the task.
* @returns A Promise that resolves when the task has been updated in the database.
*/
async function updateTask(id: string, updates: Partial<TaskInput>): Promise<void> {
try {
const task = await prisma.task.update({
where: { id },
data: updates,
})
console.log(`Task ${task.id} updated with name ${task.name} and priority ${task.priority}.`)
} catch (e) {
console.error(e)
}
}
interface TaskInput {
name: string;
priority?: number;
completed?: boolean;
deleted?: boolean;
}
/**
* Adds a new task to the database.
* @param taskInput - An object containing the details of the task to be added.
* @param taskInput.name - The name of the task.
* @param taskInput.priority - The priority of the task.
* @returns A Promise that resolves when the task has been added to the database.
*/
async function addTask(taskInput: Task): Promise<Task | void> {
try {
const task = await prisma.task.create({
data: taskInput
})
console.log(`Task ${task.id} created with name ${task.name} and priority ${task.priority}.`)
return task;
} catch (e) {
console.error(e)
}
}
/**
* Updates a task in the database.
* @param id - The ID of the task to update.
* @param updates - An object containing the updates to apply to the task.
* @param updates.name - The updated name of the task.
* @param updates.priority - The updated priority of the task.
* @param updates.completed - The updated completed status of the task.
* @returns A Promise that resolves when the task has been updated in the database.
*/
async function updateTask(id: string, updates: Partial<TaskInput>): Promise<void> {
try {
const task = await prisma.task.update({
where: { id },
data: updates,
})
console.log(`Task ${task.id} updated with name ${task.name} and priority ${task.priority}.`)
} catch (e) {
console.error(e)
}
}
It goes on from there. You get the picture. No it's not production-grade code - don't use this as a launchpad for your Todo list manager app. GitHub Copilot actually wrote most of that code (and most of the documentation) for me.
Side note on documentation: it took me more years than I care to admit to figure out that the primary consumer of source code is humans, not machines. The machine doesn't care about your language, formatting, awfulness of your algorithms, weird variable names, etc; algorithmic complexity aside it'll do exactly the same thing regardless of how you craft your code. Humans are a different matter though, and benefit enormously from a little context written in a human language.
Ironically, that same documentation that benefitted human code consumers all this time is now what enables these new machine consumers to grok and invoke your code, saving you the work of coming up with a translation layer to integrate with AI agents. So writing documentation really does help you after all. Also, write tests and eat your vegetables.
Generating the OpenAI translation layer
The code to translate our internal API into something OpenAI can use is fairly simple and reusable. All we do is read in a file as text, stuff the contents of that file into a GPT prompt, send that off to OpenAI, stream the results back to the terminal and save it to a file when done:
/**
* This file uses the OpenAI Chat Completions API to automatically generate OpenAI Function Call
* JSON objects for an arbitrary code file. It takes a source file, reads it and passes it into
* OpenAI with a simple prompt, then writes the output to another file. Extend as needed.
*/
import OpenAI from 'openai';
import fs from 'fs';
import path from 'path';
import { OptionValues, program } from 'commander';
//takes an input file, and generates a new tools.json file based on the input file
program.option('sourceFile', 'The source file to use for the prompt', './API.ts');
program.option('outputFile', 'The output file to write the tools.json to (defaults to your input + .tools.json');
const openai = new OpenAI();
/**
* Takes an input file, and generates a new tools.json file based on the input file.
* @param sourceFile - The source file to use for the prompt.
* @param outputFile - The output file to write the tools.json to. Defaults to
* @returns Promise<void>
*/
async function build({ sourceFile, outputFile = `${sourceFile}.tools.json` }: OptionValues) {
console.log(`Reading ${sourceFile}...`);
const sourceFileText = fs.readFileSync(path.join(__dirname, sourceFile), 'utf-8');
const prompt = `
This is the implementation of my ${sourceFile} file:
${sourceFileText}
Please give me a JSON object that contains a single key called "tools", which is an array of the functions in this file.
This is an example of what I expect (one element of the array):
{
"type": "function",
"function": {
"name": "addTask",
"description": "Adds a new task to the database.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the task."
},
"priority": {
"type": "number",
"description": "The priority of the task, with lower numbers indicating higher priority."
},
"completed": {
"type": "boolean",
"description": "Whether the task is marked as completed."
}
},
"required": ["name"]
}
}
},
`
//Call the OpenAI API to generate the function definition, and stream the results back
const stream = await openai.chat.completions.create({
model: 'gpt-4-1106-preview',
response_format: { type: 'json_object' },
messages: [{ role: 'user', content: prompt }],
stream: true,
});
//Keep the new tools.json in memory until we have it all
let newToolsJson = "";
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || ''
process.stdout.write(content);
newToolsJson += content;
}
console.log(`Updating ${outputFile}...}`);
// Write the tools JSON to ../tools.json
fs.writeFileSync(path.join(__dirname, outputFile), newToolsJson);
}
build(program.parse(process.argv).opts());
/**
* This file uses the OpenAI Chat Completions API to automatically generate OpenAI Function Call
* JSON objects for an arbitrary code file. It takes a source file, reads it and passes it into
* OpenAI with a simple prompt, then writes the output to another file. Extend as needed.
*/
import OpenAI from 'openai';
import fs from 'fs';
import path from 'path';
import { OptionValues, program } from 'commander';
//takes an input file, and generates a new tools.json file based on the input file
program.option('sourceFile', 'The source file to use for the prompt', './API.ts');
program.option('outputFile', 'The output file to write the tools.json to (defaults to your input + .tools.json');
const openai = new OpenAI();
/**
* Takes an input file, and generates a new tools.json file based on the input file.
* @param sourceFile - The source file to use for the prompt.
* @param outputFile - The output file to write the tools.json to. Defaults to
* @returns Promise<void>
*/
async function build({ sourceFile, outputFile = `${sourceFile}.tools.json` }: OptionValues) {
console.log(`Reading ${sourceFile}...`);
const sourceFileText = fs.readFileSync(path.join(__dirname, sourceFile), 'utf-8');
const prompt = `
This is the implementation of my ${sourceFile} file:
${sourceFileText}
Please give me a JSON object that contains a single key called "tools", which is an array of the functions in this file.
This is an example of what I expect (one element of the array):
{
"type": "function",
"function": {
"name": "addTask",
"description": "Adds a new task to the database.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the task."
},
"priority": {
"type": "number",
"description": "The priority of the task, with lower numbers indicating higher priority."
},
"completed": {
"type": "boolean",
"description": "Whether the task is marked as completed."
}
},
"required": ["name"]
}
}
},
`
//Call the OpenAI API to generate the function definition, and stream the results back
const stream = await openai.chat.completions.create({
model: 'gpt-4-1106-preview',
response_format: { type: 'json_object' },
messages: [{ role: 'user', content: prompt }],
stream: true,
});
//Keep the new tools.json in memory until we have it all
let newToolsJson = "";
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || ''
process.stdout.write(content);
newToolsJson += content;
}
console.log(`Updating ${outputFile}...}`);
// Write the tools JSON to ../tools.json
fs.writeFileSync(path.join(__dirname, outputFile), newToolsJson);
}
build(program.parse(process.argv).opts());
I've made a simple little repo with this file, the API.ts file, and a little demo that shows it all integrated. Run it like this:
ts-node rebuildTools.ts -s API.ts
ts-node rebuildTools.ts -s API.ts
Which will give you some output like this, and then update your API.ts.tools.json file:
ts-node rebuildTools.ts -s API.ts
Reading API.ts...
{
"tools": [
{
"type": "function",
"function": {
"name": "addTask",
"description": "Adds a new task to the database.",
"parameters": {
"type": "object",
"properties": {
"name": {
..........truncated...
full output at https://github.com/edspencer/gpt-functions-example/blob/main/API.ts.tools.json
.............................
"returns": {
"type": "Promise<void>",
"description": "A Promise that resolves when all tasks have been deleted from the database."
}
}
}
]
}
Updating ./API.ts.tools.json...
Done
ts-node rebuildTools.ts -s API.ts
Reading API.ts...
{
"tools": [
{
"type": "function",
"function": {
"name": "addTask",
"description": "Adds a new task to the database.",
"parameters": {
"type": "object",
"properties": {
"name": {
..........truncated...
full output at https://github.com/edspencer/gpt-functions-example/blob/main/API.ts.tools.json
.............................
"returns": {
"type": "Promise<void>",
"description": "A Promise that resolves when all tasks have been deleted from the database."
}
}
}
]
}
Updating ./API.ts.tools.json...
Done
Creating an OpenAI Assistant and talking to it
We've had Open AI generate our Tools JSON file, now let's see if it can use it with a simple demo.ts
, which:
The code is all up on GitHub, and I won't do a blow-by-blow here but let's have a look at the output when we run it:
ts-node ./demo.ts -m "I need to go buy bread from the store, then go to \
the gym. I also need to do my taxes, which is a P1."
ts-node ./demo.ts -m "I need to go buy bread from the store, then go to \
the gym. I also need to do my taxes, which is a P1."
And the output:
Creating assistant...
Created assistant asst_hkT3BFQsNf3HSmJpE8KytiX9 with name Task Planner.
Created thread thread_AigYi0oFrytu3aO5k0mRacIV
Retrieved 0 tasks from the database.
Created message
msg_uLpR3UpQB3pX62wVIA7TcqIl
Polling thread
Current status: queued
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: requires_action
Actions:
[
{
id: 'call_8JX5ffKFpxIhYmJeZYYilpv3',
type: 'function',
function: {
name: 'addTask',
arguments: '{"name": "Buy bread from the store", "priority": 2}'
}
},
{
id: 'call_GC4axxSB6Oso0tiolDLr900X',
type: 'function',
function: {
name: 'addTask',
arguments: '{"name": "Go to the gym", "priority": 2}'
}
},
{
id: 'call_7c5mWt1I5Ff3h5Lvb0Hfw2L7',
type: 'function',
function: {
name: 'addTask',
arguments: '{"name": "Do taxes", "priority": 1}'
}
}
]
Adding task
Task cloyl2gxs0000c3a7hxe6hupc created with name Buy bread from the store and priority 2.
Adding task
Task cloyl2gxv0001c3a7zi4hqt8z created with name Go to the gym and priority 2.
Adding task
Task cloyl2gxx0002c3a7l0gv7f07 created with name Do taxes and priority 1.
Creating assistant...
Created assistant asst_hkT3BFQsNf3HSmJpE8KytiX9 with name Task Planner.
Created thread thread_AigYi0oFrytu3aO5k0mRacIV
Retrieved 0 tasks from the database.
Created message
msg_uLpR3UpQB3pX62wVIA7TcqIl
Polling thread
Current status: queued
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: requires_action
Actions:
[
{
id: 'call_8JX5ffKFpxIhYmJeZYYilpv3',
type: 'function',
function: {
name: 'addTask',
arguments: '{"name": "Buy bread from the store", "priority": 2}'
}
},
{
id: 'call_GC4axxSB6Oso0tiolDLr900X',
type: 'function',
function: {
name: 'addTask',
arguments: '{"name": "Go to the gym", "priority": 2}'
}
},
{
id: 'call_7c5mWt1I5Ff3h5Lvb0Hfw2L7',
type: 'function',
function: {
name: 'addTask',
arguments: '{"name": "Do taxes", "priority": 1}'
}
}
]
Adding task
Task cloyl2gxs0000c3a7hxe6hupc created with name Buy bread from the store and priority 2.
Adding task
Task cloyl2gxv0001c3a7zi4hqt8z created with name Go to the gym and priority 2.
Adding task
Task cloyl2gxx0002c3a7l0gv7f07 created with name Do taxes and priority 1.
You can see all of the steps it takes in the console output. We had the creation of the Assistant, the Thread, then we looked to see if our sqlite database has any existing Tasks, in which case we're going to send those along as input too, then we pass those along with the user's message and get back OpenAI's function invocations (3 in this case). Finally, we iterate over them all and call our internal addTask
function, and at the bottom of the output we see that our tasks were created successfully.
Let's go call it again, updating the tasks that we just made:
ts-node demo.ts -m "I finished the laundry, please mark it complete. Also the gym is a P1"
ts-node demo.ts -m "I finished the laundry, please mark it complete. Also the gym is a P1"
Output:
Creating assistant...
Created assistant asst_WbTXKoXWL1yTWs4zvcVkDIDT with name Task Planner.
Created thread thread_mLvr7acahXbnmoe217f0gMRF
Retrieved 3 tasks from the database.
Created message
msg_iYYkAeuxRPNmJZ5vAKwiI8S7
Polling thread
Current status: queued
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: requires_action
Actions:
[
{
id: 'call_W4UKGadROhaJJFZym7vQocP7',
type: 'function',
function: {
name: 'completeTask',
arguments: '{"id": "cloyl2gxs0000c3a7hxe6hupc"}'
}
},
{
id: 'call_KzaYk1x4sIRFWeKlvgOk37qf',
type: 'function',
function: {
name: 'updateTask',
arguments: '{"id": "cloyl2gxv0001c3a7zi4hqt8z", "updates": {"priority": 1}}'
}
}
]
Completing task
Task cloyl2gxs0000c3a7hxe6hupc marked as completed.
Updating task
Task cloyl2gxv0001c3a7zi4hqt8z updated with name Go to the gym and priority 1.
Creating assistant...
Created assistant asst_WbTXKoXWL1yTWs4zvcVkDIDT with name Task Planner.
Created thread thread_mLvr7acahXbnmoe217f0gMRF
Retrieved 3 tasks from the database.
Created message
msg_iYYkAeuxRPNmJZ5vAKwiI8S7
Polling thread
Current status: queued
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: in_progress
Trying again in 2 seconds...
Polling thread
Current status: requires_action
Actions:
[
{
id: 'call_W4UKGadROhaJJFZym7vQocP7',
type: 'function',
function: {
name: 'completeTask',
arguments: '{"id": "cloyl2gxs0000c3a7hxe6hupc"}'
}
},
{
id: 'call_KzaYk1x4sIRFWeKlvgOk37qf',
type: 'function',
function: {
name: 'updateTask',
arguments: '{"id": "cloyl2gxv0001c3a7zi4hqt8z", "updates": {"priority": 1}}'
}
}
]
Completing task
Task cloyl2gxs0000c3a7hxe6hupc marked as completed.
Updating task
Task cloyl2gxv0001c3a7zi4hqt8z updated with name Go to the gym and priority 1.
That's kinda amazing. All that any of this really does is assemble blobs of text and send them to the OpenAI API, which is able to figure it all out, even with the context of the data, and correctly call both create and update APIs that exist only internally within your system, without exposing anything to the internet at large.
Here it correctly figured out the IDs of the Tasks to update (because I passed that data in with the prompt - it's tiny), which functions to call and that they should be done in parallel, meaning your user can speak/type as much as they like, making a lot of demands in a single submission, and the Assistant will batch it all up into a set of functions that, from its perspective at least, it wants you to run in parallel,
After executing the functions you can send another request to tell the Assistant the outcome - this article is long enough already but you can see how to close that loop on the OpenAI Function Calling docs.
Closing Thoughts
This stuff is all very new, and there are some pros and cons here. While all looks rosy in the end, it did take a few iterations to get GPT to reliably and consistently output the JSON format expected in the translation stage - occasionally it would innovate and restructure things a little, which causes things to break. That's probably just something that time will take care of as this stuff gets polished up, both on OpenAI's end and on everyone else's, but it's something to be aware of.
This technology requires a considered approach to testing too: GPT is a big old black box floating off in the internet somewhere, it's semi-magical, and it doesn't always give the right answer. Bit rot seems a serious risk here - both due to the newness of the tech and the fact that most of us don't really understand it very well. It seems sensible to mock/stub out expected responses from OpenAI's APIs to do unit testing, but when it comes to integration testing, you probably need your tests to do something like what our demo.ts
does, and then verify the database was updated correctly at the end.
It can be the case that you make no changes to your code or environment but still get different outcomes due to the non-determinism of GPT. Amelioration for this could be in the form of temperature control and fine tuning, but you're probably going to need to be less than 100% trustful that your Assistant is doing what you think it is.
Finally, there's obviously a huge security consideration here. Fundamentally, we're taking user input (text, speech, images, whatever), and calling code on our own systems as a result. This always involves peril, and one can imagine all kinds of SQL injection-style attacks against Agent systems that inadvertently run malicious actions the developer didn't intend. For example - my API.ts
contains a deleteAllTasks
function does what you think it does. Because it's part of API.ts
, the Assistant knows about it, and could inadvertently call it, whether the user was trying to do that or not.
It would be extremely easy to mix up public and private code in this way and accidentally expose it to the Assistant, so in reality you probably want a sanity-check to run each time the tools JSON has been rebuilt, telling you what changed. Seems a good thing to have in your CI/CD.