GAME SDK
Last updated
Last updated
The GAME SDK is available in Python & Javascript/Typescript.
This simple diagram is the basis of how most agents are designed and operated in the current state. This is referred to as an agentic loop.
An agent takes an action in the environment, gets feedback, and can take another action which either corrects or does something else based on that new feedback.
Without this loop, it would essentially be a bot or something which doesn’t get any feedback - meaning it doesn’t learn or do anything, just an input-output machine.
The agentic loop gets feedback which determines the next action.
An agent takes an action. The action is executed in the environment.
Something changes about the environment (or doesn’t change), then the state, which is what the agent sees, changes.
What the agent sees is the state of the world around the agent. When the state changed, the input to the agent also changes.
That’s essentially what the whole agentic framework is and we are going to build our framework around this concept.
The most basic building block of our SDK is a worker. The worker/executable loop is similar to the agent/environment loop. The worker takes in a state and decides which task to perform (i.e execute a function). Executing a function will output a result. Then, you can call a "get state" function which will change the state, which can be seen by the worker.
Let’s start from the worker definition - you need to specify the action space because you need to give the worker some actions or functions. Then you specify the description of the worker, which is the character card equivalent, and the "get state" function which determines what the agent sees.
When specifying a worker, there are several things to specify in initialization to interact with it.
Executables are literally defined functions. When defining a function in the action space, the action space is a list of functions. You define the function name, function description, function arguments, and the executable. For example, a function called “sit” has a function description “take a seat,” some arguments like “object to sit on,” and the executable that will run when the agent decides to call this function.
What can you configure? You can pass different functions to the agent and the state to the agent. As a developer, you have the creative freedom to configure what the agent can see, which is the state.
The worker decides which function to use based on the function name, function description, and function arguments. These elements serve as prompts - proper description of the function and its arguments is crucial for the agent to select actions appropriately. The executable itself isn’t passed to the agent; the agent only needs to understand what the function does through its description. This worker is essentially an agent. It takes in information, a state, and will decide an action to take. Then we execute that action.
An action is always executable in the form of code, which we call functions. A worker will take a function and run it. Running that function produces an output. You can do whatever you want with the output of that function. One of the good things about our framework is that it is flexible; you can write another function called get state function which will take the output of the function to change your state, and this state will then be updated or changed in a way which will be seen by the worker.
For the “get state update” function, it takes the result of the function and the current state. The current state is kept within the agent itself to maintain what the current state is.
You have the function result, which is the output of the executable, and the current state, and you output what the new state will be based on that.
The initial state is defined at the start. The “get state function” is called without visibility - it’s called all the time. This function outputs the state and the new state, and this new state is what your agent sees.
In a locally running project, the state persists in local memory. If the case of a restart, the whole state will be wiped. For cloud hosted projects, you would need to handle state persistence in an external database. If you want to maintain state completely separately, the get state function could call from a database and inject into the agents.
For long-running agents where state becomes substantial over time, there are various approaches. With 1-2 years of state data, this information could be used to fine-tune a model. This requires separating the workflow, as this SDK doesn’t directly support model fine-tuning. It can be difficult to use larger models like GPT-4, and some implementations might prefer smaller, more cost-effective models for specific use cases.
When creating an agent, you can choose what functions your agent has. You can pass all the different functions that your agent has and decide what actions can be taken in the environment. The state determines what can be seen in the environment. This is very important and provides flexibility for this SDK - we can configure what the agent sees and what the agent does. We provide the list of available functions or actions that the agent can take, and we provide what the agent can see. Together, we let the Lower Language Model (LLM) decide what action to take based on these things. Take an action, execute it, and the loop continues.
With the new SDK, the developer has complete freedom. The developer can change the state, which is what the agent sees, and can change the getState function for how the state is updated.
The developer decides what the functions are. The functions are no longer constrained to just API calls - it can be any python (or typescript) function. If your python function is called “add_two_numbers” with two arguments, that can be a function. You can wrap it in API calls if you want, and then do something else in the function. Underlyingly, it’s just Python function executable.
The HLP or task generator is essentially something that will continuously provide workers with tasks. Instead of always interacting with the agent manually, we want this agent to operate autonomously. We have this task generator to do that. The task generator is where you specify the goals of the agent (ie lifetime goals) and it’s going to continuously give tasks to the workers. Instead of giving a task manually, the HLP will give tasks, which is another LLM caller. This will continuously update the task.
Due to the flexibility of our framework, you’re not constrained to one worker. We do this more for level of abstraction.
Imagine your function has one worker and the one work already has 10 functions - that’s a lot of things it can do already. The more functions, the more difficult it becomes for LLMs to understand all the functions and execute them appropriately. In this case, you can split them up into different workers and give them different sets of actions.
For example, at the beach, you can only do certain things: swim, build sandcastle, surfboard, drink water. In the library, you can climb up the ladder, read a book. There might be overlapping actions between workers, which is completely fine.
The point is to segment them so it’s easier for these workers to execute the tasks they are given. The HLP not only tells what tasks to do but also picks which worker to give the task to.
For plugin development that can be reused across projects, the framework enables this through custom functions. A function consists of an executable plus a definition - the function name, function description, arguments, and the executable. These components together form a shareable unit that can be uploaded to GitHub. The organization will be similar to the current SDK with a dedicated functions folder.
In the next section, let's set up a simple demo project.
The GAME SDK is available in Python & Javascript/Typescript.
This simple diagram is the basis of how most agents are designed and operated in the current state. This is referred to as an agentic loop.
An agent takes an action in the environment, gets feedback, and can take another action which either corrects or does something else based on that new feedback.
Without this loop, it would essentially be a bot or something which doesn’t get any feedback - meaning it doesn’t learn or do anything, just an input-output machine.
The agentic loop gets feedback which determines the next action.
An agent takes an action. The action is executed in the environment.
Something changes about the environment (or doesn’t change), then the state, which is what the agent sees, changes.
What the agent sees is the state of the world around the agent. When the state changed, the input to the agent also changes.
That’s essentially what the whole agentic framework is and we are going to build our framework around this concept.
The most basic building block of our SDK is a worker. The worker/executable loop is similar to the agent/environment loop. The worker takes in a state and decides which task to perform (i.e execute a function). Executing a function will output a result. Then, you can call a "get state" function which will change the state, which can be seen by the worker.
Let’s start from the worker definition - you need to specify the action space because you need to give the worker some actions or functions. Then you specify the description of the worker, which is the character card equivalent, and the "get state" function which determines what the agent sees.
When specifying a worker, there are several things to specify in initialization to interact with it.
Executables are literally defined functions. When defining a function in the action space, the action space is a list of functions. You define the function name, function description, function arguments, and the executable. For example, a function called “sit” has a function description “take a seat,” some arguments like “object to sit on,” and the executable that will run when the agent decides to call this function.
What can you configure? You can pass different functions to the agent and the state to the agent. As a developer, you have the creative freedom to configure what the agent can see, which is the state.
The worker decides which function to use based on the function name, function description, and function arguments. These elements serve as prompts - proper description of the function and its arguments is crucial for the agent to select actions appropriately. The executable itself isn’t passed to the agent; the agent only needs to understand what the function does through its description. This worker is essentially an agent. It takes in information, a state, and will decide an action to take. Then we execute that action.
An action is always executable in the form of code, which we call functions. A worker will take a function and run it. Running that function produces an output. You can do whatever you want with the output of that function. One of the good things about our framework is that it is flexible; you can write another function called get state function which will take the output of the function to change your state, and this state will then be updated or changed in a way which will be seen by the worker.
For the “get state update” function, it takes the result of the function and the current state. The current state is kept within the agent itself to maintain what the current state is.
You have the function result, which is the output of the executable, and the current state, and you output what the new state will be based on that.
The initial state is defined at the start. The “get state function” is called without visibility - it’s called all the time. This function outputs the state and the new state, and this new state is what your agent sees.
In a locally running project, the state persists in local memory. If the case of a restart, the whole state will be wiped. For cloud hosted projects, you would need to handle state persistence in an external database. If you want to maintain state completely separately, the get state function could call from a database and inject into the agents.
For long-running agents where state becomes substantial over time, there are various approaches. With 1-2 years of state data, this information could be used to fine-tune a model. This requires separating the workflow, as this SDK doesn’t directly support model fine-tuning. It can be difficult to use larger models like GPT-4, and some implementations might prefer smaller, more cost-effective models for specific use cases.
When creating an agent, you can choose what functions your agent has. You can pass all the different functions that your agent has and decide what actions can be taken in the environment. The state determines what can be seen in the environment. This is very important and provides flexibility for this SDK - we can configure what the agent sees and what the agent does. We provide the list of available functions or actions that the agent can take, and we provide what the agent can see. Together, we let the Lower Language Model (LLM) decide what action to take based on these things. Take an action, execute it, and the loop continues.
With the new SDK, the developer has complete freedom. The developer can change the state, which is what the agent sees, and can change the getState function for how the state is updated.
The developer decides what the functions are. The functions are no longer constrained to just API calls - it can be any python (or typescript) function. If your python function is called “add_two_numbers” with two arguments, that can be a function. You can wrap it in API calls if you want, and then do something else in the function. Underlyingly, it’s just Python function executable.
The HLP or task generator is essentially something that will continuously provide workers with tasks. Instead of always interacting with the agent manually, we want this agent to operate autonomously. We have this task generator to do that. The task generator is where you specify the goals of the agent (ie lifetime goals) and it’s going to continuously give tasks to the workers. Instead of giving a task manually, the HLP will give tasks, which is another LLM caller. This will continuously update the task.
Due to the flexibility of our framework, you’re not constrained to one worker. We do this more for level of abstraction.
Imagine your function has one worker and the one work already has 10 functions - that’s a lot of things it can do already. The more functions, the more difficult it becomes for LLMs to understand all the functions and execute them appropriately. In this case, you can split them up into different workers and give them different sets of actions.
For example, at the beach, you can only do certain things: swim, build sandcastle, surfboard, drink water. In the library, you can climb up the ladder, read a book. There might be overlapping actions between workers, which is completely fine.
The point is to segment them so it’s easier for these workers to execute the tasks they are given. The HLP not only tells what tasks to do but also picks which worker to give the task to.
For plugin development that can be reused across projects, the framework enables this through custom functions. A function consists of an executable plus a definition - the function name, function description, arguments, and the executable. These components together form a shareable unit that can be uploaded to GitHub. The organization will be similar to the current SDK with a dedicated functions folder.
In the next section, let's set up a simple demo project.