Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enable basic sandboxed tool run functionality #1938

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions letta/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from abc import ABC, abstractmethod
from typing import List, Literal, Optional, Tuple, Union

from e2b_code_interpreter import Sandbox
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what file I'd need to modify if I wanted to ensure pip install e2b-code-interpreter gets run during the poetry install step?

from tqdm import tqdm

from letta.agent_store.storage import StorageConnector
Expand Down Expand Up @@ -651,6 +652,48 @@ def _handle_ai_response(

function_args["self"] = self # need to attach self to arg since it's dynamically linked

# Execute tool in sandbox
if (function_name in ["print_hello_world", "print_message"]): # remove hardcode
sbx = Sandbox(api_key="e2b_99c3fa2617e66233a57bf267e1e1f462745b0da5") # store api key in env var
code = ""

# 1. Import dependencies
# package = ""
# sbx.commands.run(f"pip3 install {package}")

# 2. Set Params
for param in function_args:
if param != "self":
code += param + ' = "' + function_args[param] + '"\n'

# 3. Add function source code
tool = [tool for tool in self.tools if tool.name == function_name][0]
code += tool.source_code + "\n"

# 4. Add function call
code += function_name + "("

# 5. Populate params for function call
for param in function_args:
if param != "self":
code += param + ","

code += ")"

# 6. Execute code and extract result or throw
execution = sbx.run_code(code)
if execution.error is not None:
raise execution.error
elif len(execution.results) == 0:
function_response = ""
else:
function_response = execution.results[0].text
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any issue with typing here - e.g. if the function returns a list? does this assume the function is always returning a string?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e2b will return lists as a string and store it in the text field! i.e.

python3
>>> from e2b_code_interpreter import Sandbox
>>> sbx = Sandbox(api_key=<API_KEY>)
>>> e = sbx.run_code("""
... def func(a):
...     return [1] * a
... 
... func(5)
... """)
>>> e
Execution(Results: [Result([1, 1, 1, 1, 1])], Logs: Logs(stdout: [], stderr: []), Error: None)
>>> e.results[0].text
'[1, 1, 1, 1, 1]'

It seems like the primitive types will show up here, but there are more complex types that have their own field. Should be straightforward to extend as needed in the future.


# 7. Kill sandbox
sbx.kill()
else:
function_response = function_to_call(**function_args)

function_response = function_to_call(**function_args)
if function_name in ["conversation_search", "conversation_search_date", "archival_memory_search"]:
# with certain functions we rely on the paging mechanism to handle overflow
Expand Down
65 changes: 65 additions & 0 deletions tests/test_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,3 +162,68 @@ def core_memory_clear(self: Agent):

def test_custom_import_tool(client):
pass


def test_run_basic_tool_in_sandbox(client: Union[LocalClient, RESTClient]):
"""Test creation of a simple tool with no input params"""

def print_hello_world():
"""
Returns:
str: A static string "Hello world".

"""
print("hello world")
return "hello world"

tools = client.list_tools()
print(f"Original tools {[t.name for t in tools]}")

tool = client.create_tool(print_hello_world, name="print_hello_world", tags=["extras"])

tools = client.list_tools()
assert tool in tools, f"Expected {tool.name} in {[t.name for t in tools]}"
print(f"Updated tools {[t.name for t in tools]}")

# check tool id
tool = client.get_tool(tool.id)
assert tool is not None, "Expected tool to be created"
assert tool.id == tool.id, f"Expected {tool.id} to be {tool.id}"

# create agent with tool
agent_state = client.create_agent(tools=[tool.name])
response = client.user_message(agent_id=agent_state.id, message="hi please use the tool called print_hello_world")


def test_run_tool_with_str_params_in_sandbox(client: Union[LocalClient, RESTClient]):
"""Test creation of a simple tool that relies on a provided string param"""

def print_message(message: str):
"""
Args:
message (str): The message to print.

Returns:
str: A static string "Hello world".

"""
print(message)
return message

tools = client.list_tools()
print(f"Original tools {[t.name for t in tools]}")

tool = client.create_tool(print_message, name="print_message", tags=["extras"])

tools = client.list_tools()
assert tool in tools, f"Expected {tool.name} in {[t.name for t in tools]}"
print(f"Updated tools {[t.name for t in tools]}")

# check tool id
tool = client.get_tool(tool.id)
assert tool is not None, "Expected tool to be created"
assert tool.id == tool.id, f"Expected {tool.id} to be {tool.id}"

# create agent with tool
agent_state = client.create_agent(tools=[tool.name])
response = client.user_message(agent_id=agent_state.id, message="hi please use the tool called print_message")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, on an individual test run I can manually verify that the function is running in the sandboxed environment using the e2b debug logs:

DEBUG:httpcore.connection:close.started
DEBUG:httpcore.connection:close.complete
DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
DEBUG:httpx:load_verify_locations cafile='/Users/carenthomas/Library/Caches/pypoetry/virtualenvs/letta-2EtcMsTd-py3.12/lib/python3.12/site-packages/certifi/cacert.pem'
DEBUG:e2b_code_interpreter.code_interpreter_sync:Executing code def print_hello_world():
    """
    Returns:
        str: A static string "Hello world".

    """
    print("hello world")
    return "hello world"

print_hello_world()
INFO:e2b.sandbox_sync.main:Request: POST https://49999-iq6tl0803m03zmbsfyu5j-dc35dfcb.e2b.dev/execute

Not ideal because if the function doesn't run on the sandbox for whatever reason the test suite will still consider this as passed. One option is leaving the sandbox running with a timeout so that I can still interact with it during the test after user_message returns, but I'd prefer if we consistently kill the server after execution to prevent future bugs so looking into other e2b suggested options!

Loading