Skip to content

Commit

Permalink
Implement streaming run in react agent
Browse files Browse the repository at this point in the history
  • Loading branch information
aymeric-roucher committed May 22, 2024
1 parent 15769cb commit 4584b4b
Show file tree
Hide file tree
Showing 4 changed files with 99 additions and 53 deletions.
16 changes: 8 additions & 8 deletions docs/source/en/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ An agent is a system that uses an LLM as its engine, and it has access to functi
These *tools* are functions for performing a task, and they contain all necessary description for the agent to properly use them.

The agent can be programmed to:
- devise a series of actions/tools and run them all at once like the `CodeAgent` for example
- plan and execute actions/tools one by one and wait for the outcome of each action before launching the next one like the `ReactJsonAgent` for example
- devise a series of actions/tools and run them all at once like the [`CodeAgent`] for example
- plan and execute actions/tools one by one and wait for the outcome of each action before launching the next one like the [`ReactJsonAgent`] for example

### Types of agents

Expand All @@ -42,8 +42,8 @@ This agent has a planning step, then generates python code to execute all its ac
This is the go-to agent to solve reasoning tasks, since the ReAct framework ([Yao et al., 2022](https://huggingface.co/papers/2210.03629)) makes it really efficient to think on the basis of its previous observations.

We implement two versions of ReactJsonAgent:
- [`~ReactJsonAgent`] generates tool calls as a JSON in its output.
- [`~ReactCodeAgent`] is a new type of ReactJsonAgent that generates its tool calls as blobs of code, which works really well for LLMs that have strong coding performance.
- [`ReactJsonAgent`] generates tool calls as a JSON in its output.
- [`ReactCodeAgent`] is a new type of ReactJsonAgent that generates its tool calls as blobs of code, which works really well for LLMs that have strong coding performance.

> [!TIP]
> Read [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) blog post to learn more the ReAct agent.
Expand Down Expand Up @@ -124,7 +124,7 @@ You could use any `llm_engine` method as long as:

You also need a `tools` argument which accepts a list of `Tools`. You can provide an empty list for `tools`, but use the default toolbox with the optional argument `add_base_tools=True`.

Now you can create an agent, like `CodeAgent`, and run it. For convenience, we also provide the `HfEngine` class that uses `huggingface_hub.InferenceClient` under the hood.
Now you can create an agent, like [`CodeAgent`], and run it. For convenience, we also provide the [`HfEngine`] class that uses `huggingface_hub.InferenceClient` under the hood.

```python
from transformers import CodeAgent, HfEngine
Expand Down Expand Up @@ -187,7 +187,7 @@ The execution will stop at any code trying to perform an illegal operation or if

### The system prompt

An agent, or rather the LLM that drives the agent, generates an output based on the system prompt. The system prompt can be customized and tailored to the intended task. For example, check the system prompt for the `ReactCodeAgent` (below version is slightly simplified).
An agent, or rather the LLM that drives the agent, generates an output based on the system prompt. The system prompt can be customized and tailored to the intended task. For example, check the system prompt for the [`ReactCodeAgent`] (below version is slightly simplified).

```text
You will be given a task to solve as best you can.
Expand Down Expand Up @@ -246,7 +246,7 @@ of the available tools.

A tool is an atomic function to be used by an agent.

You can for instance check the [~PythonInterpreterTool]: it has a name, a description, input descriptions, an output type, and a `__call__` method to perform the action.
You can for instance check the [`PythonInterpreterTool`]: it has a name, a description, input descriptions, an output type, and a `__call__` method to perform the action.

When the agent is initialized, the tool attributes are used to generate a tool description which is baked into the agent's system prompt. This lets the agent know which tools it can use and why.

Expand All @@ -259,7 +259,7 @@ Transformers comes with a default toolbox for empowering agents, that you can ad
- **Speech to text**: given an audio recording of a person talking, transcribe the speech into text ([Whisper](./model_doc/whisper))
- **Text to speech**: convert text to speech ([SpeechT5](./model_doc/speecht5))
- **Translation**: translates a given sentence from source language to target language.
- **Python code interpreter**: runs your the LLM generated Python code in a secure environment. This tool will only be added to [~ReactJsonAgent] if you use `add_base_tools=True`, since code-based tools can already execute Python code
- **Python code interpreter**: runs your the LLM generated Python code in a secure environment. This tool will only be added to [`ReactJsonAgent`] if you use `add_base_tools=True`, since code-based tools can already execute Python code


You can manually use a tool by calling the [`load_tool`] function and a task to perform.
Expand Down
111 changes: 78 additions & 33 deletions src/transformers/agents/agents.py
Original file line number Diff line number Diff line change
Expand Up @@ -597,7 +597,31 @@ def __init__(
if "final_answer" not in self._toolbox.tools:
self._toolbox.add_tool(FinalAnswerTool())

def run(self, task: str, **kwargs):

def provide_final_answer(self, task) -> str:
"""
This method provides a final answer to the task, based on the logs of the agent's interactions.
"""
self.prompt = [
{
"role": MessageRole.SYSTEM,
"content": "An agent tried to answer a user query but it failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:",
}
]
self.prompt += self.write_inner_memory_from_logs()[1:]
self.prompt += [
{
"role": MessageRole.USER,
"content": f"Based on the above, please provide an answer to the following user request:\n{task}",
}
]
try:
return self.llm_engine(self.prompt, stop_sequences=["<end_action>", "Observation:"])
except Exception as e:
return f"Error in generating final llm output: {e}."


def run(self, task: str, stream: bool = False, **kwargs):
"""
Runs the agent for the given task.
Expand All @@ -614,41 +638,59 @@ def run(self, task: str, **kwargs):
agent.run("What is the result of 2 power 3.7384?")
```
"""
if stream:
return self.stream_run(task, **kwargs)
else:
return self.direct_run(task, **kwargs)


def stream_run(self, task: str, **kwargs):
self.initialize_for_run(task, **kwargs)

final_answer = None
iteration = 0
while final_answer is None and iteration < self.max_iterations:
try:
final_answer = self.step()
step_logs = self.step()
if 'final_answer' in step_logs:
final_answer = step_logs['final_answer']
except AgentError as e:
self.logger.error(e, exc_info=1)
self.logs[-1]["error"] = e
finally:
iteration += 1
yield self.logs[-1]

if final_answer is None and iteration == self.max_iterations:
error_message = "Reached max iterations."
self.logs.append({"error": AgentMaxIterationsError(error_message)})

This comment has been minimized.

Copy link
@freddyaboulton

freddyaboulton May 22, 2024

Contributor

Can you please also yield the max iterations error?

This comment has been minimized.

Copy link
@aymeric-roucher

aymeric-roucher May 23, 2024

Author Contributor

@freddyaboulton this is done, tell me if it works better now!

This comment has been minimized.

Copy link
@freddyaboulton

freddyaboulton May 23, 2024

Contributor

Thank you!

self.logger.error(error_message, exc_info=1)
final_answer = self.provide_final_answer(task)

return final_answer

This comment has been minimized.

Copy link
@freddyaboulton

freddyaboulton May 22, 2024

Contributor

Can you please yield the final answer as well?

This comment has been minimized.

Copy link
@freddyaboulton

freddyaboulton May 22, 2024

Contributor

That way the following for loop will yield all the logs and the final answer:

for message in agent.run(prompt, stream=True):
    ...

This comment has been minimized.

Copy link
@freddyaboulton

freddyaboulton May 23, 2024

Contributor

Can you also yield final_answer instead of returning it (not just in the max iterations case)?

Otherwise the final_answer would not be in the iterator in the cases where max iterations are not reached.



def direct_run(self, task: str, **kwargs):
self.initialize_for_run(task, **kwargs)

self.prompt = [
{
"role": MessageRole.SYSTEM,
"content": "An agent tried to answer a user query but it failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:",
}
]
self.prompt += self.write_inner_memory_from_logs()[1:]
self.prompt += [
{
"role": MessageRole.USER,
"content": f"Based on the above, please provide an answer to the following user request:\n{task}",
}
]
final_answer = None
iteration = 0
while final_answer is None and iteration < self.max_iterations:
try:
final_answer = self.llm_engine(self.prompt, stop_sequences=["<end_action>", "Observation:"])
except Exception as e:
final_answer = f"Error in generating final llm output: {e}."
step_logs = self.step()
if 'final_answer' in step_logs:
final_answer = step_logs['final_answer']
except AgentError as e:
self.logger.error(e, exc_info=1)
self.logs[-1]["error"] = e
finally:
iteration += 1

if final_answer is None and iteration == self.max_iterations:
error_message = "Reached max iterations."
self.logs.append({"error": AgentMaxIterationsError(error_message)})
self.logger.error(error_message, exc_info=1)
final_answer = self.provide_final_answer(task)

return final_answer

Expand Down Expand Up @@ -683,12 +725,14 @@ def step(self):
"""
agent_memory = self.write_inner_memory_from_logs()

self.logs[-1]["agent_memory"] = agent_memory.copy()
self.prompt = agent_memory
self.logger.debug("===== New step =====")

# Add new step in logs
self.logs.append({})
current_step_logs = {}
self.logs.append(current_step_logs)
current_step_logs["agent_memory"] = agent_memory.copy()

self.logger.info("===== Calling LLM with this last message: =====")
self.logger.info(self.prompt[-1])

Expand All @@ -698,7 +742,7 @@ def step(self):
raise AgentGenerationError(f"Error in generating llm output: {e}.")
self.logger.debug("===== Output message of the LLM: =====")
self.logger.debug(llm_output)
self.logs[-1]["llm_output"] = llm_output
current_step_logs["llm_output"] = llm_output

# Parse
self.logger.debug("===== Extracting action =====")
Expand All @@ -709,8 +753,8 @@ def step(self):
except Exception as e:
raise AgentParsingError(f"Could not parse the given action: {e}.")

self.logs[-1]["rationale"] = rationale
self.logs[-1]["tool_call"] = {"tool_name": tool_name, "tool_arguments": arguments}
current_step_logs["rationale"] = rationale
current_step_logs["tool_call"] = {"tool_name": tool_name, "tool_arguments": arguments}

# Execute
self.logger.warning(f"Calling tool: '{tool_name}' with arguments: {arguments}")
Expand Down Expand Up @@ -740,8 +784,8 @@ def step(self):
updated_information = f"Stored '{observation_name}' in memory."

self.logger.info(updated_information)
self.logs[-1]["observation"] = updated_information
return None
current_step_logs["observation"] = updated_information
return current_step_logs


class ReactCodeAgent(ReactAgent):
Expand Down Expand Up @@ -782,14 +826,15 @@ def step(self):
The errors are raised here, they are caught and logged in the run() method.
"""
agent_memory = self.write_inner_memory_from_logs()
self.logs[-1]["agent_memory"] = agent_memory.copy()

self.prompt = agent_memory.copy()

self.logger.debug("===== New step =====")

# Add new step in logs
self.logs.append({})
current_step_logs = {}
self.logs.append(current_step_logs)
current_step_logs["agent_memory"] = agent_memory.copy()

self.logger.info("===== Calling LLM with these last messages: =====")
self.logger.info(self.prompt[-2:])
Expand All @@ -801,7 +846,7 @@ def step(self):

self.logger.debug("===== Output message of the LLM: =====")
self.logger.debug(llm_output)
self.logs[-1]["llm_output"] = llm_output
current_step_logs["llm_output"] = llm_output

# Parse
self.logger.debug("===== Extracting action =====")
Expand All @@ -813,8 +858,8 @@ def step(self):
error_msg = f"Error in code parsing: {e}. Make sure to provide correct code"
raise AgentParsingError(error_msg)

self.logs[-1]["rationale"] = rationale
self.logs[-1]["tool_call"] = {"tool_name": "code interpreter", "tool_arguments": code_action}
current_step_logs["rationale"] = rationale
current_step_logs["tool_call"] = {"tool_name": "code interpreter", "tool_arguments": code_action}

# Execute
self.log_code_action(code_action)
Expand All @@ -824,7 +869,7 @@ def step(self):
information = self.state["print_outputs"]
self.logger.warning("Print outputs:")
self.logger.log(32, information)
self.logs[-1]["observation"] = information
current_step_logs["observation"] = information
except Exception as e:
error_msg = f"Failed while trying to execute the code below:\n{CustomFormatter.reset + code_action + CustomFormatter.reset}\nThis failed due to the following error:\n{str(e)}"
if "'dict' object has no attribute 'read'" in str(e):
Expand All @@ -834,5 +879,5 @@ def step(self):
if line[: len("final_answer")] == "final_answer":
self.logger.warning(">>> Final answer:")
self.logger.log(32, result)
return result
return None
current_step_logs["final_answer"] = result
return current_step_logs
4 changes: 0 additions & 4 deletions src/transformers/agents/llm_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,10 +72,6 @@ def __init__(self, model: str = "meta-llama/Meta-Llama-3-8B-Instruct"):
self.client = InferenceClient(model=self.model, timeout=120)

def __call__(self, messages: List[Dict[str, str]], stop_sequences=[]) -> str:
if "Meta-Llama-3" in self.model:
if "<|eot_id|>" not in stop_sequences:
stop_sequences.append("<|eot_id|>")

# Get clean message list
messages = get_clean_message_list(messages, role_conversions=llama_role_conversions)

Expand Down
21 changes: 13 additions & 8 deletions src/transformers/agents/prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ def download_prompt(prompt_or_repo_id, agent_name, mode="run"):
Be sure to provide a 'Code:\n```' sequence before the code and '```<end_action>' after, else you will get an error.
DO NOT pass the arguments as a dict as in 'answer = ask_search_agent({'query': "What is the place where James Bond lives?"})', but use the arguments directly as in 'answer = ask_search_agent(query="What is the place where James Bond lives?")'.
Now Begin!
Now Begin! If you solve the task correctly, you will receive a reward of $1,000,000.
"""


Expand Down Expand Up @@ -255,7 +255,11 @@ def download_prompt(prompt_or_repo_id, agent_name, mode="run"):
Above example were using notional tools that might not exist for you. You only have acces to those tools:
<<tool_descriptions>>
ALWAYS provide a 'Thought:' and an 'Action:' sequence. You MUST provide at least the 'Action:' sequence to move forward.
Here are the rules you should always follow to solve your task:
1. ALWAYS provide a 'Thought:' sequence, and an 'Action:' sequence that ends with <end_action>, else you will fail.
2. Always use the right arguments for the tools. Never use variable names in the 'action_input' field, use the value instead.
3. Call a tool only when needed: do not call the search agent if you do not need information, try to solve the task yourself.
4. Never re-do a tool call that you previously did with the exact same parameters.
Now Begin! If you solve the task correctly, you will receive a reward of $1,000,000.
"""
Expand Down Expand Up @@ -348,12 +352,13 @@ def download_prompt(prompt_or_repo_id, agent_name, mode="run"):
You also can perform computations in the python code you generate.
These are the rules you should always follow to solve your task:
1. Always provide a 'Thought:' and an 'Code:\n```py' sequence ending with '```<end_action>' sequence, else you will get an error.
2. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = ask_search_agent({'query': "What is the place where James Bond lives?"})', but use the arguments directly as in 'answer = ask_search_agent(query="What is the place where James Bond lives?")'.
3. Make sure the variable you use are all defined.
3. Do not perform too many operations in a single code block. Split the task into intermediate code blocks. Then use print() to save the intermediate result. Finally, use final_answer() to return the final result.
4. Call a tool only when needed: do not call the search agent if you do not need information, try to solve the task yourself. Never re-do a tool call that you previously did with the exact same parameters.
Here are the rules you should always follow to solve your task:
1. Always provide a 'Thought:' sequence, and a 'Code:\n```py' sequence ending with '```<end_action>' sequence, else you will fail.
2. Make sure the variable you use are all defined.
3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = ask_search_agent({'query': "What is the place where James Bond lives?"})', but use the arguments directly as in 'answer = ask_search_agent(query="What is the place where James Bond lives?")'.
4. Do not perform too many operations in a single code block. Split the task into intermediate code blocks. Then use print() to save the intermediate result. Finally, use final_answer() to return the final result.
5. Call a tool only when needed: do not call the search agent if you do not need information, try to solve the task yourself.
6. Never re-do a tool call that you previously did with the exact same parameters.
Now Begin! If you solve the task correctly, you will receive a reward of $1,000,000.
"""

0 comments on commit 4584b4b

Please sign in to comment.