Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming, various fixes #30838

Merged
merged 11 commits into from
May 31, 2024
16 changes: 8 additions & 8 deletions docs/source/en/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ An agent is a system that uses an LLM as its engine, and it has access to functi
These *tools* are functions for performing a task, and they contain all necessary description for the agent to properly use them.

The agent can be programmed to:
- devise a series of actions/tools and run them all at once like the `CodeAgent` for example
- plan and execute actions/tools one by one and wait for the outcome of each action before launching the next one like the `ReactJsonAgent` for example
- devise a series of actions/tools and run them all at once like the [`CodeAgent`] for example
- plan and execute actions/tools one by one and wait for the outcome of each action before launching the next one like the [`ReactJsonAgent`] for example

### Types of agents

Expand All @@ -42,8 +42,8 @@ This agent has a planning step, then generates python code to execute all its ac
This is the go-to agent to solve reasoning tasks, since the ReAct framework ([Yao et al., 2022](https://huggingface.co/papers/2210.03629)) makes it really efficient to think on the basis of its previous observations.

We implement two versions of ReactJsonAgent:
- [`~ReactJsonAgent`] generates tool calls as a JSON in its output.
- [`~ReactCodeAgent`] is a new type of ReactJsonAgent that generates its tool calls as blobs of code, which works really well for LLMs that have strong coding performance.
- [`ReactJsonAgent`] generates tool calls as a JSON in its output.
- [`ReactCodeAgent`] is a new type of ReactJsonAgent that generates its tool calls as blobs of code, which works really well for LLMs that have strong coding performance.

> [!TIP]
> Read [Open-source LLMs as LangChain Agents](https://huggingface.co/blog/open-source-llms-as-agents) blog post to learn more the ReAct agent.
Expand Down Expand Up @@ -124,7 +124,7 @@ You could use any `llm_engine` method as long as:

You also need a `tools` argument which accepts a list of `Tools`. You can provide an empty list for `tools`, but use the default toolbox with the optional argument `add_base_tools=True`.

Now you can create an agent, like `CodeAgent`, and run it. For convenience, we also provide the `HfEngine` class that uses `huggingface_hub.InferenceClient` under the hood.
Now you can create an agent, like [`CodeAgent`], and run it. For convenience, we also provide the [`HfEngine`] class that uses `huggingface_hub.InferenceClient` under the hood.

```python
from transformers import CodeAgent, HfEngine
Expand Down Expand Up @@ -187,7 +187,7 @@ The execution will stop at any code trying to perform an illegal operation or if

### The system prompt

An agent, or rather the LLM that drives the agent, generates an output based on the system prompt. The system prompt can be customized and tailored to the intended task. For example, check the system prompt for the `ReactCodeAgent` (below version is slightly simplified).
An agent, or rather the LLM that drives the agent, generates an output based on the system prompt. The system prompt can be customized and tailored to the intended task. For example, check the system prompt for the [`ReactCodeAgent`] (below version is slightly simplified).

```text
You will be given a task to solve as best you can.
Expand Down Expand Up @@ -246,7 +246,7 @@ of the available tools.

A tool is an atomic function to be used by an agent.

You can for instance check the [~PythonInterpreterTool]: it has a name, a description, input descriptions, an output type, and a `__call__` method to perform the action.
You can for instance check the [`PythonInterpreterTool`]: it has a name, a description, input descriptions, an output type, and a `__call__` method to perform the action.

When the agent is initialized, the tool attributes are used to generate a tool description which is baked into the agent's system prompt. This lets the agent know which tools it can use and why.

Expand All @@ -259,7 +259,7 @@ Transformers comes with a default toolbox for empowering agents, that you can ad
- **Speech to text**: given an audio recording of a person talking, transcribe the speech into text ([Whisper](./model_doc/whisper))
- **Text to speech**: convert text to speech ([SpeechT5](./model_doc/speecht5))
- **Translation**: translates a given sentence from source language to target language.
- **Python code interpreter**: runs your the LLM generated Python code in a secure environment. This tool will only be added to [~ReactJsonAgent] if you use `add_base_tools=True`, since code-based tools can already execute Python code
- **Python code interpreter**: runs your the LLM generated Python code in a secure environment. This tool will only be added to [`ReactJsonAgent`] if you use `add_base_tools=True`, since code-based tools can already execute Python code


You can manually use a tool by calling the [`load_tool`] function and a task to perform.
Expand Down
124 changes: 86 additions & 38 deletions src/transformers/agents/agents.py
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,7 @@ def toolbox(self) -> Toolbox:
return self._toolbox

def initialize_for_run(self, task: str, **kwargs):
self.token_count = 0
self.task = task
if len(kwargs) > 0:
self.task += f"\nYou have been provided with these initial arguments: {str(kwargs)}."
Expand Down Expand Up @@ -544,7 +545,7 @@ def run(self, task: str, return_generated_code: bool = False, **kwargs):
self.prompt = [prompt_message, task_message]
self.logger.info("====Executing with this prompt====")
self.logger.info(self.prompt)
llm_output = self.llm_engine(self.prompt, stop_sequences=["<end_code>"])
llm_output = self.llm_engine(self.prompt, stop_sequences=["<end_action>"])

if return_generated_code:
return llm_output
Expand Down Expand Up @@ -597,7 +598,29 @@ def __init__(
if "final_answer" not in self._toolbox.tools:
self._toolbox.add_tool(FinalAnswerTool())

def run(self, task: str, **kwargs):
def provide_final_answer(self, task) -> str:
"""
This method provides a final answer to the task, based on the logs of the agent's interactions.
"""
self.prompt = [
{
"role": MessageRole.SYSTEM,
"content": "An agent tried to answer an user query but it got stuck and failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:",
}
]
self.prompt += self.write_inner_memory_from_logs()[1:]
self.prompt += [
{
"role": MessageRole.USER,
"content": f"Based on the above, please provide an answer to the following user request:\n{task}",
}
]
try:
return self.llm_engine(self.prompt)
except Exception as e:
return f"Error in generating final llm output: {e}."

def run(self, task: str, stream: bool = False, **kwargs):
"""
Runs the agent for the given task.

Expand All @@ -614,41 +637,62 @@ def run(self, task: str, **kwargs):
agent.run("What is the result of 2 power 3.7384?")
```
"""
if stream:
return self.stream_run(task, **kwargs)
else:
return self.direct_run(task, **kwargs)

def stream_run(self, task: str, **kwargs):
self.initialize_for_run(task, **kwargs)

final_answer = None
iteration = 0
while final_answer is None and iteration < self.max_iterations:
try:
final_answer = self.step()
step_logs = self.step()
if "final_answer" in step_logs:
final_answer = step_logs["final_answer"]
except AgentError as e:
self.logger.error(e, exc_info=1)
self.logs[-1]["error"] = e
finally:
iteration += 1
yield self.logs[-1]

if final_answer is None and iteration == self.max_iterations:
error_message = "Reached max iterations."
self.logs.append({"error": AgentMaxIterationsError(error_message)})
final_step_log = {"error": AgentMaxIterationsError(error_message)}
self.logs.append(final_step_log)
self.logger.error(error_message, exc_info=1)
final_answer = self.provide_final_answer(task)
final_step_log["final_answer"] = final_answer
yield final_step_log

yield final_answer

def direct_run(self, task: str, **kwargs):
self.initialize_for_run(task, **kwargs)

self.prompt = [
{
"role": MessageRole.SYSTEM,
"content": "An agent tried to answer a user query but it failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:",
}
]
self.prompt += self.write_inner_memory_from_logs()[1:]
self.prompt += [
{
"role": MessageRole.USER,
"content": f"Based on the above, please provide an answer to the following user request:\n{task}",
}
]
final_answer = None
iteration = 0
while final_answer is None and iteration < self.max_iterations:
try:
final_answer = self.llm_engine(self.prompt, stop_sequences=["Observation:"])
except Exception as e:
final_answer = f"Error in generating final llm output: {e}."
step_logs = self.step()
if "final_answer" in step_logs:
final_answer = step_logs["final_answer"]
except AgentError as e:
self.logger.error(e, exc_info=1)
self.logs[-1]["error"] = e
finally:
iteration += 1

if final_answer is None and iteration == self.max_iterations:
error_message = "Reached max iterations."
final_step_log = {"error": AgentMaxIterationsError(error_message)}
self.logs.append(final_step_log)
self.logger.error(error_message, exc_info=1)
final_answer = self.provide_final_answer(task)
final_step_log["final_answer"] = final_answer

return final_answer

Expand Down Expand Up @@ -683,22 +727,24 @@ def step(self):
"""
agent_memory = self.write_inner_memory_from_logs()

self.logs[-1]["agent_memory"] = agent_memory.copy()
self.prompt = agent_memory
self.logger.debug("===== New step =====")

# Add new step in logs
self.logs.append({})
current_step_logs = {}
self.logs.append(current_step_logs)
current_step_logs["agent_memory"] = agent_memory.copy()

self.logger.info("===== Calling LLM with this last message: =====")
self.logger.info(self.prompt[-1])

try:
llm_output = self.llm_engine(self.prompt, stop_sequences=["Observation:"])
llm_output = self.llm_engine(self.prompt, stop_sequences=["<end_action>", "Observation:"])
except Exception as e:
raise AgentGenerationError(f"Error in generating llm output: {e}.")
self.logger.debug("===== Output message of the LLM: =====")
self.logger.debug(llm_output)
self.logs[-1]["llm_output"] = llm_output
current_step_logs["llm_output"] = llm_output

# Parse
self.logger.debug("===== Extracting action =====")
Expand All @@ -709,8 +755,8 @@ def step(self):
except Exception as e:
raise AgentParsingError(f"Could not parse the given action: {e}.")

self.logs[-1]["rationale"] = rationale
self.logs[-1]["tool_call"] = {"tool_name": tool_name, "tool_arguments": arguments}
current_step_logs["rationale"] = rationale
current_step_logs["tool_call"] = {"tool_name": tool_name, "tool_arguments": arguments}

# Execute
self.logger.warning(f"Calling tool: '{tool_name}' with arguments: {arguments}")
Expand All @@ -721,7 +767,8 @@ def step(self):
answer = arguments
if answer in self.state: # if the answer is a state variable, return the value
answer = self.state[answer]
return answer
current_step_logs["final_answer"] = answer
return current_step_logs
else:
observation = self.execute_tool_call(tool_name, arguments)
observation_type = type(observation)
Expand All @@ -740,8 +787,8 @@ def step(self):
updated_information = f"Stored '{observation_name}' in memory."

self.logger.info(updated_information)
self.logs[-1]["observation"] = updated_information
return None
current_step_logs["observation"] = updated_information
return current_step_logs


class ReactCodeAgent(ReactAgent):
Expand Down Expand Up @@ -782,26 +829,27 @@ def step(self):
The errors are raised here, they are caught and logged in the run() method.
"""
agent_memory = self.write_inner_memory_from_logs()
self.logs[-1]["agent_memory"] = agent_memory.copy()

self.prompt = agent_memory.copy()

self.logger.debug("===== New step =====")

# Add new step in logs
self.logs.append({})
current_step_logs = {}
self.logs.append(current_step_logs)
current_step_logs["agent_memory"] = agent_memory.copy()

self.logger.info("===== Calling LLM with these last messages: =====")
self.logger.info(self.prompt[-2:])

try:
llm_output = self.llm_engine(self.prompt, stop_sequences=["<end_code>", "Observation:"])
llm_output = self.llm_engine(self.prompt, stop_sequences=["<end_action>", "Observation:"])
except Exception as e:
raise AgentGenerationError(f"Error in generating llm output: {e}.")

self.logger.debug("===== Output message of the LLM: =====")
self.logger.debug(llm_output)
self.logs[-1]["llm_output"] = llm_output
current_step_logs["llm_output"] = llm_output

# Parse
self.logger.debug("===== Extracting action =====")
Expand All @@ -813,8 +861,8 @@ def step(self):
error_msg = f"Error in code parsing: {e}. Make sure to provide correct code"
raise AgentParsingError(error_msg)

self.logs[-1]["rationale"] = rationale
self.logs[-1]["tool_call"] = {"tool_name": "code interpreter", "tool_arguments": code_action}
current_step_logs["rationale"] = rationale
current_step_logs["tool_call"] = {"tool_name": "code interpreter", "tool_arguments": code_action}

# Execute
self.log_code_action(code_action)
Expand All @@ -824,7 +872,7 @@ def step(self):
information = self.state["print_outputs"]
self.logger.warning("Print outputs:")
self.logger.log(32, information)
self.logs[-1]["observation"] = information
current_step_logs["observation"] = information
except Exception as e:
error_msg = f"Failed while trying to execute the code below:\n{CustomFormatter.reset + code_action + CustomFormatter.reset}\nThis failed due to the following error:\n{str(e)}"
if "'dict' object has no attribute 'read'" in str(e):
Expand All @@ -834,5 +882,5 @@ def step(self):
if line[: len("final_answer")] == "final_answer":
self.logger.warning(">>> Final answer:")
self.logger.log(32, result)
return result
return None
current_step_logs["final_answer"] = result
return current_step_logs
6 changes: 0 additions & 6 deletions src/transformers/agents/llm_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,6 @@ def __init__(self, model: str = "meta-llama/Meta-Llama-3-8B-Instruct"):
self.client = InferenceClient(model=self.model, timeout=120)

def __call__(self, messages: List[Dict[str, str]], stop_sequences=[]) -> str:
if "Meta-Llama-3" in self.model:
if "<|eot_id|>" not in stop_sequences:
stop_sequences.append("<|eot_id|>")
if "!!!!!" not in stop_sequences:
stop_sequences.append("!!!!!")

# Get clean message list
messages = get_clean_message_list(messages, role_conversions=llama_role_conversions)

Expand Down
Loading
Loading