Skip to content

Commit

Permalink
fix typos and others (#195)
Browse files Browse the repository at this point in the history
Co-authored-by: trentfowlercohere <[email protected]>
  • Loading branch information
mrmer1 and trentfowlercohere authored Oct 16, 2024
1 parent fe1c400 commit 8583c3b
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 36 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ Let's ask the agent a few questions, starting with this one about the Chat endpo

Firstly, the agent rightly chooses the `search_developer_docs` tool to retrieve the information it needs.

Additionally, because the question asks about two different things, retrieving information using the same query as the user's may not be the most optimal. Instead, the query needs to be expanded, or split into multiple parts, each retrieving its own set of documents.
Additionally, because the question asks about two different things, retrieving information using the same query as the user's may not be the most optimal approach. Instead, the query needs to be expanded or split into multiple parts, each retrieving its own set of documents.

Thus, the agent expands the original query into two queries.

Expand Down Expand Up @@ -326,10 +326,9 @@ This is especially important in multi-turn conversations, where the user's inten

For example, in the first turn, a user might ask "What is A" and in the second turn, they might ask "Compare that with B and C". So, the agent needs to be able to infer that the user's intent is to compare A with B and C.

Let's see an example of this. First, note that the `run_agent` function is already enabled to handle multi-turn conversations. It can take messages from the previous conversation turns and append them to the `messages` list.

So in the first turn, the user asks about the Chat endpoint, which the agent duly provides a response to.
Let's see an example of this. First, note that the `run_agent` function is already set up to handle multi-turn conversations. It can take messages from the previous conversation turns and append them to the `messages` list.

In the first turn, the user asks about the Chat endpoint, to which the agent duly responds.

```python PYTHON
messages = run_agent("What is the Chat endpoint?")
Expand All @@ -355,13 +354,13 @@ Sources:
```


In the second turn, the user asks a question that contains two parts: first asking how it's different from RAG and then asking for code examples.
In the second turn, the user asks a question that has two parts: first, how it's different from RAG, and then, for code examples.

We pass the messages from the previous conversation turn to the `run_agent` function.

Because of this, the agent is able to infer that the question is referring to the Chat endpoint even though the user didn't explicitly mention it.

And the agent goes on to expand the query into two separate queries, one for the `search_code_examples` tool and one for the `search_developer_docs` tool.
The agent then expands the query into two separate queries, one for the `search_code_examples` tool and one for the `search_developer_docs` tool.


```python PYTHON
Expand Down Expand Up @@ -447,6 +446,6 @@ In this tutorial, we learned about:
- How query expansion works over multiple data sources
- How query expansion works in multi-turn conversations

Having said that, we may encounter even more complex queries that what we've seen so far. In particular, there are queries that require sequential reasoning where the retrieval needs to happen over multiple steps.
Having said that, we may encounter even more complex queries that what we've seen so far. In particular, some queries require sequential reasoning where the retrieval needs to happen over multiple steps.

In Part 3, we'll learn how the agentic RAG system can perform sequential reasoning.
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,7 @@ We'll learn these by building an agent that answers questions about using Cohere

## Setup

To get started, first we need to install the `cohere` library and create a Cohere client.

First, we need to install the `cohere` library and create a Cohere client.

```python PYTHON
import json
Expand Down Expand Up @@ -158,15 +157,15 @@ def run_agent(query, messages=None):

## Multi-step tool calling

Let's ask the agent a few questions, starting with this one about a specific feature. The user is asking about two things, first about a features to reorder search results, and second about code examples for that feature.
Let's ask the agent a few questions, starting with this one about a specific feature. The user is asking about two things: a feature to reorder search results and code examples for that feature.

In this case, the agent first needs to identify what that feature is before it can answer the second part of the question.

This is reflected in the agent's tool plan, which describes the steps it will take to answer the question.

So, it first calls the `search_developer_docs` tool to find the feature.

It then finds out that the feature is Rerank. And using this information, it calls the `search_code_examples` tool to find code examples for that feature.
It then discovers that the feature is Rerank. Using this information, it calls the `search_code_examples` tool to find code examples for that feature.

Finally, it uses the retrieved information to answer both parts of the user's question.

Expand Down Expand Up @@ -218,11 +217,11 @@ Sources:

## Multi-step, parallel tool calling

We saw in Part 2 how the Cohere API suports tool calling to happen in parallel, and now in a sequence. That also means that both scenarios can happen at the same time.
In Part 2, we saw how the Cohere API supports tool calling in parallel and now in a sequence. That also means that both scenarios can happen at the same time.

Here's an examples. Suppose we ask the agent to find the leaders of the top 3 countries with the largest oil reserves.

In the first step, it looks up the internet for information about the 3 countries with the largest oil reserves.
In the first step, it searches the Internet for information about the 3 countries with the largest oil reserves.

And in the second step, it performs parallel searches for the leaders of the 3 identified countries.

Expand Down Expand Up @@ -373,21 +372,21 @@ Sources:

## Self-correction

The concept of sequential reasoning is useful in a broader sense, and in particular, where the agent needs to adapt and change its plan midway in a task.
The concept of sequential reasoning is useful in a broader sense, particularly where the agent needs to adapt and change its plan midway in a task.

In other words, it allows the agent to self-correct.

To illustrate this, let's look at an example. Here, the user is asking about the Cohere safety mode feature.

Given the nature of the question, the agent correctly identifies that it needs to find required information via the `search_developer_docs` tool.

However, we know that the tool doesn't contain this information because we have only added a small sample of documents there.
However, we know that the tool doesn't contain this information because we have only added a small sample of documents.

As a result, the agent, having received the documents back without any relevant information, decides to search the internet instead. This is also helped by the fact the we have added specific instructions in the `search_internet` tool to search the internet for information not found in the developer documentation.
As a result, the agent, having received the documents back without any relevant information, decides to search the internet instead. This is also helped by the fact that we have added specific instructions in the `search_internet` tool to search the internet for information not found in the developer documentation.

It finally has the information it needs, and uses it to answer the user's question.

This highlights another important aspect of agentic RAG where it allows a RAG system to be flexible. This is achieved by powering the retrieval component with an LLM.
This highlights another important aspect of agentic RAG, which allows a RAG system to be flexible. This is achieved by powering the retrieval component with an LLM.

On the other hand, a standard RAG system would typically hand-engineer this, and hence, is more rigid.

Expand Down Expand Up @@ -462,8 +461,8 @@ In this tutorial, we learned about:
- How multi-step, parallel tool calling works
- How multi-step tool calling enables an agent to self-correct, and hence, be more flexible

However, up till now, we have only worked with purely unstructured data, the type of data we typicallyencounter in a standard RAG system.
However, up until now, we have only worked with purely unstructured data, the type of data we typically encounter in a standard RAG system.

In the coming chapters, we'll add another complexity to the agentic RAG system, which is working with semi-structured and structured data. This adds another dimension to the agent's flexibility, which is dealing with a more diverse set of data sources.
In the coming chapters, we'll add another complexity to the agentic RAG system working with semi-structured and structured data. This adds another dimension to the agent's flexibility, which is dealing with a more diverse set of data sources.

In Part 4, we'll learn how to work with semi-structured data.
In Part 4, we'll learn how to build an agent that can perform faceted queries over semi-structured data.
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,15 @@ keywords: "Cohere, RAG, agents, function calling,tool use"

<a target="_blank" href="https://colab.research.google.com/github/cohere-ai/notebooks/blob/main/notebooks/guides/agentic-rag/agentic_rag_pt1_routing.ipynb">Open in Colab</a>

Imagine a RAG system that can search over a diverse set of sources, such as a website, a database, and a set of documents.
Imagine a RAG system that can search over diverse sources, such as a website, a database, and a set of documents.

In a standard RAG setting, the application would aggregate retrieved documents from all the different sources it is connected to. This may contribute to noise from less relevant documents.

Additionally, it doesn’t take into consideration that, given a data source's nature, it might be less or more relevant to a query compared to the other data sources.
Additionally, it doesn’t take into consideration that, given a data source's nature, it might be less or more relevant to a query than the other data sources.

An agentic RAG system can solve this problem by routing queries to the most relevant tools based on the query's nature. This is done by leveraging the tool use capabilities of the Chat endpoint.

In this tutorial, we'll learn about:
In this tutorial, we'll cover:
- Setting up the tools
- Running an agentic RAG workflow
- Routing queries to tools
Expand Down Expand Up @@ -50,15 +50,14 @@ Note: the source code for tool definitions can be [found here](https://colab.res

## Setting up the tools

In an agentic RAG system, each data source is represented as a "tool". A tool is broadly any function or service that can receive and send objects to the LLM. But in the case of RAG, this becomes a more specific case of a tool that takes a query as input and return a set of documents.
In an agentic RAG system, each data source is represented as a tool. A tool is broadly any function or service that can receive and send objects to the LLM. But in the case of RAG, this becomes a more specific case of a tool that takes a query as input and returns a set of documents.

Here, we are defining a Python function for each tool, but more broadly, the tool can be any function or service that can receive and send objects.
- `search_developer_docs`: Searches Cohere developer documentation. Here we are creating a small list of sample documents for simplicity and will return the same list for every query. In practice, you will want to implement a search function as such those that use semantic search.
- `search_code_examples`: Searches for Cohere code examples and tutorials. Here we are also creating a small list of sample documents for simplicity.
- `search_developer_docs`: Searches Cohere developer documentation. Here we are creating a small list of sample documents for simplicity and will return the same list for every query. In practice, you will want to implement a search function such as those that use semantic search.
- `search_internet`: Performs an internet search using Tavily search, which we take from LangChain's ready implementation.
- `search_code_examples`: Searches for Cohere code examples and tutorials. Here we are also creating a small list of sample documents for simplicity.


These functions are mapped to a dictionary called functions_map for easy access.
These functions are mapped to a dictionary called `functions_map` for easy access.

Here, we are defining a Python function for each tool, but more broadly, the tool can be any function or service that can receive and send objects.

Expand Down Expand Up @@ -116,9 +115,9 @@ functions_map = {
}
```

The second and final setup step is to define the tool schemas in a format that can be passed to the Chat endpoint. The schema must contain the following fields: `name`, `description`, and `parameters` in the format shown below.
The second and final setup step is to define the tool schemas in a format that can be passed to the Chat endpoint. A tool schema must contain the following fields: `name`, `description`, and `parameters` in the format shown below.

This schema informs the LLM about what the tool does, and the LLM decides whether to use a particular tool based on it. Therefore, the more descriptive and specific the schema, the more likely the LLM will make the right tool call decisions.
This schema informs the LLM about what the tool does, which enables an LLM to decide whether to use a particular tool. Therefore, the more descriptive and specific the schema, the more likely the LLM will make the right tool call decisions.

```python PYTHON
search_developer_docs_tool = {
Expand Down Expand Up @@ -188,7 +187,7 @@ We can now run an agentic RAG workflow using a tool use approach. We can think o
At its most basic, these four components interact in a workflow through four steps:
- **Step 1: Get user message** – The LLM gets the user message (via the application)
- **Step 2: Tool planning and calling** – The LLM makes a decision on the tools to call (if any) and generates - the tool calls
- **Step 3: Tool execution** - The application executes the tools and the results are sent to the LLM
- **Step 3: Tool execution** - The application executes the tools and the sends the results to the LLM
- **Step 4: Response and citation generation** – The LLM generates the response and citations to back to the user

We wrap all these steps in a function called `run_agent`.
Expand Down Expand Up @@ -290,9 +289,9 @@ Let's ask the agent a few questions, starting with this one about the Embed endp

Because of question asks about a specific feature, the agent decides to use the `search_developer_docs` tool (instead of retrieving from all the data sources it's connected to).

It first generates a tool plan that describes how it will handle the query. Then it generates tool calls to the `search_developer_docs` tool with the associated `query` parameter.
It first generates a tool plan that describes how it will handle the query. Then, it generates tool calls to the `search_developer_docs` tool with the associated `query` parameter.

And the tool does indeed contain the information asked by the user, which the agent then uses to generate its response.
The tool does indeed contain the information asked by the user, which the agent then uses to generate its response.


```python PYTHON
Expand All @@ -319,7 +318,7 @@ Sources:
```


Let's now ask the agent a question about Cohere's co-founders. This information is not likely to be found in the developer documentation or code examples, so we can expect the agent to use the internet search tool.
Let's now ask the agent a question about the authors of the sentence BERT paper. This information is not likely to be found in the developer documentation or code examples because it is not Cohere-specific, so we can expect the agent to use the internet search tool.

And this is exactly what the agent does. This time, it decides to use the `search_internet` tool, triggers the search through Tavily search, and uses the results to generate its response.

Expand Down Expand Up @@ -367,9 +366,9 @@ Sources:
3. search_internet_5am6cjesgdry:3
```

Let's ask a final question to the agent, this time about tutorials that are relevant for enterprises.
Let's ask the agent a final question, this time about tutorials that are relevant for enterprises.

An again, the agent uses the context of the query to decide on the most relevant tool to use. In this case, it selects the `search_code_examples` tool and provides a response based on the information found.
Again, the agent uses the context of the query to decide on the most relevant tool. In this case, it selects the `search_code_examples` tool and provides a response based on the information found.


```python PYTHON
Expand Down

0 comments on commit 8583c3b

Please sign in to comment.