diff --git a/fern/assets/images/04315e6-Screenshot_2024-07-10_at_9.29.25_AM.png b/fern/assets/images/04315e6-Screenshot_2024-07-10_at_9.29.25_AM.png new file mode 100644 index 00000000..bba66357 Binary files /dev/null and b/fern/assets/images/04315e6-Screenshot_2024-07-10_at_9.29.25_AM.png differ diff --git a/fern/assets/images/1cf1e77-cohere_meta_image.jpg b/fern/assets/images/1cf1e77-cohere_meta_image.jpg new file mode 100644 index 00000000..1f3083fe Binary files /dev/null and b/fern/assets/images/1cf1e77-cohere_meta_image.jpg differ diff --git a/fern/assets/images/3b75f4e-image.png b/fern/assets/images/3b75f4e-image.png new file mode 100644 index 00000000..dfd1d6b6 Binary files /dev/null and b/fern/assets/images/3b75f4e-image.png differ diff --git a/fern/assets/images/837d25c-image.png b/fern/assets/images/837d25c-image.png new file mode 100644 index 00000000..106b76be Binary files /dev/null and b/fern/assets/images/837d25c-image.png differ diff --git a/fern/assets/images/9272011-cohere_meta_image.jpg b/fern/assets/images/9272011-cohere_meta_image.jpg new file mode 100644 index 00000000..1f3083fe Binary files /dev/null and b/fern/assets/images/9272011-cohere_meta_image.jpg differ diff --git a/fern/assets/images/baaa93f-cohere_meta_image.jpg b/fern/assets/images/baaa93f-cohere_meta_image.jpg new file mode 100644 index 00000000..1f3083fe Binary files /dev/null and b/fern/assets/images/baaa93f-cohere_meta_image.jpg differ diff --git a/fern/assets/images/ebb82f9-Screenshot_2024-07-10_at_9.27.11_AM.png b/fern/assets/images/ebb82f9-Screenshot_2024-07-10_at_9.27.11_AM.png new file mode 100644 index 00000000..0777dde1 Binary files /dev/null and b/fern/assets/images/ebb82f9-Screenshot_2024-07-10_at_9.27.11_AM.png differ diff --git a/fern/assets/input.css b/fern/assets/input.css index 41b06ca8..10f45414 100644 --- a/fern/assets/input.css +++ b/fern/assets/input.css @@ -453,3 +453,9 @@ button[class^="Sidebar-link-buttonWrapper"] { padding: 9px 0 32px 0; } } + +.side { + width: 40% !important; + float: right !important; + margin-left: .75rem !important; +} diff --git a/fern/docs.yml b/fern/docs.yml index 31cc7263..a64e8b19 100644 --- a/fern/docs.yml +++ b/fern/docs.yml @@ -55,7 +55,7 @@ navbar-links: url: https://coral.cohere.com/ - type: secondary text: DASHBOARD - url: https://os.cohere.ai/ + url: https://dashboard.cohere.com/ - type: secondary text: PLAYGROUND url: https://dashboard.cohere.com/playground/generate diff --git a/fern/fern.config.json b/fern/fern.config.json index c85a4002..afdd207c 100644 --- a/fern/fern.config.json +++ b/fern/fern.config.json @@ -1,4 +1,4 @@ { "organization": "cohere", - "version": "0.37.15" + "version": "0.37.16" } \ No newline at end of file diff --git a/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx b/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx index e816a647..1d89e36b 100644 --- a/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx +++ b/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx @@ -33,7 +33,7 @@ Here are the steps you'll need to get set up in advance of running Cohere models ## Embeddings -You can use this code to invoke Cohere's embed model on Amazon Bedrock: +You can use this code to invoke Cohere's Embed English v3 model (`cohere.embed-english-v3`) or Embed Multilingual v3 model (`cohere.embed-multilingual-v3`) on Amazon Bedrock: ```python PYTHON import cohere @@ -70,7 +70,7 @@ print(result) ## Text Generation -You can use this code to invoke Cohere's Command models on Amazon Bedrock: +You can use this code to invoke either Command R (`cohere.command-r-v1:0`), Command R+ (`cohere.command-r-plus-v1:0`), Command (`cohere.command-text-v14`), or Command light (`cohere.command-light-text-v14`) on Amazon Bedrock: ```python PYTHON import cohere diff --git a/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx b/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx index 4b6a2327..152f070b 100644 --- a/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx +++ b/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx @@ -14,12 +14,14 @@ In an effort to make our language-model capabilities more widely available, we'v In this article, you learn how to use [Azure AI Studio](https://ai.azure.com/) to deploy both the Cohere Command models and the Cohere Embed models on Microsoft's Azure cloud computing platform. -The following four models are available through Azure AI Studio with pay-as-you-go, token-based billing: +The following six models are available through Azure AI Studio with pay-as-you-go, token-based billing: - Command R - Command R+ - Embed v3 - English - Embed v3 - Multilingual +- Cohere Rerank V3 (English) +- Cohere Rerank V3 (multilingual) ## Prerequisites @@ -30,10 +32,11 @@ Whether you're using Command or Embed, the initial set up is the same. You'll ne - An [Azure AI project](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/create-projects) in Azure AI Studio. - Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the required steps, your user account must be assigned the Azure AI Developer role on the resource group. For more information on permissions, see [Role-based access control in Azure AI Studio](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/rbac-ai-studio). -For Command- or Embed-based workflows, you'll also need to create a deployment and consume the model. Here are links for more information: +For workflows based around Command, Embed, or Rerank, you'll also need to create a deployment and consume the model. Here are links for more information: - **Command:** [create a Command deployment](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-command#create-a-new-deployment) and then [consume the Command model](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-command#create-a-new-deployment). - **Embed:** [create an Embed deployment](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-embed#create-a-new-deployment) and [consume the Embed model](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-embed#consume-the-cohere-embed-models-as-a-service). +- **Rerank**: [create a Rerank deployment](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-rerank) and [consume the Rerank model](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-rerank#consume-the-cohere-rerank-models-as-a-service). ## Text Generation @@ -133,6 +136,65 @@ except urllib.error.HTTPError as error: print(error.read().decode("utf8", "ignore")) ``` +## ReRank + +We currently exposes the `v1/rerank` endpoint for inference with both Rerank 3 - English and Rerank 3 - Multilingual. For more information on using the APIs, see the [reference](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-rerank#rerank-api-reference-for-cohere-rerank-models-deployed-as-a-service) section. + +```python PYTHON +import cohere + +co = cohere.Client( + base_url="https://..inference.ai.azure.com/v1", + api_key="" +) + +documents = [ + { + "Title": "Incorrect Password", + "Content": "Hello, I have been trying to access my account for the past hour and it keeps saying my password is incorrect. Can you please help me?", + }, + { + "Title": "Confirmation Email Missed", + "Content": "Hi, I recently purchased a product from your website but I never received a confirmation email. Can you please look into this for me?", + }, + { + "Title": "Questions about Return Policy", + "Content": "Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective.", + }, + { + "Title": "Customer Support is Busy", + "Content": "Good morning, I have been trying to reach your customer support team for the past week but I keep getting a busy signal. Can you please help me?", + }, + { + "Title": "Received Wrong Item", + "Content": "Hi, I have a question about my recent order. I received the wrong item and I need to return it.", + }, + { + "Title": "Customer Service is Unavailable", + "Content": "Hello, I have been trying to reach your customer support team for the past hour but I keep getting a busy signal. Can you please help me?", + }, + { + "Title": "Return Policy for Defective Product", + "Content": "Hi, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective.", + }, + { + "Title": "Wrong Item Received", + "Content": "Good morning, I have a question about my recent order. I received the wrong item and I need to return it.", + }, + { + "Title": "Return Defective Product", + "Content": "Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective.", + }, +] + +response = co.rerank( + documents=documents, + query="What emails have been about returning items?", + rank_fields=["Title", "Content"], + top_n=5, +) +``` + ## A Note on SDKs You should be aware that it's possible to use the cohere SDK client to consume Azure AI deployments. Here are example notes for [Command](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere-cmdR.ipynb) and [Embed](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere-embed.ipynb). diff --git a/fern/pages/get-started/the-cohere-platform.mdx b/fern/pages/get-started/the-cohere-platform.mdx index b98f3427..e091d214 100644 --- a/fern/pages/get-started/the-cohere-platform.mdx +++ b/fern/pages/get-started/the-cohere-platform.mdx @@ -9,46 +9,43 @@ keywords: "natural language processing, generative AI, fine-tuning models" createdAt: "Thu Oct 13 2022 21:30:34 GMT+0000 (Coordinated Universal Time)" updatedAt: "Mon Jun 24 2024 09:16:55 GMT+0000 (Coordinated Universal Time)" --- -Cohere allows developers and enterprises to build LLM-powered applications. We do that by creating world-class models, and the supporting platform to deploy them securely and privately. +Cohere allows developers and enterprises to build LLM-powered applications. We do that by creating world-class models, along with the supporting platform required to deploy them securely and privately. ## Cohere Large Language Models (LLMs). -The Command family of models includes [Command](https://cohere.com/models/command), [Command R](/docs/command-r), and [Command R+](/docs/command-r-plus). Together, they are the text-generation LLMs powering conversational agents, summarization, copywriting, and similar use cases. They work through the [Chat](/reference/chat) endpoint, which can be used with or without [retrieval augmented generation](/docs/retrieval-augmented-generation-rag) (RAG). +The Command family of models includes [Command](https://cohere.com/models/command), [Command R](/docs/command-r), and [Command R+](/docs/command-r-plus). Together, they are the text-generation LLMs powering conversational agents, summarization, copywriting, and similar use cases. They work through the [Chat](/reference/chat) endpoint, which can be used with or without [retrieval augmented generation](/docs/retrieval-augmented-generation-rag) RAG. [Rerank](https://txt.cohere.com/rerank/) is the fastest way to inject the intelligence of a language model into an existing search system. It can be accessed via the [Rerank](/reference/rerank-1) endpoint. [Embed](https://cohere.com/models/embed) improves the accuracy of search, classification, clustering, and RAG results. It also powers the [Embed](/reference/embed) and [Classify](/reference/classify) endpoints. - + [Click here](/docs/foundation-models) to learn more about Cohere foundation models. ## These LLMs Make it Easy to Build Conversational Agents (and Other LLM-powered Apps) -Try [Coral](http://coral.cohere.com) to see what an LLM-powered conversational agent can look like. It is able to converse, summarize text, and write emails and articles. +Try [the Chat UI](https://coral.cohere.com) to see what an LLM-powered conversational agent can look like. It is able to converse, summarize text, and write emails and articles. - + +Our goal, however, is to enable you to build your own LLM-powered applications. The [Chat endpoint](/docs/chat-api), for example, can be used to build a conversational agent powered by the Command family of models. -Our goal, however, is to enable you to build your own LLM-powered applications. The [Chat endpoint](/docs/cochat-beta), for example, can be used to build a conversational agent powered by the Command family of models. - - - +A diagram of a conversational agent. ### Retrieval-Augmented Generation (RAG) “Grounding” refers to the practice of allowing an LLM to access external data sources – like the internet or a company’s internal technical documentation – which leads to better, more factual generations. -Coral is being used with grounding enabled in the screenshot below, and you can see how accurate and information-dense its reply is. - - +Chat is being used with grounding enabled in the screenshot below, and you can see how accurate and information-dense its reply is. + -What’s more, Coral’s advanced RAG capabilities allow you to see what underlying query the model generates when completing its tasks, and its output includes [citations](/docs/documents-and-citations) pointing you to where it found the information it uses. Both the query and the citations can be leveraged alongside the Cohere Embed and Rerank models to build a remarkably powerful RAG system, such as the one found in [this guide](https://txt.cohere.com/rag-chatbot/). - +What’s more, advanced RAG capabilities allow you to see what underlying query the model generates when completing its tasks, and its output includes [citations](/docs/documents-and-citations) pointing you to where it found the information it uses. Both the query and the citations can be leveraged alongside the Cohere Embed and Rerank models to build a remarkably powerful RAG system, such as the one found in [this guide](https://cohere.com/llmu/rag-chatbot). + [Click here](/docs/serving-platform) to learn more about the Cohere serving platform. @@ -56,8 +53,7 @@ What’s more, Coral’s advanced RAG capabilities allow you to see what underly Embeddings enable you to search based on what a phrase _means_ rather than simply what keywords it _contains_, leading to search systems that incorporate context and user intent better than anything that has come before. - - +How a query returns results. Learn more about semantic search [here](/docs/intro-semantic-search). @@ -65,18 +61,17 @@ Learn more about semantic search [here](/docs/intro-semantic-search). To [create a fine-tuned model](/docs/fine-tuning), simply upload a dataset and hold on while we train a custom model and then deploy it for you. Fine-tuning can be done with [generative models](/docs/generate-fine-tuning), [multi-label classification models](/docs/classify-fine-tuning), [rerank models](/docs/rerank-fine-tuning), and [chat models](/docs/chat-fine-tuning). - - +A diagram of fine-tuning. ## Where you can access Cohere Models Depending on your privacy/security requirements there are a number of ways to access Cohere: - [Cohere API](/reference/about): this is the easiest option, simply grab an API key from [the dashboard](https://dashboard.cohere.com/) and start using the models hosted by Cohere. -- Cloud AI platforms: this option offers a balance of ease-of-use and security. you can access Cohere on various cloud AI platforms such as [Oracle's GenAI Service](https://www.oracle.com/uk/artificial-intelligence/generative-ai/large-language-models/), AWS' [Bedrock](https://aws.amazon.com/bedrock/cohere-command-embed/) and [Sagemaker](https://aws.amazon.com/blogs/machine-learning/cohere-brings-language-ai-to-amazon-sagemaker/) platforms, [Google Cloud](https://console.cloud.google.com/marketplace/product/cohere-id-public/cohere-public?ref=txt.cohere.com), and [Azure's AML service](https://txt.cohere.com/coheres-enterprise-ai-models-coming-soon-to-microsoft-azure-ai-as-a-managed-service/). +- Cloud AI platforms: this option offers a balance of ease-of-use and security. you can access Cohere on various cloud AI platforms such as [Oracle's GenAI Service](https://www.oracle.com/uk/artificial-intelligence/generative-ai/large-language-models/), AWS' [Bedrock](https://aws.amazon.com/bedrock/cohere-command-embed/) and [Sagemaker](https://aws.amazon.com/blogs/machine-learning/cohere-brings-language-ai-to-amazon-sagemaker/) platforms, [Google Cloud](https://console.cloud.google.com/marketplace/product/cohere-id-public/cohere-public?ref=txt.cohere.com), and [Azure's AML service](https://txt.cohere.com/coheres-enterprise-ai-models-coming-soon-to-microsoft-azure-ai-as-a-managed-service/). - Private cloud deploy deployments: Cohere's models can be deployed privately in most virtual private cloud (VPC) environments, offering enhanced security and highest degree of customization. Please [contact sales](emailto:team@cohere.com) for information. - +The major cloud providers. ### On-Premise and Air Gapped Solutions diff --git a/fern/pages/llm-university/llmu-2.mdx b/fern/pages/llm-university/llmu-2.mdx index 24430dfe..77e022bb 100644 --- a/fern/pages/llm-university/llmu-2.mdx +++ b/fern/pages/llm-university/llmu-2.mdx @@ -3,11 +3,14 @@ title: "Welcome to LLM University!" slug: "docs/llmu-2" description: "LLM University (LLMU) offers in-depth, practical NLP and LLM training. Ideal for all skill levels. Learn, build, and deploy Language AI with Cohere." image: "../../assets/images/1cc9fac-Cohere_LLM_University.png" +no-image-zoom: true createdAt: "Wed Apr 26 2023 16:41:18 GMT+0000 (Coordinated Universal Time)" updatedAt: "Wed Apr 24 2024 03:04:28 GMT+0000 (Coordinated Universal Time)" --- -![](../../assets/images/60c937f-small-LLMUni_Docs_Banner.png) + + + #### Welcome to LLM University by Cohere! diff --git a/fern/pages/responsible-use/responsible-use.mdx b/fern/pages/responsible-use/responsible-use.mdx index 0a5a2e2b..5886baab 100644 --- a/fern/pages/responsible-use/responsible-use.mdx +++ b/fern/pages/responsible-use/responsible-use.mdx @@ -10,12 +10,12 @@ keywords: "AI safety, AI risk, responsible AI" createdAt: "Thu Sep 01 2022 19:22:12 GMT+0000 (Coordinated Universal Time)" updatedAt: "Fri Mar 15 2024 04:47:51 GMT+0000 (Coordinated Universal Time)" --- -The Responsible Use documentation aims to guide developers in using language models constructively and ethically. Toward this end, we've published [guidelines](/usage-guidelines) for using our API safely, as well as our processes around [harm prevention](/harm-prevention). We provide model cards to communicate the strengths and weaknesses of our models and to encourage responsible use (motivated by [Mitchell, 2019](https://arxiv.org/pdf/1810.03993.pdf)). We also provide a [data statement](/data-statement) describing our pre-training datasets (motivated by [Bender and Friedman, 2018](https://www.aclweb.org/anthology/Q18-1041/)). +The Responsible Use documentation aims to guide developers in using language models constructively and ethically. Toward this end, we've published [guidelines](/docs/usage-guidelines) for using our API safely, as well as our processes around [harm prevention](#harm-prevention). We provide model cards to communicate the strengths and weaknesses of our models and to encourage responsible use (motivated by [Mitchell, 2019](https://arxiv.org/pdf/1810.03993.pdf)). We also provide a [data statement](/data-statement) describing our pre-training datasets (motivated by [Bender and Friedman, 2018](https://www.aclweb.org/anthology/Q18-1041/)). **Model Cards:** -- [Generation](generation-card) -- [Representation](representation-card) +- [Generation](/docs/generation-benchmarks) +- [Representation](/docs/representation-benchmarks) If you have feedback or questions, please feel free to [let us know](mailto:responsibility@cohere.ai) — we are here to help. diff --git a/fern/pages/text-embeddings/text-classification-with-cohere.mdx b/fern/pages/text-embeddings/text-classification-with-cohere.mdx new file mode 100644 index 00000000..21b26b9a --- /dev/null +++ b/fern/pages/text-embeddings/text-classification-with-cohere.mdx @@ -0,0 +1,143 @@ +--- +title: Text Classification +description: "The document explains how to perform text classification using Cohere's classify endpoint, including setting up the SDK, preparing data, generating predictions, and fine-tuning the model for tasks like sentiment analysis." +keywords: "text classification, Cohere, large language models, word embeddings" +image: "../../assets/images/1cf1e77-cohere_meta_image.jpg" +slug: /docs/text-classification-with-cohere +--- + +Among the most popular use cases for language embeddings is 'text classification,' in which different pieces of text -- blog posts, lyrics, poems, headlines, etc. -- are grouped based on their similarity, their sentiment, or some other property. + +Here, we'll discuss how to perform simple text classification tasks with Cohere's `classify` endpoint, and provide links to more information on how to fine-tune this endpoint for more specialized work. + +## Few-Shot Classification with Cohere's `classify` Endpoint + +Generally, training a text classifier requires a tremendous amount of data. But with large language models, it's now possible to create so-called 'few shot' classification models able to perform well after seeing a far smaller number of samples. + +In the next few sections, we'll create a sentiment analysis classifier to sort text into "positive," "negative," and "neutral" categories. + +### Setting up the SDK + +First, let's import the required tools and set up a Cohere client. + +```python PYTHON +import cohere +from cohere import ClassifyExample +``` + +```python PYTHON +co = cohere.Client("COHERE_API_KEY") # Your Cohere API key +``` + +### Preparing the Data and Inputs + +With the `classify` endpoint, you can create a text classifier with as few as two examples per class, and each example **must** contain the text itself and the corresponding label (i.e. class). So, if you have two classes you need a minimum of four examples, if you have three classes you need a minimum of six examples, and so on. + +Here are examples, created as `ClassifyExample` objects: + +```python PYTHON +examples = [ClassifyExample(text="I’m so proud of you", label="positive"), + ClassifyExample(text="What a great time to be alive", label="positive"), + ClassifyExample(text="That’s awesome work", label="positive"), + ClassifyExample(text="The service was amazing", label="positive"), + ClassifyExample(text="I love my family", label="positive"), + ClassifyExample(text="They don't care about me", label="negative"), + ClassifyExample(text="I hate this place", label="negative"), + ClassifyExample(text="The most ridiculous thing I've ever heard", label="negative"), + ClassifyExample(text="I am really frustrated", label="negative"), + ClassifyExample(text="This is so unfair", label="negative"), + ClassifyExample(text="This made me think", label="neutral"), + ClassifyExample(text="The good old days", label="neutral"), + ClassifyExample(text="What's the difference", label="neutral"), + ClassifyExample(text="You can't ignore this", label="neutral"), + ClassifyExample(text="That's how I see it", label="neutral")] + +``` + +Besides the examples, you'll also need the 'inputs,' which are the strings of text you want the classifier to sort. Here are the ones we'll be using: + +```python PYTHON +inputs = ["Hello, world! What a beautiful day", + "It was a great time with great people", + "Great place to work", + "That was a wonderful evening", + "Maybe this is why", + "Let's start again", + "That's how I see it", + "These are all facts", + "This is the worst thing", + "I cannot stand this any longer", + "This is really annoying", + "I am just plain fed up"] +``` + +### Generate Predictions + +Setting up the model is quite straightforward with the `classify` endpoint. We'll use Cohere's `embed-english-v3.0` model, here's what that looks like: + +```python PYTHON +def classify_text(inputs, examples): + + """ + Classifies a list of input texts given the examples + Arguments: + model (str): identifier of the model + inputs (list[str]): a list of input texts to be classified + examples (list[Example]): a list of example texts and class labels + Returns: + classifications (list): each result contains the text, labels, and conf values + """ + + # Classify text by calling the Classify endpoint + response = co.classify( + model='embed-english-v3.0', + inputs=inputs, + examples=examples) + + classifications = response.classifications + + return classifications + +# Classify the inputs +predictions = classify_text(inputs, examples) + +print(predictions) +``` + +Here’s a sample output returned (note that this output has been truncated to make it easier to read, you'll get much more in return if you run the code yourself): + +``` +[ClassifyResponseClassificationsItem(id='9df6628d-57b2-414c-837e-c8a22f00d3db', + input='hello, world! what a beautiful day', + prediction='positive', + predictions=['positive'], + confidence=0.40137812, + confidences=[0.40137812], + labels={'negative': ClassifyResponseClassificationsItemLabelsValue(confidence=0.23582731), + 'neutral': ClassifyResponseClassificationsItemLabelsValue(confidence=0.36279458), + 'positive': ClassifyResponseClassificationsItemLabelsValue(confidence=0.40137812)}, + classification_type='single-label'), + ClassifyResponseClassificationsItem(id='ce2c3b0b-ce98-4905-9ef5-fc83c6848fc5', + input='it was a great time with great people', + prediction='positive', + predictions=['positive'], + confidence=0.49054274, + confidences=[0.49054274], + labels={'negative': ClassifyResponseClassificationsItemLabelsValue(confidence=0.19989403), + 'neutral': ClassifyResponseClassificationsItemLabelsValue(confidence=0.30956325), + 'positive': ClassifyResponseClassificationsItemLabelsValue(confidence=0.49054274)}, + classification_type='single-label') + ....] +``` + +Most of this is pretty easy to understand, but there are a few things worth drawing attention to. + +Besides returning the predicted class in the `prediction` field, the endpoint also returns the `confidence` value of the prediction, which varies between 0 (unconfident) and 1 (completely confident). + +Also, these confidence values are split among the classes; since we're using three, the confidence values for the "positive," "negative," and "neutral" classes must add up to a total of 1. + +Under the hood, the classifier selects the class with the highest confidence value as the “predicted class.” A high confidence value for the predicted class therefore indicates that the model is very confident of its prediction, and vice versa. + +#### What If I Need to Fine-Tune the `classify` endpoint? + +Cohere has [dedicated documentation](/docs/classify-fine-tuning) on fine-tuning the `classify` endpoint for bespoke tasks. You can also read this [blog post](https://cohere.com/blog/fine-tuning-for-classification), which works out a detailed example. diff --git a/fern/pages/text-generation/introduction-to-text-generation-at-cohere.mdx b/fern/pages/text-generation/introduction-to-text-generation-at-cohere.mdx new file mode 100644 index 00000000..76619762 --- /dev/null +++ b/fern/pages/text-generation/introduction-to-text-generation-at-cohere.mdx @@ -0,0 +1,35 @@ +--- +title: Introduction to Text Generation at Cohere +slug: /docs/introduction-to-text-generation-at-cohere +--- + +Large language models are impressive for many reasons, but among the most prominent is their ability to quickly generate text. With just a little bit of prompting, they can crank out conceptual explanations, blog posts, web copy, poetry, and almost anything else. Their style can be tweaked to be suitable for children and adults, technical people and laymen, and they can be asked to work in dozens of different natural languages. + +In this article, we'll cover some of the basics of what makes this functionality possible. If you'd like to skip straight to a more hands-on treatment, check out "[Using the Chat API](/docs/chat-api)." + +## How are Large Language Models Trained? + +Eliding a great deal of technical complexity, a large language model is just a neural network trained to predict the [next token](/docs/tokens-and-tokenizers#what-is-a-token), given the tokens that have come before. Take a sentence like "Hang on, I need to go inside and grab my \_\_\_." As a human being with a great deal of experience using natural language, you can make some reasonable guesses about which token will complete this sentence even with no additional context: + +- "Hang on, I need to go inside and grab my **bag**." +- "Hang on, I need to go inside and grab my **keys**." +- Etc. + +Of course, there are other possibilities that are plausible, but less likely: + +- "Hang on, I need to go inside and grab my **friend**." +- "Hang on, I need to go inside and grab my **book**." + +And, there's a long-tail of possibilities that are technically grammatically correct but which effectively never occur in a real exchange: + +- "Hang on, I need to go inside and grab my **giraffe**." + +_You_ have an intuitive sense of how a sentence like this will end because you've been using language all your life. A model like Command R+ must learn how to perform the same feat by seeing billions of token sequences and figuring out a statistical distribution over them that allows it to predict what comes next. + +Once it's done so, it can take a prompt like "Help me generate some titles for a blog post about quantum computing," and use the distribution it has learned to generate the series of tokens it _thinks_ would follow such a request. Since it's an _AI_ system _generating_ tokens, it's known as "generative AI," and with models as powerful as Cohere's, the results are often surprisingly good. + +## Learn More + +The rest of the "Text Generation" section of our documentation walks you through how to work with Cohere's models. Check out ["Using the Chat API"](/docs/chat-api) to get set up and understand what a response looks like, or reading the [streaming guide](/docs/streaming) to figure out how to integrate generative AI into streaming applications. + +You might also benefit from reading the [retrieval-augmented generation](/docs/retrieval-augmented-generation-rag), [tool-use](/docs/tool-use), and [agent-building](/docs/multi-step-tool-use) guides. diff --git a/fern/pages/text-generation/prompt-engineering/preambles.mdx b/fern/pages/text-generation/prompt-engineering/preambles.mdx index 7306f142..d4557f47 100644 --- a/fern/pages/text-generation/prompt-engineering/preambles.mdx +++ b/fern/pages/text-generation/prompt-engineering/preambles.mdx @@ -7,9 +7,11 @@ createdAt: "Tue Mar 12 2024 19:19:02 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu Jun 13 2024 16:10:09 GMT+0000 (Coordinated Universal Time)" --- -A preamble is a system message that is provided to a model at the beginning of a conversation which dictates how the model should behave throughout. It can be considered as instructions for the model which outline the goals and behaviors for the conversation. - + + +A preamble is a system message that is provided to a model at the beginning of a conversation which dictates how the model should behave throughout. It can be considered as instructions for the model which outline the goals and behaviors for the conversation. ## Writing a custom preamble diff --git a/fern/pages/text-generation/prompt-engineering/prompt-tuner.mdx b/fern/pages/text-generation/prompt-engineering/prompt-tuner.mdx new file mode 100644 index 00000000..6226e0de --- /dev/null +++ b/fern/pages/text-generation/prompt-engineering/prompt-tuner.mdx @@ -0,0 +1,133 @@ +--- +title: Prompt Tuner (beta) +image: "../../../assets/images/baaa93f-cohere_meta_image.jpg" +slug: /docs/prompt-tuner +--- + + +This feature is in beta, so it may experience changes and updates in the future. + + +# Introduction + +[Prompt Tuner](https://dashboard.cohere.com/prompt-tuner) is an intuitive tool developed by Cohere to streamline the process of defining a robust prompt for user-specific needs. A model's effectiveness can significantly depend on how well the input prompt is formulated. The Prompt Tuner addresses this challenge by automating the trial-and-error process traditionally associated with prompt optimization. + +With the Prompt Tuner, you: + +- provide the initial prompt you wish to optimize and +- define criteria important to your goals, such as word count, output format, or hallucination checks. + +The tool then iterates through various prompt modifications, evaluating each against the selected criteria to determine the most effective prompt configuration. + +**Optimize a prompt without writing a single line of code.** + +# Starting the optimization + +Cohere models are utilized in various enterprise scenarios. For instance, a model could be prompted to write a job description for a specific position with a word limit of 200 words. An initial prompt might look like this: + +``` +Create a job description for a Data Scientist position with the following requirements: proficiency in Python, experience with machine learning algorithms, knowledge of data visualisation tools, and familiarity with big data technologies. + +List at least 4 requirements. +``` + +However, this prompt could be improved by being more specific. This can be done using the [Prompt Tuner](https://dashboard.cohere.com/prompt-tuner) in the Cohere Dashboard. + +## 1. Input the initial prompt + +The left-hand side of the [Prompt Tuner](https://dashboard.cohere.com/prompt-tuner) provides a window to paste the initial prompt. + +## 2. Specify criteria + +The right-hand side is reserved for optimization parameters. For now, we will focus on `CRITERIA`. The remaining parameters will be discussed in the next section of this document. + +`CRITERIA` allows you to **specify the requirements for optimizing the prompts**, either through a set of predefined criteria or using natural language. In the example above, since we aim for the job description to be no more than 200 words, set the word count between 150 and 200. + +### Define custom criteria + +One of the most compelling features of the [Prompt Tuner](https://dashboard.cohere.com/prompt-tuner?tab=tuner) is its **ability to support custom criteria defined in natural language**. You can select the `Descriptive` box and provide a text description of how the completion should meet this criterion. + +Example: + +``` +There are least 4 requirements. +``` + +## 3. Run the optimization + +Once done, press the `OPTIMIZE PROMPT` button. + +![](../../../assets/images//3b75f4e-image.png) + +# Understanding the results + +After the optimization is complete, you will see the **best** prompt and its completions. However, you can also access all the prompts generated by the tuner by clicking the drop-down button in the top right corner of the prompt window. + +The tuner iteratively generates new prompts, focusing on criteria that still need improvement. Consequently, a table displaying the scores for each requirement at each iteration is also presented. + +# Improving the results + +The [Prompt Tuner](https://dashboard.cohere.com/prompt-tuner) offers a rich set of parameters that can be adjusted, giving you full control over prompt optimization. Understanding how to set these parameters is crucial for achieving good results. + +### CRITERIA + +The optimized prompt is a direct product of the input prompt and the criteria it is meant to optimize. More criteria can be added to guide the optimization process and achieve better results. + +There are two types of criteria: + +- **Rule-based**: These are the foundational criteria for each query: + - Word Count: Checks whether the number of words is within a specified range. + - Is JSON: Checks if the completion is a valid JSON object. Optionally, allows checking the generated schema agains a specific JSON Schema. + - Grounding: Measures whether the information in the completion is derived from the prompt and provided documents. + - Accuracy: Measures how well the completion follows the instructions defined in the prompt. +- **Custom**: Custom criteria allows users to define their own descriptions to create evaluation prompts and check the generated completions. + +### MODEL + +`MODEL` lets you choose a model from the Cohere suite for which the prompt should be optimized for. + +### VARIABLES + +`VARIABLES` allows you to test how the prompt generalizes to multiple scenarios. Suppose writing a job description should be extended to multiple positions with different requirements. + +For example: + +- **Job posting 1: ** + - **Position:** Data Scientist, + - **Requirements:** proficiency in Python, experience with machine learning algorithms, knowledge of data visualisation tools, and familiarity with big data technologies. +- **Job posting 2:** + - **Position:** Product Manager + - **Requirements:** Strong understanding of product lifecycle management, experience with market research and user feedback analysis, excellent communication and leadership skills, and familiarity with Agile methodologies. +- **Job posting 3:** + - **Position:** Software Engineer + - **Requirements:** Proficiency in Java or C++, experience with software development lifecycle, strong problem-solving skills, and familiarity with version control systems like Git. + +To account for this, the initial prompt can be modified to include placeholders: + +``` +Create a job description for a ${position} position with the following requirements: ${requirements}. +``` + + + +After adjusting the prompt, the variable names will appear in the `VARIABLES` section, where the appropriate values can be entered. + +
+ +
+ +### DOCUMENTS + +Cohere models have strong Retrieval Augmented Generation (RAG) capabilites. Therefore, the [Prompt Tuner](https://dashboard.cohere.com/prompt-tuner) also allows you to optimize prompts for these use cases, as well. If you want to ground your task in the context of a document, you can upload the document, and the optimizer will handle the rest. + +Note: Currently, we only support raw text documents. + +# More examples + +For more examples, please see the example section where we provide templates for more real-life scenarios: + +- Performance Review +- Word Definition +- Social Media Content Creation diff --git a/fern/pages/text-generation/summarizing-text.mdx b/fern/pages/text-generation/summarizing-text.mdx new file mode 100644 index 00000000..4273bbf2 --- /dev/null +++ b/fern/pages/text-generation/summarizing-text.mdx @@ -0,0 +1,246 @@ +--- +title: Summarizing Text +description: "The document explains how to perform text summarization using Cohere's Chat endpoint, highlighting features like length and format control, and the use of retrieval-augmented generation for grounded summaries. It also provides guidance on migrating from the Generate and Summarize endpoints to the Chat endpoint." +image: "../../assets/images/9272011-cohere_meta_image.jpg" +keywords: "Cohere, large language models, generative AI" +slug: /docs/summarizing-text +--- + +Text summarization distills essential information and generates concise snippets from dense documents. With Cohere, you can do text summarization via the Chat endpoint. + +The Command R family of models (R and R+) supports 128k context length, so you can pass long documents to be summarized. + +## Basic summarization + +You can perform text summarization with a simple prompt asking the model to summarize a piece of text. + +```python PYTHON +document = """Equipment rental in North America is predicted to “normalize” going into 2024, +according to Josh Nickell, vice president of equipment rental for the American Rental +Association (ARA). +“Rental is going back to ‘normal,’ but normal means that strategy matters again - +geography matters, fleet mix matters, customer type matters,” Nickell said. “In +late 2020 to 2022, you just showed up with equipment and you made money. +“Everybody was breaking records, from the national rental chains to the smallest +rental companies; everybody was having record years, and everybody was raising +prices. The conversation was, ‘How much are you up?’ And now, the conversation +is changing to ‘What’s my market like?’” +Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply +coming back down to Earth from unprecedented circumstances during the time of Covid. +Rental companies are still seeing growth, but at a more moderate level.""" + + + +response = co.chat(message= f"Generate a concise summary of this text\n{document}").text + + +print(response) +``` + +(NOTE: Here, we are passing the document as a variable, but you can also just copy the document directly into the prompt and ask Chat to summarize it.) + +Here's a sample output: + +``` +The equipment rental market in North America is expected to normalize by 2024, +according to Josh Nickell of the American Rental Association. This means a shift +from the unprecedented growth of 2020-2022, where demand and prices were high, +to a more strategic approach focusing on geography, fleet mix, and customer type. +Rental companies are still experiencing growth, but at a more moderate and sustainable level. +``` + +### Length control + +You can further control the output by defining the length of the summary in your prompt. For example, you can specify the number of sentences to be generated. + +```python PYTHON +response = co.chat(message= f"Summarize this text in one sentence\n{document}").text + +print(response) +``` + +And here's what a sample of the output might look like: + +``` +The equipment rental market in North America is expected to stabilize in 2024, +with a focus on strategic considerations such as geography, fleet mix, and +customer type, according to Josh Nickell of the American Rental Association (ARA). +``` + +You can also specify the length in terms of word count. + +```python PYTHON +response = co.chat(message= f"Summarize this text in less than 10 words\n{document}").text + +print(response) +``` + +``` +Rental equipment supply and demand to balance. +``` + +(Note: While the model is generally good at adhering to length instructions, due to the nature of LLMs, we do not guarantee that the exact word, sentence, or paragraph numbers will be generated.) + +### Format control + +Instead of generating summaries as paragraphs, you can also prompt the model to generate the summary as bullet points. + +```python PYTHON +response = co.chat(message= f"Generate a concise summary of this text as bullet points\n{document}").text + +print(response) +``` + +``` +- Equipment rental in North America is expected to "normalize" by 2024, according to Josh Nickell + of the American Rental Association (ARA). +- This "normalization" means a return to strategic focus on factors like geography, fleet mix, + and customer type. +- In the past two years, rental companies easily made money and saw record growth due to the + unique circumstances of the Covid pandemic. +- Now, the focus is shifting from universal success to varying market conditions and performance. +- Nickell's outlook is not pessimistic; rental companies are still growing, but at a more + sustainable and moderate pace. + +``` + +## Grounded summarization + +Another approach to summarization is using [retrieval-augmented generation](/docs/retrieval-augmented-generation-rag) (RAG). Here, you can instead pass the document as a chunk of documents to the Chat endpoint call. + +This approach allows you to take advantage of the citations generated by the endpoint, which means you can get a grounded summary of the document. Each grounded summary includes fine-grained citations linking to the source documents, making the response easily verifiable and building trust with the user. + +Here is a chunked version of the document. (we don’t cover the chunking process here, but if you’d like to learn more, see this cookbook on [chunking strategies](https://github.com/cohere-ai/notebooks/blob/main/notebooks/guides/Chunking_strategies.ipynb).) + +```python PYTHON +document_chunked = [{"text": "Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA)."}, +{"text": "“Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money."}, +{"text": "“Everybody was breaking records, from the national rental chains to the smallest rental companies; everybody was having record years, and everybody was raising prices. The conversation was, ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”"}] +``` + +It also helps to create a custom preamble to prime the model about the task—that it will receive a series of text fragments from a document presented in chronological order. + +```python PYTHON +preamble = """## Task & Context +You will receive a series of text fragments from a document that are presented in chronological order. \ +As the assistant, you must generate responses to user's requests based on the information given in the fragments. \ +Ensure that your responses are accurate and truthful, and that you reference your sources where appropriate to answer \ +the queries, regardless of their complexity.""" + +``` + +Other than the custom preamble, the only change to the Chat endpoint call is passing the document parameter containing the list of document chunks. + +Aside from displaying the actual summary (response.text), we can display the citations as as well (response.citations). The citations are a list of specific passages in the response that cite from the documents that the model receives. + +```python PYTHON +response = co.chat(message= f"Summarize this text in two sentences.", preamble=preamble, documents=document_chunked) +print(response.text) + +# Print citations (if any) +if response.citations: + print("\nCitations:") + for citation in response.citations: + print(citation) + print("\nCited Documents:") + for document in response.documents: + print(document) + +``` + +``` +Josh Nickell, vice president of the American Rental Association, predicts that equipment rental in North America will "normalize" by 2024. This means that factors like geography, fleet mix, and customer type will influence success in the market. + +Citations: +start=0 end=4 text='Josh' document_ids=['doc_0'] +start=5 end=12 text='Nickell' document_ids=['doc_0', 'doc_1'] +start=14 end=63 text='vice president of the American Rental Association' document_ids=['doc_0'] +start=79 end=112 text='equipment rental in North America' document_ids=['doc_0'] +start=118 end=129 text='"normalize"' document_ids=['doc_0', 'doc_1'] +start=133 end=138 text='2024.' document_ids=['doc_0'] +start=168 end=245 text='geography, fleet mix, and customer type will influence success in the market.' document_ids=['doc_1'] + +Cited Documents: +{'id': 'doc_0', 'text': 'Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA).'} +{'id': 'doc_1', 'text': '“Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money.'} + +``` + +## Migrating from Generate to Chat Endpoint + +This guide outlines how to migrate from Generate to Chat; the biggest difference is simply the need to replace the `prompt` argument with `message`, but there's also no model default, so you'll have to specify a model. + +```python PYTHON +# Before + +co.generate( + prompt="""Write a short summary from the following text in bullet point format, in different + words. + + Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA). + “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money. + “Everybody was breaking records, from the national rental chains to the smallest rental companies; everybody was having record years, and everybody was raising prices. The conversation was, ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’” + Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back down to Earth from unprecedented circumstances during the time of Covid. Rental companies are still seeing growth, but at a more moderate level. + """ +) + +# After +co.chat( + message="""Write a short summary from the following text in bullet point format, + in different words. + + Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA). + “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money. + “Everybody was breaking records, from the national rental chains to the smallest rental companies; everybody was having record years, and everybody was raising prices. The conversation was, ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’” + Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back down to Earth from unprecedented circumstances during the time of Covid. Rental companies are still seeing growth, but at a more moderate level. + """, + model="command-r-plus" +) + +``` + +## Migration from Summarize to Chat Endpoint + +To use the Command R/R+ models for summarization, we recommend using the Chat endpoint. This guide outlines how to migrate from the Summarize endpoint to the Chat endpoint. + +```python PYTHON +# Before + +co.summarize( + format="bullets", + length="short", + extractiveness="low", + text="""Equipment rental in North America is predicted to “normalize” going into 2024, according + to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA). + “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography + matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you + just showed up with equipment and you made money. + “Everybody was breaking records, from the national rental chains to the smallest rental companies; + everybody was having record years, and everybody was raising prices. The conversation was, ‘How + much are you up?’ And now, the conversation is changing to ‘What’s my market like?’” + Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back + down to Earth from unprecedented circumstances during the time of Covid. Rental companies are + still seeing growth, but at a more moderate level. + """ +) + +# After +co.chat( + message="""Write a short summary from the following text in bullet point format, in different words. + + Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh + Nickell, vice president of equipment rental for the American Rental Association (ARA). + “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography + matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just + showed up with equipment and you made money. + “Everybody was breaking records, from the national rental chains to the smallest rental companies; + everybody was having record years, and everybody was raising prices. The conversation was, + ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’” + Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back + down to Earth from unprecedented circumstances during the time of Covid. Rental companies are + still seeing growth, but at a more moderate level. + """, + model="command-r-plus" +) + +``` diff --git a/fern/pages/tutorials/build-things-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere.mdx new file mode 100644 index 00000000..e9391a3a --- /dev/null +++ b/fern/pages/tutorials/build-things-with-cohere.mdx @@ -0,0 +1,97 @@ +--- +title: Build Things with Cohere! +slug: /docs/build-things-with-cohere +--- + +Welcome to our hands-on introduction to Cohere! This section is split over seven different tutorials, each focusing on one use case leveraging our Chat, Embed, and Rerank endpoints: + +- Part 1: Installation and Setup (the document you're reading now) +- [Part 2: Text Generation](/docs/text-generation-tutorial) +- [Part 3: Chatbots](/docs/building-a-chatbot-with-cohere) +- [Part 4: Semantic Search](/docs/semantic-search-with-cohere) +- [Part 5: Reranking](/docs/reranking-with-cohere) +- [Part 6: Retrieval-Augmented Generation (RAG)](/docs/rag-with-cohere) +- [Part 7: Agents with Tool Use](/docs/building-an-agent-with-cohere) + +Your learning is structured around building an onboarding assistant that helps new hires at Co1t, a fictitious company. The assistant can help write introductions, answer user questions about the company, search for information from e-mails, and create meeting appointments. + +We recommend that you follow the parts sequentially. However, feel free to skip to specific parts if you want (apart from Part 1, which is a pre-requisite) because each part also works as a standalone tutorial. + +## Installation and Setup + +The Cohere platform lets developers access large language model (LLM) capabilities with a few lines of code. These LLMs can solve a broad spectrum of natural language use cases, including classification, semantic search, paraphrasing, summarization, and content generation. + +Cohere's models can be accessed through the [playground](https://dashboard.cohere.ai/playground/generate?model=xlarge&__hstc=14363112.d9126f508a1413c0edba5d36861c19ac.1701897884505.1722364657840.1722366723691.56&__hssc=14363112.1.1722366723691&__hsfp=3560715434), SDK, and CLI tool. We support SDKs in four different languages: Python, Typescript, Java, and Go. For these tutorials, we'll use the Python SDK and access the models through the Cohere platform with an API key. + +To get started, first install the Cohere Python SDK. + +```python PYTHON +! pip install -U cohere +``` + +Next, we'll import the `cohere` library and create a client to be used throughout the examples. We create a client by passing the Cohere API key as an argument. To get an API key, [sign up with Cohere](https://dashboard.cohere.com/welcome/register) and get the API key [from the dashboard](https://dashboard.cohere.com/api-keys). + +```python PYTHON +import cohere + +co = cohere.Client(api_key="YOUR_COHERE_API_KEY") # Get your API key here: https://dashboard.cohere.com/api-keys +``` + +# Accessing Cohere from Other Platforms + +The Cohere platform is the fastest way to access Cohere's models and get started. + +However, if you prefer other options, you can access Cohere's models through other platforms such as Amazon Bedrock, Amazon SageMaker, Azure AI Studio, and Oracle Cloud Infrastructure (OCI) Generative AI Service. + +Read this documentation on [Cohere SDK cloud platform compatibility](/docs/cohere-works-everywhere). In this sections below we sketch what it looks like to access Cohere models through other means, but we link out to more extensive treatments if you'd like additional detail. + +## Amazon Bedrock + +The following is how you can create a Cohere client on Amazon Bedrock. + +For further information, read this documentation on [Cohere on Bedrock](/docs/cohere-on-aws#amazon-bedrock). + +```python PYTHON +import cohere + +co = cohere.BedrockClient( + aws_region="...", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) +``` + +## Amazon SageMaker + +The following is how you can create a Cohere client on Amazon SageMaker. + +For further information, read this documentation on [Cohere on SageMaker](/docs/cohere-on-aws#amazon-sagemaker). + +```python PYTHON +import cohere + +co = cohere.SagemakerClient( + aws_region="us-east-1", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) +``` + +## Microsoft Azure + +The following is how you can create a Cohere client on Microsoft Azure. + +For further information, read this documentation on [Cohere on Azure](/docs/cohere-on-microsoft-azure). + +```python PYTHON +import cohere + +co = cohere.Client( + api_key="...", + base_url="...", +) +``` + +In Part 2, we'll get started with the first use case - [text generation](/docs/text-generation-tutorial). diff --git a/fern/pages/tutorials/build-things-with-cohere/building-a-chatbot-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere/building-a-chatbot-with-cohere.mdx new file mode 100644 index 00000000..695be363 --- /dev/null +++ b/fern/pages/tutorials/build-things-with-cohere/building-a-chatbot-with-cohere.mdx @@ -0,0 +1,215 @@ +--- +title: Building a Chatbot with Cohere +slug: /docs/building-a-chatbot-with-cohere +--- + +Open in Colab + +As its name implies, the Chat endpoint enables developers to build chatbots that can handle conversations. At the core of a conversation is a multi-turn dialog between the user and the chatbot. This requires the chatbot to have the state (or “memory”) of all the previous turns to maintain the state of the conversation. + +In this tutorial, you'll learn about: + +- Creating a custom preamble +- Creating a single-turn conversation +- Building the conversation memory +- Running a multi-turn conversation +- Viewing the chat history + +You'll learn these by building an onboarding assistant for new hires. + +## Setup + +To get started, first we need to install the `cohere` library and create a Cohere client. + +```python PYTHON +# pip install cohere + +import cohere + +co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys +``` + +## Creating a custom preamble + +A conversation starts with a system message, or a [preamble](/docs/preambles), to help steer a chatbot’s response toward certain characteristics. + +For example, if we want the chatbot to adopt a formal style, the preamble can be used to encourage the generation of more business-like and professional responses. + +The recommended approach is to use two H2 Markdown headers: "Task and Context" and "Style Guide" in the exact order. + +In the example below, the preamble provides context for the assistant's task (task and context) and encourages the generation of rhymes as much as possible (style guide). + +```python PYTHON +# Add the user message +message = "I'm joining a new startup called Co1t today. Could you help me write a short introduction message to my teammates." + +# Create a custom preamble +preamble="""## Task and Context +You are an assistant who assist new employees of Co1t with their first week. + +## Style Guide +Try to speak in rhymes as much as possible. Be professional.""" + +# Generate the response +response = co.chat(message=message, + preamble=preamble) + +print(response.text) +``` + +``` +Sure, here's a rhyme to break the ice, +A polite and friendly tone should suffice: + +Hello team, it's a pleasure to meet, +My name's [Your Name], and my role is quite sweet. + +I'm thrilled to join Co1t, a startup so bright, +Where innovation and talent ignite. + +My role here is [Your Role], a position brand new, +Where I'll contribute and learn from you. + +I look forward to working together in harmony, +Exchanging ideas and creating synergy. + +Feel free to connect, and let's start anew, +I'm excited to be part of this team, me and you! + +Cheers to a great first week, +And many successes, unique and sleek! + +Let's collaborate and soar, +Co1t's future is bright, that's for sure! + +Regards, +[Your Name] + +(P.S. I'm a poet and didn't know it!) +``` + +Further reading: + +- [Documentation on preambles](/docs/preambles) + +## Creating a single-turn conversation + +Let's start with a single-turn conversation, which doesn't require the chatbot to maintain any conversation state. + +Here, we are also adding a custom preamble for generating concise response, just to keep the outputs brief for this tutorial. + +```python PYTHON +# Add the user message +message = "I'm joining a new startup called Co1t today. Could you help me write a short introduction message to my teammates." + +# Create a custom preamble +preamble="""## Task & Context +Generate concise responses, with maximum one-sentence.""" + +# Generate the response +response = co.chat(message=message, + preamble=preamble) + +print(response.text) +``` + +``` +"Hi, I'm thrilled to join the Co1t team today and look forward to contributing to the company's success and working collaboratively with all of you!" +``` + +## Building the conversation memory + +Now, we want the model to refine the earlier response. This requires the next generation to have access to the state, or memory, of the conversation. + +To do this, we add the `chat_history` argument, which takes the current chat history as the value. + +You can get the current chat history by taking the the `response.chat_history` object from the previous response. + +Looking at the response, we see that the model is able to get the context from the chat history. The model is able to capture that "it" in the user message refers to the introduction message it had generated earlier. + +```python PYTHON +# Add the user message +message = "Make it more upbeat and conversational." + +# Generate the response with the current chat history as the context +response = co.chat(message=message, + preamble=preamble, + chat_history=response.chat_history) + +print(response.text) +``` + +``` +"Hey, I'm stoked to be a part of the Co1t crew! Can't wait to dive in and work together to make our startup vision a reality!" +``` + +Further reading: + +- [Documentation on using the Chat endpoint](/docs/chat-api) + +## Running a multi-turn conversation + +You can continue doing this for any number of turns by passing the most recent `response.chat_history` value, which contains the conversation history from the beginning. + +```python PYTHON +# Add the user message +message = "Thanks. Could you create another one for my DM to my manager." + +# Generate the response with the current chat history as the context +response = co.chat(message=message, + preamble=preamble, + chat_history=response.chat_history) + +print(response.text) +``` + +``` +"Super excited to be a part of the Co1t family! Looking forward to learning from your expertise and guidance and contributing my best to the team's success under your management." +``` + +## Viewing the chat history + +To look at the current chat history, you can print the `response.chat_history` object, which contains a list of `USER` and `CHATBOT` turns in the same sequence as they were created. + +```python PYTHON +# View the chat history +for turn in response.chat_history: + print("Role:",turn.role) + print("Message:",turn.message,"\n") +``` + +``` +Role: USER +Message: I'm joining a new startup called Co1t today. Could you help me write a short introduction message to my teammates. + +Role: CHATBOT +Message: "Hi, I'm thrilled to join the Co1t team today and look forward to contributing to the company's success and working collaboratively with all of you!" + +Role: USER +Message: Make it more upbeat and conversational. + +Role: CHATBOT +Message: "Hey, I'm stoked to be a part of the Co1t crew! Can't wait to dive in and work together to make our startup vision a reality!" + +Role: USER +Message: Thanks. Could you create another one for my DM to my manager. + +Role: CHATBOT +Message: "Super excited to be a part of the Co1t family! Looking forward to learning from your expertise and guidance and contributing my best to the team's success under your management." +``` + +## Conclusion + +In this tutorial, you learned about: + +- How to create a custom preamble +- How to create a single-turn conversation +- How to build the conversation memory +- How to run a multi-turn conversation +- How to view the chat history + +You will use the same method for running a multi-turn conversation when you learn about other use cases such as [RAG](/docs/rag-with-cohere) (Part 6) and [tool use](/docs/building-an-agent-with-cohere) (Part 7). + +But to fully leverage these other capabilities, you will need another type of language model that generates text representations, or embeddings. + +In Part 4, you will learn how text embeddings can power an important use case for RAG, which is [semantic search](/docs/semantic-search-with-cohere). diff --git a/fern/pages/tutorials/build-things-with-cohere/building-an-agent-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere/building-an-agent-with-cohere.mdx new file mode 100644 index 00000000..01d26639 --- /dev/null +++ b/fern/pages/tutorials/build-things-with-cohere/building-an-agent-with-cohere.mdx @@ -0,0 +1,373 @@ +--- +title: Building an Agent with Cohere +slug: /docs/building-an-agent-with-cohere +--- + +Open in Colab + +Tool use extends the ideas from [RAG](/docs/rag-with-cohere), where external systems are used to guide the response of an LLM, but by leveraging a much bigger set of tools than what’s possible with RAG. The concept of tool use leverages LLMs' useful feature of being able to act as a reasoning and decision-making engine. + +While RAG enables applications that can _answer questions_, tool use enables those that can _automate tasks_. + +Tool use also enables developers to build agentic applications that can take actions, that is, doing both read and write operations on an external system. + +In this tutorial, you'll learn about: + +- Creating tools +- Tool planning and calling +- Tool execution +- Response and citation generation +- Multi-step tool use + +You'll learn these by building an onboarding assistant for new hires. + +## Setup + +To get started, first we need to install the `cohere` library and create a Cohere client. + +```python PYTHON +# pip install cohere numpy + +import numpy as np +import cohere + +co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys +``` + +## Creating tools + +The pre-requisite, before we can run a [tool use workflow](/docs/tools), is to set up the tools. Let's create three tools: + +- `search_faqs`: A tool for searching the FAQs. For simplicity, we'll not implement any retrieval logic, but we'll simply pass a list of pre-defined documents, which are the FAQ documents we had used in the text embeddings section. +- `search_emails`: A tool for searching the emails. Same as above, we'll simply pass a list of pre-defined emails from the Reranking section. +- `create_calendar_event`: A tool for creating new calendar events. Again, for simplicity, we'll not implement actual event bookings, but will return a mock success event. In practice, we can connect to a calendar service API and implement all the necessary logic here. + +Here, we are defining a Python function for each tool, but more broadly, the tool can be any function or service that can receive and send objects. + +```python PYTHON +# Create the tools +def search_faqs(query): + faqs = [ + {"text": "Reimbursing Travel Expenses: Easily manage your travel expenses by submitting them through our finance tool. Approvals are prompt and straightforward."}, + {"text": "Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours."} + ] + return {"faqs" : faqs} + +def search_emails(query): + emails = [ + {"from": "it@co1t.com", "to": "david@co1t.com", "date": "2024-06-24", "subject": "Setting Up Your IT Needs", "text": "Greetings! To ensure a seamless start, please refer to the attached comprehensive guide, which will assist you in setting up all your work accounts."}, + {"from": "john@co1t.com", "to": "david@co1t.com", "date": "2024-06-24", "subject": "First Week Check-In", "text": "Hello! I hope you're settling in well. Let's connect briefly tomorrow to discuss how your first week has been going. Also, make sure to join us for a welcoming lunch this Thursday at noon—it's a great opportunity to get to know your colleagues!"} + ] + return {"emails" : emails} + +def create_calendar_event(date: str, time: str, duration: int): + # You can implement any logic here + return {"is_success": True, + "message": f"Created a {duration} hour long event at {time} on {date}"} + +functions_map = { + "search_faqs": search_faqs, + "search_emails": search_emails, + "create_calendar_event": create_calendar_event +} +``` + +The second and final setup step is to define the tool schemas in a format that can be passed to the Chat endpoint. The schema must contain the following fields: `name`, `description`, and `parameter_definitions` in the format shown below. + +This schema informs the LLM about what the tool does, and the LLM decides whether to use a particular tool based on it. Therefore, the more descriptive and specific the schema, the more likely the LLM will make the right tool call decisions. + +```python PYTHON +# Define the tools +tools = [ + { + "name": "search_faqs", + "description": "Given a user query, searches a company's frequently asked questions (FAQs) list and returns the most relevant matches to the query.", + "parameter_definitions": { + "query": { + "description": "The query from the user", + "type": "str", + "required": True + } + } + }, + { + "name": "search_emails", + "description": "Given a user query, searches a person's emails and returns the most relevant matches to the query.", + "parameter_definitions": { + "query": { + "description": "The query from the user", + "type": "str", + "required": True + } + } + }, + { + "name": "create_calendar_event", + "description": "Creates a new calendar event of the specified duration at the specified time and date. A new event cannot be created on the same time as an existing event.", + "parameter_definitions": { + "date": { + "description": "the date on which the event starts, formatted as mm/dd/yy", + "type": "str", + "required": True + }, + "time": { + "description": "the time of the event, formatted using 24h military time formatting", + "type": "str", + "required": True + }, + "duration": { + "description": "the number of hours the event lasts for", + "type": "float", + "required": True + } + } + } +] +``` + +## Tool planning and calling + +We can now run the tool use workflow. We can think of a tool use system as consisting of four components: + +- The user +- The application +- The LLM +- The tools + +At its most basic, these four components interact in a workflow through four steps: + +- **Step 1: Get user message** – The LLM gets the user message (via the application) +- **Step 2: Tool planning and calling** – The LLM makes a decision on the tools to call (if any) and generates - the tool calls +- **Step 3: Tool execution** - The application executes the tools and the results are sent to the LLM +- **Step 4: Response and citation generation** – The LLM generates the response and citations to back to the user + +```python PYTHON +# Step 1: Get user message +message = "Any messages about getting setup with IT?" + +preamble="""## Task & Context +You are an assistant who assist new employees of Co1t with their first week. You respond to their questions and assist them with their needs. Today is Monday, June 24, 2024""" + +# Step 2: Tool planning and calling +response = co.chat( + message=message, + preamble=preamble, + tools=tools) + +if response.tool_calls: + print("Tool plan:") + print(response.text,"\n") + + print("Tool calls:") + for call in response.tool_calls: + print(f"Tool name: {call.name} | Parameters: {call.parameters}") +``` + +``` +Tool plan: +I will search the user's emails for any messages about getting set up with IT. + +Tool calls: +Tool name: search_emails | Parameters: {'query': 'IT setup'} +``` + +Given three tools to choose from, the model is able to pick the right tool (in this case, `search_emails`) based on what the user is asking for. + +Also, notice that the model first generates a plan about what it should do ("I will do ...") before actually generating the tool call(s). + +## Tool execution + +```python PYTHON +# Step 3: Tool execution +tool_results = [] +for tc in response.tool_calls: + tool_call = {"name": tc.name, "parameters": tc.parameters} + tool_output = functions_map[tc.name](**tc.parameters) + tool_results.append({"call": tool_call, "outputs": [tool_output]}) + +print("Tool results:") +for result in tool_results: + print(result) +``` + +``` +Tool results: +{'call': {'name': 'search_emails', 'parameters': {'query': 'IT setup'}}, 'outputs': [{'emails': [{'from': 'it@co1t.com', 'to': 'david@co1t.com', 'date': '2024-06-24', 'subject': 'Setting Up Your IT Needs', 'text': 'Greetings! To ensure a seamless start, please refer to the attached comprehensive guide, which will assist you in setting up all your work accounts.'}, {'from': 'john@co1t.com', 'to': 'david@co1t.com', 'date': '2024-06-24', 'subject': 'First Week Check-In', 'text': "Hello! I hope you're settling in well. Let's connect briefly tomorrow to discuss how your first week has been going. Also, make sure to join us for a welcoming lunch this Thursday at noon—it's a great opportunity to get to know your colleagues!"}]}]} +``` + +## Response and citation generation + +```python PYTHON +# Step 4: Response and citation generation +response = co.chat( + message="", # In response generation, we set the message as empty + preamble=preamble, + tools=tools, + tool_results=tool_results, + chat_history=response.chat_history +) + +# Print final response +print("Final response:") +print(response.text) +print("="*50) + +# Print citations (if any) +if response.citations: + print("\nCITATIONS:") + for citation in response.citations: + print(citation) + + print("\nCITED REFERENCES:") + for document in response.documents: + print(document) +``` + +``` +Final response: +You have an email from IT with a comprehensive guide attached to help you set up your work accounts. +================================================== + +CITATIONS: +start=12 end=25 text='email from IT' document_ids=['search_emails:0:2:0'] +start=33 end=61 text='comprehensive guide attached' document_ids=['search_emails:0:2:0'] +start=74 end=99 text='set up your work accounts' document_ids=['search_emails:0:2:0'] + +CITED REFERENCES: +{'emails': '[{"date":"2024-06-24","from":"it@co1t.com","subject":"Setting Up Your IT Needs","text":"Greetings! To ensure a seamless start, please refer to the attached comprehensive guide, which will assist you in setting up all your work accounts.","to":"david@co1t.com"},{"date":"2024-06-24","from":"john@co1t.com","subject":"First Week Check-In","text":"Hello! I hope you\'re settling in well. Let\'s connect briefly tomorrow to discuss how your first week has been going. Also, make sure to join us for a welcoming lunch this Thursday at noon—it\'s a great opportunity to get to know your colleagues!","to":"david@co1t.com"}]', 'id': 'search_emails:0:2:0', 'tool_name': 'search_emails'} +``` + +## Multi-step tool use + +The model can execute more complex tasks in tool use – tasks that require tool calls to happen in a sequence. This is referred to as "multi-step" tool use. + +Let's create a function to called `run_assistant` to implement these steps, and along the way, print out the key events and messages. Optionally, this function also accepts the chat history as an argument to keep the state in a multi-turn conversation. + +```python PYTHON +model = "command-r-plus" + +preamble="""## Task & Context +You are an assistant who assists new employees of Co1t with their first week. You respond to their questions and assist them with their needs. Today is Monday, June 24, 2024""" + +# A function that runs multi-step tool use +def run_assistant(message, chat_history=[]): + # Step 1: get user message + print(f"Question:\n{message}") + print("="*50) + + # Step 2: Generate tool calls (if any) + response = co.chat( + message=message, + model=model, + preamble=preamble, + tools=tools, + chat_history=chat_history + ) + + # Tool execution loop + while response.tool_calls: + tool_calls = response.tool_calls + + if response.text: + print("Intermediate response:") + print(response.text,"\n") + print("Tool calls:") + for call in tool_calls: + print(f"Tool name: {call.name} | Parameters: {call.parameters}") + print("="*50) + + # Step 3: Get tool results + tool_results = [] + for tc in tool_calls: + tool_call = {"name": tc.name, "parameters": tc.parameters} + tool_output = functions_map[tc.name](**tc.parameters) + tool_results.append({"call": tool_call, "outputs": [tool_output]}) + + # Step 4: Generate response and citations + response = co.chat( + message="", + model=model, + preamble=preamble, + tools=tools, + tool_results=tool_results, + chat_history=response.chat_history + ) + + chat_history = response.chat_history + + # Print final response + print("Final response:") + print(response.text) + print("="*50) + + # Print citations (if any) + if response.citations: + print("\nCITATIONS:") + for citation in response.citations: + print(citation) + + print("\nCITED REFERENCES:") + for document in response.documents: + print(document) + + return chat_history +``` + +To illustrate the concept of multi-step tool user, let's ask the assistant to block time for any lunch invites received in the email. + +This requires tasks to happen over multiple steps in a sequence. Here, we see the assistant running these steps: + +- First, it calls the `search_emails` tool to find any lunch invites, which it found one. +- Next, it calls the `create_calendar_event` tool to create an event to block the person's calendar on the day mentioned by the email. + +This is also an example of tool use enabling a write operation instead of just a read operation that we saw with RAG. + +```python PYTHON +chat_history = run_assistant("Can you check if there are any lunch invites, and for those days, block an hour on my calendar from 12-1PM.") +``` + +``` +Question: +Can you check if there are any lunch invites, and for those days, block an hour on my calendar from 12-1PM. +================================================== +Intermediate response: +I will search the user's emails for lunch invites, and then create calendar events for the dates and times of those invites. + +Tool calls: +Tool name: search_emails | Parameters: {'query': 'lunch invite'} +================================================== +Intermediate response: +I have found one lunch invite for Thursday 27 June at noon. I will now create a calendar event for this. + +Tool calls: +Tool name: create_calendar_event | Parameters: {'date': '06/27/24', 'duration': 1, 'time': '12:00'} +================================================== +Final response: +I found one lunch invite for Thursday 27 June at noon. I have created a calendar event for this. +================================================== + +CITATIONS: +start=29 end=53 text='Thursday 27 June at noon' document_ids=['search_emails:0:2:0'] +start=62 end=95 text='created a calendar event for this' document_ids=['create_calendar_event:0:4:0'] + +CITED REFERENCES: +{'emails': '[{"date":"2024-06-24","from":"it@co1t.com","subject":"Setting Up Your IT Needs","text":"Greetings! To ensure a seamless start, please refer to the attached comprehensive guide, which will assist you in setting up all your work accounts.","to":"david@co1t.com"},{"date":"2024-06-24","from":"john@co1t.com","subject":"First Week Check-In","text":"Hello! I hope you\'re settling in well. Let\'s connect briefly tomorrow to discuss how your first week has been going. Also, make sure to join us for a welcoming lunch this Thursday at noon—it\'s a great opportunity to get to know your colleagues!","to":"david@co1t.com"}]', 'id': 'search_emails:0:2:0', 'tool_name': 'search_emails'} +{'id': 'create_calendar_event:0:4:0', 'is_success': 'true', 'message': 'Created a 1 hour long event at 12:00 on 06/27/24', 'tool_name': 'create_calendar_event'} +``` + +In this tutorial, you learned about: + +- How to create tools +- How tool planning and calling happens +- How tool execution happens +- How to generate the response and citations +- How to run tool use in a multi-step scenario + +And that concludes our 7-part Cohere tutorial. We hope that they have provided you with a foundational understanding of the Cohere API, the available models and endpoints, and the types of use cases that you can build with them. + +To continue your learning, check out: + +- [LLM University - A range of courses and step-by-step guides to help you start building](https://cohere.com/llmu) +- [Cookbooks - A collection of basic to advanced example applications](/page/cookbooks) +- [Cohere's documentation](/docs/the-cohere-platform) +- [The Cohere API reference](/reference/about) diff --git a/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx new file mode 100644 index 00000000..9ebc2530 --- /dev/null +++ b/fern/pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx @@ -0,0 +1,373 @@ +--- +title: RAG with Cohere +slug: /docs/rag-with-cohere +--- + +Open in Colab + +The Chat endpoint provides comprehensive support for various text generation use cases, including retrieval-augmented generation (RAG). + +While LLMs are good at maintaining the context of the conversation and generating responses, they can be prone to hallucinate and include factually incorrect or incomplete information in their responses. + +RAG enables a model to access and utilize supplementary information from external documents, thereby improving the accuracy of its responses. + +When using RAG with the Chat endpoint, these responses are backed by fine-grained citations linking to the source documents. This makes the responses easily verifiable. + +In this tutorial, you'll learn about: + +- Basic RAG +- Search query generation +- Retrieval with Embed +- Reranking with Rerank +- Response and citation generation + +You'll learn these by building an onboarding assistant for new hires. + +## Setup + +To get started, first we need to install the `cohere` library and create a Cohere client. + +```python PYTHON +# pip install cohere numpy + +import numpy as np +import cohere + +co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys +``` + +## Basic RAG + +To see how RAG works, let's define the documents that the application has access to. We'll use a short list of documents consisting of internal FAQs about the fictitious company Co1t (in production, these documents are massive). + +In this example, each document is a dictionary with one field, `text`. But we can define any number of fields we want, depending on the nature of the documents. For example, emails could contain `title` and `text` fields. + +```python PYTHON +# Define the documents +faqs_short = [ + {"text": "Reimbursing Travel Expenses: Easily manage your travel expenses by submitting them through our finance tool. Approvals are prompt and straightforward."}, + {"text": "Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours."}, + {"text": "Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance."}, + {"text": "Performance Reviews Frequency: We conduct informal check-ins every quarter and formal performance reviews twice a year."} +] +``` + +To use these documents, we pass them to the `documents` parameter in the Chat endpoint call. This tells the model to run in RAG-mode and use these documents in its response. + +Let's create a query asking about the company's support for personal well-being, which is not going to be available to the model based on the data its trained on. It will need to use external documents. + +RAG introduces additional objects in the Chat response. Here we display two: + +- `citations`: indicate the specific text spans from the retrieved documents on which the response is grounded. +- `documents`: the IDs of the documents referenced in the citations. + +```python PYTHON +# Add the user query +query = "Are there fitness-related perks?" + +# Generate the response +response = co.chat( + message=query, + model="command-r-plus", + documents=faqs_short) + +# Display the response +print(response.text) + +# Display the citations and source documents +if response.citations: + print("\nCITATIONS:") + for citation in response.citations: + print(citation) + + print("\nDOCUMENTS:") + for document in response.documents: + print(document) +``` + +``` +Yes, we offer health and wellness benefits, including gym memberships, on-site yoga classes, and comprehensive health insurance. + +CITATIONS: +start=14 end=42 text='health and wellness benefits' document_ids=['doc_2'] +start=54 end=69 text='gym memberships' document_ids=['doc_2'] +start=71 end=91 text='on-site yoga classes' document_ids=['doc_2'] +start=97 end=128 text='comprehensive health insurance.' document_ids=['doc_2'] + +DOCUMENTS: +{'id': 'doc_2', 'text': 'Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance.'} +``` + +Further reading: + +- [Chat endpoint API reference](/reference/chat) +- [Documentation on RAG](/docs/retrieval-augmented-generation-rag) +- [LLM University module on RAG](https://cohere.com/llmu#rag) + +## Search query generation + +The previous example showed how to get started with RAG, and in particular, the augmented generation portion of RAG. But as its name implies, RAG consists of other steps, such as retrieval. + +In a basic RAG application, the steps involved are: + +- Transforming the user message into search queries +- Retrieving relevant documents for a given search query +- Generating the response and citations + +Let's now look at the first step—search query generation. The chatbot needs to generate an optimal set of search queries to use for retrieval. + +The Chat endpoint has a feature that handles this for us automatically. This is done by adding the `search_queries_only=True` parameter to the Chat endpoint call. + +It will generate a list of search queries based on a user message. Depending on the message, it can be one or more queries. + +In the example below, the resulting queries breaks down the user message into two separate queries. + +```python PYTHON +# Add the user query +query = "How to stay connected with the company and do you organize team events?" + +# Generate the search queries +response = co.chat(message=query, + search_queries_only=True) + +queries = [] +for r in response.search_queries: + queries.append(r.text) + +print(queries) +``` + +``` +['staying connected with the company', 'team events'] +``` + +And in the example below, the model decides that one query is sufficient. + +```python PYTHON +# Add the user query +query = "How flexible are the working hours" + +# Generate the search queries +response = co.chat(message=query, + search_queries_only=True) + +queries = [] +for r in response.search_queries: + queries.append(r.text) + +print(queries) +``` + +``` +['working hours flexibility'] +``` + +## Retrieval with Embed + +Given the search query, we need a way to retrieve the most relevant documents from a large collection of documents. + +This is where we can leverage text embeddings through the Embed endpoint. It enables semantic search, which lets us to compare the semantic meaning of the documents and the query. It solves the problem faced by the more traditional approach of lexical search, which is great at finding keyword matches, but struggles at capturing the context or meaning of a piece of text. + +The Embed endpoint takes in texts as input and returns embeddings as output. + +First, we need to embed the documents to search from. We call the Embed endpoint using `co.embed()` and pass the following arguments: + +- `model`: Here we choose `embed-english-v3.0`, which generates embeddings of size 1024 +- `input_type`: We choose `search_document` to ensure the model treats these as the documents (instead of the query) for search +- `texts`: The list of texts (the FAQs) + +```python PYTHON +# Define the documents +faqs_long = [ + {"text": "Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged."}, + {"text": "Finding Coffee Spots: For your caffeine fix, head to the break room's coffee machine or cross the street to the café for artisan coffee."}, + {"text": "Team-Building Activities: We foster team spirit with monthly outings and weekly game nights. Feel free to suggest new activity ideas anytime!"}, + {"text": "Working Hours Flexibility: We prioritize work-life balance. While our core hours are 9 AM to 5 PM, we offer flexibility to adjust as needed."}, + {"text": "Side Projects Policy: We encourage you to pursue your passions. Just be mindful of any potential conflicts of interest with our business."}, + {"text": "Reimbursing Travel Expenses: Easily manage your travel expenses by submitting them through our finance tool. Approvals are prompt and straightforward."}, + {"text": "Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours."}, + {"text": "Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance."}, + {"text": "Performance Reviews Frequency: We conduct informal check-ins every quarter and formal performance reviews twice a year."}, + {"text": "Proposing New Ideas: Innovation is welcomed! Share your brilliant ideas at our weekly team meetings or directly with your team lead."}, +] + +# Embed the documents +doc_emb = co.embed( + model="embed-english-v3.0", + input_type="search_document", + texts=[doc['text'] for doc in faqs_long]).embeddings +``` + +Next, we add a query, which asks about how to get to know the team. + +We choose `search_query` as the `input_type` to ensure the model treats this as the query (instead of the documents) for search. + +```python PYTHON +# Add the user query +query = "How to get to know my teammates" + +# Generate the search query +response = co.chat(message=query, + search_queries_only=True) +query_optimized = response.search_queries[0].text + +# Embed the search query +query_emb = co.embed( + model="embed-english-v3.0", + input_type="search_query", + texts=[query_optimized]).embeddings +``` + +Now, we want to search for the most relevant documents to the query. For this, we make use of the `numpy` library to compute the similarity between each query-document pair using the dot product approach. + +Each query-document pair returns a score, which represents how similar the pair are. We then sort these scores in descending order and select the top most similar pairs, which we choose 5 (this is an arbitrary choice, you can choose any number). + +Here, we show the most relevant documents with their similarity scores. + +```python PYTHON +# Compute dot product similarity and display results +n = 5 +scores = np.dot(query_emb, np.transpose(doc_emb))[0] +scores_sorted = sorted(enumerate(scores), key=lambda x: x[1], reverse=True)[:n] + +retrieved_documents = [faqs_long[item[0]] for item in scores_sorted] + +for idx, item in enumerate(scores_sorted): + print(f"Rank: {idx+1}") + print(f"Score: {item[1]}") + print(f"Document: {faqs_long[item[0]]}\n") +``` + +``` +Rank: 1 +Score: 0.32675385963873044 +Document: {'text': 'Team-Building Activities: We foster team spirit with monthly outings and weekly game nights. Feel free to suggest new activity ideas anytime!'} + +Rank: 2 +Score: 0.2683516879250747 +Document: {'text': 'Proposing New Ideas: Innovation is welcomed! Share your brilliant ideas at our weekly team meetings or directly with your team lead.'} + +Rank: 3 +Score: 0.25784017142593213 +Document: {'text': 'Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged.'} + +Rank: 4 +Score: 0.18610347850687634 +Document: {'text': "Finding Coffee Spots: For your caffeine fix, head to the break room's coffee machine or cross the street to the café for artisan coffee."} + +Rank: 5 +Score: 0.12958686394309055 +Document: {'text': 'Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance.'} +``` + +Further reading: + +- [Embed endpoint API reference](/reference/embed) +- [Documentation on the Embed endpoint](/docs/embeddings) +- [Documentation on the models available on the Embed endpoint](/docs/cohere-embed) + +## Reranking with Rerank + +Reranking can boost the results from semantic or lexical search further. The Rerank endpoint takes a list of search results and reranks them according to the most relevant documents to a query. This requires just a single line of code to implement. + +We call the endpoint using `co.rerank()` and pass the following arguments: + +- `query`: The user query +- `documents`: The list of documents we get from the semantic search results +- `top_n`: The top reranked documents to select +- `model`: We choose Rerank English 3 + +Looking at the results, we see that the given a query about getting to know the team, the document that talks about joining Slack channels is now ranked higher (1st) compared to earlier (3rd). + +Here we select `top_n` to be 2, which will be the documents we will pass next for response generation. + +```python PYTHON +# Rerank the documents +results = co.rerank(query=query_optimized, + documents=retrieved_documents, + top_n=2, + model='rerank-english-v3.0') + +# Display the reranking results +for idx, result in enumerate(results.results): + print(f"Rank: {idx+1}") + print(f"Score: {result.relevance_score}") + print(f"Document: {retrieved_documents[result.index]}\n") + +reranked_documents = [retrieved_documents[result.index] for result in results.results] +``` + +``` +Rank: 1 +Score: 0.0040072887 +Document: {'text': 'Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged.'} + +Rank: 2 +Score: 0.0020829707 +Document: {'text': 'Team-Building Activities: We foster team spirit with monthly outings and weekly game nights. Feel free to suggest new activity ideas anytime!'} +``` + +Further reading: + +- [Rerank endpoint API reference](/reference/rerank) +- [Documentation on Rerank](/docs/overview) +- [Documentation on Rerank fine-tuning](/docs/rerank-fine-tuning) +- [Documentation on Rerank best practices](/docs/reranking-best-practices) + +## Response and citation generation + +Finally we reach the step that we saw in the earlier `Basic RAG` section. Here, the response is generated based on the the query and the documents retrieved. + +RAG introduces additional objects in the Chat response. Here we display two: + +- `citations`: indicate the specific spans of text from the retrieved documents on which the response is grounded. +- `documents`: the IDs of the documents being referenced in the citations. + +```python PYTHON +# Generate the response +response = co.chat( + message=query_optimized, + model="command-r-plus", + documents=reranked_documents) + +# Display the response +print(response.text) + +# Display the citations and source documents +if response.citations: + print("\nCITATIONS:") + for citation in response.citations: + print(citation) + + print("\nDOCUMENTS:") + for document in response.documents: + print(document) +``` + +``` +There are a few ways to get to know your teammates. You could join your company's Slack channels to stay informed and connected. You could also take part in team-building activities, such as outings and game nights. + +CITATIONS: +start=62 end=96 text="join your company's Slack channels" document_ids=['doc_0'] +start=100 end=128 text='stay informed and connected.' document_ids=['doc_0'] +start=157 end=181 text='team-building activities' document_ids=['doc_1'] +start=191 end=215 text='outings and game nights.' document_ids=['doc_1'] + +DOCUMENTS: +{'id': 'doc_0', 'text': 'Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged.'} +{'id': 'doc_1', 'text': 'Team-Building Activities: We foster team spirit with monthly outings and weekly game nights. Feel free to suggest new activity ideas anytime!'} +``` + +## Conclusion + +In this tutorial, you learned about: + +- How to get started with RAG +- How to generate search queries +- How to perform retrieval with Embed +- How to perform reranking with Rerank +- How to generate response and citations + +RAG is great for building applications that can _answer questions_ by grounding the response in external documents. But you can unlock the ability to not just answer questions, but also _automate tasks_. This can be done using a technique called tool use. + +In Part 7, you will learn how to leverage [tool use](/docs/building-an-agent-with-cohere) to automate tasks and workflows. diff --git a/fern/pages/tutorials/build-things-with-cohere/reranking-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere/reranking-with-cohere.mdx new file mode 100644 index 00000000..51453db4 --- /dev/null +++ b/fern/pages/tutorials/build-things-with-cohere/reranking-with-cohere.mdx @@ -0,0 +1,245 @@ +--- +title: Reranking with Cohere +slug: /docs/reranking-with-cohere +--- + +Open in Colab + +Reranking is a technique that leverages [embeddings](/docs/embeddings) as the last stage of a retrieval process, and is especially useful in [RAG systems](/docs/retrieval-augmented-generation-rag). + +We can rerank results from semantic search as well as any other search systems such as lexical search. This means that companies can retain an existing keyword-based (also called “lexical”) or semantic search system for the first-stage retrieval and integrate the [Rerank endpoint](/docs/rerank-2) in the second-stage reranking. + +In this tutorial, you'll learn about: + +- Reranking lexical/semantic search results +- Reranking semi-structured data +- Reranking tabular data +- Multilingual reranking + +You'll learn these by building an onboarding assistant for new hires. + +## Setup + +To get started, first we need to install the `cohere` library and create a Cohere client. + +```python PYTHON +# pip install cohere numpy + +import numpy as np +import cohere + +co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys +``` + +## Reranking lexical/semantic search results + +Rerank requires just a single line of code to implement. + +Suppose we have a list of search results of an FAQ list, which can come from semantic, lexical, or any other types of search systems. But this list may not be optimally ranked for relevance to the user query. + +This is where Rerank can help. We call the endpoint using `co.rerank()` and pass the following arguments: + +- `query`: The user query +- `documents`: The list of documents +- `top_n`: The top reranked documents to select +- `model`: We choose Rerank English 3 + +```python PYTHON +# Define the documents +faqs_short = [ + {"text": "Reimbursing Travel Expenses: Easily manage your travel expenses by submitting them through our finance tool. Approvals are prompt and straightforward."}, + {"text": "Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours."}, + {"text": "Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance."}, + {"text": "Performance Reviews Frequency: We conduct informal check-ins every quarter and formal performance reviews twice a year."} +] +``` + +```python PYTHON +# Add the user query +query = "Are there fitness-related perks?" + +# Rerank the documents +results = co.rerank(query=query, + documents=faqs_short, + top_n=2, + model='rerank-english-v3.0') + +print(results) +``` + +``` +id='9633b278-93ff-4664-a142-7d9dcf0ec0e5' results=[RerankResponseResultsItem(document=None, index=2, relevance_score=0.01798621), RerankResponseResultsItem(document=None, index=3, relevance_score=8.463939e-06)] meta=ApiMeta(api_version=ApiMetaApiVersion(version='1', is_deprecated=None, is_experimental=None), billed_units=ApiMetaBilledUnits(input_tokens=None, output_tokens=None, search_units=1, classifications=None), tokens=None, warnings=None) +``` + +```python PYTHON +# Display the reranking results +def return_results(results, documents): + for idx, result in enumerate(results.results): + print(f"Rank: {idx+1}") + print(f"Score: {result.relevance_score}") + print(f"Document: {documents[result.index]}\n") + +return_results(results, faqs_short) +``` + +``` +Rank: 1 +Score: 0.01798621 +Document: {'text': 'Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance.'} + +Rank: 2 +Score: 8.463939e-06 +Document: {'text': 'Performance Reviews Frequency: We conduct informal check-ins every quarter and formal performance reviews twice a year.'} +``` + +Further reading: + +- [Rerank endpoint API reference](/reference/rerank) +- [Documentation on Rerank](/docs/overview) +- [Documentation on Rerank fine-tuning](/docs/rerank-fine-tuning) +- [Documentation on Rerank best practices](/docs/reranking-best-practices) +- [LLM University module on Text Representation](https://cohere.com/llmu#text-representation) + +## Reranking semi-structured data + +The Rerank 3 model supports multi-aspect and semi-structured data like emails, invoices, JSON documents, code, and tables. By setting the rank fields, you can select which fields the model should consider for reranking. + +In the following example, we'll use an email data example. It is a semi-stuctured data that contains a number of fields – `from`, `to`, `date`, `subject`, and `text`. + +Suppose the new hire now wants to search for any emails about check-in sessions. Let's pretend we have a list of five emails retrieved from the email provider's API. + +To perform reranking over semi-structured data, we add an additional parameter, `rank_fields`, which contains the list of available fields. + +The model will rerank based on order of the fields passed in. For example, given `rank_fields=['title','author','text']`, the model will rerank using the values in title, author, and text sequentially. + +```python PYTHON +# Define the documents +emails = [ + {"from": "hr@co1t.com", "to": "david@co1t.com", "date": "2024-06-24", "subject": "A Warm Welcome to Co1t!", "text": "We are delighted to welcome you to the team! As you embark on your journey with us, you'll find attached an agenda to guide you through your first week."}, + {"from": "it@co1t.com", "to": "david@co1t.com", "date": "2024-06-24", "subject": "Setting Up Your IT Needs", "text": "Greetings! To ensure a seamless start, please refer to the attached comprehensive guide, which will assist you in setting up all your work accounts."}, + {"from": "john@co1t.com", "to": "david@co1t.com", "date": "2024-06-24", "subject": "First Week Check-In", "text": "Hello! I hope you're settling in well. Let's connect briefly tomorrow to discuss how your first week has been going. Also, make sure to join us for a welcoming lunch this Thursday at noon—it's a great opportunity to get to know your colleagues!"} +] +``` + +```python PYTHON +# Add the user query +query = "Any email about check ins?" + +# Rerank the documents +results = co.rerank(query=query, + documents=emails, + top_n=2, + model='rerank-english-v3.0', + rank_fields=["from", "to", "date", "subject", "body"]) + +return_results(results, emails) +``` + +``` +Rank: 1 +Score: 0.1979091 +Document: {'from': 'john@co1t.com', 'to': 'david@co1t.com', 'date': '2024-06-24', 'subject': 'First Week Check-In', 'text': "Hello! I hope you're settling in well. Let's connect briefly tomorrow to discuss how your first week has been going. Also, make sure to join us for a welcoming lunch this Thursday at noon—it's a great opportunity to get to know your colleagues!"} + +Rank: 2 +Score: 9.535461e-05 +Document: {'from': 'hr@co1t.com', 'to': 'david@co1t.com', 'date': '2024-06-24', 'subject': 'A Warm Welcome to Co1t!', 'text': "We are delighted to welcome you to the team! As you embark on your journey with us, you'll find attached an agenda to guide you through your first week."} +``` + +## Reranking tabular data + +Many enterprises rely on tabular data, such as relational databases, CSVs, and Excel. To perform reranking, you can transform a dataframe into a list of JSON records and use Rerank 3's JSON capabilities to rank them. + +Here's an example of reranking a CSV file that contains employee information. + +```python PYTHON +import pandas as pd +from io import StringIO + +# Create a demo CSV file +data = """name,role,join_date,email,status +Rebecca Lee,Senior Software Engineer,2024-07-01,rebecca@co1t.com,Full-time +Emma Williams,Product Designer,2024-06-15,emma@co1t.com,Full-time +Michael Jones,Marketing Manager,2024-05-20,michael@co1t.com,Full-time +Amelia Thompson,Sales Representative,2024-05-20,amelia@co1t.com,Part-time +Ethan Davis,Product Designer,2024-05-25,ethan@co1t.com,Contractor""" +data_csv = StringIO(data) + +# Load the CSV file +df = pd.read_csv(data_csv) +df.head(1) +``` + +Here's what the table looks like: + +| name | role | join_date | email | status | +| :---------- | :----------------------- | :--------- | :------------------------------------------ | :-------- | +| Rebecca Lee | Senior Software Engineer | 2024-07-01 | [rebecca@co1t.com](mailto:rebecca@co1t.com) | Full-time | + +Below, we'll get results from the Rerank endpoint: + +```python PYTHON +# Define the documents and rank fields +employees = df.to_dict('records') +rank_fields = df.columns.tolist() + +# Add the user query +query = "Any full-time product designers who joined recently?" + +# Rerank the documents +results = co.rerank(query=query, + documents=employees, + top_n=1, + model='rerank-english-v3.0', + rank_fields=rank_fields) + +return_results(results, employees) + +``` + +``` +Rank: 1 +Score: 0.986828 +Document: {'name': 'Emma Williams', 'role': 'Product Designer', 'join_date': '2024-06-15', 'email': 'emma@co1t.com', 'status': 'Full-time'} +``` + +## Multilingual reranking + +The Rerank endpoint also supports multilingual semantic search via the `rerank-multilingual-...` models. This means you can perform semantic search on texts in different languages. + +In the example below, we repeat the steps of performing reranking with one difference – changing the model type to a multilingual one. Here, we use the `rerank-multilingual-v3.0` model. Here, we are reranking the FAQ list using an Arabic query. + +```python PYTHON +# Define the query +query = "هل هناك مزايا تتعلق باللياقة البدنية؟" # Are there fitness benefits? + +# Rerank the documents +results = co.rerank(query=query, + documents=faqs_short, + top_n=2, + model='rerank-multilingual-v3.0') + +return_results(results, faqs_short) +``` + +``` +Rank: 1 +Score: 0.42232594 +Document: {'text': 'Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance.'} + +Rank: 2 +Score: 0.00025118678 +Document: {'text': 'Performance Reviews Frequency: We conduct informal check-ins every quarter and formal performance reviews twice a year.'} +``` + +## Conclusion + +In this tutorial, you learned about: + +- How to rerank lexical/semantic search results +- How to rerank semi-structured data +- How to rerank tabular data +- How to perform Multilingual reranking + +We have now seen two critical components of a powerful search system - [semantic search](/docs/semantic-search-with-cohere), or dense retrieval (Part 4) and reranking (Part 5). These building blocks are essential for implementing RAG solutions. + +In Part 6, you will learn how to [implement RAG](/docs/rag-with-cohere). diff --git a/fern/pages/tutorials/build-things-with-cohere/semantic-search-with-cohere.mdx b/fern/pages/tutorials/build-things-with-cohere/semantic-search-with-cohere.mdx new file mode 100644 index 00000000..bfc2a660 --- /dev/null +++ b/fern/pages/tutorials/build-things-with-cohere/semantic-search-with-cohere.mdx @@ -0,0 +1,277 @@ +--- +title: Semantic Search with Cohere +slug: /docs/semantic-search-with-cohere +--- + +Open in Colab + +[Text embeddings](/docs/embeddings) are lists of numbers that represent the context or meaning inside a piece of text. This is particularly useful in search or information retrieval applications. With text embeddings, this is called semantic search. + +Semantic search solves the problem faced by the more traditional approach of lexical search, which is great at finding keyword matches, but struggles to capture the context or meaning of a piece of text. + +With Cohere, you can generate text embeddings through the [Embed endpoint](/docs/cohere-embed) (Embed v3 being the latest model), which supports over 100 languages. + +In this tutorial, you'll learn about: + +- Embedding the documents +- Embedding the query +- Performing semantic search +- Multilingual semantic search +- Changing embedding compression types + +You'll learn these by building an onboarding assistant for new hires. + +## Setup + +To get started, first we need to install the `cohere` library and create a Cohere client. + +```python PYTHON +# pip install cohere + +import numpy as np +import cohere + +co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys +``` + +## Embedding the documents + +The Embed endpoint takes in texts as input and returns embeddings as output. + +For semantic search, there are two types of documents we need to turn into embeddings. + +- The list of documents that we want to search from. +- The query that will be used to search the documents. + +Right now, we are doing the former. We call the Embed endpoint using `co.embed()` and pass the following arguments: + +- `model`: Here we choose `embed-english-v3.0`, which generates embeddings of size 1024 +- `input_type`: We choose `search_document` to ensure the model treats these as the documents for search +- `texts`: The list of texts (the FAQs) + +```python PYTHON +# Define the documents +faqs_long = [ + {"text": "Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged."}, + {"text": "Finding Coffee Spots: For your caffeine fix, head to the break room's coffee machine or cross the street to the café for artisan coffee."}, + {"text": "Team-Building Activities: We foster team spirit with monthly outings and weekly game nights. Feel free to suggest new activity ideas anytime!"}, + {"text": "Working Hours Flexibility: We prioritize work-life balance. While our core hours are 9 AM to 5 PM, we offer flexibility to adjust as needed."}, + {"text": "Side Projects Policy: We encourage you to pursue your passions. Just be mindful of any potential conflicts of interest with our business."}, + {"text": "Reimbursing Travel Expenses: Easily manage your travel expenses by submitting them through our finance tool. Approvals are prompt and straightforward."}, + {"text": "Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours."}, + {"text": "Health and Wellness Benefits: We care about your well-being and offer gym memberships, on-site yoga classes, and comprehensive health insurance."}, + {"text": "Performance Reviews Frequency: We conduct informal check-ins every quarter and formal performance reviews twice a year."}, + {"text": "Proposing New Ideas: Innovation is welcomed! Share your brilliant ideas at our weekly team meetings or directly with your team lead."}, +] + +documents = faqs_long + +# Embed the documents +doc_emb = co.embed( + model="embed-english-v3.0", + input_type="search_document", + texts=[doc['text'] for doc in documents]).embeddings +``` + +Further reading: + +- [Embed endpoint API reference](/reference/embed) +- [Documentation on the Embed endpoint](/docs/embeddings) +- [Documentation on the models available on the Embed endpoint](/docs/cohere-embed) +- [LLM University module on Text Representation](https://cohere.com/llmu#text-representation) + +## Embedding the query + +Next, we add a query, which asks about how to stay connected to company updates. + +We choose `search_query` as the `input_type` to ensure the model treats this as the query (instead of documents) for search. + +```python PYTHON +# Add the user query +query = "How do I stay connected to what's happening at the company?" + +# Embed the query +query_emb = co.embed( + model="embed-english-v3.0", + input_type="search_query", + texts=[query]).embeddings +``` + +## Performing semantic search + +Now, we want to search for the most relevant documents to the query. We do this by computing the similarity between the embeddings of the query and each of the documents. + +There are various approaches to compute similarity between embeddings, and we'll choose the dot product approach. For this, we use the `numpy` library which comes with the implementation. + +Each query-document pair returns a score, which represents how similar the pair is. We then sort these scores in descending order and select the top-most similar pairs, which we choose 2 (this is an arbitrary choice, you can choose any number). + +Here, we show the most relevant documents with their similarity scores. + +```python PYTHON +# Compute dot product similarity and display results +def return_results(query_emb, doc_emb, documents): + n = 2 + scores = np.dot(query_emb, np.transpose(doc_emb))[0] + scores_sorted = sorted(enumerate(scores), + key=lambda x: x[1], + reverse=True)[:n] + + for idx, item in enumerate(scores_sorted): + print(f"Rank: {idx+1}") + print(f"Score: {item[1]}") + print(f"Document: {documents[item[0]]}\n") + +return_results(query_emb, doc_emb, documents) +``` + +``` +Rank: 1 +Score: 0.352135965228231 +Document: {'text': 'Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged.'} + +Rank: 2 +Score: 0.31995661889273097 +Document: {'text': 'Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours.'} +``` + +## Multilingual semantic search + +The Embed endpoint also supports multilingual semantic search via the `embed-multilingual-...` models. This means you can perform semantic search on texts in different languages. + +Specifically, you can do both multilingual and cross-lingual searches using one single model. + +Multilingual search happens when the query and the result are of the same language. For example, an English query of “places to eat” returning an English result of “Bob's Burgers.” You can replace English with other languages and use the same model for performing search. + +Cross-lingual search happens when the query and the result are of a different language. For example, a Hindi query of “खाने की जगह” (places to eat) returning an English result of “Bob's Burgers.” + +In the example below, we repeat the steps of performing semantic search with one difference – changing the model type to the multilingual version. Here, we use the `embed-multilingual-v3.0` model. Here, we are searching a French version of the FAQ list using an English query. + +```python PYTHON +# Define the documents +faqs_short_fr = [ + {"text" : "Remboursement des frais de voyage : Gérez facilement vos frais de voyage en les soumettant via notre outil financier. Les approbations sont rapides et simples."}, + {"text" : "Travailler de l'étranger : Il est possible de travailler à distance depuis un autre pays. Il suffit de coordonner avec votre responsable et de vous assurer d'être disponible pendant les heures de travail."}, + {"text" : "Avantages pour la santé et le bien-être : Nous nous soucions de votre bien-être et proposons des adhésions à des salles de sport, des cours de yoga sur site et une assurance santé complète."}, + {"text" : "Fréquence des évaluations de performance : Nous organisons des bilans informels tous les trimestres et des évaluations formelles deux fois par an."} +] + +documents = faqs_short_fr + +# Embed the documents +doc_emb = co.embed( + model="embed-multilingual-v3.0", + input_type="search_document", + texts=[doc['text'] for doc in documents]).embeddings + +# Add the user query +query = "What's your remote-working policy?" + +# Embed the query +query_emb = co.embed( + model="embed-multilingual-v3.0", + input_type="search_query", + texts=[query]).embeddings + +# Compute dot product similarity and display results +return_results(query_emb, doc_emb, documents) +``` + +``` +Rank: 1 +Score: 0.442758615743984 +Document: {'text': "Travailler de l'étranger : Il est possible de travailler à distance depuis un autre pays. Il suffit de coordonner avec votre responsable et de vous assurer d'être disponible pendant les heures de travail."} + +Rank: 2 +Score: 0.32783563708365726 +Document: {'text': 'Avantages pour la santé et le bien-être : Nous nous soucions de votre bien-être et proposons des adhésions à des salles de sport, des cours de yoga sur site et une assurance santé complète.'} +``` + +Further reading: + +- [The list of supported languages for multilingual Embed](/docs/cohere-embed#list-of-supported-languages) + +## Changing embedding compression types + +Semantic search over large datasets can require a lot of memory, which is expensive to host in a vector database. Changing the embeddings compression type can help reduce the memory footprint. + +A typical embedding model generates embeddings as float32 format (consuming 4 bytes). By compressing the embeddings to int8 format (1 byte), we can reduce the memory 4x while keeping 99.99% of the original search quality. + +We can go even further and use the binary format (1 bit), which reduces the needed memory 32x while keeping 90-98% of the original search quality. + +The Embed endpoint supports the following formats: `float`, `int8`, `unint8`, `binary`, and `ubinary`. You can get these different compression levels by passing the `embedding_types` parameter. + +In the example below, we embed the documents in two formats: `float` and `int8`. + +```python PYTHON +# Define the documents +documents = faqs_long + +# Embed the documents with the given embedding types +doc_emb = co.embed( + model="embed-english-v3.0", + embedding_types=["float","int8"], + input_type="search_document", + texts=[doc['text'] for doc in documents]).embeddings + +# Add the user query +query = "How do I stay connected to what's happening at the company?" + +# Embed the query +query_emb = co.embed( + model="embed-english-v3.0", + embedding_types=["float","int8"], + input_type="search_query", + texts=[query]).embeddings +``` + +Here are the search results of using the `float` embeddings. + +```python PYTHON +# Compute dot product similarity and display results +return_results(query_emb.float_, doc_emb.float_, documents) +``` + +``` +Rank: 1 +Score: 0.352135965228231 +Document: {'text': 'Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged.'} + +Rank: 2 +Score: 0.31995661889273097 +Document: {'text': 'Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours.'} +``` + +And here are the search results of using the `int8` embeddings. + +```python PYTHON +# Compute dot product similarity and display results +return_results(query_emb.int8, doc_emb.int8, documents) +``` + +``` +Rank: 1 +Score: 563583 +Document: {'text': 'Joining Slack Channels: You will receive an invite via email. Be sure to join relevant channels to stay informed and engaged.'} + +Rank: 2 +Score: 508692 +Document: {'text': 'Working from Abroad: Working remotely from another country is possible. Simply coordinate with your manager and ensure your availability during core hours.'} +``` + +Further reading: + +- [Documentation on embeddings compression levels](/docs/embeddings#compression-levels) + +## Conclusion + +In this tutorial, you learned about: + +- How to embed documents for search +- How to embed queries +- How to perform semantic search +- How to perform multilingual semantic search +- How to change the embedding compression types + +A high-performance and modern search system typically includes a reranking stage, which further boosts the search results. + +In Part 5, you will learn how to [add reranking](/docs/reranking-with-cohere) to a search system. diff --git a/fern/pages/tutorials/build-things-with-cohere/text-generation-tutorial.mdx b/fern/pages/tutorials/build-things-with-cohere/text-generation-tutorial.mdx new file mode 100644 index 00000000..8fce39e4 --- /dev/null +++ b/fern/pages/tutorials/build-things-with-cohere/text-generation-tutorial.mdx @@ -0,0 +1,297 @@ +--- +title: Cohere Text Generation Tutorial +slug: /docs/text-generation-tutorial +--- + +Open in Colab + +Command is Cohere’s flagship LLM, able to generate a response based on a user message or prompt. It is trained to follow user commands and to be instantly useful in practical business applications, like summarization, copywriting, extraction, and question-answering. + +Command R and Command R+ are the most recent models in the [Command family](/docs/command-r-plus). They strike the kind of balance between efficiency and high levels of accuracy that enable enterprises to move from proof of concept to production-grade AI applications. + +This tutorial leans of the Chat endpoint to build an onboarding assistant for new hires at Co1t, a fictional company, and covers: + +- Basic text generation +- Prompt engineering +- Parameters for controlling output +- Structured output generation +- Streaming output + +## Setup + +To get started, first we need to install the `cohere` library and create a Cohere client. + +```python PYTHON +# pip install cohere + +import cohere + +co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys +``` + +## Basic text generation + +To get started we just need to pass a single `message` parameter that represents (you guessed it) the user message, after which we use the client we just created to call the Chat endpoint. + +```python PYTHON +# Add the user message +message = "I'm joining a new startup called Co1t today. Could you help me write a short introduction message to my teammates." + +# Generate the response +response = co.chat(message=message) + +print(response.text) +``` + +The response we get back contains several objects, but for the sake of simplicity we'll focus for the moment on the `text` object: + +``` +Sure! Here is a short introduction message: + +"Hi everyone! My name is [Your Name] and I am excited to join the Co1t team today. I am passionate about [relevant experience or skills] and look forward to contributing my skills and ideas to the team. In my free time, I enjoy [hobbies or interests]. Feel free to reach out to me directly if you want to chat or collaborate. Let's work together to make Co1t a success!" +``` + +Here are some additional resources if you'd like to read further: + +- [Chat endpoint API reference](/reference/chat) +- [Documentation on Chat fine-tuning](/docs/chat-fine-tuning) +- [Documentation on Command R+](/docs/command-r-plus) +- [LLM University module on text generation](https://cohere.com/llmu#text-generation) + +## Prompt engineering + +Prompting is at the heart of working with LLMs as it provides context for the text that we want the model to generate. Prompts can be anything from simple instructions to more complex pieces of text, and they are used to steer the model to producing a specific type of output. + +This section examines a couple of prompting techniques, the first of which is adding more specific instructions to the prompt (the more instructions you provide in the prompt, the closer you can get to the response you need.) + +The limit of how long a prompt can be is dependent on the maximum context length that a model can support (in the case Command R and Command R+, it's 128k tokens). + +Below, we'll add one additional instruction to the earlier prompt, the length we need the response to be. + +```python PYTHON +# Add the user message +message = "I'm joining a new startup called Co1t today. Could you help me write a one-sentence introduction message to my teammates." + +# Generate the response +response = co.chat(message=message) + +print(response.text) +``` + +``` +Here's a potential introduction message: + +"Hi everyone, my name is [Your Name] and I'm thrilled to join Co1t today as part of the team, and I look forward to contributing my skills and ideas to drive innovation and success!" + +This message expresses your excitement about joining the company and highlights your commitment to contributing to the team's success. +``` + +All our prompts so far use what is called zero-shot prompting, which means that provide instruction without any example. But in many cases, it is extremely helpful to provide examples to the model to guide its response. This is called few-shot prompting. + +Few-shot prompting is especially useful when we want the model response to follow a particular style or format. Also, it is sometimes hard to explain what you want in an instruction, and easier to show examples. + +Below, we want the response to be similar in style and length to the convention, as we show in the examples. + +```python PYTHON +# Add the user message +user_input = "Why can't I access the server? Is it a permissions issue?" + +# Create a prompt containing example outputs +message=f"""Write a ticket title for the following user request: + +User request: Where are the usual storage places for project files? +Ticket title: Project File Storage Location + +User request: Emails won't send. What could be the issue? +Ticket title: Email Sending Issues + +User request: How can I set up a connection to the office printer? +Ticket title: Printer Connection Setup + +User request: {user_input} +Ticket title:""" + +# Generate the response +response = co.chat(message=message) + +print(response.text) +``` + +``` +Server Access Issues +``` + +Further reading: + +- [Documentation on prompt engineering](/docs/crafting-effective-prompts) +- [LLM University module on prompt engineering](https://cohere.com/llmu#prompt-engineering) + +## Parameters for controlling output + +The Chat endpoint provides developers with an array of options and parameters. + +For example, you can choose from several variations of the Command model. Different models produce different output profiles, such as quality and latency. + +```python PYTHON +# Add the user message +message = "I'm joining a new startup called Co1t today. Could you help me write a one-sentence introduction message to my teammates." + +# Generate the response by specifying a model +response = co.chat(message=message, model="command-r") + +print(response.text) +``` + +``` +Hello, my name is [Your Name] and I'm thrilled to join the Co1t team today as the new kid in town! +``` + +Often, you’ll need to control the level of randomness of the output. You can control this using a few parameters. + +The most commonly used parameter is `temperature`, which is a number used to tune the degree of randomness. You can enter values between 0.0 to 1.0. + +A lower temperature gives more predictable outputs, and a higher temperature gives more "creative" outputs. + +Here's an example of setting `temperature` to 0. + +```python PYTHON +# Add the user message +message = "I like learning about the industrial revolution and how it shapes the modern world. How can I introduce myself in two words." + +# Generate the response multiple times by specifying a low temperature value +for idx in range(3): + response = co.chat(message=message, temperature=0) + print(f"{idx+1}: {response.text}\n") +``` + +``` +1: Curious Historian. + +2: Curious Historian. + +3: Curious Historian. +``` + +And here's an example of setting `temperature` to 1. + +```python PYTHON +# Add the user message +message = "I like learning about the industrial revolution and how it shapes the modern world. How can I introduce myself in two words." + +# Generate the response multiple times by specifying a high temperature value +for idx in range(3): + response = co.chat(message=message, temperature=1) + print(f"{idx+1}: {response.text}\n") +``` + +``` +1: Sure! Here are two words that can describe you: + +1. Industry Enthusiast +2. Revolution Aficionado + +These words combine your passion for learning about the Industrial Revolution with a modern twist, showcasing your enthusiasm and knowledge in a concise manner. + +2: "Revolution Fan" + +3: History Enthusiast! +``` + +Further reading: + +- [Available models for the Chat endpoint](/docs/models#command) +- [Documentation on predictable outputs](/docs/predictable-outputs) +- [Documentation on advanced generation parameters](/docs/advanced-generation-hyperparameters) + +## Structured output generation + +By adding the `response_format` parameter, you can get the model to generate the output as a JSON object. By generating JSON objects, you can structure and organize the model's responses in a way that can be used in downstream applications. + +The `response_format` parameter allows you to specify the schema the JSON object must follow. It takes the following parameters: + +- `message`: The user message +- `response_format`: The schema of the JSON object + +```python PYTHON +# Add the user message +user_input = "Why can't I access the server? Is it a permissions issue?" + +# Generate the response multiple times by adding the JSON schema +response = co.chat( + model="command-r-plus", + message=f"""Create an IT ticket for the following user request. Generate a JSON object. + {user_input}""", + response_format={ + "type": "json_object", + "schema": { + "type": "object", + "required": ["title", "category", "status"], + "properties": { + "title": { "type": "string"}, + "category": { "type" : "string", "enum" : ["access", "software"]}, + "status": { "type" : "string" , "enum" : ["open", "closed"]} + } + } + }, +) + +import json +json_object = json.loads(response.text) + +print(json_object) +``` + +``` +{'title': 'User Unable to Access Server', 'category': 'access', 'status': 'open'} +``` + +Further reading: + +- [Documentation on Structured Generations (JSON)](/docs/structured-outputs-json) + +## Streaming responses + +All the previous examples above generate responses in a non-streamed manner. This means that the endpoint would return a response object only after the model has generated the text in full. + +The Chat endpoint also provides streaming support. In a streamed response, the endpoint would return a response object for each token as it is being generated. This means you can display the text incrementally without having to wait for the full completion. + +To activate it, use `co.chat_stream()` instead of `co.chat()`. + +In streaming mode, the endpoint will generate a series of objects. To get the actual text contents, we take objects whose `event_type` is `text-generation`. + +```python PYTHON +# Add the user message +message = "I'm joining a new startup called Co1t today. Could you help me write a one-sentence introduction message to my teammates." + +# Generate the response by streaming it +response = co.chat_stream( + message=message) + +for event in response: + if event.event_type == "text-generation": + print(event.text, end="") +``` + +``` +Here's a potential introduction message: + +"Hi everyone, my name is [Your Name] and I'm thrilled to join Co1t today as the newest [Your Role], and I look forward to contributing my skills and expertise to the team and driving innovative solutions for our customers." +``` + +Further reading: + +- [Documentation on streaming responses](/docs/streaming) + +## Conclusion + +In this tutorial, you learned about: + +- How to get started with a basic text generation +- How to improve outputs with prompt engineering +- How to control outputs using parameter changes +- How to generate structured outputs +- How to stream text generation outputs + +However, we have only done all this using direct text generations. As its name implies, the Chat endpoint can also support building chatbots, which require features to support multi-turn conversations and maintain the conversation state. + +In Part 3, you'll learn how to build chatbots with the Chat endpoint. diff --git a/fern/v1.yml b/fern/v1.yml index 776767a5..ff938fdb 100644 --- a/fern/v1.yml +++ b/fern/v1.yml @@ -53,6 +53,8 @@ navigation: path: pages/models/rerank-2.mdx - section: Text Generation contents: + - page: Introduction to Text Generation at Cohere + path: pages/text-generation/introduction-to-text-generation-at-cohere.mdx - page: Using the Chat API path: pages/text-generation/chat-api.mdx - page: Streaming Responses @@ -101,6 +103,8 @@ navigation: path: pages/text-generation/prompt-engineering/prompt-truncation.mdx - page: Preambles path: pages/text-generation/prompt-engineering/preambles.mdx + - page: Prompt Tuner (beta) + path: pages/text-generation/prompt-engineering/prompt-tuner.mdx - section: Prompt Library contents: - page: Create CSV data from JSON data @@ -121,6 +125,8 @@ navigation: path: pages/text-generation/prompt-engineering/prompt-library/multilingual-interpreter.mdx - page: Migrating from the Generate API to the Chat API path: pages/text-generation/migrating-from-cogenerate-to-cochat.mdx + - page: Summarizing Text + path: pages/text-generation/summarizing-text.mdx - section: Text Embeddings (Vectors, Search, Retrieval) contents: - page: Introduction to Embeddings at Cohere @@ -134,7 +140,7 @@ navigation: - page: Rerank Best Practices path: pages/text-embeddings/reranking/reranking-best-practices.mdx - page: Text Classification - path: pages/text-embeddings/text-classification-1.mdx + path: pages/text-embeddings/text-classification-with-cohere.mdx - section: Fine-Tuning contents: - page: Introduction @@ -249,6 +255,21 @@ navigation: path: pages/tutorials/cookbooks.mdx - page: LLM University path: pages/llm-university/llmu-2.mdx + - section: Build Things with Cohere! + path: pages/tutorials/build-things-with-cohere.mdx + contents: + - page: Cohere Text Generation Tutorial + path: pages/tutorials/build-things-with-cohere/text-generation-tutorial.mdx + - page: Building a Chatbot with Cohere + path: pages/tutorials/build-things-with-cohere/building-a-chatbot-with-cohere.mdx + - page: Semantic Search with Cohere + path: pages/tutorials/build-things-with-cohere/semantic-search-with-cohere.mdx + - page: Reranking with Cohere + path: pages/tutorials/build-things-with-cohere/reranking-with-cohere.mdx + - page: RAG with Cohere + path: pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx + - page: Building an Agent with Cohere + path: pages/tutorials/build-things-with-cohere/building-an-agent-with-cohere.mdx - section: Responsible Use contents: - section: Overview diff --git a/fern/v2.yml b/fern/v2.yml index 06c09ef2..a62f8284 100644 --- a/fern/v2.yml +++ b/fern/v2.yml @@ -53,6 +53,8 @@ navigation: path: pages/models/rerank-2.mdx - section: Text Generation contents: + - page: Introduction to Text Generation at Cohere + path: pages/text-generation/introduction-to-text-generation-at-cohere.mdx - page: Using the Chat API path: pages/text-generation/chat-api.mdx - page: Streaming Responses @@ -101,6 +103,8 @@ navigation: path: pages/text-generation/prompt-engineering/prompt-truncation.mdx - page: Preambles path: pages/text-generation/prompt-engineering/preambles.mdx + - page: Prompt Tuner (beta) + path: pages/text-generation/prompt-engineering/prompt-tuner.mdx - section: Prompt Library contents: - page: Create CSV data from JSON data @@ -121,6 +125,8 @@ navigation: path: pages/text-generation/prompt-engineering/prompt-library/multilingual-interpreter.mdx - page: Migrating from the Generate API to the Chat API path: pages/text-generation/migrating-from-cogenerate-to-cochat.mdx + - page: Summarizing Text + path: pages/text-generation/summarizing-text.mdx - section: Text Embeddings (Vectors, Search, Retrieval) contents: - page: Introduction to Embeddings at Cohere @@ -134,7 +140,7 @@ navigation: - page: Rerank Best Practices path: pages/text-embeddings/reranking/reranking-best-practices.mdx - page: Text Classification - path: pages/text-embeddings/text-classification-1.mdx + path: pages/text-embeddings/text-classification-with-cohere.mdx - section: Fine-Tuning contents: - page: Introduction @@ -249,6 +255,21 @@ navigation: path: pages/tutorials/cookbooks.mdx - page: LLM University path: pages/llm-university/llmu-2.mdx + - section: Build Things with Cohere! + path: pages/tutorials/build-things-with-cohere.mdx + contents: + - page: Cohere Text Generation Tutorial + path: pages/tutorials/build-things-with-cohere/text-generation-tutorial.mdx + - page: Building a Chatbot with Cohere + path: pages/tutorials/build-things-with-cohere/building-a-chatbot-with-cohere.mdx + - page: Semantic Search with Cohere + path: pages/tutorials/build-things-with-cohere/semantic-search-with-cohere.mdx + - page: Reranking with Cohere + path: pages/tutorials/build-things-with-cohere/reranking-with-cohere.mdx + - page: RAG with Cohere + path: pages/tutorials/build-things-with-cohere/rag-with-cohere.mdx + - page: Building an Agent with Cohere + path: pages/tutorials/build-things-with-cohere/building-an-agent-with-cohere.mdx - section: Responsible Use contents: - section: Overview