diff --git a/cohere-openapi.yaml b/cohere-openapi.yaml index a657002e..8ac18332 100644 --- a/cohere-openapi.yaml +++ b/cohere-openapi.yaml @@ -5588,7 +5588,7 @@ paths: With `prompt_truncation` set to "OFF", no elements will be dropped. If the sum of the inputs exceeds the model's context length limit, a `TooManyTokens` error will be returned. - Compatible Deployments: + Compatible Deployments: - AUTO: Cohere Platform Only - AUTO_PRESERVE_ORDER: Azure, AWS Sagemaker/Bedrock, Private Deployments connectors: @@ -5832,6 +5832,8 @@ paths: **Note**: This parameter is only compatible with models [Command R 08-2024](https://docs.cohere.com/docs/command-r#august-2024-release), [Command R+ 08-2024](https://docs.cohere.com/docs/command-r-plus#august-2024-release) and newer. + **Note**: `command-r7b-12-2024` only supports `"CONTEXTUAL"` and `"STRICT"` modes. + Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments responses: "200": @@ -6021,6 +6023,8 @@ paths: Safety modes are not yet configurable in combination with `tools`, `tool_results` and `documents` parameters. **Note**: This parameter is only compatible with models [Command R 08-2024](https://docs.cohere.com/v2/docs/command-r#august-2024-release), [Command R+ 08-2024](https://docs.cohere.com/v2/docs/command-r-plus#august-2024-release) and newer. + + **Note**: `command-r7b-12-2024` only supports `"CONTEXTUAL"` and `"STRICT"` modes. max_tokens: x-fern-audiences: - public @@ -22652,6 +22656,8 @@ components: description: | Defaults to `"accurate"`. Dictates the approach taken to generating citations as part of the RAG flow by allowing the user to specify whether they want `"accurate"` results, `"fast"` results or no results. + + **Note**: `command-r7b-12-2024` only supports `"fast"` and `"off"` modes. Its default is `"fast"`. ResponseFormatTypeV2: x-fern-audiences: - public diff --git a/fern/pages/v2/deployment-options/cohere-on-aws/amazon-bedrock.mdx b/fern/pages/v2/deployment-options/cohere-on-aws/amazon-bedrock.mdx index 204f688a..d582ba24 100644 --- a/fern/pages/v2/deployment-options/cohere-on-aws/amazon-bedrock.mdx +++ b/fern/pages/v2/deployment-options/cohere-on-aws/amazon-bedrock.mdx @@ -18,10 +18,9 @@ Here, you'll learn how to use Amazon Bedrock to deploy both the Cohere Command a - Command R - Command R+ -- Command Light -- Command - Embed - English - Embed - Multilingual +- Rerank v3.5 ## Prerequisites @@ -62,17 +61,17 @@ model_id = "cohere.embed-english-v3" # or "cohere.embed-multilingual-v3" # Invoke the model and print the response result = co.embed( - model=model_id, - input_type=input_type, - texts=texts, - truncate=truncate) # aws_client.invoke_model(**params) + model=model_id, + input_type=input_type, + texts=texts, + truncate=truncate) # aws_client.invoke_model(**params) print(result) ``` ## Text Generation -You can use this code to invoke either Command R (`cohere.command-r-v1:0`), Command R+ (`cohere.command-r-plus-v1:0`), Command (`cohere.command-text-v14`), or Command light (`cohere.command-light-text-v14`) on Amazon Bedrock: +You can use this code to invoke either Command R (`cohere.command-r-v1:0`), Command R+ (`cohere.command-r-plus-v1:0`) on Amazon Bedrock: ```python PYTHON import cohere @@ -90,3 +89,35 @@ result = co.chat(message="Write a LinkedIn post about starting a career in tech: print(result) ``` + +## Rerank + +You can use this code to invoke our latest Rerank models on Bedrock + +```python PYTHON +import cohere + +co = cohere.BedrockClientV2( + aws_region="us-west-2", # pick a region where the model is available + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) + +docs = [ + "Carson City is the capital city of the American state of Nevada.", + "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.", + "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.", + "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.", + "Capital punishment has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.", +] + +response = co.rerank( + model="cohere.rerank-v3-5:0", + query="What is the capital of the United States?", + documents=docs, + top_n=3, +) + +print(response) +``` diff --git a/fern/pages/v2/text-embeddings/reranking/reranking-best-practices.mdx b/fern/pages/v2/text-embeddings/reranking/reranking-best-practices.mdx index b0d05a63..2cb48e15 100644 --- a/fern/pages/v2/text-embeddings/reranking/reranking-best-practices.mdx +++ b/fern/pages/v2/text-embeddings/reranking/reranking-best-practices.mdx @@ -38,7 +38,7 @@ If you would like more control over how chunking is done, we recommend that you ## Queries -Our Rerank models (`rerank-v3.0` and `rerank-v3.5`) are trained with a context length of 4096 tokens. The model takes into account both the input from the query and documents. If your query is larger than 2048 tokens, it will be truncated to the first 2048 tokens. +Our `rerank-v3.5` and `rerank-v3.0` models are trained with a context length of 4096 tokens. The model takes both the _query_ and the _document_ into account when calculating against this limit, and the query can account for up to half of the full context length. If your query is larger than 2048 tokens, in other words, it will be truncated to the first 2048 tokens (leaving the other 2048 for the document(s)). ## Structured Data Support