Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First round of fixes based on user feedback #290

Merged
merged 6 commits into from
Dec 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion fern/pages/models/cohere-embed.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Embed Model
title: Cohere's Embed Models (Details and Application)
slug: docs/cohere-embed
hidden: false
description: >-
Expand Down
2 changes: 1 addition & 1 deletion fern/pages/models/models.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Models Overview"
title: An Overview of Cohere's Models
slug: "docs/models"

hidden: false
Expand Down
2 changes: 1 addition & 1 deletion fern/pages/models/rerank-2.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Rerank Model"
title: Cohere's Rerank Model (Details and Application)
slug: "docs/rerank-2"

hidden: false
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ If `Number of documents * max_chunks_per_doc` exceeds `10,000`, the endpoint wil

## Queries

Our `rerank-v3.5` and `rerankv-3.0` models are trained with a context length of 4096 tokens. The model takes into account both the input from the query and document. If your query is larger than 2048 tokens, it will be truncated to the first 2048 tokens.
Our `rerank-v3.5` and `rerank-v3.0` models are trained with a context length of 4096 tokens. The model takes both the _query_ and the _document_ into account when calculating against this limit, and the query can account for up to half of the full context length. If your query is larger than 2048 tokens, in other words, it will be truncated to the first 2048 tokens (leaving the other 2048 for the document(s)).

## Semi-Structured Data Support

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ keywords: "prompt engineering, generative AI prompts"
createdAt: "Thu Feb 29 2024 18:14:26 GMT+0000 (Coordinated Universal Time)"
updatedAt: "Thu May 23 2024 20:21:50 GMT+0000 (Coordinated Universal Time)"
---
LLMs come with limitations; specifically, they can only handle so much text as input. This means that you will often need to figure out which document sections and chat history elements to keep, and which ones to omit.
LLMs come with limitations; specifically, they can only handle so much text as input. This means that you will often need to figure out which part of a document or chat history to keep, and which ones to omit.

To make this easier, the Chat API comes with a helpful `prompt_truncation` parameter. When `prompt_truncation` is set to `AUTO`, the API will automatically break up the documents into smaller chunks, rerank the chunks and drop the minimum required number of the least relevant documents in order to stay within the model's context length limit.
To make this easier, the Chat API comes with a helpful `prompt_truncation` parameter. When `prompt_truncation` is set to `AUTO`, the API will automatically break up the documents into smaller chunks, rerank those chunks according to how relevant they are, and then start dropping the least relevant documents until the text fits within the model's context length limit.

**Note:** The last few messages in the chat history will never be truncated or dropped. The RAG API will throw a 400 `Too Many Tokens` error if it can't fit those messages along with a single document under the context limit.
6 changes: 3 additions & 3 deletions fern/v1.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ navigation:
path: pages/get-started/contribute.mdx
- section: Models
contents:
- page: Models Overview
- page: An Overview of Cohere's Models
path: pages/models/models.mdx
- section: Command
contents:
Expand All @@ -51,9 +51,9 @@ navigation:
path: pages/models/the-command-family-of-models/command-r.mdx
- page: Command and Command Light
path: pages/models/the-command-family-of-models/command-beta.mdx
- page: Embed
- page: Cohere's Embed Models (Details and Application)
path: pages/models/cohere-embed.mdx
- page: Rerank
- page: Cohere's Rerank Model (Details and Application)
path: pages/models/rerank-2.mdx
- page: Aya
path: pages/models/aya.mdx
Expand Down
6 changes: 3 additions & 3 deletions fern/v2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ navigation:
path: pages/get-started/contribute.mdx
- section: Models
contents:
- page: Models Overview
- page: An Overview of Cohere's Models
path: pages/models/models.mdx
- section: Command
contents:
Expand All @@ -51,9 +51,9 @@ navigation:
path: pages/v2/models/the-command-family-of-models/command-r.mdx
- page: Command and Command Light
path: pages/v2/models/the-command-family-of-models/command-beta.mdx
- page: Embed
- page: Cohere's Embed Models (Details and Application)
path: pages/models/cohere-embed.mdx
- page: Rerank
- page: Cohere's Rerank Model (Details and Application)
path: pages/models/rerank-2.mdx
- page: Aya
path: pages/models/aya.mdx
Expand Down
Loading