cohere-ai · trentfowlercohere · Oct 22, 2024 · Sep 2, 2024 · Sep 2, 2024 · Sep 2, 2024
@@ -0,0 +1,36 @@
+---
+title: "Embed v3.0 Models are now Multimodal"
+slug: "changelog/embed-v3-is-multimodal"
+createdAt: "Tues Oct 22 2024 05:30:00 (MST)"
+hidden: false
+description: >-
+  Launch of multimodal embeddings for our Embed models, plus some code to help get started.
+---
+
+Today we’re announcing updates to our embed-v3.0 family of models. These models now have the ability to process images into embeddings. There is no change to existing text capabilities which means there is no need to re-embed texts you have already processed with our `embed-v3.0` models.
+
+In the rest of these release notes, we’ll provide more details about technical enhancements, new features, and new pricing.
+
+## Technical Details
+### API Changes:
+The Embed API has two major changes: 
+- Introduced a new `input_type` called `image`
+- Introduced a new parameter called `images`
+
+Example request on how to process 
+
+```Text cURL
+POST https://api.cohere.ai/v1/embed
+{
+    "model": "embed-multilingual-v3.0",
+    "input_type": "image",
+    "embedding_types": ["float"],
+    "images": [enc_img]
+}
+```
+### Restrictions: 
+- The API only accepts images in the base format of the following: `png`, `jpeg`,`Webp`, and `gif`
+- Image embeddings currently does not support batching so the max images sent per request is 1
+- The maximum image sizez is `5mb`
+- The `images` parameter only accepts a base64 encoded image formatted as a Data Url
+
@@ -77,6 +77,8 @@ result = co.embed(
 print(result)
 ```
 
+Note that we've released multimodal embeddings models that are able to handle images in addition to text. Find [more information here](https://docs.cohere.com/docs/multimodal-embeddings).
+
 ## Text Generation
 
 You can use this code to invoke Cohere's Command models on Amazon SageMaker:

@@ -99,6 +99,7 @@ Though this section is called "Text Generation", it's worth pointing out that th
 We expose two routes for Embed v3 - English and Embed v3 - Multilingual inference:
 
 - `v1/embeddings` adheres to the Azure AI Generative Messages API schema; 
+    - Use `v1/images/embeddings` if you want to use one of our [multimodal embeddings models](/docs/multimodal-embeddings).
 - ` v1/embed` supports Cohere's native API schema.
 
 You can find more information about Azure's API [here](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-embed#embed-api-reference-for-cohere-embed-models-deployed-as-a-service).

@@ -14,22 +14,22 @@ Embed models can be used to generate embeddings from text or classify it based o
 
 ## English Models
 
-| Latest Model                | Description                                                                                          | Dimensions | Max Tokens (Context Length) | Similarity Metric | Endpoints                                                                                  |
-|-----------------------------|------------------------------------------------------------------------------------------------------|------------|-----------------------------|-------------------|-------------------------------------------------------------------------------------------|
-| `embed-english-v3.0`        | A model that allows for text to be classified or turned into embeddings. English only.               | 1024       | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance | [Embed](/reference/embed),  <br/>[Embed Jobs](/reference/embed-jobs)               |
-| `embed-english-light-v3.0`  | A smaller, faster version of `embed-english-v3.0`. Almost as capable, but a lot faster. English only.| 384        | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance | [Embed](/reference/embed),  <br/>[Embed Jobs](/reference/embed-jobs)               |
-| `embed-english-v2.0`        | Our older embeddings model that allows for text to be classified or turned into embeddings. English only | 4096       | 512                         | Cosine Similarity | [Classify](/reference/classify), [Embed](/reference/embed)                         |
-| `embed-english-light-v2.0`  | A smaller, faster version of embed-english-v2.0. Almost as capable, but a lot faster. English only.   | 1024       | 512                         | Cosine Similarity | [Classify](/reference/classify), [Embed](/reference/embed)                         |
+| Latest Model                | Description                                                                                              | Modality     | Dimensions | Max Tokens (Context Length) | Similarity Metric                                             | Endpoints                                                                          |
+|-----------------------------|----------------------------------------------------------------------------------------------------------|--------------|------------|-----------------------------|---------------------------------------------------------------|------------------------------------------------------------------------------------|
+| `embed-english-v3.0`        | A model that allows for text to be classified or turned into embeddings. English only.                   | Text, Images | 1024       | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance | [Embed](/reference/embed),  <br/>[Embed Jobs](/reference/embed-jobs)               |
+| `embed-english-light-v3.0`  | A smaller, faster version of `embed-english-v3.0`. Almost as capable, but a lot faster. English only.    | Text, Images | 384        | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance | [Embed](/reference/embed),  <br/>[Embed Jobs](/reference/embed-jobs)               |
+| `embed-english-v2.0`        | Our older embeddings model that allows for text to be classified or turned into embeddings. English only.| Text         | 4096       | 512                         | Cosine Similarity                                             | [Classify](/reference/classify), [Embed](/reference/embed)                         |
+| `embed-english-light-v2.0`  | A smaller, faster version of embed-english-v2.0. Almost as capable, but a lot faster. English only.      | Text         | 1024       | 512                         | Cosine Similarity                                             | [Classify](/reference/classify), [Embed](/reference/embed)                         |
 
 
 
 ## Multi-Lingual Models
 
-| Latest Model                     | Description                                                                                                                                            | Dimensions | Max Tokens (Context Length) | Similarity Metric       | Endpoints                                                                                          |
-|----------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------------------------|-------------------------|---------------------------------------------------------------------------------------------------|
-| `embed-multilingual-v3.0`        | Provides multilingual classification and embedding support. [See supported languages here.](/docs/supported-languages)            | 1024       | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance       | [Embed](/reference/embed), [Embed Jobs](/reference/embed-jobs)                             |
-| `embed-multilingual-light-v3.0`  | A smaller, faster version of `embed-multilingual-v3.0`. Almost as capable, but a lot faster. Supports multiple languages.                                | 384        | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance       | [Embed](/reference/embed),  <br/>[Embed Jobs](/reference/embed-jobs)                         |
-| `embed-multilingual-v2.0`        | Provides multilingual classification and embedding support. [See supported languages here.](/docs/supported-languages)            | 768        | 256                         | Dot Product Similarity  | [Classify](/reference/classify), [Embed](/reference/embed)                                   |
+| Latest Model                     | Description                                                                                                                                  | Modality          | Dimensions | Max Tokens (Context Length) | Similarity Metric                                                   | Endpoints                                                              |
+|----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------------|------------|-----------------------------|---------------------------------------------------------------------|------------------------------------------------------------------------|
+| `embed-multilingual-v3.0`        | Provides multilingual classification and embedding support. [See supported languages here.](/docs/supported-languages)                       | Text, Images      | 1024       | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance       | [Embed](/reference/embed), [Embed Jobs](/reference/embed-jobs)         |
+| `embed-multilingual-light-v3.0`  | A smaller, faster version of `embed-multilingual-v3.0`. Almost as capable, but a lot faster. Supports multiple languages.                    | Text, Images      | 384        | 512                         | Cosine Similarity, Dot Product Similarity, Euclidean Distance       | [Embed](/reference/embed),  <br/>[Embed Jobs](/reference/embed-jobs)   |
+| `embed-multilingual-v2.0`        | Provides multilingual classification and embedding support. [See supported languages here.](/docs/supported-languages)                       | Text              | 768        | 256                         | Dot Product Similarity                                              | [Classify](/reference/classify), [Embed](/reference/embed)             |