Skip to content

Commit

Permalink
Update embeddings docs (#217)
Browse files Browse the repository at this point in the history
* update embeddings docs

* fix path image
  • Loading branch information
mrmer1 authored Oct 28, 2024
1 parent ecc1faf commit 737741c
Show file tree
Hide file tree
Showing 4 changed files with 169 additions and 22 deletions.
14 changes: 9 additions & 5 deletions fern/pages/text-embeddings/embeddings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Cohere embeddings are optimized for different types of inputs.

- When using embeddings for [semantic search](/docs/semantic-search), the search query should be embedded by setting `input_type="search_query"`
- When using embeddings for semantic search, the text passages that are being searched over should be embedded with `input_type="search_document"`.
- When using embedding for [`classification`](/docs/text-classification-with-embed) and `clustering` tasks, you can set `input_type` to either 'classification' or 'clustering' to optimize the embeddings appropriately.
- When using embedding for `classification` and `clustering` tasks, you can set `input_type` to either 'classification' or 'clustering' to optimize the embeddings appropriately.
- When `input_type='image'`, the expected input to be embedded is an image instead of text.

## Multilingual Support
Expand Down Expand Up @@ -93,7 +93,11 @@ Be aware that image embedding has the following restrictions:
- Our API currently does not support batch image embeddings.

```python PYTHON
import cohere
import cohere
from PIL import Image
from io import BytesIO
import base64

co = cohere.Client(api_key="<YOUR API KEY>")

# The model accepts input in base64 as a Data URL
Expand Down Expand Up @@ -125,9 +129,9 @@ ret.embeddings.float

## Compression Levels

The Cohere embeddings platform now supports compression. The Embed API features an `embeddings_types` parameter which allows the user to specify various ways of compressing the output.
The Cohere embeddings platform supports compression. The Embed API features an `embeddings_types` parameter which allows the user to specify various ways of compressing the output.

The following embedding types are now supported:
The following embedding types are supported:

- `float`
- `int8`
Expand All @@ -145,7 +149,7 @@ ret = co.embed(texts=phrases,
ret.embeddings # This contains the float embeddings
```

However we recommend being explicit about the `embedding type(s)`. To specify an `embedding type`, pass one of the types from the list above in as list containing a string:
However we recommend being explicit about the `embedding type(s)`. To specify an embedding types, pass one of the types from the list above in as list containing a string:

```python PYTHON
ret = co.embed(texts=phrases,
Expand Down
81 changes: 66 additions & 15 deletions fern/pages/v2/text-embeddings/embeddings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ slug: "v2/docs/embeddings"
hidden: false
description: >-
Embeddings transform text into numerical data, enabling language-agnostic
similarity searches and efficient storage with compression.
similarity searches and efficient storage with compression (API v2).
image: "../../../assets/images/fa074c3-cohere_docs_preview_image_1200x630_copy.jpg"
keywords: "vector embeddings, embeddings, natural language processing"

Expand Down Expand Up @@ -48,11 +48,16 @@ calculate_similarity(soup1, london) # 0.16 - not similar!

## The `input_type` parameter

Cohere embeddings are optimized for different types of inputs. For example, when using embeddings for semantic search, the search query should be embedded by setting `input_type="search_query"` whereas the text passages that are being searched over should be embedded with `input_type="search_document"`. You can find more details and a code snippet in the [Semantic Search guide](/v2/docs/semantic-search). Similarly, the input type can be set to `classification` ([example](/v2/docs/text-classification-with-cohere)) and `clustering` to optimize the embeddings for those use cases.
Cohere embeddings are optimized for different types of inputs.

- When using embeddings for [semantic search](/docs/semantic-search), the search query should be embedded by setting `input_type="search_query"`
- When using embeddings for semantic search, the text passages that are being searched over should be embedded with `input_type="search_document"`.
- When using embedding for `classification` and `clustering` tasks, you can set `input_type` to either 'classification' or 'clustering' to optimize the embeddings appropriately.
- When `input_type='image'`, the expected input to be embedded is an image instead of text.

## Multilingual Support

In addition to `embed-english-v3.0` we offer a best-in-class multilingual model [embed-multilingual-v3.0](/v2/docs/embed-2#multi-lingual-models) with support for over 100 languages, including Chinese, Spanish, and French. This model can be used with the Embed API, just like its English counterpart:
In addition to `embed-english-v3.0` we offer a best-in-class multilingual model [embed-multilingual-v3.0](/docs/embed-2#multi-lingual-models) with support for over 100 languages, including Chinese, Spanish, and French. This model can be used with the Embed API, just like its English counterpart:

```python PYTHON
import cohere
Expand All @@ -75,44 +80,90 @@ print(embeddings[0][:5]) # Print embeddings for the first text

```

## Image Embeddings

The Cohere embedding platform supports image embeddings for the entire of `embed-v3.0` family. This functionality can be utilized with the following steps:

- Pass `image` to the `input_type` parameter (as discussed above).
- Pass your image URL to the new `images` parameter.

Be aware that image embedding has the following restrictions:

- If `input_type='image'`, the `texts` field must be empty.
- The original image file type must be `png` or `jpeg`.
- The image must be base64 encoded and sent as a Data URL to the `images` parameter.
- Our API currently does not support batch image embeddings.

```python PYTHON
import cohere
from PIL import Image
from io import BytesIO
import base64

co = cohere.ClientV2(api_key="<YOUR API KEY>")

# The model accepts input in base64 as a Data URL

def image_to_base64_data_url(image_path):
# Open the image file
with Image.open(image_path) as img:
# Create a BytesIO object to hold the image data in memory
buffered = BytesIO()
# Save the image as PNG to the BytesIO object
img.save(buffered, format="PNG")
# Encode the image data in base64
img_base64 = base64.b64encode(buffered.getvalue()).decode("utf-8")

# Create the Data URL and assumes the original image file type was png
data_url = f"data:image/png;base64,{img_base64}"
return data_url

processed_image = image_to_base64_data_url("<PATH_TO_IMAGE>")

ret = co.embed(images=[processed_image],
model='embed-english-v3.0',
embedding_types= ["float"],
input_type='image')

ret.embeddings.float
```

## Compression Levels

The Cohere embeddings platform supports compression. The Embed API features a required parameter, `embeddings_types`, which allows the user to specify various ways of compressing the output.
The Cohere embeddings platform supports compression. The Embed API features an `embeddings_types` parameter which allows the user to specify various ways of compressing the output.

The following embedding types are now supported:
In the v2 API, this is a required parameter for calling the Embed endpoint.

The following embedding types are supported:

- `float`
- `int8`
- `unint8`
- `binary`
- `ubinary`

To specify an `embedding type`, pass one of the types from the list above in as list containing a string:
To specify an embedding types, pass one of the types from the list above in as list containing a string:

```python PYTHON
ret = co.embed(texts=phrases,
model=model,
input_type=input_type,
embedding_types=['int8'])
embedding_types= ["float"])

ret.embeddings.int8 # This contains your int8 embeddings
ret.embeddings.float # This will be empty
ret.embeddings.uint8 # This will be empty
ret.embeddings.ubinary # This will be empty
ret.embeddings.binary # This will be empty
ret.embeddings # This contains the float embeddings
```

Finally, you can also pass several `embedding_types` in as a list, in which case the endpoint will return a dictionary with both types available:
You can specify multiple embedding types in a single call. For example, the following call will return both `int8` and `float` embeddings:

```python PYTHON
ret = co.embed(texts=phrases,
model=model,
input_type=input_type,
embedding_types=['int8', 'float'])
embedding_types=["int8", "float"])

ret.embeddings.int8 # This contains your int8 embeddings
ret.embeddings.float # This contains your float embeddings
ret.embeddings.uint8 # This will be empty
ret.embeddings.ubinary # This will be empty
ret.embeddings.binary # This will be empty
```
```
92 changes: 92 additions & 0 deletions fern/pages/v2/text-embeddings/multimodal-embeddings.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
title: "Multimodal Embeddings"
slug: "v2/docs/multimodal-embeddings"

hidden: false
description: "Multimodal embeddings convert text and images into embeddings for search and classification (API v2)."
image: "../../../assets/images/fa074c3-cohere_docs_preview_image_1200x630_copy.jpg"
keywords: "vector embeddings, image embeddings, images, multimodal, multimodal embeddings, embeddings, natural language processing"

createdAt: "Tues Sep 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time)"
updatedAt: "Tues Sep 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time)"
---
<img src='../../../assets/images/multi-modal-guide-header.png' alt='embeddings.' />

<Note title="This Guide Uses the Embed API.">
You can find the API reference for the api [here](/reference/embed)

Image capabilities is only compatible with our embed v3.0 models
</Note>

In this guide, we show you how to use the embed endpoint to embed a series of images. This guide uses a simple dataset of graphs to illustrate how semantic search can be done over images with Cohere. To see an end-to-end example of retrieval, check out this [notebook](https://github.com/cohere-ai/notebooks/blob/main/notebooks/Multimodal_Semantic_Search.ipynb).

### Introduction to Multimodal Embeddings

Information is often represented in multiple modalities. A document, for instance, may contain text, images, and graphs, while a product can be described through images, its title, and a written description. This combination of elements often leads to a comprehensive semantic understanding of the subject matter. Traditional embedding models have been limited to a single modality, and even multimodal embedding models often suffer from degradation in `text-to-text` or `text-to-image` retrieval tasks. The `embed-v3.0` series of models, however, is fully multimodal, enabling it to embed both images and text effectively. We have achieved state-of-the-art performance without compromising text-to-text retrieval capabilities.

### How to use Multimodal Embeddings

#### 1\. Prepare your Image for Embeddings

The Embed API takes in images with the following file formats: `png`, `jpeg`,`Webp`, and `gif`. The images must then be formatted as a Data URL.

```python PYTHON
# Import the necessary packages
import os
import base64

# Defining the function to convert an image to a base 64 Data URL
def image_to_base64_data_url(image_path):
_, file_extension = os.path.splitext(image_path)
file_type=(file_extension[1:])

with open(image_path, "rb") as f:
enc_img = base64.b64encode(f.read()).decode('utf-8')
enc_img = f"data:image/{file_type};base64,{enc_img}"
return enc_img

image_path='<YOUR IMAGE PATH>'
processed_image=image_to_base64_data_url(image_path)
```
#### 2\. Call the Embed Endpoint
```python PYTHON
# Import the necessary packages
import cohere
co = cohere.ClientV2(api_key="<YOUR API KEY>")

co.embed(
model='embed-english-v3.0',
images=[processed_image],
input_type='image',
embedding_types=['float']
)
```
## Sample Output
Below is a sample of what the output would look like if you passed in a `jpeg` with original dimensions of `1080x1350` with a standard bit-depth of 24.
```json JSON
{
"id": "d8f2b461-79a4-44ee-82e4-be601bbb07be",
"embeddings": {
"float_": [[-0.025604248, 0.0154418945, ...]],
"int8": null,
"uint8": null,
"binary": null,
"ubinary": null,
},
"texts": [],
"meta": {
"api_version": {"version": "2", "is_deprecated": null, "is_experimental": null},
"billed_units": {
"input_tokens": null,
"output_tokens": null,
"search_units": null,
"classifications": null,
"images": 1,
},
"tokens": null,
"warnings": null,
},
"images": [{"width": 1080, "height": 1080, "format": "jpeg", "bit_depth": 24}],
"response_type": "embeddings_by_type",
}
```
4 changes: 2 additions & 2 deletions fern/v2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,9 +117,9 @@ navigation:
- section: Text Embeddings (Vectors, Search, Retrieval)
contents:
- page: Introduction to Embeddings at Cohere
path: pages/text-embeddings/embeddings.mdx
path: pages/v2/text-embeddings/embeddings.mdx
- page: Multimodal Embeddings
path: pages/text-embeddings/multimodal-embeddings.mdx
path: pages/v2/text-embeddings/multimodal-embeddings.mdx
- page: Batch Embedding Jobs
path: pages/v2/text-embeddings/embed-jobs-api.mdx
- section: Reranking
Expand Down

0 comments on commit 737741c

Please sign in to comment.