diff --git a/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx b/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx index 9250a0d8..2e5f41e1 100644 --- a/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx +++ b/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx @@ -51,6 +51,10 @@ This tutorial assumes you have the following: Note: While this tutorial integrates Cohere with an Elastic Cloud [serverless](https://docs.elastic.co/serverless/elasticsearch/get-started) project, you can also integrate with your self-managed Elasticsearch deployment or Elastic Cloud deployment by simply switching from the [serverless](https://docs.elastic.co/serverless/elasticsearch/clients) to the general [language client](https://www.elastic.co/guide/en/elasticsearch/client/index.html). +## Create an Elastic Serverless deployment + +If you don't have an Elastic Cloud deployment, sign up [here](https://www.google.com/url?q=https%3A%2F%2Fcloud.elastic.co%2Fregistration%3Futm_source%3Dgithub%26utm_content%3Delasticsearch-labs-notebook) for a free trial and request access to Elastic Serverless + ## Install the required packages Install and import the required Python Packages: @@ -65,10 +69,15 @@ To install the packages, use the following code !pip install cohere==5.2.5 ``` +After the instalation has finished, find your endpoint URL and create your API key in the Serverless dashboard. + ## Import the required packages +Next, we need to import the modules we need. 🔐 NOTE: getpass enables us to securely prompt the user for credentials without echoing them to the terminal, or storing it in memory. + ```python PYTHON from elasticsearch_serverless import Elasticsearch, helpers +from getpass import getpass import cohere import json import requests @@ -76,17 +85,15 @@ import requests ## Create an Elasticsearch client -In order to create an Elasticsearch client you will need: +Now we can instantiate the Python Elasticsearch client. -- An endpoint for your cluster, found in the Elastic Serverless dashboard -- An encoded API key +First we prompt the user for their endpoint and encoded API key. Then we create a client object that instantiates an instance of the Elasticsearch class. -When creating your API key in the Serverless dashboard make sure to turn on Control security privileges, and edit cluster privileges to specify `"cluster": ["all"].` -Note - you can also create a client using a local or Elastic Cloud cluster. For simplicity we use Elastic Serverless. +When creating your Elastic Serverless API key make sure to turn on Control security privileges, and edit cluster privileges to specify `"cluster": ["all"]`. ```python PYTHON -ELASTICSEARCH_ENDPOINT = "elastic_endpoint" -ELASTIC_API_KEY = "encoded_api_key" +ELASTICSEARCH_ENDPOINT = getpass("Elastic Endpoint: ") +ELASTIC_API_KEY = getpass("Elastic encoded API key: ") # Use the encoded API key client = Elasticsearch( ELASTICSEARCH_ENDPOINT, @@ -108,9 +115,11 @@ To set up an inference pipeline for ingestion we first must create an inference We will create an inference endpoint that uses `embed-english-v3.0` and `int8` or `byte` compression to save on storage. ```python PYTHON -COHERE_API_KEY = "cohere_api_key" +COHERE_API_KEY = getpass("Enter Cohere API key: ") +# Delete the inference model if it already exists +client.options(ignore_status=[404]).inference.delete(inference_id="cohere_embeddings") -client.inference.put_model( +client.inference.put( task_type="text_embedding", inference_id="cohere_embeddings", body={ @@ -126,67 +135,57 @@ client.inference.put_model( ) ``` -## Create an inference pipeline - -Now that we have an inference endpoint we can create an inference pipeline and processor to use when we ingest documents into our index. +Here's what you might see: -```python PYTHON -client.ingest.put_pipeline( - id="cohere_embeddings", - description="Ingest pipeline for Cohere inference.", - processors=[ - { - "inference": { - "model_id": "cohere_embeddings", - "input_output": { - "input_field": "text", - "output_field": "text_embedding", - }, - } - } - ], -) +``` +Enter Cohere API key: ·········· +ObjectApiResponse({'model_id': 'cohere_embeddings', 'inference_id': 'cohere_embeddings', 'task_type': 'text_embedding', 'service': 'cohere', 'service_settings': {'similarity': 'cosine', 'dimensions': 1024, 'model_id': 'embed-english-v3.0', 'rate_limit': {'requests_per_minute': 10000}, 'embedding_type': 'byte'}, 'task_settings': {}}) ``` -Let's note a few important parameters from that API call: - -- `inference`: A processor that performs inference using a machine learning model or service such as Cohere. -- `model_id`: Specifies the ID of the inference endpoint to be used. In this example, the model ID is set to `cohere_embeddings` to match the inference endpoint we created. -- `input_output`: Specifies input and output fields. -- `input_field`: Field name from which the `dense_vector` representation is created. This needs to match the data we are passing to the processor. -- `output_field`: Field name which contains inference results. +## Create the Index -## Create index +The mapping of the destination index – the index that contains the embeddings that the model will generate based on your input text – must be created. The destination index must have a field with the [`semantic_text`](https://www.google.com/url?q=https%3A%2F%2Fwww.elastic.co%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Fsemantic-text.html) field type to index the output of the Cohere model. -We will now create an empty index that will be the destination of our documents and embeddings. +Let's create an index named cohere-wiki-embeddings with the mappings we need ```python PYTHON +client.indices.delete(index="cohere-wiki-embeddings", ignore_unavailable=True) client.indices.create( index="cohere-wiki-embeddings", - settings={"index": {"default_pipeline": "cohere_embeddings"}}, mappings={ "properties": { - "text_embedding": { - "type": "dense_vector", - "dims": 1024, - "element_type": "byte", + "text_semantic": { + "type": "semantic_text", + "inference_id": "cohere_embeddings" }, - "text": {"type": "text"}, + "text": {"type": "text", "copy_to": "text_semantic"}, "wiki_id": {"type": "integer"}, "url": {"type": "text"}, "views": {"type": "float"}, "langs": {"type": "integer"}, "title": {"type": "text"}, "paragraph_id": {"type": "integer"}, - "id": {"type": "integer"}, + "id": {"type": "integer"} } }, ) ``` -## Insert documents +You might see something like this: + +``` +ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'cohere-wiki-embeddings'}) +``` + +Let's note a few important parameters from that API call: + +- `semantic_text`: A field type automatically generates embeddings for text content using an inference endpoint. +- `inference_id`: Specifies the ID of the inference endpoint to be used. In this example, the model ID is set to cohere_embeddings. +- `copy_to`: Specifies the output field which contains inference results + +## Insert Documents -Let’s now index our wiki dataset. +Let's insert our example wiki dataset. You need a production Cohere account to complete this step, otherwise the documentation ingest will time out due to the API request rate limits. ```python PYTHON url = "https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl" @@ -211,39 +210,105 @@ helpers.bulk(client, documents) print("Done indexing documents into `cohere-wiki-embeddings` index!") ``` -Our index should now be populated with our wiki data and text embeddings for the `text` field. Ingesting large datasets and creating vector or hybrid search indices is seamless with Elastic. +You should see this: -# Hybrid Search with Elasticsearch and Cohere +``` +Done indexing documents into `cohere-wiki-embeddings` index! +``` -Now let’s start querying our index. We will perform a hybrid search query, which means we will compute the relevance of search results based on the vector similarity to our query, as well as the keyword similarity. Hybrid search tends to lead to state-of-the-art search results and Elastic is well-suited to offer this. -Here we build a query that will search over the `title` and `text` fields using keyword matching, and will search over our text embeddings using vector similarity. +## Semantic Search +After the dataset has been enriched with the embeddings, you can query the data using the semantic query provided by Elasticsearch. `semantic_text` in Elasticsearch simplifies the semantic search significantly. Learn more about how [semantic text](https://www.google.com/url?q=https%3A%2F%2Fwww.elastic.co%2Fsearch-labs%2Fblog%2Fsemantic-search-simplified-semantic-text) in Elasticsearch allows you to focus on your model and results instead of on the technical details. -```python PYTHON +```python PYTHON +query = "When were the semi-finals of the 2022 FIFA world cup played?" + +response = client.search( + index="cohere-wiki-embeddings", + size=100, + query = { + "semantic": { + "query": "When were the semi-finals of the 2022 FIFA world cup played?", + "field": "text_semantic" + } + } +) + +raw_documents = response["hits"]["hits"] + +# Display the first 10 results +for document in raw_documents[0:10]: + print(f'Title: {document["_source"]["title"]}\nText: {document["_source"]["text"]}\n') + +# Format the documents for ranking +documents = [] +for hit in response["hits"]["hits"]: + documents.append(hit["_source"]["text"]) +``` + +Here's what that might look like: +``` +Title: 2022 FIFA World Cup +Text: The 2022 FIFA World Cup was an international football tournament contested by the men's national teams of FIFA's member associations and 22nd edition of the FIFA World Cup. It took place in Qatar from 20 November to 18 December 2022, making it the first World Cup held in the Arab world and Muslim world, and the second held entirely in Asia after the 2002 tournament in South Korea and Japan. France were the defending champions, having defeated Croatia 4–2 in the 2018 final. At an estimated cost of over $220 billion, it is the most expensive World Cup ever held to date; this figure is disputed by Qatari officials, including organising CEO Nasser Al Khater, who said the true cost was $8 billion, and other figures related to overall infrastructure development since the World Cup was awarded to Qatar in 2010. + +Title: 2022 FIFA World Cup +Text: The semi-finals were played on 13 and 14 December. Messi scored a penalty kick before Julián Álvarez scored twice to give Argentina a 3–0 victory over Croatia. Théo Hernandez scored after five minutes as France led Morocco for most of the game and later Randal Kolo Muani scored on 78 minutes to complete a 2–0 victory for France over Morocco as they reached a second consecutive final. + +Title: 2022 FIFA World Cup +Text: The quarter-finals were played on 9 and 10 December. Croatia and Brazil ended 0–0 after 90 minutes and went to extra time. Neymar scored for Brazil in the 15th minute of extra time. Croatia, however, equalised through Bruno Petković in the second period of extra time. With the match tied, a penalty shootout decided the contest, with Croatia winning the shoot-out 4–2. In the second quarter-final match, Nahuel Molina and Messi scored for Argentina before Wout Weghorst equalised with two goals shortly before the end of the game. The match went to extra time and then penalties, where Argentina would go on to win 4–3. Morocco defeated Portugal 1–0, with Youssef En-Nesyri scoring at the end of the first half. Morocco became the first African and the first Arab nation to advance as far as the semi-finals of the competition. Despite Harry Kane scoring a penalty for England, it was not enough to beat France, who won 2–1 by virtue of goals from Aurélien Tchouaméni and Olivier Giroud, sending them to their second consecutive World Cup semi-final and becoming the first defending champions to reach this stage since Brazil in 1998. + +Title: 2022 FIFA World Cup +Text: Unlike previous FIFA World Cups, which are typically played in June and July, because of Qatar's intense summer heat and often fairly high humidity, the 2022 World Cup was played in November and December. As a result, the World Cup was unusually staged in the middle of the seasons of domestic association football leagues, which started in late July or August, including all of the major European leagues, which had been obliged to incorporate extended breaks into their domestic schedules to accommodate the World Cup. Major European competitions had scheduled their respective competitions group matches to be played before the World Cup, to avoid playing group matches the following year. + +Title: 2022 FIFA World Cup +Text: The match schedule was confirmed by FIFA in July 2020. The group stage was set to begin on 21 November, with four matches every day. Later, the schedule was tweaked by moving the Qatar vs Ecuador game to 20 November, after Qatar lobbied FIFA to allow their team to open the tournament. The final was played on 18 December 2022, National Day, at Lusail Stadium. + +Title: 2022 FIFA World Cup +Text: Owing to the climate in Qatar, concerns were expressed over holding the World Cup in its traditional time frame of June and July. In October 2013, a task force was commissioned to consider alternative dates and report after the 2014 FIFA World Cup in Brazil. On 24 February 2015, the FIFA Task Force proposed that the tournament be played from late November to late December 2022, to avoid the summer heat between May and September and also avoid clashing with the 2022 Winter Olympics in February, the 2022 Winter Paralympics in March and Ramadan in April. + +Title: 2022 FIFA World Cup +Text: Of the 32 nations qualified to play at the 2022 FIFA World Cup, 24 countries competed at the previous tournament in 2018. Qatar were the only team making their debut in the FIFA World Cup, becoming the first hosts to make their tournament debut since Italy in 1934. As a result, the 2022 tournament was the first World Cup in which none of the teams that earned a spot through qualification were making their debut. The Netherlands, Ecuador, Ghana, Cameroon, and the United States returned to the tournament after missing the 2018 tournament. Canada returned after 36 years, their only prior appearance being in 1986. Wales made their first appearance in 64 years – the longest ever gap for any team, their only previous participation having been in 1958. + +Title: 2022 FIFA World Cup +Text: After UEFA were guaranteed to host the 2018 event, members of UEFA were no longer in contention to host in 2022. There were five bids remaining for the 2022 FIFA World Cup: Australia, Japan, Qatar, South Korea, and the United States. + +Title: Cristiano Ronaldo +Text: Ronaldo was named in Portugal's squad for the 2022 FIFA World Cup in Qatar, making it his fifth World Cup. On 24 November, in Portugal's opening match against Ghana, Ronaldo scored a penalty kick and became the first male player to score in five different World Cups. In the last group game against South Korea, Ronaldo received criticism from his own coach for his reaction at being substituted. He was dropped from the starting line-up for Portugal's last 16 match against Switzerland, marking the first time since Euro 2008 that he had not started a game for Portugal in a major international tournament, and the first time Portugal had started a knockout game without Ronaldo in the starting line-up at an international tournament since Euro 2000. He came off the bench late on as Portugal won 6–1, their highest tally in a World Cup knockout game since the 1966 World Cup, with Ronaldo's replacement Gonçalo Ramos scoring a hat-trick. Portugal employed the same strategy in the quarter-finals against Morocco, with Ronaldo once again coming off the bench; in the process, he equalled Bader Al-Mutawa's international appearance record, becoming the joint–most capped male footballer of all time, with 196 caps. Portugal lost 1–0, however, with Morocco becoming the first CAF nation ever to reach the World Cup semi-finals. + +Title: 2022 FIFA World Cup +Text: The final draw was held at the Doha Exhibition and Convention Center in Doha, Qatar, on 1 April 2022, 19:00 AST, prior to the completion of qualification. The two winners of the inter-confederation play-offs and the winner of the Path A of the UEFA play-offs were not known at the time of the draw. The draw was attended by 2,000 guests and was led by Carli Lloyd, Jermaine Jenas and sports broadcaster Samantha Johnson, assisted by the likes of Cafu (Brazil), Lothar Matthäus (Germany), Adel Ahmed Malalla (Qatar), Ali Daei (Iran), Bora Milutinović (Serbia/Mexico), Jay-Jay Okocha (Nigeria), Rabah Madjer (Algeria), and Tim Cahill (Australia). +``` + +## Hybrid Search +After the dataset has been enriched with the embeddings, you can query the data using hybrid search. + +Pass a semantic query, and provide the query text and the model you have used to create the embeddings. + +```python PYTHON query = "When were the semi-finals of the 2022 FIFA world cup played?" response = client.search( index="cohere-wiki-embeddings", size=100, - knn={ - "field": "text_embedding", - "query_vector_builder": { - "text_embedding": { - "model_id": "cohere_embeddings", - "model_text": query, - } - }, - "k": 10, - "num_candidates": 50, - }, query={ - "multi_match": { - "query": query, - "fields": ["text", "title"] + "bool": { + "must": { + "multi_match": { + "query": "When were the semi-finals of the 2022 FIFA world cup played?", + "fields": ["text", "title"] + } + }, + "should": { + "semantic": { + "query": "When were the semi-finals of the 2022 FIFA world cup played?", + "field": "text_semantic" + } + }, } } + ) -raw_documents = response[“hits”][“hits”] +raw_documents = response["hits"]["hits"] # Display the first 10 results for document in raw_documents[0:10]: @@ -255,16 +320,17 @@ for hit in response["hits"]["hits"]: documents.append(hit["_source"]["text"]) ``` -These are looking pretty good, but we can better consolidate our results using Cohere’s new Rerank v3 model available through Elastic’s inference API. +## Ranking -## Rerank search results with Cohere and Elasticsearch +In order to effectively combine the results from our vector and BM25 retrieval, we can use Cohere's Rerank 3 model through the inference API to provide a final, more precise, semantic reranking of our results. -In order to effectively combine the results from our vector and BM25 retrieval, we can use Cohere's Rerank v3 model through the inference API to provide a final, more precise, semantic reranking of our results. +First, create an inference endpoint with your Cohere API key. Make sure to specify a name for your endpoint, and the model_id of one of the rerank models. In this example we will use Rerank 3. -First, create an inference endpoint with your Cohere API key. Make sure to specify a name for your endpoint, and the model_id of one of the rerank models. In this example we will use Rerank v3. +```python PYTHON +# Delete the inference model if it already exists +client.options(ignore_status=[404]).inference.delete(inference_id="cohere_rerank") -```python PYTHON -client.inference.put_model( +client.inference.put( task_type="rerank", inference_id="cohere_rerank", body={ @@ -282,9 +348,11 @@ client.inference.put_model( You can now rerank your results using that inference endpoint. Here we will pass in the query we used for retrieval, along with the documents we just retrieved using hybrid search. -The inference service will respond with a list of documents in descending order of relevance. Each document has a corresponding index (reflecting the order the documents were in when sent to the inference endpoint), and if the “return_documents” task setting is True, then the document texts will be included as well. +The inference service will respond with a list of documents in descending order of relevance. Each document has a corresponding index (reflecting to the order the documents were in when sent to the inference endpoint), and if the “return_documents” task setting is True, then the document texts will be included as well. -```python PYTHON +In this case we will set the response to False and will reconstruct the input documents based on the index returned in the response. + +```python PYTHON response = client.inference.inference( inference_id="cohere_rerank", body={ @@ -309,36 +377,37 @@ for document in ranked_documents[0:10]: print(f"Title: {document['title']}\nText: {document['text']}\n") ``` -# RAG with Cohere and Elasticsearch +## Retrieval augemented generation Now that we have ranked our results, we can easily turn this into a RAG system with Cohere's Chat API. Pass in the retrieved documents, along with the query and see the grounded response using Cohere's newest generative model Command R+. -Next, we can easily get a grounded generation with citations from the Cohere Chat API. We simply pass in the user query and documents retrieved from Elasticsearch to the API, and print out our grounded response. +First, we will create the Cohere client. + +```python PYTHON +co = cohere.Client(COHERE_API_KEY) +``` + +Next, we can easily get a grounded generation with citations from the Cohere Chat API. We simply pass in the user query and documents retrieved from Elastic to the API, and print out our grounded response. ```python PYTHON -response = co.chat(message=query, documents=ranked_documents, model='command-r-plus') +response = co.chat( + message=query, + documents=ranked_documents, + model='command-r-plus' +) source_documents = [] for citation in response.citations: - for document_id in citation.document_ids: - if document_id not in source_documents: - source_documents.append(document_id) + for document_id in citation.document_ids: + if document_id not in source_documents: + source_documents.append(document_id) print(f"Query: {query}") print(f"Response: {response.text}") print("Sources:") for document in response.documents: - if document['id'] in source_documents: - print(f"{document['title']}: {document['text']}") -``` - -And our response should look something like this. - -``` -Query: When were the semi-finals of the 2022 FIFA world cup played? -Response: The semi-finals of the 2022 FIFA World Cup were played on 13 and 14 December. -Sources: -2022 FIFA World Cup: The semi-finals were played on 13 and 14 December. Messi scored a penalty... + if document['id'] in source_documents: + print(f"{document['title']}: {document['text']}") ``` -And there you have it! A quick and easy implementation of hybrid search and RAG with Cohere and Elastic. +And there you have it! A quick and easy implementation of hybrid search and RAG with Cohere and Elastic. \ No newline at end of file