Skip to content

Commit

Permalink
Merge branch 'main' into struct-op-tools
Browse files Browse the repository at this point in the history
Signed-off-by: Meor Amer <[email protected]>
  • Loading branch information
mrmer1 authored Nov 26, 2024
2 parents ed37745 + 68652b6 commit fcccbde
Show file tree
Hide file tree
Showing 77 changed files with 9,070 additions and 817 deletions.
File renamed without changes.
2 changes: 1 addition & 1 deletion .github/workflows/check-mdx-frontmatter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ jobs:
run: pnpm install

- name: Run MDX frontmatter check
run: node .github/scripts/check-mdx-frontmatter.js
run: node .github/scripts/check-mdx-frontmatter.cjs
40 changes: 40 additions & 0 deletions .github/workflows/snippet-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
name: snippet-ci

on:
pull_request: {}
push:
branches:
- main

jobs:
run:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Setup pnpm
uses: pnpm/action-setup@v2
with:
version: 8

- name: Install Dependencies
shell: bash
run: pnpm install

- name: Set up python
uses: actions/setup-python@v2
with:
python-version: '3.x'

- name: poetry install
run: |
python -m pip install --upgrade pip
python -m pip install poetry
poetry install
- name: Run snippet tests
continue-on-error: true
env:
CO_API_KEY: ${{ secrets.COHERE_TOKEN }}
run: pnpm run --filter snippet-tester test
172 changes: 56 additions & 116 deletions cohere-openapi.yaml

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -571,6 +571,9 @@ redirects:
- source: /docs/data-statement
destination: /docs/usage-guidelines
permanent: true
- source: /docs/usage-guidelines
destination: /docs/usage-policy
permanent: true
- source: /v2/v2/:slug*
destination: /v2/:slug*
permanent: true
Expand Down
4 changes: 2 additions & 2 deletions fern/pages/cookbooks/convfinqa-finetuning-wandb.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,10 @@ from cohere.finetuning import (
)

# fill in your Cohere API key here
os.environ['COHERE_API_KEY'] = "<COHERE_API_KEY>"
os.environ["COHERE_API_KEY"] = "<COHERE_API_KEY>"

# instantiate the Cohere client
co = cohere.Client(os.environ['COHERE_API_KEY'])
co = cohere.Client(os.environ["COHERE_API_KEY"])
```

## Dataset
Expand Down
11 changes: 5 additions & 6 deletions fern/pages/cookbooks/deploy-finetuned-model-aws-marketplace.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,8 @@ To subscribe to the algorithm:
Install the Python packages you will use below and import them. For example, you can run the command below to install `cohere` if you haven't done so.


```python
!pip install "cohere>=5.11.0"
```sh
pip install "cohere>=5.11.0"
```


Expand Down Expand Up @@ -200,9 +200,10 @@ save_hf_model(merged_weights_dir, merged_model)


```python
%%time
sess = sage.Session()
merged_weights = S3Uploader.upload(merged_weights_dir, s3_checkpoint_dir, sagemaker_session=sess)
merged_weights = S3Uploader.upload(
merged_weights_dir, s3_checkpoint_dir, sagemaker_session=sess
)
print("merged_weights", merged_weights)
```

Expand All @@ -213,7 +214,6 @@ Create Cohere client and use it to export the merged weights to the TensorRT-LLM


```python
%%time
co = cohere.SagemakerClient(aws_region=region)
co.sagemaker_finetuning.export_finetune(
arn=arn,
Expand All @@ -232,7 +232,6 @@ The Cohere client provides a built-in method to create an endpoint for inference


```python
%%time
co.sagemaker_finetuning.create_endpoint(
arn=arn,
endpoint_name=endpoint_name,
Expand Down
7 changes: 4 additions & 3 deletions fern/pages/cookbooks/finetune-on-sagemaker.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,11 @@ To subscribe to the model algorithm:
2. On the AWS Marketplace listing, click on the **Continue to Subscribe** button.
3. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. On the "Configure and launch" page, make sure ARN displayed in your region match with the ARN in the following cell.

```sh
pip install "cohere>=5.11.0"
```

```python
!pip install "cohere>=5.11.0"

import cohere
import boto3
import sagemaker as sage
Expand Down Expand Up @@ -297,7 +298,7 @@ from tqdm import tqdm
total = 0
correct = 0
for line in tqdm(
open('./sample_finetune_scienceQA_eval.jsonl').readlines()
open("./sample_finetune_scienceQA_eval.jsonl").readlines()
):
total += 1
question_answer_json = json.loads(line)
Expand Down
86 changes: 43 additions & 43 deletions fern/pages/cookbooks/rag-cohere-mongodb.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ Libraries:



```python
!pip install --quiet datasets tqdm cohere pymongo
```sh
pip install --quiet datasets tqdm cohere pymongo
```


Expand Down Expand Up @@ -183,11 +183,11 @@ def combine_attributes(row):
combined = f"{row['company']} {row['sector']} "

# Add reports information
for report in row['reports']:
for report in row["reports"]:
combined += f"{report['year']} {report['title']} {report['author']} {report['content']} "

# Add recent news information
for news in row['recent_news']:
for news in row["recent_news"]:
combined += f"{news['headline']} {news['summary']} "

return combined.strip()
Expand All @@ -196,15 +196,15 @@ def combine_attributes(row):

```python
# Add the new column 'combined_attributes'
dataset_df['combined_attributes'] = dataset_df.apply(
dataset_df["combined_attributes"] = dataset_df.apply(
combine_attributes, axis=1
)
```


```python
# Display the first few rows of the updated dataframe
dataset_df[['company', 'ticker', 'combined_attributes']].head()
dataset_df[["company", "ticker", "combined_attributes"]].head()
```

<div>
Expand Down Expand Up @@ -270,7 +270,7 @@ def get_embedding(
texts=[text],
model=model,
input_type=input_type, # Used for embeddings of search queries run against a vector DB to find relevant documents
embedding_types=['float'],
embedding_types=["float"],
)

return response.embeddings.float[0]
Expand All @@ -279,7 +279,7 @@ def get_embedding(
# Apply the embedding function with a progress bar
tqdm.pandas(desc="Generating embeddings")
dataset_df["embedding"] = dataset_df[
'combined_attributes'
"combined_attributes"
].progress_apply(get_embedding)

print(f"We just computed {len(dataset_df['embedding'])} embeddings.")
Expand Down Expand Up @@ -421,8 +421,8 @@ def get_mongo_client(mongo_uri):
)

# Validate the connection
ping_result = client.admin.command('ping')
if ping_result.get('ok') == 1.0:
ping_result = client.admin.command("ping")
if ping_result.get("ok") == 1.0:
# Connection successful
print("Connection to MongoDB successful")
return client
Expand Down Expand Up @@ -478,7 +478,7 @@ MongoDB's Document model and its compatibility with Python dictionaries offer se
![](../../assets/images/rag-cohere-mongodb-4.png)

```python
documents = dataset_df.to_dict('records')
documents = dataset_df.to_dict("records")
collection.insert_many(documents)

print("Data ingestion into MongoDB completed")
Expand Down Expand Up @@ -592,13 +592,13 @@ def rerank_documents(query: str, documents, top_n: int = 3):
original_doc = documents[result.index]
top_documents_after_rerank.append(
{
'company': original_doc['company'],
'combined_attributes': original_doc[
'combined_attributes'
"company": original_doc["company"],
"combined_attributes": original_doc[
"combined_attributes"
],
'reports': original_doc['reports'],
'vector_search_score': original_doc['score'],
'relevance_score': result.relevance_score,
"reports": original_doc["reports"],
"vector_search_score": original_doc["score"],
"relevance_score": result.relevance_score,
}
)

Expand Down Expand Up @@ -724,9 +724,9 @@ pd.DataFrame(reranked_documents).head()
def format_documents_for_chat(documents):
return [
{
"company": doc['company'],
"company": doc["company"],
# "reports": doc['reports'],
"combined_attributes": doc['combined_attributes'],
"combined_attributes": doc["combined_attributes"],
}
for doc in documents
]
Expand Down Expand Up @@ -825,7 +825,7 @@ class CohereChat:
# Use the connection string from history_params
self.client = pymongo.MongoClient(
self.history_params.get(
'connection_string', 'mongodb://localhost:27017/'
"connection_string", "mongodb://localhost:27017/"
)
)

Expand All @@ -838,34 +838,34 @@ class CohereChat:
# Use the history_collection from history_params, or default to "chat_history"
self.history_collection = self.db[
self.history_params.get(
'history_collection', 'chat_history'
"history_collection", "chat_history"
)
]

# Use the session_id from history_params, or default to "default_session"
self.session_id = self.history_params.get(
'session_id', 'default_session'
"session_id", "default_session"
)

def add_to_history(self, message: str, prefix: str = ""):
self.history_collection.insert_one(
{
'session_id': self.session_id,
'message': message,
'prefix': prefix,
"session_id": self.session_id,
"message": message,
"prefix": prefix,
}
)

def get_chat_history(self) -> List[Dict[str, str]]:
history = self.history_collection.find(
{'session_id': self.session_id}
).sort('_id', 1)
{"session_id": self.session_id}
).sort("_id", 1)
return [
{
"role": (
"user" if item['prefix'] == "USER" else "chatbot"
"user" if item["prefix"] == "USER" else "chatbot"
),
"message": item['message'],
"message": item["message"],
}
for item in history
]
Expand All @@ -875,11 +875,11 @@ class CohereChat:
) -> List[Dict]:
rerank_docs = [
{
'company': doc['company'],
'combined_attributes': doc['combined_attributes'],
"company": doc["company"],
"combined_attributes": doc["combined_attributes"],
}
for doc in documents
if doc['combined_attributes'].strip()
if doc["combined_attributes"].strip()
]

if not rerank_docs:
Expand All @@ -897,11 +897,11 @@ class CohereChat:

top_documents_after_rerank = [
{
'company': rerank_docs[result.index]['company'],
'combined_attributes': rerank_docs[result.index][
'combined_attributes'
"company": rerank_docs[result.index]["company"],
"combined_attributes": rerank_docs[result.index][
"combined_attributes"
],
'relevance_score': result.relevance_score,
"relevance_score": result.relevance_score,
}
for result in response.results
]
Expand All @@ -925,8 +925,8 @@ class CohereChat:
) -> List[Dict]:
return [
{
"company": doc['company'],
"combined_attributes": doc['combined_attributes'],
"company": doc["company"],
"combined_attributes": doc["combined_attributes"],
}
for doc in documents
]
Expand Down Expand Up @@ -972,8 +972,8 @@ class CohereChat:

def show_history(self):
history = self.history_collection.find(
{'session_id': self.session_id}
).sort('_id', 1)
{"session_id": self.session_id}
).sort("_id", 1)
for item in history:
print(f"{item['prefix']}: {item['message']}")
print("-------------------------")
Expand All @@ -988,9 +988,9 @@ chat = CohereChat(
database=DB_NAME,
main_collection=COLLECTION_NAME,
history_params={
'connection_string': MONGO_URI,
'history_collection': "chat_history",
'session_id': 2,
"connection_string": MONGO_URI,
"history_collection": "chat_history",
"session_id": 2,
},
)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,12 @@ To access Cohere's models on SageMaker Jumpstart, follow these steps:

If you have any questions about this process, reach out to [email protected].

## Optimize your Inference Latencies

By default, SageMaker endpoints have a random routing strategy. This means that requests coming to the model endpoints are forwarded to the machine learning instances randomly, which can cause latency issues in applications focused on generative AI. In 2023, the SageMaker platform introduced a `RoutingStrategy` parameter allowing you to use the ‘least outstanding requests’ (LOR) approach to routing. With LOR, SageMaker monitors the load of the instances behind your endpoint as well as the models or inference components that are deployed on each instance, then optimally routes requests to the instance that is best suited to serve it.

LOR has shown an improvement in latency under various conditions, and you can find more details [here](https://aws.amazon.com/blogs/machine-learning/minimize-real-time-inference-latency-by-using-amazon-sagemaker-routing-strategies/).

## Next Steps

With your selected configuration and Product ARN available, you now have everything you need to integrate with Cohere’s model offerings on SageMaker.
Expand Down
Loading

0 comments on commit fcccbde

Please sign in to comment.