-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into struct-op-tools
Signed-off-by: Meor Amer <[email protected]>
- Loading branch information
Showing
77 changed files
with
9,070 additions
and
817 deletions.
There are no files selected for viewing
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
name: snippet-ci | ||
|
||
on: | ||
pull_request: {} | ||
push: | ||
branches: | ||
- main | ||
|
||
jobs: | ||
run: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout repository | ||
uses: actions/checkout@v4 | ||
|
||
- name: Setup pnpm | ||
uses: pnpm/action-setup@v2 | ||
with: | ||
version: 8 | ||
|
||
- name: Install Dependencies | ||
shell: bash | ||
run: pnpm install | ||
|
||
- name: Set up python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.x' | ||
|
||
- name: poetry install | ||
run: | | ||
python -m pip install --upgrade pip | ||
python -m pip install poetry | ||
poetry install | ||
- name: Run snippet tests | ||
continue-on-error: true | ||
env: | ||
CO_API_KEY: ${{ secrets.COHERE_TOKEN }} | ||
run: pnpm run --filter snippet-tester test |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -115,6 +115,12 @@ To access Cohere's models on SageMaker Jumpstart, follow these steps: | |
|
||
If you have any questions about this process, reach out to [email protected]. | ||
|
||
## Optimize your Inference Latencies | ||
|
||
By default, SageMaker endpoints have a random routing strategy. This means that requests coming to the model endpoints are forwarded to the machine learning instances randomly, which can cause latency issues in applications focused on generative AI. In 2023, the SageMaker platform introduced a `RoutingStrategy` parameter allowing you to use the ‘least outstanding requests’ (LOR) approach to routing. With LOR, SageMaker monitors the load of the instances behind your endpoint as well as the models or inference components that are deployed on each instance, then optimally routes requests to the instance that is best suited to serve it. | ||
|
||
LOR has shown an improvement in latency under various conditions, and you can find more details [here](https://aws.amazon.com/blogs/machine-learning/minimize-real-time-inference-latency-by-using-amazon-sagemaker-routing-strategies/). | ||
|
||
## Next Steps | ||
|
||
With your selected configuration and Product ARN available, you now have everything you need to integrate with Cohere’s model offerings on SageMaker. | ||
|
Oops, something went wrong.