Skip to content

Commit

Permalink
Merge branch 'master' into andrew/ts_common
Browse files Browse the repository at this point in the history
  • Loading branch information
andrewtruong committed Oct 10, 2024
2 parents e15272b + 5579795 commit 9115bde
Show file tree
Hide file tree
Showing 41 changed files with 4,558 additions and 78 deletions.
9 changes: 6 additions & 3 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ jobs:
- name: Checkout
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: 3.9
- name: Install dependencies
Expand All @@ -203,6 +203,7 @@ jobs:
'10',
'11',
'12',
'13',
#
]
nox-shard:
Expand All @@ -220,6 +221,7 @@ jobs:
'llamaindex',
'mistral0',
'mistral1',
'notdiamond',
'openai',
]
fail-fast: false
Expand All @@ -246,7 +248,7 @@ jobs:
- name: Checkout
uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version-major }}.${{ matrix.python-version-minor }}
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version-major }}.${{ matrix.python-version-minor }}
- name: Install dependencies
Expand All @@ -272,7 +274,8 @@ jobs:
WF_CLICKHOUSE_HOST: weave_clickhouse
WEAVE_SERVER_DISABLE_ECOSYSTEM: 1
run: |
nox -e "tests-${{ matrix.python-version-major }}.${{ matrix.python-version-minor }}(shard='${{ matrix.nox-shard }}')" -- -n4
nox -e "tests-${{ matrix.python-version-major }}.${{ matrix.python-version-minor }}(shard='${{ matrix.nox-shard }}')" -- \
-n4
trace-tests-matrix-check: # This job does nothing and is only used for the branch protection
if: always()

Expand Down
26 changes: 26 additions & 0 deletions docs/docs/guides/core-types/evaluations.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,32 @@ asyncio.run(evaluation.evaluate(model))

This will run `predict` on each example and score the output with each scoring functions.

#### Custom Naming

You can change the name of the Evaluation itself by passing a `name` parameter to the `Evaluation` class.

```python
evaluation = Evaluation(
dataset=examples, scorers=[match_score1], name="My Evaluation"
)
```

You can also change the name of individual evaluations by setting the `display_name` key of the `__weave` dictionary.

:::note

Using the `__weave` dictionary sets the call display name which is distinct from the Evaluation object name. In the
UI, you will see the display name if set, otherwise the Evaluation object name will be used.

:::

```python
evaluation = Evaluation(
dataset=examples, scorers=[match_score1]
)
evaluation.evaluate(model, __weave={"display_name": "My Evaluation Run"})
```

### Define a function to evaluate

Alternatively, you can also evaluate a function that is wrapped in a `@weave.op()`.
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
114 changes: 114 additions & 0 deletions docs/docs/guides/integrations/notdiamond.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Not Diamond ¬◇

When building complex LLM workflows users may need to prompt different models according to accuracy,
cost, or call latency. Users can use [Not Diamond][nd] to route prompts in these workflows to the
right model for their needs, helping maximize accuracy while saving on model costs.

## Getting started

Make sure you have [created an account][account] and [generated an API key][keys], then add your API
key to your env as `NOTDIAMOND_API_KEY`.

![[Create an API key](imgs/notdiamond/api-keys.png)]

From here, you can

- try the [quickstart guide],
- [build a custom router][custom router] with W&B Weave and Not Diamond, or
- [chat with Not Diamond][chat] to see routing in action

## Tracing

Weave integrates with [Not Diamond's Python library][python] to [automatically log API calls][ops].
You only need to run `weave.init()` at the start of your workflow, then continue using the routed
provider as usual:

```python
from notdiamond import NotDiamond

import weave
weave.init('notdiamond-quickstart')

client = NotDiamond()
session_id, provider = client.chat.completions.model_select(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Concisely explain merge sort."}
],
model=['openai/gpt-4o', 'anthropic/claude-3-5-sonnet-20240620']
)

print("LLM called: ", provider.provider) # openai, anthropic, etc
print("Provider model: ", provider.model) # gpt-4o, claude-3-5-sonnet-20240620, etc
```

## Custom routing

You can also train your own [custom router] on [Evaluations][evals], allowing Not Diamond to route prompts
according to eval performance for specialized use cases.

Start by training a custom router:

```python
from weave.flow.eval import EvaluationResults
from weave.integrations.notdiamond.custom_router import train_router

# Build an Evaluation on gpt-4o and Claude 3.5 Sonnet
evaluation = weave.Evaluation(...)
gpt_4o = weave.Model(...)
sonnet = weave.Model(...)

model_evals = {
'openai/gpt-4o': evaluation.get_eval_results(gpt_4o),
'anthropic/claude-3-5-sonnet-20240620': evaluation.get_eval_results(sonnet),
}
preference_id = train_router(
model_evals=model_evals,
prompt_column="prompt",
response_column="actual",
language="en",
maximize=True,
api_key=api_key,
)
```

By reusing this preference ID in any `model_select` request, you can route your prompts
to maximize performance and minimize cost on your evaluation data:

```python
from notdiamond import NotDiamond
client = NotDiamond()

import weave
weave.init('notdiamond-quickstart')

session_id, provider = client.chat.completions.model_select(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Concisely explain merge sort."}
],
model=['openai/gpt-4o', 'anthropic/claude-3-5-sonnet-20240620'],

# passing this preference ID reuses your custom router
preference_id=preference_id
)

print("LLM called: ", provider.provider) # openai, anthropic, etc
print("Provider model: ", provider.model) # gpt-4o, claude-3-5-sonnet-20240620, etc
```

## Additional support

Visit the [docs] or [send us a message][support] for further support.

[account]: https://app.notdiamond.ai
[chat]: https://chat.notdiamond.ai
[custom router]: https://docs.notdiamond.ai/docs/router-training-quickstart
[docs]: https://docs.notdiamond.ai
[evals]: ../../guides/core-types/evaluations.md
[keys]: https://app.notdiamond.ai/keys
[nd]: https://www.notdiamond.ai/
[ops]: ../../guides/tracking/ops.md
[python]: https://github.com/Not-Diamond/notdiamond-python
[quickstart guide]: https://docs.notdiamond.ai/docs/quickstart
[support]: mailto:[email protected]
14 changes: 13 additions & 1 deletion docs/docs/guides/tracking/tracing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,19 @@ instance.my_method.call(instance, "World")

#### Call Display Name

Sometimes you may want to override the display name of a call. You can achieve this in one of three ways:
Sometimes you may want to override the display name of a call. You can achieve this in one of four ways:

0. Change the display name at the time of calling the op:

```python showLineNumbers
result = my_function("World", __weave={"display_name": "My Custom Display Name"})
```

:::note

Using the `__weave` dictionary sets the call display name which will take precedence over the Op display name.

:::

1. Change the display name on a per-call basis. This uses the [`Op.call`](../../reference/python-sdk/weave/trace/weave.trace.op.md#function-call) method to return a `Call` object, which you can then use to set the display name using [`Call.set_display_name`](../../reference/python-sdk/weave/trace/weave.trace.weave_client.md#method-set_display_name).
```python showLineNumbers
Expand Down
Loading

0 comments on commit 9115bde

Please sign in to comment.