Merge branch 'master' into feature/modules

wandb · Dec 4, 2024 · 40f341c · 40f341c
2 parents b96107a + c340fc8
commit 40f341c
Show file tree

Hide file tree

Showing 68 changed files with 802 additions and 530 deletions.
diff --git a/docs/docs/guides/tools/comparison.md b/docs/docs/guides/tools/comparison.md
@@ -0,0 +1,125 @@
+# Comparison
+
+The Weave Comparison feature allows you to visually compare and diff code, traces, prompts, models, and model configurations.  You can compare two objects side-by-side or analyze a larger set of objects to identify differences, patterns, and trends.
+
+This guide covers the steps to start a comparison and the available actions to tailor your comparison view, including baseline comparisons, numeric diff formatting, and more. 
+
+## Access the Comparison view
+
+1. In the sidebar, select the type of object you'd like to compare (e.g. **Traces**, **Models**, etc.).
+2. Select the objects that you want to compare. The selection method varies depending on the type of object you are comparing:
+   - For **Traces**, select traces to compare by checking the checkboxes in the appropriate rows in the Traces column.
+   - For objects such as **Models**, navigate to the model Versions page and check the checkboxes next to the  versions that you want to compare.
+3. Click **Compare** to open the Comparison view. Now, you can refine your view using the [available actions](#available-actions).
+
+## Available actions
+
+In the Comparison view, you have multiple actions available, depending on how many objects are being compared. Make sure to look at the [usage notes](#usage-notes).
+
+- [Change the diff display](#change-the-diff-display)
+- [Display side-by-side](#display-side-by-side)
+- [Display in a unified view](#display-in-a-unified-view)
+- [Set a baseline](#set-a-baseline)
+- [Remove a baseline](#remove-a-baseline)
+- [Change the comparison order](#change-the-comparison-order)
+- [Change numeric diff display format](#change-numeric-diff-display-format)
+- [Compare with baseline or previous](#compare-with-baseline-or-previous)
+- [Compare a pair from a multi-object comparison](#compare-a-pair-from-a-multi-object-comparison)
+- [Remove an object from comparison](#remove-an-object-from-comparison)
+
+### Change the diff display
+
+By default, **Diff only** is set to off. To filter the table rows so that only changed rows are displayed, toggle **Diff only** on. 
+
+### Display side-by-side 
+
+> This option is only available when comparing two objects, or a [pair from a multi-object comparison](#compare-a-pair-from-a-multi-object-comparison).
+
+To compare each object side-by-side in separate columns, select **Side-by-side**. 
+
+![Side-by-side Comparison view of two objects](imgs/comparison-2objs-sidebyside.png)
+
+### Display in a unified view
+
+> This option is only available when comparing two objects, or a [pair from a multi-object comparison](#compare-a-pair-from-a-multi-object-comparison).
+
+To compare each object in a unified view, select **Unified**. 
+
+![Unified Comparison view of two objects](imgs/comparison-2objs-unified.png)
+
+### Set a baseline
+
+By default, each object in the Comparison view is compared to the object to its left. However, you can set an object as the _baseline_, which means that all objects will be compared to the leftmost object in the view.
+To set an object as baseline, do the following:
+
+1. In the Comparison view topbar, mouse over the object that you want to set as the baseline.
+2. Click the three dots to the right of the ID.
+   ![Make baseline option displayed.](imgs/comparison-2objs-baseline.png)
+3. In the dropdown, select **Make baseline**. The UI refreshes so that the baseline object is furthest left in the topbar, and `Baseline` displays next to the ID.
+    ![Baseline set.](imgs/comparison-2objs-baseline-set.png)
+
+### Remove a baseline
+
+To remove an object as baseline, do the following:
+
+1. In the Comparison view topbar, mouse over the baseline object.
+2. Click the three dots to the right of the ID.
+3. In the dropdown, select **Remove baseline**. Now, `Baseline` no longer displays next to the call ID.
+
+### Change the comparison order
+
+To change the comparison order, do the following:
+
+1. In the Comparison view topbar, mouse over the ID that you want to reorder. 
+2. Click the six dots to the left of the ID.
+   ![Setting the order.](imgs/comparison-2objs-reorder.png)
+3. Drag the ID to the left or the right, depending on which object was selected. 
+4. Place the ID in the desired ordering. The UI refreshes with an updated comparison ordering.
+
+### Change numeric diff display format 
+
+For numeric values such as `completion_tokens` and `total_tokens`, you can view the diff as either an integer or a percentage. Additionally, positive numeric values can be viewed as a multiplier. To change a numeric diff's display format, do the following:
+
+1. In the Comparison table, find the numeric value that you want to update the diff display format for.
+    ![A numeric value displayed as an integer.](imgs/comparison-2objs-numericdiffformat.png)
+2. Click the diff value. The format automatically updates to either an integer or a percentage.
+    ![A numeric value updated to a percentage.](imgs/comparison-2objs-numericdiffformat-updated.png)
+
+### Compare with baseline or previous
+
+> This option is only available when comparing 3 or more objects.
+> You can also [set](#set-a-baseline) or [remove an existing baseline by clicking the 3 dots to the right of the ID](#remove-a-baseline).
+
+To perform a baseline comparison with 3 or more objects, do the following:
+
+1. In the right hand corner of the Comparison view, click the dropdown. Depending on your current view configuration, the dropdown is either titled **Compare with previous** or **Compare with baseline**.
+2. Depending on your current view configuration, select either **Compare with previous** or **Compare with baseline**.
+   - **Compare with baseline**: Sets the leftmost object as the baseline. The table updates so that the leftmost column is the baseline.
+   -  **Compare with previous**: No object is set as baseline.
+
+### Compare a pair from a multi-object comparison
+
+> This option is only available when comparing 3 or more objects.
+
+When comparing 3 or more objects, you can compare a single object to a previous object or baseline. This changes the Comparison table view so that the view is identical to a two-object comparison. To compare a pair of objects from a multi-object comparison, do the following:
+
+1. In the Comparison view topbar, find the ID that you want to compare to previous or baseline. 
+2. To select the item, click the ID. The UI refreshes with a two-way comparison table.
+    ![Comparing a pair from a multi-object comparison.](imgs/comparsion-7objs-diffonly-subset.png)
+
+To reset the view so that the first 6 objects selected for comparison are displayed in the table, click the ID again.
+
+### Remove an object from comparison
+
+> This option is only available when comparing 3 or more objects.
+
+To remove an object from comparison, do the following:
+
+1. In the Comparison view topbar, find the object that you want to remove from comparison.
+2. Click the three dots to the right of the ID.
+3. In the dropdown, select **Remove object from comparison**. The UI refreshes with an updated table that no longer includes the removed object.
+
+## Usage notes
+
+ - The Comparison feature is only available in the UI.
+ - You can compare as many objects as you'd like. However, the UI only displays a maximum of 6. To view an object in the comparison table that is not visible when comparing more than 6 objects, either [change the comparison order](#change-the-comparison-order) so that the object is one of the first 6 objects from left to right, or [pair from a multi-object comparison](#compare-a-pair-from-a-multi-object-comparison) for easy viewing. 
diff --git a/docs/docs/guides/tools/imgs/comparison-2objs-baseline-set.png b/docs/docs/guides/tools/imgs/comparison-2objs-baseline-set.png
diff --git a/docs/docs/guides/tools/imgs/comparison-2objs-baseline.png b/docs/docs/guides/tools/imgs/comparison-2objs-baseline.png
diff --git a/docs/docs/guides/tools/imgs/comparison-2objs-numericdiffformat-updated.png b/docs/docs/guides/tools/imgs/comparison-2objs-numericdiffformat-updated.png
diff --git a/docs/docs/guides/tools/imgs/comparison-2objs-numericdiffformat.png b/docs/docs/guides/tools/imgs/comparison-2objs-numericdiffformat.png
diff --git a/docs/docs/guides/tools/imgs/comparison-2objs-reorder.png b/docs/docs/guides/tools/imgs/comparison-2objs-reorder.png
diff --git a/docs/docs/guides/tools/imgs/comparison-2objs-sidebyside.png b/docs/docs/guides/tools/imgs/comparison-2objs-sidebyside.png
diff --git a/docs/docs/guides/tools/imgs/comparison-2objs-unified.png b/docs/docs/guides/tools/imgs/comparison-2objs-unified.png
diff --git a/docs/docs/guides/tools/imgs/comparsion-7objs-diffonly-allobjs.png b/docs/docs/guides/tools/imgs/comparsion-7objs-diffonly-allobjs.png
diff --git a/docs/docs/guides/tools/imgs/comparsion-7objs-diffonly-remove.png b/docs/docs/guides/tools/imgs/comparsion-7objs-diffonly-remove.png
diff --git a/docs/docs/guides/tools/imgs/comparsion-7objs-diffonly-subset.png b/docs/docs/guides/tools/imgs/comparsion-7objs-diffonly-subset.png
diff --git a/docs/docs/guides/tools/playground.md b/docs/docs/guides/tools/playground.md
@@ -16,7 +16,7 @@ Evaluating LLM prompts and responses is challenging. The Weave Playground is des
 Get started with the Playground to optimize your LLM interactions and streamline your prompt engineering process and LLM application development.
 
 - [Prerequisites](#prerequisites)
-   - [Add a provider API key](#add-a-provider-api-key)
+   - [Add provider credentials and information](#add-provider-credentials-and-information)
    - [Access the Playground](#access-the-playground)
 - [Select an LLM](#select-an-llm)
 - [Adjust LLM parameters](#adjust-llm-parameters)
@@ -27,17 +27,21 @@ Get started with the Playground to optimize your LLM interactions and streamline
 
 ## Prerequisites
 
-Before you can use Playground, you must [add an API key](#add-a-provider-api-key) for your preferred LLM provider(s), and [open the Playground UI](#access-the-playground). 
+Before you can use Playground, you must [add provider credentials](#add-provider-credentials-and-information), and [open the Playground UI](#access-the-playground). 
 
-### Add a provider API key 
+### Add provider credentials and information 
 
-Playground currently supports OpenAI, Anthropic, Gemini, and Groq models.
-To use one of the available LLMs, your W&B admin must add the appropriate API key to your team secrets in W&B settings.
+Playground currently supports OpenAI, Anthropic, Gemini, Groq, and Amazon Bedrock models.
+To use one of the available models, add the appropriate information to your team secrets in W&B settings.
 
 - OpenAI: `OPENAI_API_KEY`
 - Anthropic: `ANTHROPIC_API_KEY`
 - Gemini: `GOOGLE_API_KEY`
 - Groq: `GEMMA_API_KEY`
+- Amazon Bedrock:
+   - `AWS_ACCESS_KEY_ID`
+   - `AWS_SECRET_ACCESS_KEY`
+   - `AWS_REGION_NAME`
 
 ### Access the Playground
 
@@ -105,6 +109,32 @@ You can switch the LLM using the dropdown menu in the top left. Currently, the a
 - o1-mini
 - o1-preview-2024-09-12
 - o1-preview
+- ai21.j2-mid-v1
+- ai21.j2-ultra-v1
+- amazon.titan-text-lite-v1
+- amazon.titan-text-express-v1
+- mistral.mistral-7b-instruct-v0:2
+- mistral.mixtral-8x7b-instruct-v0:1
+- mistral.mistral-large-2402-v1:0
+- mistral.mistral-large-2407-v1:0
+- anthropic.claude-3-sonnet-20240229-v1:0
+- anthropic.claude-3-5-sonnet-20240620-v1:0
+- anthropic.claude-3-haiku-20240307-v1:0
+- anthropic.claude-3-opus-20240229-v1:0
+- anthropic.claude-v2
+- anthropic.claude-v2:1
+- anthropic.claude-instant-v1
+- cohere.command-text-v14
+- cohere.command-light-text-v14
+- cohere.command-r-plus-v1:0
+- cohere.command-r-v1:0
+- meta.llama2-13b-chat-v1
+- meta.llama2-70b-chat-v1
+- meta.llama3-8b-instruct-v1:0
+- meta.llama3-70b-instruct-v1:0
+- meta.llama3-1-8b-instruct-v1:0
+- meta.llama3-1-70b-instruct-v1:0
+- meta.llama3-1-405b-instruct-v1:0
 
 ## Adjust LLM parameters
 

diff --git a/docs/docs/guides/tracking/tracing.mdx b/docs/docs/guides/tracking/tracing.mdx
@@ -361,16 +361,16 @@ You can also manually create Calls using the API directly.
         import weave
 
         # Initialize Weave Tracing
-        weave.init('intro-example')
+        client = weave.init('intro-example')
 
         def my_function(name: str):
             # Start a call
-            call = weave.create_call(op="my_function", inputs={"name": name})
+            call = client.create_call(op="my_function", inputs={"name": name})
 
             # ... your function code ...
 
             # End a call
-            weave.finish_call(call, output="Hello, World!")
+            client.finish_call(call, output="Hello, World!")
 
         # Call your function
         print(my_function("World"))

diff --git a/docs/sidebars.ts b/docs/sidebars.ts
@@ -67,6 +67,7 @@ const sidebars: SidebarsConfig = {
         "guides/core-types/datasets",
         "guides/tracking/feedback",
         "guides/tracking/costs",
+        "guides/tools/comparison",
         "guides/tools/playground",
         "guides/core-types/media",
         {

diff --git a/pyproject.toml b/pyproject.toml
@@ -53,7 +53,7 @@ cerebras = ["cerebras-cloud-sdk"]
 cohere = ["cohere>=5.9.1,<5.9.3"]
 dspy = ["dspy>=0.1.5", "litellm<=1.49.1"]
 google_ai_studio = ["google-generativeai>=0.8.3"]
-groq = ["groq>=0.9.0"]
+groq = ["groq>=0.13.0"]
 instructor = [
   "instructor>=1.4.3,<1.7.0; python_version <= '3.9'",
   "instructor>=1.4.3; python_version > '3.9'",
@@ -226,7 +226,7 @@ module = "weave_query.*"
 ignore_errors = true
 
 [tool.bumpversion]
-current_version = "0.51.23-dev0"
+current_version = "0.51.24-dev0"
 parse = """(?x)
     (?P<major>0|[1-9]\\d*)\\.
     (?P<minor>0|[1-9]\\d*)\\.

diff --git a/sdks/node/package-lock.json b/sdks/node/package-lock.json
diff --git a/sdks/node/package.json b/sdks/node/package.json
@@ -1,11 +1,19 @@
 {
   "name": "weave",
-  "version": "0.7.0",
+  "version": "0.7.3",
   "description": "AI development toolkit",
-  "types": "dist/src/index.d.ts",
-  "main": "dist/src/index.js",
+  "types": "dist/index.d.ts",
+  "main": "dist/index.js",
   "type": "commonjs",
+  "exports": {
+    ".": {
+      "types": "./dist/index.d.ts",
+      "default": "./dist/index.js"
+    }
+  },
   "scripts": {
+    "build": "tsc --outDir dist",
+    "prepare": "npm run build",
     "test": "jest --silent",
     "test:coverage": "jest --coverage",
     "test:watch": "jest --watch",
@@ -14,6 +22,9 @@
     "generate-api": "swagger-typescript-api -p ./weave.openapi.json -o ./src/generated -n traceServerApi.ts",
     "dev": "nodemon"
   },
+  "files": [
+    "dist"
+  ],
   "repository": {
     "type": "git",
     "url": "https://github.com/wandb/weave/js"

diff --git a/sdks/node/tsconfig.json b/sdks/node/tsconfig.json
@@ -7,19 +7,28 @@
     "sourceMap": true,
     "strict": true,
     "esModuleInterop": true,
-    "outDir": "./dist",
+    "outDir": "dist",
     "paths": {
       "weave": ["./src/index.ts"]
-    }
+    },
+    "declaration": true,
+    "declarationMap": true,
+    "rootDir": "src",
+    "tsBuildInfoFile": "dist/.tsbuildinfo"
   },
   "include": ["src/**/*"],
-  "exclude": ["src", "examples", "dist", "node_modules"],
-  "references": [
-    {
-      "path": "./src/tsconfig.src.json"
-    },
-    {
-      "path": "./examples/tsconfig.examples.json"
-    }
+  "exclude": [
+    "examples",
+    "dist",
+    "node_modules",
+    "src/integrations/checkOpenai.ts"
   ]
+  // "references": [
+  //   {
+  //     "path": "./src/tsconfig.src.json"
+  //   },
+  //   {
+  //     "path": "./examples/tsconfig.examples.json"
+  //   }
+  // ]
 }
diff --git a/tests/integrations/vertexai/vertexai_test.py b/tests/integrations/vertexai/vertexai_test.py
@@ -123,3 +123,62 @@ async def get_response():
     output = call.output
     assert "paris" in output["candidates"][0]["content"]["parts"][0]["text"].lower()
     assert output["candidates"][0]["content"]["role"] == "model"
+
+
+@pytest.mark.skip(
+    reason="This test depends on a non-deterministic external service provider"
+)
+@pytest.mark.flaky(reruns=5, reruns_delay=2)
+@pytest.mark.skip_clickhouse_client
+def test_chat_session(client):
+    import vertexai
+    from vertexai.generative_models import GenerativeModel
+
+    vertexai.init(project="wandb-growth", location="us-central1")
+    model = GenerativeModel("gemini-1.5-flash")
+    chat = model.start_chat()
+    chat.send_message("What is the capital of France?")
+
+    calls = list(client.calls())
+    assert len(calls) == 1
+
+    call = calls[0]
+    assert call.started_at < call.ended_at
+
+    trace_name = op_name_from_ref(call.op_name)
+    assert trace_name == "vertexai.GenerativeModel.generate_content"
+    output = call.output
+    assert "paris" in output["candidates"][0]["content"]["parts"][0]["text"].lower()
+    assert output["candidates"][0]["content"]["role"] == "model"
+    assert output["candidates"][0]["finish_reason"] == "STOP"
+    assert "gemini-1.5-flash" in output["model_version"]
+
+
+@pytest.mark.skip(
+    reason="This test depends on a non-deterministic external service provider"
+)
+@pytest.mark.flaky(reruns=5, reruns_delay=2)
+@pytest.mark.asyncio
+@pytest.mark.skip_clickhouse_client
+async def test_chat_session_async(client):
+    import vertexai
+    from vertexai.generative_models import GenerativeModel
+
+    vertexai.init(project="wandb-growth", location="us-central1")
+    model = GenerativeModel("gemini-1.5-flash")
+    chat = model.start_chat()
+    await chat.send_message_async("What is the capital of France?")
+
+    calls = list(client.calls())
+    assert len(calls) == 1
+
+    call = calls[0]
+    assert call.started_at < call.ended_at
+
+    trace_name = op_name_from_ref(call.op_name)
+    assert trace_name == "vertexai.GenerativeModel.generate_content"
+    output = call.output
+    assert "paris" in output["candidates"][0]["content"]["parts"][0]["text"].lower()
+    assert output["candidates"][0]["content"]["role"] == "model"
+    assert output["candidates"][0]["finish_reason"] == "STOP"
+    assert "gemini-1.5-flash" in output["model_version"]
diff --git a/tests/trace/test_exec.py b/tests/trace/test_exec.py
@@ -101,7 +101,7 @@ def test_publish_works_for_code_with_no_source_file(
 
     ref = captured["ref"]
     op = ref.get()
-    actual_captured_code = op.art.path_contents["obj.py"].decode()
+    actual_captured_code = op.get_captured_code()
     expected_captured_code = expected_captured_code[1:]  # ignore first newline
 
     assert actual_captured_code == expected_captured_code