Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Studio integration of Trace Submission PHS-616 #975

Merged
merged 16 commits into from
Aug 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions .github/workflows/daily.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,7 @@ name: "os-support-tests"
on:
workflow_dispatch:
# Scheduled workflows will only run on the default branch.
schedule:
- cron: '0 0 * * *' # runs once a day at midnight in the timezone of your GitHub repository


defaults:
run:
shell: bash
Expand Down
44 changes: 44 additions & 0 deletions .github/workflows/sdk-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,27 @@ jobs:
password: ${{ secrets.GH_PAT }}
ports:
- "3000:3000"
postgres:
image: postgres:15
ports:
- "5433:5432"
env:
POSTGRES_DB: "il_sdk"
POSTGRES_USER: "il_sdk"
POSTGRES_PASSWORD: "test"
studio-backend:
image: registry.gitlab.aleph-alpha.de/product/studio/backend:latest
ports:
- "8000:8000"
env:
DATABASE_URL: "postgres:5432"
POSTGRES_DB: "il_sdk"
POSTGRES_USER: "il_sdk"
POSTGRES_PASSWORD: "test"
AUTHORIZATION_SERVICE_URL: "none"
credentials:
username: "unused"
password: ${{ secrets.GL_STUDIO_CONTAINER_TOKEN }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand Down Expand Up @@ -164,6 +185,7 @@ jobs:
ARGILLA_API_URL: "http://localhost:6900/"
ARGILLA_API_KEY: "argilla.apikey"
CLIENT_URL: "https://api.aleph-alpha.com"
STUDIO_URL: "http://localhost:8000/"
run: |
./scripts/test.sh
run-notebooks:
Expand All @@ -186,6 +208,27 @@ jobs:
env:
ARGILLA_ELASTICSEARCH: "http://argilla-elastic-search:9200"
ARGILLA_ENABLE_TELEMETRY: 0
postgres:
image: postgres:15
ports:
- "5433:5432"
env:
POSTGRES_DB: "il_sdk"
POSTGRES_USER: "il_sdk"
POSTGRES_PASSWORD: "test"
studio-backend:
image: registry.gitlab.aleph-alpha.de/product/studio/backend:latest
ports:
- "8000:8000"
env:
DATABASE_URL: "postgres:5432"
POSTGRES_DB: "il_sdk"
POSTGRES_USER: "il_sdk"
POSTGRES_PASSWORD: "test"
AUTHORIZATION_SERVICE_URL: "none"
credentials:
username: "unused"
password: ${{ secrets.GL_STUDIO_CONTAINER_TOKEN }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand Down Expand Up @@ -217,5 +260,6 @@ jobs:
ARGILLA_API_URL: "http://localhost:6900/"
ARGILLA_API_KEY: "argilla.apikey"
CLIENT_URL: "https://api.aleph-alpha.com"
STUDIO_URL: "http://localhost:8000"
run: |
./scripts/notebook_runner.sh
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
...

### Features
...
- Add `StudioClient` as connector to PhariaStudio for submitting traces.

### Fixes
...

### Deprecations
...
- Deprecate old Trace Viewer as the new `StudioClient` replaces it. This affects `Tracer.submit_to_trace_viewer`.

## 5.0.3

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ The how-tos are quick lookups about how to do things. Compared to the tutorials,
| [...define a task](./src/documentation/how_tos/how_to_define_a_task.ipynb) | How to come up with a new task and formulate it |
| [...implement a task](./src/documentation/how_tos/how_to_implement_a_task.ipynb) | Implement a formulated task and make it run with the Intelligence Layer |
| [...debug and log a task](./src/documentation/how_tos/how_to_log_and_debug_a_task.ipynb) | Tools for logging and debugging in tasks |
| [...run the trace viewer](./src/documentation/how_tos/how_to_run_the_trace_viewer.ipynb) | Downloading and running the trace viewer for debugging traces |
| [...use PhariaStudio with traces](./src/documentation/how_tos/how_to_use_pharia_studio_with_traces.ipynb) | Submitting Traces to PhariaStudio for debugging |
| **Analysis Pipeline** | |
| [...implement a simple evaluation and aggregation logic](./src/documentation/how_tos/how_to_implement_a_simple_evaluation_and_aggregation_logic.ipynb) | Basic examples of evaluation and aggregation logic |
| [...create a dataset](./src/documentation/how_tos/how_to_create_a_dataset.ipynb) | Create a dataset used for running a task |
Expand Down
20 changes: 20 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,26 @@ services:
image: ghcr.io/aleph-alpha/trace-viewer-trace-viewer:main
ports:
- 3000:3000

# export GITLAB_TOKEN=...
# (optional) export GITLAB_TOKEN=$(op item get YOUR_TOKEN --format json --fields password | jq .value | tr -d '"')
# echo $GITLAB_TOKEN | docker login registry.gitlab.aleph-alpha.de -u your_email@for_gitlab --password-stdin
# docker compose pull to update containers
studio-backend:
image: registry.gitlab.aleph-alpha.de/product/studio/backend:latest
ports:
- 8000:8000
depends_on:
postgres:
condition: service_started
restart: true
environment:
DATABASE_URL: postgres:5432
POSTGRES_DB: il_sdk
POSTGRES_USER: il_sdk
POSTGRES_PASSWORD: test

AUTHORIZATION_SERVICE_URL: "none"
postgres:
image: postgres:15
ports:
Expand Down
1 change: 1 addition & 0 deletions env.sample
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ ARGILLA_API_KEY="argilla.apikey"
HUGGING_FACE_TOKEN=token
# local dev builds run on 5173
TRACE_VIEWER_URL="http://localhost:3000"
STUDIO_URL=http://localhost:8000
17 changes: 10 additions & 7 deletions src/documentation/how_tos/how_to_log_and_debug_a_task.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@
"outputs": [],
"source": [
"import random\n",
"from uuid import uuid4\n",
"\n",
"from aleph_alpha_client import Prompt\n",
"from dotenv import load_dotenv\n",
"\n",
"from intelligence_layer.connectors import StudioClient\n",
"from intelligence_layer.core import (\n",
" CompleteInput,\n",
" InMemoryTracer,\n",
Expand All @@ -37,10 +39,7 @@
" - To create custom logging messages in a trace use `task_span.log()`.\n",
" - To map a complex execution flow of a task into a single trace, pass the `task_span` of the `do_run` to other execution methods (e.g. `Task.run()` or `model.complete()`). \n",
" - If the execution method is not provided by the intelligence layer, the tracing of input and output has to happen manually. See the implementation of `Task.run()` for an example.\n",
" - Use the [trace viewer](./how_to_run_the_trace_viewer.ipynb) to view and inspect a trace\n",
" - Use and display an `InMemoryTracer` in a notebook to automatically send the trace data to the trace viewer.\n",
" - Note: This also works for traces of the `Runner` and the `Evaluator`.\n",
" - To create persistent traces, use the `FileTracer` instead. This creates files which can manually be uploaded in the trace viewer UI."
" - Use the [submit trace functionality of the `StudioClient`](./how_to_use_pharia_studio_with_traces.ipynb) to view and inspect a trace in PhariaStudio"
]
},
{
Expand Down Expand Up @@ -77,9 +76,13 @@
"\n",
"tracer = InMemoryTracer()\n",
"DummyTask().run(\"\", tracer)\n",
"# ! make sure to run the trace viewer docker container to get the improved display !\n",
"# display an InMemoryTracer in a notebook and send the data to the trace viewer\n",
"display(tracer)\n",
"\n",
"project_name = str(uuid4())\n",
"studio_client = StudioClient(project=project_name)\n",
"my_project = studio_client.create_project(project=project_name)\n",
"\n",
"submitted_trace_id = studio_client.submit_from_tracer(tracer)\n",
"\n",
"\n",
"pass"
]
Expand Down
35 changes: 0 additions & 35 deletions src/documentation/how_tos/how_to_run_the_trace_viewer.ipynb

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from uuid import uuid4\n",
"\n",
"from intelligence_layer.connectors import StudioClient\n",
"from intelligence_layer.core import InMemoryTracer, Task, TaskSpan"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to use PhariaStudio for Debugging in a SaaS Configuration\n",
"<div class=\"alert alert-info\"> \n",
"\n",
"Make sure your account has permissions to use the PhariaStudio application.\n",
"\n",
"For an on-prem or local installation, please contact the PhariaStudio team.\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"0. Generate a trace of your `Task` of interest.\n",
"1. Initialize a `StudioClient` with a project.\n",
" - Use an existing project or create a new one with the `StudioClient.create_project` function.\n",
"2. Submit your traces with the client\n",
" 1. Submit a single trace via `Tracer.export_for_viewing` and `StudioClient.submit_trace`\n",
" 2. [Recommended] submit multiple traces via `StudioClient.submit_from_tracer`. \n",
"\n",
"### Example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Step 0\n",
"class DummyTask(Task[str, str]):\n",
" def do_run(self, input: str, task_span: TaskSpan) -> str:\n",
" return f\"{input} -> output\"\n",
"\n",
"\n",
"tracer = InMemoryTracer()\n",
"DummyTask().run(\"My Dummy Run\", tracer=tracer)\n",
"\n",
"# Step 1\n",
"project_name = str(uuid4())\n",
"studio_client = StudioClient(project=project_name)\n",
"my_project = studio_client.create_project(project=project_name)\n",
"\n",
"# Step 2.1\n",
"trace_to_submit = tracer.export_for_viewing()\n",
"trace_id = studio_client.submit_trace(trace_to_submit) # only works for single traces\n",
"\n",
"# Step 2.2\n",
"tracer2 = InMemoryTracer()\n",
"DummyTask().run(\"My Dummy Run2\", tracer=tracer2)\n",
"DummyTask().run(\"My Dummy Run3\", tracer=tracer2)\n",
"ids_of_submitted_traces = studio_client.submit_from_tracer(tracer2)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "intelligence-layer-aL2cXmJM-py3.11",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
1 change: 1 addition & 0 deletions src/intelligence_layer/connectors/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,5 +44,6 @@
QdrantInMemoryRetriever as QdrantInMemoryRetriever,
)
from .retrievers.qdrant_in_memory_retriever import RetrieverType as RetrieverType
from .studio.studio import StudioClient as StudioClient

__all__ = [symbol for symbol in dir() if symbol and symbol[0].isupper()]
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,7 @@ class DocumentIndexClient:
Document Index is a tool for managing collections of documents, enabling operations such as creation, deletion, listing, and searching.
Documents can be stored either in the cloud or in a local deployment.

Args:
Attributes:
token: A valid token for the document index API.
base_document_index_url: The url of the document index' API.

Expand Down
Loading