Skip to content

Commit

Permalink
fix: make environment variables more configurable (#1175)
Browse files Browse the repository at this point in the history
- migrate tests to use p-prod instance and expose needed env variables
- DI tests no longer assume existing state and rather create/clean up resources
- filter indexes are now also cleaned up
- most tests use luminous-base-control instead of supreme
---------

Co-authored-by: Michael Barlow <[email protected]>
  • Loading branch information
NiklasKoehneckeAA and Michael-JB authored Dec 16, 2024
1 parent eab6ed1 commit 047c4d5
Show file tree
Hide file tree
Showing 34 changed files with 750 additions and 704 deletions.
12 changes: 9 additions & 3 deletions env.sample → .env.example
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
CLIENT_URL="https://api.aleph-alpha.com"
ARGILLA_API_URL="http://localhost:6900/"
ARGILLA_API_KEY="argilla.apikey"

Expand All @@ -13,7 +12,14 @@ POSTGRES_DB=il_sdk
POSTGRES_USER=il_sdk
POSTGRES_PASSWORD=test

# things to adapt
# ---- Things to adapt ----
CLIENT_URL=...
AA_TOKEN=token
DOCUMENT_INDEX_URL=...

# needed for studio integration
DATA_SERVICE_URL=...
AUTHORIZATION_SERVICE_URL=...

# needed for hugging face integration
HUGGING_FACE_TOKEN=token
AA_TOKEN=token
10 changes: 6 additions & 4 deletions .github/workflows/sdk-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -147,9 +147,9 @@ jobs:
POSTGRES_DB: "il_sdk"
POSTGRES_USER: "il_sdk"
POSTGRES_PASSWORD: "test"
AUTHORIZATION_SERVICE_URL: "none"
AUTHORIZATION_SERVICE_URL: ${{ secrets.AUTHORIZATION_SERVICE_URL }}
AA_TOKEN: ${{ secrets.AA_TOKEN }}
API_SCHEDULER_URL: "https://api.aleph-alpha.com"
API_SCHEDULER_URL: ${{ secrets.CLIENT_URL }}
DATA_SERVICE_URL: ${{secrets.DATA_SERVICE_URL}}
credentials:
username: "unused"
Expand Down Expand Up @@ -190,6 +190,7 @@ jobs:
ARGILLA_API_KEY: "argilla.apikey"
CLIENT_URL: ${{ secrets.CLIENT_URL }}
STUDIO_URL: "http://localhost:8000/"
DOCUMENT_INDEX_URL: ${{secrets.DOCUMENT_INDEX_URL}}
POSTGRES_HOST: "localhost"
POSTGRES_PORT: "5433"
POSTGRES_DB: "il_sdk"
Expand Down Expand Up @@ -235,9 +236,9 @@ jobs:
POSTGRES_DB: "il_sdk"
POSTGRES_USER: "il_sdk"
POSTGRES_PASSWORD: "test"
AUTHORIZATION_SERVICE_URL: "none"
AUTHORIZATION_SERVICE_URL: ${{ secrets.AUTHORIZATION_SERVICE_URL }}
AA_TOKEN: ${{ secrets.AA_TOKEN }}
API_SCHEDULER_URL: "https://api.aleph-alpha.com"
API_SCHEDULER_URL: ${{ secrets.CLIENT_URL }}
DATA_SERVICE_URL: ${{secrets.DATA_SERVICE_URL}}
credentials:
username: "unused"
Expand Down Expand Up @@ -274,5 +275,6 @@ jobs:
ARGILLA_API_KEY: "argilla.apikey"
CLIENT_URL: ${{ secrets.CLIENT_URL }}
STUDIO_URL: "http://localhost:8001"
DOCUMENT_INDEX_URL: ${{secrets.DOCUMENT_INDEX_URL}}
run: |
./scripts/notebook_runner.sh
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@

### Breaking Changes
- The env variable `POSTGRES_HOST` is split into `POSTGRES_HOST` and `POSTGRES_PORT`. This affects all classes interacting with Studio and the `InstructionFinetuningDataRepository`.
- The following env variables now need to be set (previously pointed to defaults)
- `CLIENT_URL` - URL of your inference stack
- `DOCUMENT_INDEX_URL` - URL of the document index

## 8.0.0

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ The tutorials aim to guide you through implementing several common use-cases wit

### Setup LLM access

The tutorials require access to an LLM endpoint. You can choose between using the Aleph Alpha API (`https://api.aleph-alpha.com`) or an on-premise setup by configuring the appropriate environment variables. To configure the environment variables, create a `.env` file in the root directory of the project and copy the contents of the `.env.sample` file into it.
The tutorials require access to an LLM endpoint. You can choose between using the Aleph Alpha API (`https://api.aleph-alpha.com`) or an on-premise setup by configuring the appropriate environment variables. To configure the environment variables, create a `.env` file in the root directory of the project and copy the contents of the `.env.example` file into it.

To use the **Aleph Alpha API**, that is set as the default host URL, set the `AA_TOKEN` variable to your [Aleph Alpha access token,](https://docs.aleph-alpha.com/docs/account/#create-a-new-token) and you are good to go.

Expand Down
3 changes: 1 addition & 2 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,7 @@ services:
env_file: ".env" # mainly for AA-TOKEN, DB User/PW
environment:
POSTGRES_HOST: postgres
AUTHORIZATION_SERVICE_URL: "none"
API_SCHEDULER_URL: "https://api.aleph-alpha.com"
API_SCHEDULER_URL: ${CLIENT_URL}
postgres:
image: postgres:15
ports:
Expand Down
27 changes: 10 additions & 17 deletions src/documentation/document_index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
" LimitedConcurrencyClient,\n",
" SemanticEmbed,\n",
")\n",
"from intelligence_layer.core import InMemoryTracer\n",
"from intelligence_layer.core import InMemoryTracer, LuminousControlModel\n",
"from intelligence_layer.examples import MultipleChunkRetrieverQa, RetrieverBasedQaInput\n",
"\n",
"load_dotenv()"
Expand Down Expand Up @@ -61,9 +61,7 @@
"source": [
"## Upload documents to the Document Index\n",
"\n",
"To search through the DI, you'll first need to upload the documents to it.\n",
"For now, we'll use the [DI instance hosted by Aleph Alpha](https://app.document-index.aleph-alpha.com).\n",
"We assume you have an assigned namespace and possess a token to access it."
"To search through the DI, you'll first need to upload the documents to it. We assume that the URL of your DI instance is available under the `DOCUMENT_INDEX_URL` environment variable, and that you already have a namespace and a token to access it."
]
},
{
Expand All @@ -72,8 +70,8 @@
"metadata": {},
"outputs": [],
"source": [
"# specify this for your own namespace\n",
"NAMESPACE = \"aleph-alpha\""
"# change this to your namespace\n",
"NAMESPACE = \"Search\""
]
},
{
Expand All @@ -84,7 +82,7 @@
"source": [
"document_index = DocumentIndexClient(\n",
" token=getenv(\"AA_TOKEN\"),\n",
" base_document_index_url=\"https://document-index.aleph-alpha.com\",\n",
" base_document_index_url=getenv(\"DOCUMENT_INDEX_URL\"),\n",
")"
]
},
Expand Down Expand Up @@ -630,7 +628,9 @@
"outputs": [],
"source": [
"client = LimitedConcurrencyClient.from_env()\n",
"retriever_qa = MultipleChunkRetrieverQa(document_index_retriever, insert_chunk_number=3)\n",
"retriever_qa = MultipleChunkRetrieverQa(\n",
" document_index_retriever, insert_chunk_number=3, model=LuminousControlModel()\n",
")\n",
"\n",
"\n",
"input = RetrieverBasedQaInput(\n",
Expand Down Expand Up @@ -659,18 +659,11 @@
"source": [
"tracer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "intelligence-layer-LP3DLT23-py3.12",
"language": "python",
"name": "python3"
},
Expand All @@ -684,7 +677,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
13 changes: 5 additions & 8 deletions src/documentation/elo_qa_eval.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,6 @@
"metadata": {},
"outputs": [],
"source": [
"from os import getenv\n",
"\n",
"from aleph_alpha_client import Client\n",
"from dotenv import load_dotenv\n",
"\n",
"from intelligence_layer.connectors import LimitedConcurrencyClient\n",
Expand All @@ -56,8 +53,7 @@
"\n",
"load_dotenv()\n",
"\n",
"aa_client = Client(getenv(\"AA_TOKEN\"))\n",
"limited_concurrency_client = LimitedConcurrencyClient(aa_client, max_retry_time=60)"
"aa_client = limited_concurrency_client = LimitedConcurrencyClient.from_env()"
]
},
{
Expand Down Expand Up @@ -205,7 +201,7 @@
"source": [
"models = [\n",
" LuminousControlModel(name=\"luminous-base-control\", client=aa_client),\n",
" LuminousControlModel(name=\"luminous-supreme-control\", client=aa_client),\n",
" Llama3InstructModel(name=\"llama-3.1-8b-instruct\", client=aa_client),\n",
"]\n",
"\n",
"for model in models:\n",
Expand Down Expand Up @@ -292,6 +288,8 @@
"metadata": {},
"outputs": [],
"source": [
"# Here we evaluate with the same model as we want to evaluate for the evaluation.\n",
"# This includes a significant bias and is generally less recommended.\n",
"elo_qa_evaluation_logic = EloQaEvaluationLogic(\n",
" model=Llama3InstructModel(name=\"llama-3.1-8b-instruct\")\n",
")\n",
Expand Down Expand Up @@ -450,8 +448,7 @@
"outputs": [],
"source": [
"newly_added_models = [\n",
" LuminousControlModel(name=\"luminous-base-control-20230501\", client=aa_client),\n",
" LuminousControlModel(name=\"luminous-supreme-control-20230501\", client=aa_client),\n",
" Llama3InstructModel(name=\"llama-3.1-70b-instruct\", client=aa_client),\n",
"]\n",
"\n",
"for model in newly_added_models:\n",
Expand Down
18 changes: 7 additions & 11 deletions src/documentation/evaluate_with_studio.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -84,13 +84,6 @@
"Therefore, let's check out what it looks like."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -126,14 +119,17 @@
"metadata": {},
"outputs": [],
"source": [
"all_labels = list(set(item[\"label\"] for item in data))\n",
"# we grab only a subset of the data here to speed up the evaluation. Remove the index to run on all example datapoints.\n",
"subset_of_data = data[:5]\n",
"\n",
"all_labels = list(set(item[\"label\"] for item in subset_of_data))\n",
"dataset = studio_dataset_repository.create_dataset(\n",
" examples=[\n",
" Example(\n",
" input=ClassifyInput(chunk=TextChunk(item[\"message\"]), labels=all_labels),\n",
" expected_output=item[\"label\"],\n",
" )\n",
" for item in data\n",
" for item in subset_of_data\n",
" ],\n",
" dataset_name=\"Single Label Classify Dataset\",\n",
")\n",
Expand Down Expand Up @@ -281,7 +277,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "intelligence-layer-ZqHLMTHE-py3.12",
"display_name": "intelligence-layer-LP3DLT23-py3.12",
"language": "python",
"name": "python3"
},
Expand All @@ -295,7 +291,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
4 changes: 2 additions & 2 deletions src/documentation/fastapi_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ def __call__(
def client() -> Client:
return Client(
token=os.environ["AA_TOKEN"],
host=os.getenv("AA_CLIENT_BASE_URL", "https://api.aleph-alpha.com"),
host=os.environ["CLIENT_URL"],
)


Expand All @@ -78,7 +78,7 @@ def default_model(
def summary_task(
model: Annotated[LuminousControlModel, Depends(default_model)],
) -> SteerableSingleChunkSummarize:
return SteerableSingleChunkSummarize(model)
return SteerableSingleChunkSummarize(model=model)


@app.post(
Expand Down
2 changes: 2 additions & 0 deletions src/documentation/how_tos/example_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ class ExampleData:
run_overview_2: RunOverview
evaluation_overview_1: EvaluationOverview
evaluation_overview_2: EvaluationOverview
studio_project_name: str


def example_data() -> ExampleData:
Expand Down Expand Up @@ -159,6 +160,7 @@ def example_data() -> ExampleData:
example_data.run_overview_2 = run_overview_2
example_data.evaluation_overview_1 = evaluation_overview_1
example_data.evaluation_overview_2 = evaluation_overview_2
example_data.studio_project_name = "My Example Project"

return example_data

Expand Down
4 changes: 2 additions & 2 deletions src/documentation/how_tos/how_to_aggregate_evaluations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "intelligence-layer-aL2cXmJM-py3.11",
"display_name": "intelligence-layer-LP3DLT23-py3.12",
"language": "python",
"name": "python3"
},
Expand All @@ -84,7 +84,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 047c4d5

Please sign in to comment.