Skip to content

Commit

Permalink
WIP: trying out different models
Browse files Browse the repository at this point in the history
  • Loading branch information
NickyHavoc authored and JohannesWesch committed Apr 4, 2024
1 parent 88efa0a commit 95033f3
Showing 1 changed file with 102 additions and 25 deletions.
127 changes: 102 additions & 25 deletions src/examples/user_journey.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To start off, we are only given a few anecdotal examples. Let's see how far we can get with these.\n"
"To start off, we are only given a few anecdotal examples.\n",
"Firstly, there are two e-mails, and secondly a number of potential departments to which they should be sent.\n",
"\n",
"Let's have a look.\n"
]
},
{
Expand Down Expand Up @@ -102,15 +105,23 @@
"# instantiating the default task\n",
"prompt_based_classify = PromptBasedClassify()\n",
"\n",
"# building the input object for each example\n",
"classify_inputs = [\n",
" ClassifyInput(chunk=TextChunk(example), labels=labels) for example in examples\n",
"]\n",
"\n",
"\n",
"# running the tasks concurrently\n",
"outputs = prompt_based_classify.run_concurrently(classify_inputs, InMemoryTracer())\n",
"outputs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hmm, we have some results, but they aren't really legible (yet)."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -128,6 +139,17 @@
"[sorted(list(o.scores.items()), key=lambda i: i[1], reverse=True)[0] for o in outputs]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It appears that the Finance Department can fix my laptop and the Comms people can reward free credits...\n",
"We probably have to do some finetuning of our classification approach.\n",
"\n",
"However, let's first make sure that this evidence is not anecdotal.\n",
"For this, we need to do some eval. Luckily, we have by now got access to a few more examples...\n"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -137,29 +159,21 @@
"It appears that the Finance Department can fix my laptop and the Comms people can reward free credits...\n",
"We probably have to do some finetuning of our classification approach.\n",
"\n",
" },\n",
" {\n",
" \"label\": \"Sales\",\n",
" \"message\": \"Jonas, we have met each other at the event in Nürnberg, can we meet for a follow up in your Office in Heidelberg?\"\n",
"\n",
" },\n",
" {\n",
" \"label\": \"Security\",\n",
" \"message\": \"Your hTTPs Certificate is not valid on your www.aleph-alpha.de\"\n",
" },\n",
" {\n",
" \"label\": \"HR\",\n",
" \"message\": \"I want to take a week off immediatly\"\n",
" },\n",
" {\n",
" \"label\": \"HR\",\n",
" \"message\": \"I want to take a sabbatical\"\n",
" },\n",
" {\n",
" \"label\": \"HR\",\n",
" \"message\": \"How can I work more, I want to work weekends, can I get paid overtime?\"\n",
" }\n",
"]"
"with open(\"data/classify_examples.json\", \"r\") as file:\n",
" labeled_examples: list[dict[str, str]] = json.load(file)\n",
"\n",
"labeled_examples"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The Intelligence layer offers support to run task evaluations.\n",
"\n",
"First, we have to create a dataset inside a repository.\n",
"There are different repositories (that persist datasets in different ways), but an `InMemoryDatasetRepository` will do for now.\n"
]
},
{
Expand Down Expand Up @@ -211,6 +225,13 @@
"When a dataset is created, we generate a unique ID. We'll need it later."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When a dataset is created, we generate a unique ID. We'll need it later."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -220,6 +241,13 @@
"dataset_id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have a dataset, let's actually run an evaluation on it!\n"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -264,7 +292,15 @@
"metadata": {},
"outputs": [],
"source": [
"eval_overview = evaluator.evaluate_runs(run_overview.id)"
"run_overview = runner.run_dataset(dataset_id)\n",
"run_overview"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's evaluate this run."
]
},
{
Expand Down Expand Up @@ -429,6 +465,47 @@
"Let's run the cleaned dataset using this task..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The prompt used for the `PromptBasedClassify`-task looks as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(prompt_based_classify.instruction)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can probably improve this task by making the prompt more specific, like so:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"adjusted_prompt = \"\"\"Identify the department that would be responsible for handling the given request.\n",
"Reply with only the department name.\"\"\"\n",
"prompt_adjusted_classify = PromptBasedClassify(instruction=adjusted_prompt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's run the cleaned dataset using this task..."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down

0 comments on commit 95033f3

Please sign in to comment.