[AIC][RFC] promptfoo integration prototype

Make a custom provider using AIConfig and configure it with promptfoo yaml.
lastmile-ai · Dec 1, 2023 · 4516415 · 4516415
1 parent 802ce1d
commit 4516415
Show file tree

Hide file tree

Showing 13 changed files with 776 additions and 0 deletions.
diff --git a/package-lock.json b/package-lock.json
diff --git a/package.json b/package.json
@@ -0,0 +1,10 @@
+{
+  "devDependencies": {
+    "@types/node": "^20.10.0"
+  },
+  "dependencies": {
+    "dotenv": "^16.3.1",
+    "fs": "^0.0.1-security",
+    "openai": "^4.20.1"
+  }
+}
diff --git a/python/src/aiconfig/eval/promptfoo/README.md b/python/src/aiconfig/eval/promptfoo/README.md
@@ -0,0 +1,23 @@
+# Promptfoo integration
+
+Use case: I'm a SWE who wants to run my AIConfig against a set of test cases specified in a config file. Each test case has the input and a success condition of my choosing.
+
+## Philosophy / design
+
+Prompfoo has a pretty nice interface (both input and outputs) for addressing the use case. Tests are specified in a yaml file and the test suite can be run with a simple command. The same config file makes it easy to connect your test suite to an AI config with a small amount of code.
+
+## How-to guide
+
+1. Write your test cases in a Promptfoo config file. See examples/travel/travel_promtfooconfig.yaml as an example.
+2. Define an AIConfig test suite settings file. It should have the prompt name and path to your aiconfig. See examples/travel/travel_aiconfig_test_suite_settings.json for example.
+3. Set your provider to point to run_aiconfig.py with your settings file as the argument. For e.g. see examples/travel/travel_promtfooconfig.yaml. Like this:
+
+```
+providers:
+  - exec:python ../../run_aiconfig.py ./travel_aiconfig_test_suite_settings.json
+```
+
+4. export your provider API key if needed so it's available to subprocess environments:
+   `export OPENAI_API_KEY=...`
+
+5. Run `npx promptfoo@latest eval -c path/to/promptfooconfig.yaml`
diff --git a/python/src/aiconfig/eval/promptfoo/examples/travel/travel_aiconfig_test_suite_settings.json b/python/src/aiconfig/eval/promptfoo/examples/travel/travel_aiconfig_test_suite_settings.json
@@ -0,0 +1,4 @@
+{
+  "prompt_name": "get_activities",
+  "aiconfig_path": "travel_parametrized_for_testing.aiconfig.json"
+}
diff --git a/...src/aiconfig/eval/promptfoo/examples/travel/travel_parametrized_for_testing.aiconfig.json b/...src/aiconfig/eval/promptfoo/examples/travel/travel_parametrized_for_testing.aiconfig.json
@@ -0,0 +1,36 @@
+{
+  "name": "NYC Trip Planner",
+  "description": "Intrepid explorer with ChatGPT and AIConfig",
+  "schema_version": "latest",
+  "metadata": {
+    "models": {
+      "gpt-3.5-turbo": {
+        "model": "gpt-3.5-turbo",
+        "top_p": 1,
+        "temperature": 1
+      },
+      "gpt-4": {
+        "model": "gpt-4",
+        "max_tokens": 3000,
+        "system_prompt": "You are an expert travel coordinator with exquisite taste."
+      }
+    },
+    "default_model": "gpt-3.5-turbo"
+  },
+  "prompts": [
+    {
+      "name": "get_activities",
+      "input": "{{the_query}}"
+    },
+    {
+      "name": "gen_itinerary",
+      "input": "Generate an itinerary ordered by {{order_by}} for these activities: {{get_activities.output}}.",
+      "metadata": {
+        "model": "gpt-4",
+        "parameters": {
+          "order_by": "geographic location"
+        }
+      }
+    }
+  ]
+}