reWOO style planner

jondurbin · Jul 22, 2023 · 58a0b8a · 58a0b8a
1 parent eb93532
commit 58a0b8a
Show file tree

Hide file tree

Showing 7 changed files with 106 additions and 5 deletions.
diff --git a/airoboros/instructors/inline_qa.py b/airoboros/instructors/inline_qa.py
@@ -28,7 +28,11 @@ async def generate(instructor, category, start_key="QUESTION", end_key="ANSWER")
     language = config.get("language") or instructor.language
     while count < target_count:
         # Get a batch of instructions.
-        prompt = template.format(batch_size=batch_size, language=language)
+        prompt = (
+            template.format(batch_size=batch_size, language=language)
+            if "{batch_size}" in template
+            else template.format(language=language)
+        )
         response = await instructor.generate_response(prompt, **api_params)
         if not response:
             continue

diff --git a/airoboros/instructors/plan.py b/airoboros/instructors/plan.py
@@ -0,0 +1,7 @@
+from airoboros.instructors.inline_qa import generate as generate_inline
+
+
+async def generate(instructor):
+    """Generator for rewoo style planning."""
+    async for item in generate_inline(instructor, "plan"):
+        yield item
diff --git a/airoboros/instructors/prompts/agent.txt b/airoboros/instructors/prompts/agent.txt
@@ -1,7 +1,7 @@
-Below are example prompt/response pairs that showing examples of an "agent"/"router" prompt for an artificial intelligence assistant.
+Below is an example prompt/response pair that showing examples of an "agent"/"router" prompt for an artificial intelligence assistant.
 
 PROMPT:
-Please select an appropriate function and parameters to use from the list of available functions below, based on the provided user input.
+Please select an appropriate function and parameters to use from the list of available functions below, based on the provided user input.  Provide your response in YAML format.
 
 Input: From the provided CSV, generate an aggregate table containing a count per email address.
 
@@ -38,7 +38,7 @@ Be sure to include an input that could make use of one of the functions.
 
 Be sure to format the list of available functions as proper YAML, with appropriate spacing for nested objects.
 
-Be sure to format the answers as proper YAML, with appropriate spacing for nested objects.
+The new prompts should ask for output to be in either YAML or JSON format; be sure to format each answer according to the format requested by it's corresponding prompt.
 
 Be sure to randomize the ordering of the available functions so the selected function is not always the first function.  The selected function should sometimes be the first function, other times be the second function, and so on for all N in number of functions.
 

diff --git a/airoboros/instructors/prompts/plan.txt b/airoboros/instructors/prompts/plan.txt
@@ -0,0 +1,80 @@
+Below is an example prompt/response pair howing how an artificial intelligence assistant can generate an execution plan given the input question. The assistant doesn't attempt to actually solve the input instruction, only provides a step-by-step execution plan.
+
+Example 1.
+QUESTION: For the following tasks, make plans that can solve the problem step-by-step. For each plan, indicate which external tool together with tool input to retrieve evidence. You can store the evidence into a variable #E[index] that can be called by later tools.
+
+Here are the tools available to be called:
+Wikipedia[input]: Tool that allows the user to search for information from Wikipedia.  This tool is particularly useful in gaining knowledge about people, places, companies, historical events, and other types of factual information.  The input to this function should be a search string that would help find the appropriate page.  The output may be quite verbose and noisy, but often contains the correct piece of information related to the input query.
+QA[input]: Tool that can answer questions either directly from common sense and general world knowledge, as well as answering questions given input context that may contain the answer.
+
+Each plan should be followed by exactly one evidence (#E[index]) value.
+
+The output should be in format:
+Plan: [first action to take based in input question]
+#E1 = [function to call with input parameter]
+Plan: [next action to take, based on result of #E1]
+#E2 = [next function to call and input parameter, which may include reference to previous evidence, e.g. "Given context #E1"]
+...
+Final answer: #E[n]
+
+Question: What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?
+
+ANSWER:
+Plan: Search for more information about Colorado orogeny.
+#E1 = Wikipedia[Colorado orogeny]
+Plan: Find out the area that eastern sector of the Colorado orogeny extends into.
+#E2 = QA[What is the name of the area that eastern sector of Colorado extends into? Given context: #E1]
+Plan: Search for more information about the area.
+#E3 = Wikipedia[#E2]
+Plan: Find out the elevation range for the area.
+#E4 = QA[What is elevation range for the area #E2? Given context: #E3]
+Final answer: #E4
+
+Example 2.
+QUESTION:
+Please create a step-by-step plan to generate an ideal response to the user instruction, making use of a set of available tools.  Each plan will have a corresponding evidence value, which will be the output of one of the available functions given an input string
+ that can be the user question, one or more previous evidence values, or a mixture of both.
+
+Here are the tools available to be called:
+Google[input]: Tool that allows the user to search for information using the Google search engine.  This tool is useful in finding an appropriate list of sites that may or may not include the answer to the user's question.  The function doesn't directly answer the question; it finds a list of sites that may have the answer.
+Scraper[input]: Load one or more websites from the input string containing newline delimited links, where input is one or more links, and produces plain text output containing the content of the links.
+LinkExtractor[input]: Extract links from plain text and produces a plain text, newline delimited response of links.
+LLM[input]: Question answering language model, particularly useful in answering questions based on an input passage of text.  The input must be a text question that references an :evidence[n]: variable, e.g. What color is the cat, given :evidence1:?
+
+The input to each function just just be a plain string, without quotes or "+" to concatenate a string with an evidence variable, e.g. LLM[What is the capital of Michigan, given :evidence3:?]
+
+Be sure to only include one evidence output per plan step.
+
+The output should be in format:
+Plan: [first action to take based in input question]
+:evidence0: = [function to call with input parameter]
+Plan: [next action to take, based on result of :evidence0:]
+:evidence1: = [next function to call and input parameter, which may include reference to previous evidence, e.g. "Given context :evidence0"]
+...
+Answer: [:evidence[n]: containing the final answer.]
+
+
+Question: Who is the current CEO of Tesla and what are some of the key patents they hold?
+
+ANSWER:
+Plan: Begin by conducting a web search to find out who the current CEO of Tesla is.
+:evidence0: = Google[Current CEO of Tesla]
+Plan: Utilize a language model to interpret the search results and find the name of the CEO.
+:evidence1: = LLM[Who is the current CEO of Tesla, given :evidence0:?]
+Plan: Conduct another web search to find the key patents held by the identified CEO of Tesla.
+:evidence2: = Google[Key patents held by :evidence1:]
+Plan: Extract the relevant links from the Google search results for a more focused search.
+:evidence3: = LinkExtractor[:evidence2:]
+Plan: Use a web scraper tool to extract information from the relevant links.
+:evidence4: = Scraper[:evidence3:]
+Plan: Finally, utilize the language model to identify and summarize the key patents held by the CEO of Tesla from the extracted information.
+:evidence5: = LLM[What are the key patents held by :evidence1:, given :evidence4:?]
+Answer: :evidence5:
+
+I would like you to generate {batch_size} new, unique instruction/response pair of a similar format.  Be sure to include between 2 and 8 unique, random available tools.  One of the tools must be a question answering tool similar to "QA" in the example, but it can have a different name.  Don't make any reference to the examples in the output.
+
+Response format:
+QUESTION:
+[question in a format similar to the example, with instructions, descriptions of functions, output format, etc.]
+ANSWER:
+[answer to the question, in the format asked for in the QUESTION, with plan and execute/evidence vars, one plan per var]
diff --git a/airoboros/self_instruct.py b/airoboros/self_instruct.py
@@ -409,6 +409,7 @@ async def run(self):
         from airoboros.instructors.experiences import generate as experience_generator
         from airoboros.instructors.general import generate as general_generator
         from airoboros.instructors.orca import generate as orca_generator
+        from airoboros.instructors.plan import generate as plan_generator
         from airoboros.instructors.riddles import generate as riddle_generator
         from airoboros.instructors.roleplay import generate as roleplay_generator
         from airoboros.instructors.trivia import generate as trivia_generator
@@ -422,6 +423,7 @@ async def run(self):
             "counterfactual_contextual": counterfactual_contextual_generator,
             "experience": experience_generator,
             "general": general_generator,
+            "plan": plan_generator,
             "orca": orca_generator,
             "riddle": riddle_generator,
             "roleplay": roleplay_generator,

diff --git a/example-config.yaml b/example-config.yaml
@@ -273,3 +273,11 @@ instructors:
     batch_size: 5
     min_docsearch_score: 0.1
     prompt_path: agent.txt
+
+  ##################################################################################
+  # reWOO style planner
+  plan:
+    count: 100
+    batch_size: 1
+    min_docsearch_score: 0.05
+    prompt_path: plan.txt
diff --git a/setup.py b/setup.py
@@ -6,7 +6,7 @@
 
 setup(
     name="airoboros",
-    version="2.0.2",
+    version="2.0.3",
     description="Updated and improved implementation of the self-instruct system.",
     long_description=long_description,
     long_description_content_type="text/markdown",