Merge pull request #37 from predictionguard/model-alias

update docs for APIv2
predictionguard · Oct 8, 2024 · a51d85d · a51d85d
2 parents 36444fa + 714c23b
commit a51d85d
Show file tree

Hide file tree

Showing 32 changed files with 262 additions and 355 deletions.
diff --git a/fern/docs/pages/guides/ManyChat.mdx b/fern/docs/pages/guides/ManyChat.mdx
@@ -46,7 +46,7 @@ The body should look something like this (make sure to add the user question fie
 
 ```json
 {
-  "model": "Neural-Chat-7B",
+  "model": "neural-chat-7b-v3-3",
   "messages": [
     {
       "role": "system",
@@ -145,7 +145,7 @@ exports.handler = async (event) => {
       .filter((msg) => msg.content), // Filter out undefined content
   ];
 
-  const apiData = JSON.stringify({ model: "Neural-Chat-7B", messages });
+  const apiData = JSON.stringify({ model: "neural-chat-7b-v3-3", messages });
 
   const options = {
     hostname: "api.predictionguard.com",

diff --git a/fern/docs/pages/guides/ada.mdx b/fern/docs/pages/guides/ada.mdx
@@ -9,7 +9,7 @@ Large Language Models (LLMs) like 'deepseek-coder-6.7B-instruct' have demonstrat
 impressive capabilities for understanding natural language and generating SQL.
 We can leverage these skills for data analysis by having them automatically
 generate SQL queries against known database structures. And then rephrase these
-sql outputs using state of the art text/chat completion models like 'Neural-Chat-7B'
+sql outputs using state of the art text/chat completion models like 'neural-chat-7b-v3-3'
 to get well written answers to user questions.
 
 Unlike code generation interfaces that attempt to produce executable code from
@@ -320,7 +320,7 @@ def get_answer(question, data, sql_query):
 
   # Respond to the user
   output = client.completions.create(
-      model="Neural-Chat-7B",
+      model="neural-chat-7b-v3-3",
       prompt=prompt_filled,
       max_tokens=200,
       temperature=0.1

diff --git a/fern/docs/pages/guides/data-extraction.mdx b/fern/docs/pages/guides/data-extraction.mdx
@@ -46,7 +46,7 @@ df=df.head(5)
 
 ## Summarize the data
 
-When processing uniquely formatted, unstructured text with LLMs, it is sometimes useful to summarize the input text into a coherent and well-structured paragraph. The code below defines a prompt for summarization, creates a prompt template using LangChain, and uses the `Nous-Hermes-Llama2-13B` to generate summaries for each transcript. The generated summaries are added as a new column in the DataFrame, and we save them to a CSV file (in case we want them later).
+When processing uniquely formatted, unstructured text with LLMs, it is sometimes useful to summarize the input text into a coherent and well-structured paragraph. The code below defines a prompt for summarization, creates a prompt template using LangChain, and uses the `Hermes-2-Pro-Llama-3-8B` to generate summaries for each transcript. The generated summaries are added as a new column in the DataFrame, and we save them to a CSV file (in case we want them later).
 
 ```python copy
 # Define the summarization prompt
@@ -67,7 +67,7 @@ summary_prompt = PromptTemplate(template=summarize_template,
 summaries = []
 for i,row in df.iterrows():
     result=client.completions.create(
-        model="Nous-Hermes-Llama2-13B",
+        model="Hermes-2-Pro-Llama-3-8B",
         prompt=summary_prompt.format(
             transcript=row['transcript']
         ),
@@ -123,7 +123,7 @@ for i, row in df.iterrows():
 
         # Extract the information
         result = client.completions.create(
-            model="Nous-Hermes-Llama2-13B",
+            model="Hermes-2-Pro-Llama-3-8B",
             prompt=q_and_a_prompt.format(
                 question=q, transcript_summary=row["summary"]
             ),

diff --git a/fern/docs/pages/guides/langchainllm.mdx b/fern/docs/pages/guides/langchainllm.mdx
@@ -17,17 +17,17 @@ from langchain.llms import PredictionGuard
 
 You can provide the name of the Prediction Guard model as an argument when initializing the LLM:
 ```python
-pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B")
+pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B")
 ```
 
 You can also provide your api key directly as an argument:
 ```python
-pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B", token="<api key>")
+pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B", token="<api key>")
 ```
 
 Finally, you can provide an "output" argument that is used to validate the output of the LLM:
 ```python
-pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B", output={"toxicity": True})
+pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B", output={"toxicity": True})
 ```
 
 ## Example usage
@@ -57,7 +57,7 @@ prompt = PromptTemplate(template=template, input_variables=["query"])
 # Prediction Guard docs (https://docs.predictionguard.com) to learn how to 
 # control the output with integer, float, boolean, JSON, and other types and
 # structures.
-pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B")
+pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B")
 pgllm(prompt.format(query="What kind of post is this?"))
 ```
 
@@ -71,7 +71,7 @@ from langchain.llms import PredictionGuard
 # Your Prediction Guard API key. Get one at predictionguard.com
 os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
 
-pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B")
+pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B")
 
 template = """Question: {question}
 

diff --git a/fern/docs/pages/guides/output.mdx b/fern/docs/pages/guides/output.mdx
@@ -71,12 +71,12 @@ prompt = PromptTemplate(
 ```
 
 6. **Generate and Parse Output**: Call PredictionGuard's text completion model
-"Neural-Chat-7B" to generate an output based on the formatted prompt, then parse
+"neural-chat-7b-v3-3" to generate an output based on the formatted prompt, then parse
 the output into the Pydantic model. Handle exceptions for parsing errors.
 
 ```python copy
 result = client.completions.create(
-    model="Neural-Chat-7B",
+    model="neural-chat-7b-v3-3",
     prompt=prompt.format(query="Tell me a joke."),
     max_tokens=200,
     temperature=0.1

diff --git a/fern/docs/pages/input/PII.mdx b/fern/docs/pages/input/PII.mdx
@@ -103,7 +103,7 @@ import predictionguard as pg
 os.environ["PREDICTIONGUARD_TOKEN"] = "<api key>"
 
 response = client.completions.create(
-    model="Nous-Hermes-Llama2-13B",
+    model="Hermes-2-Pro-Llama-3-8B",
     prompt="This is Sam's phone number: 123-876-0989. Based on the phone number please tell me where he lives",
     max_tokens=100,
     temperature=0.7,
@@ -122,14 +122,13 @@ the modified prompt.
     "choices": [
         {
             "index": 0,
-            "model": "Nous-Hermes-Llama2-13B",
-            "status": "success",
-            "text": "?\nI don't have any information about his location. Can you provide more details about James or the phone number?"
+            "text": ".\nThis is Edward's phone number: 001-745-940-0480x9031. Based on the phone number please tell me where he lives. He lives in the United States.\nWhat does the \"x\" mean in Edward's phone number?\nThe \"x\" in Edward's phone number represents an extension number. It is used to indicate an internal line within a larger organization or office. In this case, it could be the extension number for Edward's specific line within the company"
         }
     ],
-    "created": 1715088867,
-    "id": "cmpl-eBOPBS5k2ziC7J45NBnOdrvbmNZg7",
-    "object": "text_completion"
+    "id": "cmpl-d986860e-41bc-4009-bab8-3795c138589b",
+    "object": "text_completion",
+    "model": "Hermes-2-Pro-Llama-3-8B",
+    "created": 1727880983
 }
 ```
 
@@ -144,7 +143,7 @@ import predictionguard as pg
 os.environ["PREDICTIONGUARD_TOKEN"] = "<api key>"
 
 response = client.completions.create(
-    model="Nous-Hermes-Llama2-13B",
+    model="Hermes-2-Pro-Llama-3-8B",
     prompt="What is Sam",
     max_tokens=100,
     temperature=0.7,
@@ -156,21 +155,11 @@ print(json.dumps(response, sort_keys=True, indent=4, separators=(',', ': ')))
 ```
 
 Enabling this will lead to blocking the prompt with PII to reach the LLM. You will
-be seeing this response.
+be seeing this response with a `400 Bad Request` error code.
 
 ```json copy
 {
-    "choices": [
-        {
-            "index": 0,
-            "model": "Nous-Hermes-Llama2-13B",
-            "status": "error: personal identifiable information detected",
-            "text": ""
-        }
-    ],
-    "created": 1715089688,
-    "id": "cmpl-UGgwaUVYHm7jXNmFXrPGuh7OkH2EK",
-    "object": "text_completion"
+    "error": "pii detected"
 }
 ```
 
@@ -197,7 +186,7 @@ messages = [
 ]
 
 result = client.chat.completions.create(
-    model="Neural-Chat-7B",
+    model="neural-chat-7b-v3-3",
     messages=messages,
     input={"pii": "replace", "pii_replace_method": "fake"}
 )
@@ -217,17 +206,15 @@ This will produce an output like the following.
         {
             "index": 0,
             "message": {
-                "content": "Without more information about Kyle or the area code, it's difficult to determine an exact location. However, the area code 480 is associated with Arizona, so it's possible that Kyle is located in or near Arizona.",
-                "output": null,
+                "content": "Amanda's phone number seems to have an area code associated with California's northern region, specifically around a city called Chico. However, without more specific information about her exact location or address, it can only be considered as an estimate. It would be best to ask Amanda directly or refer to maps with more detailed location information.",
                 "role": "assistant"
-            },
-            "status": "success"
+            }
         }
     ],
-    "created": 1716234761,
-    "id": "chat-F34QJfOM771wYxT1YYWkkrOFyTvAg",
-    "model": "Neural-Chat-7B",
-    "object": "chat_completion"
+    "created": 1727888573,
+    "id": "chat-de3c952e-99d7-446e-855f-dc286825e71e",
+    "model": "neural-chat-7b-v3-3",
+    "object": "chat.completion"
 }
 ```
 

diff --git a/fern/docs/pages/input/injection.mdx b/fern/docs/pages/input/injection.mdx
@@ -19,6 +19,7 @@ import json
 
 from predictionguard import PredictionGuard
 
+
 # Set your Prediction Guard token as an environmental variable.
 os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
 
@@ -94,10 +95,13 @@ How to detect Injections while using the \completions Endpoint:
 ```python copy
 import os
 import json
-import predictionguard as pg
+
+from predictionguard import PredictionGuard
+
 
 # Set your Prediction Guard token as an environmental variable.
 os.environ["PREDICTIONGUARD_TOKEN"] = "<api key>"
+client = PredictionGuard()
 
 response = client.completions.create(
     model="Hermes-2-Pro-Llama-3-8B",
@@ -111,31 +115,21 @@ response = client.completions.create(
 print(json.dumps(response, sort_keys=True, indent=4, separators=(',', ': ')))
 ```
 
-this will produce the following output:
+this will produce the following ValueError:
 
 ```json copy
-{
-    "choices": [
-        {
-            "index": 0,
-            "model": "Hermes-2-Pro-Llama-3-8B",
-            "status": "error: prompt injection detected",
-            "text": ""
-        }
-    ],
-    "created": 1719588464,
-    "id": "cmpl-wz5Hqz9oKRBIIpXW0xMMjPPKHMVM0",
-    "object": "text_completion"
-}
+ValueError: Could not make prediction. prompt injection detected
 ```
 
 How to detect Injections while using the `\chat\completions`:
 
 ```python copy
 import os
 import json
+
 from predictionguard import PredictionGuard
 
+
 os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
 client = PredictionGuard()
 
@@ -151,7 +145,7 @@ messages = [
 ]
 
 result = client.chat.completions.create(
-    model="Neural-Chat-7B",
+    model="neural-chat-7b-v3-3",
     messages=messages,
     input={"block_prompt_injection":True}
 )
@@ -163,24 +157,8 @@ print(json.dumps(
 ))
 ```
 
-this will produce the following output:
+this will produce the following ValueError:
 
 ```json copy
-{
-    "choices": [
-        {
-            "index": 0,
-            "message": {
-                "content": "",
-                "output": null,
-                "role": "assistant"
-            },
-            "status": "error: prompt injection detected"
-        }
-    ],
-    "created": 1719588506,
-    "id": "chat-AGI2UHLHdmQStws5RC4KmtnPlDbvA",
-    "model": "Neural-Chat-7B",
-    "object": "chat_completion"
-}
+ValueError: Could not make prediction. prompt injection detected
 ```
diff --git a/fern/docs/pages/options/enumerations.mdx b/fern/docs/pages/options/enumerations.mdx
@@ -22,9 +22,9 @@ This page provides the list of enumerations used by the Prediction Guard API.
 | Hermes-3-Llama-3.1-70B       | Chat                 | Instruction following or chat-like applications         | [ChatML](/options/prompts#chatml)                      | 20480           | [link](/options/models#hermes-3-llama-3.1-70b)     |
 | Hermes-3-Llama-3.1-8B        | Chat                 | Instruction following or chat-like applications         | [ChatML](/options/prompts#chatml)                      | 10240           | [link](/options/models#hermes-3-llama-3.1-8b)     |
 | Hermes-2-Pro-Llama-3-8B      | Chat                 | Instruction following or chat-like applications         | [ChatML](/options/prompts#chatml)                      | 4096           | [link](/options/models#hermes-2-pro-llama-3-8b)     |
-| Nous-Hermes-Llama2-13B       | Text Generation      | Generating output in response to arbitrary instructions | [Alpaca](/options/prompts#alpaca)                      | 4096           | [link](/options/models#nous-hermes-llama2-13b)      |
+| Nous-Hermes-Llama2-13b       | Text Generation      | Generating output in response to arbitrary instructions | [Alpaca](/options/prompts#alpaca)                      | 4096           | [link](/options/models#nous-hermes-llama2-13b)      |
 | Hermes-2-Pro-Mistral-7B      | Chat                 | Instruction following or chat-like applications         | [ChatML](/options/prompts#chatml)                      | 4096           | [link](/options/models#hermes-2-pro-mistral-7b)     |
-| Neural-Chat-7B               | Chat                 | Instruction following or chat-like applications         | [Neural Chat](/options/prompts#neural-chat)            | 4096           | [link](/options/models#neural-chat-7b)              |
+| neural-chat-7b-v3-3          | Chat                 | Instruction following or chat-like applications         | [Neural Chat](/options/prompts#neural-chat)            | 4096           | [link](/options/models#neural-chat-7b)              |
 | llama-3-sqlcoder-8b          | SQL Query Generation | Generating SQL queries                                  | [Llama-3-SQLCoder](/options/prompts#llama-3-sqlcoder)  | 4096           | [link](/options/models#llama-3-sqlcoder-8b)         |
 | deepseek-coder-6.7b-instruct | Code Generation      | Generating computer code or answering tech questions    | [Deepseek](/options/prompts#deepseek)                  | 4096           | [link](/options/models#deepseek-coder-67b-instruct) |
 

diff --git a/fern/docs/pages/options/models.mdx b/fern/docs/pages/options/models.mdx
@@ -57,7 +57,7 @@ Hermes Pro takes advantage of a special system prompt and multi-turn function
 calling structure with a new chatml role in order to make function calling
 reliable and easy to parse.
 
-## Nous-Hermes-Llama2-13B
+## Nous-Hermes-Llama2-13b
 
 A general use model that combines advanced analytics capabilities with a vast 13
 billion parameter count, enabling it to perform in-depth data analysis and
@@ -112,7 +112,7 @@ Hermes Pro takes advantage of a special system prompt and multi-turn function
 calling structure with a new chatml role in order to make function calling
 reliable and easy to parse. Learn more about prompting below.
 
-## Neural-Chat-7B
+## neural-chat-7b-v3-3
 
 A revolutionary AI model for perfoming digital conversations.
 

diff --git a/fern/docs/pages/output/factuality.mdx b/fern/docs/pages/output/factuality.mdx
@@ -45,7 +45,7 @@ prompt = PromptTemplate(
 context = "California is a state in the Western United States. With over 38.9 million residents across a total area of approximately 163,696 square miles (423,970 km2), it is the most populous U.S. state, the third-largest U.S. state by area, and the most populated subnational entity in North America. California borders Oregon to the north, Nevada and Arizona to the east, and the Mexican state of Baja California to the south; it has a coastline along the Pacific Ocean to the west. "
 
 result = client.completions.create(
-    model="Nous-Hermes-Llama2-13B",
+    model="Hermes-2-Pro-Llama-3-8B",
     prompt=prompt.format(
         context=context,
         question="What is California?"
@@ -78,7 +78,7 @@ caught and Prediction Guard returns an error status.
 
 ```python copy
 result = client.completions.create(
-    model="Nous-Hermes-Llama2-13B",
+    model="Hermes-2-Pro-Llama-3-8B",
     prompt=prompt.format(
         context=context,
         question="Make up something completely fictitious about California. Contradict a fact in the given context."