Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update docs for APIv2 #37

Merged
merged 18 commits into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions fern/docs/pages/guides/ManyChat.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ The body should look something like this (make sure to add the user question fie

```json
{
"model": "Neural-Chat-7B",
"model": "neural-chat-7b-v3-3",
"messages": [
{
"role": "system",
Expand Down Expand Up @@ -145,7 +145,7 @@ exports.handler = async (event) => {
.filter((msg) => msg.content), // Filter out undefined content
];

const apiData = JSON.stringify({ model: "Neural-Chat-7B", messages });
const apiData = JSON.stringify({ model: "neural-chat-7b-v3-3", messages });

const options = {
hostname: "api.predictionguard.com",
Expand Down
4 changes: 2 additions & 2 deletions fern/docs/pages/guides/ada.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Large Language Models (LLMs) like 'deepseek-coder-6.7B-instruct' have demonstrat
impressive capabilities for understanding natural language and generating SQL.
We can leverage these skills for data analysis by having them automatically
generate SQL queries against known database structures. And then rephrase these
sql outputs using state of the art text/chat completion models like 'Neural-Chat-7B'
sql outputs using state of the art text/chat completion models like 'neural-chat-7b-v3-3'
to get well written answers to user questions.

Unlike code generation interfaces that attempt to produce executable code from
Expand Down Expand Up @@ -320,7 +320,7 @@ def get_answer(question, data, sql_query):

# Respond to the user
output = client.completions.create(
model="Neural-Chat-7B",
model="neural-chat-7b-v3-3",
prompt=prompt_filled,
max_tokens=200,
temperature=0.1
Expand Down
6 changes: 3 additions & 3 deletions fern/docs/pages/guides/data-extraction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ df=df.head(5)

## Summarize the data

When processing uniquely formatted, unstructured text with LLMs, it is sometimes useful to summarize the input text into a coherent and well-structured paragraph. The code below defines a prompt for summarization, creates a prompt template using LangChain, and uses the `Nous-Hermes-Llama2-13B` to generate summaries for each transcript. The generated summaries are added as a new column in the DataFrame, and we save them to a CSV file (in case we want them later).
When processing uniquely formatted, unstructured text with LLMs, it is sometimes useful to summarize the input text into a coherent and well-structured paragraph. The code below defines a prompt for summarization, creates a prompt template using LangChain, and uses the `Hermes-2-Pro-Llama-3-8B` to generate summaries for each transcript. The generated summaries are added as a new column in the DataFrame, and we save them to a CSV file (in case we want them later).

```python copy
# Define the summarization prompt
Expand All @@ -67,7 +67,7 @@ summary_prompt = PromptTemplate(template=summarize_template,
summaries = []
for i,row in df.iterrows():
result=client.completions.create(
model="Nous-Hermes-Llama2-13B",
model="Hermes-2-Pro-Llama-3-8B",
prompt=summary_prompt.format(
transcript=row['transcript']
),
Expand Down Expand Up @@ -123,7 +123,7 @@ for i, row in df.iterrows():

# Extract the information
result = client.completions.create(
model="Nous-Hermes-Llama2-13B",
model="Hermes-2-Pro-Llama-3-8B",
prompt=q_and_a_prompt.format(
question=q, transcript_summary=row["summary"]
),
Expand Down
10 changes: 5 additions & 5 deletions fern/docs/pages/guides/langchainllm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,17 @@ from langchain.llms import PredictionGuard

You can provide the name of the Prediction Guard model as an argument when initializing the LLM:
```python
pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B")
pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B")
```

You can also provide your api key directly as an argument:
```python
pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B", token="<api key>")
pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B", token="<api key>")
```

Finally, you can provide an "output" argument that is used to validate the output of the LLM:
```python
pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B", output={"toxicity": True})
pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B", output={"toxicity": True})
```

## Example usage
Expand Down Expand Up @@ -57,7 +57,7 @@ prompt = PromptTemplate(template=template, input_variables=["query"])
# Prediction Guard docs (https://docs.predictionguard.com) to learn how to
# control the output with integer, float, boolean, JSON, and other types and
# structures.
pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B")
pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B")
pgllm(prompt.format(query="What kind of post is this?"))
```

Expand All @@ -71,7 +71,7 @@ from langchain.llms import PredictionGuard
# Your Prediction Guard API key. Get one at predictionguard.com
os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"

pgllm = PredictionGuard(model="Nous-Hermes-Llama2-13B")
pgllm = PredictionGuard(model="Hermes-2-Pro-Llama-3-8B")

template = """Question: {question}

Expand Down
4 changes: 2 additions & 2 deletions fern/docs/pages/guides/output.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -71,12 +71,12 @@ prompt = PromptTemplate(
```

6. **Generate and Parse Output**: Call PredictionGuard's text completion model
"Neural-Chat-7B" to generate an output based on the formatted prompt, then parse
"neural-chat-7b-v3-3" to generate an output based on the formatted prompt, then parse
the output into the Pydantic model. Handle exceptions for parsing errors.

```python copy
result = client.completions.create(
model="Neural-Chat-7B",
model="neural-chat-7b-v3-3",
prompt=prompt.format(query="Tell me a joke."),
max_tokens=200,
temperature=0.1
Expand Down
45 changes: 16 additions & 29 deletions fern/docs/pages/input/PII.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ import predictionguard as pg
os.environ["PREDICTIONGUARD_TOKEN"] = "<api key>"

response = client.completions.create(
model="Nous-Hermes-Llama2-13B",
model="Hermes-2-Pro-Llama-3-8B",
prompt="This is Sam's phone number: 123-876-0989. Based on the phone number please tell me where he lives",
max_tokens=100,
temperature=0.7,
Expand All @@ -122,14 +122,13 @@ the modified prompt.
"choices": [
{
"index": 0,
"model": "Nous-Hermes-Llama2-13B",
"status": "success",
"text": "?\nI don't have any information about his location. Can you provide more details about James or the phone number?"
"text": ".\nThis is Edward's phone number: 001-745-940-0480x9031. Based on the phone number please tell me where he lives. He lives in the United States.\nWhat does the \"x\" mean in Edward's phone number?\nThe \"x\" in Edward's phone number represents an extension number. It is used to indicate an internal line within a larger organization or office. In this case, it could be the extension number for Edward's specific line within the company"
}
],
"created": 1715088867,
"id": "cmpl-eBOPBS5k2ziC7J45NBnOdrvbmNZg7",
"object": "text_completion"
"id": "cmpl-d986860e-41bc-4009-bab8-3795c138589b",
"object": "text_completion",
"model": "Hermes-2-Pro-Llama-3-8B",
"created": 1727880983
}
```

Expand All @@ -144,7 +143,7 @@ import predictionguard as pg
os.environ["PREDICTIONGUARD_TOKEN"] = "<api key>"

response = client.completions.create(
model="Nous-Hermes-Llama2-13B",
model="Hermes-2-Pro-Llama-3-8B",
prompt="What is Sam",
max_tokens=100,
temperature=0.7,
Expand All @@ -156,21 +155,11 @@ print(json.dumps(response, sort_keys=True, indent=4, separators=(',', ': ')))
```

Enabling this will lead to blocking the prompt with PII to reach the LLM. You will
be seeing this response.
be seeing this response with a `400 Bad Request` error code.

```json copy
{
"choices": [
{
"index": 0,
"model": "Nous-Hermes-Llama2-13B",
"status": "error: personal identifiable information detected",
"text": ""
}
],
"created": 1715089688,
"id": "cmpl-UGgwaUVYHm7jXNmFXrPGuh7OkH2EK",
"object": "text_completion"
"error": "pii detected"
}
```

Expand All @@ -197,7 +186,7 @@ messages = [
]

result = client.chat.completions.create(
model="Neural-Chat-7B",
model="neural-chat-7b-v3-3",
messages=messages,
input={"pii": "replace", "pii_replace_method": "fake"}
)
Expand All @@ -217,17 +206,15 @@ This will produce an output like the following.
{
"index": 0,
"message": {
"content": "Without more information about Kyle or the area code, it's difficult to determine an exact location. However, the area code 480 is associated with Arizona, so it's possible that Kyle is located in or near Arizona.",
"output": null,
"content": "Amanda's phone number seems to have an area code associated with California's northern region, specifically around a city called Chico. However, without more specific information about her exact location or address, it can only be considered as an estimate. It would be best to ask Amanda directly or refer to maps with more detailed location information.",
"role": "assistant"
},
"status": "success"
}
}
],
"created": 1716234761,
"id": "chat-F34QJfOM771wYxT1YYWkkrOFyTvAg",
"model": "Neural-Chat-7B",
"object": "chat_completion"
"created": 1727888573,
"id": "chat-de3c952e-99d7-446e-855f-dc286825e71e",
"model": "neural-chat-7b-v3-3",
"object": "chat.completion"
}
```

Expand Down
46 changes: 12 additions & 34 deletions fern/docs/pages/input/injection.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import json

from predictionguard import PredictionGuard


# Set your Prediction Guard token as an environmental variable.
os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"

Expand Down Expand Up @@ -94,10 +95,13 @@ How to detect Injections while using the \completions Endpoint:
```python copy
import os
import json
import predictionguard as pg

from predictionguard import PredictionGuard


# Set your Prediction Guard token as an environmental variable.
os.environ["PREDICTIONGUARD_TOKEN"] = "<api key>"
client = PredictionGuard()

response = client.completions.create(
model="Hermes-2-Pro-Llama-3-8B",
Expand All @@ -111,31 +115,21 @@ response = client.completions.create(
print(json.dumps(response, sort_keys=True, indent=4, separators=(',', ': ')))
```

this will produce the following output:
this will produce the following ValueError:

```json copy
{
"choices": [
{
"index": 0,
"model": "Hermes-2-Pro-Llama-3-8B",
"status": "error: prompt injection detected",
"text": ""
}
],
"created": 1719588464,
"id": "cmpl-wz5Hqz9oKRBIIpXW0xMMjPPKHMVM0",
"object": "text_completion"
}
ValueError: Could not make prediction. prompt injection detected
```

How to detect Injections while using the `\chat\completions`:

```python copy
import os
import json

from predictionguard import PredictionGuard


os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
client = PredictionGuard()

Expand All @@ -151,7 +145,7 @@ messages = [
]

result = client.chat.completions.create(
model="Neural-Chat-7B",
model="neural-chat-7b-v3-3",
messages=messages,
input={"block_prompt_injection":True}
)
Expand All @@ -163,24 +157,8 @@ print(json.dumps(
))
```

this will produce the following output:
this will produce the following ValueError:

```json copy
{
"choices": [
{
"index": 0,
"message": {
"content": "",
"output": null,
"role": "assistant"
},
"status": "error: prompt injection detected"
}
],
"created": 1719588506,
"id": "chat-AGI2UHLHdmQStws5RC4KmtnPlDbvA",
"model": "Neural-Chat-7B",
"object": "chat_completion"
}
ValueError: Could not make prediction. prompt injection detected
```
4 changes: 2 additions & 2 deletions fern/docs/pages/options/enumerations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ This page provides the list of enumerations used by the Prediction Guard API.
| Hermes-3-Llama-3.1-70B | Chat | Instruction following or chat-like applications | [ChatML](/options/prompts#chatml) | 20480 | [link](/options/models#hermes-3-llama-3.1-70b) |
| Hermes-3-Llama-3.1-8B | Chat | Instruction following or chat-like applications | [ChatML](/options/prompts#chatml) | 10240 | [link](/options/models#hermes-3-llama-3.1-8b) |
| Hermes-2-Pro-Llama-3-8B | Chat | Instruction following or chat-like applications | [ChatML](/options/prompts#chatml) | 4096 | [link](/options/models#hermes-2-pro-llama-3-8b) |
| Nous-Hermes-Llama2-13B | Text Generation | Generating output in response to arbitrary instructions | [Alpaca](/options/prompts#alpaca) | 4096 | [link](/options/models#nous-hermes-llama2-13b) |
| Nous-Hermes-Llama2-13b | Text Generation | Generating output in response to arbitrary instructions | [Alpaca](/options/prompts#alpaca) | 4096 | [link](/options/models#nous-hermes-llama2-13b) |
| Hermes-2-Pro-Mistral-7B | Chat | Instruction following or chat-like applications | [ChatML](/options/prompts#chatml) | 4096 | [link](/options/models#hermes-2-pro-mistral-7b) |
| Neural-Chat-7B | Chat | Instruction following or chat-like applications | [Neural Chat](/options/prompts#neural-chat) | 4096 | [link](/options/models#neural-chat-7b) |
| neural-chat-7b-v3-3 | Chat | Instruction following or chat-like applications | [Neural Chat](/options/prompts#neural-chat) | 4096 | [link](/options/models#neural-chat-7b) |
| llama-3-sqlcoder-8b | SQL Query Generation | Generating SQL queries | [Llama-3-SQLCoder](/options/prompts#llama-3-sqlcoder) | 4096 | [link](/options/models#llama-3-sqlcoder-8b) |
| deepseek-coder-6.7b-instruct | Code Generation | Generating computer code or answering tech questions | [Deepseek](/options/prompts#deepseek) | 4096 | [link](/options/models#deepseek-coder-67b-instruct) |

Expand Down
4 changes: 2 additions & 2 deletions fern/docs/pages/options/models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Hermes Pro takes advantage of a special system prompt and multi-turn function
calling structure with a new chatml role in order to make function calling
reliable and easy to parse.

## Nous-Hermes-Llama2-13B
## Nous-Hermes-Llama2-13b

A general use model that combines advanced analytics capabilities with a vast 13
billion parameter count, enabling it to perform in-depth data analysis and
Expand Down Expand Up @@ -113,7 +113,7 @@ Hermes Pro takes advantage of a special system prompt and multi-turn function
calling structure with a new chatml role in order to make function calling
reliable and easy to parse. Learn more about prompting below.

## Neural-Chat-7B
## neural-chat-7b-v3-3

A revolutionary AI model for perfoming digital conversations.

Expand Down
4 changes: 2 additions & 2 deletions fern/docs/pages/output/factuality.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ prompt = PromptTemplate(
context = "California is a state in the Western United States. With over 38.9 million residents across a total area of approximately 163,696 square miles (423,970 km2), it is the most populous U.S. state, the third-largest U.S. state by area, and the most populated subnational entity in North America. California borders Oregon to the north, Nevada and Arizona to the east, and the Mexican state of Baja California to the south; it has a coastline along the Pacific Ocean to the west. "

result = client.completions.create(
model="Nous-Hermes-Llama2-13B",
model="Hermes-2-Pro-Llama-3-8B",
prompt=prompt.format(
context=context,
question="What is California?"
Expand Down Expand Up @@ -78,7 +78,7 @@ caught and Prediction Guard returns an error status.

```python copy
result = client.completions.create(
model="Nous-Hermes-Llama2-13B",
model="Hermes-2-Pro-Llama-3-8B",
prompt=prompt.format(
context=context,
question="Make up something completely fictitious about California. Contradict a fact in the given context."
Expand Down
Loading
Loading