Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating docs to remove sqlcoder and add new embeddings features #40

Merged
merged 4 commits into from
Oct 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,10 @@ navigation:
layout:
- api: API Reference
display-errors: true
- section: Valid Inputs
contents:
- page: Enumerations for API
path: ./docs/pages/options/enumerations.mdx
- section: SDK Reference
contents:
- page: Chat
Expand Down
2 changes: 1 addition & 1 deletion fern/docs/pages/options/enumerations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ This page provides the list of enumerations used by the Prediction Guard API.
| Nous-Hermes-Llama2-13b | Text Generation | Generating output in response to arbitrary instructions | [Alpaca](/options/prompts#alpaca) | 4096 | [link](/options/models#nous-hermes-llama2-13b) |
| Hermes-2-Pro-Mistral-7B | Chat | Instruction following or chat-like applications | [ChatML](/options/prompts#chatml) | 4096 | [link](/options/models#hermes-2-pro-mistral-7b) |
| neural-chat-7b-v3-3 | Chat | Instruction following or chat-like applications | [Neural Chat](/options/prompts#neural-chat) | 4096 | [link](/options/models#neural-chat-7b) |
| llama-3-sqlcoder-8b | SQL Query Generation | Generating SQL queries | [Llama-3-SQLCoder](/options/prompts#llama-3-sqlcoder) | 4096 | [link](/options/models#llama-3-sqlcoder-8b) |
| deepseek-coder-6.7b-instruct | Code Generation | Generating computer code or answering tech questions | [Deepseek](/options/prompts#deepseek) | 4096 | [link](/options/models#deepseek-coder-67b-instruct) |

### This Model is required in the `/embeddings` endpoint:

| Model Name | Type | Use Case | Context Length | More Info |
| --------------------------------- | --------------------- | ----------------------------------------------- | -------------- | ------------------------------------------------------|
| multilingual-e5-large-instruct | Embedding Generation | Used for generating text embeddings | 512 | [link](/options/models#multilingual-e5-large-instruct) |
| bridgetower-large-itm-mlm-itc | Embedding Generation | Used for generating text and image embedding | 100 | [link](/options/models#bridgetower-large-itm-mlm-itc) |

### This Model is required in the `/chat/completions` vision endpoint:
Expand Down
95 changes: 53 additions & 42 deletions fern/docs/pages/options/models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ with an improved focus on longer context lengths. This allows for more accuracy
in areas that require a longer context window, along with being an improved version of the previous
Hermes and Llama line of models.

**Type**: Chat
**Use Case**: Instruction Following or Chat-Like Applications
**Prompt Format**: [ChatML](/options/prompts#chatml)
**Type:** Chat\
**Use Case:** Instruction Following or Chat-Like Applications\
**Prompt Format:** [ChatML](/options/prompts#chatml)\

https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B

Expand All @@ -37,9 +37,9 @@ A general use model that maintains excellent general task and conversation
capabilities while excelling at JSON Structured Outputs and improving on several
other metrics.

**Type**: Chat
**Use Case**: Instruction Following or Chat-Like Applications
**Prompt Format**: [ChatML](/options/prompts#chatml)
**Type:** Chat\
**Use Case:** Instruction Following or Chat-Like Applications\
**Prompt Format:** [ChatML](/options/prompts#chatml)\

https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B

Expand All @@ -64,9 +64,9 @@ billion parameter count, enabling it to perform in-depth data analysis and
support complex decision-making processes. This model is designed to process
large volumes of data, uncover hidden patterns, and provide actionable insights.

**Type**: Text Generation
**Use Case**: Generating Output in Response to Arbitrary Instructions
**Prompt Format**: [Alpaca](/options/prompts#alpaca)
**Type:** Text Generation\
**Use Case:** Generating Output in Response to Arbitrary Instructions\
**Prompt Format:** [Alpaca](/options/prompts#alpaca)\

https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b

Expand All @@ -92,9 +92,9 @@ excels in delivering accurate and contextually relevant responses, making it ide
for a wide range of applications, including chatbots, language translation,
content creation, and more.

**Type**: Chat
**Use Case**: Instruction Following or Chat-Like Applications
**Prompt Format**: [ChatML](/options/prompts#chatml)
**Type:** Chat\
**Use Case:** Instruction Following or Chat-Like Applications\
**Prompt Format:** [ChatML](/options/prompts#chatml)\

https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B

Expand All @@ -114,11 +114,11 @@ reliable and easy to parse. Learn more about prompting below.

## neural-chat-7b-v3-3

A revolutionary AI model for perfoming digital conversations.
A revolutionary AI model for performing digital conversations.

**Type**: Chat
**Use Case**: Instruction Following or Chat-Like Applications
**Prompt Format**: [Neural Chat](/options/prompts#neural-chat)
**Type:** Chat\
**Use Case:** Instruction Following or Chat-Like Applications\
**Prompt Format:** [Neural Chat](/options/prompts#neural-chat)\

https://huggingface.co/Intel/neural-chat-7b-v3-3

Expand All @@ -130,27 +130,14 @@ from mistralai/Mistral-7B-v-0.1. For more information, refer to the blog

[The Practice of Supervised Fine-tuning and Direct Preference Optimization on Intel Gaudi2](https://medium.com/@NeuralCompressor/the-practice-of-supervised-finetuning-and-direct-preference-optimization-on-habana-gaudi2-a1197d8a3cd3)

## llama-3-sqlcoder-8b

A state of the art AI model for generating SQL queries from natural language.

**Type**: SQL Query Generation
**Use Case**: Generating SQL Queries
**Prompt Format**: [Llama-3-SQLCoder](/options/prompts#llama-3-sqlcoder)

https://huggingface.co/defog/llama-3-sqlcoder-8b

A capable language model for text to SQL generation for Postgres, Redshift and
Snowflake that is on-par with the most capable generalist frontier models.

## deepseek-coder-6.7b-instruct

DeepSeek Coder is a capable coding model trained on two trillion code and natural
language tokens.

**Type**: Code Generation
**Use Case**: Generating Computer Code or Answering Tech Questions
**Prompt Format**: [Deepseek](/options/prompts#deepseek)
**Type:** Code Generation\
**Use Case:** Generating Computer Code or Answering Tech Questions\
**Prompt Format:** [Deepseek](/options/prompts#deepseek)\

https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct

Expand All @@ -163,16 +150,40 @@ support project-level code completion and infilling. For coding capabilities,
Deepseek Coder achieves state-of-the-art performance among open-source code models
on multiple programming languages and various benchmarks.

## multilingual-e5-large-instruct

Multilingual-e5 is a multilingual model for creating text embeddings in multiple languages.

**Type:** Embedding Generation\
**Use Case:** Used for Generating Text Embeddings\

https://huggingface.co/intfloat/multilingual-e5-large-instruct

multilingual-e5-large-instruct is a robust, multilingual embedding model with
560 million parameters and a dimensionality of 1024, capable of processing
inputs with up to 512 tokens. This model builds on the xlm-roberta-large
architecture and is designed to excel in multilingual text embedding tasks
across 100 languages. Trained through a two-stage process, it first undergoes
contrastive pre-training on one billion weakly supervised text pairs, followed
by fine-tuning on diverse multilingual datasets from the E5-mistral paper.

With state-of-the-art performance in text retrieval and semantic similarity,
this model demonstrates impressive results on the BEIR and MTEB benchmarks.
Users should note that task instructions are crucial for optimal performance,
as the model leverages these to customize embeddings for various scenarios.
Although the model generally supports 100 languages, performance may vary
for low-resource languages.

With a training approach that mirrors the English E5 model recipe, it achieves
comparable quality to leading English-only models while offering a multilingual edge.

## bridgetower-large-itm-mlm-itc

BridgeTower is a multimodal model for creating joint embeddings between images
and text.

_**Note: This Model is required to be used with the `/embeddings` endpoint. Most of the
SDKs will not ask you to provide model because it's using this one.**_

**Type**: Embedding Generation
**Use Case**: Used for Generating Text and Image Embedding
**Type:** Embedding Generation\
**Use Case:** Used for Generating Text and Image Embedding\

https://huggingface.co/BridgeTower/bridgetower-large-itm-mlm-itc

Expand All @@ -196,8 +207,8 @@ LLaVa is a multimodal model that supports vision and language models combined.
_**This Model is required to be used with the `/chat/completions` vision endpoint.
Most of the SDKs will not ask you to provide model because it's using this one.**_

**Type**: Vision Text Generation
**Use Case**: Used for Generating Text from Text and Image Inputs
**Type:** Vision Text Generation\
**Use Case:** Used for Generating Text from Text and Image Inputs\

https://huggingface.co/llava-hf/llava-1.5-7b-hf

Expand All @@ -214,9 +225,9 @@ with an improved focus on longer context lengths. This allows for more accuracy
in areas that require a longer context window, along with being an improved version of the previous
Hermes and Llama line of models.

**Type**: Chat
**Use Case**: Instruction Following or Chat-Like Applications
**Prompt Format**: [ChatML](/options/prompts#chatml)
**Type:** Chat\
**Use Case:** Instruction Following or Chat-Like Applications\
**Prompt Format:** [ChatML](/options/prompts#chatml)\

https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B

Expand Down
17 changes: 0 additions & 17 deletions fern/docs/pages/options/prompts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -90,20 +90,3 @@ appropriate information, and do not keep the curly braces)
{context or user message}
### Response:
```

## Llama-3-SQLCoder

(Replace the portions of the prompt below in curly braces `{...}` with the
appropriate information, and do not keep the curly braces)

```
<|begin_of_text|><|start_header_id|>user<|end_header_id|>

Generate a SQL query to answer this question: {user_question}
{instructions}

DDL statements:
{create_table_statements}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

The following SQL query best answers the question {user_question}:
```
24 changes: 12 additions & 12 deletions fern/docs/pages/usingllms/embeddings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ subtitle: Embeddings Endpoint

At Prediction Guard, we offer an embedding endpoint capable of generating embeddings for both text and images. This feature is particularly useful when you want to load embeddings into a vector database for performing semantically similar searches etc.

The Bridgetower model is a cross-modal encoder that handles both images and text. Here is a simple illustration of how to make a call to the embeddings endpoint with both image and text inputs. This endpoint accepts image URL, local image files, data URIs, and base64 encoded image strings as input.
## Text

## Embeddings for text and image
The multilingual-e5-large-instruct model is a lightweight embeddings model capable of embedding text. It supports 100 languages and a context length of 512. Here is a simple example of how to make a call to the embeddings endpoint using this model.

```Python
import os
Expand All @@ -20,13 +20,8 @@ os.environ["PREDICTIONGUARD_API_KEY"] = "<api key>"
client = PredictionGuard()

response = client.embeddings.create(
model="bridgetower-large-itm-mlm-itc",
input=[
{
"text": "Cool skateboarding tricks you can try this summer",
"image": "https://farm4.staticflickr.com/3300/3497460990_11dfb95dd1_z.jpg"
}
]
model="multilingual-e5-large-instruct",
input="I love to learn and use LLMs."
)

print(json.dumps(
Expand All @@ -39,7 +34,9 @@ print(json.dumps(

This will yield a json object with the embedding.

## Embeddings for text only
## Multimodal

The Bridgetower model is a cross-modal encoder that handles both images and text. Here is a simple illustration of how to make a call to the embeddings endpoint with both image and text inputs. This endpoint accepts image URL, local image files, data URIs, and base64 encoded image strings as input.

```Python
import os
Expand All @@ -56,7 +53,8 @@ response = client.embeddings.create(
model="bridgetower-large-itm-mlm-itc",
input=[
{
"text": "Tell me a joke.",
"text": "Cool skateboarding tricks you can try this summer",
"image": "https://farm4.staticflickr.com/3300/3497460990_11dfb95dd1_z.jpg"
}
]
)
Expand All @@ -69,7 +67,9 @@ print(json.dumps(
))
```

## Embeddings for Image only
This will yield a json object with the embedding.

### Embeddings for Image only

```Python
import os
Expand Down
Loading
Loading