Skip to content

Commit

Permalink
Merge pull request #5 from predictionguard/remove-models
Browse files Browse the repository at this point in the history
removing old models
  • Loading branch information
jmansdorfer authored May 3, 2024
2 parents 1badf56 + 83f480b commit f0e5332
Showing 1 changed file with 2 additions and 51 deletions.
53 changes: 2 additions & 51 deletions fern/docs/pages/models/details.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ LLMs are hosted by Prediction Guard in a secure, privacy conserving environment

Open access models are amazing these days! Each of these models was trained by a talented team and released publicly under a permissive license. The data used to train each model and the prompt formatting for each model varies. We've tried to give you some of the relevant details here, but shoot us a message [in Slack](support) with any questions.

### The best models (start here)
### Models available in `/completions` and `/chat/completions` endpoints

| Model Name | Type | Use Case | Prompt Format | Context Length | More Info |
| ---------------------------- | --------------- | ------------------------------------------------------- | ---------------------------------- | -------------- | ----------------------------------------------------------------------- |
Expand All @@ -25,53 +25,4 @@ Open access models are amazing these days! Each of these models was trained by a
| Hermes-2-Pro-Mistral-7B | Chat | Instruction following or chat-like applications | [ChatML](prompts#chatml) | 4096 | [link](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B) |
| Neural-Chat-7B | Chat | Instruction following or chat-like applications | [Neural Chat](prompts#neural-chat) | 4096 | [link](https://huggingface.co/Intel/neural-chat-7b-v3-1) |
| Yi-34B-Chat | Chat | Instruction following in English or Chinese | [ChatML](prompts#chatml) | 2048 | [link](https://huggingface.co/01-ai/Yi-34B-Chat) |
| deepseek-coder-6.7b-instruct | Code Generation | Generating computer code or answering tech questions | [Deepseek](prompts#deepseek) | 4096 | [link](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) |

### Other models available

The models below are available in our API. However, these models scale to zero (i.e., they might not be ready for you to interact with). These models are less frequently accessed by our users, so we suggest you start with the models above. If your company requires one of these models to be up-and-running 24/7. [Reach out to us](support), and we will help make that happen!

| Model Name | Model Card | Parameters | Context Length |
| ---------------------------- | --------------------------------------------------------------------------------- | ---------- | -------------- |
| Llama-2-13B | [link](https://huggingface.co/meta-llama/Llama-2-13b-hf) | 13B | 4096 |
| Llama-2-7B | [link](https://huggingface.co/meta-llama/Llama-2-7b-hf) | 7B | 4096 |
| Nous-Hermes-Llama2-7B | [link](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-7b) | 7B | 4096 |
| Camel-5B | [link](https://huggingface.co/Writer/camel-5b-hf) | 5B | 2048 |
| Dolly-3B | [link](https://huggingface.co/databricks/dolly-v2-3b) | 3B | 2048 |
| Dolly-7B | [link](https://huggingface.co/databricks/dolly-v2-7b) | 7B | 2048 |
| Falcon-7B-Instruct | [link](https://huggingface.co/tiiuae/falcon-7b-instruct) | 7B | 2048 |
| h2oGPT-6_9B | [link](https://huggingface.co/h2oai/h2ogpt-oig-oasst1-512-6_9b) | 6.9B | 2048 |
| MPT-7B-Instruct | [link](https://huggingface.co/mosaicml/mpt-7b-instruct) | 7B | 4096 |
| Pythia-6_9-Deduped | [link](https://huggingface.co/EleutherAI/pythia-6.9b-deduped) | 6.9B | 2048 |
| RedPajama-INCITE-Instruct-7B | [link](https://huggingface.co/togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1) | 7B | 2048 |
| WizardCoder | [link](https://huggingface.co/WizardLM/WizardCoder-15B-V1.0) | 15.5B | 8192 |
| StarCoder | [link](https://huggingface.co/bigcode/starcoder) | 15.5B | 8192 |

import { Callout } from "nextra-theme-docs";

<Callout type="info" emoji="ℹ️">
Note if you aren't actively using these models, they are scaled down. As such,
your first call to a model might need to "wake up" that model inference
server. You will get a message "Waking up model. Try again in a few minutes."
in such cases. Typically it takes around 5-15 minutes to wake up the model
server depending on the size of the model. We are actively working on reducing
these cold start times.
</Callout>

## Closed LLMs (if you t̶r̶u̶s̶t̶ need them)

These models are integrated into our API, but they are not hosted by Prediction Guard in the same manner as the models above.

**Note - You will need your own OpenAI API key to use the models below. Customers worried about data privacy, IP/PII leakage, HIPAA compliance, etc. should look into the above "Open Access LLMs" and/or our enterprise deploy. [Contact support](support) with any questions.**

| Model Name | Generation | Context Length |
| ----------------------------- | ---------- | -------------- |
| OpenAI-gpt-3.5-turbo-instruct | GPT-3.5 | 4097 |
| OpenAI-davinci-002 | GPT-3.5 | 4097 |
| OpenAI-babbage-002 | GPT-3 | 2049 |

<Callout type="info" emoji="ℹ️">
To use the OpenAI models above, make sure you either: (1) define the
environment variable `OPENAI_API_KEY` if you are using the Python client; or
(2) set the header parameter `OpenAI-ApiKey` if you are using the REST API.
</Callout>
| deepseek-coder-6.7b-instruct | Code Generation | Generating computer code or answering tech questions | [Deepseek](prompts#deepseek) | 4096 | [link](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) |

0 comments on commit f0e5332

Please sign in to comment.