diff --git a/fern/docs/pages/models/details.mdx b/fern/docs/pages/models/details.mdx index d12c8d7..660e343 100644 --- a/fern/docs/pages/models/details.mdx +++ b/fern/docs/pages/models/details.mdx @@ -16,7 +16,7 @@ LLMs are hosted by Prediction Guard in a secure, privacy conserving environment Open access models are amazing these days! Each of these models was trained by a talented team and released publicly under a permissive license. The data used to train each model and the prompt formatting for each model varies. We've tried to give you some of the relevant details here, but shoot us a message [in Slack](support) with any questions. -### The best models (start here) +### Models available in `/completions` and `/chat/completions` endpoints | Model Name | Type | Use Case | Prompt Format | Context Length | More Info | | ---------------------------- | --------------- | ------------------------------------------------------- | ---------------------------------- | -------------- | ----------------------------------------------------------------------- | @@ -25,53 +25,4 @@ Open access models are amazing these days! Each of these models was trained by a | Hermes-2-Pro-Mistral-7B | Chat | Instruction following or chat-like applications | [ChatML](prompts#chatml) | 4096 | [link](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B) | | Neural-Chat-7B | Chat | Instruction following or chat-like applications | [Neural Chat](prompts#neural-chat) | 4096 | [link](https://huggingface.co/Intel/neural-chat-7b-v3-1) | | Yi-34B-Chat | Chat | Instruction following in English or Chinese | [ChatML](prompts#chatml) | 2048 | [link](https://huggingface.co/01-ai/Yi-34B-Chat) | -| deepseek-coder-6.7b-instruct | Code Generation | Generating computer code or answering tech questions | [Deepseek](prompts#deepseek) | 4096 | [link](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) | - -### Other models available - -The models below are available in our API. However, these models scale to zero (i.e., they might not be ready for you to interact with). These models are less frequently accessed by our users, so we suggest you start with the models above. If your company requires one of these models to be up-and-running 24/7. [Reach out to us](support), and we will help make that happen! - -| Model Name | Model Card | Parameters | Context Length | -| ---------------------------- | --------------------------------------------------------------------------------- | ---------- | -------------- | -| Llama-2-13B | [link](https://huggingface.co/meta-llama/Llama-2-13b-hf) | 13B | 4096 | -| Llama-2-7B | [link](https://huggingface.co/meta-llama/Llama-2-7b-hf) | 7B | 4096 | -| Nous-Hermes-Llama2-7B | [link](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-7b) | 7B | 4096 | -| Camel-5B | [link](https://huggingface.co/Writer/camel-5b-hf) | 5B | 2048 | -| Dolly-3B | [link](https://huggingface.co/databricks/dolly-v2-3b) | 3B | 2048 | -| Dolly-7B | [link](https://huggingface.co/databricks/dolly-v2-7b) | 7B | 2048 | -| Falcon-7B-Instruct | [link](https://huggingface.co/tiiuae/falcon-7b-instruct) | 7B | 2048 | -| h2oGPT-6_9B | [link](https://huggingface.co/h2oai/h2ogpt-oig-oasst1-512-6_9b) | 6.9B | 2048 | -| MPT-7B-Instruct | [link](https://huggingface.co/mosaicml/mpt-7b-instruct) | 7B | 4096 | -| Pythia-6_9-Deduped | [link](https://huggingface.co/EleutherAI/pythia-6.9b-deduped) | 6.9B | 2048 | -| RedPajama-INCITE-Instruct-7B | [link](https://huggingface.co/togethercomputer/RedPajama-INCITE-Instruct-7B-v0.1) | 7B | 2048 | -| WizardCoder | [link](https://huggingface.co/WizardLM/WizardCoder-15B-V1.0) | 15.5B | 8192 | -| StarCoder | [link](https://huggingface.co/bigcode/starcoder) | 15.5B | 8192 | - -import { Callout } from "nextra-theme-docs"; - - - Note if you aren't actively using these models, they are scaled down. As such, - your first call to a model might need to "wake up" that model inference - server. You will get a message "Waking up model. Try again in a few minutes." - in such cases. Typically it takes around 5-15 minutes to wake up the model - server depending on the size of the model. We are actively working on reducing - these cold start times. - - -## Closed LLMs (if you t̶r̶u̶s̶t̶ need them) - -These models are integrated into our API, but they are not hosted by Prediction Guard in the same manner as the models above. - -**Note - You will need your own OpenAI API key to use the models below. Customers worried about data privacy, IP/PII leakage, HIPAA compliance, etc. should look into the above "Open Access LLMs" and/or our enterprise deploy. [Contact support](support) with any questions.** - -| Model Name | Generation | Context Length | -| ----------------------------- | ---------- | -------------- | -| OpenAI-gpt-3.5-turbo-instruct | GPT-3.5 | 4097 | -| OpenAI-davinci-002 | GPT-3.5 | 4097 | -| OpenAI-babbage-002 | GPT-3 | 2049 | - - - To use the OpenAI models above, make sure you either: (1) define the - environment variable `OPENAI_API_KEY` if you are using the Python client; or - (2) set the header parameter `OpenAI-ApiKey` if you are using the REST API. - +| deepseek-coder-6.7b-instruct | Code Generation | Generating computer code or answering tech questions | [Deepseek](prompts#deepseek) | 4096 | [link](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) | \ No newline at end of file