Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Commit

Permalink
Add API docs (#116)
Browse files Browse the repository at this point in the history
  • Loading branch information
emrgnt-cmplxty authored Oct 31, 2023
1 parent 8bbc919 commit 2706bd4
Show file tree
Hide file tree
Showing 18 changed files with 256 additions and 148 deletions.
11 changes: 8 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Local cruft and other
# Crufty EXTs
poetry.lock
.env
.vscode
Expand All @@ -8,12 +8,17 @@ poetry.lock
*.sh
**/__pycache__/**
**/.DS_Store
storeage/
textbooks/


# Local sandbox environments
playground/
outputs/
dump/
sciphi/_version.py
textbooks/

# Scraped data
sciphi/library_of_phi/raw_data/

# Built Docs
docs/build
49 changes: 39 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ After entering your settings, ensure you save and exit the file.
SciPhi supports multiple LLM providers (e.g. OpenAI, Anthropic, HuggingFace, and vLLM) and RAG providers (e.g. SciPhi). The framework supports seamless integration of these providers. To run an example completion with SciPhi, execute:

```bash
python -m sciphi.scripts.sciphi_gen_completion -llm_provider_name=sciphi --llm_api_key=YOUR_SCIPHI_API_KEY --llm_api_base=https://api.sciphi.ai/v1 --rag_api_base=https://api.sciphi.ai --llm_model_name=SciPhi/SciPhi-Self-RAG-Mistral-7B-32k --query="Write a few paragraphs on general relativity. Include the mathematical definition of Einsteins field equation in your writeup."
python -m sciphi.scripts.sciphi_chat --llm_model_name=SciPhi/SciPhi-Self-RAG-Mistral-7B-32k --query="Write a few paragraphs on general relativity. Include the mathematical definition of Einsteins field equation in your writeup."
```

### Configurable Data Generation
Expand All @@ -83,13 +83,7 @@ Use SciPhi to generate datasets tailored to your specifications. By running the
python -m sciphi.scripts.data_augmenter --config-path=$PWD/sciphi/config/prompts/question_and_answer.yaml --config_name=None --n_samples=1
```

Inspect the output of this command:

```bash
tail augmented_output/config_name__question_and_answer_dataset_name__ContextualAI_tiny-wiki100-chunks.jsonl
```

Sample Output:
Inspecting the output of this command:

```bash
{"question": "What is the reaction called when alcohol and carboxylic acids react?", "answer": "Fischer esterification"}
Expand Down Expand Up @@ -134,7 +128,7 @@ This is an effort to democratize access to top-tier textbooks. This can readily
4. **Custom Settings & RAG Functionality**:

Simply switch `rag-enabled` to `True`. Ensure you have the right `.env` variables set up, or provide CLI values for `rag_api_base` and `rag_api_key`.

Alternatively, you may provide your own custom settings in a YAML file. See the [default settings configuration here](sciphi/config/generation_settings/textbook_generation_settings.yaml).

_Important:_ To make the most out of grounding your data with Wikipedia, ensure your system matches our detailed specifications. An example RAG provider can be seen [here](https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/interface/rag/sciphi_wiki.py). More high quality outbook books are available [here](https://github.com/SciPhi-AI/library-of-phi).
Expand All @@ -153,7 +147,42 @@ This example evaluates your RAG over 100 science multiple-choice questions and r

## Development

### Example - Instantiate your own LLM and RAG provider
### Basic Example - Generate a chat completion with SciPhi

Here's how you can use SciPhi to quickly set up and retrieve chat completions, without diving deep into intricate configurations:
```python
from sciphi.interface import (
SciPhiFormatter,
SciPhiLLMInterface,
SciPhiWikiRAGInterface,
)
# SciPhi RAG Interface
# Supports calls like `contexts = rag_interface.get_contexts(query)`
rag_interface = SciPhiWikiRAGInterface()
# SciPhi LLM Interface
llm_interface = SciPhiLLMInterface(rag_interface)
# Get the completion for a given prompt
query: str = "Who is the president of the United States?"
conversation.append({"role": "user", "content": query})
generation_config = GenerationConfig(
model_name=llm_model_name,
stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN,
# pass in any other generation settings here
)
completion = llm_interface.get_chat_completion(
conversation, generation_config
)
print(completion)
# The current President of the United States is Joe Biden.
```
### Advanced Example - Instantiate your own LLM and RAG provider
Here's an example of how you can instantiate your own LLM and RAG provider using SciPhi:

Expand Down
106 changes: 51 additions & 55 deletions docs/source/api/main.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,12 @@
# SciPhi API Documentation (in reStructuredText format)

SciPhi API Documentation
========================

Welcome to the SciPhi API documentation. Here, you'll find a detailed guide on how to use the different endpoints provided by the SciPhi service. This API allows you to interact with the powerful functionalities of the SciPhi codebase, bringing the power of large language models directly to your applications.
Welcome to the SciPhi API documentation. Here, you'll find a detailed guide on how to use the different endpoints provided by the SciPhi service. This API allows you to interact with the powerful functionalities of the SciPhi codebase and associated AI. SciPhi looks to become a powerful tool for exploring the world's knowledge, and we hope you enjoy using it!

Endpoint Overview
-----------------

1. **Search**: This endpoint allows you to use the Retriever to fetch related documents from a given set of queries. Meta's `Contreiver` embeddings are used in this process. Currently just Wikipedia is embedded, but the goal is to scale this to a comprehensive database embedded via recent SOTA methods.
1. **Search**: This endpoint allows you to fetch related documents based on a set of queries. The documents are retrieved by re-ranked similarity search over embeddings produced by the `facebook/contriever <https://huggingface.co/facebook/contriever>`_. As of now, only Wikipedia is embedded, but there are plans to expand this to a more comprehensive corpus using state-of-the-art embedding methods.
2. **OpenAI Formatted LLM Request (v1)**: SciPhi models are served via an API that is compatible with the OpenAI API.

Detailed Endpoint Descriptions
Expand All @@ -22,75 +20,73 @@ Search Endpoint
- **Description**: This endpoint interacts with the Retriever module of the SciPhi codebase, allowing you to search for related documents based on the provided queries.

**Request Body**:
- ``queries``: List of query strings for which related documents are to be retrieved.
- ``queries``: A list of query strings for which related documents should be retrieved.
- ``top_k``: (Optional) The number of top related documents you wish to retrieve for each query.

**Response**:
A list of lists containing Document objects, where each list corresponds to the related documents for each query.

**Example**:

.. code-block:: bash
curl -X POST http://<api_url>/search \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"queries": ["What is general relativity?", "Who is Albert Einstein?"], "top_k": 5}'
OpenAI API (v1) Endpoint
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- **URL**: ``/v1/{path:path}``
- **Method**: ``GET``, ``POST``, ``PUT``, ``DELETE``
- **Description**: This endpoint is designed to forward requests to another server, such as vLLM. It can act as a middleware, allowing you to utilize other services while managing access through the SciPhi API.

**Request Body**:
The body should match the request format of the service to which you are forwarding the request.

**Response**:
The response received from the forwarded service.
A list of lists containing Document objects, where each list corresponds to the related documents for a specific query.

**Example**:

.. code-block:: bash
curl -X POST https://api.sciphi.ai/v1/completion \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"prompt": "Describe the universe.", ...}'
curl -X POST https://api.sciphi.ai/search \
-H "Authorization: Bearer $SCIPHI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"queries": ["What is general relativity?", "Who is Albert Einstein?"], "top_k": 5}'
This request queries the SciPhi World Database. The expected response is:

Alternatively, with the SciPhi framework, you may execute a generation as shown:
.. code-block:: none
[[{"id":14678539,"title":"General Relativity and Gravitation","text":"General Relativity and Gravitation General Re ...
.. code-block:: python
from sciphi.interface import LLMInterfaceManager, RAGInterfaceManager
from sciphi.llm import GenerationConfig
SciPhi v1 Endpoints
~~~~~~~~~~~~~~~~~~~

# LLM Provider Settings
llm_interface = LLMInterfaceManager.get_interface_from_args(
LLMProviderName(llm_provider_name),
api_key=llm_api_key,
api_base=llm_api_base,
rag_interface=rag_interface,
model_name=llm_model_name,
)
SciPhi adheres to the API specification of OpenAI's API, allowing compatibility with any application designed for the OpenAI API. Below is an example curl command:

# Set up typical LLM generation settings
completion_config = GenerationConfig(
temperature=llm_temperature,
top_k=llm_top_k,
max_tokens_to_sample=llm_max_tokens_to_sample,
model_name=llm_model_name,
skip_special_tokens=llm_skip_special_tokens,
stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN,
)
# Get the completion for a prompt
completion = llm_interface.get_completion(prompt, generation_config)
**Example**:

.. code-block:: bash
curl https://api.sciphi.ai/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 55c51253002ed4f7d1dd3afbe2a72635" \
-d '{
"model": "SciPhi/SciPhi-Self-RAG-Mistral-7B-32k",
"prompt": "Say this is a test.",
"temperature": 0.7
}'


After executing the above request with the SciPhi/SciPhi-Self-RAG-Mistral-7B-32k model, the expected response is:

.. code-block:: json
{
"id":"cmpl-f03f53c15a174ffe89bdfc83507de7a9",
"object":"text_completion",
"created":1698730137,
"model":"SciPhi/SciPhi-Self-RAG-Mistral-7B-32k",
"choices":[
{
"index":0,
"text":"This is a test.",
"logprobs":null,
"finish_reason":"length"
}
],
"usage":
{
"prompt_tokens":7,
"total_tokens":23,
"completion_tokens":16
}
} (base) ocolegrove@MacBook-Pro-5 sciphi-core %

API Key and Signup
------------------

To access the SciPhi API, you will need an API key. If you don't have one, you can sign up `here <https://www.sciphi.ai/signup>`_. Ensure you include the API key in your request headers as shown in the examples.
To access the SciPhi API, you need an API key. If you don't possess one, you can sign up `here <https://www.sciphi.ai/signup>`_. Ensure you include the API key in your request headers as shown in the examples.
Binary file modified docs/source/assets/logos/sciphi.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 14 additions & 15 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Welcome to SciPHi!
SciPhi [ΨΦ]: AI's Knowledge Engine 💡
================

.. image:: https://github.com/emrgnt-cmplxty/sciphi/assets/68796651/195367d8-54fd-4281-ace0-87ea8523f982
Expand All @@ -21,18 +21,16 @@ Welcome to SciPHi!
</p>


SciPhi [ΨΦ]: AI's Knowledge Engine 💡
-----------------------------------------------------------------

SciPhi is a powerful knowledge engine tailored for LLM-based data generation and management.
SciPhi is a powerful knowledge engine tailored for LLM-based inference, data generation and management.

With SciPhi, you can:

* Generate datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi**.
* Tap into the **Retriever-Augmented Generation (RAG)** for data anchoring to real-world sources.
- Features like end-to-end cloud and local RAG knowledge engine APIs are underway!
* Generate truthful datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi**.
* Custom tailor your data creation for applications such as LLM training, RAG, and beyond.
- For instance, the in-built textbook module can generate RAG-enhanced textbooks from a given table of contents.
- For ex., generate RAG-grounded textbooks from a given table of contents.

Quick and easy setup:

Expand All @@ -46,31 +44,32 @@ Diverse Features:
* Evaluate your RAG systems effectively with the SciPhi evaluation harness.
* Engage with the community on platforms like `Discord <https://discord.gg/j9GxfbxqAe>`_.

Developers can also instantiate their own LLM and RAG providers using the SciPhi framework. The supported LLM providers include popular choices like OpenAI, Anthropic, HuggingFace, and vLLM. For specialized RAG capabilities, SciPhi offers the **World Databasef API** for comprehensive database access.
Developers can also instantiate their own LLM and RAG providers using the SciPhi framework. The supported LLM providers include popular choices like OpenAI, Anthropic, HuggingFace, and vLLM. For specialized RAG capabilities, SciPhi offers the `World Databasef API <https://sciphi.readthedocs.io/en/latest/api/main.html>`_ for comprehensive knowledge access.

You can use this format in your reStructuredText documentation, and it should render as a clickable link.
For a detailed setup guide, deeper feature exploration, and developer insights, refer to:

* `SciPhi GitHub Repository <https://github.com/emrgnt-cmplxty/sciphi>`_
* `Example Textbook Generated with SciPhi <https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/data/sample/textbooks/Aerodynamics_of_Viscous_Fluids.md>`_
* `ToC Used for Sample Textbook Generation <https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/data/sample/table_of_contents/Aerodynamics_of_Viscous_Fluids.yaml>`_
* `Default Settings for Textbook Generation <https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/config/generation_settings/textbook_generation_settings.yaml>`_
* `Library of SciPhi Books <https://github.com/SciPhi-AI/library-of-phi/>`_
* `Library of Phi <https://github.com/SciPhi-AI/library-of-phi/>`_


Citing Our Work
---------------

If you're using SciPhi in your research or project, please cite our work:

.. code-block:: plaintext
.. code-block:: none
@software{SciPhi,
author = {Colegrove, Owen},
doi = {Pending},
month = {09},
title = {{SciPhi: A Framework for LLM Powered Data}},
url = {https://github.com/sciphi-ai/sciphi},
year = {2023}
author = {Colegrove, Owen},
doi = {Pending},
month = {09},
title = {{SciPhi: A Framework for LLM Powered Data}},
url = {https://github.com/sciphi-ai/sciphi},
year = {2023}
}
Documentation
Expand Down
8 changes: 5 additions & 3 deletions docs/source/setup/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,14 @@ After installation, set up your environment to link with supported LLM providers
Here is an example of the configuration in the `.env` file:

.. code-block:: bash
# To use SciPhi as a provider:
SCIPHI_API_KEY=your_sciphi_api_key
# Other providers in SciPhi:
OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
HF_TOKEN=your_huggingface_token
VLLM_API_KEY=your_vllm_api_key
SCIPHI_API_KEY=your_sciphi_api_key
RAG_API_KEY=your_rag_server_api_key
RAG_API_BASE=your_rag_api_base_url
Expand All @@ -61,10 +63,10 @@ To set up SciPhi for development:

.. code-block:: console
$ git clone https://github.com/emrgnt-cmplxty/sciphi.git
$ git clone https://github.com/SciPhi-AI/sciphi.git
$ cd sciphi
$ pip3 install poetry # If you do not have Poetry installed.
$ poetry install
$ poetry install # Can use `pip install -e .`` instead.
$ poetry install -E all_with_extras
Licensing and Acknowledgment
Expand Down
43 changes: 41 additions & 2 deletions docs/source/setup/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,51 @@ Before you start, ensure you've installed SciPhi:
pip install sciphi
For additional details, refer to the `installation guide <https://sciphi.readthedocs.io/en/latest/installation.html>`_.
For additional details, refer to the `installation guide <https://sciphi.readthedocs.io/en/latest/setup/installation.html>`_.

Instantiate Your LLM and RAG Provider
-------------------------------------

Here's a simple example of how you can utilize SciPhi to work with your own LLM and RAG provider:
Here's how you can use SciPhi to quickly set up and retrieve chat completions, without diving deep into intricate configurations:

.. code-block:: python
from sciphi.interface import (
SciPhiFormatter,
SciPhiLLMInterface,
SciPhiWikiRAGInterface,
)
# SciPhi RAG Interface
# Supports calls like `contexts = rag_interface.get_contexts(query)`
rag_interface = SciPhiWikiRAGInterface()
# SciPhi LLM Interface
llm_interface = SciPhiLLMInterface(rag_interface)
# Initializing the conversation
query: str = "Who is the president of the United States?"
conversation = []
conversation.append({"role": "user", "content": query})
# Define generation configuration
generation_config = GenerationConfig(
model_name=llm_model_name,
stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN,
# Pass in any other desired generation settings here
)
# Get the chat completion
completion = llm_interface.get_chat_completion(
conversation, generation_config
)
print(completion)
# Expected output: The current President of the United States is Joe Biden.
---

Here's a more advanced example of how you can utilize SciPhi to work with configurably with available LLM and RAG providers:

.. code-block:: python
Expand Down
Loading

0 comments on commit 2706bd4

Please sign in to comment.