Add API docs (#116)

SciPhi-AI · Oct 31, 2023 · 2706bd4 · 2706bd4
1 parent 8bbc919
commit 2706bd4
Show file tree

Hide file tree

Showing 18 changed files with 256 additions and 148 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,4 @@
-# Local cruft and other
+# Crufty EXTs 
 poetry.lock
 .env
 .vscode
@@ -8,12 +8,17 @@ poetry.lock
 *.sh
 **/__pycache__/**
 **/.DS_Store
-storeage/
-textbooks/
+
+
 # Local sandbox environments
 playground/
 outputs/
 dump/
 sciphi/_version.py
+textbooks/
+
 # Scraped data
 sciphi/library_of_phi/raw_data/
+
+# Built Docs
+docs/build
diff --git a/README.md b/README.md
@@ -72,7 +72,7 @@ After entering your settings, ensure you save and exit the file.
 SciPhi supports multiple LLM providers (e.g. OpenAI, Anthropic, HuggingFace, and vLLM) and RAG providers (e.g. SciPhi). The framework supports seamless integration of these providers. To run an example completion with SciPhi, execute:
 
 ```bash
-python -m sciphi.scripts.sciphi_gen_completion -llm_provider_name=sciphi --llm_api_key=YOUR_SCIPHI_API_KEY --llm_api_base=https://api.sciphi.ai/v1 --rag_api_base=https://api.sciphi.ai --llm_model_name=SciPhi/SciPhi-Self-RAG-Mistral-7B-32k --query="Write a few paragraphs on general relativity. Include the mathematical definition of Einsteins field equation in your writeup."
+python -m sciphi.scripts.sciphi_chat --llm_model_name=SciPhi/SciPhi-Self-RAG-Mistral-7B-32k --query="Write a few paragraphs on general relativity. Include the mathematical definition of Einsteins field equation in your writeup."
 ```
 
 ### Configurable Data Generation
@@ -83,13 +83,7 @@ Use SciPhi to generate datasets tailored to your specifications. By running the
 python -m sciphi.scripts.data_augmenter --config-path=$PWD/sciphi/config/prompts/question_and_answer.yaml --config_name=None --n_samples=1
 ```
 
-Inspect the output of this command:
-
-```bash
-tail augmented_output/config_name__question_and_answer_dataset_name__ContextualAI_tiny-wiki100-chunks.jsonl
-```
-
-Sample Output:
+Inspecting the output of this command:
 
 ```bash
 {"question": "What is the reaction called when alcohol and carboxylic acids react?", "answer": "Fischer esterification"}
@@ -134,7 +128,7 @@ This is an effort to democratize access to top-tier textbooks. This can readily
 4. **Custom Settings & RAG Functionality**:
 
    Simply switch `rag-enabled` to `True`. Ensure you have the right `.env` variables set up, or provide CLI values for `rag_api_base` and `rag_api_key`.
-   
+
    Alternatively, you may provide your own custom settings in a YAML file. See the [default settings configuration here](sciphi/config/generation_settings/textbook_generation_settings.yaml).
 
    _Important:_ To make the most out of grounding your data with Wikipedia, ensure your system matches our detailed specifications. An example RAG provider can be seen [here](https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/interface/rag/sciphi_wiki.py). More high quality outbook books are available [here](https://github.com/SciPhi-AI/library-of-phi).
@@ -153,7 +147,42 @@ This example evaluates your RAG over 100 science multiple-choice questions and r
 
 ## Development
 
-### Example - Instantiate your own LLM and RAG provider
+### Basic Example - Generate a chat completion with SciPhi
+
+Here's how you can use SciPhi to quickly set up and retrieve chat completions, without diving deep into intricate configurations:
+
+```python
+
+from sciphi.interface import (
+    SciPhiFormatter,
+    SciPhiLLMInterface,
+    SciPhiWikiRAGInterface,
+)
+   # SciPhi RAG Interface
+   # Supports calls like `contexts = rag_interface.get_contexts(query)`
+   rag_interface = SciPhiWikiRAGInterface()
+
+   # SciPhi LLM Interface
+   llm_interface = SciPhiLLMInterface(rag_interface)
+
+   # Get the completion for a given prompt
+   query: str = "Who is the president of the United States?"
+   conversation.append({"role": "user", "content": query})
+
+   generation_config = GenerationConfig(
+      model_name=llm_model_name,
+      stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN,
+      # pass in any other generation settings here
+   )
+
+   completion = llm_interface.get_chat_completion(
+      conversation, generation_config
+   )
+   print(completion)
+   # The current President of the United States is Joe Biden.
+```
+
+### Advanced Example - Instantiate your own LLM and RAG provider
 
 Here's an example of how you can instantiate your own LLM and RAG provider using SciPhi:
 

diff --git a/docs/source/api/main.rst b/docs/source/api/main.rst
@@ -1,14 +1,12 @@
-# SciPhi API Documentation (in reStructuredText format)
-
 SciPhi API Documentation
 ========================
 
-Welcome to the SciPhi API documentation. Here, you'll find a detailed guide on how to use the different endpoints provided by the SciPhi service. This API allows you to interact with the powerful functionalities of the SciPhi codebase, bringing the power of large language models directly to your applications.
+Welcome to the SciPhi API documentation. Here, you'll find a detailed guide on how to use the different endpoints provided by the SciPhi service. This API allows you to interact with the powerful functionalities of the SciPhi codebase and associated AI. SciPhi looks to become a powerful tool for exploring the world's knowledge, and we hope you enjoy using it!
 
 Endpoint Overview
 -----------------
 
-1. **Search**: This endpoint allows you to use the Retriever to fetch related documents from a given set of queries. Meta's `Contreiver` embeddings are used in this process. Currently just Wikipedia is embedded, but the goal is to scale this to a comprehensive database embedded via recent SOTA methods. 
+1. **Search**: This endpoint allows you to fetch related documents based on a set of queries. The documents are retrieved by re-ranked similarity search over embeddings produced by the `facebook/contriever <https://huggingface.co/facebook/contriever>`_. As of now, only Wikipedia is embedded, but there are plans to expand this to a more comprehensive corpus using state-of-the-art embedding methods.
 2. **OpenAI Formatted LLM Request (v1)**: SciPhi models are served via an API that is compatible with the OpenAI API.
 
 Detailed Endpoint Descriptions
@@ -22,75 +20,73 @@ Search Endpoint
 - **Description**: This endpoint interacts with the Retriever module of the SciPhi codebase, allowing you to search for related documents based on the provided queries.
 
 **Request Body**:
-  - ``queries``: List of query strings for which related documents are to be retrieved.
+  - ``queries``: A list of query strings for which related documents should be retrieved.
   - ``top_k``: (Optional) The number of top related documents you wish to retrieve for each query.
 
 **Response**: 
-A list of lists containing Document objects, where each list corresponds to the related documents for each query.
-
-**Example**:
-
-.. code-block:: bash
-
-   curl -X POST http://<api_url>/search \
-       -H "Authorization: Bearer YOUR_API_KEY" \
-       -d '{"queries": ["What is general relativity?", "Who is Albert Einstein?"], "top_k": 5}'
-
-OpenAI API (v1) Endpoint
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-- **URL**: ``/v1/{path:path}``
-- **Method**: ``GET``, ``POST``, ``PUT``, ``DELETE``
-- **Description**: This endpoint is designed to forward requests to another server, such as vLLM. It can act as a middleware, allowing you to utilize other services while managing access through the SciPhi API.
-
-**Request Body**: 
-The body should match the request format of the service to which you are forwarding the request.
-
-**Response**: 
-The response received from the forwarded service.
+A list of lists containing Document objects, where each list corresponds to the related documents for a specific query.
 
 **Example**:
 
 .. code-block:: bash
 
-   curl -X POST https://api.sciphi.ai/v1/completion \
-       -H "Authorization: Bearer YOUR_API_KEY" \
-       -d '{"prompt": "Describe the universe.", ...}'
+   curl -X POST https://api.sciphi.ai/search \
+        -H "Authorization: Bearer $SCIPHI_API_KEY" \
+        -H "Content-Type: application/json" \
+        -d '{"queries": ["What is general relativity?", "Who is Albert Einstein?"], "top_k": 5}'
 
+This request queries the SciPhi World Database. The expected response is:
 
-Alternatively, with the SciPhi framework, you may execute a generation as shown:
+.. code-block:: none
 
+    [[{"id":14678539,"title":"General Relativity and Gravitation","text":"General Relativity and Gravitation General Re ...
 
-.. code-block:: python
 
-    from sciphi.interface import LLMInterfaceManager, RAGInterfaceManager
-    from sciphi.llm import GenerationConfig
+SciPhi v1 Endpoints
+~~~~~~~~~~~~~~~~~~~
 
-    # LLM Provider Settings
-    llm_interface = LLMInterfaceManager.get_interface_from_args(
-        LLMProviderName(llm_provider_name),
-        api_key=llm_api_key,
-        api_base=llm_api_base,
-        rag_interface=rag_interface,
-        model_name=llm_model_name,
-    )
+SciPhi adheres to the API specification of OpenAI's API, allowing compatibility with any application designed for the OpenAI API. Below is an example curl command:
 
-    # Set up typical LLM generation settings
-    completion_config = GenerationConfig(
-        temperature=llm_temperature,
-        top_k=llm_top_k,
-        max_tokens_to_sample=llm_max_tokens_to_sample,
-        model_name=llm_model_name,
-        skip_special_tokens=llm_skip_special_tokens,
-        stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN,
-    )
-
-    # Get the completion for a prompt
-    completion = llm_interface.get_completion(prompt, generation_config)
+**Example**:
 
+.. code-block:: bash
 
+curl https://api.sciphi.ai/v1/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer 55c51253002ed4f7d1dd3afbe2a72635" \
+  -d '{
+     "model": "SciPhi/SciPhi-Self-RAG-Mistral-7B-32k",
+     "prompt": "Say this is a test.",
+     "temperature": 0.7
+   }'
+
+
+After executing the above request with the SciPhi/SciPhi-Self-RAG-Mistral-7B-32k model, the expected response is:
+
+.. code-block:: json
+
+{
+    "id":"cmpl-f03f53c15a174ffe89bdfc83507de7a9",
+    "object":"text_completion",
+    "created":1698730137,
+    "model":"SciPhi/SciPhi-Self-RAG-Mistral-7B-32k",
+    "choices":[
+        {
+            "index":0,
+            "text":"This is a test.",
+            "logprobs":null,
+            "finish_reason":"length"
+        }
+    ],
+    "usage":
+        {
+            "prompt_tokens":7,
+            "total_tokens":23,
+            "completion_tokens":16
+        }
+}                                                                                                                                                                           (base) ocolegrove@MacBook-Pro-5 sciphi-core % 
 
 API Key and Signup
 ------------------
 
-To access the SciPhi API, you will need an API key. If you don't have one, you can sign up `here <https://www.sciphi.ai/signup>`_. Ensure you include the API key in your request headers as shown in the examples.
+To access the SciPhi API, you need an API key. If you don't possess one, you can sign up `here <https://www.sciphi.ai/signup>`_. Ensure you include the API key in your request headers as shown in the examples.
diff --git a/docs/source/assets/logos/sciphi.png b/docs/source/assets/logos/sciphi.png
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -1,4 +1,4 @@
-Welcome to SciPHi!
+SciPhi [ΨΦ]: AI's Knowledge Engine 💡
 ================
 
 .. image:: https://github.com/emrgnt-cmplxty/sciphi/assets/68796651/195367d8-54fd-4281-ace0-87ea8523f982
@@ -21,18 +21,16 @@ Welcome to SciPHi!
    </p>
 
 
-SciPhi [ΨΦ]: AI's Knowledge Engine 💡
------------------------------------------------------------------
 
-SciPhi is a powerful knowledge engine tailored for LLM-based data generation and management.
+SciPhi is a powerful knowledge engine tailored for LLM-based inference, data generation and management.
 
 With SciPhi, you can:
 
-* Generate datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi**.
 * Tap into the **Retriever-Augmented Generation (RAG)** for data anchoring to real-world sources.
    - Features like end-to-end cloud and local RAG knowledge engine APIs are underway!
+* Generate truthful datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi**.
 * Custom tailor your data creation for applications such as LLM training, RAG, and beyond.
-   - For instance, the in-built textbook module can generate RAG-enhanced textbooks from a given table of contents.
+   - For ex., generate RAG-grounded textbooks from a given table of contents.
 
 Quick and easy setup:
 
@@ -46,31 +44,32 @@ Diverse Features:
 * Evaluate your RAG systems effectively with the SciPhi evaluation harness.
 * Engage with the community on platforms like `Discord <https://discord.gg/j9GxfbxqAe>`_.
 
-Developers can also instantiate their own LLM and RAG providers using the SciPhi framework. The supported LLM providers include popular choices like OpenAI, Anthropic, HuggingFace, and vLLM. For specialized RAG capabilities, SciPhi offers the **World Databasef API** for comprehensive database access.
+Developers can also instantiate their own LLM and RAG providers using the SciPhi framework. The supported LLM providers include popular choices like OpenAI, Anthropic, HuggingFace, and vLLM. For specialized RAG capabilities, SciPhi offers the `World Databasef API <https://sciphi.readthedocs.io/en/latest/api/main.html>`_ for comprehensive knowledge access.
 
+You can use this format in your reStructuredText documentation, and it should render as a clickable link.
 For a detailed setup guide, deeper feature exploration, and developer insights, refer to:
 
 * `SciPhi GitHub Repository <https://github.com/emrgnt-cmplxty/sciphi>`_
 * `Example Textbook Generated with SciPhi <https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/data/sample/textbooks/Aerodynamics_of_Viscous_Fluids.md>`_
 * `ToC Used for Sample Textbook Generation <https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/data/sample/table_of_contents/Aerodynamics_of_Viscous_Fluids.yaml>`_
 * `Default Settings for Textbook Generation <https://github.com/SciPhi-AI/sciphi/blob/main/sciphi/config/generation_settings/textbook_generation_settings.yaml>`_
-* `Library of SciPhi Books <https://github.com/SciPhi-AI/library-of-phi/>`_
+* `Library of Phi <https://github.com/SciPhi-AI/library-of-phi/>`_
 
 
 Citing Our Work
 ---------------
 
 If you're using SciPhi in your research or project, please cite our work:
 
-.. code-block:: plaintext
+.. code-block:: none
 
    @software{SciPhi,
-   author = {Colegrove, Owen},
-   doi = {Pending},
-   month = {09},
-   title = {{SciPhi: A Framework for LLM Powered Data}},
-   url = {https://github.com/sciphi-ai/sciphi},
-   year = {2023}
+      author = {Colegrove, Owen},
+      doi = {Pending},
+      month = {09},
+      title = {{SciPhi: A Framework for LLM Powered Data}},
+      url = {https://github.com/sciphi-ai/sciphi},
+      year = {2023}
    }
 
 Documentation

diff --git a/docs/source/setup/installation.rst b/docs/source/setup/installation.rst
@@ -42,12 +42,14 @@ After installation, set up your environment to link with supported LLM providers
 Here is an example of the configuration in the `.env` file:
 
 .. code-block:: bash
+   # To use SciPhi as a provider:
+   SCIPHI_API_KEY=your_sciphi_api_key
 
+   # Other providers in SciPhi:
    OPENAI_API_KEY=your_openai_api_key
    ANTHROPIC_API_KEY=your_anthropic_api_key
    HF_TOKEN=your_huggingface_token
    VLLM_API_KEY=your_vllm_api_key
-   SCIPHI_API_KEY=your_sciphi_api_key
    RAG_API_KEY=your_rag_server_api_key
    RAG_API_BASE=your_rag_api_base_url
 
@@ -61,10 +63,10 @@ To set up SciPhi for development:
 
 .. code-block:: console
 
-   $ git clone https://github.com/emrgnt-cmplxty/sciphi.git
+   $ git clone https://github.com/SciPhi-AI/sciphi.git
    $ cd sciphi
    $ pip3 install poetry  # If you do not have Poetry installed.
-   $ poetry install
+   $ poetry install # Can use `pip install -e .`` instead.
    $ poetry install -E all_with_extras
 
 Licensing and Acknowledgment

diff --git a/docs/source/setup/quickstart.rst b/docs/source/setup/quickstart.rst
@@ -22,12 +22,51 @@ Before you start, ensure you've installed SciPhi:
 
     pip install sciphi
 
-For additional details, refer to the `installation guide <https://sciphi.readthedocs.io/en/latest/installation.html>`_.
+For additional details, refer to the `installation guide <https://sciphi.readthedocs.io/en/latest/setup/installation.html>`_.
 
 Instantiate Your LLM and RAG Provider
 -------------------------------------
 
-Here's a simple example of how you can utilize SciPhi to work with your own LLM and RAG provider:
+Here's how you can use SciPhi to quickly set up and retrieve chat completions, without diving deep into intricate configurations:
+
+.. code-block:: python
+
+    from sciphi.interface import (
+        SciPhiFormatter,
+        SciPhiLLMInterface,
+        SciPhiWikiRAGInterface,
+    )
+
+    # SciPhi RAG Interface
+    # Supports calls like `contexts = rag_interface.get_contexts(query)`
+    rag_interface = SciPhiWikiRAGInterface()
+
+    # SciPhi LLM Interface
+    llm_interface = SciPhiLLMInterface(rag_interface)
+
+    # Initializing the conversation
+    query: str = "Who is the president of the United States?"
+    conversation = []
+    conversation.append({"role": "user", "content": query})
+
+    # Define generation configuration
+    generation_config = GenerationConfig(
+        model_name=llm_model_name,
+        stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN,
+        # Pass in any other desired generation settings here
+    )
+
+    # Get the chat completion
+    completion = llm_interface.get_chat_completion(
+        conversation, generation_config
+    )
+
+    print(completion)
+    # Expected output: The current President of the United States is Joe Biden.
+
+---
+
+Here's a more advanced example of how you can utilize SciPhi to work with configurably with available LLM and RAG providers:
 
 .. code-block:: python