This is the code used to run the RAG demo using GPU Droplets in Paddy Srinivasan's Deploy 2024 keynote.
This project demonstrates a Retrieval-Augmented Generation (RAG) system using Python. It retrieves relevant document chunks based on query embeddings and generates detailed responses using a large language model (LLM).
- DigitalOcean (DO) Account
- Python 3.8+
- Log in to your DigitalOcean account and navigate to the control panel.
- Click on "Spaces" and create a new Space (e.g.,
wade-rag
). - Note your
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
from the API section in your DO account.
In your space, create a folder and add the files as PDFs or documents that you want to include in your RAG. Note the name of the folder to use below.
- Go to the "Databases" section and create a new PostgreSQL database cluster.
- Configure the database and note the connection details (host, port, user, password, database name).
- Connect to your PostgreSQL database and install the
pgvector
extension.
CREATE EXTENSION IF NOT EXISTS vector;
- Go to the "Droplets" section and create a new Droplet.
- Note the IP address for use later.
- Setup an SSH key so you can remote into the environment.
First, you'll want to SSH into the Droplet. Note the IP address of your droplet, and from a terminal type ssh root@YOUR-IP-ADDRESS
.
git clone https://github.com/wadewegner/deploy24-rag-demo.git
cd deploy24-rag-demo
Create a .env
file in the project directory with the following content:
DB_NAME=
DB_USER=
DB_PASSWORD=
DB_HOST=
DB_PORT=25060
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
SPACE_NAME=
FOLDER_NAME=
Be sure to add all your info!
Run the setup script to create and set up the virtual environment:
chmod +x setup_venv.sh
sudo ./setup_venv.sh # Installs all the required dependencies
source rag_env/bin/activate # Activate the virtual environment
Run the prepare_documents.py
script to process and store document embeddings in the database:
python prepare_documents.py
Run the retrieval_llm.py
script to query the system and generate responses:
python retrieval_llm.py
To reset the environment, run the reset script:
chmod +x reset_demo.sh
./reset_demo.sh