Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataprep updated readme and parametrized prepare_doc_arango.py #1036

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/docker/compose/chathistory-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ services:
build:
dockerfile: comps/chathistory/mongo/Dockerfile
image: ${REGISTRY:-opea}/chathistory-mongo-server:${TAG:-latest}
chathistory-arango-server:
build:
dockerfile: comps/chathistory/arango/Dockerfile
image: ${REGISTRY:-opea}/chathistory-arango-server:${TAG:-latest}
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@

# this file should be run in the root of the repo
services:
feedbackmanagement:
feedbackmanagement-mongo-server:
build:
dockerfile: comps/feedback_management/mongo/Dockerfile
image: ${REGISTRY:-opea}/feedbackmanagement:${TAG:-latest}
image: ${REGISTRY:-opea}/feedbackmanagement-mongo-server:${TAG:-latest}
feedbackmanagement-arango-server:
build:
dockerfile: comps/feedback_management/arango/Dockerfile
image: ${REGISTRY:-opea}/feedbackmanagement-arango-server:${TAG:-latest}
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ services:
build:
dockerfile: comps/prompt_registry/mongo/Dockerfile
image: ${REGISTRY:-opea}/promptregistry-mongo-server:${TAG:-latest}
promptregistry-arango-server:
build:
dockerfile: comps/prompt_registry/arango/Dockerfile
image: ${REGISTRY:-opea}/promptregistry-arango-server:${TAG:-latest}
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
__pycache__
*.egg-info/
.DS_Store
.venv
venv/
33 changes: 33 additions & 0 deletions ARANGODB_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
Instructions

0. Create a virtual environment:

```bash
python -m venv .venv

source .venv/bin/activate
```

1. Install the required packages:

```bash
pip install python-arango
pip install langchain_openai
pip install git+https://github.com/arangoml/langchain.git@arangodb#subdirectory=libs/community
```

2. Provision the ArangoDB with Vector Index image:

```bash
docker create --name arango-vector -p 8529:8529 -e ARANGO_ROOT_PASSWORD=test jbajic/arangodb-arm:vector-index-preview

docker start arango-vector
```

3. Set your `OPENAI_API_KEY` environment variable (contact Anthony for access)

4. Run the test script to confirm LangChain is working:

```bash
python langchain_test.py
```
4 changes: 4 additions & 0 deletions comps/chathistory/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,7 @@ The Chat History microservice able to support various database backends for stor
### Chat History with MongoDB

For more detail, please refer to this [README](./mongo/README.md)

### Chat History with ArangoDB

For more detail, please refer to this [README](./arango/README.md)
30 changes: 30 additions & 0 deletions comps/chathistory/arango/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM python:3.11-slim

ENV LANG=C.UTF-8

RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
build-essential \
libjemalloc-dev \
libgl1-mesa-glx

RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/

USER user

COPY comps /home/user/comps
COPY requirements.txt /home/user/

RUN pip install --no-cache-dir --upgrade pip setuptools && \
pip install --no-cache-dir -r /home/user/comps/chathistory/arango/requirements.txt && \
pip install --no-cache-dir -r /home/user/requirements.txt

ENV PYTHONPATH=/home/user

WORKDIR /home/user/comps/chathistory/mongo

ENTRYPOINT ["python", "chat.py"]
123 changes: 123 additions & 0 deletions comps/chathistory/arango/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# 📝 Chat History Microservice with ArangoDB

This README provides setup guides and all the necessary information about the Chat History microservice with ArangoDB database.

---

## Setup Environment Variables

See `config.py` for default values.

```bash
export ARANGO_HOST=${ARANGO_HOST}
export ARANGO_PORT=${ARANGO_PORT}
export ARANGO_PROTOCOL=${ARANGO_PROTOCOL}
export ARANGO_USERNAME=${ARANGO_USERNAME}
export ARANGO_PASSWORD=${ARANGO_PASSWORD}
export DB_NAME=${DB_NAME}
export COLLECTION_NAME=${COLLECTION_NAME}
```

---

## 🚀Start Microservice with Docker

### Build Docker Image

```bash
cd ../../../../
docker build -t opea/chathistory-arango-server:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/chathistory/arango/Dockerfile .
```

### Run Docker with CLI

- Run ArangoDB image container

```bash
docker run -d -p 8529:8529 --name=arango arangodb/arangodb:latest
```

- Run the Chat History microservice

```bash
docker run -p 6012:6012 \
-e http_proxy=$http_proxy \
-e https_proxy=$https_proxy \
-e no_proxy=$no_proxy \
-e ARANGO_HOST=${ARANGO_HOST} \
-e ARANGO_PORT=${ARANGO_PORT} \
-e ARANGO_PROTOCOL=${ARANGO_PROTOCOL} \
-e ARANGO_USERNAME=${ARANGO_USERNAME} \
-e ARANGO_PASSWORD=${ARANGO_PASSWORD} \
-e DB_NAME=${DB_NAME} \
-e COLLECTION_NAME=${COLLECTION_NAME} \
opea/chathistory-arango-server:latest
```

---

## ✅ Invoke Microservice

The Chat History microservice exposes the following API endpoints:

- Create new chat conversation

```bash
curl -X 'POST' \
http://${host_ip}:6012/v1/chathistory/create \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"data": {
"messages": "test Messages", "user": "test"
}
}'
```

- Get all the Conversations for a user

```bash
curl -X 'POST' \
http://${host_ip}:6012/v1/chathistory/get \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"user": "test"}'
```

- Get a specific conversation by id.

```bash
curl -X 'POST' \
http://${host_ip}:6012/v1/chathistory/get \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"user": "test", "id":"48918"}'
```

- Update the conversation by id.

```bash
curl -X 'POST' \
http://${host_ip}:6012/v1/chathistory/create \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"data": {
"messages": "test Messages Update", "user": "test"
},
"id":"48918"
}'
```

- Delete a stored conversation.

```bash
curl -X 'POST' \
http://${host_ip}:6012/v1/chathistory/delete \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"user": "test", "id":"48918"}'
```
32 changes: 32 additions & 0 deletions comps/chathistory/arango/arango_conn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

from arango import ArangoClient as PythonArangoClient
from arango.database import StandardDatabase
from config import ARANGO_HOST, ARANGO_PASSWORD, ARANGO_PORT, ARANGO_PROTOCOL, ARANGO_USERNAME, DB_NAME


class ArangoClient:
conn_url = f"{ARANGO_PROTOCOL}://{ARANGO_HOST}:{ARANGO_PORT}/"

@staticmethod
def get_db_client() -> StandardDatabase:
try:
# Create client
client = PythonArangoClient(hosts=ArangoClient.conn_url)

# First connect to _system database
sys_db = client.db("_system", username=ARANGO_USERNAME, password=ARANGO_PASSWORD, verify=True)

# Create target database if it doesn't exist
if not sys_db.has_database(DB_NAME):
sys_db.create_database(DB_NAME)

# Now connect to the target database
db = client.db(DB_NAME, username=ARANGO_USERNAME, password=ARANGO_PASSWORD, verify=True)

return db

except Exception as e:
print(e)
raise e
Loading
Loading