Embeddings Microservice

The Embedding Microservice is designed to efficiently convert textual strings into vectorized embeddings, facilitating seamless integration into various machine learning and data processing workflows. This service utilizes advanced algorithms to generate high-quality embeddings that capture the semantic essence of the input text, making it ideal for applications in natural language processing, information retrieval, and similar fields.

Key Features:

High Performance: Optimized for quick and reliable conversion of textual data into vector embeddings.

Scalability: Built to handle high volumes of requests simultaneously, ensuring robust performance even under heavy loads.

Ease of Integration: Provides a simple and intuitive API, allowing for straightforward integration into existing systems and workflows.

Customizable: Supports configuration and customization to meet specific use case requirements, including different embedding models and preprocessing techniques.

Users are able to configure and build embedding-related services according to their actual needs.

🚀1. Start Microservice with Python (Option 1)

Currently, we provide three ways to implement the embedding service:

Build the embedding model locally from the server, which is faster, but takes up memory on the local server.
Build it based on the TEI endpoint, which provides more flexibility, but may bring some network latency.
Build it based on the Prediction Guard endpoint, which provides performant, hosted embedding models on top of Gaudi, but needs an API key.

Regardless of the implementation, you need to install requirements first.

1.1 Install Requirements

# run with langchain
pip install -r langchain/requirements.txt

# run with llama_index
pip install -r llama_index/requirements.txt

# run with predictionguard
pip install -r predictionguard/requirements.txt

1.2 Start Embedding Service

You can select one of following ways to start the embedding service:

Start Embedding Service with PredictionGuard

First, you need to start a PredictionGuard service.

docker run -d --name="embedding-predictionguard" \
    -e http_proxy=$http_proxy -e https_proxy=$https_proxy \
    -p 6000:6000 --ipc=host \
    -e PREDICTIONGUARD_API_KEY=<your_api_key> \
    opea/embedding-predictionguard:latest

Then you need to test your PredictionGuard service using the following commands:

curl http://localhost:6000/v1/embeddings\
  -X POST \
  -d '{"input":"Hello, world!"}' \
  -H 'Content-Type: application/json'

Start the embedding service

# run with predictionguard
cd predictionguard
export PREDICTIONGUARD_API_KEY=${your_api_key}
python embedding_pg.py

Start Embedding Service with TEI

First, you need to start a TEI service.

your_port=8090
model="BAAI/bge-large-en-v1.5"
docker run -p $your_port:80 -v ./data:/data --name tei_server -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model

Then you need to test your TEI service using the following commands:

curl localhost:$your_port/embed \
    -X POST \
    -d '{"inputs":"What is Deep Learning?"}' \
    -H 'Content-Type: application/json'

Start the embedding service with the TEI_EMBEDDING_ENDPOINT.

# run with langchain
cd langchain
# run with llama_index
cd llama_index
export TEI_EMBEDDING_ENDPOINT="http://localhost:$yourport"
export TEI_EMBEDDING_MODEL_NAME="BAAI/bge-large-en-v1.5"
python embedding_tei.py

Start Embedding Service with Local Model

# run with langchain
cd langchain
# run with llama_index
cd llama_index
python local_embedding.py

🚀2. Start Microservice with Docker (Optional 2)

2.1 Start Embedding Service with TEI

First, you need to start a TEI service.

your_port=8090
model="BAAI/bge-large-en-v1.5"
docker run -p $your_port:80 -v ./data:/data --name tei_server -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model

Then you need to test your TEI service using the following commands:

curl localhost:$your_port/embed \
    -X POST \
    -d '{"inputs":"What is Deep Learning?"}' \
    -H 'Content-Type: application/json'

Export the TEI_EMBEDDING_ENDPOINT for later usage:

export TEI_EMBEDDING_ENDPOINT="http://localhost:$yourport"
export TEI_EMBEDDING_MODEL_NAME="BAAI/bge-large-en-v1.5"

2.2 Start Embedding Service with PredictionGuard

First, build the Docker image for the PredictionGuard embedding microservice:

docker build -t opea/embedding-predictionguard:latest --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -f comps/embeddings/predictionguard/docker/Dockerfile .

Start the Docker container for the PredictionGuard embedding microservice. Replace <your_api_key> with your PredictionGuard API key.

docker run -d --name="embedding-predictionguard" \
    -e http_proxy=$http_proxy -e https_proxy=$https_proxy \
    -p 6000:6000 --ipc=host \
    -e PREDICTIONGUARD_API_KEY=$PREDICTIONGUARD_API_KEY \
    opea/embedding-predictionguard:latest

Then you need to test your PredictionGuard service using the following commands:

curl http://localhost:6000/v1/embeddings\
  -X POST \
  -d '{"input":"Hello, world!"}' \
  -H 'Content-Type: application/json'

2.3 Build Docker Image

Build Langchain Docker (Option a)

cd ../../
docker build -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/langchain/docker/Dockerfile .

Build LlamaIndex Docker (Option b)

cd ../../
docker build -t opea/embedding-tei-llama-index:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/llama_index/docker/Dockerfile .

Build PredictionGuard Docker (Option c)

docker build -t opea/embedding-predictionguard:latest --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -f comps/embeddings/predictionguard/docker/Dockerfile .

2.4 Run Docker with CLI

# run with langchain docker
docker run -d --name="embedding-tei-server" -p 6000:6000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT -e TEI_EMBEDDING_MODEL_NAME=$TEI_EMBEDDING_MODEL_NAME opea/embedding-tei:latest
# run with llama-index docker
docker run -d --name="embedding-tei-llama-index-server" -p 6000:6000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT -e TEI_EMBEDDING_MODEL_NAME=$TEI_EMBEDDING_MODEL_NAME opea/embedding-tei-llama-index:latest

docker run -d --name="embedding-predictionguard" -e http_proxy=$http_proxy -e https_proxy=$https_proxy -p 6000:6000 --ipc=host -e PREDICTIONGUARD_API_KEY=$PREDICTIONGUARD_API_KEY opea/embedding-predictionguard:latest

2.5 Run Docker with Docker Compose

cd docker
docker compose -f docker_compose_embedding.yaml up -d

🚀3. Consume Embedding Service

3.1 Check Service Status

curl http://localhost:6000/v1/health_check\
  -X GET \
  -H 'Content-Type: application/json'

3.2 Consume Embedding Service

curl http://localhost:6000/v1/embeddings\
  -X POST \
  -d '{"text":"Hello, world!"}' \
  -H 'Content-Type: application/json'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Embeddings Microservice

🚀1. Start Microservice with Python (Option 1)

1.1 Install Requirements

1.2 Start Embedding Service

Start Embedding Service with PredictionGuard

Start Embedding Service with TEI

Start Embedding Service with Local Model

🚀2. Start Microservice with Docker (Optional 2)

2.1 Start Embedding Service with TEI

2.2 Start Embedding Service with PredictionGuard

2.3 Build Docker Image

Build Langchain Docker (Option a)

Build LlamaIndex Docker (Option b)

Build PredictionGuard Docker (Option c)

2.4 Run Docker with CLI

2.5 Run Docker with Docker Compose

🚀3. Consume Embedding Service

3.1 Check Service Status

3.2 Consume Embedding Service

Files

README.md

Latest commit

History

README.md

File metadata and controls

Embeddings Microservice

🚀1. Start Microservice with Python (Option 1)

1.1 Install Requirements

1.2 Start Embedding Service

Start Embedding Service with PredictionGuard

Start Embedding Service with TEI

Start Embedding Service with Local Model

🚀2. Start Microservice with Docker (Optional 2)

2.1 Start Embedding Service with TEI

2.2 Start Embedding Service with PredictionGuard

2.3 Build Docker Image

Build Langchain Docker (Option a)

Build LlamaIndex Docker (Option b)

Build PredictionGuard Docker (Option c)

2.4 Run Docker with CLI

2.5 Run Docker with Docker Compose

🚀3. Consume Embedding Service

3.1 Check Service Status

3.2 Consume Embedding Service