bge
is short for BAAI general embedding
.
Model | Language | query instruction for retrieval* |
---|---|---|
BAAI/bge-large-en-v1.5 | English | Represent this sentence for searching relevant passages: |
BAAI/bge-base-en-v1.5 | English | Represent this sentence for searching relevant passages: |
BAAI/bge-small-en-v1.5 | English | Represent this sentence for searching relevant passages: |
BAAI/bge-large-zh-v1.5 | Chinese | 为这个句子生成表示以用于检索相关文章: |
BAAI/bge-base-zh-v1.5 | Chinese | 为这个句子生成表示以用于检索相关文章: |
BAAI/bge-small-zh-v1.5 | Chinese | 为这个句子生成表示以用于检索相关文章: |
*: If you need to search the long relevant passages to a short query (s2p retrieval task), you need to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, no instruction need to be added to passages.
This folder contains the following examples for BGE models:
File | Description | GPU Minimum Requirement |
---|---|---|
01_load_inference |
Environment setup and suggested configurations when inferencing BGE models on Databricks. | 1xT4 |
02_mlflow_logging_inference |
Save, register, and load BGE models with MLFlow, and create a Databricks model serving endpoint. | 1xT4 |
03_build_document_index |
Build a vector store with faiss using BGE models. | 1xT4 |
04_fine_tune_embedding |
Fine-tune BGE models. | 1xT4 |