-
Notifications
You must be signed in to change notification settings - Fork 143
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Xinyu Ye <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: lkk <[email protected]> Co-authored-by: test <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Letong Han <[email protected]>
- Loading branch information
1 parent
40f1463
commit ad0bb7c
Showing
22 changed files
with
1,812 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
# LLM Fine-tuning Microservice | ||
|
||
LLM Fine-tuning microservice involves adapting a base model to a specific task or dataset to improve its performance on that task. | ||
|
||
# 🚀1. Start Microservice with Python (Optional 1) | ||
|
||
## 1.1 Install Requirements | ||
|
||
```bash | ||
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu | ||
python -m pip install intel-extension-for-pytorch | ||
python -m pip install oneccl_bind_pt --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/ | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## 1.2 Start Finetuning Service with Python Script | ||
|
||
### 1.2.1 Start Ray Cluster | ||
|
||
OneCCL and Intel MPI libraries should be dynamically linked in every node before Ray starts: | ||
|
||
```bash | ||
source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl; print(torch_ccl.cwd)")/env/setvars.sh | ||
``` | ||
|
||
Start Ray locally using the following command. | ||
|
||
```bash | ||
ray start --head | ||
``` | ||
|
||
For a multi-node cluster, start additional Ray worker nodes with below command. | ||
|
||
```bash | ||
ray start --address='${head_node_ip}:6379' | ||
``` | ||
|
||
### 1.2.2 Start Finetuning Service | ||
|
||
```bash | ||
export HF_TOKEN=${your_huggingface_token} | ||
python finetuning_service.py | ||
``` | ||
|
||
# 🚀2. Start Microservice with Docker (Optional 2) | ||
|
||
## 2.1 Setup on CPU | ||
|
||
### 2.1.1 Build Docker Image | ||
|
||
Build docker image with below command: | ||
|
||
```bash | ||
export HF_TOKEN=${your_huggingface_token} | ||
cd ../../ | ||
docker build -t opea/finetuning:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy --build-arg HF_TOKEN=$HF_TOKEN -f comps/finetuning/docker/Dockerfile_cpu . | ||
``` | ||
|
||
### 2.1.2 Run Docker with CLI | ||
|
||
Start docker container with below command: | ||
|
||
```bash | ||
docker run -d --name="finetuning-server" -p 8005:8005 --runtime=runc --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/finetuning:latest | ||
``` | ||
|
||
## 2.2 Setup on Gaudi2 | ||
|
||
### 2.2.1 Build Docker Image | ||
|
||
Build docker image with below command: | ||
|
||
```bash | ||
cd ../../ | ||
docker build -t opea/finetuning-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/finetuning/docker/Dockerfile_hpu . | ||
``` | ||
|
||
### 2.2.2 Run Docker with CLI | ||
|
||
Start docker container with below command: | ||
|
||
```bash | ||
export HF_TOKEN=${your_huggingface_token} | ||
docker run --runtime=habana -e HABANA_VISIBLE_DEVICES=all -p 8005:8005 -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host -e https_proxy=$https_proxy -e http_proxy=$http_proxy -e no_proxy=$no_proxy -e HF_TOKEN=$HF_TOKEN opea/finetuning-gaudi:latest | ||
``` | ||
|
||
# 🚀3. Consume Finetuning Service | ||
|
||
## 3.1 Create fine-tuning job | ||
|
||
Assuming a training file `alpaca_data.json` is uploaded, it can be downloaded in [here](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json), the following script launches a finetuning job using `meta-llama/Llama-2-7b-chat-hf` as base model: | ||
|
||
```bash | ||
# upload a training file | ||
curl http://${your_ip}:8005/v1/finetune/upload_training_files -X POST -H "Content-Type: multipart/form-data" -F "files=@./alpaca_data.json" | ||
|
||
# create a finetuning job | ||
curl http://${your_ip}:8005/v1/fine_tuning/jobs \ | ||
-X POST \ | ||
-H "Content-Type: application/json" \ | ||
-d '{ | ||
"training_file": "alpaca_data.json", | ||
"model": "meta-llama/Llama-2-7b-chat-hf" | ||
}' | ||
|
||
# list finetuning jobs | ||
curl http://${your_ip}:8005/v1/fine_tuning/jobs -X GET | ||
|
||
# retrieve one finetuning job | ||
curl http://localhost:8005/v1/fine_tuning/jobs/retrieve -X POST -H "Content-Type: application/json" -d '{ | ||
"fine_tuning_job_id": ${fine_tuning_job_id}}' | ||
|
||
# cancel one finetuning job | ||
|
||
curl http://localhost:8005/v1/fine_tuning/jobs/cancel -X POST -H "Content-Type: application/json" -d '{ | ||
"fine_tuning_job_id": ${fine_tuning_job_id}}' | ||
|
||
# list checkpoints of a finetuning job | ||
curl http://${your_ip}:8005/v1/finetune/list_checkpoints -X POST -H "Content-Type: application/json" -d '{"fine_tuning_job_id": ${fine_tuning_job_id}}' | ||
|
||
``` |
Empty file.
Oops, something went wrong.