[Examples] PyTorch (train and inference), TGI and TEI (inference) #40

alvarobartt · 2024-06-10T06:52:25Z

As part of our collaboration with Google Cloud and following up #2 to create the Deep Learning Containers (DLCs) for both Google Kubernetes Engine (GKE) and Vertex AI, we want to create dedicated examples per each alternative offered.

The examples to be created and included within this repository are listed below and divided in two categories:

Edit: updated as of Philipp's comment below!

Training
- Vertex AI
  - Custom Container Training Job to fine-tune (SFT) an LLM via TRL's CLI using QLoRA on a single GPU Add examples/vertex-ai/... for LLM fine-tuning with TRL #44
  - Custom Container Training Job to full fine-tune (SFT) an LLM via TRL's CLI on N GPUs (N > 1) via accelerate Add examples/vertex-ai/... for LLM fine-tuning with TRL #44
  - Kubeflow Pipeline to fine-tune an LLM with a custom script on TPU (we already have https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/tpu-examples/causal-language-modeling)
- GKE
  - fine-tune (SFT) an LLM via TRL's CLI using QLoRA on a single GPU Add examples/gke/... for LLM fine-tuning with TRL #53
  - fine-tune (SFT) an LLM via TRL's CLI on N GPUs (N > 1) via accelerate Add examples/gke/... for LLM fine-tuning with TRL #53
Inference
- Vertex AI (already a bunch of existing examples under examples/vertex-ai/notebooks only missing review / rewrite) Review examples/vertex-ai/notebooks/*.ipynb #55
  - PyTorch via transformers using a Custom Prediction Routine (CPR) in Vertex AI with any relevant model for any supported HF_TASK other than text-generation (could be either CPU, GPU or both)
  - TGI via a pre-built DLC (on GPU) from the Hugging Face Hub
  - TGI via a pre-built DLC (on GPU) from a GCS Bucket
  - TEI via a pre-built DLC (could be either CPU, GPU or both)
  - TEI via a pre-built DLC (could be either CPU, GPU or both) from a GCS Bucket
- GKE
  - TGI via a pre-built DLC with a custom Kubernetes configuration for GKE Autopilot mode (on GPU) from the Hugging Face Hub Add examples/gke/tgi-deployment #41
  - TGI via a pre-built DLC with a custom Kubernetes configuration for GKE Autopilot mode (on GPU) from a GCS Bucket Add examples/gke/{tei,tgi}-from-gcs-deployment #42
  - TEI via a pre-built DLC with a custom Kubernetes configuration for GKE Autopilot mode (could be either CPU, GPU or both) Add examples/gke/tei-deployment #43
  - TEI via a pre-built DLC with a custom Kubernetes configuration for GKE Autopilot mode (could be either CPU, GPU or both) from a GCS Bucket Add examples/gke/{tei,tgi}-from-gcs-deployment #42

Note

This issue assumes that the DLCs are already created and can be used as containers for the examples described above.

The text was updated successfully, but these errors were encountered:

philschmid · 2024-06-10T09:23:47Z

Additionally, we should create examples for

Training (GKE)
- fine-tune (SFT) an LLM via TRL's CLI using QLoRA on a single GPU
- fine-tune (SFT) an LLM via TRL's CLI using QLoRA on N GPUs (N > 1) via accelerate

Inference
- Vertex AI
  - TGI via a pre-built DLC (on GPU) loading fine-tuned model from GCS
- GKE
  - TGI via a pre-built DLC (on GPU) loading fine-tuned model from GCS

Note

We already have a few examples covering some of those topics. Lets try to reuse most of them and rather update them then creating new versions.

alvarobartt added the examples label Jun 10, 2024

alvarobartt self-assigned this Jun 10, 2024

alvarobartt closed this as completed Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Examples] PyTorch (train and inference), TGI and TEI (inference) #40

[Examples] PyTorch (train and inference), TGI and TEI (inference) #40

alvarobartt commented Jun 10, 2024 •

edited

Loading

philschmid commented Jun 10, 2024 •

edited by alvarobartt

Loading

[Examples] PyTorch (train and inference), TGI and TEI (inference) #40

[Examples] PyTorch (train and inference), TGI and TEI (inference) #40

Comments

alvarobartt commented Jun 10, 2024 • edited Loading

philschmid commented Jun 10, 2024 • edited by alvarobartt Loading

alvarobartt commented Jun 10, 2024 •

edited

Loading

philschmid commented Jun 10, 2024 •

edited by alvarobartt

Loading