Deployment error on GKE #126

piksida · 2024-12-01T14:09:29Z

Hello!
I deployed Gemma 2 2b it on GKE with autopilot mode following these instructions https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-tgi#autopilot. There's this error Node scale up in zones us-central1-c associated with this pod failed: GCE quota exceeded. Pod is at risk of not being scheduled. I checked quota there's enough GPU. However the pod is in pending state.

alvarobartt · 2024-12-02T10:18:03Z

Hi here again @piksida, as mentioned via LinkedIn, the issue is related to the quota rather than the TGI deployment; so on, to fix that you should visit https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#gpu_quota and ensure that the quota for the GPU that you want to use e.g. NVIDIA L4 has a value greater equal or greater than 1 (which as of what you shared via LinkedIn it seems it's already the case). Anyway, try to increase that to at least 3, so that we discard that option, otherwise, you can give it a try creating the GKE cluster with the Standard mode and the node pool with a single NVIDIA L4 which should work as of your current quota.

See the screenshot below of the filters that you need to apply in order to see the quota that you need to upgrade:

alvarobartt · 2024-12-02T10:39:28Z

Here's also another reference from the Google Cloud Forum that may seem relevant to you https://www.googlecloudcommunity.com/gc/Google-Kubernetes-Engine-GKE/GKE-Autopilot-cluster-and-Wanted-up-a-GPU-Nvidia-l4-or-Nvidia/m-p/785551

alvarobartt self-assigned this Dec 2, 2024

alvarobartt added the question label Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment error on GKE #126

Deployment error on GKE #126

piksida commented Dec 1, 2024

alvarobartt commented Dec 2, 2024 •

edited

Loading

alvarobartt commented Dec 2, 2024

Deployment error on GKE #126

Deployment error on GKE #126

Comments

piksida commented Dec 1, 2024

alvarobartt commented Dec 2, 2024 • edited Loading

alvarobartt commented Dec 2, 2024

alvarobartt commented Dec 2, 2024 •

edited

Loading