Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment error on GKE #126

Open
piksida opened this issue Dec 1, 2024 · 2 comments
Open

Deployment error on GKE #126

piksida opened this issue Dec 1, 2024 · 2 comments
Assignees
Labels

Comments

@piksida
Copy link

piksida commented Dec 1, 2024

Hello!
I deployed Gemma 2 2b it on GKE with autopilot mode following these instructions https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-tgi#autopilot. There's this error Node scale up in zones us-central1-c associated with this pod failed: GCE quota exceeded. Pod is at risk of not being scheduled. I checked quota there's enough GPU. However the pod is in pending state.

@alvarobartt alvarobartt self-assigned this Dec 2, 2024
@alvarobartt
Copy link
Member

alvarobartt commented Dec 2, 2024

Hi here again @piksida, as mentioned via LinkedIn, the issue is related to the quota rather than the TGI deployment; so on, to fix that you should visit https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#gpu_quota and ensure that the quota for the GPU that you want to use e.g. NVIDIA L4 has a value greater equal or greater than 1 (which as of what you shared via LinkedIn it seems it's already the case). Anyway, try to increase that to at least 3, so that we discard that option, otherwise, you can give it a try creating the GKE cluster with the Standard mode and the node pool with a single NVIDIA L4 which should work as of your current quota.

See the screenshot below of the filters that you need to apply in order to see the quota that you need to upgrade:

image

@alvarobartt
Copy link
Member

Here's also another reference from the Google Cloud Forum that may seem relevant to you https://www.googlecloudcommunity.com/gc/Google-Kubernetes-Engine-GKE/GKE-Autopilot-cluster-and-Wanted-up-a-GPU-Nvidia-l4-or-Nvidia/m-p/785551

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants