Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix model-downloader and tgi in multi shard case #642

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lianhao
Copy link
Collaborator

@lianhao lianhao commented Dec 12, 2024

Description

Upgrade huggingface-hub to version 0.26.5 when downloading models, due to the existing huggingface/downloader:0.17.3 image doesn't acknowledge the HF_TOKEN correctly.

Loose the tgi securityContext to allow running with multi shard.

Issues

Fixes #641
Fixes #639

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

@lianhao lianhao requested a review from yongfengdu as a code owner December 12, 2024 08:51
@lianhao lianhao requested a review from Ruoyu-y December 12, 2024 08:52
@lianhao lianhao changed the title Use huggingface-hub 0.26.5 to download model Fix model-downloader and tgi in multi shard case Dec 12, 2024
@@ -36,7 +36,6 @@ spec:
name: {{ include "speecht5.fullname" . }}-config
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing "pip install" on every container start is not a fix, at best it's a (rather horrible) temporary band-aid...

Isn't there any HF image which would have a working huggingface-hub version?

If not, this issue should be reported to upstream. And "TODO: fix this for 1.2 release" comment here with a link to the ticket would be good.

=> If upstream does not provide fixed image before OPEA 1.2 release, I think OPEA needs to add such image to DockerHub before next release...

Copy link
Collaborator Author

@lianhao lianhao Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's a temporary workaround to unblock the CI. Created an issue in upstream huggingface/huggingface_hub#2708

helm-charts/common/tgi/values.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

Cache being in emptyDir i.e. going away when the pod instance goes away (instead of being shared like model data), is more secure, but it can be significant pod startup performance issue. Especially with HPA and in other setups were pods come and go. That can be looked in another PR though.

I think in long term it would be better to separate model downloading and running of the application services. That would allow model downloading to be centralized to single service / container, instead of split over multiple pods, and even to do it as separate step before starting any of the application pods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants