-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix model-downloader and tgi in multi shard case #642
base: main
Are you sure you want to change the base?
Conversation
@@ -36,7 +36,6 @@ spec: | |||
name: {{ include "speecht5.fullname" . }}-config | |||
securityContext: | |||
allowPrivilegeEscalation: false | |||
readOnlyRootFilesystem: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing "pip install" on every container start is not a fix, at best it's a (rather horrible) temporary band-aid...
Isn't there any HF image which would have a working huggingface-hub
version?
If not, this issue should be reported to upstream. And "TODO: fix this for 1.2 release" comment here with a link to the ticket would be good.
=> If upstream does not provide fixed image before OPEA 1.2 release, I think OPEA needs to add such image to DockerHub before next release...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's a temporary workaround to unblock the CI. Created an issue in upstream huggingface/huggingface_hub#2708
Signed-off-by: Lianhao Lu <[email protected]>
Fix issue opea-project#639 Signed-off-by: Lianhao Lu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved.
Cache being in emptyDir
i.e. going away when the pod instance goes away (instead of being shared like model data), is more secure, but it can be significant pod startup performance issue. Especially with HPA and in other setups were pods come and go. That can be looked in another PR though.
I think in long term it would be better to separate model downloading and running of the application services. That would allow model downloading to be centralized to single service / container, instead of split over multiple pods, and even to do it as separate step before starting any of the application pods.
Description
Upgrade huggingface-hub to version 0.26.5 when downloading models, due to the existing
huggingface/downloader:0.17.3
image doesn't acknowledge the HF_TOKEN correctly.Loose the tgi securityContext to allow running with multi shard.
Issues
Fixes #641
Fixes #639
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
List the newly introduced 3rd party dependency if exists.
Tests
Describe the tests that you ran to verify your changes.