Fix model-downloader and tgi in multi shard case #642

lianhao · 2024-12-12T08:51:52Z

Description

Upgrade huggingface-hub to version 0.26.5 when downloading models, due to the existing huggingface/downloader:0.17.3 image doesn't acknowledge the HF_TOKEN correctly.

Loose the tgi securityContext to allow running with multi shard.

Issues

Fixes #641
Fixes #639

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

eero-t · 2024-12-12T16:57:33Z

helm-charts/common/speecht5/templates/deployment.yaml

@@ -36,7 +36,6 @@ spec:
                name: {{ include "speecht5.fullname" . }}-config
          securityContext:
            allowPrivilegeEscalation: false
-            readOnlyRootFilesystem: true


Doing "pip install" on every container start is not a fix, at best it's a (rather horrible) temporary band-aid...

Isn't there any HF image which would have a working huggingface-hub version?

If not, this issue should be reported to upstream. And "TODO: fix this for 1.2 release" comment here with a link to the ticket would be good.

=> If upstream does not provide fixed image before OPEA 1.2 release, I think OPEA needs to add such image to DockerHub before next release...

yes, it's a temporary workaround to unblock the CI. Created an issue in upstream huggingface/huggingface_hub#2708

helm-charts/common/tgi/values.yaml

helm-charts/common/speecht5/templates/deployment.yaml

Signed-off-by: Lianhao Lu <[email protected]>

Fix issue opea-project#639 Signed-off-by: Lianhao Lu <[email protected]>

eero-t

Approved.

Cache being in emptyDir i.e. going away when the pod instance goes away (instead of being shared like model data), is more secure, but it can be significant pod startup performance issue. Especially with HPA and in other setups were pods come and go. That can be looked in another PR though.

I think in long term it would be better to separate model downloading and running of the application services. That would allow model downloading to be centralized to single service / container, instead of split over multiple pods, and even to do it as separate step before starting any of the application pods.

lianhao requested a review from yongfengdu as a code owner December 12, 2024 08:51

lianhao requested a review from Ruoyu-y December 12, 2024 08:52

lianhao force-pushed the bug641 branch from 2e98034 to a61a25f Compare December 12, 2024 11:13

lianhao changed the title ~~Use huggingface-hub 0.26.5 to download model~~ Fix model-downloader and tgi in multi shard case Dec 12, 2024

lianhao force-pushed the bug641 branch from a61a25f to 9009207 Compare December 12, 2024 11:16

eero-t reviewed Dec 12, 2024

View reviewed changes

lianhao force-pushed the bug641 branch from 9009207 to 22f4649 Compare December 13, 2024 03:30

eero-t reviewed Dec 13, 2024

View reviewed changes

helm-charts/common/speecht5/templates/deployment.yaml Outdated Show resolved Hide resolved

lianhao added 2 commits December 16, 2024 06:25

Workaround to acknowledge HF_TOKEN in model-downloader

84f1df1

Signed-off-by: Lianhao Lu <[email protected]>

tgi: Fix permission issue of non-root user

c12565a

Fix issue opea-project#639 Signed-off-by: Lianhao Lu <[email protected]>

lianhao force-pushed the bug641 branch from 22f4649 to c12565a Compare December 16, 2024 06:36

lianhao mentioned this pull request Dec 16, 2024

Adapt to latest vllm changes #632

Open

1 task

eero-t approved these changes Dec 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix model-downloader and tgi in multi shard case #642

Fix model-downloader and tgi in multi shard case #642

lianhao commented Dec 12, 2024 •

edited

Loading

eero-t Dec 12, 2024

lianhao Dec 13, 2024 •

edited

Loading

eero-t left a comment

Fix model-downloader and tgi in multi shard case #642

Are you sure you want to change the base?

Fix model-downloader and tgi in multi shard case #642

Conversation

lianhao commented Dec 12, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

eero-t Dec 12, 2024

Choose a reason for hiding this comment

lianhao Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

eero-t left a comment

Choose a reason for hiding this comment

lianhao commented Dec 12, 2024 •

edited

Loading

lianhao Dec 13, 2024 •

edited

Loading