Skip to content

Commit

Permalink
get kserve up to date (#66)
Browse files Browse the repository at this point in the history
* add tags to rest server timing logs to differentiate cpu and wall time (kserve#3954)

Signed-off-by: Gregory Keith <[email protected]>

* Implement Huggingface model download in storage initializer (kserve#3584)

* initial commit for hugging face model download and load

Signed-off-by: Andrews Arokiam <[email protected]>

* bug fix on storage initializer

Signed-off-by: Andrews Arokiam <[email protected]>

* added hf_token and unittests

Signed-off-by: Andrews Arokiam <[email protected]>

* separate hf-storage-initializer image to reduce image size

Signed-off-by: Andrews Arokiam <[email protected]>

* review comment changes

Signed-off-by: Andrews Arokiam <[email protected]>

* snapshot download

Signed-off-by: Andrews Arokiam <[email protected]>

* use existing image for storage initializer

Signed-off-by: Andrews Arokiam <[email protected]>

* resolved merge conflicts

Signed-off-by: Andrews Arokiam <[email protected]>

* added hf storage uri validation

Signed-off-by: Andrews Arokiam <[email protected]>

* resolved merge conflicts

Signed-off-by: Andrews Arokiam <[email protected]>

---------

Signed-off-by: Andrews Arokiam <[email protected]>

* Update OWNERS file (kserve#3966)

Signed-off-by: Dan Sun <[email protected]>

* Cluster local model controller (kserve#3860)

* Consolidate into one commit

Signed-off-by: Jin Dong <[email protected]>

* Fix configmap format

Signed-off-by: Jin Dong <[email protected]>

* Fix configmap

Signed-off-by: Jin Dong <[email protected]>

* Log configmap read error

Signed-off-by: Jin Dong <[email protected]>

* fix naming

Signed-off-by: Dan Sun <[email protected]>

* Update comments

Signed-off-by: Jin Dong <[email protected]>

* Add enabled flag to configmap and avoid cluster resource check in isvc defaulter

Signed-off-by: Jin Dong <[email protected]>

* move client into the local model block

Signed-off-by: Dan Sun <[email protected]>

* Fix lint

Signed-off-by: Jin Dong <[email protected]>

---------

Signed-off-by: Jin Dong <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Co-authored-by: Dan Sun <[email protected]>

* Prepare for 0.14.0-rc1release and automate sync process (kserve#3970)

* Sync helm chart with kustomize

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update manifest generation script to sync helm charts

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Make kserve-addressable-resolver role optional

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Prepare for 0.14.0-rc1 release

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update release process

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Comment out crd sync script in make

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix helm template syntax

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

---------

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* add a new API for multi-node/multi-gpu (kserve#3871)

* add a new API for multi-node/multi-gpu

Signed-off-by: jooho lee <[email protected]>

* fix gitaction

Signed-off-by: jooho lee <[email protected]>

* fix merging conflict

Signed-off-by: jooho lee <[email protected]>

* fix gitaction fail

Signed-off-by: jooho lee <[email protected]>

* regenerate codegen/manifests

Signed-off-by: jooho lee <[email protected]>

* Apply suggestions from code review

Co-authored-by: Dan Sun <[email protected]>
Signed-off-by: Jooho Lee <[email protected]>

* remove unnecessary comment

Signed-off-by: jooho lee <[email protected]>

* change the type of workerSpec in isvc to PodSpec

Signed-off-by: jooho lee <[email protected]>

* update controller-gen version

Signed-off-by: jooho lee <[email protected]>

* remove replicas from workerSpec

Signed-off-by: jooho lee <[email protected]>

* fix conflict merging

Signed-off-by: jooho lee <[email protected]>

* added size(replicas) for workerSpec again

Signed-off-by: jooho lee <[email protected]>

* add WorkerSpec to inferenceService

Signed-off-by: jooho lee <[email protected]>

* fix go linter

Signed-off-by: jooho lee <[email protected]>

---------

Signed-off-by: jooho lee <[email protected]>
Signed-off-by: Jooho Lee <[email protected]>
Signed-off-by: Jooho Lee <[email protected]>
Co-authored-by: Dan Sun <[email protected]>

* Fix update-openapigen.sh that can be executed from kserve dir (kserve#3924)

* fix openapigen.sh that can be executed from kserve dir

Signed-off-by: jooho lee <[email protected]>

* regenerate codegen/manifests

Signed-off-by: jooho lee <[email protected]>

* Update go.sum

Signed-off-by: Dan Sun <[email protected]>

---------

Signed-off-by: jooho lee <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Co-authored-by: Dan Sun <[email protected]>

* Add python 3.12 support and remove python 3.8 support (kserve#3645)

* Support python 3.12

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update dependencies

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update deps to support 3.12

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Remove python 3.8 support

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Remove skip for infer client test

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix port forward

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix sklearn pandas dep

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* skip pydantic v1 test for py 3.12

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Add setuptools dep for pmml

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix lgb

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Include setuptools for paddle

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Include setuptools for huggingface

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Rebase

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Rebase

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

---------

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix openssl vulnerability  CWE-1395 (kserve#3975)

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix Kubernetes Doc Links  (kserve#3670)

* Bump version to 0.13.0-rc0 (kserve#3665)

Signed-off-by: Curtis Maddalozzo <[email protected]>
Signed-off-by: jordanyono <[email protected]>

* fixing docs

Signed-off-by: jordanyono <[email protected]>

* fix spelling mistake

Signed-off-by: jordanyono <[email protected]>

---------

Signed-off-by: Curtis Maddalozzo <[email protected]>
Signed-off-by: jordanyono <[email protected]>
Co-authored-by: Curtis Maddalozzo <[email protected]>

* Fix kserve local testing env (kserve#3981)

* Fix local testing

Signed-off-by: Dan Sun <[email protected]>

* Fix codegen

Signed-off-by: Dan Sun <[email protected]>

---------

Signed-off-by: Dan Sun <[email protected]>

* Fix streaming response not working properly with logger (kserve#3847)

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Add a flag for automount serviceaccount token (kserve#3979)

* Add a flag for automount serviceaccount

Signed-off-by: Jin Dong <[email protected]>

* Set default to false

Signed-off-by: Jin Dong <[email protected]>

* Default to true

Signed-off-by: Jin Dong <[email protected]>

* Fix test error

Signed-off-by: Jin Dong <[email protected]>

* Update openapi generated.go

Signed-off-by: Jin Dong <[email protected]>

* Fix python lint

Signed-off-by: Jin Dong <[email protected]>

* Fix config loading

Signed-off-by: Jin Dong <[email protected]>

---------

Signed-off-by: Jin Dong <[email protected]>

* Do not set security context on the storage initializer from user container (kserve#3985)

* Do not set security context on the storage initializer from user container

Signed-off-by: Jin Dong <[email protected]>

* Add securityContext to the default storage container in the helm chart

Signed-off-by: Jin Dong <[email protected]>

---------

Signed-off-by: Jin Dong <[email protected]>

* Modelcar race condition mitigation with an init container (kserve#3932)

This adds the model container as an init-container to mitigate a race
condition that would happen if the model container is not present on the
cluster-node. The race condition happens if the cluster is able to fetch
and start the runtime container before the modelcar is fetched. This
would lead to the runtime to terminate with error.

By configuring the model container as an init-container the runtime
won't start until the modelcar is fetched. Although there is still the
risk of a race condition when the cluster schedules the runtime
container first, the pod should stabilize after a few restarts of the
runtime container and should either prevent a CrashLoopBackOff event on
the pod, or the crash event would finish quickly.

This improves compatibility with the runtimes which can now stay
agnostic to the modelcar implementation, until better techniques (like
native sidecars, and oci volume mounts) become mature.

Signed-off-by: Edgar Hernández <[email protected]>

* Fix: Headers passing for v1/v2 endpoints (kserve#3669)

* Initial commit for headers passing issue

Signed-off-by: Andrews Arokiam <[email protected]>

* modifying the e2e test for rebase conflict

Signed-off-by: Andrews Arokiam <[email protected]>

* bug fix on unittest

Signed-off-by: Andrews Arokiam <[email protected]>

* review changes

Signed-off-by: Andrews Arokiam <[email protected]>

* fix for test failure

Signed-off-by: Andrews Arokiam <[email protected]>

* bug fix on e2e test

Signed-off-by: Andrews Arokiam <[email protected]>

* overridding the entrypoint of custom model images

Signed-off-by: Andrews Arokiam <[email protected]>

* custom response header

Signed-off-by: Andrews Arokiam <[email protected]>

* fix for unittest failure

Signed-off-by: Andrews Arokiam <[email protected]>

* added custom response headers in post process

Signed-off-by: Andrews Arokiam <[email protected]>

* added predict time latency in example response header

Signed-off-by: Andrews Arokiam <[email protected]>

* fix OOM

---------

Signed-off-by: Andrews Arokiam <[email protected]>
Co-authored-by: Dan Sun <[email protected]>

* Torchserve security update  (kserve#3774)

* security update

Signed-off-by: udai <[email protected]>

* adding sign off

Signed-off-by: udai <[email protected]>

---------

Signed-off-by: udai <[email protected]>

* Pin ubuntu 22.04 for minikube setup action (kserve#3994)

Signed-off-by: Jin Dong <[email protected]>

* KServe 0.14 Release (kserve#3988)

* temp commit

Signed-off-by: Jin Dong <[email protected]>

* python-release.sh

Signed-off-by: Jin Dong <[email protected]>

---------

Signed-off-by: Jin Dong <[email protected]>

* bump to vllm0.6.2 add explicit chat template (kserve#3964)

* explicitly give a chat template

Signed-off-by: yxia216 <[email protected]>

* fix dummy model issue, fix python version smaller than 3.10, and formatting

Signed-off-by: yxia216 <[email protected]>

* fix vLLMModel

Signed-off-by: yxia216 <[email protected]>

* change the interface of CreateChatCompletionRequest

Signed-off-by: yxia216 <[email protected]>

* update dummy model's para

Signed-off-by: yxia216 <[email protected]>

* consitent with OpenAIGPTTokenizer and OpenAIGPTModel

Signed-off-by: yxia216 <[email protected]>

* give a chat template if there is no

Signed-off-by: yxia216 <[email protected]>

* update the response and update the readme

Signed-off-by: yxia216 <[email protected]>

* update the chat_template

Signed-off-by: yxia216 <[email protected]>

* update data

Signed-off-by: yxia216 <[email protected]>

* add test of chat temmplate for tokenizer

Signed-off-by: yxia216 <[email protected]>

* jinja2 template format

Signed-off-by: yxia216 <[email protected]>

* use a simpler chat template

---------

Signed-off-by: yxia216 <[email protected]>

* bump to vllm0.6.3 (kserve#4001)

Signed-off-by: yxia216 <[email protected]>

* Feature: Add hf transfer (kserve#4000)

* Add hf transfer

Signed-off-by: tjandy98 <[email protected]>

* Add hf transfer env

Signed-off-by: tjandy98 <[email protected]>

---------

Signed-off-by: tjandy98 <[email protected]>

* Fix snyk scan null error (kserve#3974)

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update quick install script (kserve#4005)

Signed-off-by: Johnu George <[email protected]>

* Local Model Node CR (kserve#3978)

* init CR

Signed-off-by: Gavin Li <[email protected]>

* make generate

Signed-off-by: Gavin Li <[email protected]>

* make manifests

Signed-off-by: Gavin Li <[email protected]>

* black format

Signed-off-by: Gavin Li <[email protected]>

* fix generated python code

Signed-off-by: Gavin Li <[email protected]>

* feedback

Signed-off-by: Gavin Li <[email protected]>

* more feedback

Signed-off-by: Gavin Li <[email protected]>

* black format

Signed-off-by: Gavin Li <[email protected]>

* make manifests

Signed-off-by: Gavin Li <[email protected]>

---------

Signed-off-by: Gavin Li <[email protected]>

* Reduce E2Es dependency on CI environment (2) (kserve#4008)

Reduce E2Es dependency on CI environment

Some code of the E2Es assume the environment is GitHub, because it is referring to GitHub-specific variables. This PR focuses on the `kserve/custom-model-grpc` container image, so that no Python code of the E2Es using this image is referencing the `github_sha` variable.

Also, a small improvement on the `get_isvc_endpoint` utility function is done to use the schema in the endpoint specified in the status of the InferenceService, rather than hard-coding to plain-text HTTP. This adds compatibility for CI environments where KServe ConfigMap has been configured with `urlScheme: https` for the Ingress.

Signed-off-by: Edgar Hernández <[email protected]>

* Allow GCS to download single file (kserve#4015)

allow gcs to download single file

fixes 	kserve#4013

Signed-off-by: Spolti <[email protected]>

* bump to vllm0.6.3.post1 (kserve#4023)

Signed-off-by: yxia216 <[email protected]>

* Set default for SamplingParams.max_tokens in OpenAI requests if unset (kserve#4020)

* Set default for SamplingParams.max_tokens in OpenAI requests if unset

Signed-off-by: Kevin Mingtarja <[email protected]>

* Fix lint

Signed-off-by: Kevin Mingtarja <[email protected]>

* Fix formatting

Signed-off-by: Kevin Mingtarja <[email protected]>

---------

Signed-off-by: Kevin Mingtarja <[email protected]>

* Add tools functionality to vLLM (kserve#4033)

* Add tools to chat template

Signed-off-by: Arjun Bhalla <[email protected]>

Linting

Signed-off-by: Arjun Bhalla <[email protected]>

add test

Signed-off-by: Arjun Bhalla <[email protected]>

Fix linting manually

Signed-off-by: Arjun Bhalla <[email protected]>

* Fix linting

Signed-off-by: Arjun Bhalla <[email protected]>

---------

Signed-off-by: Arjun Bhalla <[email protected]>
Signed-off-by: Arjun Bhalla <[email protected]>
Co-authored-by: Arjun Bhalla <[email protected]>

* Use apt-get upgrade for CVE fixes

Signed-off-by: Dan Sun <[email protected]>

* For vllm users, our parser should be able to support both - and _ (kserve#3933)

Signed-off-by: yxia216 <[email protected]>

* Add tools unpacking for vLLM (kserve#4035)

* Add tools to chat template

Signed-off-by: Arjun Bhalla <[email protected]>

Linting

Signed-off-by: Arjun Bhalla <[email protected]>

add test

Signed-off-by: Arjun Bhalla <[email protected]>

Fix linting manually

Signed-off-by: Arjun Bhalla <[email protected]>

* Fix linting

Signed-off-by: Arjun Bhalla <[email protected]>

* Add tools unpacking for vllm

Signed-off-by: Arjun Bhalla <[email protected]>

* Add sanity check test

Signed-off-by: Arjun Bhalla <[email protected]>

---------

Signed-off-by: Arjun Bhalla <[email protected]>
Signed-off-by: Arjun Bhalla <[email protected]>
Co-authored-by: Arjun Bhalla <[email protected]>

* Multi-Node Inference Implementation (kserve#3972)

Signed-off-by: jooho lee <[email protected]>

* Enhance InjectAgent to Handle Only HTTPGet, TCP Readiness Probes (kserve#4012)

* Fix readiness probe logic and update test scenarios for HTTPGet, TCPSocket, and Exec handling

Signed-off-by: Snehomoy <[email protected]>

* Update: Refactor logic for readiness probe handling

Signed-off-by: Snehomoy <[email protected]>

* Apply gofmt formatting to agent_injector.go

Signed-off-by: Snehomoy <[email protected]>

* Added logger to replace fmt.Printf for better consistency and observability

Signed-off-by: Snehomoy <[email protected]>

* Formatted file using goimports with -local

Signed-off-by: Snehomoy <[email protected]>

---------

Signed-off-by: Snehomoy <[email protected]>

* Feat: Fix memory issue by replacing io.ReadAll with io.Copy (kserve#4017) (kserve#4018)

* Feat: Fix memory issue by replacing io.ReadAll with io.Copy (kserve#4017)

Previously, io.ReadAll was causing out-of-memory problems when downloading large files from GCS.
This change replaces io.ReadAll() with io.Copy() to stream data and prevent excessive memory usage.

Signed-off-by: ops-jaeha <[email protected]>

* Feat: Fix add newline at end of file to satisfy golang lint

Signed-off-by: ops-jaeha <[email protected]>

* Feat: Refact log Info for golang lint (kserve#4017)

Signed-off-by: ops-jaeha <[email protected]>

---------

Signed-off-by: ops-jaeha <[email protected]>

* Update alibiexplainer example (kserve#4004)

chore:	Fix CVE-2024-26130 - NULL Pointer Dereference
	  - Upgrade cryptography to version 42.0.4 or higher.
	Update Python version to match KServe 0.14.0
	Update tensorflow, tensorflow-io-gcs-filesystem and dill libraries

Signed-off-by: Spolti <[email protected]>

* Fix huggingface build runs out of storage in CI (kserve#4044)

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update snyk scan to include new images (kserve#4042)

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Introducing KServe Guru on Gurubase.io (kserve#4038)

Signed-off-by: Kursat Aktas <[email protected]>

* Fix Hugging Face server EncoderModel not returning probabilities (kserve#4024)

* Fix huggingface srever not work with return_probabilities

Signed-off-by: oplushappy <[email protected]>

* Fix pytest huggingface server assertion error

Signed-off-by: oplushappy <[email protected]>

* Fix the lint error and Add approx for  assertion

Signed-off-by: oplushappy <[email protected]>

* Parse string output to dictionary for accurate assertion

Signed-off-by: oplushappy <[email protected]>

* Fix linting error

Signed-off-by: oplushappy <[email protected]>

---------

Signed-off-by: oplushappy <[email protected]>

* Add deeper readiness check for transformer (kserve#3348)

* Add deeper readiness and liveness check for transformer

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Add unit tests

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* put the feature behind flag

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update tests

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* resolve comments

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Make use of inference client

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Add e2e test

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Make inference client singleton and lazy initialize

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Raise 503 If server is not ready / live

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Add test for custom transformer with rest protocol

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix CI running out of space

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Increase memory limit

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Check for model ready

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Webhook debug

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Address reviews

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Check for retry count in grpc client

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update python/kserve/kserve/model_server.py

Co-authored-by: Dan Sun <[email protected]>
Signed-off-by: Sivanantham <[email protected]>

---------

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Sivanantham <[email protected]>
Co-authored-by: Dan Sun <[email protected]>

* Fix Starlette Denial of service (DoS) via multipart/form-data (kserve#4006)

chore:	Fix CVE-2024-47874

Signed-off-by: Spolti <[email protected]>

* remove duplicated import "github.com/onsi/gomega" (kserve#4051)

remove duplicated import

Signed-off-by: carlory <[email protected]>

* Fix localmodel controller name in snyk scan workflow (kserve#4054)

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix azure blob storage access key env not mounted (kserve#4064)

* add storageaccesskey to azure env builder

Signed-off-by: bentohset <[email protected]>

* update integration and unit test for azure storage access key

Signed-off-by: bentohset <[email protected]>

* fix formatting

Signed-off-by: bentohset <[email protected]>

---------

Signed-off-by: bentohset <[email protected]>

* Storage Initializer support single digit azure DNS zone ID (kserve#4070)

* support single digit azure zone id

Signed-off-by: bentohset <[email protected]>

* add single digit azure dns zone id tests

Signed-off-by: bentohset <[email protected]>

* fix formatting

Signed-off-by: bentohset <[email protected]>

---------

Signed-off-by: bentohset <[email protected]>

* support text embedding task in huggingfaceserver

Signed-off-by: Kevin Mingtarja <[email protected]>

* fix lint errors

Signed-off-by: Kevin Mingtarja <[email protected]>

* format code

Signed-off-by: Kevin Mingtarja <[email protected]>

* bring back enhancements after getting kserve up-to-date (#42)

* improve dockerfile, makefile, readme

* support custom classification labels, refactor postprocess

* support text embedding task

* improve support for token classification (named entity recognition)

* use self.model_config.id2label by default (#45)

* minor cleanup and fixes after rebase

* use approx in test_input_padding

* revert token_classification changes

* fix test

---------

Signed-off-by: Gregory Keith <[email protected]>
Signed-off-by: Andrews Arokiam <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Signed-off-by: Jin Dong <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: jooho lee <[email protected]>
Signed-off-by: Jooho Lee <[email protected]>
Signed-off-by: Jooho Lee <[email protected]>
Signed-off-by: Curtis Maddalozzo <[email protected]>
Signed-off-by: jordanyono <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: udai <[email protected]>
Signed-off-by: yxia216 <[email protected]>
Signed-off-by: tjandy98 <[email protected]>
Signed-off-by: Johnu George <[email protected]>
Signed-off-by: Gavin Li <[email protected]>
Signed-off-by: Spolti <[email protected]>
Signed-off-by: Kevin Mingtarja <[email protected]>
Signed-off-by: Arjun Bhalla <[email protected]>
Signed-off-by: Arjun Bhalla <[email protected]>
Signed-off-by: Snehomoy <[email protected]>
Signed-off-by: ops-jaeha <[email protected]>
Signed-off-by: Kursat Aktas <[email protected]>
Signed-off-by: oplushappy <[email protected]>
Signed-off-by: Sivanantham <[email protected]>
Signed-off-by: carlory <[email protected]>
Signed-off-by: bentohset <[email protected]>
Signed-off-by: Kevin Mingtarja <[email protected]>
Signed-off-by: Kevin Mingtarja <[email protected]>
Co-authored-by: gfkeith <[email protected]>
Co-authored-by: Andrews Arokiam <[email protected]>
Co-authored-by: Dan Sun <[email protected]>
Co-authored-by: Jin Dong <[email protected]>
Co-authored-by: Sivanantham <[email protected]>
Co-authored-by: Jooho Lee <[email protected]>
Co-authored-by: jordanyono <[email protected]>
Co-authored-by: Curtis Maddalozzo <[email protected]>
Co-authored-by: Edgar Hernández <[email protected]>
Co-authored-by: udaij12 <[email protected]>
Co-authored-by: hustxiayang <[email protected]>
Co-authored-by: tjandy98 <[email protected]>
Co-authored-by: Johnu George <[email protected]>
Co-authored-by: Gavin Li <[email protected]>
Co-authored-by: Filippe Spolti <[email protected]>
Co-authored-by: Arjun Bhalla <[email protected]>
Co-authored-by: Arjun Bhalla <[email protected]>
Co-authored-by: Snehomoy.M <[email protected]>
Co-authored-by: 이재하 <[email protected]>
Co-authored-by: Kursat Aktas <[email protected]>
Co-authored-by: oplushappy <[email protected]>
Co-authored-by: 杨朱 · Kiki <[email protected]>
Co-authored-by: Benjamin Toh <[email protected]>
  • Loading branch information
1 parent 9789948 commit 100b9b2
Show file tree
Hide file tree
Showing 272 changed files with 185,589 additions and 28,229 deletions.
31 changes: 22 additions & 9 deletions .github/workflows/e2e-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ env:
BASE_ARTIFACT_PREFIX: "base"
# Controller images
CONTROLLER_IMG: "kserve-controller"
LOCALMODEL_CONTROLLER_IMG: "kserve-localmodel-controller"
STORAGE_INIT_IMG: "storage-initializer"
AGENT_IMG: "agent"
ROUTER_IMG: "router"
Expand All @@ -33,6 +34,7 @@ env:
PMML_IMG: "pmmlserver"
PADDLE_IMG: "paddleserver"
CUSTOM_MODEL_GRPC_IMG: "custom-model-grpc"
CUSTOM_MODEL_GRPC_IMG_TAG: "kserve/custom-model-grpc:${{ github.sha }}"
HUGGINGFACE_IMG: "huggingfaceserver"
# Explainer images
ART_IMG: "art-explainer"
Expand All @@ -55,6 +57,9 @@ jobs:
- name: Checkout source
uses: actions/checkout@v4

- name: Free-up disk space
uses: ./.github/actions/free-up-disk-space

- name: Setup Docker Buildx
uses: docker/setup-buildx-action@v3

Expand All @@ -73,6 +78,14 @@ jobs:
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.CONTROLLER_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error

- name: Upload localmodel controller image
uses: actions/upload-artifact@v4
with:
name: ${{ env.BASE_ARTIFACT_PREFIX }}-${{ env.LOCALMODEL_CONTROLLER_IMG }}-${{ github.sha }}
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.LOCALMODEL_CONTROLLER_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error

- name: Upload agent image
uses: actions/upload-artifact@v4
Expand Down Expand Up @@ -272,7 +285,7 @@ jobs:
if-no-files-found: error

test-predictor:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[
kserve-image-build,
Expand Down Expand Up @@ -337,7 +350,7 @@ jobs:
./test/scripts/gh-actions/status-check.sh
test-transformer-explainer-mms:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[kserve-image-build, predictor-runtime-build, explainer-runtime-build]
steps:
Expand Down Expand Up @@ -417,7 +430,7 @@ jobs:
./test/scripts/gh-actions/status-check.sh
test-graph:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[
kserve-image-build,
Expand Down Expand Up @@ -496,7 +509,7 @@ jobs:
test-path-based-routing:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[
kserve-image-build,
Expand Down Expand Up @@ -589,7 +602,7 @@ jobs:
./test/scripts/gh-actions/status-check.sh
test-qpext:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs: [kserve-image-build, predictor-runtime-build]
steps:
- name: Checkout source
Expand Down Expand Up @@ -654,7 +667,7 @@ jobs:
./test/scripts/gh-actions/status-check.sh
test-with-helm:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[kserve-image-build]
steps:
Expand Down Expand Up @@ -705,7 +718,7 @@ jobs:
./test/scripts/gh-actions/status-check.sh
test-raw:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[kserve-image-build, predictor-runtime-build]
steps:
Expand Down Expand Up @@ -785,7 +798,7 @@ jobs:
./test/scripts/gh-actions/status-check.sh
test-kourier:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[kserve-image-build, predictor-runtime-build, graph-tests-images-build]
steps:
Expand Down Expand Up @@ -870,7 +883,7 @@ jobs:
./test/scripts/gh-actions/status-check.sh "kourier"
test-llm:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs:
[ kserve-image-build, predictor-runtime-build]
steps:
Expand Down
95 changes: 95 additions & 0 deletions .github/workflows/kserve-localmodel-controller-docker-publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
name: Kserve localmodel controller Docker Publisher

on:
push:
# Publish `master` as Docker `latest` image.
branches:
- master

# Publish `v1.2.3` tags as releases.
tags:
- v*

# Run tests for any PRs.
pull_request:

env:
IMAGE_NAME: kserve-localmodel-controller

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
# Run tests.
# See also https://docs.docker.com/docker-hub/builds/automated-testing/
test:
runs-on: ubuntu-latest

steps:
- name: Checkout source
uses: actions/checkout@v4

- name: Run tests
run: |
if [ -f docker-compose.test.yml ]; then
docker-compose --file docker-compose.test.yml build
docker-compose --file docker-compose.test.yml run sut
else
docker buildx build . --file localmodel.Dockerfile
fi
# Push image to GitHub Packages.
# See also https://docs.docker.com/docker-hub/builds/
push:
# Ensure test job passes before pushing image.
needs: test

runs-on: ubuntu-latest
if: github.event_name == 'push'

steps:
- name: Checkout source
uses: actions/checkout@v4

- name: Setup QEMU
uses: docker/setup-qemu-action@v3

- name: Setup Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USER }}
password: ${{ secrets.DOCKER_PASSWORD }}

- name: export version variable
run: |
IMAGE_ID=kserve/$IMAGE_NAME
# Change all uppercase to lowercase
IMAGE_ID=$(echo $IMAGE_ID | tr '[A-Z]' '[a-z]')
# Strip git ref prefix from version
VERSION=$(echo "${{ github.ref }}" | sed -e 's,.*/\(.*\),\1,')
# Strip "v" prefix from tag name
# [[ "${{ github.ref }}" == "refs/tags/"* ]] && VERSION=$(echo $VERSION | sed -e 's/^v//')
# Use Docker `latest` tag convention
[ "$VERSION" == "master" ] && VERSION=latest
echo VERSION=$VERSION >> $GITHUB_ENV
echo IMAGE_ID=$IMAGE_ID >> $GITHUB_ENV
- name: Build and push
uses: docker/build-push-action@v5
with:
platforms: linux/amd64,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/s390x
context: .
file: localmodel.Dockerfile
push: true
tags: ${{ env.IMAGE_ID }}:${{ env.VERSION }}
# https://github.com/docker/buildx/issues/1533
provenance: false
15 changes: 7 additions & 8 deletions .github/workflows/python-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: ["3.9", "3.10", "3.11", "3.12"]
steps:
- name: Checkout source
uses: actions/checkout@v4
Expand Down Expand Up @@ -98,29 +98,33 @@ jobs:
# ----------------------------------------Kserve Pydantic V1 Unit Tests--------------------------------------------
- name: Setup kserve pydantic v1 directory
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.12') }}
run: |
mkdir -p python/kserve-pydantic-v1
cp -r python/kserve/* python/kserve-pydantic-v1
cd python/kserve-pydantic-v1
# update the lock file without installing dependencies
poetry update "pydantic<2.0" --lock
- name: Load cached kserve pydantic v1 venv
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.12') }}
id: cached-kserve-pydantic-v1-dependencies
uses: actions/cache@v3
with:
path: python/kserve-pydantic-v1/.venv
key: kserve-pydantic-v1-venv-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/kserve-pydantic-v1/poetry.lock') }}
# install kserve pydantic v1 dependencies if cache does not exist
- name: Install kserve pydantic v1 dependencies
if: steps.cached-kserve-pydantic-v1-dependencies.outputs.cache-hit != 'true'
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.12') && steps.cached-kserve-pydantic-v1-dependencies.outputs.cache-hit != 'true' }}
run: |
cd python/kserve-pydantic-v1
make install_dependencies
- name: Install kserve pydantic v1
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.12') }}
run: |
cd python/kserve-pydantic-v1
make dev_install
- name: Test kserve pydantic v1
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.12') }}
run: |
cd python
source kserve-pydantic-v1/.venv/bin/activate
Expand Down Expand Up @@ -250,37 +254,32 @@ jobs:
# ----------------------------------------Huggingface Server Unit Tests------------------------------------------------
# load cached huggingface venv if cache exists
- name: Load cached huggingface venv
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.8') }}
id: huggingface-dependencies
uses: actions/cache@v4
with:
path: /mnt/python/huggingfaceserver-venv
key: huggingface-venv-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/kserve/poetry.lock', '**/huggingfaceserver/poetry.lock') }}
# install huggingface server dependencies if cache does not exist
- name: Configure poetry for huggingface server
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.8') }}
run: |
poetry config virtualenvs.path /mnt/python/huggingfaceserver-venv
poetry config virtualenvs.in-project false
- name: Install huggingface dependencies
if: ${{ steps.cached-huggingface-dependencies.outputs.cache-hit != 'true' && !startsWith(steps.setup-python.outputs.python-version, '3.8') }}
run: |
sudo mkdir -p /mnt/python/huggingfaceserver-venv
# change permission so that poetry can install without sudo
sudo chown -R $USER /mnt/python/huggingfaceserver-venv
cd python/huggingfaceserver
make install_dependencies
- name: Install huggingface server
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.8') }}
run: |
cd python/huggingfaceserver
make dev_install
- name: Test huggingfaceserver
if: ${{ !startsWith(steps.setup-python.outputs.python-version, '3.8') }}
run: |
cd python/huggingfaceserver
poetry run -- pytest --cov=huggingfaceserver -vv
- name: Free space after tests
run: |
df -hT
df -hT
12 changes: 9 additions & 3 deletions .github/workflows/scheduled-image-scan.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
file: python/storage-initializer.Dockerfile,
},
{ name: router, file: router.Dockerfile },
{ name: kserve-localmodel-controller, file: localmodel.Dockerfile },
]

steps:
Expand All @@ -43,13 +44,14 @@ jobs:
--sarif-file-output=./application/${{ matrix.image.name }}/docker.snyk.sarif
sarif: false

# Replace any "undefined" security severity values with 0. The undefined value is used in the case
# Replace any "undefined" or "null" security severity values with 0. The undefined value is used in the case
# of license-related findings, which do not indicate a security vulnerability.
# See https://github.com/github/codeql-action/issues/2187 for more context.
# This can be removed once https://github.com/snyk/cli/pull/5409 is merged.
- name: Replace security-severity undefined for license-related findings
run: |
sudo sed -i 's/"security-severity": "undefined"/"security-severity": "0"/g' ./application/${{ matrix.image.name }}/docker.snyk.sarif
sudo sed -i 's/"security-severity": "null"/"security-severity": "0"/g' ./application/${{ matrix.image.name }}/docker.snyk.sarif
- name: Upload sarif file to Github Code Scanning
if: always()
Expand All @@ -69,6 +71,8 @@ jobs:
{ name: xgbserver, file: python/xgb.Dockerfile },
{ name: pmmlserver, file: python/pmml.Dockerfile },
{ name: paddleserver, file: python/paddle.Dockerfile },
{ name: lgbserver, file: python/lgb.Dockerfile },
{ name: huggingfaceserver, file: python/huggingface_server.Dockerfile },
]

steps:
Expand All @@ -88,13 +92,14 @@ jobs:
--sarif-file-output=./application/${{ matrix.image.name }}/docker.snyk.sarif
sarif: false

# Replace any "undefined" security severity values with 0. The undefined value is used in the case
# Replace any "undefined" or "null" security severity values with 0. The undefined value is used in the case
# of license-related findings, which do not indicate a security vulnerability.
# See https://github.com/github/codeql-action/issues/2187 for more context.
# This can be removed once https://github.com/snyk/cli/pull/5409 is merged.
- name: Replace security-severity undefined for license-related findings
run: |
sudo sed -i 's/"security-severity": "undefined"/"security-severity": "0"/g' ./application/${{ matrix.image.name }}/docker.snyk.sarif
sudo sed -i 's/"security-severity": "null"/"security-severity": "0"/g' ./application/${{ matrix.image.name }}/docker.snyk.sarif
- name: Upload sarif file to Github Code Scanning
if: always()
Expand Down Expand Up @@ -129,13 +134,14 @@ jobs:
--sarif-file-output=./application/${{ matrix.image.name }}/docker.snyk.sarif
sarif: false

# Replace any "undefined" security severity values with 0. The undefined value is used in the case
# Replace any "undefined" or "null" security severity values with 0. The undefined value is used in the case
# of license-related findings, which do not indicate a security vulnerability.
# See https://github.com/github/codeql-action/issues/2187 for more context.
# This can be removed once https://github.com/snyk/cli/pull/5409 is merged.
- name: Replace security-severity undefined for license-related findings
run: |
sudo sed -i 's/"security-severity": "undefined"/"security-severity": "0"/g' ./application/${{ matrix.image.name }}/docker.snyk.sarif
sudo sed -i 's/"security-severity": "null"/"security-severity": "0"/g' ./application/${{ matrix.image.name }}/docker.snyk.sarif
- name: Upload sarif file to Github Code Scanning
if: always()
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ venv/
config/overlays/development/manager_image_patch.yaml
config/overlays/development/configmap/inferenceservice_patch.yaml
config/overlays/development/manager_auth_proxy_patch.yaml
config/overlays/development/localmodel_image_patch.yaml
config/overlays/dev-image-config/inferenceservice_patch.yaml

.ko.yaml
Expand Down
Loading

0 comments on commit 100b9b2

Please sign in to comment.