Skip to content

Commit

Permalink
feat: update mlflow version to support newer version of protobuf (#465)
Browse files Browse the repository at this point in the history
<!--  Thanks for sending a pull request!  Here are some tips for you:

1. Run unit tests and ensure that they are passing
2. If your change introduces any API changes, make sure to update the
e2e tests
3. Make sure documentation is updated for your PR!

-->

**What this PR does / why we need it**:
<!-- Explain here the context and why you're making the change. What is
the problem you're trying to solve. --->
* Mlflow 1.26.1 or above is required in order to be compatible with
newer version of protobuf.

* This PR also drops the support for Python 3.7 from the real-time and
batch predictor servers (continuing from
#464). Docs and examples are
updated accordingly.

**Which issue(s) this PR fixes**:
<!--
*Automatically closes linked issue when PR is merged.
Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`.
-->

Fixes #

**Does this PR introduce a user-facing change?**:
<!--
If no, just write "NONE" in the release-note block below.
If yes, a release note is required. Enter your extended release note in
the block below.
If the PR requires additional action from users switching to the new
release, include the string "action required".

For more information about release notes, see kubernetes' guide here:
http://git.k8s.io/community/contributors/guide/release-notes.md
-->

```release-note
Upgrade MlFlow client version; drop Python 3.7 support 
```

**Checklist**

- [ ] Added unit test, integration, and/or e2e tests
- [ ] Tested locally
- [x] Updated documentation
- [ ] Update Swagger spec if the PR introduce API changes
- [ ] Regenerated Golang and Python client if the PR introduce API
changes

---------

Co-authored-by: Krithika Sundararajan <[email protected]>
  • Loading branch information
khorshuheng and Krithika Sundararajan authored Nov 2, 2023
1 parent 36993fc commit 32ba4d6
Show file tree
Hide file tree
Showing 31 changed files with 52 additions and 77 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/merlin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10"] #TODO: Remove Python 3.7 support
python-version: ["3.8", "3.9", "3.10"]
env:
PIPENV_DEFAULT_PYTHON_VERSION: ${{ matrix.python-version }}
steps:
Expand Down Expand Up @@ -74,7 +74,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10"] #TODO: Remove Python 3.7 support
python-version: ["3.8", "3.9", "3.10"]
env:
PIPENV_DEFAULT_PYTHON_VERSION: ${{ matrix.python-version }}
steps:
Expand Down Expand Up @@ -263,7 +263,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["37", "38", "39", "310"]
python-version: ["38", "39", "310"]
needs:
- create-version
steps:
Expand All @@ -284,7 +284,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["37", "38", "39", "310"]
python-version: ["38", "39", "310"]
needs:
- create-version
steps:
Expand Down
7 changes: 3 additions & 4 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,7 @@ jobs:
working-directory: ./python/sdk
run: |
IMAGE_TAG="${{ env.CONTAINER_REGISTRY }}/merlin-sdk:py${{ matrix.python-version }}-${{ inputs.version }}"
SDK_VERSION="$(sed 's/^v\(.*\)-/\1/' <<< "${{ inputs.version }}")"
docker build -t ${IMAGE_TAG} --build-arg PYTHON_VERSION=${{ matrix.python-version }} --build-arg VERSION=${SDK_VERSION} .
docker build -t ${IMAGE_TAG} --build-arg PYTHON_VERSION=${{ matrix.python-version }} .
docker push ${IMAGE_TAG}
publish-api:
Expand Down Expand Up @@ -106,7 +105,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["37", "38", "39", "310"]
python-version: ["38", "39", "310"]
steps:
- name: Log in to the Container registry
uses: docker/login-action@v1
Expand All @@ -132,7 +131,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["37", "38", "39", "310"]
python-version: ["38", "39", "310"]
steps:
- name: Log in to the Container registry
uses: docker/login-action@v1
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ TEST_TAGS?=
GOLANGCI_LINT_VERSION="v1.51.2"
PROTOC_GEN_GO_JSON_VERSION="v1.1.0"
PROTOC_GEN_GO_VERSION="v1.26"
PYTHON_VERSION ?= "37" #set as 37 38 39 310 for 3.7-3.10 respectively
PYTHON_VERSION ?= "39" #set as 38 39 310 for 3.8-3.10 respectively

all: setup init-dep lint test clean build run

Expand Down
4 changes: 2 additions & 2 deletions examples/batch/env.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ dependencies:
- pip:
- joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
- numpy>=1.19.5
- scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
- scikit-learn>=1.1.2
- xgboost==1.6.2
- mlflow==1.23.0
- mlflow==1.26.1
4 changes: 2 additions & 2 deletions examples/batch/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
numpy>=1.19.5
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
xgboost==1.6.2
mlflow==1.23.0
mlflow==1.26.1
merlin-sdk
cloudpickle==2.0.0
google-cloud
Expand Down
2 changes: 1 addition & 1 deletion examples/custom-model/http_json/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
xgboost==1.6.2
merlin-sdk
2 changes: 1 addition & 1 deletion examples/model-endpoint/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
merlin-sdk
joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
cloudpickle==2.0.0
2 changes: 1 addition & 1 deletion examples/pyfunc/env.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ dependencies:
- python=3.8
- pip:
- joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
- scikit-learn==1.1.2 #TODO: >=1.1.2 upon python 3.7 deprecation
- scikit-learn>=1.1.2
- xgboost==1.6.2
2 changes: 1 addition & 1 deletion examples/pyfunc/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
xgboost==1.6.2
merlin-sdk
Expand Down
2 changes: 1 addition & 1 deletion examples/resource-request-gpu/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
merlin-sdk
joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
cloudpickle==2.0.0
2 changes: 1 addition & 1 deletion examples/resource-request/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
merlin-sdk
joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
cloudpickle==2.0.0
2 changes: 1 addition & 1 deletion examples/sklearn/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
merlin-sdk
joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
cloudpickle==2.0.0
2 changes: 1 addition & 1 deletion examples/xgboost/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
xgboost==1.6.2
merlin-sdk
cloudpickle==2.0.0
2 changes: 1 addition & 1 deletion python/batch-predictor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES python main.py --job-name iris-predictio

### Requirements

- python 3.7.0
- python >= 3.8.0
- pipenv (install using `pip install pipenv`)
- protoc (see [installation instruction](http://google.github.io/proto-lens/installing-protoc.html))
- gcloud (see [installation instruction](https://cloud.google.com/sdk/install))
Expand Down
2 changes: 1 addition & 1 deletion python/batch-predictor/docker/app.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,6 @@ ARG GOOGLE_APPLICATION_CREDENTIALS
RUN if [[ ! -z "$GOOGLE_APPLICATION_CREDENTIALS" ]]; then gcloud auth activate-service-account --key-file=${GOOGLE_APPLICATION_CREDENTIALS}; fi
RUN gsutil -m cp -r ${MODEL_URL} .
RUN /bin/bash -c ". activate ${CONDA_ENVIRONMENT} && \
sed -i 's/mlflow$/mlflow==1.23.0/' ${HOME}/model/conda.yaml && \
sed -i 's/mlflow\(\s*==\s*[^ ]*\)\{0,1\}/mlflow==1.26.1/g' ${HOME}/model/conda.yaml && \
conda env update --name ${CONDA_ENVIRONMENT} --file ${HOME}/model/conda.yaml && \
python ${HOME}/merlin-spark-app/main.py --dry-run-model ${HOME}/model"
2 changes: 1 addition & 1 deletion python/batch-predictor/docker/base.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

FROM gcr.io/spark-operator/spark-py:v3.0.0
FROM apache/spark-py:v3.1.3

# Switch to user root so we can add additional jars and configuration files.
USER root
Expand Down
8 changes: 0 additions & 8 deletions python/batch-predictor/docker/env37.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion python/batch-predictor/merlinpyspark/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def spark_udf(spark, model_uri, features, result_type="double"):
archive_path = SparkModelCache.add_local_model(spark, local_model_path)

def predict(*args):
model = SparkModelCache.get_or_load(archive_path)
model, _ = SparkModelCache.get_or_load(archive_path)
schema = {features[i]: arg for i, arg in enumerate(args)}
pdf = None
for x in args:
Expand Down
7 changes: 3 additions & 4 deletions python/batch-predictor/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
findspark
pyspark==3.0.1
mlflow>=1.2.0,<=1.23.0
mlflow>=1.26.1,<2.0.0
cloudpickle==2.0.0
pyarrow>=0.14.1,<=9.0.0
protobuf>=3.0,<4.0.0
#TODO: Update merlin-sdk dep to: file:${SDK_PATH}#egg=merlin-sdk
merlin-sdk==0.33.0 # Pin to the version that supports Python 3.7.
protobuf>=3.0,<5.0.0
file:${SDK_PATH}#egg=merlin-sdk
6 changes: 2 additions & 4 deletions python/batch-predictor/requirements_test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@ pytest
pytest-cov
mypy
google-cloud-bigquery
scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
scikit-learn>=1.1.2
joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
mypy-protobuf>=1.19
types-PyYAML
protobuf<4.0.0
grpcio<1.49.0
types-PyYAML
13 changes: 6 additions & 7 deletions python/batch-predictor/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,20 +21,19 @@
with open('requirements.txt') as f:
REQUIRE = f.read().splitlines()

# TODO: Uncomment below lines after Pyfunc server stops supporting Python 3.7
# merlin_path = os.path.join(os.getcwd(), "../sdk")
# merlin_sdk_package = "merlin-sdk"
# for index, item in enumerate(REQUIRE):
# if merlin_sdk_package in item:
# REQUIRE[index] = f"{merlin_sdk_package} @ file://localhost/{merlin_path}#egg={merlin_sdk_package}"
merlin_path = os.path.join(os.getcwd(), "../sdk")
merlin_sdk_package = "merlin-sdk"
for index, item in enumerate(REQUIRE):
if merlin_sdk_package in item:
REQUIRE[index] = f"{merlin_sdk_package} @ file://localhost/{merlin_path}#egg={merlin_sdk_package}"

setup(
name='merlin-pyspark-app',
version='0.2.0',
author_email='[email protected]',
description='Base pyspark application for running merlin prediction job',
long_description=open('README.md').read(),
python_requires='>=3.7,<3.11',
python_requires='>=3.8,<3.11',
packages=find_packages("merlinpyspark"),
install_requires=REQUIRE,
tests_require=TEST_REQUIRE,
Expand Down
2 changes: 1 addition & 1 deletion python/batch-predictor/test-model/conda.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dependencies:
- pip:
- scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
- scikit-learn>=1.1.2
- joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ verify_ssl = true
merlin-sdk = "=={{ cookiecutter.merlin_sdk_version }}"

[requires]
python_version = "3.7"
python_version = "3.8"
2 changes: 1 addition & 1 deletion python/pyfunc-server/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ WORKDIR /pyfunc-server
RUN if [ ! -z "$GOOGLE_APPLICATION_CREDENTIALS" ]; then gcloud auth activate-service-account --key-file=${GOOGLE_APPLICATION_CREDENTIALS}; fi
RUN gsutil cp -r ${MODEL_URL} .
RUN /bin/bash -c ". activate merlin-model && \
sed -i 's/mlflow$/mlflow==1.23.0/' model/conda.yaml && \
sed -i 's/mlflow\(\s*==\s*[^ ]*\)\{0,1\}/mlflow==1.26.1/g' model/conda.yaml && \
conda env update --name merlin-model --file model/conda.yaml && \
python -m pyfuncserver --model_dir model --dry_run"

Expand Down
8 changes: 0 additions & 8 deletions python/pyfunc-server/docker/env37.yaml

This file was deleted.

3 changes: 1 addition & 2 deletions python/pyfunc-server/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,4 @@ grpcio<1.49.0
grpcio-reflection<1.49.0
grpcio-tools<1.49.0
grpcio-health-checking<1.49.0
#TODO: Update merlin-sdk dep to: file:${SDK_PATH}#egg=merlin-sdk
merlin-sdk==0.33.0 # Pin to the version that supports Python 3.7.
file:${SDK_PATH}#egg=merlin-sdk
13 changes: 6 additions & 7 deletions python/pyfunc-server/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,11 @@
# replace merlin relative path in requirements.txt into absolute path
# setuptools could not install relative path requirements

# TODO: Uncomment below lines after Pyfunc server stops supporting Python 3.7
# merlin_path = os.path.join(os.getcwd(), "../sdk")
# merlin_sdk_package = "merlin-sdk"
# for index, item in enumerate(REQUIRE):
# if merlin_sdk_package in item:
# REQUIRE[index] = f"{merlin_sdk_package} @ file://localhost/{merlin_path}#egg={merlin_sdk_package}"
merlin_path = os.path.join(os.getcwd(), "../sdk")
merlin_sdk_package = "merlin-sdk"
for index, item in enumerate(REQUIRE):
if merlin_sdk_package in item:
REQUIRE[index] = f"{merlin_sdk_package} @ file://localhost/{merlin_path}#egg={merlin_sdk_package}"

setup(
name='pyfuncserver',
Expand All @@ -45,7 +44,7 @@
description='Model Server implementation for mlflow pyfunc model',
long_description=open('README.md').read(),
long_description_content_type='text/markdown',
python_requires='>=3.7,<3.11',
python_requires='>=3.8,<3.11',
packages=find_packages(exclude=["test"]),
install_requires=REQUIRE,
tests_require=tests_require,
Expand Down
6 changes: 2 additions & 4 deletions python/sdk/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,5 @@ RUN apt-get update && apt-get install build-essential curl vim wget -y

COPY . .${WORKDIR}

ARG VERSION

RUN pip install merlin-sdk==${VERSION}
RUN pip install merlin-sdk[test]==${VERSION}
RUN pip install .
RUN pip install ".[test]"
6 changes: 3 additions & 3 deletions python/sdk/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@
"cookiecutter>=1.7.2",
"docker>=4.2.1",
"google-cloud-storage>=1.19.0",
"mlflow>=1.2.0,<=1.23.0", # for py3.11 due to proto -> "mlflow>=1.26.1",
"protobuf>=3.0.0,<4.0.0", # for py3.11 due to proto -> "protobuf>=4.0.0,<5.0dev",
"protobuf>=3.0.0,<5.0.0",
"mlflow>=1.26.1,<2.0.0",
"PyPrind>=2.11.2",
"python_dateutil>=2.5.3",
"PyYAML>=5.4",
Expand All @@ -53,7 +53,7 @@
"pytest",
"recursive-diff>=1.0.0",
"requests",
"scikit-learn==1.0.2", #TODO: >=1.1.2 upon python 3.7 deprecation
"scikit-learn>=1.1.2",
"types-python-dateutil",
"types-PyYAML",
"types-six",
Expand Down
2 changes: 1 addition & 1 deletion python/sdk/test/batch/model/env.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
dependencies:
- pip:
- mlflow
- scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
- scikit-learn>=1.1.2
- joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
- numpy>=1.19.5,<1.23.4 # https://github.com/Azure/MachineLearningNotebooks/issues/1314
- pandas==1.3.5
Expand Down
2 changes: 1 addition & 1 deletion python/sdk/test/pyfunc/env.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ dependencies:
- pip:
- joblib>=0.13.0,<1.2.0 # >=1.2.0 upon upgrade of kserve's version
- numpy<=1.23.5 # Temporary pin numpy due to https://numpy.org/doc/stable/release/1.20.0-notes.html#numpy-1-20-0-release-notes
- scikit-learn==1.0.2 #TODO: >=1.1.2 upon python 3.7 deprecation
- scikit-learn>=1.1.2
- xgboost==1.6.2
- pytest
- pytest-xdist==1.34.0

0 comments on commit 32ba4d6

Please sign in to comment.