-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to mount self-signed certs to dsp v2 #17
Conversation
97a0426
to
736da8c
Compare
A set of new images have been built to help with testing out this PR: |
An OCP cluster where you are logged in as cluster admin is required. The Data Science Pipelines team recommends testing this using the Data Science Pipelines Operator. Check here for more information on using the DSPO. To use and deploy a DSP stack with these images (assuming the DSPO is deployed), first save the following YAML to a file named apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
name: pr-17
spec:
dspVersion: v2
apiServer:
image: "quay.io/opendatahub/ds-pipelines-api-server:pr-17"
argoDriverImage: "quay.io/opendatahub/ds-pipelines-driver:pr-17"
argoLauncherImage: "quay.io/opendatahub/ds-pipelines-launcher:pr-17"
persistenceAgent:
image: "quay.io/opendatahub/ds-pipelines-persistenceagent:pr-17"
scheduledWorkflow:
image: "quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-17"
mlmd:
deploy: true # Optional component
grpc:
image: "quay.io/opendatahub/ds-pipelines-metadata-grpc:pr-17"
envoy:
image: "quay.io/opendatahub/ds-pipelines-metadata-envoy:pr-17"
mlpipelineUI:
deploy: true # Optional component
image: "quay.io/opendatahub/ds-pipelines-frontend:pr-17"
objectStorage:
minio:
deploy: true
image: 'quay.io/opendatahub/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance' Then run the following: cd $(mktemp -d)
git clone [email protected]:opendatahub-io/data-science-pipelines.git
cd data-science-pipelines/
git fetch origin pull/17/head
git checkout -b pullrequest 97a042615bbbdc3e64846d05957fc41174cda01a
oc apply -f dspa.pr-17.yaml More instructions here on how to deploy and test a Data Science Pipelines Application. |
Change to PR detected. A new PR build was completed. |
@VaniHaripriya this is going to be a carried patch, can you rename your commit message with: change:
to:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
736da8c
to
16b63f8
Compare
Change to PR detected. A new PR build was completed. |
16b63f8
to
db71bb4
Compare
Change to PR detected. A new PR build was completed. |
Pipelinespec ubuntu yaml file: iris-pipeline-ubuntu.yaml# PIPELINE DEFINITION
# Name: iris-training-pipeline
# Inputs:
# min_max_scaler: bool
# neighbors: int
# standard_scaler: bool
components:
comp-create-dataset:
executorLabel: exec-create-dataset
outputDefinitions:
artifacts:
iris_dataset:
artifactType:
schemaTitle: system.Dataset
schemaVersion: 0.0.1
comp-normalize-dataset:
executorLabel: exec-normalize-dataset
inputDefinitions:
artifacts:
input_iris_dataset:
artifactType:
schemaTitle: system.Dataset
schemaVersion: 0.0.1
parameters:
min_max_scaler:
parameterType: BOOLEAN
standard_scaler:
parameterType: BOOLEAN
outputDefinitions:
artifacts:
normalized_iris_dataset:
artifactType:
schemaTitle: system.Dataset
schemaVersion: 0.0.1
comp-train-model:
executorLabel: exec-train-model
inputDefinitions:
artifacts:
normalized_iris_dataset:
artifactType:
schemaTitle: system.Dataset
schemaVersion: 0.0.1
parameters:
n_neighbors:
parameterType: NUMBER_INTEGER
outputDefinitions:
artifacts:
model:
artifactType:
schemaTitle: system.Model
schemaVersion: 0.0.1
deploymentSpec:
executors:
exec-create-dataset:
container:
args:
- --executor_input
- '{{$}}'
- --function_to_execute
- create_dataset
command:
- sh
- -c
- "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip ||\
\ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
\ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0'\
\ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' &&\
\ python3 -m pip install --quiet --no-warn-script-location 'pandas==2.2.0'\
\ && \"$0\" \"$@\"\n"
- sh
- -ec
- 'program_path=$(mktemp -d)
printf "%s" "$0" > "$program_path/ephemeral_component.py"
_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main --component_module_path "$program_path/ephemeral_component.py" "$@"
'
- "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
\ *\n\ndef create_dataset(iris_dataset: Output[Dataset]):\n import pandas\
\ as pd\n\n csv_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'\n\
\ col_names = [\n 'Sepal_Length', 'Sepal_Width', 'Petal_Length',\
\ 'Petal_Width', 'Labels'\n ]\n df = pd.read_csv(csv_url, names=col_names)\n\
\n with open(iris_dataset.path, 'w') as f:\n df.to_csv(f)\n\n"
image: quay.io/vmudadla/irisbaseubuntu:latest
exec-normalize-dataset:
container:
args:
- --executor_input
- '{{$}}'
- --function_to_execute
- normalize_dataset
command:
- sh
- -c
- "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip ||\
\ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
\ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0'\
\ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' &&\
\ python3 -m pip install --quiet --no-warn-script-location 'pandas==2.2.0'\
\ 'scikit-learn==1.4.0' && \"$0\" \"$@\"\n"
- sh
- -ec
- 'program_path=$(mktemp -d)
printf "%s" "$0" > "$program_path/ephemeral_component.py"
_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main --component_module_path "$program_path/ephemeral_component.py" "$@"
'
- "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
\ *\n\ndef normalize_dataset(\n input_iris_dataset: Input[Dataset],\n\
\ normalized_iris_dataset: Output[Dataset],\n standard_scaler: bool,\n\
\ min_max_scaler: bool,\n):\n if standard_scaler is min_max_scaler:\n\
\ raise ValueError(\n 'Exactly one of standard_scaler\
\ or min_max_scaler must be True.')\n\n import pandas as pd\n from\
\ sklearn.preprocessing import MinMaxScaler\n from sklearn.preprocessing\
\ import StandardScaler\n\n with open(input_iris_dataset.path) as f:\n\
\ df = pd.read_csv(f)\n labels = df.pop('Labels')\n\n if standard_scaler:\n\
\ scaler = StandardScaler()\n if min_max_scaler:\n scaler\
\ = MinMaxScaler()\n\n df = pd.DataFrame(scaler.fit_transform(df))\n\
\ df['Labels'] = labels\n normalized_iris_dataset.metadata['state']\
\ = \"Normalized\"\n with open(normalized_iris_dataset.path, 'w') as\
\ f:\n df.to_csv(f)\n\n"
image: quay.io/vmudadla/irisbaseubuntu:latest
exec-train-model:
container:
args:
- --executor_input
- '{{$}}'
- --function_to_execute
- train_model
command:
- sh
- -c
- "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip ||\
\ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
\ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0'\
\ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' &&\
\ python3 -m pip install --quiet --no-warn-script-location 'pandas==2.2.0'\
\ 'scikit-learn==1.4.0' && \"$0\" \"$@\"\n"
- sh
- -ec
- 'program_path=$(mktemp -d)
printf "%s" "$0" > "$program_path/ephemeral_component.py"
_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main --component_module_path "$program_path/ephemeral_component.py" "$@"
'
- "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
\ *\n\ndef train_model(\n normalized_iris_dataset: Input[Dataset],\n\
\ model: Output[Model],\n n_neighbors: int,\n):\n import pickle\n\
\n import pandas as pd\n from sklearn.model_selection import train_test_split\n\
\ from sklearn.neighbors import KNeighborsClassifier\n\n with open(normalized_iris_dataset.path)\
\ as f:\n df = pd.read_csv(f)\n\n y = df.pop('Labels')\n X\
\ = df\n\n X_train, X_test, y_train, y_test = train_test_split(X, y,\
\ random_state=0)\n\n clf = KNeighborsClassifier(n_neighbors=n_neighbors)\n\
\ clf.fit(X_train, y_train)\n\n model.metadata['framework'] = 'scikit-learn'\n\
\ with open(model.path, 'wb') as f:\n pickle.dump(clf, f)\n\n"
image: quay.io/vmudadla/irisbaseubuntu:latest
pipelineInfo:
name: iris-training-pipeline
root:
dag:
tasks:
create-dataset:
cachingOptions:
enableCache: true
componentRef:
name: comp-create-dataset
taskInfo:
name: create-dataset
normalize-dataset:
cachingOptions:
enableCache: true
componentRef:
name: comp-normalize-dataset
dependentTasks:
- create-dataset
inputs:
artifacts:
input_iris_dataset:
taskOutputArtifact:
outputArtifactKey: iris_dataset
producerTask: create-dataset
parameters:
min_max_scaler:
runtimeValue:
constant: false
standard_scaler:
runtimeValue:
constant: true
taskInfo:
name: normalize-dataset
train-model:
cachingOptions:
enableCache: true
componentRef:
name: comp-train-model
dependentTasks:
- normalize-dataset
inputs:
artifacts:
normalized_iris_dataset:
taskOutputArtifact:
outputArtifactKey: normalized_iris_dataset
producerTask: normalize-dataset
parameters:
n_neighbors:
componentInputParameter: neighbors
taskInfo:
name: train-model
inputDefinitions:
parameters:
min_max_scaler:
parameterType: BOOLEAN
neighbors:
parameterType: NUMBER_INTEGER
standard_scaler:
parameterType: BOOLEAN
schemaVersion: 2.1.0
sdkVersion: kfp-2.7.0
|
@VaniHaripriya can you squash the commits? |
db71bb4
to
e418fd8
Compare
Change to PR detected. A new PR build was completed. |
commonEnvs = append(commonEnvs, k8score.EnvVar{ | ||
Name: "SSL_CERT_DIR", | ||
Value: sslCertDir, | ||
}) | ||
executor.Container.Env = append(executor.Container.Env, k8score.EnvVar{ | ||
Name: "SSL_CERT_DIR", | ||
Value: sslCertDir, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you noticed this behavior as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my guess you are appending the env field multiple times:
commonEnvs = append(commonEnvs, k8score.EnvVar{
Name: "SSL_CERT_DIR",
Value: sslCertDir,
})
I'm not sure if this is useful because commonEnvs
is used here which is before you update env. But it's a reference type, so it does get included in the env field, however you add it again here:
executor.Container.Env = append(executor.Container.Env, k8score.EnvVar{
Name: "SSL_CERT_DIR",
Value: sslCertDir,
})
maybe try removing the update to commonenv
and only keep the last change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you noticed this behavior as well?
@HumairAK I will take a look at it. I didn't notice this behavior when I deployed in my cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HumairAK I am not seeing it when I deployed it in my cluster. But, its working even after removing commonEnvs part. So, I removed them.Please check.
Add ability to mount self-signed certs to dsp v2 Added SSL_CERT_DIR env variable Deleted duplicate env vars
6141b33
to
70fe585
Compare
Change to PR detected. A new PR build was completed. |
1 similar comment
Change to PR detected. A new PR build was completed. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: amadhusu, HumairAK, VaniHaripriya The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Resolves RHOAIENG-1810
Description of your changes:
Testing changes:
Deploy a dspa using below yaml file. Create pipeline runs by uploading the below provided pipelinespec yaml files using RHEL and Ubuntu images. Verify the pipelines run without errors and also if the volumeMounts in kfp-launcher pod contain the ca-bundle mount path. Also, we can verify the same in workflow YAML.
dspa.yaml
Pipelinespec yaml files:
iris-pipeline-rhel.yaml