Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a .yaml file with kubeflow version #172

Closed

Conversation

dibryant
Copy link

Fixes for https://issues.redhat.com/browse/RHOAIENG-12524

Collaborated with @andyatmiami

Created a .yaml file with kubeflow version with automated doc

@openshift-ci openshift-ci bot requested review from harshad16 and paulovmr November 25, 2024 13:51
releases:
- name: Kubeflow Notebook
version: 1.9.0
repoUrl: https://github.com/kubeflow/kubeflow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing newline at end of file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment how I need to run the script to get this file here? without any parameters is enough?

Copy link
Member

@jiridanek jiridanek Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyatmiami just give me a comment on top how do I run the script to produce the output from Jira that's in the PR; that's all that I expect from the script, to be able to produce the output that's given in jira.

Best way to move this forward is IMO to merge it then, and see what the Platform team thinks, and also what other teams end up doing so we can regroup and unify. There's lots of time before 2.17 goes out (early-ish next year).

Copy link

@andyatmiami andyatmiami Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can run: generate-metadata-yaml.sh -h to spit out a simple usage line.

essentially:

  • -o required
  • -n <name> optional override to use the provided name
  • -v <name> optional override to use the provided version
  • -r <name> optional override to use the provided repoUrl

bare minimum:

./generate-metadata-yaml.sh -o /Users/astonebe/Development/Code/GitHub/odhio-kubeflow/components/notebook-controller/config/component_metadata.yaml

to exactly match what is in the Jira (but see other comments I made on this PR w.r.t. why these overrides are currently necessary)

./generate-metadata-yaml.sh -o /Users/astonebe/Development/Code/GitHub/odhio-kubeflow/components/notebook-controller/config/component_metadata.yaml -n "Kubeflow Notebook" -v "1.9.0"

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one change needed

@@ -0,0 +1,4 @@
releases:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

openshift-ci bot commented Nov 25, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from harshad16. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

exit 1
fi

_handle_metadata_file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing newline at end of file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a weird place where to put the script; not sure where to put it, though; /utils?

Copy link

@andyatmiami andyatmiami Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, probably my fault for not including that information when I passed this script off to @dibryant ...

My intention was to place this file at the root of the notebook-controller component directory.. in fact there is logic that requires it to be there (relative path'ing to find a PROJECT file - if it exists)

So, script should go here:

I made the decision to store the file here based on precedent from other components. Example:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/precedence/precedent/? ;p

I agree with the proposed location

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precedence: the precedent based on presence

fixed in comment (words are hard - thanks for pointing out the error)

😎

# inspired from https://stackoverflow.com/a/29835459
current_dir=$(CDPATH= cd -- "$(dirname -- "$0")" && pwd)

kf_project_file="${current_dir}/PROJECT"
Copy link

@andyatmiami andyatmiami Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Controversial" Decision #1

I noticed most kubeflow components have this PROJECT file in the component root directory that contains (loosely) the information required by the component_metadata.yaml specification.

I decided to fully embrace auto-generating this file if the present of a PROJECT file was found (while still supporting command line flag overrides to explicitly provide the value.

Good idea? Bad Idea? You tell me!

metadata_repo_url=$(printf "https://%s/%s/%s" "${github_host}" "${github_owner}" "${github_repo}")
fi

if [ -z "${metadata_name}" ]; then
Copy link

@andyatmiami andyatmiami Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Controversial" Decision #2

While the specification for component_metadata.yaml states the name attribute should be a human-readable name.. it doesn't really provide any template/structure/consistency w.r.t how the name should be formulated.

With my intent to allow full automation w/out user input... I then opted to have the script generate the name based on information present in the PROJECT file (as outlined above)

This would result in the following:

  • name: kubeflow.org notebook-controller

I can change this default naming scheme based on preferences of team... but to automate generating the name (as presented in the Jira issue) of Kubeflow Notebook - that was going to require a lot more shell scripting that I felt was necessary/reasonable when the "core ask" here is from this ill-defined concept of "human readable".. Again happy to pivot!

Good idea? Bad Idea? You tell me!

Comment on lines 99 to 102
# NOTE: Does not handle multiple entries!!
yq_env_arg="${metadata_name}" yq -i '.releases[0].name = strenv(yq_env_arg)' "${output_file}"
yq_env_arg="${metadata_version}" yq -i '.releases[0].version = strenv(yq_env_arg)' "${output_file}"
yq_env_arg="${metadata_repo_url}" yq -i '.releases[0].repoUrl = strenv(yq_env_arg)' "${output_file}"
Copy link

@andyatmiami andyatmiami Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Controversial" Decision #3

While the specification for component_metadata.yaml supports multiple elements in the releases attribute - w/in the kubeflow components directory of red-hat-data-services - it will always map 1:1... so I simply hardcoded the script to always interact with the 0th element.

This simplifies logic greatly and I don't think in practice, for rhds kubeflow, we'd ever need anything more complicated.

Good idea? Bad Idea? You tell me!

Comment on lines 153 to 156
if ! yq --version &> /dev/null; then
printf "%s" "yq not installed... aborting script."
exit 1
fi
Copy link

@andyatmiami andyatmiami Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ I am not too familiar with the environment in which this script would be invoked (when we talk about full end-to-end automation)

This script leverages yq to cleanly interact with reading/writing to the component_metadata.yaml file... is it safe to assume yq would be available in some subsequent pipeline? and/or developers can reasonably be assumed to have yq installed on their machine should they want to run this script?

exit 1
fi

_handle_metadata_file
Copy link

@andyatmiami andyatmiami Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ While the file @dibryant checked in matches the Jira issue and looks like this:

releases: 
  - name: Kubeflow Notebook
    version: 1.9.0 
    repoUrl: https://github.com/kubeflow/kubeflow

I want to very explicitly point out that running this script with ONLY the required -o <output file> would result in the following contents getting generated:

releases: 
  - name: kubeflow.org notebook-controller
    version: v1.9.0 
    repoUrl: https://github.com/kubeflow/kubeflow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is nice too; isn't kubeflow a trademark, though? maybe we need to be careful with it

I still think that its best to just merge this PR the moment it "works" and then evaluate when more teams finish their work.

@@ -0,0 +1,4 @@
releases:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Jira issue mentions the file name should be:

  • odh_metadata.yaml

(notice the underscore)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As shared in JIRA and Slack: https://redhat-internal.slack.com/archives/C060A5FJEAD/p1732213663369219
seems like the file name should be component_metadata.yaml
can we please make the change

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JIRA we have open for our team lists: odh_metadata.yaml fwiw - I'll change that (as that was the source of truth I was going off of in my comment above)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andyatmiami After a discussion in slack channel, the filename is decided to be component_metadata.yaml. I did add a comment in Epic here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the internal tracker for our team to now reflect this proper name so there is not any confusion. Thanks for clarification/confirmation 💯

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional for opening PR in this repo and not in https://github.com/opendatahub-io/kubeflow?
can you please share the reasoning

@dibryant
Copy link
Author

Is it intentional for opening PR in this repo and not in https://github.com/opendatahub-io/kubeflow? can you please share the reasoning

No I think I opened it to this Repo because its the one I had forked. I can reopen it on the other repo

@harshad16
Copy link
Member

No I think I opened it to this Repo because its the one I had forked. I can reopen it on the other repo

yes please, can you open in opendatahub repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants