Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RHOAIENG-11850] Updated etcd manifest #303

Merged
merged 1 commit into from
Sep 25, 2024

Conversation

mholder6
Copy link

@mholder6 mholder6 commented Sep 20, 2024

Updated the manifest to include requests and limits for the etcd deployment.

Tested by applying resource quota to the redhat-ods-applications project and viewing which resources did not automatically rollout after applying the rq.
After finding that the etcd deployment was the only pod that was not automatically redeployed, viewed the metrics for the etcd pod with no changes to find the request boundaries, and then added and deleted multiple ISVC's to find the limit boundaries.

Instructions to apply ResourceQuota:

Here is a copy of the RQ I used:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
spec:
  hard:
    pods: "4" 
    requests.cpu: 50m 
    requests.memory: 96Mi 
    limits.cpu: "1" 
    limits.memory: 512Mi 

2 ways to apply -- CLI or UI

  1. Using the CLI, ensure you are logged into the cluster, and using the intended project you want to apply the RQ to (in this case we are applying to the redhat-ods-applications project/namespace.
    a. in the directory where you have saved the above RQ yaml, run oc apply -f <nameOfResourceQuota.yaml>

  2. Using the UI, there are 2 ways -- Creating a Pod or a ResourceQuota directly

Creating a Pod to create a ResourceQuota:
a. In the sidebar, navigate to Workloads, and then Pods.
b. Ensure you are in the project you want the RQ applied to. There is a drop-down list at the top left of the screen. In this case we are using the redhat-ods-applications project
c. Click the blue "Create Pod" button at the top right of the page, and paste the RQ yaml defined above.
d. Click the blue "Create" button at the bottom. -- View the ResourceQuota by navigating to the sidebar again, clicking Administration > and then Workloads.

Creating a ResourceQuota Directly:
a. In the sidebar, navigate to Administration > and then ResourceQuotas.
b. Ensure you are in the project you want the RQ applied to. There is a drop-down list at the top left of the screen. In this case we are using the redhat-ods-applications project
c. Click the blue "Create ResourceQuota" button at the top right of the page, and paste the RQ yaml defined above -- or you can manually edit the values you want for the RQ.
d. Click the blue "Create" button at the bottom.

Once the RQ is applied, modify the request and limit values to satisfy the resources in your project. The deployments that are not automatically redeployed are the deployments that do not have resource values defined.

Motivation

Modifications

Result

PR checklist

Checklist items below are applicable for development targeted to both fast and stable branches/tags

  • Unit tests pass locally
  • FVT tests pass locally
  • If the PR adds a new container image or updates the tag of an existing image (not build within cpaas), is the corresponding change made in live-builder and cpaas-midstream to add/update the image tag in the operator CSV? Link the PRs if applicable

Checklist items below are applicable for development targeted to both fast and stable branches/tags

  • Tested modelmesh serving deployment with odh-manifests and ran odh-manifests-e2e tests locally

Sorry, something went wrong.

@mholder6 mholder6 requested review from spolti and Jooho September 20, 2024 18:48
@openshift-ci openshift-ci bot requested a review from israel-hdez September 20, 2024 18:48
@hdefazio hdefazio self-requested a review September 23, 2024 13:43
Copy link

@hdefazio hdefazio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any screenshots to support the memory values?

@israel-hdez
Copy link

israel-hdez commented Sep 24, 2024

So, I have tried this. I had to tune the ResourceQuota to let pods to be created successfully.
Despite applying the fix, I still saw the problem reported in the ticket:

Error creating: pods "etcd-68d5dbd5f7-ssffm" is forbidden: failed quota: compute-resources: must specify limits.cpu for: etcd-secret-creator; limits.memory for: etcd-secret-creator; requests.cpu for: etcd-secret-creator; requests.memory for: etcd-secret-creator

Looks like the existing resources aren't the problem, but it is the initContainer (which is the one named etcd-secret-creator, as noted in the error) not having any resources set. Observe here: https://github.com/opendatahub-io/modelmesh-serving/blob/main/config/overlays/odh/quickstart.yaml#L50-L78 the missing resources field.

The PR as is is lowering the memory limits of the etcd container. IMO, if current requests/limits are working (given the right allocation of quotas) we should keep them untouched, and only fix the missing resources spec of the etcd-secret-creator initContainer.

…et container initializer.

Signed-off-by: mholder6 <[email protected]>
Copy link

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is working correctly on my trials.

/lgtm

Copy link

openshift-ci bot commented Sep 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, mholder6

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants