Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate maximum batch size in batch patching as per time remaining in maintenance window #250

Merged
merged 19 commits into from
May 3, 2024

Conversation

GAURAVRAMRAKHYANI
Copy link
Contributor

@GAURAVRAMRAKHYANI GAURAVRAMRAKHYANI commented Mar 13, 2024

There are four changes at high level:
(a) Calculate maximum batch size in batch patching as per time remaining in maintenance window. The upper limit of max batch size is 300.

(b) If there are some remaining packages after batch patching due to failures in batch patching or maintenance window cutoff exceeded, then do batch patching again with lower batch size (1/10th of original batch size) instead of directly falling back to sequential patching. In case there is again remaining packages after batch patching with lower batch size then do sequential patching.
The code changes to calculate batch size in phase1 and phase2 will be done in later iterations once the code structure is approved.

(c) Use different value for average time to install packages for different package managers as per the telemetry data. Using current average time to install packages, it should improve after this change, and we can update the values later as per the improvement.

(d) In the maintenance window cutoff calculation, assumption of the number of packages in the batch which could take max time to install was 3, changing it to 1. Rest of the packages should take average time to install.

(e) In sequential patching, for the packages which failed to install in first attempt are retried 3 times. So, at max, there are total 4 attempts in sequential patching for a package. Now, with multi-phase batch patching implementation, we have two attempts in batch patching and 4 attempts in sequential patching. So, at max, there are total 6 attempts for a package. These are too many attempts. Each attempt to install package requires running two commands, one to install package and the other to check if package is installed successfully. These commands take up to total 1 second per package. These commands generate some logs which will add in telemetry throttling and slow down. To reduce attempts, removing 3 retries in sequential patching. So, there are total 2 attempts in batch patching and 1 in sequential patching.

Only adding function signatures and their caller from the code in the first iteration. Will implement the functions in later iterations.

Testing:
Manual testing is not done as function implementation is pending.
Test changes will be done once the product changes are approved.

Update:
Function implementation for get_max_batch_size is done.
Completed the manual testings. Fixed some issues in the code changes which were found during unit testing.
Unit tests are not done. Will do it once product code changes are reviewed.

Manual testing done:
(a) Done testing in all the 3 package managers: apt, yum and zypper.

(b) Tested the scenario in which multi-phase batch patching is done in only 1 phase and in 2 phases due to different conditions like remaining packages to install greater than 0 after phase 1 and maintenance window batch cutoff reached with phase 1 so need to do phase2. Also, tested that the packages remaining after phase 2 are attempted in sequential patching.

(c) Done testing for the case when number of packages to install is 0 in the beginning of the patch installation job.

Copy link

codecov bot commented Mar 13, 2024

Codecov Report

Attention: Patch coverage is 30.18868% with 37 lines in your changes are missing coverage. Please review.

Project coverage is 83.54%. Comparing base (8734572) to head (01ebd9d).

Files Patch % Lines
src/core/src/core_logic/PatchInstaller.py 11.42% 31 Missing ⚠️
src/core/src/core_logic/MaintenanceWindow.py 60.00% 2 Missing ⚠️
...ore/src/package_managers/AptitudePackageManager.py 66.66% 1 Missing ⚠️
src/core/src/package_managers/PackageManager.py 66.66% 1 Missing ⚠️
src/core/src/package_managers/YumPackageManager.py 66.66% 1 Missing ⚠️
.../core/src/package_managers/ZypperPackageManager.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #250      +/-   ##
==========================================
- Coverage   90.95%   83.54%   -7.41%     
==========================================
  Files          91       91              
  Lines       15723    15749      +26     
==========================================
- Hits        14301    13158    -1143     
- Misses       1422     2591    +1169     
Flag Coverage Δ
python27 ?
python39 83.54% <30.18%> (-7.41%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

SathishMSFT
SathishMSFT previously approved these changes Mar 18, 2024
@codecov-commenter
Copy link

codecov-commenter commented Mar 20, 2024

Codecov Report

Attention: Patch coverage is 98.16514% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 90.97%. Comparing base (b8d7a16) to head (4de68b4).

Files Patch % Lines
src/core/src/core_logic/PatchInstaller.py 98.24% 1 Missing ⚠️
src/core/src/package_managers/PackageManager.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #250      +/-   ##
==========================================
- Coverage   90.98%   90.97%   -0.01%     
==========================================
  Files          91       91              
  Lines       15737    15799      +62     
==========================================
+ Hits        14318    14373      +55     
- Misses       1419     1426       +7     
Flag Coverage Δ
python27 90.97% <98.16%> (-0.01%) ⬇️
python39 90.97% <98.16%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fixed some issues in code after manual testing.
Updated the average time to install package in different package managers based on update in kusto query to get average time.
feng-j678

This comment was marked as resolved.

Copy link
Contributor

@rane-rajasi rane-rajasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_max_batch_size() looks good.

An observation on the changes in install_updates() in PatchInstaller.py, the function is getting longer, please extract them out to another function in your revised code changes.

Also since get_max_batch_size() is an independent function, we can have UTs to validate it. For any future revisions, please try to add UTs to test out the smaller units. Overall UT to validate E2E scenario can be added in the iteration that has all code changes

(a) There should be two new lines at the end of file
(b) Extract some code of install_updates to another function as install_updates was getting longer.
@GAURAVRAMRAKHYANI
Copy link
Contributor Author

GAURAVRAMRAKHYANI commented Apr 1, 2024

get_max_batch_size() looks good.

An observation on the changes in install_updates() in PatchInstaller.py, the function is getting longer, please extract them out to another function in your revised code changes.

Also since get_max_batch_size() is an independent function, we can have UTs to validate it. For any future revisions, please try to add UTs to test out the smaller units. Overall UT to validate E2E scenario can be added in the iteration that has all code changes

(a) I have addressed the comment that as install_updates is getting longer, extracted some part of it to another function. Did some manual testings and no issues found. Continuing doing more testing around this change.

(b) I have not yet implemented UT for get_max_batch_size(), will implement it tomorrow IST. Meanwhile, can you review the product changes?

Copy link
Contributor

@rane-rajasi rane-rajasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded to asks on previous feedback. Waiting on UTs

src/core/src/core_logic/PatchInstaller.py Outdated Show resolved Hide resolved
src/core/src/core_logic/PatchInstaller.py Outdated Show resolved Hide resolved
src/core/src/package_managers/AptitudePackageManager.py Outdated Show resolved Hide resolved
Copy link
Contributor

@feng-j678 feng-j678 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the cicd pipeline is failing, the code coverage does not meet the threshold, and there are failing tests

image
image

per Gaurav:
It is expected, the unit tests are not updated, As batch size has changed, the unit tests will fail as they expect older batch size.
I will update unit tests once product changes are reviewed otherwise will have to update the unit tests again if there are changes in product

@kjohn-msft @rane-rajasi

The fix is in the function get_max_batch_size, there is difference in behaviour in python2 and python3. In python2, int/int is int but in python3 int/int is float

Addressed PR comment by adding the class PackageBatchConfig for batch patching constants
Copy link
Collaborator

@kjohn-msft kjohn-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have the automated checks pass (repeat). Code functionality is okay.

@kjohn-msft
Copy link
Collaborator

Code cov is having a partial service outage. https://status.codecov.com/
Retry after the status shows green. @feng-j678 for review on code state as now.

kjohn-msft
kjohn-msft previously approved these changes May 2, 2024
feng-j678
feng-j678 previously approved these changes May 2, 2024
@kjohn-msft kjohn-msft dismissed rane-rajasi’s stale review May 2, 2024 22:20

rane-rajasi is out of office

@GAURAVRAMRAKHYANI GAURAVRAMRAKHYANI dismissed stale reviews from feng-j678 and kjohn-msft via a8ad80a May 3, 2024 11:42
feng-j678
feng-j678 previously approved these changes May 3, 2024
@kjohn-msft kjohn-msft merged commit 1641706 into master May 3, 2024
5 checks passed
@kjohn-msft kjohn-msft deleted the garamrak-multiPhase branch May 3, 2024 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants