Implement file-level parallelization #100

drdavella · 2023-10-25T20:56:08Z

Overview

Implement (optional) per-codemod file-level parallelization

Description

This PR enables files to be processed in parallel for each codemod
There is no parallelism between codemods since we need to preserve strict ordering
The default behavior for now is to use a single thread, so there is no difference from the previous behavior
All per-file codemod state is now managed by FileContext and then aggregated at the end of each codemod pass
This will enable us to perform testing in more resource constrained environments where more threads might be useful. In particular, we suspect that environments that are highly limited by IO bandwidth will benefit from parallelization

codecov · 2023-10-25T21:04:37Z

Codecov Report

Merging #100 (15716b7) into main (73f5b4a) will decrease coverage by 0.10%.
The diff coverage is 100.00%.

❗ Current head 15716b7 differs from pull request most recent head 43b88ab. Consider uploading reports for the commit 43b88ab to get more accurate results

@@            Coverage Diff             @@
##             main     #100      +/-   ##
==========================================
- Coverage   95.63%   95.54%   -0.10%     
==========================================
  Files          60       60              
  Lines        2451     2446       -5     
==========================================
- Hits         2344     2337       -7     
- Misses        107      109       +2

Files	Coverage Δ
src/codemodder/cli.py	`100.00% <100.00%> (ø)`
src/codemodder/codemodder.py	`98.14% <100.00%> (+0.08%)`	⬆️
src/codemodder/codemods/api/__init__.py	`95.31% <100.00%> (-0.08%)`	⬇️
src/codemodder/codemods/base_codemod.py	`100.00% <100.00%> (ø)`
src/codemodder/context.py	`97.53% <100.00%> (+0.23%)`	⬆️
src/codemodder/file_context.py	`100.00% <100.00%> (ø)`
...codemodder/project_analysis/python_repo_manager.py	`100.00% <100.00%> (ø)`
src/core_codemods/order_imports.py	`92.85% <ø> (ø)`
src/core_codemods/sql_parameterization.py	`91.32% <100.00%> (-0.04%)`	⬇️
src/core_codemods/upgrade_sslcontext_tls.py	`100.00% <100.00%> (ø)`

... and 1 file with indirect coverage changes

clavedeluna

How much of a PITA would it be to add an integration test with --max-workers used?

src/codemodder/cli.py

src/codemodder/codemodder.py

drdavella · 2023-10-26T14:37:30Z

@clavedeluna since the default is to use one worker and since this is purely to enable performance testing I'm going to skip the integration test for now. Once we decide we want to use --max-workers > 1 in production environments we can revisit with an integration test.

drdavella marked this pull request as ready for review October 26, 2023 13:23

drdavella requested review from clavedeluna and andrecsilva as code owners October 26, 2023 13:23

clavedeluna reviewed Oct 26, 2023

View reviewed changes

src/codemodder/cli.py Outdated Show resolved Hide resolved

src/codemodder/codemodder.py Outdated Show resolved Hide resolved

src/codemodder/codemodder.py Show resolved Hide resolved

drdavella force-pushed the parallel-file-processing branch from e49878a to 15716b7 Compare October 26, 2023 13:55

drdavella requested a review from clavedeluna October 26, 2023 14:14

andrecsilva approved these changes Oct 26, 2023

View reviewed changes

clavedeluna approved these changes Oct 26, 2023

View reviewed changes

Implement file-level parallelization

43b88ab

drdavella force-pushed the parallel-file-processing branch from 15716b7 to 43b88ab Compare October 26, 2023 14:37

drdavella merged commit b2b91af into main Oct 26, 2023
9 checks passed

drdavella deleted the parallel-file-processing branch October 26, 2023 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement file-level parallelization #100

Implement file-level parallelization #100

drdavella commented Oct 25, 2023 •

edited

Loading

codecov bot commented Oct 25, 2023 •

edited

Loading

clavedeluna left a comment

drdavella commented Oct 26, 2023

Implement file-level parallelization #100

Implement file-level parallelization #100

Conversation

drdavella commented Oct 25, 2023 • edited Loading

Overview

Description

codecov bot commented Oct 25, 2023 • edited Loading

Codecov Report

clavedeluna left a comment

Choose a reason for hiding this comment

drdavella commented Oct 26, 2023

drdavella commented Oct 25, 2023 •

edited

Loading

codecov bot commented Oct 25, 2023 •

edited

Loading