Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CHORE] Remove non-MicroPartition and non-ScanOperator paths #1946

Merged
merged 4 commits into from
Feb 24, 2024

Conversation

clarkzinzow
Copy link
Contributor

This PR removes the legacy non-MicroPartition and non-ScanOperator code paths, refactoring where required. As a drive-by, this PR also refactors the query optimization tests to use structural equality assertions instead of repr-based ones, and adds additional test coverage around scan pushdowns.

Closes #1863
Closes #1939
Closes #1945

Copy link

codecov bot commented Feb 23, 2024

Codecov Report

Attention: Patch coverage is 92.59259% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 83.93%. Comparing base (99db777) to head (3ab8ff3).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1946      +/-   ##
==========================================
- Coverage   85.63%   83.93%   -1.71%     
==========================================
  Files          55       55              
  Lines        6250     6111     -139     
==========================================
- Hits         5352     5129     -223     
- Misses        898      982      +84     
Files Coverage Δ
daft/execution/execution_step.py 93.23% <100.00%> (-0.44%) ⬇️
daft/execution/physical_plan.py 95.17% <100.00%> (+0.12%) ⬆️
daft/execution/rust_physical_plan_shim.py 93.10% <ø> (-0.72%) ⬇️
daft/io/_csv.py 95.23% <100.00%> (ø)
daft/io/_json.py 100.00% <100.00%> (ø)
daft/io/_parquet.py 100.00% <100.00%> (ø)
daft/logical/builder.py 89.83% <100.00%> (-0.34%) ⬇️
daft/logical/schema.py 91.96% <ø> (+0.58%) ⬆️
daft/plan_scheduler/physical_plan_scheduler.py 75.00% <100.00%> (ø)
daft/runners/pyrunner.py 96.62% <100.00%> (+0.45%) ⬆️
... and 5 more

... and 2 files with indirect coverage changes

@samster25
Copy link
Member

samster25 commented Feb 23, 2024

@clarkzinzow btw: it still looks like github actions is still running the matrix over MICROPARTITIONS

@clarkzinzow
Copy link
Contributor Author

@samster25 Hmm I thought that I removed that from the actions config, let me double-check.

Copy link
Contributor

@jaychia jaychia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!!! I feel like a new man...

Amazing work going through all our optimizer tests 😰 I think the dummy_scan_operator is fine, and makes the tests nice and explicit

image

@@ -303,128 +290,6 @@ def num_outputs(self) -> int:
return 1


@dataclass(frozen=True)
class ReadFile(SingleOutputInstruction):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goodbye and thank you for your service 🫡

src/daft-scan/src/lib.rs Outdated Show resolved Hide resolved
@jaychia
Copy link
Contributor

jaychia commented Feb 24, 2024

We'll need github repo admin permissions to change our "Required" CI steps before we can actually merge!

@clarkzinzow clarkzinzow force-pushed the clark/remove-old-table-path branch from 5d52fc6 to 3ab8ff3 Compare February 24, 2024 01:37
@clarkzinzow
Copy link
Contributor Author

clarkzinzow commented Feb 24, 2024

We'll need github repo admin permissions to change our "Required" CI steps before we can actually merge!

@jaychia Not sure if I understand how the required workflows work, but I just rebased this PR on latest main, so I think we can have someone with admin repo privileges force-merge the PR since all pre-merge checks are green, without waiting for the post-merge checks to run. I assume that our required workflows are reusing our normal pre-merge workflows as a "reusable" workflow, so once the reusable workflows are updated in main via merging this PR, main commits and PR commits should be post-merge tested using the updated workflows as expected.

Am I understanding our required workflows setup correctly? I don't think I have permissions to see it, let alone modify it, unfortunately. ☹️

@jaychia
Copy link
Contributor

jaychia commented Feb 24, 2024 via email

@clarkzinzow
Copy link
Contributor Author

@jaychia should be good to merge now!

@jaychia jaychia merged commit 0a51db1 into main Feb 24, 2024
28 of 29 checks passed
@jaychia jaychia deleted the clark/remove-old-table-path branch February 24, 2024 03:41
@jaychia
Copy link
Contributor

jaychia commented Feb 24, 2024

Edited main branch protection rules and merged PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants