Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Enable concat for swordfish #2976

Merged
merged 15 commits into from
Oct 22, 2024
Merged

[FEAT] Enable concat for swordfish #2976

merged 15 commits into from
Oct 22, 2024

Conversation

colin-ho
Copy link
Contributor

@colin-ho colin-ho commented Sep 30, 2024

Implement concat as a streaming sink in swordfish.

Concatenating two dataframes in a streaming executor is pretty simple. You just stream the left side, then stream the right side. This PR implements just that.

@github-actions github-actions bot added the enhancement New feature or request label Sep 30, 2024
Copy link

codspeed-hq bot commented Sep 30, 2024

CodSpeed Performance Report

Merging #2976 will not alter performance

Comparing colin/swordfish-concat (102970e) with main (31d5412)

Summary

✅ 17 untouched benchmarks

Copy link

codecov bot commented Sep 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.63%. Comparing base (31d5412) to head (102970e).
Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2976      +/-   ##
==========================================
+ Coverage   77.35%   78.63%   +1.28%     
==========================================
  Files         615      616       +1     
  Lines       74412    73147    -1265     
==========================================
- Hits        57561    57520      -41     
+ Misses      16851    15627    -1224     
Files with missing lines Coverage Δ
src/daft-local-execution/src/pipeline.rs 94.30% <100.00%> (+0.44%) ⬆️
src/daft-local-execution/src/sinks/concat.rs 100.00% <100.00%> (ø)
...c/daft-local-execution/src/sinks/streaming_sink.rs 80.15% <ø> (ø)

... and 17 files with indirect coverage changes

@colin-ho colin-ho force-pushed the colin/swordfish-concat branch from a2ca316 to fe2c0b1 Compare October 22, 2024 01:44
@colin-ho colin-ho changed the title [FEAT] Concat for native executor [FEAT] Enable concat for swordfish Oct 22, 2024
@colin-ho colin-ho marked this pull request as ready for review October 22, 2024 04:45
@colin-ho colin-ho requested a review from kevinzwang October 22, 2024 16:41
Copy link
Member

@kevinzwang kevinzwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One todo for the future: loosen index ordering requirement when downstream ops do not need data to be ordered. That way the right side can start computation before the left side is done.

@colin-ho colin-ho enabled auto-merge (squash) October 22, 2024 20:37
@colin-ho colin-ho merged commit 7d600c2 into main Oct 22, 2024
40 checks passed
@colin-ho colin-ho deleted the colin/swordfish-concat branch October 22, 2024 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants