[FEAT] [Join Optimizations] Add broadcast join. #3280
Annotations
2 errors
Run release-drafter/release-drafter@v5
Validation Failed: {"resource":"Release","code":"invalid","field":"target_commitish"}
{
name: 'HttpError',
id: '7133863019',
status: 422,
response: {
url: 'https://api.github.com/repos/Eventual-Inc/Daft/releases/132770221',
status: 422,
headers: {
'access-control-allow-origin': '*',
'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset',
connection: 'close',
'content-length': '195',
'content-security-policy': "default-src 'none'",
'content-type': 'application/json; charset=utf-8',
date: 'Thu, 07 Dec 2023 20:51:38 GMT',
'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin',
server: 'GitHub.com',
'strict-transport-security': 'max-age=31536000; includeSubdomains; preload',
vary: 'Accept-Encoding, Accept, X-Requested-With',
'x-accepted-github-permissions': 'contents=write',
'x-content-type-options': 'nosniff',
'x-frame-options': 'deny',
'x-github-api-version-selected': '2022-11-28',
'x-github-media-type': 'github.v3; format=json',
'x-github-request-id': '8DC7:7361:8AB3C1:11E26EF:65723059',
'x-ratelimit-limit': '1000',
'x-ratelimit-remaining': '970',
'x-ratelimit-reset': '1701985799',
'x-ratelimit-resource': 'core',
'x-ratelimit-used': '30',
'x-xss-protection': '0'
},
data: {
message: 'Validation Failed',
errors: [
{
resource: 'Release',
code: 'invalid',
field: 'target_commitish'
}
],
documentation_url: 'https://docs.github.com/rest/releases/releases#update-a-release'
}
},
request: {
method: 'PATCH',
url: 'https://api.github.com/repos/Eventual-Inc/Daft/releases/132770221',
headers: {
accept: 'application/vnd.github.v3+json',
'user-agent': 'probot/12.2.5 octokit-core.js/3.5.1 Node.js/16.20.2 (linux; x64)',
authorization: 'token [REDACTED]',
'content-type': 'application/json; charset=utf-8'
},
body: '{"body":"## Changes\\n\\n## ✨ New Features\\n\\n- [FEAT] [JSON Reader] Add native streaming + parallel JSON reader. @clarkzinzow (#1679)\\n\\n## 🚀 Performance Improvements\\n\\n- [PERF] Enable Predicates in Parquet Reader @samster25 (#1702)\\n\\n## 📖 Documentation\\n\\n- [DOCS] Add notebooks used for pydata global 2023 presentation @jaychia (#1703)\\n","draft":true,"prerelease":false,"make_latest":"true","name":"v0.2.7","tag_name":"v0.2.7","target_commitish":"refs/pull/1706/merge"}',
request: {}
},
event: {
id: '7133863019',
name: 'pull_request',
payload: {
action: 'edited',
changes: {
body: {
from: 'This PR adds a broadcast join implementation as a new join strategy, where all partitions of a small table are broadcasted to each partition in the larger table, such that we do a local (hash) join of the entire small table with each individual partition of the larger table.\r\n' +
'\r\n' +
'The query planner chooses the broadcast join as its join strategy if one of the sides of the join is smaller than a preconfigured broadcasting threshold (set to 10 MiB by default, but is user-configurable).\r\n' +
'\r\n' +
'If the smaller side of the join is the right side, we invert the join for planning and scheduling simplicity so we can always broadcast the left side; we then swap back to the correct join ordering when performing the local joins. This means that we always form the probe table on the left side of the join; a future optimization (applicable to both the broadcast join and the hash join) would be to have local joins build the probe table on the smaller side while preserving the expected column ordering. We would still need to always build the probe table on the left s
|
Run release-drafter/release-drafter@v5
HttpError: Validation Failed: {"resource":"Release","code":"invalid","field":"target_commitish"}
at /home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:8462:21
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async Job.doExecute (/home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:30793:18)
{
name: 'AggregateError',
event: {
id: '7133863019',
name: 'pull_request',
payload: {
action: 'edited',
changes: {
body: {
from: 'This PR adds a broadcast join implementation as a new join strategy, where all partitions of a small table are broadcasted to each partition in the larger table, such that we do a local (hash) join of the entire small table with each individual partition of the larger table.\r\n' +
'\r\n' +
'The query planner chooses the broadcast join as its join strategy if one of the sides of the join is smaller than a preconfigured broadcasting threshold (set to 10 MiB by default, but is user-configurable).\r\n' +
'\r\n' +
'If the smaller side of the join is the right side, we invert the join for planning and scheduling simplicity so we can always broadcast the left side; we then swap back to the correct join ordering when performing the local joins. This means that we always form the probe table on the left side of the join; a future optimization (applicable to both the broadcast join and the hash join) would be to have local joins build the probe table on the smaller side while preserving the expected column ordering. We would still need to always build the probe table on the left side of the join if we need to preserve the row-ordering of the right side of the join, e.g. if the right side of the join is range-partitioned.\r\n' +
'\r\n' +
'## TODOs\r\n' +
'\r\n' +
'- [x] Test coverage.\r\n' +
'- [ ] (Follow-up?) TPC-H benchmarking demonstrating speedup due to use of broadcast join.\r\n' +
'- [ ] (Follow-up) In local joins, build the probe table on the smaller side of the join.\r\n' +
'- [ ] (Follow-up) Add table size approximations for operators that affect cardinality.'
}
},
number: 1706,
organization: {
avatar_url: 'https://avatars.githubusercontent.com/u/98941975?v=4',
description: 'Eventual Computing',
events_url: 'https://api.github.com/orgs/Eventual-Inc/events',
hooks_url: 'https://api.github.com/orgs/Eventual-Inc/hooks',
id: 98941975,
issues_url: 'https://api.github.com/orgs/Eventual-Inc/issues',
login: 'Eventual-Inc',
members_url: 'https://api.github.com/orgs/Eventual-Inc/members{/member}',
node_id: 'O_kgDOBeW8Fw',
public_members_url: 'https://api.github.com/orgs/Eventual-Inc/public_members{/member}',
repos_url: 'https://api.github.com/orgs/Eventual-Inc/repos',
url: 'https://api.github.com/orgs/Eventual-Inc'
},
pull_request: {
_links: {
comments: {
href: 'https://api.github.com/repos/Eventual-Inc/Daft/issues/1706/comments'
},
commits: {
href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/1706/commits'
},
html: { href: 'https://github.com/Eventual-Inc/Daft/pull/1706' },
issue: {
href: 'https://api.github.com/repos/Eventual-Inc/Daft/issues/1706'
},
review_comment: {
href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/comments{/number}'
},
review_comments: {
href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/1706/comments'
},
self: {
href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/1706'
},
statuses: {
href: 'https://api.github.com/repos/Eventual-Inc/Daft/statuses/4fb0e12e84fb572291fa2f323a0b351843068604'
}
},
active_lock_re
|
The logs for this run have expired and are no longer available.
Loading