Skip to content

[BUG] Use schema_hints as hints instead of definitive schema #3205

[BUG] Use schema_hints as hints instead of definitive schema

[BUG] Use schema_hints as hints instead of definitive schema #3205

Triggered via pull request December 1, 2023 23:33
Status Success
Total duration 25s
Artifacts

release-drafter.yml

on: pull_request
update_release_draft
6s
update_release_draft
Fit to window
Zoom out
Zoom in

Annotations

3 errors
update_release_draft
Resource not accessible by integration { name: 'HttpError', id: '7066345326', status: 403, response: { url: 'https://api.github.com/repos/Eventual-Inc/Daft/issues/1636/labels', status: 403, headers: { 'access-control-allow-origin': '*', 'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset', connection: 'close', 'content-encoding': 'gzip', 'content-security-policy': "default-src 'none'", 'content-type': 'application/json; charset=utf-8', date: 'Fri, 01 Dec 2023 23:33:42 GMT', 'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', server: 'GitHub.com', 'strict-transport-security': 'max-age=31536000; includeSubdomains; preload', 'transfer-encoding': 'chunked', vary: 'Accept-Encoding, Accept, X-Requested-With', 'x-accepted-github-permissions': 'issues=write; pull_requests=write', 'x-content-type-options': 'nosniff', 'x-frame-options': 'deny', 'x-github-api-version-selected': '2022-11-28', 'x-github-media-type': 'github.v3; format=json', 'x-github-request-id': 'DF85:69E3:5B98834:5EA4CAD:656A6D56', 'x-ratelimit-limit': '1000', 'x-ratelimit-remaining': '963', 'x-ratelimit-reset': '1701475150', 'x-ratelimit-resource': 'core', 'x-ratelimit-used': '37', 'x-xss-protection': '0' }, data: { message: 'Resource not accessible by integration', documentation_url: 'https://docs.github.com/rest/issues/labels#add-labels-to-an-issue' } }, request: { method: 'POST', url: 'https://api.github.com/repos/Eventual-Inc/Daft/issues/1636/labels', headers: { accept: 'application/vnd.github.v3+json', 'user-agent': 'probot/12.2.5 octokit-core.js/3.5.1 Node.js/16.20.2 (linux; x64)', authorization: 'token [REDACTED]', 'content-type': 'application/json; charset=utf-8' }, body: '{"labels":["bug"]}', request: {} }, event: { id: '7066345326', name: 'pull_request', payload: { action: 'edited', changes: { body: { from: 'Addresses #1599 \r\n' + '\r\n' + "Instead of using `schema_hints` as a definitive schema, use them as 'hints' as to the intended datatype of each column. \r\n" + "This is implemented via running schema inference first, then applying the 'hints' onto the inferred schema.\r\n" + '\r\n' + 'Changes:\r\n' + '- Added `apply_hints` method for the PySchema class.\r\n' + '- Apply hints after schema inference in `_get_tabular_files_scan` (when DAFT_MICROPARTITIONS=0)\r\n' + '- Apply hints after schema inference in `GlobScanOperator.try_new` (when DAFT_MICROPARTITIONS=1)\r\n' + '- Remove passing hints into `from_tabular_scan_with_scan_operator` because schema hints are already applied in ScanOperator.\r\n' + '- Added option to pass in schema into `read_parquet_into_micropartition` (this was necessary because the schema created from scan operator was not passed in)\r\n' + '\r\n' + 'Tests:\r\n' + '- Added tests for for read_csv, read_json, read_parquet\r\n' + '- For read_csv, I added a test case to ensure that if `has_headers=false`, then the schema_hints **should** be used as definitive schema.\r\n' + '\r\n' + 'Feedback greatly appreciated! Let me know if this is the correctly intended behaviour, and also if the code can be optimized/refactored since this is my first time writing Rust!' } }, number: 1636, organization: { avatar_url: 'https://avatars.githubusercontent.com/u/98941975?v=4', description: 'Eventual Computing', events_url: 'https://api.github.c
update_release_draft
Resource not accessible by integration { name: 'HttpError', id: '7066345326', status: 403, response: { url: 'https://api.github.com/repos/Eventual-Inc/Daft/releases', status: 403, headers: { 'access-control-allow-origin': '*', 'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset', connection: 'close', 'content-encoding': 'gzip', 'content-security-policy': "default-src 'none'", 'content-type': 'application/json; charset=utf-8', date: 'Fri, 01 Dec 2023 23:33:44 GMT', 'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', server: 'GitHub.com', 'strict-transport-security': 'max-age=31536000; includeSubdomains; preload', 'transfer-encoding': 'chunked', vary: 'Accept-Encoding, Accept, X-Requested-With', 'x-accepted-github-permissions': 'contents=write; contents=write,workflows=write', 'x-content-type-options': 'nosniff', 'x-frame-options': 'deny', 'x-github-api-version-selected': '2022-11-28', 'x-github-media-type': 'github.v3; format=json', 'x-github-request-id': 'DF87:26E7:4661AA1:48EA3CA:656A6D58', 'x-ratelimit-limit': '1000', 'x-ratelimit-remaining': '962', 'x-ratelimit-reset': '1701475150', 'x-ratelimit-resource': 'core', 'x-ratelimit-used': '38', 'x-xss-protection': '0' }, data: { message: 'Resource not accessible by integration', documentation_url: 'https://docs.github.com/rest/releases/releases#create-a-release' } }, request: { method: 'POST', url: 'https://api.github.com/repos/Eventual-Inc/Daft/releases', headers: { accept: 'application/vnd.github.v3+json', 'user-agent': 'probot/12.2.5 octokit-core.js/3.5.1 Node.js/16.20.2 (linux; x64)', authorization: 'token [REDACTED]', 'content-type': 'application/json; charset=utf-8' }, body: '{"target_commitish":"refs/pull/1636/merge","name":"v0.2.6","tag_name":"v0.2.6","body":"## Changes\\n\\n## ✨ New Features\\n\\n- [FEAT] Enable Comparison between timestamp / dates @samster25 (#1689)\\n- [FEAT] Enable MicroPartitions by default @jaychia (#1684)\\n- [FEAT] Temporal Literals for Date and Timestamp @samster25 (#1683)\\n- [FEAT] Partitioning exprs for Iceberg @samster25 (#1680)\\n\\n## 👾 Bug Fixes\\n\\n- [BUG] fix off by 1 for retries for cred provider @samster25 (#1681)\\n\\n## 🧰 Maintenance\\n\\n- [CHORE] drop s3 compat mode for gcs for anonymous mode @samster25 (#1682)\\n- [CHORE] Remove usage of credentials in workflows @jaychia (#1686)\\n- [CHORE] Iceberg Image Caching @samster25 (#1687)\\n- [CHORE] Bump Iceberg Version and V1 of caching @samster25 (#1685)\\n","draft":true,"prerelease":false,"make_latest":"true"}', request: { retryCount: 1 } }, event: { id: '7066345326', name: 'pull_request', payload: { action: 'edited', changes: { body: { from: 'Addresses #1599 \r\n' + '\r\n' + "Instead of using `schema_hints` as a definitive schema, use them as 'hints' as to the intended datatype of each column. \r\n" + "This is implemented via running schema inference first, then applying the 'hints' onto the inferred schema.\r\n" + '\r\n' + 'Changes:\r\n' + '- Added `apply_hints` method for the PySchema class.\r\n' + '- Apply hints after schema inference in `_get_tabular_files_scan` (when DAFT_MICROPARTITIONS=0)\r\n' + '- Apply hints after schema inference in `GlobScanOperator.try_new` (when DAFT_MICROPARTITIONS=1)\r\n' + '- Remove passing hints into `from_tabular_scan_with_scan_operator` because schema hints are already applied in ScanOperator.\r\n' + '- Added option to pass in schema into `read_parque
update_release_draft
HttpError: Resource not accessible by integration at /home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:8462:21 at processTicksAndRejections (node:internal/process/task_queues:96:5) at async Job.doExecute (/home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:30793:18) HttpError: Resource not accessible by integration at /home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:8462:21 at processTicksAndRejections (node:internal/process/task_queues:96:5) at async Job.doExecute (/home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:30793:18) { name: 'AggregateError', event: { id: '7066345326', name: 'pull_request', payload: { action: 'edited', changes: { body: { from: 'Addresses #1599 \r\n' + '\r\n' + "Instead of using `schema_hints` as a definitive schema, use them as 'hints' as to the intended datatype of each column. \r\n" + "This is implemented via running schema inference first, then applying the 'hints' onto the inferred schema.\r\n" + '\r\n' + 'Changes:\r\n' + '- Added `apply_hints` method for the PySchema class.\r\n' + '- Apply hints after schema inference in `_get_tabular_files_scan` (when DAFT_MICROPARTITIONS=0)\r\n' + '- Apply hints after schema inference in `GlobScanOperator.try_new` (when DAFT_MICROPARTITIONS=1)\r\n' + '- Remove passing hints into `from_tabular_scan_with_scan_operator` because schema hints are already applied in ScanOperator.\r\n' + '- Added option to pass in schema into `read_parquet_into_micropartition` (this was necessary because the schema created from scan operator was not passed in)\r\n' + '\r\n' + 'Tests:\r\n' + '- Added tests for for read_csv, read_json, read_parquet\r\n' + '- For read_csv, I added a test case to ensure that if `has_headers=false`, then the schema_hints **should** be used as definitive schema.\r\n' + '\r\n' + 'Feedback greatly appreciated! Let me know if this is the correctly intended behaviour, and also if the code can be optimized/refactored since this is my first time writing Rust!' } }, number: 1636, organization: { avatar_url: 'https://avatars.githubusercontent.com/u/98941975?v=4', description: 'Eventual Computing', events_url: 'https://api.github.com/orgs/Eventual-Inc/events', hooks_url: 'https://api.github.com/orgs/Eventual-Inc/hooks', id: 98941975, issues_url: 'https://api.github.com/orgs/Eventual-Inc/issues', login: 'Eventual-Inc', members_url: 'https://api.github.com/orgs/Eventual-Inc/members{/member}', node_id: 'O_kgDOBeW8Fw', public_members_url: 'https://api.github.com/orgs/Eventual-Inc/public_members{/member}', repos_url: 'https://api.github.com/orgs/Eventual-Inc/repos', url: 'https://api.github.com/orgs/Eventual-Inc' }, pull_request: { _links: { comments: { href: 'https://api.github.com/repos/Eventual-Inc/Daft/issues/1636/comments' }, commits: { href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/1636/commits' }, html: { href: 'https://github.com/Eventual-Inc/Daft/pull/1636' }, issue: { href: 'https://api.github.com/repos/Eventual-Inc/Daft/issues/1636' }, review_comment: { href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/comments{/number}' }, review_comments: { href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/1636/comments' }, self: { href: 'https://api.github.com/repos/Eventual-Inc/Daft/pulls/1636' }, statuses: { href: 'https://api.github.com/repos/Eventual-Inc/Daft/statuses/ff67fa7e2f20204b0aca