Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support metadata table "Manifests" #861

Merged
merged 17 commits into from
Jan 2, 2025

Conversation

flaneur2020
Copy link
Contributor

@flaneur2020 flaneur2020 commented Dec 29, 2024

Re #823. Extends @xxchan's #822 to add support for the Manifests metadata table. (I'll rebase and update this PR once #822 merges.)

This PR also made two small changes to make it possible to pass the test about manifests table:

  • add io to MetadataScan
  • make the MetadataTable an async trait to allow us load manifests in the impl.

there're related comments in #823, we can rebase this pr after #823 updated.

Reference implementations:

Java
PyIceberg

@Xuanwo
Copy link
Member

Xuanwo commented Dec 30, 2024

#822 has been merged, let's move on!

@flaneur2020 flaneur2020 marked this pull request as ready for review December 31, 2024 10:31
@flaneur2020
Copy link
Contributor Author

@Xuanwo merged the main branch, PTAL 🫡

/// Get the manifests table.
pub fn manifests(&self) -> ManifestsTable {
ManifestsTable {
metadata_table: self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I think we can simply use Table here, which suggests that MetadataTable is merely a wrapper and doesn't implement any additional API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9fe6bd0, PTAL

)
.await?;
for manifest in manifest_list.entries() {
content.append_value(manifest.content.clone() as i8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unusual to see something that can use as u8 but still requires clone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may derive Copy for ManifestContentType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 4c6e338

]
}

fn schema(&self) -> Schema {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to make this pub, so engines can get the schema first without fetching the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 83e8811

@flaneur2020 flaneur2020 requested review from xxchan and Xuanwo January 1, 2025 05:24
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @flaneur2020 for this, and thank you @xxchan's review, let's move!

@Xuanwo Xuanwo merged commit 2fb9808 into apache:main Jan 2, 2025
16 checks passed
}

/// Returns the schema of the manifests table.
pub fn schema(&self) -> Schema {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return iceberg schema here, and user could easily convert it to arrow schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return iceberg schema here, and user could easily convert it to arrow schema.

Would you like to create an issue for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #868

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants