feat: implement schema selection and projection methods#207
Merged
Xuanwo merged 10 commits intoapache:mainfrom Sep 23, 2025
Merged
feat: implement schema selection and projection methods#207Xuanwo merged 10 commits intoapache:mainfrom
Xuanwo merged 10 commits intoapache:mainfrom
Conversation
Contributor
nullccxsy
commented
Sep 3, 2025
- Added select and project methods to the Schema class for creating projection schemas based on specified field names or IDs.
- Introduced PruneColumnVisitor to handle the logic for selecting and projecting fields, including support for nested structures.
351edfe to
fe87277
Compare
wgtmac
reviewed
Sep 5, 2025
wgtmac
requested changes
Sep 7, 2025
src/iceberg/schema.cc
Outdated
Comment on lines
261
to
272
| /// \brief Visitor class for pruning schema columns based on selected field IDs. | ||
| /// | ||
| /// This visitor traverses a schema and creates a projected version containing only | ||
| /// the specified fields. It handles different projection modes: | ||
| /// - select_full_types=true: Include entire fields when their ID is selected | ||
| /// - select_full_types=false: Recursively project nested fields within selected structs | ||
| /// | ||
| /// \warning Error conditions that will cause projection to fail: | ||
| /// - Attempting to explicitly project List or Map types (returns InvalidArgument) | ||
| /// - Projecting a List when element result is null (returns InvalidArgument) | ||
| /// - Projecting a Map without a defined map value type (returns InvalidArgument) | ||
| /// - Projecting a struct when result is not StructType (returns InvalidArgument) |
Member
There was a problem hiding this comment.
I think it is easy and valid to support projections in the nested map and list types and don't know why the Java impl does not support this. The code will be much simpler (shorter) if we support them.
@Fokko Do you have any context on the Java impl?
wgtmac
reviewed
Sep 7, 2025
wgtmac
requested changes
Sep 10, 2025
wgtmac
requested changes
Sep 11, 2025
wgtmac
requested changes
Sep 12, 2025
1dd1fda to
11126f9
Compare
wgtmac
reviewed
Sep 15, 2025
added 10 commits
September 23, 2025 10:30
- Added select and project methods to the Schema class for creating projection schemas based on specified field names or IDs. - Introduced PruneColumnVisitor to handle the logic for selecting and projecting fields, including support for nested structures.
… handling - Modified the PruneColumnVisitor class to pass results as shared pointers, improving memory management and clarity. - Updated Visit methods for ListType, MapType, and StructType to accommodate the new result handling approach.
…rror reporting - Updated the PruneColumnVisitor class to utilize shared pointers for type results, enhancing memory management. - Refined Visit methods for StructType, ListType, and MapType to improve clarity and error handling, particularly for cases involving invalid projections.
dbb3e92 to
7e7e2a5
Compare
gty404
approved these changes
Sep 23, 2025
HeartLinked
approved these changes
Sep 23, 2025
lishuxu
approved these changes
Sep 23, 2025
dongxiao1198
approved these changes
Sep 23, 2025
Xuanwo
approved these changes
Sep 23, 2025
Member
Xuanwo
left a comment
There was a problem hiding this comment.
Thank you for working on this!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.