Conversation
6f0bee0 to
a70edb2
Compare
a70edb2 to
384e229
Compare
|
I plan to move the set/remove statistics methods from the Transaction class to another class, such as ManageSnapshot. In the meantime, I’d like to confirm with everyone if I’m heading in the right direction with the current implementation. |
kevinjqliu
left a comment
There was a problem hiding this comment.
Thanks for the PR! Added a few comments. I think it would also be helpful to include integration tests
9b15c86 to
d16ef47
Compare
|
@kevinjqliu could you please review it once more? |
kevinjqliu
left a comment
There was a problem hiding this comment.
Added a few comments.
Do you know which engine currently can generate puffin files? would be great to add an integration with a spark generated puffin file
@kevinjqliu As far as I know, only Trino can generate them. What kind of test would you like to have? I believe we are covering all relevant cases for this PR. If PyIceberg could generate or read puffin files, then I agree it would be useful to add tests to check compatibility between engines. However, I think it only makes sense to test puffin files during reading, as testing generation would mean verifying the implementation of something that isn’t our responsibility. In this case, it’s just a metadata update. What do you think? |
d16ef47 to
11120bf
Compare
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM! Thanks for working on this!
Regarding the integration tests, since we're manipulating table metadata to add/remove table stats, it would be great to verify that another source can interact with these stats. Not a hard blocker
|
@ndrluis do you mind resolving the merge conflict here? |
11120bf to
adfc971
Compare
|
@kevinjqliu Done! |
| if update.snapshot_id != update.statistics.snapshot_id: | ||
| raise ValueError("Snapshot id in statistics does not match the snapshot id in the update") |
There was a problem hiding this comment.
It's a bit of an awkward check, but something that we have to live with I guess.
2034836 to
ba64764
Compare
8fe9992 to
217bb95
Compare
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
217bb95 to
45c60cb
Compare
45c60cb to
fb9b2a2
Compare
The Java expire snapshot process expires table statistics and partition statistics. I am implementing a statistics table to make our expire snapshot compatible with the Java implementation.