fix(table): prevent index out of range error in buildManifestEvaluator#692
fix(table): prevent index out of range error in buildManifestEvaluator#692zeroshade merged 5 commits intoapache:mainfrom
Conversation
zeroshade
left a comment
There was a problem hiding this comment.
Would it be relatively easy to construct a contrived test that would have triggered the out of range error? Just to ensure we don't introduce this back later
0737398 to
50e72f9
Compare
|
@zeroshade I've added a short unit test. |
4f3e817 to
6fb1100
Compare
|
Also apache/iceberg#15108 (comment) suggests the partition specs should be tracked in a Line 92 in ad086d2 |
We are still pre-version 1. Thus I'm okay with a breaking change here. These sort of situations are precisely why we haven't hit version 1.0 and called it stable yet. Just make sure we're very clear in the commit notice about the breaking change (so the release notes make it clear) |
|
Alternately, we could store it as a map internally and just construct the slice in the function on the fly ( |
|
Just had a closer look into it and it seems the field Line 1210 in 3ce3b4b because it is defined as such in the Iceberg docs:
Changing its type in the Lines 1316 to 1324 in 3ce3b4b and Lines 1430 to 1439 in 3ce3b4b could use the map for better performance. If we'd also add the new internal lazy-loading method to the Metadata interface, users wouldn't be affected but all these methods could use the map instead: Lines 244 to 249 in 8ff2e91 Lines 251 to 256 in 8ff2e91 Lines 258 to 272 in 8ff2e91 iceberg-go/table/update_spec.go Lines 407 to 411 in 8ff2e91 Buuuut... I'm not sure if the minor performance boost justifies the additional lazy-loading complexity 😅 there's the risk of future changes to the code leading to diverging states between the partition-spec list and map. Should I give it a try or do we keep the lists and just prevent the out of range errors for now? |
1a14f1f to
b649746
Compare
|
@zeroshade the methods buildPartitionProjection and buildPartitionEvaluator had the same issue. Added a fix and unit tests for those two as well. |
zeroshade
left a comment
There was a problem hiding this comment.
Thanks for this, looks good!
Buuuut... I'm not sure if the minor performance boost justifies the additional lazy-loading complexity 😅 there's the risk of future changes to the code leading to diverging states between the partition-spec list and map. Should I give it a try or do we keep the lists and just prevent the out of range errors for now?
I think we should keep the lists and prevent the out of range errors for now, and we can look into switching to the proper map approach/lazy-loading/etc. as a separate PR.
I've got just one nitpick question for you, otherwise this is good to go!
…ion, buildManifestEvaluator, buildPartitionEvaluator fixes apache#691 Previously, these builders directly accessed the PartitionSpecs array by index without validating that the requested specID actually exists in the partition specs. This could cause an index out of bounds error. This change adds a proper lookup using slices.IndexFunc to find the partition spec by its ID and returns an ErrPartitionSpecNotFound error if the spec is not found, improving error handling and robustness. The lookup logic is implemented once as a helper method and is reused.
b649746 to
50b0260
Compare
Replace generic error message with ErrPartitionSpecNotFound sentinel error in GetSpecByID() for improved error handling and type safety.
Introduce PartitionSpecByID method for cleaner partition spec lookups, replacing ad-hoc filtering patterns throughout the codebase with a standardized interface method.
Move evaluator creation and error handling outside the closure to ensure errors are caught during builder construction time.
9799d2d to
de8797b
Compare
|
Thanks for reviewing @ferhatelmas, any further comments/issues on this? |
|
My pleasure. Overall looks good to me. #692 (comment) is the only concern but not a blocker to merge this PR I think |
|
I agree with your statement there, it'd be better to address all at once and for now let's keep the consistency |
fixes #691
Previously, the buildManifestEvaluator method directly accessed the PartitionSpecs array by index without validating that the requested specID actually exists in the partition specs. This could cause an index out of bounds error.
This change adds a proper lookup using slices.IndexFunc to find the partition spec by its ID and returns an error if the spec is not found, improving error handling and robustness.