Skip to content

Core: Use partition filtering for dangling DV detection in ManifestFilterManager#15671

Open
hemanthboyina wants to merge 3 commits intoapache:mainfrom
hemanthboyina:partition_filter_manifestmanager
Open

Core: Use partition filtering for dangling DV detection in ManifestFilterManager#15671
hemanthboyina wants to merge 3 commits intoapache:mainfrom
hemanthboyina:partition_filter_manifestmanager

Conversation

@hemanthboyina
Copy link
Copy Markdown
Contributor

In ManifestFilterManager#removeDanglingDeletesFor receives DataFiles which have partition info but only stores their paths as strings. This means ManifestFilterManager#canContainDroppedFiles returns true for every delete manifest since there's no partition info to filter against. The existing ManifestFilterManager#delete(F File) does this correctly, it stores both the file and a PartitionSet. then uses this information in ManifestFileUtil#canContainAny to skip manifests whose partition range doesn't overlap.

@github-actions github-actions Bot added the core label Mar 18, 2026
@hemanthboyina
Copy link
Copy Markdown
Contributor Author

I think the existing test cases should be good to validate this.
please review @anoopj , thanks

Copy link
Copy Markdown
Contributor

@anoopj anoopj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean implementation, thanks @hemanthboyina.

Just one optional suggestion: could we add a test that exercises the optimization? e.g., a partitioned table where data files are removed from one partition, verifying that delete manifests for non-overlapping partitions are correctly skipped. Existing tests validate correctness, but a targeted test would protect against future regressions in the filtering logic.

Otherwise LGTM.

@hemanthboyina
Copy link
Copy Markdown
Contributor Author

added the tests too @anoopj , please review

@hemanthboyina
Copy link
Copy Markdown
Contributor Author

@singhpk234 @nastra @amogh-jahagirdar can you please review

Copy link
Copy Markdown
Contributor

@anoopj anoopj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please get this reviewed by a committer.

Comment thread core/src/test/java/org/apache/iceberg/TestRewriteFiles.java Outdated
@hemanthboyina
Copy link
Copy Markdown
Contributor Author

can you please help in review this @amogh-jahagirdar @danielcweeks @manuzhang , thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants