Skip to content

Conversation

@gabeiglio
Copy link
Contributor

@gabeiglio gabeiglio commented Feb 9, 2026

Rationale for this change

Doing some performance tests for overwriting partitions, we noticed that PyIceberg took double the time it usually takes java based implementation, we noticed that _exisiting_manifests does not take advantage of manifest pruning before reading all Manifest Entries

In this PR I:

  • Moved methods from _DeleteFiles to _SnapshotProducer parent class to share with other classes (_OverwriteFiles)
  • Implemented manifest pruning over all deleted files partitions to not read manifests that do not match file partitions
  • Refactored the method to only iterate once over all files (instead of multiple)

Are these changes tested?

I believe current tests in tests/integration/test_writes.py cover all cases

Are there any user-facing changes?

Nope

@gabeiglio gabeiglio marked this pull request as ready for review February 9, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant