refactor(parquet-datasource): split opener.rs into an opener/ module#22346
Merged
Conversation
Pure code motion, no behavior change and no public API change (`opener` was already a private module). Splits the ~2,700 LOC `opener.rs` into a directory module: - `opener/early_stop.rs` — `EarlyStoppingStream`, the dynamic-filter early-termination wrapper applied at the end of `build_stream`. - `opener/encryption.rs` — `EncryptionContext` and the `ParquetMorselizer::get_encryption_context` helpers, isolating the `#[cfg(feature = "parquet_encryption")]` gating that previously bled through the main file. `opener.rs` becomes `opener/mod.rs`. Split out of apache#22156, which originally also extracted an `opener/push_decoder_stream.rs`; that move is now obsolete since apache#22289 already extracted `PushDecoderStreamState` into `push_decoder.rs`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
@xudong963 wonder if you could review this refactor? |
xudong963
approved these changes
May 19, 2026
| use std::task::{Context, Poll}; | ||
|
|
||
| use arrow::datatypes::{SchemaRef, TimeUnit}; | ||
| use datafusion_common::encryption::FileDecryptionProperties; |
Member
There was a problem hiding this comment.
After this refactor, with default features, this import becomes unused
Contributor
Author
There was a problem hiding this comment.
Good catch — both uses in this file are #[cfg(feature = "parquet_encryption")], so I've gated the import the same way in 5bb94e0. Verified both default-features and --features parquet_encryption build cleanly.
Both uses in opener/mod.rs are cfg-gated on the feature; with default features the import was unused after the module split. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Relates to the discussion in #22024 about the Parquet datasource crate becoming hard to navigate. Split out of #22156, which bundled several code-motion moves into one PR — this is one of three smaller, independently-reviewable PRs that replace it.
Rationale for this change
opener.rshad grown to ~2,700 LOC, bundling several distinct responsibilities into one file. That makes it hard to read and hard to review changes in isolation. This PR is pure code motion: no behavior change and no public API change.What changes are included in this PR?
Splits
opener.rsinto anopener/directory module:opener/early_stop.rs—EarlyStoppingStream, the dynamic-filter early-termination wrapper applied at the end ofbuild_stream.opener/encryption.rs—EncryptionContextand theParquetMorselizer::get_encryption_contexthelpers, isolating the#[cfg(feature = "parquet_encryption")]gating that previously bled through the main file.opener.rsbecomesopener/mod.rs.Note: #22156 originally also extracted an
opener/push_decoder_stream.rs. That move is now obsolete — #22289 has since extractedPushDecoderStreamStateintopush_decoder.rs— so it is dropped here.Are these changes tested?
Yes, covered by existing tests.
cargo test -p datafusion-datasource-parquet --all-features(122 passing) andcargo clippy -p datafusion-datasource-parquet --all-targets --all-features -- -D warningsboth pass.Are there any user-facing changes?
No.
openerwas already a private module; this only reorganizes files inside the crate.🤖 Generated with Claude Code