[fix](filecache) pass tablet_id through FileReaderOptions instead of parsing from path#61683
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 26700 ms |
TPC-DS: Total hot run time: 169861 ms |
…parsing from path
CachedRemoteFileReader::_execute_remote_read previously parsed tablet_id
from file paths at runtime via extract_tablet_id(). This breaks when
enable_packed_file (small file merging) is enabled because packed file
paths don't follow the expected data/{tablet_id}/... format.
Fix: store tablet_id from FileReaderOptions at construction time and use
it directly, eliminating runtime path parsing. Propagate tablet_id through
all code paths: Segment, InvertedIndexFileReader, FSIndexInput::open,
DownloadFileMeta (warmup/preheating), and beta_rowset consistency checks.
b387dc0 to
7c91d8c
Compare
|
run buildall |
7c91d8c to
ca2ccb5
Compare
|
run buildall |
TPC-H: Total hot run time: 26994 ms |
TPC-DS: Total hot run time: 169654 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
43d21c6 to
a6c6a36
Compare
|
run buildall |
TPC-H: Total hot run time: 26719 ms |
TPC-DS: Total hot run time: 169228 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run p0 |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run nonConcurrent |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by anyone and no changes requested. |
|
PR approved by at least one committer and no changes requested. |
airborne12
left a comment
There was a problem hiding this comment.
LGTM. Clean and thorough fix that eliminates fragile runtime path parsing for tablet_id.
Key strengths:
- Correctly replaces TOCTOU-style
extract_tablet_id()with construction-time data propagation throughFileReaderOptions - All 26 files covered — every
is_doris_table=truecallsite now setstablet_id - DCHECK safety net catches missed propagation paths in debug builds
- Test updates are comprehensive (14 CachedRemoteFileReader tests updated)
Minor suggestions (non-blocking):
- Consider removing default
tablet_id = -1from function signatures once all callsites are confirmed migrated, to get compile-time enforcement Segment::openredundantly overridestablet_idfrom the parameter even though all callers already set it in reader_options — defensive but could be simplified
CachedRemoteFileReader::_execute_remote_read previously parsed tablet_id from file paths at runtime via extract_tablet_id(). This breaks when enable_packed_file (small file merging) is enabled because packed file paths don't follow the expected data/{tablet_id}/... format.
Fix: store tablet_id from FileReaderOptions at construction time and use it directly, eliminating runtime path parsing. Propagate tablet_id through all code paths: Segment, InvertedIndexFileReader, FSIndexInput::open, DownloadFileMeta (warmup/preheating), and beta_rowset consistency checks.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)