perf(metadata table) Metadata table archival should proceed until earliest commit to retain on data table#18215
perf(metadata table) Metadata table archival should proceed until earliest commit to retain on data table#18215kbuci wants to merge 3 commits intoapache:masterfrom
Conversation
|
hey @kbuci : but what does this buy us though. Its a minor optimization right. don't you think we can live w/ it and keep the code simple. |
| TimelineUtils.getEarliestInstantForMetadataArchival( | ||
| dataMetaClient.getActiveTimeline(), config.shouldArchiveBeyondSavepoint()); | ||
| dataMetaClient.getActiveTimeline(), config.shouldArchiveBeyondSavepoint(), | ||
| TimelineUtils.getEarliestRetainedCommitFromLastClean(dataMetaClient)); |
There was a problem hiding this comment.
The dataMetaClient.getActiveTimeline() is passed as a param here, we can inline the fetch of TimelineUtils.getEarliestRetainedCommitFromLastClean(dataMetaClient) inside the method.
| if (earliestCommit.get().compareTo(earliestInflight.get()) < 0) { | ||
| return earliestCommit; | ||
| if (shouldArchiveBeyondSavepoint) { | ||
| return findSmallestInstant(earliestInflight, earliestPossibleRestoreCommit); |
There was a problem hiding this comment.
in the original logic, when shouldArchiveBeyondSavepoint is true, #getFirstNonSavepointCommit is invoked to filter out the savepointed instant(because for data table these instants are skipped during archiving, there is case that these savepoints are very old, the skip is intentional).
In the new logic, earliestPossibleRestoreCommit does not skip the savepointed instants when the last clean retained instant is not there, should we still skip by checking dataTableActiveTimeline.findFirstSavepointedWrite() in this case.
There was a problem hiding this comment.
🤖 Good catch. If shouldArchiveBeyondSavepoint is enabled, we need to ensure that old savepoints don't inadvertently become a hard floor for the archival watermark, as that would defeat the purpose of the config and cause the timeline to grow.
| /** | ||
| * Returns the earliest commit to retain from the latest completed clean on the given table, if present. | ||
| */ | ||
| public static Option<String> getEarliestRetainedCommitFromLastClean(HoodieTableMetaClient metaClient) { |
There was a problem hiding this comment.
we can reuse the logic in SavepointActionExecutor#getLastCommitRetained.
| return earliestUncleanedDataTableInstant; | ||
| } | ||
| } | ||
| return dataTableActiveTimeline.getCommitsTimeline().firstInstant(); |
There was a problem hiding this comment.
when shouldArchiveBeyondSavepoint is true, should we skip the very old savepoints just like before here?
| // - First save-pointed write | ||
| // - Earliest inflight | ||
| // - Earliest commit that dataset can potentially be restored to | ||
| return findSmallestInstant(dataTableActiveTimeline.findFirstSavepointedWrite(), |
There was a problem hiding this comment.
if rolling back a savepoint is not a real case, maybe the comparison of dataTableActiveTimeline.findFirstSavepointedWrite() is not necessary.
yihua
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Style & Readability Review — The code changes look good overall, with a minor suggestion for a test helper's object creation.
| return addClean(instantTime, Option.empty(), cleanerPlan, metadata, false, false); | ||
| } | ||
|
|
||
| public HoodieTestTable addIncrementalClean(String instantTime, String earliestCommitToRetain) throws IOException { |
There was a problem hiding this comment.
🤖 nit: For these complex objects like HoodieCleanerPlan and HoodieCleanStat with many parameters, consider using a builder pattern if they are created frequently with varying attributes. It can improve readability by explicitly naming the parameters.
yihua
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for contributing! The logic to utilize ECTR for MDT archival is a great optimization to keep the MDT timeline lean and performant. The implementation is clean, but I left one inline comment regarding a potential edge case with missing write commits for savepoints.
| } | ||
| HoodieInstant firstSavePoint = firstSavePointInstantOption.get(); | ||
| return getWriteTimeline() | ||
| .filter(instant -> instant.requestedTime().equals(firstSavePoint.requestedTime())) |
There was a problem hiding this comment.
🤖 If the write commit corresponding to the savepoint is missing (e.g., due to a corrupted timeline or manual deletion), this will return Option.empty(). Consequently, TimelineUtils.getEarliestInstantForMetadataArchival will silently ignore the savepoint and might incorrectly allow archiving past it. Consider returning the savepoint instant itself directly (or as a fallback).
| if (earliestCommit.get().compareTo(earliestInflight.get()) < 0) { | ||
| return earliestCommit; | ||
| if (shouldArchiveBeyondSavepoint) { | ||
| return findSmallestInstant(earliestInflight, earliestPossibleRestoreCommit); |
There was a problem hiding this comment.
🤖 Good catch. If shouldArchiveBeyondSavepoint is enabled, we need to ensure that old savepoints don't inadvertently become a hard floor for the archival watermark, as that would defeat the purpose of the config and cause the timeline to grow.
Summary and Changelog
Summary: Metadata table (MDT) archival should be able to archive instants that are before the earliest commit to retain (in latest clean) on the data table, provided that there are no savepoints. This is since
Changelog:
getEarliestInstantForMetadataArchivalnow takes a third parameter:Option<String> earliestUncleanedDataTableInstantTimeOption(ECTR from the latest clean).getEarliestRetainedCommitFromLastClean(HoodieTableMetaClient)to read ECTR from the latest completed clean; throwsHoodieIOExceptionon read failure.findSmallestInstant(Option<HoodieInstant>, Option<HoodieInstant>)andgetEarliestPossibleRestoreCommit(HoodieActiveTimeline, Option<String>).findFirstSavepointedWrite(): returns the first write commit that has been savepointed (used when blocking MDT archival on savepoints).TimelineUtils.getEarliestRetainedCommitFromLastClean(dataMetaClient)is supplied as the third argument togetEarliestInstantForMetadataArchival.getEarliestInstantForMetadataArchivalcall sites updated to the 3-arg signature; added cases for ECTR present, ECTR not on timeline, and savepoints + ECTR (earlier than savepoints and between savepoints).addIncrementalClean(String instantTime, String earliestCommitToRetain)for tests that need a clean with a given ECTR.testArchivalInMetadataTableCanProceedUntilECTRto assert MDT archival proceeds up to the data table ECTR.Impact
Risk Level
Low. It should be safe if instants on the data table before the ECTR don't have a corresponding deltacommit on MDT. This is since those instants must already have been committed, and cannot anyway be rolled back.
Documentation Update
None.
Contributor's checklist