Add support for handling scenarios where end time is invalid during RetentionManager run#18148
Conversation
…etentionManager run
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18148 +/- ##
============================================
+ Coverage 63.44% 63.47% +0.03%
Complexity 1627 1627
============================================
Files 3244 3244
Lines 197250 197320 +70
Branches 30514 30531 +17
============================================
+ Hits 125136 125254 +118
+ Misses 62082 62035 -47
+ Partials 10032 10031 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
xiangfu0
left a comment
There was a problem hiding this comment.
Found a few high-signal issues; see inline comments.
| boolean oldValue = _useCreationTimeFallbackForRetention; | ||
|
|
||
| // Validate that the value is a proper boolean string | ||
| if (!"true".equalsIgnoreCase(newValue) && !"false".equalsIgnoreCase(newValue)) { |
There was a problem hiding this comment.
When this cluster config key is deleted, changedConfigs will still contain it but clusterConfigs.get(...) will be null (DefaultClusterConfigChangeHandler explicitly reports deleted keys that way). This branch treats null as invalid and keeps the old value, so removing the override never reverts to the default false until restart. Because this flag gates destructive retention deletion, the current leader can keep purging segments after an operator thinks they disabled the feature. Please handle null explicitly and reset _useCreationTimeFallbackForRetention to the default.
There was a problem hiding this comment.
Fixed. Made changes for other config for this class as well.
Thanks for pointing this out.
There was a problem hiding this comment.
^^ @xiangfu0, is this okay. If so, I will merge.
| // Validate that the value is a proper boolean string | ||
| if (!"true".equalsIgnoreCase(newValue) && !"false".equalsIgnoreCase(newValue)) { |
There was a problem hiding this comment.
nit: lets extract this into a util method to be used where this is duplicated here.
…issing_start_end_time
Summary
segmentZKMetadata.getCreationTime()instead, so segments with missing/invalid end times can still be cleaned up.controller.retentionManager.enableCreationTimeFallback(default false) — no behavior change unless explicitly opted in.Test plan
TimeRetentionStrategyTest#testCreationTimeFallback— unit tests covering: fallback disabled (existing behavior preserved), fallback enabled with valid/recent/invalid/zero creation time, valid end time takes priority over fallbackRetentionManagerTest#testCreationTimeFallbackOnChange— verifies dynamic config toggle viaonChange()RetentionManagerTest#testRetentionWithInvalidEndTimeAndCreationTimeFallback— end-to-end: segment with invalid end time is deleted when fallback is enabled and creation time exceeds retention