[fix](azure) Fix incorrect modification timestamps in AzureObjStorage#61790
[fix](azure) Fix incorrect modification timestamps in AzureObjStorage#61790dataroaring wants to merge 1 commit intoapache:masterfrom
Conversation
The modification time for Azure blobs was computed incorrectly in three places: - globList LIST path (line 428): used getLastModified().getSecond() which returns the second-of-minute (0-59), not an epoch timestamp. - globListByGetProperties HEAD path (line 503): used toEpochSecond() which returns epoch seconds, off by 1000x from S3's toEpochMilli(). - listFiles path (line 564): same getSecond() bug as the glob path. All three are now fixed to use .toInstant().toEpochMilli(), consistent with S3ObjStorage which uses Instant.toEpochMilli(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
Fixes incorrect RemoteFile.modificationTime values produced by the Azure object storage backend by normalizing Azure “last modified” timestamps to epoch milliseconds, aligning behavior with S3 and other file systems.
Changes:
- Fix
globListLIST-path timestamp bug (previously usedgetSecond()which is second-of-minute). - Fix deterministic-path
globListByGetProperties(HEAD/getProperties) timestamp unit from epoch seconds to epoch milliseconds. - Fix
listFilestimestamp bug (samegetSecond()issue).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| isPrefix ? 0 : blobItem.getProperties().getLastModified() | ||
| .toInstant().toEpochMilli()); |
There was a problem hiding this comment.
Add/extend unit tests (e.g., AzureObjStorageTest’s mocked paged listing) to assert RemoteFile.modificationTime is an epoch-millisecond value derived from BlobItemProperties.getLastModified().toInstant().toEpochMilli(), so regressions back to second-of-minute/epoch-seconds are caught.
| props.getLastModified() != null | ||
| ? props.getLastModified().toEpochSecond() : 0 | ||
| ? props.getLastModified().toInstant().toEpochMilli() : 0 | ||
| ); |
There was a problem hiding this comment.
Add test coverage for the deterministic-path getProperties/HEAD flow to validate the modification time unit is epoch milliseconds (not epoch seconds). Using a fixed OffsetDateTime in mocks will make the assertion deterministic.
| props.getContentLength(), | ||
| props.getContentLength(), | ||
| props.getLastModified().getSecond(), | ||
| props.getLastModified().toInstant().toEpochMilli(), | ||
| null); |
There was a problem hiding this comment.
Consider adding a unit test around listFiles() (or a small isolated helper) to assert BlobItemProperties.getLastModified is converted to epoch milliseconds, since this path previously returned an incorrect value and is easy to regress.
|
run buildall |
TPC-H: Total hot run time: 28215 ms |
TPC-DS: Total hot run time: 168707 ms |
FE UT Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
Summary
Fix incorrect modification timestamps in
AzureObjStorageacross three code paths:globListLIST path: UsedgetLastModified().getSecond()which returns the second-of-minute (0–59), not an epoch timestampglobListByGetPropertiesHEAD path: UsedtoEpochSecond()which returns epoch seconds — off by 1000× from S3'stoEpochMilli()listFilespath: SamegetSecond()bug as the glob pathAll three are fixed to use
.toInstant().toEpochMilli(), consistent withS3ObjStoragewhich usesInstant.toEpochMilli().Found via review of #61775 (cherry-pick of #60414).
Test plan
RemoteFile.modifiedTime🤖 Generated with Claude Code