[improve][broker]Improve ManagedLedger search position by offset#25099
[improve][broker]Improve ManagedLedger search position by offset#25099gaozhangmin wants to merge 1 commit intoapache:masterfrom
Conversation
|
@gaozhangmin Please add the following content to your PR description and select a checkbox: |
|
this feat should completed in the downstream project, I added PIP-404 to introduce per-ledger properties, the downstream project could add the first entry index to the ledger-info |
No,the firstEntryIndex is recorded when new ledger created. |
|
The feat is mostly for kop, as it is a protocol plugin, I don't think add the logic to the pulsar-core is a good idea. |
I see your point, but since asyncFindPosition exists in Pulsar, perhaps we should focus on how to optimize it. |
Motivation
The current
asyncFindPositionmethod has performance issues: it performs binary search by reading all entries from the beginning based on offset to find the position. This implementation causes several problems when dealing with large amounts of data:This problem becomes particularly severe in Pulsar topics with large amounts of historical data, significantly impacting consumer startup speed and overall performance.
Modifications
This PR optimizes the
asyncFindPositionmethod through the following approaches:Record index information during ledger creation:
ManagedLedgerInterceptorindex intoLedgerInfofirstEntryIndexOptimize search logic:
firstEntryIndexinasyncFindPositionReduce unnecessary data reads:
Verifying this change
This change added tests and can be verified as follows:
ManagedLedgerInterceptorImplTestincluding:testSetFirstEntryIndex: Verifies proper setting of firstEntryIndex during ledger creationtestFindPositionByOffsetWithMissingFirstEntryIndex: Tests backward compatibility when firstEntryIndex is not availabletestFindPositionByOffset: Tests optimized position finding with various offset scenariosDoes this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
firstEntryIndexfield to LedgerInfo metadata structureDocumentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
gaozhangmin#13