[Cosmos] Port hub region caching per partition level (#48788)#48789
Draft
jeet1995 wants to merge 1 commit intoAzure:mainfrom
Draft
[Cosmos] Port hub region caching per partition level (#48788)#48789jeet1995 wants to merge 1 commit intoAzure:mainfrom
jeet1995 wants to merge 1 commit intoAzure:mainfrom
Conversation
Port hub region caching from .NET SDK (PR Azure#5648) to Java SDK. Feature summary: - After 2 consecutive 404/1002 (ReadSessionNotAvailable) on single-master accounts, SDK sets x-ms-cosmos-hub-region-processing-only header - Non-hub regions return 403/3 (WriteForbidden); SDK retries to next region - Hub region responds with 200 OK; SDK caches hub URI for that partition - Future requests route directly to cached hub (warm path) - Works for both PPAF and non-PPAF accounts Implementation details: - Feature flag: COSMOS.HUB_REGION_PROCESSING_ENABLED (default: false) - New class: GlobalPartitionEndpointManagerForHubRegionRouting - Per-partition ConcurrentHashMap cache for hub region URIs - Warm/cold path routing, cache invalidation, thread-safe - ClientRetryPolicy: 403/3 handling on read path for hub discovery - ClientRetryPolicy: Hub header gated behind feature flag - ClientRetryPolicy.onBeforeSendRequest: Warm path cache check - RxDocumentClientImpl: Cache hub on successful response - 13 unit tests covering all cache operations and eligibility Files changed: - Configs.java: Add feature flag constants and getter - ClientRetryPolicy.java: Hub header gating, 403/3 read path, warm path - RetryPolicy.java: Wire hub region manager - RxDocumentClientImpl.java: Instantiate manager, cache on success - GlobalPartitionEndpointManagerForHubRegionRouting.java (new) - 5 test files updated for new constructor parameter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
Fixes #48788
Description
Ports the per-partition hub region caching feature from .NET SDK (PR Azure/azure-cosmos-dotnet-v3#5648) to the Java SDK.
Problem
On single-master accounts, repeated 404/1002 (ReadSessionNotAvailable) errors cause unnecessary retries without discovering the hub region. No caching exists, so every request repeats the full discovery chain.
Solution
Feature Flag
Gated behind COSMOS.HUB_REGION_PROCESSING_ENABLED (env var: COSMOS_HUB_REGION_PROCESSING_ENABLED). Disabled by default per Debdatta Kunda's guidance.
Key Changes
Testing