Skip to content

Fixes UnsupportedOperationException when using readManyByPartitionKey for empty pages#49311

Open
FabianMeiswinkel wants to merge 4 commits into
Azure:mainfrom
FabianMeiswinkel:users/fabianm/ReadManyByPKFix
Open

Fixes UnsupportedOperationException when using readManyByPartitionKey for empty pages#49311
FabianMeiswinkel wants to merge 4 commits into
Azure:mainfrom
FabianMeiswinkel:users/fabianm/ReadManyByPKFix

Conversation

@FabianMeiswinkel
Copy link
Copy Markdown
Member

Description

The driver stack trace when it fails is the following
26/05/29 08:30:24 WARN TaskSetManager: Lost task 3.0 in stage 42.0 (TID 108773) (10.181.218.61 executor 44): java.lang.UnsupportedOperationException
      at java.util.Collections$UnmodifiableMap.put(Collections.java:1459)
      at azure_cosmos_spark.com.azure.cosmos.models.FeedResponse.setContinuationTokenInternal(FeedResponse.java:453)
      at azure_cosmos_spark.com.azure.cosmos.models.FeedResponse.setContinuationToken(FeedResponse.java:448)
      at azure_cosmos_spark.com.azure.cosmos.models.ModelBridgeInternal.setFeedResponseContinuationToken(ModelBridgeInternal.java:605)
      at azure_cosmos_spark.com.azure.cosmos.implementation.RxDocumentClientImpl.lambda$buildSequentialBatchFlux$132(RxDocumentClientImpl.java:4940)
      at azure_cosmos_spark.reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106)
      at azure_cosmos_spark.reactor.core.publisher.FluxFlatMap$FlatMapMain.tryEmit(FluxFlatMap.java:547)
      at azure_cosmos_spark.reactor.core.publisher.FluxFlatMap$FlatMapInner.onNext(FluxFlatMap.java:988)
      at azure_cosmos_spark.reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)

This can happen when for a single physical partition the requested tuples don't have any matches - the empty page returned would have immutable headers.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Copilot AI review requested due to automatic review settings May 29, 2026 14:41
@FabianMeiswinkel FabianMeiswinkel requested review from a team and kirankumarkolli as code owners May 29, 2026 14:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a Cosmos DB readManyByPartitionKey(s) failure where empty pages with immutable response headers could throw UnsupportedOperationException when stamping continuation tokens.

Changes:

  • Detects known immutable header map shapes and defensively copies before mutating continuation headers.
  • Adds unit tests for immutable-map detection and the customer-reported continuation-token stamping path.
  • Preserves the no-op fast path for empty headers when no continuation header needs to be changed.
Show a summary per file
File Description
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/FeedResponse.java Copies known immutable header maps before setting/removing continuation tokens.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/Utils.java Adds isImmutableMap helper for known immutable map implementations used by Cosmos internals.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/UtilsImmutableMapTests.java Adds unit coverage for immutable-map detection.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/ReadManyByPartitionKeyContinuationTokenTest.java Adds regression coverage for immutable non-empty headers in continuation-token stamping.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 1

@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

}

if (Utils.isImmutableMap(this.header)) {
this.header = new HashMap<>(this.header);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getResponseHeaders() contract changed — silently.? It's a corner concern but the contract fo Before the fix, callers got whatever was passed in (often Collections.unmodifiableMap, i.e. read-only). After a swap, they get a mutable HashMap. External callers could now response.getResponseHeaders().put(...) and coulde pollute the FeedResponse.?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine - in vast majority of cases it is mutable today anyway - which is a problem but one overlooked when intriducing the public API at V4 day 0. And we can't relaly breka it anymore.

I am mroe concerned with this not being final anymore - that could cause some interesting edge cases - let me look at that mode.

private void setContinuationTokenInternal(String headerName, String continuationToken) {
if (!Strings.isNullOrWhiteSpace(continuationToken)) {
boolean setting = !Strings.isNullOrWhiteSpace(continuationToken);
boolean clearing = !setting && !this.header.isEmpty() && this.header.containsKey(headerName);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this.header.containsKey(headerName) already checks for this.header.isEmpty()

Is this a deliberate choice?

Copy link
Copy Markdown
Member

@ananth7592 ananth7592 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Except for 2 minor comments

Copy link
Copy Markdown
Member

@tvaron3 tvaron3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xinlian12
Copy link
Copy Markdown
Member

/azp run java - cosmos - spark

@xinlian12
Copy link
Copy Markdown
Member

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xinlian12
Copy link
Copy Markdown
Member

@sdkReviewAgent

return;
}

if (Utils.isImmutableMap(this.header)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Recommendation — Design: Stale contract comment contradicts this fix

The comment at lines 59–69 (just above, not in the diff) explicitly states:

"If a future code path introduces an immutable non-null header map, the setContinuationTokenInternal method will fail fast with UnsupportedOperationException, and the fix should be to make the upstream pipeline emit a mutable map rather than adding defensive cloning here."

This PR does exactly what that comment says not to do — and doesn't update the comment. The comment also asserts "Non-null maps are always mutable", which is factually incorrect given that five headerResponse() methods produce immutable maps via Utils.immutableMapOf(...).

Why this matters: A future maintainer reading the comment will believe immutable maps should cause a fail-fast crash and that the fix belongs upstream. If they add a new immutable producer that isImmutableMap() doesn't recognize (e.g., Collections.singletonMap() or Map.of()), they'll get the original UnsupportedOperationException and be confused about which strategy to follow.

Suggestion — consider the simpler upstream fix the comment prescribes: The only confirmed crash source is ParallelDocumentQueryExecutionContext.headerResponse() (line 285). Changing it to return a mutable HashMap is a ~3-line change:

private static Map<String, String> headerResponse(double requestCharge) {
    Map<String, String> map = new HashMap<>();
    map.put(HttpConstants.HttpHeaders.REQUEST_CHARGE, String.valueOf(requestCharge));
    return map;
}

This follows the pattern already used by addCompositeContinuationToken() at line 267 (new HashMap<>(page.getResponseHeaders())), keeps header as final (addressing your own concern about thread safety), and avoids the need for the new isImmutableMap() infrastructure entirely.

At minimum, if the downstream-defense approach ships, please update the comment at lines 59–69 to reflect the new strategy — otherwise it's a maintenance trap.

⚠️ AI-generated review — may be incorrect. Agree? → resolve the conversation. Disagree? → reply with your reasoning.

@xinlian12
Copy link
Copy Markdown
Member

Review complete (54:17)

Posted 1 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants