Skip to content

Revert "fix(auth): Address ClientSideCredentialAccessBoundary RefreshTask race condition"#12770

Closed
lqiu96 wants to merge 1 commit intomainfrom
revert-12681-fix-cab-race-condition
Closed

Revert "fix(auth): Address ClientSideCredentialAccessBoundary RefreshTask race condition"#12770
lqiu96 wants to merge 1 commit intomainfrom
revert-12681-fix-cab-race-condition

Conversation

@lqiu96
Copy link
Copy Markdown
Member

@lqiu96 lqiu96 commented Apr 13, 2026

Reverts #12681

@lqiu96 lqiu96 requested a review from vverman April 13, 2026 16:22
@lqiu96 lqiu96 requested review from a team as code owners April 13, 2026 16:22
@lqiu96 lqiu96 enabled auto-merge (squash) April 13, 2026 16:22
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the task synchronization logic in ClientSideCredentialAccessBoundaryFactory and removes a concurrency regression test. The review feedback indicates that reverting to currentRefreshTask.task.get() and splitting the completion logic into separate listeners re-introduces race conditions, as there is no guarantee that the factory state updates before waiters are unblocked. Furthermore, a potential NullPointerException was identified in the new exception handling code, and the removal of the regression test is discouraged.

try {
// Wait for the refresh task to complete.
currentRefreshTask.get();
currentRefreshTask.task.get();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Reverting this to wait on currentRefreshTask.task.get() instead of currentRefreshTask.get() re-introduces a race condition. The internal task completes as soon as the network request is done, but before the factory state has been updated via finishRefreshTask. This allows concurrent calls to proceed with potentially stale or incomplete state.

Suggested change
currentRefreshTask.task.get();
currentRefreshTask.get();

Comment on lines +498 to 525
// Add listener to update factory's credentials when the task completes.
task.addListener(
() -> {
try {
finishRefreshTask(task);
RefreshTask.this.set(Futures.getDone(task));
} catch (ExecutionException e) {
Throwable cause = e.getCause();
RefreshTask.this.setException(cause != null ? cause : e);
} catch (Throwable t) {
RefreshTask.this.setException(t);
RefreshTask.this.setException(cause);
}
},
MoreExecutors.directExecutor());

// Add callback to set the result or exception based on the outcome.
Futures.addCallback(
task,
new FutureCallback<IntermediateCredentials>() {
@Override
public void onSuccess(IntermediateCredentials result) {
RefreshTask.this.set(result);
}

@Override
public void onFailure(@Nullable Throwable t) {
RefreshTask.this.setException(
t != null ? t : new IOException("Refresh failed with null Throwable."));
}
},
MoreExecutors.directExecutor());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This implementation re-introduces a race condition by using two separate listeners. There is no guarantee that finishRefreshTask (which updates the factory state) will execute before RefreshTask.this.set(result) (which unblocks waiters). Additionally, line 505 is not null-safe and will throw a NullPointerException if cause is null. It is highly recommended to use a single listener to ensure correct sequencing.

      // Single listener to guarantee that finishRefreshTask updates the internal state BEFORE
      // the outer future completes and unblocks waiters.
      task.addListener(
          () -> {
            try {
              finishRefreshTask(task);
              RefreshTask.this.set(Futures.getDone(task));
            } catch (ExecutionException e) {
              Throwable cause = e.getCause();
              RefreshTask.this.setException(cause != null ? cause : e);
            } catch (Throwable t) {
              RefreshTask.this.setException(t);
            }
          },
          MoreExecutors.directExecutor());

"Expected zero NullPointerExceptions due to the race condition, but some were thrown.");
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The removal of the generateToken_freshInstance_concurrent_noNpe regression test is discouraged. This test was specifically designed to catch the race condition that is being re-introduced by this revert. Keeping the test ensures that the concurrency issue is properly tracked and eventually fixed without regression.

@lqiu96
Copy link
Copy Markdown
Member Author

lqiu96 commented Apr 13, 2026

Not proceed with the revert. Original PR is fine.

@lqiu96 lqiu96 closed this Apr 13, 2026
auto-merge was automatically disabled April 13, 2026 20:40

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant