Skip to content

Fix task state leak due to interrupt race#19102

Closed
jtuglu1 wants to merge 1 commit intoapache:masterfrom
jtuglu1:fix-interrupted-race
Closed

Fix task state leak due to interrupt race#19102
jtuglu1 wants to merge 1 commit intoapache:masterfrom
jtuglu1:fix-interrupted-race

Conversation

@jtuglu1
Copy link
Copy Markdown
Contributor

@jtuglu1 jtuglu1 commented Mar 6, 2026

Description

Saw this issue in our cluster where a task queue callback thread gets interrupted mid-persist. Since our DB connector propagates interruptions, we would fail to persist the task status callback, causing inconsistency between what's in memory (FAILED) and what's in DB (RUNNING). Since the in-memory task is complete, it is cleaned up. On the next sync, the RUNNING entry is pulled in, keeping it in the active tasks list as WAITING.

Release note

Fix WAITING task state leak due to interrupt race


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant