Skip to content

InMemoryTaskStore.wait_for_update has lost-wakeup races (concurrent waiters, notify-before-wait) #2535

@blackwell-systems

Description

@blackwell-systems

wait_for_update overwrites _update_events[task_id] with a fresh event on every call. This causes two races:

Race 1: Concurrent waiters

If two callers poll the same task simultaneously, the second overwrites the first's event. notify_update only sets the latest event, so the first waiter hangs forever.

Race 2: Notify before wait

If update_task completes (calling notify_update) before wait_for_update is called, the signal is lost because no event exists yet. The waiter blocks until the next update, which never comes for a terminal task.

Both are reachable via task_result_handler.py:126, which calls _wait_for_task_update in a polling loop.

Reproducer

import anyio
from mcp.shared.experimental.tasks.in_memory_task_store import InMemoryTaskStore
from mcp.types import TaskMetadata

async def main():
    store = InMemoryTaskStore()
    task = await store.create_task(TaskMetadata())

    # Race 1: concurrent waiters
    woke = {"a": False, "b": False}
    async def waiter(name):
        await store.wait_for_update(task.task_id)
        woke[name] = True
    async def updater():
        await anyio.sleep(0.05)
        await store.update_task(task.task_id, status="completed")
    try:
        with anyio.fail_after(2):
            async with anyio.create_task_group() as tg:
                tg.start_soon(waiter, "a")
                await anyio.sleep(0.01)
                tg.start_soon(waiter, "b")
                tg.start_soon(updater)
    except TimeoutError:
        pass
    print(f"a: {'woke' if woke['a'] else 'HUNG'}, b: {'woke' if woke['b'] else 'HUNG'}")

    # Race 2: notify before wait
    store2 = InMemoryTaskStore()
    task2 = await store2.create_task(TaskMetadata())
    await store2.update_task(task2.task_id, status="completed")
    try:
        with anyio.fail_after(1):
            await store2.wait_for_update(task2.task_id)
            print("wait returned")
    except TimeoutError:
        print("HUNG: signal lost")

anyio.run(main)

Output:

a: HUNG, b: woke
HUNG: signal lost

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions