Describe the bug
CoalescePartitionsExec tasks hold on to an Arc ref of the input plan (ref)
- It doesn't actually need it after
input.execute(..) except for printing the plan in debug logs
RepartitionExec has some state in the plan itself
- One such state holds on to Arc refs of
Arc<Vec<SpawnedTask<()>>> (ref)
- Which means that the tasks are only cancelled once all Arc refs to the plan are dropped
- So each layer of
CoalescePartitionsExec-RepartitionExec delays cancellation of the query
To Reproduce
I have a reproducer here: Samyak2#1 (warning: mostly LLM-generated, but I have verified that it actually checks the correct thing)
Relevant parts of the output:
repartition_task_group=0 input_partition=0 kind=pull_from_input drop_elapsed_ms=68
repartition_task_group=1 input_partition=0 kind=pull_from_input drop_elapsed_ms=80
repartition_task_group=1 input_partition=1 kind=pull_from_input drop_elapsed_ms=85
output_partitions=32 input_rows_per_partition=1024000 all_repartition_operator_drop_elapsed_ms=80
all_repartition_task_drop_elapsed_ms=85
all_observed_drop_elapsed_ms=85
The cancellation is delayed by ~80ms due to CoalescePartitionsExec
Expected behavior
CoalescePartitionsExec should drop child plan early
Additional context
I have a fix for this. Will raise a PR soon.
Describe the bug
CoalescePartitionsExectasks hold on to an Arc ref of the input plan (ref)input.execute(..)except for printing the plan in debug logsRepartitionExechas some state in the plan itselfArc<Vec<SpawnedTask<()>>>(ref)CoalescePartitionsExec-RepartitionExecdelays cancellation of the queryTo Reproduce
I have a reproducer here: Samyak2#1 (warning: mostly LLM-generated, but I have verified that it actually checks the correct thing)
Relevant parts of the output:
The cancellation is delayed by ~80ms due to
CoalescePartitionsExecExpected behavior
CoalescePartitionsExecshould drop child plan earlyAdditional context
I have a fix for this. Will raise a PR soon.