Skip to content

Tool-call telemetry misclassifies Bash cancellation outcomes #583

@chen893

Description

@chen893

This was generated by AI during triage.

Summary

tool_call telemetry currently derives outcome from tool output text. This makes some cancellation paths show up as error, and can also make real errors show up as cancelled when their output happens to contain cancellation-related words.

I can help implement the fix, but I would like to discuss the preferred shape with maintainers first because the robust fix may involve structured metadata on internal tool.result handling, and possibly SDK-visible event contracts.

Affected code

  • packages/agent-core/src/agent/turn/index.ts: telemetryToolOutcome classifies by string matching against tool output.
  • packages/agent-core/src/tools/builtin/shell/bash.ts: Bash abort and timeout paths return user-visible text such as Interrupted by user and Command killed by timeout (...).
  • packages/agent-core/src/loop/tool-call.ts: isUserCancellation(signal.reason) is available while settling aborted tool calls, but this structured signal is not preserved for telemetry.

What I observed

Current classification logic treats an error result as cancelled only when output contains one of:

  • aborted
  • cancelled
  • manually interrupted

That means these cases are currently misclassified or fragile:

  • Bash user abort returns Interrupted by user, so telemetry records outcome: error instead of cancelled.
  • Bash timeout returns Command killed by timeout (...), so it is also recorded as error; whether timeout should remain error or become a separate category seems worth deciding explicitly.
  • A real hook/tool error whose message or stack contains aborted, cancelled, or manually interrupted can be recorded as cancelled, which also suppresses the error_type field because error_type is only attached when outcome === "error".

Expected behavior

Telemetry should not depend only on user-visible output strings. User-initiated cancellation should be classified consistently as cancelled; real tool or hook failures should remain error; timeout behavior should have an explicit agreed meaning, either a separate outcome or an error subtype.

Possible fix directions

  1. Minimal fix: add another string such as interrupted to telemetryToolOutcome. This covers the Bash user-abort text but keeps the false-positive problem and remains coupled to copy changes.
  2. Preferred direction: preserve structured outcome or reason metadata when creating tool.result events or internal telemetry inputs. The loop already has access to isUserCancellation(signal.reason) in cancellation paths, so telemetry can consume structured state instead of guessing from output.

I am happy to help with the implementation after maintainers confirm which telemetry schema and event-surface tradeoff is preferred.

Additional notes

Impact appears limited to telemetry and the distribution of error_type; I did not find in-process retry, UI, or control-flow logic depending on this telemetry outcome field.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions