Skip to content

bug: duplicate message_complete events in NDJSON output (--format json) #45

@byapparov

Description

@byapparov

Summary

aictrl run --format json emits duplicate message_complete events for the same LLM turn. In CLI 0.2.0 the duplicates had identical timestamps so consumers could dedup by timestamp. In 0.3.1 the timestamps drift by ~9ms, breaking timestamp-based dedup and inflating token/cost metrics for any downstream consumer.

Reproduction

bun run ./src/index.ts run --format json \
  --model "zai-coding-plan/glm-4.7-flash" \
  "list the files in the current directory using ls, then tell me how many there are"

Observed output (0.3.1)

Turn 1 emits two message_complete with identical timestamps (dedup works):

{"type":"message_complete","timestamp":1775768815171,"tokens":{"input":12486,"output":72},"finish":"tool-calls"}
{"type":"message_complete","timestamp":1775768815171,"tokens":{"input":12486,"output":72},"finish":"tool-calls"}

Turn 2 emits two message_complete with different timestamps (dedup breaks):

{"type":"message_complete","timestamp":1775768836033,"tokens":{"input":137,"output":193},"finish":"stop"}
{"type":"message_complete","timestamp":1775768836042,"tokens":{"input":137,"output":193},"finish":"stop"}

Expected

One message_complete per LLM turn.

Root cause

packages/cli/src/cli/cmd/run.ts lines 455–466 — the message.updated event handler emits message_complete every time the SDK pushes a message.updated with info.finish set. The SDK sends the same finished message multiple times (likely once for content completion, once for step lifecycle), and there's no guard against re-emitting for an already-finished messageID.

Suggested fix

Track emitted messageIDs in a Set<string> and skip if already seen:

const emittedMessages = new Set<string>()

// Inside the message.updated handler:
if (info.finish) {
  const msgId = event.properties.info.id // or another stable identifier
  if (emittedMessages.has(msgId)) continue
  emittedMessages.add(msgId)
  emit("message_complete", { ... })
}

Impact

Any consumer of the NDJSON stream that aggregates tokens or costs across message_complete events will over-count. The aictrl-dev/aictrl task executor's aggregateNdjsonMetrics function has a workaround (timestamp:input:output dedup key) that catches same-timestamp duplicates but misses the drifted-timestamp ones introduced in 0.3.1.

Ref: aictrl-dev/aictrl#1197 (test fixture update documenting this behaviour)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions