feat(runtime): expose per-turn usage and cost in the after_llm_call hook payload by kimizuka · Pull Request #2994 · docker/docker-agent

kimizuka · 2026-06-03T14:38:42Z

Summary

Adds per-turn token usage and computed USD cost to the after_llm_call hook payload (hooks.Input), so a sidecar cost ledger can record per-call spend from the hook payload alone — without subscribing to the runtime event channel. This is the primitive-first half of #2948.

model_id was already populated by #2911, so the remaining scope here is just usage and cost.

This implements what was discussed in #2948. The one design decision worth a second look is the cost JSON encoding, covered under Wire contract below.

Closes #2948

Wire contract

hooks.Input is the struct shared by every hook event and serialized to JSON for handlers, so the additions are deliberately conservative:

Usage *chat.Usage `json:"usage,omitempty"`
Cost  *float64    `json:"cost,omitempty"`

For a native model call, cost has three meaningful states:

Go value	JSON	meaning
`nil`	key absent	unpriced — no pricing table, or no usage reported
`&0`	`"cost": 0`	a priced call that was genuinely free
`&N`	`"cost": N`	the priced USD cost of the response

omitempty on a pointer drops only nil, never a pointer to 0, so a free call still emits an explicit "cost": 0. A present cost is therefore authoritative and an absent one means "unpriced", with no need to cross-check usage. Both fields are populated only for after_llm_call; they are nil (and thus absent) on every other event, so no other event's payload changes.

Note on the schema (deviation from my #2948 comment)

In #2948 I suggested json:"cost" (no omitempty, explicit cost: null for unpriced). I switched to omitempty here because hooks.Input is shared by all events: without omitempty, every non-after_llm_call event (before_llm_call, session_end, …) would start emitting "cost": null, which both pollutes unrelated payloads and breaks the struct's all-omitempty convention. The pointer + omitempty form keeps the same three-way distinction within after_llm_call (absent / 0 / N) while leaving other events untouched. An explicit null instead of omitempty is a one-line change if the team prefers it.

Cost is computed once and equals the session's recorded cost

The per-turn cost is computed once in runTurn via a new computeMessageCost(usage, m) *float64 helper and threaded into both the hook payload and recordAssistantMessage. The previous inline arithmetic in recordAssistantMessage is replaced by this single source (the m *modelsdev.Model param becomes the precomputed cost *float64), so the cost a handler sees is exactly the cost the session bills for the turn. The persisted message cost is unchanged (nil records as 0, matching prior behavior).

Harness agents

For harness agents, cost is the harness's own reported total rather than a computed price. The harness library defaults the total to 0 when the harness output omits a cost (e.g. the codex harness reports token counts but no cost), which is indistinguishable from a genuinely free call — so to avoid telling a ledger that a billed turn was free, cost is surfaced only when the harness reported a non-zero value (otherwise it is nil/unpriced).

Sub-sessions

after_llm_call fires for every model call, including those inside sub-sessions (transferred tasks, background agents, skills), each with the sub-session's own session_id. Summing cost across after_llm_call events therefore captures all spend — including sub-sessions whose cost may never reach the session store — which is the motivating case in #2948. The change does not touch sub-session persistence in any way.

Example: a cost-ledger sidecar

A command hook can append one CSV row per model call straight from the payload — no event-channel subscription needed. has("cost") distinguishes an unpriced call (key absent) from a priced free one (cost: 0):

after_llm_call:
  - type: command
    command: |
      cat | jq -r '[
        (now | todateiso8601), .session_id, .model_id,
        (.usage.input_tokens  // 0), (.usage.output_tokens // 0),
        (if has("cost") then (.cost | tostring) else "unpriced" end)
      ] | @csv' >> /tmp/cost-ledger.csv

A runnable version is wired into the canonical examples/hooks.yaml. Because after_llm_call fires for sub-session turns too (each with its own session_id), summing the cost column is the full spend for the run.

Out of scope (follow-ups)

Fallback-model pricing: on a turn that fell back to a secondary model, cost/model_id reflect the primary model. This is pre-existing behavior in recordAssistantMessage and is preserved here (hook cost == recorded cost); attributing cost to the model that actually ran is a separate change.
Coverage of compaction sub-runtimes and the chatserver / a2a / acp paths.

Testing

pkg/runtime/after_llm_call_test.go:

priced call → usage + non-nil cost, and *cost == sess.OwnCost()
unpriced model → usage present, cost nil
JSON wire contract (absent / explicit 0 / N)
harness path (codex) → usage present, cost nil
computeMessageCost unit tests (every nil branch + all token classes)

docs/configuration/hooks/index.md updated for the new fields and the sub-session / harness caveats; examples/hooks.yaml demonstrates a cost-ledger sidecar consuming the payload.

Signed off per DCO.

Forward the per-call token usage and computed USD cost to the after_llm_call hook payload so sidecar cost ledgers can record per-call spend from the payload alone, without subscribing to the runtime event channel. Cost is a *float64 so the wire contract can distinguish an unpriced model (nil, key absent) from a priced free call (pointer to 0). The per-turn cost is computed once in computeMessageCost and threaded into both the hook payload and the recorded assistant message, so the two can never disagree. For harness agents the cost is surfaced only when the harness reported a non-zero value, avoiding reporting a billed turn as free when a harness omits its cost (e.g. codex). Signed-off-by: kimizuka <f.kimizuka@gmail.com>

Verify that after_llm_call populates usage and cost, that cost is nil when the model is unpriced, the nil-vs-zero JSON contract, harness usage with no cost surfacing as unpriced, and computeMessageCost. Signed-off-by: kimizuka <f.kimizuka@gmail.com>

Describe the new usage and cost fields, the priced/unpriced/free semantics and the harness caveat, and add a per-call cost-ledger example to examples/hooks.yaml. Signed-off-by: kimizuka <f.kimizuka@gmail.com>

Add the empty line embeddedstructfieldcheck wants between the embedded ModelStore and the cost field, and switch the float equality assertions to assert.InDelta to satisfy testifylint's float-compare rule.

kimizuka added 3 commits June 3, 2026 23:45

docs(hooks): document after_llm_call usage and cost fields

8a4eb0b

Describe the new usage and cost fields, the priced/unpriced/free semantics and the harness caveat, and add a per-call cost-ledger example to examples/hooks.yaml. Signed-off-by: kimizuka <f.kimizuka@gmail.com>

kimizuka force-pushed the feat/after-llm-call-usage-cost branch from 0d587c8 to 8a4eb0b Compare June 3, 2026 14:45

kimizuka mentioned this pull request Jun 3, 2026

Expose per-turn token usage and cost in the after_llm_call hook payload #2948

Open

aheritier added area/agent For work that has to do with the general agent loop/agentic features of the app kind/feat PR adds a new feature (maps to feat: commit prefix) labels Jun 3, 2026

dgageot marked this pull request as ready for review June 3, 2026 15:42

dgageot requested a review from a team as a code owner June 3, 2026 15:42

dgageot marked this pull request as draft June 3, 2026 15:42

test(runtime): satisfy golangci-lint in after_llm_call test

dea13db

Add the empty line embeddedstructfieldcheck wants between the embedded ModelStore and the cost field, and switch the float equality assertions to assert.InDelta to satisfy testifylint's float-compare rule.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(runtime): expose per-turn usage and cost in the after_llm_call hook payload#2994

feat(runtime): expose per-turn usage and cost in the after_llm_call hook payload#2994
kimizuka wants to merge 4 commits into
docker:mainfrom
kimizuka:feat/after-llm-call-usage-cost

kimizuka commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kimizuka commented Jun 3, 2026

Summary

Wire contract

Note on the schema (deviation from my #2948 comment)

Cost is computed once and equals the session's recorded cost

Harness agents

Sub-sessions

Example: a cost-ledger sidecar

Out of scope (follow-ups)

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants