Skip to content

Agent heartbeat can consume x402-buyer pre-signed auths from shared paid/* namespace #306

@bussyjd

Description

@bussyjd

Summary

When the obol-agent buys remote inference via buy.py buy, the purchased model is exposed as paid/<model> through LiteLLM's static wildcard route. Because the agent itself routes through the same LiteLLM instance, agent heartbeat or internal inference calls can consume pre-signed authorizations from the buyer pool — reducing the count available for explicit user requests.

Observed behavior

Bought 10 pre-signed ERC-3009 authorizations for paid/qwen3.5:9b. Fired 12 sequential requests:

Request Result
1–9 200 OK (paid inference returned)
10–12 502 Bad Gateway (auth pool exhausted)

Expected: 10 successes, 2 failures.
Actual: 9 successes, 3 failures — 1 auth was consumed by something else between buy and test.

Isolated confirmation

Created a second ServiceOffer (depletion-test) with the same upstream but fresh auths. Bought exactly 5 auths and fired 7 requests:

Request Result
1–5 200 OK
6–7 502 Bad Gateway

Exact 5/5 — the sidecar auth accounting is correct, no off-by-one. The missing auth in the first test was consumed by the agent's own LiteLLM traffic (likely a heartbeat or internal inference call).

Root cause

The paid/* wildcard in LiteLLM's model_list routes ALL requests with that prefix through the x402-buyer sidecar. The agent's own inference calls (heartbeat, chat completions routed through LiteLLM) share the same paid/* namespace and consume auths from the same pool.

Impact

  • Auth pool depletes faster than expected (agent consumes N auths silently)
  • User-facing requests may fail earlier than the budget suggests
  • No visibility into which requests consumed auths (sidecar doesn't log successful requests)

Possible fixes

  1. Sidecar request logging: Log each auth consumption with timestamp + request metadata so consumption is auditable
  2. Namespace isolation: Use a per-purchase model prefix (e.g. paid/<purchase-name>/<model>) so agent traffic doesn't collide with user purchases
  3. Agent model routing: Configure the agent to NOT use paid/* aliases for its own inference — route agent chat through local Ollama only
  4. Auth reservation: Reserve a fraction of the pool for user requests, warn when agent traffic encroaches

Environment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions