-
Notifications
You must be signed in to change notification settings - Fork 0
Agent heartbeat can consume x402-buyer pre-signed auths from shared paid/* namespace #306
Description
Summary
When the obol-agent buys remote inference via buy.py buy, the purchased model is exposed as paid/<model> through LiteLLM's static wildcard route. Because the agent itself routes through the same LiteLLM instance, agent heartbeat or internal inference calls can consume pre-signed authorizations from the buyer pool — reducing the count available for explicit user requests.
Observed behavior
Bought 10 pre-signed ERC-3009 authorizations for paid/qwen3.5:9b. Fired 12 sequential requests:
| Request | Result |
|---|---|
| 1–9 | 200 OK (paid inference returned) |
| 10–12 | 502 Bad Gateway (auth pool exhausted) |
Expected: 10 successes, 2 failures.
Actual: 9 successes, 3 failures — 1 auth was consumed by something else between buy and test.
Isolated confirmation
Created a second ServiceOffer (depletion-test) with the same upstream but fresh auths. Bought exactly 5 auths and fired 7 requests:
| Request | Result |
|---|---|
| 1–5 | 200 OK |
| 6–7 | 502 Bad Gateway |
Exact 5/5 — the sidecar auth accounting is correct, no off-by-one. The missing auth in the first test was consumed by the agent's own LiteLLM traffic (likely a heartbeat or internal inference call).
Root cause
The paid/* wildcard in LiteLLM's model_list routes ALL requests with that prefix through the x402-buyer sidecar. The agent's own inference calls (heartbeat, chat completions routed through LiteLLM) share the same paid/* namespace and consume auths from the same pool.
Impact
- Auth pool depletes faster than expected (agent consumes N auths silently)
- User-facing requests may fail earlier than the budget suggests
- No visibility into which requests consumed auths (sidecar doesn't log successful requests)
Possible fixes
- Sidecar request logging: Log each auth consumption with timestamp + request metadata so consumption is auditable
- Namespace isolation: Use a per-purchase model prefix (e.g.
paid/<purchase-name>/<model>) so agent traffic doesn't collide with user purchases - Agent model routing: Configure the agent to NOT use
paid/*aliases for its own inference — route agent chat through local Ollama only - Auth reservation: Reserve a fraction of the pool for user requests, warn when agent traffic encroaches
Environment
- ARM64 (NVIDIA DGX Spark)
- Branch:
feat/monetize-path+codex/serviceoffer-controller(PR Refactor monetize reconciliation into the serviceoffer controller #299) - x402-buyer sidecar: locally built from PR Refactor monetize reconciliation into the serviceoffer controller #299
- Seller: ServiceOffer with x402 ForwardAuth on Base Sepolia
- Facilitator:
https://facilitator.x402.rs