Agent heartbeat can consume x402-buyer pre-signed auths from shared paid/* namespace

## Summary

When the obol-agent buys remote inference via `buy.py buy`, the purchased model is exposed as `paid/<model>` through LiteLLM's static wildcard route. Because the agent itself routes through the same LiteLLM instance, agent heartbeat or internal inference calls can consume pre-signed authorizations from the buyer pool — reducing the count available for explicit user requests.

## Observed behavior

Bought 10 pre-signed ERC-3009 authorizations for `paid/qwen3.5:9b`. Fired 12 sequential requests:

| Request | Result |
|---------|--------|
| 1–9 | 200 OK (paid inference returned) |
| 10–12 | 502 Bad Gateway (auth pool exhausted) |

**Expected**: 10 successes, 2 failures.
**Actual**: 9 successes, 3 failures — 1 auth was consumed by something else between buy and test.

## Isolated confirmation

Created a second ServiceOffer (`depletion-test`) with the same upstream but fresh auths. Bought exactly 5 auths and fired 7 requests:

| Request | Result |
|---------|--------|
| 1–5 | 200 OK |
| 6–7 | 502 Bad Gateway |

**Exact 5/5** — the sidecar auth accounting is correct, no off-by-one. The missing auth in the first test was consumed by the agent's own LiteLLM traffic (likely a heartbeat or internal inference call).

## Root cause

The `paid/*` wildcard in LiteLLM's `model_list` routes ALL requests with that prefix through the x402-buyer sidecar. The agent's own inference calls (heartbeat, chat completions routed through LiteLLM) share the same `paid/*` namespace and consume auths from the same pool.

## Impact

- Auth pool depletes faster than expected (agent consumes N auths silently)
- User-facing requests may fail earlier than the budget suggests
- No visibility into which requests consumed auths (sidecar doesn't log successful requests)

## Possible fixes

1. **Sidecar request logging**: Log each auth consumption with timestamp + request metadata so consumption is auditable
2. **Namespace isolation**: Use a per-purchase model prefix (e.g. `paid/<purchase-name>/<model>`) so agent traffic doesn't collide with user purchases
3. **Agent model routing**: Configure the agent to NOT use `paid/*` aliases for its own inference — route agent chat through local Ollama only
4. **Auth reservation**: Reserve a fraction of the pool for user requests, warn when agent traffic encroaches

## Environment

- ARM64 (NVIDIA DGX Spark)
- Branch: `feat/monetize-path` + `codex/serviceoffer-controller` (PR #299)
- x402-buyer sidecar: locally built from PR #299
- Seller: ServiceOffer with x402 ForwardAuth on Base Sepolia
- Facilitator: `https://facilitator.x402.rs`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent heartbeat can consume x402-buyer pre-signed auths from shared paid/* namespace #306

Summary

Observed behavior

Isolated confirmation

Root cause

Impact

Possible fixes

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request	Result
1–9	200 OK (paid inference returned)
10–12	502 Bad Gateway (auth pool exhausted)

Agent heartbeat can consume x402-buyer pre-signed auths from shared paid/* namespace #306

Description

Summary

Observed behavior

Isolated confirmation

Root cause

Impact

Possible fixes

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions