docs: clarify BYOK + Custom Inference request path and data flow by hongyi-chen · Pull Request #138 · warpdotdev/docs

hongyi-chen · 2026-05-26T18:07:48Z

Summary

Clarifies the request path and data-flow framing for both BYOK and Custom Inference endpoints in our docs, in response to warpdotdev/warp#11681 and the follow-up triage thread.

The previous wording on both pages combined two claims that aren't equivalent:

Storage: API keys are stored locally (true and unchanged).
Transit: requests are routed "directly" to the model provider — that part was misleading. The Warp Agent harness is server-hosted, so requests do transit Warp's backend; the key is passed in-flight per request and used to authenticate the call from warp-server, not from the client.

Both the recent r/warpdotdev complaint and issue #11681 (Custom Inference) traced back to this framing.

Per Petra's review, the storage claim on each page now focuses on what the key is for instead of where it isn't: the key is stored only on your device and used to authenticate requests to the model provider / configured endpoint. The 3-step flow and "Why does the request route through Warp's backend?" callout further down each page explain the actual transit path.

Confirmed against warp-server:

logic/ai/llm/anthropic/util/util.go:1032-1034 — server overrides the Anthropic SDK API key with the user-provided one per request.
logic/ai/llm/user_api_keys/util.go:7 — keys arrive in the request payload as Request_Settings_ApiKeys.
logic/ai/llm/llm_role.go:723 — server-side model routing applies BYOK preferences via WithApiKeyConfigApplied.
logic/ai/llm/custom_endpoint/client.go:14-21 — the OpenAI-compatible client is constructed server-side with option.WithAPIKey(hostConfig.CustomEndpointAPIKey()) and option.WithBaseURL(...CustomEndpointBaseURL()); both come in on the request.

Changes

`src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx`

Intro paragraph: dropped "access models directly" wording.
How BYOK works section: replaced the single "directly route your agent requests" line with an explicit 3-step data flow (harness assembles request → key authenticates the call in-flight → response streams back), and clarified that keys live in-memory only for the duration of each request.
Headline storage claim now uses Petra's framing: keys are stored only on your device and used to authenticate requests to your chosen model provider — without an absolute "never leaves your machine" implication.
Added a Why does the request route through Warp's backend? note explaining the server-side harness (same runtime as Agent Mode with Warp-billed models).
ZDR section: added a sentence noting BYOK request bodies transit Warp's backend but are not retained, used for training, or logged for analytics — same posture as Warp-billed traffic. Scoped the existing "data retention policies depend on..." bullet to be explicit it's about the provider side.
Tightened the diagram alt text from "directly through your provider API key" → "authenticates BYOK agent requests with your provider API key".
Fixed a stale anchor: the ZDR section linked to #how-does-byok-work but the heading on main is now How BYOK works (slug #how-byok-works).

`src/content/docs/agent-platform/inference/custom-inference-endpoint.mdx`

Key features: rewrote the "Local configuration" bullet — endpoint API key is stored only on the device and used to authenticate requests to your endpoint.
How it works: replaced the blanket "never synced to Warp's servers" wording with an explicit 3-step request flow mirroring the BYOK rewrite.
Added a Why does the request route through Warp's backend? callout matching the BYOK page, explicitly cross-linking to BYOK so readers see the consistent posture.
ZDR section: added the same "request bodies transit Warp's backend but aren't used for training" framing as BYOK, and scoped the existing retention bullets to the provider side.

Additional context

See the implementation plan for the research summary and the breakdown that informed this update.
Conversation thread.
No structural changes (no new top-level sections, no terminology drift), no sidebar config edits, no redirects required.
Internal link checker: 0 broken links (python3 .agents/skills/check_for_broken_links/check_links.py --internal-only → 2702 internal links checked, 0 broken).

Conversation: https://staging.warp.dev/conversation/c3a085dc-4658-47c2-9908-e7f56672872f
Run: https://oz.staging.warp.dev/runs/019e665f-0f94-780b-9764-04bdbd28a24b
Plans:

Expand PR #138 to address Custom Inference Endpoint privacy claims

This PR was generated with Oz.

The BYOK doc previously said keys are "stored locally" (true) and that Warp "directly routes" requests to the provider (misleading — the Warp Agent harness is server-hosted, so traffic does transit Warp's backend while the key is used in-flight per request). This commit: - Replaces "directly route" language with an explicit 3-step data flow. - Adds a "Why does the request route through Warp's backend?" note explaining the server-side harness. - Adds a sentence to the ZDR section noting BYOK request bodies are not retained, used for training, or logged for analytics. - Tightens the diagram alt text and intro paragraph to remove the same "directly" ambiguity. Co-Authored-By: Oz <oz-agent@warp.dev>

…ok-data-flow # Conflicts: # src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx

vercel · 2026-05-26T18:07:56Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	May 27, 2026 6:38pm

oz-for-oss · 2026-05-26T18:08:02Z

@hongyi-chen

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss

Overview

This PR clarifies the BYOK documentation by explaining that BYOK agent requests transit Warp's backend and that the user-provided provider key is used in flight. The added data-flow description is useful, but one new privacy/retention statement is broader than the existing privacy documentation supports.

Concerns

The new ZDR section says Warp does not retain the BYOK request body or log it for analytics, but the privacy documentation describes account-level telemetry and plan settings that can affect AI interaction collection. This should be scoped before merge.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

…ey.mdx Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>

Replace specific feature list (Codebase Context, Rules, Secret Redaction, multi-step tool orchestration) with a more general 'Warp's agent harness' reference. Keeps the explanation accurate without enumerating internals that may evolve. Co-Authored-By: Oz <oz-agent@warp.dev>

Issue #11681 reported that the privacy framing on the Custom Inference endpoint page was misleading: requests are server-hosted through warp-server, so traffic does transit Warp's backend even though the endpoint URL and API key are stored locally on the client. This commit narrows and corrects the privacy claim on the Custom Inference endpoint page, mirroring the BYOK rewrite already in this PR: - Replace the blanket 'never synced to the cloud' wording for endpoint URLs with a narrower, accurate claim: API keys are never synced or stored on Warp's servers; endpoint URLs and model identifiers may appear in Warp's usage telemetry, but API keys never do. - Add an explicit 3-step request flow (harness assembles -> in-flight key authenticates the call -> response streams back) so the server-side path is no longer surprising. - Add a 'Why does the request route through Warp's backend?' callout matching the BYOK page. - Tighten the ZDR section to note that prompts/responses transit Warp's backend without being used for training, and scope the existing retention bullets to the provider side. Also align the BYOK headline claim with the same wording ('never synced or stored on Warp's servers') so both pages converge on a single phrasing. Confirmed against warp-server: - logic/ai/llm/custom_endpoint/client.go:14-21 - the OpenAI-compatible client is constructed server-side using hostConfig.CustomEndpointAPIKey() and hostConfig.CustomEndpointBaseURL() from the request, not from persistent server config. - logic/ai/llm/user_api_keys/util.go:7 - keys arrive per-request via Request_Settings_ApiKeys. Co-Authored-By: Oz <oz-agent@warp.dev>

Per review feedback, simplify the Custom Inference endpoint privacy framing to a single durable claim — API keys are never synced or stored on Warp's servers — without adding a separate caveat about endpoint URL or model identifier telemetry. Co-Authored-By: Oz <oz-agent@warp.dev>

petradonka · 2026-05-27T10:43:34Z

 ## How BYOK works

-When you add your own model API keys in Warp, those keys are stored **locally on your device** and are **never synced to the cloud**.
+When you add your own model API keys in Warp, those keys are stored **locally on your device** (in your OS keychain or equivalent secure storage) and are **never synced or stored on Warp's servers**.


Suggested change

When you add your own model API keys in Warp, those keys are stored **locally on your device** (in your OS keychain or equivalent secure storage) and are **never synced or stored on Warp's servers**.

When you add your own model API keys in Warp, those keys are stored **locally on your device** (in your OS keychain or equivalent secure storage) and are **never retained on Warp's servers**.

I think people don't understand/care about the sync part, what we want to get across is that your key may cross paths on our servers, but is not stored there.

petradonka · 2026-05-27T10:44:16Z

-Warp uses these API keys when routing your agent requests to the model provider you've configured.
+When you send a prompt using a model with the **key icon**:
+
+1. Warp's agent harness on Warp's backend assembles the request from your prompt and conversation context.


This sounds like it happens on our servers, is that right? I'd assume it happens locally.

If this is true, I'd perhaps reshuffle this to:

locally, we put the request together incl your API key

send it to our servers and assemble the whole prompt/etc, make request to model provider

stream back from provider through our servers to you

petradonka · 2026-05-27T10:45:59Z

+2. Your API key is sent up alongside that request and used in-flight to authenticate the call to your chosen model provider (Anthropic, OpenAI, or Google).
+3. The provider's response streams back through Warp's backend to your client.
+
+Your API key is held in memory only for the duration of each request — Warp never writes it to disk or to any database.


I'd perhaps rather say your API key passes through our servers but it is not stored there.

Per Petra's review feedback: the previous phrasing 'stored locally on your device and never synced or stored on Warp's servers' technically holds but implies too strongly that the API key never leaves the user's machine. The key does transit Warp's backend in-flight per request (see the 3-step flow further down each page). Reframe the headline storage claim on both pages to focus on what the key is for instead of where it isn't: it is stored only on the user's device and used to authenticate requests to the model provider / configured endpoint. The downstream 3-step flow and 'Why does the request route through Warp's backend?' callout remain unchanged and continue to explain the actual transit path. Co-Authored-By: Oz <oz-agent@warp.dev>

Per the remaining review feedback on PR #138: - Replace 'stored only on your device' headline claim with explicit language that the key passes through Warp's servers but is not stored there, mirroring Petra's preferred phrasing. - Reshuffle the 3-step flow so step 1 is local (client pulls the key from secure storage and sends it up) and step 2 explicitly states that the agent harness runs on Warp's backend, answering Petra's question about where assembly happens. - Reword the 'held in memory' sentence to use the same 'passes through but is not stored' framing. Same changes applied in parallel to the Custom Inference Endpoint page. Co-Authored-By: Oz <oz-agent@warp.dev>

hongyi-chen and others added 2 commits May 26, 2026 11:05

Merge remote-tracking branch 'origin/main' into hongyichen/clarify-by…

a2f14e0

…ok-data-flow # Conflicts: # src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx

cla-bot Bot added the cla-signed label May 26, 2026

vercel Bot deployed to Preview May 26, 2026 18:10 View deployment

oz-for-oss Bot reviewed May 26, 2026

View reviewed changes

Comment thread src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx Outdated

Update src/content/docs/agent-platform/inference/bring-your-own-api-k…

7e785b2

…ey.mdx Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>

vercel Bot deployed to Preview May 26, 2026 18:19 View deployment

vercel Bot deployed to Preview May 26, 2026 18:22 View deployment

Merge branch 'main' into hongyichen/clarify-byok-data-flow

67b37db

hongyi-chen requested a review from petradonka May 26, 2026 18:53

vercel Bot deployed to Preview May 26, 2026 18:55 View deployment

hongyi-chen changed the title ~~docs: clarify BYOK request path and data flow~~ docs: clarify BYOK + Custom Inference request path and data flow May 26, 2026

vercel Bot deployed to Preview May 26, 2026 23:04 View deployment

vercel Bot deployed to Preview May 26, 2026 23:08 View deployment

petradonka reviewed May 27, 2026

View reviewed changes

vercel Bot deployed to Preview May 27, 2026 16:03 View deployment

Merge branch 'main' into hongyichen/clarify-byok-data-flow

d68e04a

vercel Bot deployed to Preview May 27, 2026 17:18 View deployment

vercel Bot deployed to Preview May 27, 2026 18:38 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: clarify BYOK + Custom Inference request path and data flow#138

docs: clarify BYOK + Custom Inference request path and data flow#138
hongyi-chen wants to merge 10 commits into
mainfrom
hongyichen/clarify-byok-data-flow

hongyi-chen commented May 26, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 26, 2026 •

edited

Loading

Uh oh!

oz-for-oss Bot commented May 26, 2026 •

edited

Loading

Uh oh!

oz-for-oss Bot left a comment

Uh oh!

Uh oh!

petradonka May 27, 2026

Uh oh!

petradonka May 27, 2026 •

edited

Loading

Uh oh!

petradonka May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	When you add your own model API keys in Warp, those keys are stored locally on your device (in your OS keychain or equivalent secure storage) and are never synced or stored on Warp's servers.
	When you add your own model API keys in Warp, those keys are stored locally on your device (in your OS keychain or equivalent secure storage) and are never retained on Warp's servers.

Conversation

hongyi-chen commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx

src/content/docs/agent-platform/inference/custom-inference-endpoint.mdx

Additional context

Uh oh!

vercel Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot left a comment

Choose a reason for hiding this comment

Overview

Concerns

Verdict

Uh oh!

Uh oh!

petradonka May 27, 2026

Choose a reason for hiding this comment

Uh oh!

petradonka May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petradonka May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hongyi-chen commented May 26, 2026 •

edited

Loading

`src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx`

`src/content/docs/agent-platform/inference/custom-inference-endpoint.mdx`

vercel Bot commented May 26, 2026 •

edited

Loading

oz-for-oss Bot commented May 26, 2026 •

edited

Loading

petradonka May 27, 2026 •

edited

Loading