Prism Gateway docs rewrite - Phase 1 & 2 by NVJKKartik · Pull Request #553 · future-agi/docs

NVJKKartik · 2026-04-02T12:29:15Z

Summary

First batch of the Prism Gateway docs rewrite (6 pages out of 41 planned). Covers foundation pages and core API.

Pages rewritten/added

How it works (rewrite) - full plugin pipeline with exact priority numbers, cache short-circuit mechanics, multi-tenancy, config hierarchy, hot-reload
Supported providers (update) - 19 cloud + 5 self-hosted providers, 4-tab strategy (Prism SDK | OpenAI SDK | LiteLLM | cURL)
Endpoints overview (new) - all 97 API endpoints across 20+ categories
Chat completions (new) - primary endpoint with streaming, function calling (full 2-turn), vision, structured outputs. Absorbs streaming.mdx content.
Quickstart (rewrite) - OpenAI SDK first ("change 2 lines"), all 4 tabs, tested against live gateway
Virtual keys & access control (new) - key properties, BYOK vs managed, admin API, RBAC, IP ACL (3 layers), access groups

Structural changes

Sidebar restructured: Concepts, Providers, API Reference, Routing, Safety & Policy, Performance, Cost & Observability, Agentic, Deployment
All code examples tested against live gateway (gateway.futureagi.com with Gemini models)
LiteLLM requires /v1 in base_url (discovered during testing, fixed everywhere)
Every page passed docs-architect review + humanizer review

What's next

35 more pages across Phases 3-6. Tracked in Notion.

Test plan

Verify all 6 pages render correctly
Verify new sidebar structure shows correct groups
Click all Card links to confirm no 404s
Spot-check code examples (basic completion, streaming, function calling)

- Complete pipeline with exact priority numbers (10-999) - Pre-request vs post-response split with sequential/parallel distinction - Cache short-circuit mechanics (exact vs semantic hit behavior) - Virtual key properties (RBAC, BYOK, credits, IP restrictions, etc.) - Multi-tenancy isolation model - Config hierarchy: request headers > key > org > global - Hot-reload with SHA-256 change detection and Redis pub/sub - Streaming behavior (pre-plugins → stream → post-plugins after final chunk)

- Add Hugging Face, Anyscale, Replicate to cloud providers table - Add LocalAI to self-hosted providers table - New tab strategy: Prism SDK | OpenAI SDK | LiteLLM | cURL for inference - Dashboard | Python | TypeScript for config/management - Provider health section with circuit breaker flow - Fix model name to real Anthropic model ID

- New page: /docs/prism/api/endpoints with all 97 endpoints across 20+ categories - Restructure sidebar: Concepts, Providers, API Reference, Routing, Safety & Policy, Performance, Cost & Observability, Agentic, Deployment - Add Quickstart to top-level nav - Rename "Core Concepts" to "How it works", "Manage Providers" to "Supported providers" - Nav entries only added for pages that exist (incremental approach)

- New page: /docs/prism/api/chat with 4-tab examples (Prism SDK, OpenAI SDK, LiteLLM, cURL) for basic, streaming, function calling (full 2-turn), and vision - Request/response body schemas, SSE streaming format, response headers table - Fix LiteLLM base_url: needs /v1 suffix (tested against live gateway) - All code examples tested against gateway.futureagi.com with Gemini models - Add Chat completions to nav under API Reference

- Lead with OpenAI SDK base_url swap (2-line change) - Add LiteLLM tab to all examples - Response headers example using with_raw_response (tested against live gateway) - Remove error responses section (moves to error handling guide) - Remove Prism SDK install as step 1 - framework note at bottom instead - Provider switching example with OpenAI/Anthropic/Gemini

- Key properties table with all fields from actual APIKey struct - BYOK vs managed key types with credit balance - Admin API examples (create, list, revoke, add credits) matching registered routes - Per-key guardrail overrides with YAML config example - RBAC: roles, teams, wildcard permissions, resolution order with concrete metadata example - IP ACL: 3 layers (global, per-org, per-key) with config/API examples - Access groups for logical model grouping - All architect review fixes applied

…pattern Move topic-domain groups (Providers, API Reference, Routing, Safety, Performance, Cost & Observability, Agentic) into sub-groups under Features. Keeps standard Overview → Quickstart → Concepts → Features → Deployment structure matching all other product sections in the docs.

…aching Routing: - Add complexity-based routing (8 scoring signals, tier mapping) - Add provider lock (sticky routing via header) - Add adaptive strategy details (learning phase, weight smoothing) - Add race/fastest strategy config (max_concurrent, cancel_delay, billing warning) - Fix model names to latest (claude-sonnet-4-6) - Update tabs to Dashboard | Python (Prism SDK) | TypeScript (Prism SDK) Caching: - Fix duplicate About section and duplicate cache modes section - Clean structure: About, When to use, Config, Namespaces, Per-request control, Backends - Add namespace header example and per-request tabs (Prism SDK, OpenAI SDK, cURL) - Clarify exact vs semantic hit cost behavior

…credits - Fix wrong claim "per-key not supported" - per-key RPM/TPM is fully supported - Add 3-level rate limiting (global, per-org, per-key) - Add budgets section (daily/weekly/monthly/total, hard/soft limits) - Add managed key credits (USD balance, auto-deduction, add credits API) - 4-tab examples for retry logic - Update nav title to "Rate limiting & budgets"

…erride values

…rategy - Routing: remove em dash, drop "maximize", deduplicate circuit breaker prose - Caching: replace "cross-contamination" with plainer phrasing - Rate limiting: replace config.yaml tab with TypeScript (Prism SDK) per tab strategy, move YAML to separate block below tabs

…ross-links - Update all tab labels to match strategy (Dashboard | Python (Prism SDK) | TypeScript (Prism SDK)) - Add fail-open vs fail-closed explanation after enforcement modes - Update Next steps cards with relevant cross-links - Remove screenshot references to non-existent dashboard images

…rism SDK)

New page: api/headers.mdx - complete reference for all x-prism-* request and response headers, response.prism SDK accessors, create_headers() usage. Updated: concepts/configuration.mdx - added config hierarchy explanation, model mapping section, GatewayConfig reference table, standard tabs (Dashboard/Python/TypeScript), self-hosted YAML examples. Nav: added headers page under API Reference.

New page: api/embeddings.mdx - embeddings endpoint with 4-tab examples (Prism SDK, OpenAI SDK, LiteLLM, cURL), batch embeddings, reduced dimensions, encoding format. Reranking endpoint with Prism SDK and cURL examples, parameters table, response format. RAG pipeline example showing embed → search → rerank → generate flow. Caching section. Nav: added under API Reference.

New page: api/media.mdx - TTS, speech-to-text, audio translation, and image generation. All sections have 3-tab examples (Prism SDK, OpenAI SDK, cURL). Parameter tables, supported models, response formats. LiteLLM tabs intentionally omitted - audio/image support is inconsistent.

New page: api/assistants.mdx - full OpenAI Assistants API proxy docs. Covers assistants, threads, messages, runs with endpoint tables. Examples: quick start flow, tool use with submit_tool_outputs, file search with vector stores, streaming runs. Notes on what Prism adds (cost tracking, rate limiting, logging) and limitations (no routing/ failover since threads are stored on OpenAI).

New page: api/files.mdx - file upload/list/delete, vector store CRUD, batch file uploads, vector store search, file type reference. All examples use OpenAI SDK since files are stored on OpenAI's servers.

New page: api/async-batch.mdx - async inference with polling, scheduled completions, OpenAI Batch API with JSONL input/output. Decision table for sync vs async vs scheduled vs batch.

Fixed metadata format (JSON, not key=value). Standardized tabs to Prism SDK | OpenAI SDK | cURL for inference, Dashboard | Python | TS for config. Added response.prism.cost accessor, client.current_cost, SDK analytics methods. Fixed heading casing. Cross-linked to rate limiting page for budget enforcement.

Fixed model names (claude-sonnet-4-6), lowercase headings, removed card icons, tightened limitations section, restructured config sections.

Removed card icons, updated cross-links to include virtual keys and endpoints pages. Content was already comprehensive.

New: guides/errors.mdx - error format, HTTP status codes, common errors with fixes, retry strategies (Prism SDK, OpenAI SDK, manual), SDK exception hierarchy, retry decision table. New: guides/troubleshooting.mdx - debug checklist using x-prism-* headers, step-by-step fixes for model not found, provider 404, slow responses, cache misses, guardrail blocks, rate limits, cost issues, failover problems. Nav: added Guides section with both pages.

New: features/observability.mdx - request logging, distributed tracing, Prometheus metrics table, OpenTelemetry config, session tracking. New: features/self-hosted-models.mdx - Ollama, vLLM, LM Studio config, hybrid routing patterns (cost-based, failover, complexity-based). New: admin/organizations.mdx - org settings, member roles, API key management, multi-tenancy patterns. Updated: deployment/self-hosted.mdx - fixed title case, model names, api_key required note for self-hosted providers, private repo note. Nav: added Self-hosted models, Observability, Admin section.

…e_url observability.mdx: Added Prism SDK | OpenAI SDK | cURL tabs for tracing, added user_id and response.prism accessors, concrete session example. assistants.mdx: Moved routing/failover caveat to top About section, added variable context comment in streaming snippet. organizations.mdx: Added control_plane_url explanation comment.

NVJKKartik added 7 commits April 2, 2026 16:09

humanizer fixes: drop promotional phrasing and filler adverbs

eab97c3

NVJKKartik requested a review from hadarishav April 2, 2026 12:29

NVJKKartik added 19 commits April 3, 2026 16:40

fix routing review: clarify adaptive signal weights and complexity ov…

a3f12f9

…erride values

fix rate limiting budget tabs: replace config.yaml with TypeScript (P…

e50369d

…rism SDK)

add files & vector stores page

803298d

New page: api/files.mdx - file upload/list/delete, vector store CRUD, batch file uploads, vector store search, file type reference. All examples use OpenAI SDK since files are stored on OpenAI's servers.

add async & batch processing page

b03fdcf

New page: api/async-batch.mdx - async inference with polling, scheduled completions, OpenAI Batch API with JSONL input/output. Decision table for sync vs async vs scheduled vs batch.

clean up shadow experiments page

6f31402

Fixed model names (claude-sonnet-4-6), lowercase headings, removed card icons, tightened limitations section, restructured config sections.

clean up MCP & A2A page

855bc84

Removed card icons, updated cross-links to include virtual keys and endpoints pages. Content was already comprehensive.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prism Gateway docs rewrite - Phase 1 & 2#553

Prism Gateway docs rewrite - Phase 1 & 2#553
NVJKKartik wants to merge 26 commits intoastrofrom
gateway-docs

NVJKKartik commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NVJKKartik commented Apr 2, 2026

Summary

Pages rewritten/added

Structural changes

What's next

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant