Open
Conversation
- Complete pipeline with exact priority numbers (10-999) - Pre-request vs post-response split with sequential/parallel distinction - Cache short-circuit mechanics (exact vs semantic hit behavior) - Virtual key properties (RBAC, BYOK, credits, IP restrictions, etc.) - Multi-tenancy isolation model - Config hierarchy: request headers > key > org > global - Hot-reload with SHA-256 change detection and Redis pub/sub - Streaming behavior (pre-plugins → stream → post-plugins after final chunk)
- Add Hugging Face, Anyscale, Replicate to cloud providers table - Add LocalAI to self-hosted providers table - New tab strategy: Prism SDK | OpenAI SDK | LiteLLM | cURL for inference - Dashboard | Python | TypeScript for config/management - Provider health section with circuit breaker flow - Fix model name to real Anthropic model ID
- New page: /docs/prism/api/endpoints with all 97 endpoints across 20+ categories - Restructure sidebar: Concepts, Providers, API Reference, Routing, Safety & Policy, Performance, Cost & Observability, Agentic, Deployment - Add Quickstart to top-level nav - Rename "Core Concepts" to "How it works", "Manage Providers" to "Supported providers" - Nav entries only added for pages that exist (incremental approach)
- New page: /docs/prism/api/chat with 4-tab examples (Prism SDK, OpenAI SDK, LiteLLM, cURL) for basic, streaming, function calling (full 2-turn), and vision - Request/response body schemas, SSE streaming format, response headers table - Fix LiteLLM base_url: needs /v1 suffix (tested against live gateway) - All code examples tested against gateway.futureagi.com with Gemini models - Add Chat completions to nav under API Reference
- Lead with OpenAI SDK base_url swap (2-line change) - Add LiteLLM tab to all examples - Response headers example using with_raw_response (tested against live gateway) - Remove error responses section (moves to error handling guide) - Remove Prism SDK install as step 1 - framework note at bottom instead - Provider switching example with OpenAI/Anthropic/Gemini
- Key properties table with all fields from actual APIKey struct - BYOK vs managed key types with credit balance - Admin API examples (create, list, revoke, add credits) matching registered routes - Per-key guardrail overrides with YAML config example - RBAC: roles, teams, wildcard permissions, resolution order with concrete metadata example - IP ACL: 3 layers (global, per-org, per-key) with config/API examples - Access groups for logical model grouping - All architect review fixes applied
…pattern Move topic-domain groups (Providers, API Reference, Routing, Safety, Performance, Cost & Observability, Agentic) into sub-groups under Features. Keeps standard Overview → Quickstart → Concepts → Features → Deployment structure matching all other product sections in the docs.
…aching Routing: - Add complexity-based routing (8 scoring signals, tier mapping) - Add provider lock (sticky routing via header) - Add adaptive strategy details (learning phase, weight smoothing) - Add race/fastest strategy config (max_concurrent, cancel_delay, billing warning) - Fix model names to latest (claude-sonnet-4-6) - Update tabs to Dashboard | Python (Prism SDK) | TypeScript (Prism SDK) Caching: - Fix duplicate About section and duplicate cache modes section - Clean structure: About, When to use, Config, Namespaces, Per-request control, Backends - Add namespace header example and per-request tabs (Prism SDK, OpenAI SDK, cURL) - Clarify exact vs semantic hit cost behavior
…credits - Fix wrong claim "per-key not supported" - per-key RPM/TPM is fully supported - Add 3-level rate limiting (global, per-org, per-key) - Add budgets section (daily/weekly/monthly/total, hard/soft limits) - Add managed key credits (USD balance, auto-deduction, add credits API) - 4-tab examples for retry logic - Update nav title to "Rate limiting & budgets"
…rategy - Routing: remove em dash, drop "maximize", deduplicate circuit breaker prose - Caching: replace "cross-contamination" with plainer phrasing - Rate limiting: replace config.yaml tab with TypeScript (Prism SDK) per tab strategy, move YAML to separate block below tabs
…ross-links - Update all tab labels to match strategy (Dashboard | Python (Prism SDK) | TypeScript (Prism SDK)) - Add fail-open vs fail-closed explanation after enforcement modes - Update Next steps cards with relevant cross-links - Remove screenshot references to non-existent dashboard images
New page: api/headers.mdx - complete reference for all x-prism-* request and response headers, response.prism SDK accessors, create_headers() usage. Updated: concepts/configuration.mdx - added config hierarchy explanation, model mapping section, GatewayConfig reference table, standard tabs (Dashboard/Python/TypeScript), self-hosted YAML examples. Nav: added headers page under API Reference.
New page: api/embeddings.mdx - embeddings endpoint with 4-tab examples (Prism SDK, OpenAI SDK, LiteLLM, cURL), batch embeddings, reduced dimensions, encoding format. Reranking endpoint with Prism SDK and cURL examples, parameters table, response format. RAG pipeline example showing embed → search → rerank → generate flow. Caching section. Nav: added under API Reference.
New page: api/media.mdx - TTS, speech-to-text, audio translation, and image generation. All sections have 3-tab examples (Prism SDK, OpenAI SDK, cURL). Parameter tables, supported models, response formats. LiteLLM tabs intentionally omitted - audio/image support is inconsistent.
New page: api/assistants.mdx - full OpenAI Assistants API proxy docs. Covers assistants, threads, messages, runs with endpoint tables. Examples: quick start flow, tool use with submit_tool_outputs, file search with vector stores, streaming runs. Notes on what Prism adds (cost tracking, rate limiting, logging) and limitations (no routing/ failover since threads are stored on OpenAI).
New page: api/files.mdx - file upload/list/delete, vector store CRUD, batch file uploads, vector store search, file type reference. All examples use OpenAI SDK since files are stored on OpenAI's servers.
New page: api/async-batch.mdx - async inference with polling, scheduled completions, OpenAI Batch API with JSONL input/output. Decision table for sync vs async vs scheduled vs batch.
Fixed metadata format (JSON, not key=value). Standardized tabs to Prism SDK | OpenAI SDK | cURL for inference, Dashboard | Python | TS for config. Added response.prism.cost accessor, client.current_cost, SDK analytics methods. Fixed heading casing. Cross-linked to rate limiting page for budget enforcement.
Fixed model names (claude-sonnet-4-6), lowercase headings, removed card icons, tightened limitations section, restructured config sections.
Removed card icons, updated cross-links to include virtual keys and endpoints pages. Content was already comprehensive.
New: guides/errors.mdx - error format, HTTP status codes, common errors with fixes, retry strategies (Prism SDK, OpenAI SDK, manual), SDK exception hierarchy, retry decision table. New: guides/troubleshooting.mdx - debug checklist using x-prism-* headers, step-by-step fixes for model not found, provider 404, slow responses, cache misses, guardrail blocks, rate limits, cost issues, failover problems. Nav: added Guides section with both pages.
New: features/observability.mdx - request logging, distributed tracing, Prometheus metrics table, OpenTelemetry config, session tracking. New: features/self-hosted-models.mdx - Ollama, vLLM, LM Studio config, hybrid routing patterns (cost-based, failover, complexity-based). New: admin/organizations.mdx - org settings, member roles, API key management, multi-tenancy patterns. Updated: deployment/self-hosted.mdx - fixed title case, model names, api_key required note for self-hosted providers, private repo note. Nav: added Self-hosted models, Observability, Admin section.
…e_url observability.mdx: Added Prism SDK | OpenAI SDK | cURL tabs for tracing, added user_id and response.prism accessors, concrete session example. assistants.mdx: Moved routing/failover caveat to top About section, added variable context comment in streaming snippet. organizations.mdx: Added control_plane_url explanation comment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First batch of the Prism Gateway docs rewrite (6 pages out of 41 planned). Covers foundation pages and core API.
Pages rewritten/added
Structural changes
/v1in base_url (discovered during testing, fixed everywhere)What's next
35 more pages across Phases 3-6. Tracked in Notion.
Test plan