From f4065d8e2171cfef96f2ad971cdba61dc98f6114 Mon Sep 17 00:00:00 2001 From: sk-portkey Date: Wed, 1 Apr 2026 02:41:33 +0530 Subject: [PATCH 1/2] chore: docs version 2.5.0 --- changelog/enterprise.mdx | 81 ++++++++++++++++- integrations/guardrails/lasso.mdx | 16 +++- integrations/llms/mistral-ai.mdx | 86 +++++++++++++++++++ .../enterprise-offering/budget-policies.mdx | 7 +- .../org-management/jwt.mdx | 15 ++++ .../guardrails/list-of-guardrail-checks.mdx | 1 + product/observability/metadata.mdx | 14 +++ 7 files changed, 216 insertions(+), 4 deletions(-) diff --git a/changelog/enterprise.mdx b/changelog/enterprise.mdx index 1d907672..daa39685 100644 --- a/changelog/enterprise.mdx +++ b/changelog/enterprise.mdx @@ -1,6 +1,6 @@ --- title: "Enterprise Gateway" -sidebarTitle: "Enterprise Gateway [2.4.4]" +sidebarTitle: "Enterprise Gateway [2.5.0]" rss: true --- @@ -8,6 +8,85 @@ rss: true Discuss how Portkey's AI Gateway can enhance your organization's AI infrastructure + + +## v2.5.0 +--- + +### Gateway-Local JWT Authentication + +The gateway can now validate JWT tokens locally without calling the control plane, reducing authentication latency for self-hosted deployments. When `JWT_ENABLED=ON` is set, JWTs are decoded, verified against the configured JWKS, and cached locally until expiry. The gateway resolves organization, workspace, and deployment details from its local data sync, making JWT authentication fully independent of the control plane at request time. + +- Supports all existing JWT claims: `portkey_oid`, `portkey_workspace`, `scope`, `defaults`, `usage_limits`, `rate_limits` +- Organisation ID can be resolved from the token (`portkey_oid` / `organisation_id`) or from `ORGANISATIONS_TO_SYNC` when a single org is configured +- JWT-authenticated requests work with rate limit and usage limit policies + +[JWT Authentication Documentation](/product/enterprise-offering/org-management/jwt) + +### Headers-to-Metadata Injection + +New `HEADERS_TO_METADATA` environment variable allows automatically injecting request header values into metadata. This enables enriching observability data with upstream context (e.g., caller identity, trace IDs, or environment tags) without requiring clients to set `x-portkey-metadata`. + +Configure with a comma-separated list of header names: +``` +HEADERS_TO_METADATA=x-request-id,x-caller-service,x-environment +``` + +Header values are matched case-insensitively and injected into the request metadata alongside any existing metadata from `x-portkey-metadata`. + +[Metadata Documentation](/product/observability/metadata) + +### Rate Limit Policies: Weekly Window and Endpoint Type Conditions + +- **Requests per week (`rpw`)**: Rate limit policies now support a weekly window in addition to per-minute, per-hour, and per-day +- **`endpoint_type` condition**: Rate limit policies can now target specific endpoint types (e.g., `chatComplete`, `embed`, `complete`) using the `endpoint_type` condition key + +[Usage & Rate Limit Policies Documentation](/product/enterprise-offering/budget-policies) + +### Mistral AI: `response_format` Support + +Added `response_format` parameter support for Mistral AI, enabling structured JSON output and strict JSON schema enforcement. Previously, the gateway silently dropped this parameter for Mistral requests. + +[Mistral AI Documentation](/integrations/llms/mistral-ai) + +### New Guardrail: Required Metadata Key-Value Pairs + +Added a new native guardrail that validates specific metadata key-value pairs before processing a request. Supports `all`, `any`, and `none` operators for flexible matching. + +| Check Name | Parameters | Supported Hooks | +|---|---|---| +| Required Metadata Key-Value Pairs | `metadataPairs` (object), `operator` (all/any/none) | `beforeRequestHook` | + +[Guardrail Checks Documentation](/product/guardrails/list-of-guardrail-checks) + +### Lasso Security Guardrail: v3 API Upgrade + +Upgraded the Lasso Security integration from v2 to v3 Classify API with the following improvements: +- **Findings-based detection**: Responses now include structured findings with name, category, action (BLOCK/AUTO_MASKING/WARN), and severity +- **Session and user tracking**: Supports `sessionId` and `userId` for conversation-level analysis +- **Multi-format support**: Handles `chatComplete`, `messages`, `complete`, and `embed` request types +- **Custom API endpoint**: Supports configurable `apiEndpoint` for self-hosted Lasso deployments + +[Lasso Security Documentation](/integrations/guardrails/lasso) + +### Image Format Support: HEIC and HEIF + +Added `image/heic` and `image/heif` to the list of supported image MIME types for multimodal requests and the Inline Image URLs guardrail. + +### Google Embeddings: `dimensions` Parameter Fix + +Fixed the Google provider's embedding endpoint to correctly map the OpenAI-compatible `dimensions` parameter to Google's `output_dimensionality` parameter. + +### Fixes and Improvements +- **Bedrock**: Improved tool calling support — Anthropic-format tool calls in Bedrock now correctly handle tool result content, cache control blocks, and strict role alternation +- **Bedrock**: Tool descriptions are now optional in Bedrock Converse API requests +- **Vertex AI**: Fixed model allowlist checking for proxy context-cache create routes +- **Forward Headers**: Prevented `x-portkey-forward-headers` from being forwarded to prevent infinite request loops +- **Google Vertex**: Fixed schema conversion to correctly move type-specific properties (items, properties, etc.) inside `anyOf` branches +- Updated model pricing: Bedrock (`zai.glm-5`, `minimax.minimax-m2.5`), DeepInfra (`Qwen/Qwen3.5-122B-A10B`), xAI (`grok-4.20-reasoning-latest`) + + + ## v2.4.4 diff --git a/integrations/guardrails/lasso.mdx b/integrations/guardrails/lasso.mdx index d23b6766..bcc0130a 100644 --- a/integrations/guardrails/lasso.mdx +++ b/integrations/guardrails/lasso.mdx @@ -29,7 +29,7 @@ To get started with Lasso Security, visit their documentation: | Check Name | Description | Parameters | Supported Hooks | |------------|-------------|------------|-----------------| -| Scan Content | Lasso Security's Deputies analyze content for various security risks including jailbreak attempts, custom policy violations, sexual content, hate speech, illegal content, and more. | `Timeout` (number) | `beforeRequestHook` | +| Scan Content | Lasso Security's Deputies analyze content for various security risks including jailbreak attempts, custom policy violations, sexual content, hate speech, illegal content, and more. Returns structured findings with action levels (BLOCK, AUTO_MASKING, WARN). | `Timeout` (number) | `beforeRequestHook` | @@ -127,6 +127,20 @@ Lasso Security's Deputies analyze content for various security risks across mult 4. **Custom Policy Violations**: Enforces your organization's specific security policies 5. **Harmful Content Detection**: Flags sexual content, hate speech, illegal content, and more +The integration returns structured **findings** for each deputy, including the finding name, category, action level (`BLOCK`, `AUTO_MASKING`, or `WARN`), and severity. Requests are blocked when any finding has a `BLOCK` action. + +### Supported Request Types + +Lasso works across multiple request types: +- **Chat Completions** (`/v1/chat/completions`) +- **Messages** (Anthropic-style) +- **Completions** (`/v1/completions`) +- **Embeddings** (`/v1/embeddings`) + +### Self-Hosted Lasso Endpoint + +If you are running a self-hosted Lasso deployment, you can configure a custom `apiEndpoint` in your Lasso credentials to point to your own instance instead of the default Lasso cloud. + Learn more about Lasso Security's features [here](https://www.lasso.security). ## Get Support diff --git a/integrations/llms/mistral-ai.mdx b/integrations/llms/mistral-ai.mdx index 47719763..cbfa6667 100644 --- a/integrations/llms/mistral-ai.mdx +++ b/integrations/llms/mistral-ai.mdx @@ -167,6 +167,92 @@ Your Codestral requests will show up on Portkey logs with code snippets rendered +## Structured Output (response_format) + +Mistral AI supports structured output via the `response_format` parameter. You can enforce JSON schema validation for deterministic, structured responses: + + +```python Python +from portkey_ai import Portkey + +portkey = Portkey( + api_key="PORTKEY_API_KEY", + provider="@mistral-ai" +) + +response = portkey.chat.completions.create( + model="mistral-large-latest", + messages=[{"role": "user", "content": "List 3 capitals with their countries"}], + response_format={ + "type": "json_schema", + "json_schema": { + "name": "capitals", + "strict": True, + "schema": { + "type": "object", + "properties": { + "capitals": { + "type": "array", + "items": { + "type": "object", + "properties": { + "city": {"type": "string"}, + "country": {"type": "string"} + }, + "required": ["city", "country"] + } + } + }, + "required": ["capitals"] + } + } + } +) + +print(response.choices[0].message.content) +``` + +```js Javascript +import Portkey from 'portkey-ai' + +const portkey = new Portkey({ + apiKey: "PORTKEY_API_KEY", + provider: "@mistral-ai" +}) + +const response = await portkey.chat.completions.create({ + model: "mistral-large-latest", + messages: [{ role: "user", content: "List 3 capitals with their countries" }], + response_format: { + type: "json_schema", + json_schema: { + name: "capitals", + strict: true, + schema: { + type: "object", + properties: { + capitals: { + type: "array", + items: { + type: "object", + properties: { + city: { type: "string" }, + country: { type: "string" } + }, + required: ["city", "country"] + } + } + }, + required: ["capitals"] + } + } + } +}) + +console.log(response.choices[0].message.content) +``` + + ## Mistral Tool Calling Tool calling lets models trigger external tools based on conversation context. You define available functions, the model chooses when to use them, and your application executes them and returns results. diff --git a/product/enterprise-offering/budget-policies.mdx b/product/enterprise-offering/budget-policies.mdx index 0f2c89ac..b13baee4 100644 --- a/product/enterprise-offering/budget-policies.mdx +++ b/product/enterprise-offering/budget-policies.mdx @@ -44,6 +44,7 @@ Conditions determine **which requests** a policy applies to. All conditions must | `config` | Match by config slug | `"config slug"` | 2.0.0+ | | `prompt` | Match by prompt slug | `"prompt slug"` | 2.0.0+ | | `model` | Match by model (with wildcard support) | `"@openai/gpt-4o"`, `"@anthropic/*"` | 2.0.0+ | +| `endpoint_type` | Match by endpoint type | `"chatComplete"`, `"embed"`, `"complete"` | 2.5.0+ | ### Group By @@ -186,6 +187,7 @@ When a rate limit is exceeded, Portkey returns a **429 Too Many Requests** HTTP - **`rpm`**: Requests/Tokens per minute - **`rph`**: Requests/Tokens per hour - **`rpd`**: Requests/Tokens per day +- **`rpw`**: Requests/Tokens per week ### Parameters @@ -222,6 +224,7 @@ The time interval unit for the rate limit. - `"rpm"` - Requests/Tokens per minute - `"rph"` - Requests/Tokens per hour - `"rpd"` - Requests/Tokens per day + - `"rpw"` - Requests/Tokens per week - **Behaviour**: - Defines the time window over which the rate limit is calculated - Limits reset automatically at the start of each time period @@ -241,7 +244,7 @@ The maximum number of requests or tokens allowed within the specified time unit. 1. **Conditions**: Each condition must have `key` and `value` fields. 2. **Group By**: Each group must have a `key` field. -3. **Valid Keys**: For both `conditions` and `group_by`, valid keys include `api_key`, `virtual_key`, `provider`, `config`, `prompt`, `model`, or any key starting with `metadata.` +3. **Valid Keys**: For both `conditions` and `group_by`, valid keys include `api_key`, `virtual_key`, `provider`, `config`, `prompt`, `model`, `endpoint_type`, or any key starting with `metadata.` 4. **Value**: Must be a numeric value. 5. **Workspace**: Workspace ID is required (can be provided via API key or request body). @@ -641,7 +644,7 @@ This resets the entity's usage counter to zero, allowing it to consume the full 4. **Periodic Reset Options**: `"weekly"`, `"monthly"`, or `null` for no reset. -5. **Rate Limit Units**: `"rpm"` (per minute), `"rph"` (per hour), `"rpd"` (per day). +5. **Rate Limit Units**: `"rpm"` (per minute), `"rph"` (per hour), `"rpd"` (per day), `"rpw"` (per week). 6. **Usage Limit Types**: `"cost"` (in dollars) or `"tokens"`. diff --git a/product/enterprise-offering/org-management/jwt.mdx b/product/enterprise-offering/org-management/jwt.mdx index 698ec98a..728a73f2 100644 --- a/product/enterprise-offering/org-management/jwt.mdx +++ b/product/enterprise-offering/org-management/jwt.mdx @@ -526,6 +526,21 @@ main(); ``` +## Gateway-Local JWT Validation (Self-Hosted) + + +Requires gateway **2.5.0** or higher with `JWT_ENABLED=ON` environment variable. + + +Self-hosted enterprise deployments can enable gateway-local JWT validation, which eliminates the need to call the control plane for authentication. When enabled, the gateway: + +1. Decodes the JWT and extracts the organisation ID from `portkey_oid` / `organisation_id` claims (or falls back to `ORGANISATIONS_TO_SYNC` for single-org deployments) +2. Verifies the token signature against the JWKS configured for the organisation +3. Resolves workspace and deployment details from the gateway's local data sync +4. Caches validated tokens locally until expiry + +This reduces authentication latency and makes JWT validation independent of control plane availability at request time. + ## Caching & Token Revocation - JWTs are cached until they expire to reduce validation overhead. diff --git a/product/guardrails/list-of-guardrail-checks.mdx b/product/guardrails/list-of-guardrail-checks.mdx index 7e40288b..4be5104f 100644 --- a/product/guardrails/list-of-guardrail-checks.mdx +++ b/product/guardrails/list-of-guardrail-checks.mdx @@ -215,6 +215,7 @@ Basic deterministic guardrails are ideal for quick, hard-coded validations that | **Model Rules** | Allow requests based on metadata-driven rules mapping to allowed models. | `rules`: object, `not`: boolean | Input only | | **Allowed Request Types** | Control which request types (endpoints) can be processed using an allowlist or blocklist. | `allowedTypes`: array, `blockedTypes`: array | Input only | | **Required Metadata Keys** | Checks if the metadata contains all the required keys. | `metadataKeys`: array, `operator`: string | Input only | +| **Required Metadata Key-Value Pairs** | Checks if the metadata contains specified key-value pairs. | `metadataPairs`: object, `operator`: string (all/any/none) | Input only | #### Media Processing | Guardrail Check | Description | Parameters | Supported On | diff --git a/product/observability/metadata.mdx b/product/observability/metadata.mdx index 945144eb..99213f2b 100644 --- a/product/observability/metadata.mdx +++ b/product/observability/metadata.mdx @@ -168,6 +168,20 @@ You can also apply any metadata filters to the logs or analytics and filter data +## Automatic Metadata from Headers (Self-Hosted) + + +Available on **Enterprise** Self Hosting plan. Requires gateway **2.5.0** or higher. + + +For self-hosted deployments, you can automatically inject request header values into metadata using the `HEADERS_TO_METADATA` environment variable. This is useful for enriching observability data with upstream context without requiring clients to set `x-portkey-metadata`. + +``` +HEADERS_TO_METADATA=x-request-id,x-caller-service,x-environment +``` + +When configured, the gateway extracts the specified header values from each incoming request and merges them into the request metadata. Header names are matched case-insensitively. + ## Enterprise Features For enterprise users, Portkey offers advanced metadata governance and lets you define metadata at multiple levels: From 04e9183dbe2988d6a3ae70c3fec631bc1bcd2918 Mon Sep 17 00:00:00 2001 From: sk-portkey Date: Wed, 1 Apr 2026 19:08:19 +0530 Subject: [PATCH 2/2] chore: cleanup --- changelog/enterprise.mdx | 39 +++++++++++++++++++++------------------ 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/changelog/enterprise.mdx b/changelog/enterprise.mdx index daa39685..53fb51b1 100644 --- a/changelog/enterprise.mdx +++ b/changelog/enterprise.mdx @@ -15,7 +15,13 @@ Discuss how Portkey's AI Gateway can enhance your organization's AI infrastructu ### Gateway-Local JWT Authentication -The gateway can now validate JWT tokens locally without calling the control plane, reducing authentication latency for self-hosted deployments. When `JWT_ENABLED=ON` is set, JWTs are decoded, verified against the configured JWKS, and cached locally until expiry. The gateway resolves organization, workspace, and deployment details from its local data sync, making JWT authentication fully independent of the control plane at request time. + +Requires Backend v1.13.0 or higher for Air Gapped deployments + + +The gateway can now validate JWT tokens locally without calling the control plane, reducing authentication latency for self-hosted deployments. + +SET `JWT_ENABLED=ON` to enable gateway-local JWT authentication. - Supports all existing JWT claims: `portkey_oid`, `portkey_workspace`, `scope`, `defaults`, `usage_limits`, `rate_limits` - Organisation ID can be resolved from the token (`portkey_oid` / `organisation_id`) or from `ORGANISATIONS_TO_SYNC` when a single org is configured @@ -32,7 +38,7 @@ Configure with a comma-separated list of header names: HEADERS_TO_METADATA=x-request-id,x-caller-service,x-environment ``` -Header values are matched case-insensitively and injected into the request metadata alongside any existing metadata from `x-portkey-metadata`. +Header values are matched case-insensitively and injected into the request metadata (lower case keys) alongside any existing metadata from `x-portkey-metadata`. [Metadata Documentation](/product/observability/metadata) @@ -43,12 +49,6 @@ Header values are matched case-insensitively and injected into the request metad [Usage & Rate Limit Policies Documentation](/product/enterprise-offering/budget-policies) -### Mistral AI: `response_format` Support - -Added `response_format` parameter support for Mistral AI, enabling structured JSON output and strict JSON schema enforcement. Previously, the gateway silently dropped this parameter for Mistral requests. - -[Mistral AI Documentation](/integrations/llms/mistral-ai) - ### New Guardrail: Required Metadata Key-Value Pairs Added a new native guardrail that validates specific metadata key-value pairs before processing a request. Supports `all`, `any`, and `none` operators for flexible matching. @@ -73,17 +73,20 @@ Upgraded the Lasso Security integration from v2 to v3 Classify API with the foll Added `image/heic` and `image/heif` to the list of supported image MIME types for multimodal requests and the Inline Image URLs guardrail. -### Google Embeddings: `dimensions` Parameter Fix - -Fixed the Google provider's embedding endpoint to correctly map the OpenAI-compatible `dimensions` parameter to Google's `output_dimensionality` parameter. - ### Fixes and Improvements -- **Bedrock**: Improved tool calling support — Anthropic-format tool calls in Bedrock now correctly handle tool result content, cache control blocks, and strict role alternation -- **Bedrock**: Tool descriptions are now optional in Bedrock Converse API requests -- **Vertex AI**: Fixed model allowlist checking for proxy context-cache create routes -- **Forward Headers**: Prevented `x-portkey-forward-headers` from being forwarded to prevent infinite request loops -- **Google Vertex**: Fixed schema conversion to correctly move type-specific properties (items, properties, etc.) inside `anyOf` branches -- Updated model pricing: Bedrock (`zai.glm-5`, `minimax.minimax-m2.5`), DeepInfra (`Qwen/Qwen3.5-122B-A10B`), xAI (`grok-4.20-reasoning-latest`) +- **Bedrock**: + - Improved tool calling support — Anthropic-format tool calls in Bedrock now correctly handle tool result content, cache control blocks, and strict role alternation + - Tool descriptions are now optional in Bedrock Converse API requests +- **Vertex AI**: + - Fixed model allowlist checking for proxy context-cache create routes + - Fixed schema conversion for tools +- **Google**: + - Fixed Google provider's embedding endpoint to correctly map the OpenAI-compatible `dimensions` parameter to Google's `output_dimensionality` parameter + - Fixed schema conversion for tools +- **Mistral AI**: + - Added `response_format` parameter support for Mistral AI, enabling structured JSON output and strict JSON schema enforcement. Previously, the gateway silently dropped this parameter for Mistral requests. + - [Mistral AI Documentation](/integrations/llms/mistral-ai) +- Security dependency updates