diff --git a/changelog/enterprise.mdx b/changelog/enterprise.mdx index 1d907672..53fb51b1 100644 --- a/changelog/enterprise.mdx +++ b/changelog/enterprise.mdx @@ -1,6 +1,6 @@ --- title: "Enterprise Gateway" -sidebarTitle: "Enterprise Gateway [2.4.4]" +sidebarTitle: "Enterprise Gateway [2.5.0]" rss: true --- @@ -8,6 +8,88 @@ rss: true Discuss how Portkey's AI Gateway can enhance your organization's AI infrastructure + + +## v2.5.0 +--- + +### Gateway-Local JWT Authentication + + +Requires Backend v1.13.0 or higher for Air Gapped deployments + + +The gateway can now validate JWT tokens locally without calling the control plane, reducing authentication latency for self-hosted deployments. + +SET `JWT_ENABLED=ON` to enable gateway-local JWT authentication. + +- Supports all existing JWT claims: `portkey_oid`, `portkey_workspace`, `scope`, `defaults`, `usage_limits`, `rate_limits` +- Organisation ID can be resolved from the token (`portkey_oid` / `organisation_id`) or from `ORGANISATIONS_TO_SYNC` when a single org is configured +- JWT-authenticated requests work with rate limit and usage limit policies + +[JWT Authentication Documentation](/product/enterprise-offering/org-management/jwt) + +### Headers-to-Metadata Injection + +New `HEADERS_TO_METADATA` environment variable allows automatically injecting request header values into metadata. This enables enriching observability data with upstream context (e.g., caller identity, trace IDs, or environment tags) without requiring clients to set `x-portkey-metadata`. + +Configure with a comma-separated list of header names: +``` +HEADERS_TO_METADATA=x-request-id,x-caller-service,x-environment +``` + +Header values are matched case-insensitively and injected into the request metadata (lower case keys) alongside any existing metadata from `x-portkey-metadata`. + +[Metadata Documentation](/product/observability/metadata) + +### Rate Limit Policies: Weekly Window and Endpoint Type Conditions + +- **Requests per week (`rpw`)**: Rate limit policies now support a weekly window in addition to per-minute, per-hour, and per-day +- **`endpoint_type` condition**: Rate limit policies can now target specific endpoint types (e.g., `chatComplete`, `embed`, `complete`) using the `endpoint_type` condition key + +[Usage & Rate Limit Policies Documentation](/product/enterprise-offering/budget-policies) + +### New Guardrail: Required Metadata Key-Value Pairs + +Added a new native guardrail that validates specific metadata key-value pairs before processing a request. Supports `all`, `any`, and `none` operators for flexible matching. + +| Check Name | Parameters | Supported Hooks | +|---|---|---| +| Required Metadata Key-Value Pairs | `metadataPairs` (object), `operator` (all/any/none) | `beforeRequestHook` | + +[Guardrail Checks Documentation](/product/guardrails/list-of-guardrail-checks) + +### Lasso Security Guardrail: v3 API Upgrade + +Upgraded the Lasso Security integration from v2 to v3 Classify API with the following improvements: +- **Findings-based detection**: Responses now include structured findings with name, category, action (BLOCK/AUTO_MASKING/WARN), and severity +- **Session and user tracking**: Supports `sessionId` and `userId` for conversation-level analysis +- **Multi-format support**: Handles `chatComplete`, `messages`, `complete`, and `embed` request types +- **Custom API endpoint**: Supports configurable `apiEndpoint` for self-hosted Lasso deployments + +[Lasso Security Documentation](/integrations/guardrails/lasso) + +### Image Format Support: HEIC and HEIF + +Added `image/heic` and `image/heif` to the list of supported image MIME types for multimodal requests and the Inline Image URLs guardrail. + +### Fixes and Improvements +- **Bedrock**: + - Improved tool calling support — Anthropic-format tool calls in Bedrock now correctly handle tool result content, cache control blocks, and strict role alternation + - Tool descriptions are now optional in Bedrock Converse API requests +- **Vertex AI**: + - Fixed model allowlist checking for proxy context-cache create routes + - Fixed schema conversion for tools +- **Google**: + - Fixed Google provider's embedding endpoint to correctly map the OpenAI-compatible `dimensions` parameter to Google's `output_dimensionality` parameter + - Fixed schema conversion for tools +- **Mistral AI**: + - Added `response_format` parameter support for Mistral AI, enabling structured JSON output and strict JSON schema enforcement. Previously, the gateway silently dropped this parameter for Mistral requests. + - [Mistral AI Documentation](/integrations/llms/mistral-ai) +- Security dependency updates + + + ## v2.4.4 diff --git a/integrations/guardrails/lasso.mdx b/integrations/guardrails/lasso.mdx index d23b6766..bcc0130a 100644 --- a/integrations/guardrails/lasso.mdx +++ b/integrations/guardrails/lasso.mdx @@ -29,7 +29,7 @@ To get started with Lasso Security, visit their documentation: | Check Name | Description | Parameters | Supported Hooks | |------------|-------------|------------|-----------------| -| Scan Content | Lasso Security's Deputies analyze content for various security risks including jailbreak attempts, custom policy violations, sexual content, hate speech, illegal content, and more. | `Timeout` (number) | `beforeRequestHook` | +| Scan Content | Lasso Security's Deputies analyze content for various security risks including jailbreak attempts, custom policy violations, sexual content, hate speech, illegal content, and more. Returns structured findings with action levels (BLOCK, AUTO_MASKING, WARN). | `Timeout` (number) | `beforeRequestHook` | @@ -127,6 +127,20 @@ Lasso Security's Deputies analyze content for various security risks across mult 4. **Custom Policy Violations**: Enforces your organization's specific security policies 5. **Harmful Content Detection**: Flags sexual content, hate speech, illegal content, and more +The integration returns structured **findings** for each deputy, including the finding name, category, action level (`BLOCK`, `AUTO_MASKING`, or `WARN`), and severity. Requests are blocked when any finding has a `BLOCK` action. + +### Supported Request Types + +Lasso works across multiple request types: +- **Chat Completions** (`/v1/chat/completions`) +- **Messages** (Anthropic-style) +- **Completions** (`/v1/completions`) +- **Embeddings** (`/v1/embeddings`) + +### Self-Hosted Lasso Endpoint + +If you are running a self-hosted Lasso deployment, you can configure a custom `apiEndpoint` in your Lasso credentials to point to your own instance instead of the default Lasso cloud. + Learn more about Lasso Security's features [here](https://www.lasso.security). ## Get Support diff --git a/integrations/llms/mistral-ai.mdx b/integrations/llms/mistral-ai.mdx index 47719763..cbfa6667 100644 --- a/integrations/llms/mistral-ai.mdx +++ b/integrations/llms/mistral-ai.mdx @@ -167,6 +167,92 @@ Your Codestral requests will show up on Portkey logs with code snippets rendered +## Structured Output (response_format) + +Mistral AI supports structured output via the `response_format` parameter. You can enforce JSON schema validation for deterministic, structured responses: + + +```python Python +from portkey_ai import Portkey + +portkey = Portkey( + api_key="PORTKEY_API_KEY", + provider="@mistral-ai" +) + +response = portkey.chat.completions.create( + model="mistral-large-latest", + messages=[{"role": "user", "content": "List 3 capitals with their countries"}], + response_format={ + "type": "json_schema", + "json_schema": { + "name": "capitals", + "strict": True, + "schema": { + "type": "object", + "properties": { + "capitals": { + "type": "array", + "items": { + "type": "object", + "properties": { + "city": {"type": "string"}, + "country": {"type": "string"} + }, + "required": ["city", "country"] + } + } + }, + "required": ["capitals"] + } + } + } +) + +print(response.choices[0].message.content) +``` + +```js Javascript +import Portkey from 'portkey-ai' + +const portkey = new Portkey({ + apiKey: "PORTKEY_API_KEY", + provider: "@mistral-ai" +}) + +const response = await portkey.chat.completions.create({ + model: "mistral-large-latest", + messages: [{ role: "user", content: "List 3 capitals with their countries" }], + response_format: { + type: "json_schema", + json_schema: { + name: "capitals", + strict: true, + schema: { + type: "object", + properties: { + capitals: { + type: "array", + items: { + type: "object", + properties: { + city: { type: "string" }, + country: { type: "string" } + }, + required: ["city", "country"] + } + } + }, + required: ["capitals"] + } + } + } +}) + +console.log(response.choices[0].message.content) +``` + + ## Mistral Tool Calling Tool calling lets models trigger external tools based on conversation context. You define available functions, the model chooses when to use them, and your application executes them and returns results. diff --git a/product/enterprise-offering/budget-policies.mdx b/product/enterprise-offering/budget-policies.mdx index bd34246a..796db55c 100644 --- a/product/enterprise-offering/budget-policies.mdx +++ b/product/enterprise-offering/budget-policies.mdx @@ -644,7 +644,7 @@ This resets the entity's usage counter to zero, allowing it to consume the full 4. **Periodic Reset Options**: `"weekly"`, `"monthly"`, or `null` for no reset. -5. **Rate Limit Units**: `"rpm"` (per minute), `"rph"` (per hour), `"rpd"` (per day). +5. **Rate Limit Units**: `"rpm"` (per minute), `"rph"` (per hour), `"rpd"` (per day), `"rpw"` (per week). 6. **Usage Limit Types**: `"cost"` (in dollars) or `"tokens"`. diff --git a/product/enterprise-offering/org-management/jwt.mdx b/product/enterprise-offering/org-management/jwt.mdx index 698ec98a..728a73f2 100644 --- a/product/enterprise-offering/org-management/jwt.mdx +++ b/product/enterprise-offering/org-management/jwt.mdx @@ -526,6 +526,21 @@ main(); ``` +## Gateway-Local JWT Validation (Self-Hosted) + + +Requires gateway **2.5.0** or higher with `JWT_ENABLED=ON` environment variable. + + +Self-hosted enterprise deployments can enable gateway-local JWT validation, which eliminates the need to call the control plane for authentication. When enabled, the gateway: + +1. Decodes the JWT and extracts the organisation ID from `portkey_oid` / `organisation_id` claims (or falls back to `ORGANISATIONS_TO_SYNC` for single-org deployments) +2. Verifies the token signature against the JWKS configured for the organisation +3. Resolves workspace and deployment details from the gateway's local data sync +4. Caches validated tokens locally until expiry + +This reduces authentication latency and makes JWT validation independent of control plane availability at request time. + ## Caching & Token Revocation - JWTs are cached until they expire to reduce validation overhead. diff --git a/product/guardrails/list-of-guardrail-checks.mdx b/product/guardrails/list-of-guardrail-checks.mdx index 7e40288b..4be5104f 100644 --- a/product/guardrails/list-of-guardrail-checks.mdx +++ b/product/guardrails/list-of-guardrail-checks.mdx @@ -215,6 +215,7 @@ Basic deterministic guardrails are ideal for quick, hard-coded validations that | **Model Rules** | Allow requests based on metadata-driven rules mapping to allowed models. | `rules`: object, `not`: boolean | Input only | | **Allowed Request Types** | Control which request types (endpoints) can be processed using an allowlist or blocklist. | `allowedTypes`: array, `blockedTypes`: array | Input only | | **Required Metadata Keys** | Checks if the metadata contains all the required keys. | `metadataKeys`: array, `operator`: string | Input only | +| **Required Metadata Key-Value Pairs** | Checks if the metadata contains specified key-value pairs. | `metadataPairs`: object, `operator`: string (all/any/none) | Input only | #### Media Processing | Guardrail Check | Description | Parameters | Supported On | diff --git a/product/observability/metadata.mdx b/product/observability/metadata.mdx index 945144eb..99213f2b 100644 --- a/product/observability/metadata.mdx +++ b/product/observability/metadata.mdx @@ -168,6 +168,20 @@ You can also apply any metadata filters to the logs or analytics and filter data +## Automatic Metadata from Headers (Self-Hosted) + + +Available on **Enterprise** Self Hosting plan. Requires gateway **2.5.0** or higher. + + +For self-hosted deployments, you can automatically inject request header values into metadata using the `HEADERS_TO_METADATA` environment variable. This is useful for enriching observability data with upstream context without requiring clients to set `x-portkey-metadata`. + +``` +HEADERS_TO_METADATA=x-request-id,x-caller-service,x-environment +``` + +When configured, the gateway extracts the specified header values from each incoming request and merges them into the request metadata. Header names are matched case-insensitively. + ## Enterprise Features For enterprise users, Portkey offers advanced metadata governance and lets you define metadata at multiple levels: