Skip to content

feat(cache): respect per-policy backend; unavailable redis fails visible, not silent-memory#587

Merged
jarvis9443 merged 3 commits into
mainfrom
feat/per-policy-cache-backend-519
Jun 11, 2026
Merged

feat(cache): respect per-policy backend; unavailable redis fails visible, not silent-memory#587
jarvis9443 merged 3 commits into
mainfrom
feat/per-policy-cache-backend-519

Conversation

@jarvis9443

@jarvis9443 jarvis9443 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

What

CachePolicy.backend parsed and persisted but was never read by the proxy — every policy was served by the single global cache the boot config selected. An operator picking backend: redis on a policy silently got node-local memory caching, which lies about the sharing semantics across DP replicas.

This PR makes the matched policy's backend select the cache instance per request:

  • boot builds BOTH instances: the in-process MemoryCache (always) and a RedisCache iff cfg.cache.redis is configured
  • at the cache-policy gate, backend: memory → memory instance; backend: redis → redis instance if available, otherwise caching is INACTIVE for that request: cache_status = disabled, no x-aisix-cache header, one warning per policy in the log. No silent memory fallback.
  • the legacy cfg.cache.backend knob keeps parsing; its only remaining effect is fail-fast boot validation (backend: redis without a cache.redis block still errors at startup)
  • multiple-policy matching semantics unchanged — only the backend lookup changed

Before / after

Scenario Before After
policy backend: memory memory cache memory cache (unchanged)
policy backend: redis, cache.redis configured memory cache (policy field ignored) redis cache, shared across DP replicas
policy backend: redis, no cache.redis memory cache (silent, node-local) caching disabled for matching requests: every call pays the upstream, cache_status = disabled, warn-once-per-policy log
cache.backend: redis without cache.redis boot error boot error (unchanged)

Tests

  • unit (crates/aisix-proxy/src/lib.rs):
    • redis_backend_policy_without_redis_disables_caching — both identical calls pay the upstream (wiremock .expect(2)), no x-aisix-cache header, telemetry cache_status = "disabled"
    • redis_backend_policy_dispatches_to_redis_instance — with a redis instance present, the entry is written to it and ONLY it (pinned by fingerprint; the memory instance stays empty, so a wrong-instance dispatch can't hide behind a generic "hit")
    • existing memory-policy cache tests unchanged and passing
  • e2e (tests/e2e/src/cases/cache-policy-backend-e2e.test.ts):
    • memory-only DP + backend: redis policy → two identical chats BOTH reach the upstream, header absent. Fails against an origin/main binary (the first call already gets x-aisix-cache: miss from the silent memory fallback, the repeat is served as hit), passes with this branch. A canary policy created after the redis policy guards against vacuous passes during config propagation.
    • real-redis positive path (CI already provisions redis:7-alpine): two DP instances share one redis — miss on DP A, hit on DP B without re-hitting the upstream, proving the entry lives in redis rather than node memory. Skips honestly when no redis is reachable.
  • docs synced (configuration/caching.md backend boundary, configuration/bootstrap-config.md cache section, operations/production-deployment.md, response-caching tutorial, roadmap status) and schemas/resources/cache_policy.schema.json regenerated via dump-schema

Endpoint-family sweep: the cache-policy gate exists only on /v1/chat/completions (chat.rs::dispatch); messages/responses/embeddings/rerank/audio/images have no cache gate today, so there are no sibling sites sharing this bug.

LiteLLM has no per-policy baseline to compare against — its cache backend is one global Cache(type=...) selection — so no parity divergence arises; the per-policy instance selection aligns with the pgvector semantic-cache roadmap (#86). CP/dashboard already exposes the backend field, so no CP change is needed.

Part of api7/AISIX-Cloud#519 (finding B.8).

Summary by CodeRabbit

  • New Features

    • Cache backend selection is now determined per-policy, enabling flexible memory vs. Redis caching strategies for different request types.
  • Bug Fixes

    • Requests matching a Redis-backed policy without configured Redis now explicitly disable caching instead of silently falling back to memory.
  • Documentation

    • Enhanced configuration and troubleshooting documentation clarifying cache backend availability and runtime behavior.
  • Tests

    • Added integration and E2E tests validating per-policy cache backend dispatch and Redis backend sharing across instances.

Closes api7/AISIX-Cloud#519

…ble, not silent-memory

Complete the per-policy cache backend dispatch (api7/AISIX-Cloud#519
B.8) on top of the salvaged WIP:

- rustfmt the WIP (import order in state.rs, two chat.rs hunks)
- unit tests: redis policy w/o redis disables caching (both calls pay
  the upstream, no header, cache_status=disabled); redis policy with a
  redis instance dispatches to it and ONLY it (entry pinned on the
  instance by fingerprint, memory instance stays empty)
- e2e: new cache-policy-backend spec — memory-only DP + backend=redis
  policy means every identical chat reaches the upstream (fails on
  origin/main, which silently served the repeat from node-local
  memory); real-redis positive path with two DPs sharing one redis
  (miss on A, hit on B without re-hitting the upstream)
- docs synced: caching.md backend boundary rewritten, bootstrap-config
  cache section, production-deployment, response-caching tutorial,
  roadmap status
- schemas/resources/cache_policy.schema.json regenerated (dump-schema)
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@jarvis9443, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 25 minutes and 30 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 797e74fa-83db-4953-9fbf-76d08722d0dc

📥 Commits

Reviewing files that changed from the base of the PR and between 7199678 and bc10a68.

📒 Files selected for processing (1)
  • crates/aisix-server/src/main.rs
📝 Walkthrough

Walkthrough

This PR implements per-policy cache backend selection: instead of a single global cache choice at boot, requests now use cache backends determined by their matched policy's backend field. A new CacheBackends struct manages both in-process and optional Redis caches; the handler selects one per-request and disables caching when the requested backend is unavailable.

Changes

Per-Policy Cache Backend Selection

Layer / File(s) Summary
Documentation and Configuration
config.example.yaml, crates/aisix-cache/src/lib.rs, crates/aisix-core/src/config.rs, crates/aisix-core/src/models/cache_policy.rs, docs/configuration/*, docs/operations/production-deployment.md, docs/roadmap.md, docs/tutorials/enable-response-caching.md, schemas/resources/cache_policy.schema.json
Configuration examples and all user-facing documentation clarified: memory cache always available; Redis only when cache.redis configured; per-request selection via policy backend field; no silent fallback to memory when Redis unavailable.
CacheBackends Infrastructure
crates/aisix-proxy/src/state.rs
New CacheBackends struct holds memory and optional Redis caches with for_policy_backend() selector. Emits warning once per policy when Redis requested but unconfigured. ProxyState field and constructor signatures updated to use CacheBackends.
Server Boot Wiring
crates/aisix-server/src/main.rs
Server initialization always builds MemoryCache and conditionally builds RedisCache when cache.redis present, then wraps both in CacheBackends. Validates cache.backend=redis without cache.redis at startup.
Handler Per-Request Selection
crates/aisix-proxy/src/chat.rs
Chat handler captures matched policy, selects cache backend via for_policy_backend(), and sets cache_status=disabled when backend unavailable. Cache writes execute only when both backend and TTL available.
Public API
crates/aisix-proxy/src/lib.rs
CacheBackends added to public re-exports alongside ProxyState.
Unit Tests
crates/aisix-proxy/src/lib.rs
Added seed_cache_policy_with_backend test helper and two Tokio tests validating redis-backend policies disable caching when Redis unconfigured and dispatch to Redis when configured.
E2E Tests
tests/e2e/src/cases/cache-policy-backend-e2e.test.ts
Comprehensive E2E suites validate redis-backed policies disable caching without Redis, and route to Redis with entry sharing across spawned data-plane instances.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the core change: implementing per-policy cache backend selection and ensuring unavailable Redis results in disabled caching rather than silent fallback.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/aisix-server/src/main.rs`:
- Around line 392-401: The error log exposes credentials by including
redis_cfg.url in the RedisCache::connect map_err message; update the match arm
that creates redis_cache so it omits or redacts sensitive parts of redis_cfg.url
(e.g., replace password with "<redacted>" or log only host/port/scheme) and log
the underlying error returned by RedisCache::connect without the full URL,
ensuring the Arc<dyn Cache> creation and error propagation behavior for
RedisCache::connect remains unchanged.

In `@tests/e2e/src/cases/cache-policy-backend-e2e.test.ts`:
- Around line 242-252: The test currently casts responses to the expected shape
(firstBody/secondBody) without checking that choices[0]?.message.content exists,
which can lead to uncaught TypeError; modify the assertions after parsing
firstBody and secondBody to assert that Array.isArray(choices), choices.length >
0, and that choices[0].message and choices[0].message.content are defined and of
type string (use expect(...) toFail with a clear message) before accessing
choices[0].message.content in the rest of the test (locate validation right
after the firstBody and secondBody assignments).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 589a6c8b-dade-4e88-97ea-8a9cb1348c70

📥 Commits

Reviewing files that changed from the base of the PR and between 20cd4a4 and 7199678.

📒 Files selected for processing (15)
  • config.example.yaml
  • crates/aisix-cache/src/lib.rs
  • crates/aisix-core/src/config.rs
  • crates/aisix-core/src/models/cache_policy.rs
  • crates/aisix-proxy/src/chat.rs
  • crates/aisix-proxy/src/lib.rs
  • crates/aisix-proxy/src/state.rs
  • crates/aisix-server/src/main.rs
  • docs/configuration/bootstrap-config.md
  • docs/configuration/caching.md
  • docs/operations/production-deployment.md
  • docs/roadmap.md
  • docs/tutorials/enable-response-caching.md
  • schemas/resources/cache_policy.schema.json
  • tests/e2e/src/cases/cache-policy-backend-e2e.test.ts

Comment thread crates/aisix-server/src/main.rs
Comment thread tests/e2e/src/cases/cache-policy-backend-e2e.test.ts
@jarvis9443 jarvis9443 merged commit d26926e into main Jun 11, 2026
10 checks passed
@jarvis9443 jarvis9443 deleted the feat/per-policy-cache-backend-519 branch June 11, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant