feat(cache): respect per-policy backend; unavailable redis fails visible, not silent-memory#587
Conversation
…ble, not silent-memory Complete the per-policy cache backend dispatch (api7/AISIX-Cloud#519 B.8) on top of the salvaged WIP: - rustfmt the WIP (import order in state.rs, two chat.rs hunks) - unit tests: redis policy w/o redis disables caching (both calls pay the upstream, no header, cache_status=disabled); redis policy with a redis instance dispatches to it and ONLY it (entry pinned on the instance by fingerprint, memory instance stays empty) - e2e: new cache-policy-backend spec — memory-only DP + backend=redis policy means every identical chat reaches the upstream (fails on origin/main, which silently served the repeat from node-local memory); real-redis positive path with two DPs sharing one redis (miss on A, hit on B without re-hitting the upstream) - docs synced: caching.md backend boundary rewritten, bootstrap-config cache section, production-deployment, response-caching tutorial, roadmap status - schemas/resources/cache_policy.schema.json regenerated (dump-schema)
|
Warning Review limit reached
More reviews will be available in 25 minutes and 30 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR implements per-policy cache backend selection: instead of a single global cache choice at boot, requests now use cache backends determined by their matched policy's ChangesPer-Policy Cache Backend Selection
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/aisix-server/src/main.rs`:
- Around line 392-401: The error log exposes credentials by including
redis_cfg.url in the RedisCache::connect map_err message; update the match arm
that creates redis_cache so it omits or redacts sensitive parts of redis_cfg.url
(e.g., replace password with "<redacted>" or log only host/port/scheme) and log
the underlying error returned by RedisCache::connect without the full URL,
ensuring the Arc<dyn Cache> creation and error propagation behavior for
RedisCache::connect remains unchanged.
In `@tests/e2e/src/cases/cache-policy-backend-e2e.test.ts`:
- Around line 242-252: The test currently casts responses to the expected shape
(firstBody/secondBody) without checking that choices[0]?.message.content exists,
which can lead to uncaught TypeError; modify the assertions after parsing
firstBody and secondBody to assert that Array.isArray(choices), choices.length >
0, and that choices[0].message and choices[0].message.content are defined and of
type string (use expect(...) toFail with a clear message) before accessing
choices[0].message.content in the rest of the test (locate validation right
after the firstBody and secondBody assignments).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 589a6c8b-dade-4e88-97ea-8a9cb1348c70
📒 Files selected for processing (15)
config.example.yamlcrates/aisix-cache/src/lib.rscrates/aisix-core/src/config.rscrates/aisix-core/src/models/cache_policy.rscrates/aisix-proxy/src/chat.rscrates/aisix-proxy/src/lib.rscrates/aisix-proxy/src/state.rscrates/aisix-server/src/main.rsdocs/configuration/bootstrap-config.mddocs/configuration/caching.mddocs/operations/production-deployment.mddocs/roadmap.mddocs/tutorials/enable-response-caching.mdschemas/resources/cache_policy.schema.jsontests/e2e/src/cases/cache-policy-backend-e2e.test.ts
What
CachePolicy.backendparsed and persisted but was never read by the proxy — every policy was served by the single global cache the boot config selected. An operator pickingbackend: redison a policy silently got node-local memory caching, which lies about the sharing semantics across DP replicas.This PR makes the matched policy's
backendselect the cache instance per request:MemoryCache(always) and aRedisCacheiffcfg.cache.redisis configuredbackend: memory→ memory instance;backend: redis→ redis instance if available, otherwise caching is INACTIVE for that request:cache_status = disabled, nox-aisix-cacheheader, one warning per policy in the log. No silent memory fallback.cfg.cache.backendknob keeps parsing; its only remaining effect is fail-fast boot validation (backend: rediswithout acache.redisblock still errors at startup)Before / after
backend: memorybackend: redis,cache.redisconfiguredbackend: redis, nocache.rediscache_status = disabled, warn-once-per-policy logcache.backend: rediswithoutcache.redisTests
crates/aisix-proxy/src/lib.rs):redis_backend_policy_without_redis_disables_caching— both identical calls pay the upstream (wiremock.expect(2)), nox-aisix-cacheheader, telemetrycache_status = "disabled"redis_backend_policy_dispatches_to_redis_instance— with a redis instance present, the entry is written to it and ONLY it (pinned by fingerprint; the memory instance stays empty, so a wrong-instance dispatch can't hide behind a generic "hit")tests/e2e/src/cases/cache-policy-backend-e2e.test.ts):backend: redispolicy → two identical chats BOTH reach the upstream, header absent. Fails against anorigin/mainbinary (the first call already getsx-aisix-cache: missfrom the silent memory fallback, the repeat is served ashit), passes with this branch. A canary policy created after the redis policy guards against vacuous passes during config propagation.redis:7-alpine): two DP instances share one redis —misson DP A,hiton DP B without re-hitting the upstream, proving the entry lives in redis rather than node memory. Skips honestly when no redis is reachable.configuration/caching.mdbackend boundary,configuration/bootstrap-config.mdcache section,operations/production-deployment.md, response-caching tutorial, roadmap status) andschemas/resources/cache_policy.schema.jsonregenerated via dump-schemaEndpoint-family sweep: the cache-policy gate exists only on
/v1/chat/completions(chat.rs::dispatch); messages/responses/embeddings/rerank/audio/images have no cache gate today, so there are no sibling sites sharing this bug.LiteLLM has no per-policy baseline to compare against — its cache backend is one global
Cache(type=...)selection — so no parity divergence arises; the per-policy instance selection aligns with the pgvector semantic-cache roadmap (#86). CP/dashboard already exposes thebackendfield, so no CP change is needed.Part of api7/AISIX-Cloud#519 (finding B.8).
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Closes api7/AISIX-Cloud#519