feat: routing overhaul by vishalveerareddy123 · Pull Request #70 · Fast-Editor/Lynkr

vishalveerareddy123 · 2026-05-27T23:51:03Z

Summary

Complete routing overhaul spanning 6 phases plus two NadirClaw-inspired safety features:

Phase 1-2: Complexity analysis, tier mapping, cost optimization, context validation
Phase 3-6: kNN router, LinUCB bandit, cascade mode, deadline routing, tenant policy, budget enforcement, shadow A/B testing
NadirClaw safety:

kNN ambiguous escalation: confidence 0.4-0.7 → tier bump (quality over cost)
Vision routing guard: auto-upgrade to vision models when payload has images

Key Commits

3ba5c2f Full routing overhaul (phases 1-4 + cross-cutting)
d5149ad Wire phases 3-6 into live request path
64da51b kNN ambiguous escalation + vision guard

Technical Details

kNN Ambiguous Escalation

When kNN neighbors are split (confidence 0.4-0.7), bump tier one step up:

SIMPLE → MEDIUM → COMPLEX → REASONING
REASONING never escalates (ceiling)
Method tag: +knn_ambiguous_escalate

Vision Routing Guard (Phase 1.4)

Slots after context validation, before kNN routing:

_payloadHasImages() checks for type: 'image' or 'image_url' blocks
If selected model lacks vision: true in registry, calls selector.findVisionCapable(tier)
Upgrades to cheapest vision model at or above current tier
Method tag: +vision_guard

Integration Points

routing/index.js: both features in determineProviderSmart
model-tiers.js: findVisionCapable(preferredTier) walks tier order upward
Test coverage: 16 new tests (8 kNN, 8 vision), all passing

Testing

node --test test/knn-ambiguous-escalate.test.js   # 8/8 pass
node --test test/vision-routing-guard.test.js      # 8/8 pass
node --test test/*.test.js                         # 740/758 pass (18 pre-existing failures)

Deployment Notes

Feature flags: LYNKR_KNN_ENABLED, LYNKR_CASCADE_ENABLED, LYNKR_SHADOW_POLICY
Tier config: TIER_SIMPLE, TIER_MEDIUM, TIER_COMPLEX, TIER_REASONING
Requires model registry with vision: bool field (already populated)

🤖 Generated with Claude Code

Implements docs/routing-improvement-plan.md across one branch. Phase 1 — Plug the open loops (default-on): 1.1 src/routing/tokenizer.js — js-tiktoken w/ chars/4 fallback; replaces estimator in complexity-analyzer.js and api/router.js 1.2 cost-optimizer wired into routing/index.js; picks cheaper qualifying model when ≥25% cheaper and risk!=high (LYNKR_COST_OPTIMIZE=false to disable) 1.3 src/routing/context-validator.js — escalates to context-capable model when estimated tokens exceed 85% of selected model's window 1.4 scripts/calibrate-thresholds.js — nightly job; ranges read from data/calibrated-thresholds.json by model-tiers.js 1.5 latency-tracker keyed by provider:model with backward-compat wildcard; databricks.js call sites pass model Phase 2 — Pre-router primitives: 2.1 cache/semantic.js bumped to 10K entries; short-TTL keyword override for time-sensitive queries 2.2 scripts/refresh-pricing.js — cron-friendly refresh + diff with >5% threshold alerting 2.3 scripts/learn-output-ratios.js + routing/output-ratios.js — per-task ratio table; cost-optimizer.estimateCost reads via ratioFor(taskType) Phase 3 — Learned scoring: 3.1 routing/knn-router.js + embedding-cache.js (hnswlib-node backed); scripts/build-knn-index.js with optional RouterBench bootstrap; empty/sparse → null; caller falls back to heuristic 3.3 routing/cascade.js + confidence-scorer.js — small-first cascade; off by default for streaming/tools, LYNKR_CASCADE_ENABLED=true 3.4 routing/risk-classifier.js + scripts/train-risk-classifier.js — LR over TF features; never downgrades regex-flagged high risk Phase 4 — Online adaptation: 4.1 routing/bandit.js (LinUCB) + reward-pipeline.js; state in data/bandit-state.json 4.2 routing/regret-estimator.js + scripts/sample-regret.js; opt-in via LYNKR_REGRET_ESTIMATOR=true (costs $ for Opus re-runs) 4.3 routing/drift-monitor.js — PSI over input/output distributions; alerts to data/drift-alerts.json 4.4 routing/shadow-mode.js + scripts/compare-policies.js — A/B without serving shadow decisions; LYNKR_SHADOW_POLICY=<name> to activate Cross-cutting (Phase 6): 6.1 routing/tenant-policy.js + api/middleware/tenant.js; per-tenant configs in data/tenants/<id>.json; LYNKR-Tenant-Id header 6.2 budget/hierarchical-budget.js + api/middleware/budget-enforcer.js; virtual_key/team/customer/org levels via in-process Map (Redis stub) 6.3 routing/deadline.js — P95-aware filtering keyed off LYNKR-Deadline-Ms Stubs (deferred per plan): 6.4 scripts/run-routerarena.js — entrypoint; CI integration not wired Deps added: js-tiktoken ^1.0.20, hnswlib-node ^3.0.0 Tests: 8 new test files covering tokenizer, bandit, drift, budget, cascade, tenant policy, deadline routing, and output ratios. Full unit suite passes (756/756). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- routing/index.js: replace risk-analyzer with risk-classifier; add kNN query override (confidence > 0.7), LinUCB bandit intra-tier selection, deadline-aware chooseFastest filter, per-tenant applyTenantOverrides, and shadow-mode fire-and-forget compareAndLog - databricks.js invokeModel: add small-first cascade (LYNKR_CASCADE_ENABLED) with _cascadeInner guard to prevent recursion - orchestrator/index.js runAgentLoop: thread _deadlineMs from lynkr-deadline-ms header and _tenantPolicy from options onto cleanPayload - databricks.js invokeModel: pass tenantPolicy from body._tenantPolicy into determineProviderSmart options - router.js: pass res.locals.tenantPolicy to processMessage options (both streaming and buffered paths) - server.js: mount tenantMiddleware and budgetEnforcer on /v1/messages Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Two NadirClaw-inspired safety features: 1. kNN ambiguous escalation (confidence 0.4-0.7): When kNN neighbors are split and no model has clear majority, bump tier one step up (SIMPLE→MEDIUM→COMPLEX→REASONING) to err on the side of quality over cost. REASONING tier is never escalated further. 2. Vision routing guard (Phase 1.4): When payload contains image content blocks and selected model lacks vision support, automatically upgrade to cheapest vision-capable model at or above current tier. Prevents silent upstream failures. Changes: - src/routing/index.js: add _payloadHasImages() helper, kNN ambiguous escalation block, and Phase 1.4 vision guard (slots after context validation, before kNN routing) - src/routing/model-tiers.js: add findVisionCapable() method (walks tier order from preferred upward, checks registry.getCost(model).vision) - test/knn-ambiguous-escalate.test.js: 8 tests covering boundary conditions, REASONING ceiling, missing config fallback - test/vision-routing-guard.test.js: 8 tests covering image/image_url detection, same-tier upgrade, cross-tier escalation, no-model warning All tests pass (16/16 new, 740/758 total). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Added detailed documentation for MCP Code Mode across three key files: 1. documentation/token-optimization.md: - New Phase 0: MCP Code Mode (96% reduction for MCP tools) - Full workflow example (discover → inspect → execute) - Token savings calculation (17,500 → 700 tokens) - Trade-offs and configuration - Updated phase numbering (6 → 7 optimization phases) - Headroom becomes Phase 8 2. documentation/tools.md: - New section: "MCP Code Mode (Token Optimization)" - When to use vs skip Code Mode - Integration with Smart Tool Selection - Integration with Headroom compression pipeline - Full workflow examples with JSON 3. README.md: - Updated "Token Optimization (8 Phases)" section - Enhanced "MCP Integration + Code Mode" section with: * Token reduction details (17,500 → 700 tokens) * Lazy tool discovery workflow * Use cases and trade-offs * Links to detailed documentation All docs now explain: - 96% token reduction for MCP-heavy setups - 4 meta-tools (mcp_list_tools, mcp_tool_info, mcp_tool_docs, mcp_execute) - Pipeline position: Code Mode → Smart Tool Selection → Headroom - Trade-off: 3 sequential calls vs 1 direct (adds ~2-3s latency) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…ures Added "Routing Safety Features" section to routing.md documenting: 1. Vision Capability Guard: - Automatic upgrade when images detected + model lacks vision - Tier escalation if no vision model at current tier - Example: ollama:llama3.2 → anthropic:claude-sonnet-4-6 - Method tag: +vision_guard 2. kNN Ambiguous Confidence Escalation: - When kNN confidence 0.4-0.7 (split neighbors) → escalate tier - Confidence >0.7 → use kNN model directly - Confidence ≤0.4 → ignore kNN - Example: MEDIUM → COMPLEX when neighbors split - Method tag: +knn_ambiguous_escalate Updated routing decision flow (12 → 19 steps) to include: - Step 13: Vision capability guard - Step 14: kNN routing with ambiguous escalation - Risk analysis, context escalation, LinUCB, deadline, tenant policy No external references, pure technical documentation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

vishal veerareddy and others added 3 commits May 20, 2026 17:25

veerareddyvishal144 changed the title ~~feat: routing overhaul (phases 1-6 + NadirClaw-inspired safety)~~ feat: routing overhaul May 28, 2026

vishal veerareddy and others added 2 commits May 27, 2026 17:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: routing overhaul #70

feat: routing overhaul #70
vishalveerareddy123 wants to merge 5 commits into
mainfrom
feat/routing-overhaul-v1

vishalveerareddy123 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vishalveerareddy123 commented May 27, 2026

Summary

Key Commits

Technical Details

kNN Ambiguous Escalation

Vision Routing Guard (Phase 1.4)

Integration Points

Testing

Deployment Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant