feat(llm-gateway): complete Cloudflare Worker implementation (Phases 1–7) by iscekic · Pull Request #752 · Kilo-Org/cloud

iscekic · 2026-03-03T00:20:20Z

Summary

Implements the llm-gateway Cloudflare Worker — a transparent drop-in replacement for the Next.js /api/openrouter/[...path]/route.ts proxy
Full Hono middleware chain matching the existing request flow: body parsing → auth → anonymous gate → rate limiting → provider resolution → balance/org checks → request transform → upstream proxy → background tasks
168 unit tests passing, typecheck clean

Phases

Phase	Commit	Description
1	`5a6b119`	Scaffolding: package.json, wrangler.jsonc, tsconfig, Hono stub
2	`775f840`	Request parsing, JWT auth, anonymous gate
3	`7ce0844`	Rate limiting (KV sliding window), provider resolution, BYOK
4	`faeacf1`	Balance/org checks, request validation, request transform
5	`84202a8`	Upstream proxy, response handling, custom LLM (Vercel AI SDK), abuse service
6	`6121e81`	Background tasks: usage accounting, API metrics, request logging, abuse cost
7	`b31334a`	168 unit tests covering every middleware and library module

Follow-up improvements (also in this PR)

b07e629 — Replace all setTimeout with Workers-native scheduler.wait()
b5eaf47 — Convert O11Y service binding from .fetch() HTTP to RPC (ingestApiMetrics method on WorkerEntrypoint), eliminating HTTP routing overhead and admin token auth for internal calls

Architecture

Hono middleware chain — each concern is a named middleware using c.set()/c.get() on Hono context
Cloudflare bindings: Hyperdrive (Postgres), KV (rate limiting), Service binding (O11Y via RPC), Secrets Store (all API keys)
Platform mappings: after() → ctx.waitUntil(), response.clone() → ReadableStream.tee(), setTimeout → scheduler.wait(), Sentry → structured logging
Smart Placement — colocated with DB, not at edge

Key files

llm-gateway/
  src/
    index.ts              # Hono app with full middleware chain
    middleware/            # 14 middleware files
    handler/proxy.ts      # Upstream fetch + ReadableStream.tee() + background tasks
    background/           # usage-accounting, api-metrics, request-logging, abuse-cost
    lib/                  # Pure functions: models, providers, BYOK, custom-llm, etc.
  test/unit/              # 24 test files, 168 tests
cloudflare-o11y/
  src/index.ts            # Added ingestApiMetrics RPC method

Add llm-gateway Cloudflare Worker package with: - pnpm workspace registration - package.json, tsconfig.json, wrangler.jsonc, eslint.config.mjs - vitest.config.ts / vitest.workers.config.ts (unit + integration) - src/index.ts: Hono app stub returning 501 on POST /chat/completions - src/env.ts: Cloudflare.Env type alias - src/types.ts: HonoContext, Variables, OpenRouterChatCompletionRequest - src/logger.ts: workers-tagged-logger setup - worker-configuration.d.ts: type stub (replace with wrangler types output) - Smoke test confirming worker module loads cleanly pnpm typecheck passes, pnpm test passes.

Remove all bindings (Hyperdrive, KV, services, secrets store, vars) — these will be added incrementally as each phase needs them. Simplify worker-configuration.d.ts stub and env.ts to match.

- Use shared Hyperdrive id (624ec80...dd10) with localConnectionString inline, no dev env override - Create and use real KV namespace id for USER_CACHE_KV (c92d83fa...) - Secret binding name matches secret name: NEXTAUTH_SECRET_PROD (mirrors session-ingest pattern) - Drop env.dev block entirely

- Middleware chain: request-timing, parse-body, extract-ip, resolve-auto-model, auth, anonymous-gate - Routes /api/gateway/chat/completions and /api/openrouter/chat/completions - lib/jwt.ts: verifyGatewayJwt + isPepperValid (jose ERR_JWT_EXPIRED fix) - worker-utils: add userExistsWithCache (single source of truth for KV-backed user existence check); consumed by both session-ingest and llm-gateway - llm-gateway uses shared USER_EXISTS_CACHE KV (ab836697) from session-ingest - Unit tests: jwt, parse-body, models, kilo-auto-model (39 tests)

- lib/rate-limit.ts: KV sliding window (1h/200 free models; 24h/10k promotions) - middleware/free-model-rate-limit.ts: 429 for Kilo free models over limit - middleware/promotion-limit.ts: 401 for anonymous users over promotion limit - middleware/log-free-model-usage.ts: waitUntil DB insert + KV increment - lib/byok.ts: BYOK lookup with Web Crypto AES-256-GCM decryption - lib/providers.ts: Provider type, PROVIDERS map, getProvider() with BYOK/custom LLM/free model routing - lib/provider-specific.ts: applyProviderSpecificLogic() porting all provider sub-modules - lib/tool-calling.ts: normalizeToolCallIds/repairTools/hasAttemptCompletionTool (async Web Crypto SHA-256) - middleware/provider-resolution.ts: pre-fetches secrets in parallel, wires provider result to context - types/hono.ts: added provider, userByok, customLlm, secrets variables - wrangler.jsonc + worker-configuration.d.ts: RATE_LIMIT_KV + provider secrets bindings - 54 tests passing (7 new rate-limit + 8 providers)

- lib/provider-hash.ts: async Web Crypto SHA-256 (generateProviderSpecificHash) - lib/promotions.ts: isActiveReviewPromo, isActiveCloudAgentPromo - lib/extract-headers.ts: getFraudDetectionHeaders, extractProjectHeaders, normalizeProjectId - lib/prompt-info.ts: extractPromptInfo, estimateChatTokens - lib/org-restrictions.ts: checkOrganizationModelRestrictions, getBalanceAndOrgSettings (Drizzle query, no credit expiration) - middleware/request-validation.ts: max_tokens cap, dead model 404, rate-limited-to-death 404 - middleware/balance-and-org.ts: balance check, org model/provider restrictions, data collection gate - middleware/request-transform.ts: safety_identifier, prompt_cache_key, tool repair, provider-specific logic, extraHeaders on context - lib/tool-calling.ts: export ENABLE_TOOL_REPAIR - types/hono.ts: add fraudHeaders, projectId, taskId, editorName, machineId, xKiloCodeVersion, numericKiloCodeVersion, extraHeaders - src/index.ts: wire requestValidationMiddleware, balanceAndOrgCheckMiddleware, requestTransformMiddleware - test/unit/provider-hash.test.ts: 6 tests - test/unit/org-restrictions.test.ts: 11 tests - 71 tests passing, typecheck clean

- handler/proxy.ts: upstream fetch (openRouterRequest + customLlmRequest), abuse classification (non-blocking, 2s timeout), 402→503 conversion, error logging, makeErrorReadable, free model SSE rewrite, pass-through - lib/response-helpers.ts: getOutputHeaders, wrapResponse, makeErrorReadable - lib/rewrite-free-model-response.ts: SSE stream transformer (eventsource-parser), model name rewrite, reasoning_details normalisation, cost field stripping - lib/abuse-service.ts: classifyAbuse + reportAbuseCost via CF Access headers - lib/custom-llm/: full port of customLlmRequest (Vercel AI SDK, Anthropic + OpenAI-compatible, streaming + non-streaming, reasoning, tool calling, temp_phase DB tracking with Web Crypto SHA-256 instead of Node.js crypto.hash) - index.ts: replace notImplemented stub with proxyHandler - package.json: add ai, @ai-sdk/anthropic, @ai-sdk/openai, eventsource-parser - wrangler.jsonc + worker-configuration.d.ts: ABUSE_SERVICE_URL, ABUSE_CF_ACCESS_CLIENT_ID, ABUSE_CF_ACCESS_CLIENT_SECRET bindings - All 71 existing unit tests pass, typecheck clean

…metrics, request logging, abuse cost) - background/usage-accounting.ts: parse SSE/non-streaming response, compute cost, insert microdollar_usage + metadata via CTE, update kilocode_users.microdollars_used, ingest org token usage. Fixes duplicate parseMicrodollarUsageFromString that was left in the stub. - background/api-metrics.ts: drain background stream to extract inferenceProvider, emit ApiMetricsParams to O11Y service binding via POST /ingest/api-metrics - background/request-logging.ts: insert api_request_log for Kilo employees (@kilo.ai / @kilocode.ai emails or KILO_ORGANIZATION_ID) - background/abuse-cost.ts: report upstream cost to abuse service after usage is computed - handler/proxy.ts: rewritten to tee response.body into 4 streams (client + accounting + metrics + logging) using ReadableStream.tee() chains; all background tasks scheduled via ctx.waitUntil() with 25s internal timeout; free model path only tees for metrics - wrangler.jsonc + worker-configuration.d.ts: add O11Y_KILO_GATEWAY_CLIENT_SECRET Secrets Store binding needed for O11Y service auth

… passing) Add comprehensive unit tests covering every middleware and library module: - middleware-chain: full e2e flow with mocked DB/KV/Secrets, health check, 404, body validation, anonymous gate, route parity, missing IP - anonymous-gate: free model anonymous access, paid model 401, authUser passthrough - free-model-rate-limit: KV sliding window 200/hr, per-IP isolation, non-Kilo skip - request-validation: max_tokens overflow, dead free models, rate-limited-to-death - response-helpers: header whitelisting, wrapResponse, BYOK error messages - rewrite-free-model-response: SSE cost stripping, model replacement, reasoning_content→reasoning conversion, JSON non-streaming path - tool-calling: repairTools deduplication/insertion/orphan removal, dropToolStrictProperties, normalizeToolCallIds, hasAttemptCompletionTool - extract-headers: fraud detection headers, project ID normalization (HTTPS/SSH git URLs) - prompt-info: extractPromptInfo with multipart content, estimateChatTokens - feature-detection: validateFeatureHeader with valid/invalid/null inputs - anonymous: createAnonymousContext, isAnonymousContext type guards - promotions: isActiveReviewPromo/isActiveCloudAgentPromo with time mocking - abuse-service: classifyRequest/classifyAbuse/reportCost/reportAbuseCost with mocked fetch, error handling, CF Access headers - request-logging: isKiloEmployee guard (@kilo.ai, @kilocode.ai, org ID), DB insert, error handling - shared test helpers (helpers.ts): JWT signing, mock Env, mock KV/Secrets, SSE stream builders

Use the Workers-native scheduler.wait() API instead of setTimeout for all timer-based patterns. scheduler.wait() is an awaitable alternative that integrates properly with the Workers I/O scheduler. Replaced in: - handler/proxy.ts: withTimeout helper and abuse classification 2s timeout - background/api-metrics.ts: stream read timeout during inference provider extraction - background/usage-accounting.ts: retry backoff delay for DB concurrency failures - worker-configuration.d.ts: added scheduler global type declaration

…etch Add ingestApiMetrics RPC method to the O11Y worker's WorkerEntrypoint, matching the existing ingestSessionMetrics pattern used by session-ingest. This eliminates HTTP routing overhead, JSON serialization, and the X-O11Y-ADMIN-TOKEN auth header for internal service-to-service calls. Changes: - cloudflare-o11y/src/index.ts: add ingestApiMetrics RPC method that validates with ApiMetricsParamsSchema and writes to Analytics Engine + Pipeline - llm-gateway/src/o11y-binding.d.ts: declare O11YBinding type extending Fetcher with the ingestApiMetrics RPC method signature - llm-gateway/src/env.ts: override Cloudflare.Env O11Y type with O11YBinding - llm-gateway/src/background/api-metrics.ts: replace o11y.fetch() HTTP call with o11y.ingestApiMetrics() RPC call, remove clientSecret parameter - llm-gateway/src/handler/proxy.ts: remove o11yClientSecretPromise pre-fetch and clientSecret plumbing, simplify BackgroundTaskParams o11y type

llm-gateway/src/index.ts

llm-gateway/src/lib/rewrite-free-model-response.ts

llm-gateway/src/middleware/auth.ts

llm-gateway/src/lib/rate-limit.ts

llm-gateway/src/lib/provider-specific.ts

llm-gateway/src/background/usage-accounting.ts

llm-gateway/src/lib/promotions.ts

kilo-code-bot · 2026-03-03T00:29:25Z

Code Review Summary

Status: 2 New Issues Found (101 total) | Recommendation: Address before merge

Fix these issues in Kilo Cloud

Overview

Severity	Count (new)	Count (total)
CRITICAL	0	4
WARNING	2	82
SUGGESTION	0	15

New Issues in This Review Pass

Issue Details (click to expand)

WARNING

File	Line	Issue
`llm-gateway/src/background/kilo-pass.ts`	510	Triple `as unknown as Tx` cast violates project coding rules — appears 3 times (lines 510, 512, 516)
`llm-gateway/src/background/kilo-pass.ts`	312	`new Date(nextYearlyIssueAtIso)` bypasses `parseIso` validation — could produce `Invalid Date` and corrupt issue month to `"NaN-NaN-01"`

Prior Review Issues (99 comments)

Many issues from prior review passes have been fixed by the author, including:

✅ err.message exposed to client → now returns generic error
✅ Redundant userExistsWithCache DB query → single lookup
✅ null as unknown as string cast → removed
✅ Missing await on applyProviderSpecificLogic → fixed
✅ Dead src/types.ts file → deleted
✅ as ReadableStream<Uint8Array> cast → removed
✅ Null-forgiving ! operator → extracted local const
✅ Unsafe as cast on HTTP status → narrowed at source
✅ Race condition in KV rate limiter → replaced with Durable Object
✅ zai provider double config → restructured as if/else chain
✅ as cast in anonymous.ts → removed, uses flow-sensitive typing
✅ Uncaught JSON.parse in parseAwsCredentials → wrapped in try/catch
✅ Missing max_completion_tokens validation → now validated
✅ Missing org membership check for custom LLMs → added
✅ .tee() on free model path → replaced with buffer+replay
✅ Non-constant-time string comparison for pepper → uses timingSafeEqual
✅ NaN from parseFloat → guarded with Number.isNaN()
✅ Reader not cancelled in catch block → added reader.cancel()
✅ Error logging not in waitUntil → wrapped
✅ Dead abuse-cost.ts file → deleted
✅ Unused wrangler secrets → removed
✅ 402→503 early return without background tasks → now schedules tasks first

Several issues were acknowledged as matching the reference implementation (not regressions):

JSON.parse without try/catch in SSE parser (matches reference)
JSON.parse on tool call arguments (matches reference)
new URL() without try/catch (matches reference)
Internal error message exposed via e.message (matches reference)
atob() without try/catch (matches reference, data from DB)
toLowerCase() on model ID (matches reference)

Outstanding issues from prior passes that remain open:

as ReadableStreamDefaultReader<Uint8Array> casts (3 locations)
as casts in abuse-service.ts (response.json)
as cast in custom-llm/index.ts line 162
as cast in vercel-routing.ts line 48
as cast in tool-calling.ts line 15
as cast in index.ts line 25 (useWorkersLogger)
Stale file header comments in usage-accounting.ts
abuse_classification hardcoded to 0
hasGenerationEndpoint: true for custom LLMs
Free model list duplication between models.ts and providers.ts
context_length mismatch for minimax model
Anonymous user ID embeds raw IP address
fetchWithBackoff retries on all 4xx (including auth failures)
FIRST_TOPUP_BONUS hardcoded constant
Inconsistent error response formats across middlewares
select() with no columns fetches all columns
Timestamp array scalability in RateLimitDO

Files Reviewed (110+ files)

New llm-gateway worker (core):

llm-gateway/src/index.ts - Entry point, middleware chain
llm-gateway/src/handler/proxy.ts - Core proxy handler
llm-gateway/src/handler/background-tasks.ts - Background task orchestration

Middleware:

llm-gateway/src/middleware/auth.ts
llm-gateway/src/middleware/anonymous-gate.ts
llm-gateway/src/middleware/balance-and-org.ts
llm-gateway/src/middleware/extract-ip.ts
llm-gateway/src/middleware/free-model-rate-limit.ts
llm-gateway/src/middleware/log-free-model-usage.ts
llm-gateway/src/middleware/parse-body.ts
llm-gateway/src/middleware/promotion-limit.ts
llm-gateway/src/middleware/provider-resolution.ts
llm-gateway/src/middleware/request-timing.ts
llm-gateway/src/middleware/request-transform.ts
llm-gateway/src/middleware/request-validation.ts
llm-gateway/src/middleware/resolve-auto-model.ts

Background tasks:

llm-gateway/src/background/api-metrics.ts
llm-gateway/src/background/kilo-pass.ts - 2 new issues
llm-gateway/src/background/request-logging.ts
llm-gateway/src/background/usage-accounting.ts

Libraries:

llm-gateway/src/lib/abuse-service.ts
llm-gateway/src/lib/anonymous.ts
llm-gateway/src/lib/byok.ts
llm-gateway/src/lib/custom-llm/index.ts
llm-gateway/src/lib/extract-headers.ts
llm-gateway/src/lib/feature-detection.ts
llm-gateway/src/lib/jwt.ts
llm-gateway/src/lib/models.ts
llm-gateway/src/lib/org-restrictions.ts
llm-gateway/src/lib/promotions.ts
llm-gateway/src/lib/prompt-info.ts
llm-gateway/src/lib/provider-hash.ts
llm-gateway/src/lib/provider-specific.ts
llm-gateway/src/lib/providers.ts
llm-gateway/src/lib/rate-limit.ts
llm-gateway/src/lib/response-helpers.ts
llm-gateway/src/lib/rewrite-free-model-response.ts
llm-gateway/src/lib/sentry.ts
llm-gateway/src/lib/tool-calling.ts
llm-gateway/src/lib/vercel-routing.ts

Durable Objects:

llm-gateway/src/dos/RateLimitDO.ts

Shared packages:

packages/worker-utils/src/o11y-schemas.ts
packages/worker-utils/src/user-exists-cache.ts
packages/worker-utils/src/index.ts

Other workers (refactored):

cloudflare-o11y/src/index.ts
cloudflare-o11y/src/api-metrics-routes.ts
cloudflare-o11y/src/o11y-analytics.ts
cloudflare-o11y/src/session-metrics-analytics.ts
cloudflare-session-ingest/src/env.ts
cloudflare-session-ingest/src/middleware/kilo-jwt-auth.ts
cloudflare-session-ingest/src/o11y-binding.d.ts
cloudflare-ai-attribution/src/ai-attribution.worker.ts
cloudflare-webhook-agent-ingest/src/index.ts

Config & tests: 40+ test files, config files (eslint, vitest, wrangler, tsconfig, package.json)

GIGAPOTATO_API_URL, OPENROUTER_ORG_ID, and ABUSE_SERVICE_URL were plain env vars but should be secrets. Move them to secrets_store_secrets and call .get() at the callsites so the Secrets Store Fetcher objects are resolved to strings before use.

llm-gateway/src/middleware/request-transform.ts

llm-gateway/src/lib/anonymous.ts

llm-gateway/src/types.ts

Replace .tee()-based response body splitting with a buffer-and-replay approach for background tasks. With .tee(), all stream branches share the same source — if any consumer (usage accounting DB writes, request logging) stalls, backpressure propagates to the client stream, causing periodic 5s freezes during streaming responses. Now the pass-through path pipes upstream chunks through a TransformStream directly to the client while accumulating a copy. Background tasks (accounting, metrics, logging) replay the buffered data after the stream completes, fully decoupled from client delivery. Also: restore O11Y service binding (service name: o11y), cancel unconsumed tee branches in free-model and fallback paths.

llm-gateway/src/lib/custom-llm/index.ts

llm-gateway/src/handler/proxy.ts

llm-gateway/src/lib/provider-specific.ts

llm-gateway/src/background/usage-accounting.ts

llm-gateway/src/middleware/balance-and-org.ts

- Type-safe ReadableStream readers (cast to ReadableStream<Uint8Array>) - Remove unnecessary type assertions (as string, as number, as unknown[]) - Replace non-null assertions with local variables or typeof narrowing - Use import type for type-only imports - Await async applyProviderSpecificLogic (was a floating promise) - Remove redundant union type ('vercel' | string → string)

llm-gateway/src/lib/jwt.ts

llm-gateway/src/lib/provider-specific.ts

Separate RATE_LIMIT_KV from USER_EXISTS_CACHE into its own KV namespace (llm-gateway-rate-limit, b22ee150a8fb4f63970bd3ff69f23e4d).

llm-gateway/src/lib/org-restrictions.ts

llm-gateway/src/lib/byok.ts

llm-gateway/src/middleware/parse-body.ts

llm-gateway/src/background/request-logging.ts

Move ApiMetricsParamsSchema and SessionMetricsParamsSchema from local definitions in cloudflare-o11y and hand-maintained .d.ts files into @kilocode/worker-utils/src/o11y-schemas.ts as the single source of truth. Consumers updated: - cloudflare-o11y: imports schema + inferred types from worker-utils - llm-gateway: o11y-binding.d.ts uses ApiMetricsParams from worker-utils - cloudflare-session-ingest: o11y-binding.d.ts uses SessionMetricsParams - Deleted cloudflare-o11y/src/session-metrics-schema.ts (moved)

llm-gateway/src/lib/extract-headers.ts

llm-gateway/src/middleware/request-validation.ts

llm-gateway/src/handler/proxy.ts

…mers Export ApiMetricsParams/SessionMetricsParams as z.input (accepts undefined for fields with .default()) so callers don't need ?? '' workarounds. Add ApiMetricsParamsParsed/SessionMetricsParamsParsed (z.infer) for post-.parse() consumers in o11y analytics.

llm-gateway/src/handler/proxy.ts

llm-gateway/src/background/abuse-cost.ts

llm-gateway/src/lib/custom-llm/index.ts

Reference calls usageLimitExceededResponse() which queries credit_transactions to distinguish first-time users from returning payers: - first-time: "Paid Model - Credits Required" + credits-required message - returning: "Low Credit Warning!" + add-credits message Worker was always returning the first-time variant, misidentifying returning users. Add hasUserMadePaidTopup() (mirrors summarizeUserPayments) and branch on the result.

Reference uses Sentry for: invalid JSON parse errors, upstream 402 (converted to 503) and upstream 5xx responses. Worker was using console.error/console.warn only, creating a blind spot in production. - Add @sentry/cloudflare dependency - Create src/lib/sentry.ts with captureException helper - Wrap Hono app with withSentry() in index.ts (uses same DSN as reference) - Capture invalid JSON parse exceptions in parse-body middleware - Capture 402-to-503 conversion and upstream 5xx errors in proxy handler - Capture unhandled errors in app.onError

kilo-code-bot · 2026-03-03T16:23:27Z

llm-gateway/src/background/usage-accounting.ts

+// Port of src/lib/processUsage.ts — simplified:
+//   - No Sentry spans/captures (use console.error/warn)
+//   - No PostHog first-usage events
+//   - No KiloPass threshold check


[WARNING]: Stale comments — the file header claims "No PostHog first-usage events" and "No KiloPass threshold check", but the code below actually implements both features (PostHog events around line 729+, KiloPass threshold check around line 545+). These misleading comments should be removed or updated to reflect the current implementation.

The reference route.ts calls modelDoesNotExistResponse() which returns HTTP 404 with { error: 'Model not found', message: 'The requested model could not be found.' }. The worker was returning HTTP 400 with { error: 'model is required' } — wrong status code, wrong error text, and missing message field.

- Add @sentry/cloudflare stub (no-op withSentry + captureException) to vitest alias config, matching the existing cloudflare:workers stub pattern. Fixes 4 suites that couldn't load due to transitive import. - Fix free-model-rate-limit test: expected body.error.code but the middleware (correctly) returns { error: string, message: string }, not a nested object. - Update middleware-chain test for the 404 model-not-found change.

…ateway and /api/openrouter The reference validates that the [...path] portion is exactly /chat/completions and returns invalidPathResponse() (HTTP 400) for anything else. The worker was falling through to the generic notFound handler returning HTTP 404 { error: 'Not found' }. Now requests to e.g. /api/gateway/other or /api/openrouter/v1/models return HTTP 400 with the exact reference error shape. Truly unknown paths (outside /api/gateway/ and /api/openrouter/) still return 404.

…ponse The reference modelNotAllowedResponse() returns: error: 'Model not allowed for your team.' message: 'The requested model is not allowed for your team.' The worker was using restrictionError.message for both fields, producing identical values. Now the message field matches the reference.

… new users The reference usageLimitExceededResponse() appends 'Get $20 free on your first topup!' for users with no payment history. The worker was omitting this, showing only the generic 'you need to add credits' text.

…ilo free models The reference makeErrorReadable() checks whether the estimated token count (JSON.stringify(request).length/4 + max output tokens) exceeds the model's context_length for Kilo free models and returns a clear user-facing message. The worker had a placeholder comment instead. - Add context_length to KiloFreeModel type and all model entries - Add getKiloFreeModelContextLength() lookup - Add estimateTokenCount() matching the reference algorithm - Implement the context-length check in makeErrorReadable() - Destructure requestedModel (was in type but not used)

The reference returns stealthModelError() for Kilo stealth models (inference_providers includes 'stealth'), producing { error: 'Stealth model unable to process request', message: same } with the upstream status code preserved. The worker had no stealth model handling at all. - Add isKiloStealthModel() to models.ts - Add stealth check as the final fallback in makeErrorReadable(), matching the reference ordering (BYOK → context-length → stealth)

Port the reference shouldRouteToVercel() logic that routes ~10% of eligible non-BYOK requests to Vercel AI Gateway, increasing to ~90% when OpenRouter error rate exceeds 50% (automatic failover). - Add vercel-routing.ts with shouldRouteToVercel(), getGatewayErrorRate(), and deterministic SHA-256 hash-based routing using taskId or userId - Add preferredModels list to models.ts (tested/recommended models) - Wire into getProvider() step 3 (after BYOK + custom LLM, before free model gateway routing) - Pass randomSeed (x-kilocode-taskid header or user.id) from provider-resolution middleware Exclusions match the reference: Anthropic models, data_collection=deny, non-preferred models, openrouter/* prefix, Kilo free models with non-openrouter gateways.

kilo-code-bot · 2026-03-03T17:46:00Z

llm-gateway/src/lib/providers.ts

+            id: 'custom',
+            apiUrl: customLlmRow.base_url,
+            apiKey: customLlmRow.api_key,
+            hasGenerationEndpoint: true,


[WARNING]: hasGenerationEndpoint: true is incorrect for custom LLMs.

Custom LLMs don't expose an OpenRouter-style /generation?id= endpoint. Setting this to true causes fetchGeneration in usage-accounting.ts (line 815) to make a spurious HTTP request to customLlmRow.base_url/generation?id=<messageId> after every successful custom LLM request.

While the call is wrapped in a try/catch (line 835) so it won't crash, it:

Adds unnecessary latency to the background accounting task

Sends unexpected requests to the customer's custom LLM endpoint

Logs a warning on every custom LLM request, creating noise

Suggested change

hasGenerationEndpoint: true,

apiKey: customLlmRow.api_key,

hasGenerationEndpoint: false,

kilo-code-bot · 2026-03-03T17:46:00Z

cloudflare-o11y/src/index.ts

+	/** RPC method called by llm-gateway via service binding. */
+	async ingestApiMetrics(params: ApiMetricsParams): Promise<void> {
+		const parsed = ApiMetricsParamsSchema.parse(params);
+		writeApiMetricsDataPoint(parsed, 'kilo-gateway', this.env, (p) => this.ctx.waitUntil(p));


[WARNING]: writeApiMetricsDataPoint is not awaited, unlike the sibling ingestSessionMetrics method (line 30) which awaits writeSessionMetricsDataPoint.

writeApiMetricsDataPoint is synchronous for the Analytics Engine write but calls waitUntil for the Stream send. Since this is an RPC method called via service binding, the caller (sendApiMetrics in api-metrics.ts) awaits the result. The method will return before the Stream send completes, which is likely fine since waitUntil extends the execution context. However, the inconsistency with ingestSessionMetrics is worth noting — if writeApiMetricsDataPoint ever becomes async (e.g., for error handling), the missing await would silently swallow errors.

kilo-code-bot · 2026-03-03T17:46:00Z

llm-gateway/src/index.ts

+
+const app = new Hono<HonoContext>();
+
+app.use('*', useWorkersLogger('llm-gateway') as Parameters<typeof app.use>[1]);


[WARNING]: as Parameters<typeof app.use>[1] cast violates the project coding rule to "STRONGLY AVOID typescript's as operator".

Consider using a type-safe wrapper or satisfies instead. If the type mismatch is due to workers-tagged-logger returning a handler typed against an older Hono version, a thin adapter function would be safer:

const logger = useWorkersLogger('llm-gateway'); app.use('*', (c, next) => logger(c, next));

Move the 402→503 conversion after scheduleBackgroundTasks() so that API metrics, usage accounting, and request logging are emitted even when the upstream provider returns 402 Payment Required. Previously the 402 check returned early before any background tasks were scheduled, causing metrics gaps. The reference implementation (route.ts) always calls emitApiMetricsForResponse() before the 402 check — this change matches that behavior. Add proxy-402.test.ts with tests confirming: - Background tasks are scheduled before the 503 response is returned - BYOK 402 responses are NOT converted to 503 (pass through) - Non-402 errors also schedule background tasks

The free model response path in proxy.ts was explicitly setting accountingStream: null and loggingStream: null, skipping both usage accounting and request logging. The reference implementation (route.ts) calls accountForMicrodollarUsage() and handleRequestLogging() for ALL non-error responses including free models. Now the free model path provides accounting and logging streams gated by !isAnon, matching the paid path behavior. Anonymous free model requests still correctly skip accounting and logging (only metrics). Add tests confirming: - Authenticated free model requests get all three streams - Anonymous free model requests get only metricsStream

…uto-models When the client requests kilo/auto*, resolveAutoModelMiddleware mutates requestBody.model to the resolved model (e.g., anthropic/claude-sonnet-4.6). The API metrics code was using requestBody.model as requestedModel, which lost the original kilo/auto* identifier. The reference (route.ts:399) captures requestedModelLowerCased BEFORE auto-model resolution and passes that to emitApiMetricsForResponse. Fix: use autoModel ?? resolvedModel instead of requestBody.model. When autoModel is set (kilo/auto*), it preserves the original identifier; when null (non-auto requests), resolvedModel IS the requested model. Add background-tasks.test.ts verifying both paths.

…o suffixes The reference (route.ts:400) passes normalizeModelId(originalModelIdLowerCased) which strips :free, :exacto etc. suffixes from the resolvedModel in API metrics. The worker was passing resolvedModel as-is, preserving the :free suffix. - Export normalizeModelId() from models.ts (port of src/lib/model-utils.ts) - Replace local copy in org-restrictions.ts with the shared export - Apply normalizeModelId() to resolvedModel in background-tasks.ts metrics Add tests verifying :free and :exacto suffixes are stripped, and models without colon suffixes are unchanged.

The reference (route.ts:220) awaits logFreeModelRequest() synchronously before the upstream request, ensuring the rate-limit entry is counted even if the upstream fails. The worker was firing the DB insert via waitUntil() non-blocking. Now the DB insert is awaited before next(), matching the reference. DO increments remain non-blocking (worker-specific optimization). DB insert failure is caught and logged but does not block the request. Add log-free-model-usage.test.ts verifying: - DB insert happens before the handler (next()) - Handler still runs if DB insert fails - Non-free models skip the insert entirely

The PostHog API key was fetched via fire-and-forget .then() without being awaited. Whether posthogApiKey was populated when background tasks captured it into bgCommon depended on microtask ordering — first-usage PostHog events could silently drop if the Secrets Store round-trip hadn't completed. Now POSTHOG_API_KEY.get() runs in parallel with ABUSE_SERVICE_URL.get() inside a single Promise.all(), guaranteeing both are resolved before the upstream request and bgCommon construction. Still fail-open on error.

Reference (route.ts:276) defaults to false when transforms array is absent. Worker was defaulting to null, which could violate a NOT NULL constraint on the DB column.

The reference (creditTransactions.ts:58) uses notExists(kilo_pass_issuance_items) to exclude KiloPass bonus credits from the payment count. Without this, a user who received only KiloPass bonus credits would be classified as a 'returning user' (no first-topup bonus shown) in the worker, but as a 'new user' (first-topup bonus shown) in the reference. Add the notExists subquery to hasUserMadePaidTopup, matching the reference's summarizeUserPayments behavior.

Use the same pattern as vercel-routing.test.ts: cast globalThis to Record<string, unknown> once and check via the alias, avoiding the TS2339 'scheduler does not exist on typeof globalThis' error.

kilo-code-bot · 2026-03-03T18:23:07Z

llm-gateway/src/middleware/auth.ts

+  const db = getWorkerDb(c.env.HYPERDRIVE.connectionString);
+
+  const rows = await db
+    .select()


[WARNING]: select() with no column arguments fetches ALL columns from kilocode_users. This includes potentially large/sensitive fields that are never used by the gateway (e.g., profile data). Consider selecting only the columns actually needed downstream:

.select({ id: kilocode_users.id, api_token_pepper: kilocode_users.api_token_pepper, google_user_email: kilocode_users.google_user_email, microdollars_used: kilocode_users.microdollars_used, total_microdollars_acquired: kilocode_users.total_microdollars_acquired, is_admin: kilocode_users.is_admin, })

This reduces data transfer from Postgres and avoids leaking unnecessary PII into the request context.

kilo-code-bot · 2026-03-03T18:23:07Z

llm-gateway/src/handler/proxy.ts

+  let response: Response;
+  if (customLlm) {
+    const db = getWorkerDb(c.env.HYPERDRIVE.connectionString);
+    const isLegacyExtension = !!fraudHeaders.http_user_agent?.startsWith('Kilo-Code/');


[WARNING]: fraudHeaders is read from context here (line 76), but it was set by requestTransformMiddleware which runs before proxyHandler. However, isLegacyExtension checks fraudHeaders.http_user_agent — if requestTransformMiddleware ever fails to set fraudHeaders, this would throw a runtime error accessing .http_user_agent on undefined. The middleware chain ordering is correct today, but this implicit dependency is fragile. Consider adding a null check:

const isLegacyExtension = !!fraudHeaders?.http_user_agent?.startsWith('Kilo-Code/');

kilo-code-bot · 2026-03-03T18:23:07Z

llm-gateway/src/handler/proxy.ts

+      requestedModel: resolvedModel,
+      request: requestBody,
+      response: new Response(errorBodyBytes, response),
+      isUserByok: !!userByok,


[WARNING]: new Response(errorBodyBytes, response) passes the original response as the ResponseInit. The Response constructor accepts { status, statusText, headers } from the second argument. Since response is a full Response object (not a plain ResponseInit), this relies on the runtime extracting only the relevant fields. While this works in practice on Cloudflare Workers, it's an implicit behavior. Consider being explicit:

new Response(errorBodyBytes, { status: response.status, statusText: response.statusText, headers: response.headers })

kilo-code-bot · 2026-03-03T18:23:07Z

llm-gateway/src/lib/custom-llm/index.ts

+function extractMessageTextParts(content: unknown): string[] {
+  if (typeof content === 'string') return [content];
+  if (!Array.isArray(content)) return [];
+  return (content as Array<Record<string, unknown>>)


[WARNING]: as Array<Record<string, unknown>> cast violates the project coding rule to "STRONGLY AVOID typescript's as operator". Since content is already narrowed to Array by the Array.isArray(content) check, you can use a type guard in the filter predicate without the cast:

return content .filter( (part): part is { type: string; text: string } => part !== null && typeof part === 'object' && ('type' in part) && (part.type === 'input_text' || part.type === 'output_text') && ('text' in part) && typeof part.text === 'string' ) .map(p => p.text);

kilo-code-bot · 2026-03-03T18:23:07Z

llm-gateway/src/lib/vercel-routing.ts

+      console.debug('[getGatewayErrorRate] query timeout');
+      return fallback;
+    }
+    const rows = result.rows as unknown as Array<{ gateway: string; errorRate: number }>;


[WARNING]: as unknown as Array<{ gateway: string; errorRate: number }> is a double as cast that violates the project coding rules. The db.execute generic parameter already types the rows, so this cast is unnecessary. Consider:

const rows = result.rows; return { openrouter: rows.find(r => r.gateway === 'openrouter')?.errorRate ?? 0, vercel: rows.find(r => r.gateway === 'vercel')?.errorRate ?? 0, };

If the types don't align, use satisfies or a runtime check instead of as unknown as.

kilo-code-bot · 2026-03-03T18:23:08Z

llm-gateway/src/lib/org-restrictions.ts

+  const providerConfig: OpenRouterProviderConfig = {};
+
+  if (params.organizationPlan === 'enterprise' && providerAllowList.length > 0) {
+    const requiredProviders = extraRequiredProviders(normalizedModelId);


[WARNING]: extraRequiredProviders is called with normalizedModelId (which strips the colon suffix via normalizeModelId), but the kiloFreeModelProviders map keys include colons (e.g., corethink:free, x-ai/grok-code-fast-1:optimized:free). This means the lookup will NEVER match for models like corethink:free (normalized to corethink) or x-ai/grok-code-fast-1:optimized:free (normalized to x-ai/grok-code-fast-1). The provider restriction check for enterprise orgs will silently skip these models.

While the previous reviewer noted this is "safe because the default behavior is permissive", this means enterprise orgs with a provider_allow_list that excludes corethink or stealth will NOT be blocked from using those free models — the restriction is silently bypassed. This could be a compliance issue for enterprise customers who expect their allow list to be enforced.

kilo-code-bot · 2026-03-03T18:23:08Z

llm-gateway/src/background/usage-accounting.ts

+): Promise<Response> {
+  const maxElapsedMs = 20_000;
+  const startedAt = Date.now();
+  let nextDelayMs = 200 * (1 + (Math.random() - 0.5) / 10);


[WARNING]: fetchWithBackoff retries on ALL 4xx responses (r.status >= 400), including 401 (auth failure) and 403 (forbidden). Auth failures will never succeed on retry. Consider only retrying on 404 and 5xx:

r => r.status === 404 || r.status >= 500

kilo-code-bot · 2026-03-03T18:23:08Z

llm-gateway/src/background/usage-accounting.ts

+    requested_model: usageContext.requested_model,
+    cache_discount: usageStats.cacheDiscount_mUsd ?? null,
+    has_error: usageStats.hasError,
+    abuse_classification: 0,


[WARNING]: abuse_classification is hardcoded to 0 here, even though usageContext.abuse_request_id is available. The reference implementation stores the abuse classification request ID in this field. This means the abuse_classification column in microdollar_usage will always be 0, making it impossible to correlate usage records with abuse classifications in the database.

Suggested change

abuse_classification: 0,

abuse_classification: usageContext.abuse_request_id ?? 0,

kilo-code-bot · 2026-03-03T18:23:08Z

llm-gateway/src/lib/providers.ts

+  }
+
+  // 3. Vercel AI Gateway A/B routing (non-BYOK, non-custom-LLM)
+  if (await shouldRouteToVercel(db, requestedModel, request, randomSeed)) {


[WARNING]: shouldRouteToVercel is called with requestedModel (already lowercased by parseBodyMiddleware), but preferredModels in models.ts contains mixed-case entries like anthropic/claude-opus-4.6. The preferredModels.includes(requestedModel) check in vercel-routing.ts:109 will fail for any model whose preferredModels entry uses different casing than the lowercased requestedModel. Currently all entries happen to be lowercase, but this is fragile — adding a model with uppercase characters to preferredModels would silently break Vercel routing for it.

kilo-code-bot · 2026-03-03T18:23:08Z

llm-gateway/src/lib/tool-calling.ts

+}
+
+function isToolMessage(msg: ChatMessage): msg is ToolMessage {
+  return msg.role === 'tool' && typeof (msg as Record<string, unknown>).tool_call_id === 'string';


[WARNING]: as Record<string, unknown> cast violates the project coding rules. This can be replaced with an in check:

Suggested change

return msg.role === 'tool' && typeof (msg as Record<string, unknown>).tool_call_id === 'string';

return msg.role === 'tool' && 'tool_call_id' in msg && typeof msg.tool_call_id === 'string';

Switch from the shared Next.js Sentry DSN to a dedicated llm-gateway project. Enable sendDefaultPii for automatic IP collection on events.

kilo-code-bot · 2026-03-03T18:32:47Z

llm-gateway/src/lib/models.ts

+  },
+  {
+    public_id: 'minimax/minimax-m2.5:free',
+    context_length: 204_800,


[WARNING]: context_length mismatch — minimax/minimax-m2.5:free is 204_800 here but 1_000_000 in providers.ts. The value here feeds getKiloFreeModelContextLength() → makeErrorReadable(), which will incorrectly report context-length errors at ~205K tokens when the model actually supports 1M tokens. This will cause false "context length exceeded" error messages for users with large prompts between 205K and 1M tokens.

Suggested change

context_length: 204_800,

context_length: 1_000_000,

kilo-code-bot · 2026-03-03T18:32:47Z

llm-gateway/src/lib/jwt.ts

+  } catch (err) {
+    if (err instanceof Error) {
+      // jose uses error.code for JWT-specific errors
+      if ((err as { code?: string }).code === 'ERR_JWT_EXPIRED') {


[SUGGESTION]: as cast violates the project coding rule to "STRONGLY AVOID typescript's as operator". Since err is already narrowed to Error by the instanceof check, you can use an in guard instead:

Suggested change

if ((err as { code?: string }).code === 'ERR_JWT_EXPIRED') {

if ('code' in err && err.code === 'ERR_JWT_EXPIRED') {

kilo-code-bot · 2026-03-03T18:32:47Z

llm-gateway/src/middleware/extract-ip.ts

+export const extractIpMiddleware = createMiddleware<HonoContext>(async (c, next) => {
+  // CF-Connecting-IP is the authoritative source on Cloudflare Workers
+  const cfIp = c.req.header('CF-Connecting-IP');
+  const xffIp = c.req.header('x-forwarded-for')?.split(',')[0]?.trim();


[WARNING]: x-forwarded-for is trivially spoofable by clients. On Cloudflare Workers, CF-Connecting-IP is the only trustworthy source — it's set by Cloudflare's edge and cannot be forged. If CF-Connecting-IP is absent, the request likely didn't traverse Cloudflare's edge (e.g., direct-to-origin), so trusting x-forwarded-for means an attacker could spoof their IP to bypass rate limiting (freeModelRateLimitMiddleware, promotionLimitMiddleware) and abuse detection.

Consider either:

Rejecting requests without CF-Connecting-IP (they shouldn't reach a Workers deployment normally)

Or at minimum, logging a warning when falling back to x-forwarded-for so it's visible in monitoring

- Include test/integration/**/*.test.ts in Vitest config and remove the previous exclusion to enable running integration tests - Add POSTHOG_API_KEY to test makeEnv helper for analytics-dependent tests

kilo-code-bot · 2026-03-03T19:07:54Z

llm-gateway/src/background/kilo-pass.ts

+    const effectiveThreshold = getEffectiveKiloPassThreshold(user.kiloPassThreshold ?? null);
+    if (effectiveThreshold === null || user.microdollarsUsed < effectiveThreshold) return;
+
+    const subscriptionState = await getKiloPassStateForUser(tx as unknown as Tx, kiloUserId);


[WARNING]: Triple as unknown as Tx cast violates the project coding rule to "STRONGLY AVOID typescript's as operator".

This pattern appears three times (lines 510, 512, 516). The Tx type is extracted via a conditional type from WorkerDb['transaction'], but the Drizzle transaction callback parameter should already be compatible. Consider defining Tx directly from the transaction callback's parameter type, or using a wrapper function that accepts the transaction and calls the inner functions, avoiding the cast entirely:

type Tx = Parameters<Parameters<WorkerDb['transaction']>[0]>[0];

If the types genuinely don't align, a single cast at the db.transaction call site (wrapping the callback) would be cleaner than casting on every usage.

kilo-code-bot · 2026-03-03T19:07:54Z

llm-gateway/src/background/kilo-pass.ts

+  if (!anchor) return null;
+  // currentPeriodStart = nextYearlyIssueAt - 1 month (or startedAt)
+  const currentPeriodStart = nextYearlyIssueAtIso
+    ? addMonths(new Date(nextYearlyIssueAtIso), -1)


[WARNING]: new Date(nextYearlyIssueAtIso) bypasses the parseIso validation used on line 308.

If nextYearlyIssueAtIso is a truthy but malformed string (e.g., "not-a-date"), parseIso on line 308 returns null and anchor falls through to startedAtIso. But on line 311, the truthy check passes and new Date(nextYearlyIssueAtIso) produces Invalid Date, which propagates through addMonths → computeIssueMonth producing "NaN-NaN-01" as the issue month.

Use the already-validated result instead:

Suggested change

? addMonths(new Date(nextYearlyIssueAtIso), -1)

? addMonths(parseIso(nextYearlyIssueAtIso) ?? anchor, -1)

Or reuse the parseIso result from line 308 by extracting it into a local variable.

iscekic added 11 commits March 2, 2026 20:22

chore(llm-gateway): strip wrangler.jsonc to bare minimum

d11d11e

Remove all bindings (Hyperdrive, KV, services, secrets store, vars) — these will be added incrementally as each phase needs them. Simplify worker-configuration.d.ts stub and env.ts to match.

iscekic self-assigned this Mar 3, 2026

chore(llm-gateway): configure custom domain and remove dev settings

c10a8a0

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

llm-gateway/src/middleware/request-transform.ts Outdated Show resolved Hide resolved

llm-gateway/src/lib/anonymous.ts Outdated Show resolved Hide resolved

llm-gateway/src/types.ts Outdated Show resolved Hide resolved

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

llm-gateway/src/lib/jwt.ts Outdated Show resolved Hide resolved

llm-gateway/src/lib/provider-specific.ts Outdated Show resolved Hide resolved

chore(llm-gateway): use dedicated KV namespace for RATE_LIMIT_KV

fec8eb9

Separate RATE_LIMIT_KV from USER_EXISTS_CACHE into its own KV namespace (llm-gateway-rate-limit, b22ee150a8fb4f63970bd3ff69f23e4d).

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

llm-gateway/src/lib/org-restrictions.ts Show resolved Hide resolved

llm-gateway/src/lib/byok.ts Show resolved Hide resolved

llm-gateway/src/middleware/parse-body.ts Show resolved Hide resolved

llm-gateway/src/background/request-logging.ts Outdated Show resolved Hide resolved

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

llm-gateway/src/lib/extract-headers.ts Outdated Show resolved Hide resolved

llm-gateway/src/middleware/request-validation.ts Outdated Show resolved Hide resolved

llm-gateway/src/handler/proxy.ts Show resolved Hide resolved

iscekic added 2 commits March 3, 2026 03:07

Merge branch 'main' into feat/llm-gateway

0af8fac

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

llm-gateway/src/handler/proxy.ts Outdated Show resolved Hide resolved

llm-gateway/src/background/abuse-cost.ts Outdated Show resolved Hide resolved

llm-gateway/src/lib/custom-llm/index.ts Show resolved Hide resolved

iscekic added 2 commits March 3, 2026 11:36

Revert unnecessary changes to SessionIngestDO.ts

8108f0a

Regenerate llm-gateway/worker-configuration.d.ts via wrangler types

f197802

iscekic added 2 commits March 3, 2026 16:45

iscekic force-pushed the feat/llm-gateway branch from d6dc4f1 to 3ef5600 Compare March 3, 2026 16:18

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

iscekic added 8 commits March 3, 2026 18:16

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

iscekic added 9 commits March 3, 2026 18:53

fix(B9): default has_middle_out_transform to false instead of null

6b54ee4

Reference (route.ts:276) defaults to false when transforms array is absent. Worker was defaulting to null, which could violate a NOT NULL constraint on the DB column.

fix: resolve typecheck errors for scheduler stub in test files

8592163

Use the same pattern as vercel-routing.test.ts: cast globalThis to Record<string, unknown> once and check via the alias, avoiding the TS2339 'scheduler does not exist on typeof globalThis' error.

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

iscekic added 3 commits March 3, 2026 19:26

chore: use dedicated Sentry project for llm-gateway worker

cddfbae

Switch from the shared Next.js Sentry DSN to a dedicated llm-gateway project. Enable sendDefaultPii for automatic IP collection on events.

chore: remove tracesSampleRate from Sentry config

614af83

chore: simplify deploy script to single env-less command

4086045

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

iscekic added 2 commits March 3, 2026 19:51

test(vitest): run integration tests and add PostHog test key

faf9a50

- Include test/integration/**/*.test.ts in Vitest config and remove the previous exclusion to enable running integration tests - Add POSTHOG_API_KEY to test makeEnv helper for analytics-dependent tests

fix: resolve eslint errors in llm-gateway

406ce37

kilo-code-bot bot reviewed Mar 3, 2026

View reviewed changes

	hasGenerationEndpoint: true,
	apiKey: customLlmRow.api_key,
	hasGenerationEndpoint: false,


		const app = new Hono<HonoContext>();

		app.use('*', useWorkersLogger('llm-gateway') as Parameters<typeof app.use>[1]);

	abuse_classification: 0,
	abuse_classification: usageContext.abuse_request_id ?? 0,

	return msg.role === 'tool' && typeof (msg as Record<string, unknown>).tool_call_id === 'string';
	return msg.role === 'tool' && 'tool_call_id' in msg && typeof msg.tool_call_id === 'string';

	if ((err as { code?: string }).code === 'ERR_JWT_EXPIRED') {
	if ('code' in err && err.code === 'ERR_JWT_EXPIRED') {

	? addMonths(new Date(nextYearlyIssueAtIso), -1)
	? addMonths(parseIso(nextYearlyIssueAtIso) ?? anchor, -1)

Conversation

iscekic commented Mar 3, 2026

Summary

Phases

Follow-up improvements (also in this PR)

Architecture

Key files

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Overview

New Issues in This Review Pass

WARNING

Prior Review Issues (99 comments)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 3, 2026

kilo-code-bot bot commented Mar 3, 2026 •

edited

Loading