Skip to content

feat(kilo-chat): attachments backend (R2 + presigned URLs + plugin)#3304

Draft
iscekic wants to merge 47 commits into
mainfrom
worktree-attachments-backend
Draft

feat(kilo-chat): attachments backend (R2 + presigned URLs + plugin)#3304
iscekic wants to merge 47 commits into
mainfrom
worktree-attachments-backend

Conversation

@iscekic
Copy link
Copy Markdown
Contributor

@iscekic iscekic commented May 18, 2026

Summary

Adds user/bot attachments to kilo-chat conversations. Bytes live in Cloudflare R2; upload via presigned PUT, download via presigned GET; metadata in ConversationDO SQLite.

Implemented per plan docs/superpowers/plans/2026-05-18-attachments-backend.md (26 tasks, 9 phases).

Architecture

  • Worker mints short-lived S3-style presigned URLs via aws4fetch for direct client/plugin ↔ R2 transfer.
  • ConversationDO owns the attachment row lifecycle: pendinglinked, sync-delete on message edit/delete and conversation destroy, 24h orphan-sweep alarm.
  • Plugin uploads/downloads through identical routes via the controller relay.
  • Bot status carries a new optional capabilities: Capability[] field so clients can gate the attachment picker.

What's included

Shared schemas (@kilocode/kilo-chat):

  • capabilitySchema ('attachments').
  • attachmentBlockSchema joined into contentBlockSchema / inputContentBlockSchema.
  • messageCreatedWebhookSchema.attachments[] (max 10), text relaxed to allow empty when attachments present.
  • botStatusRequestSchema / botStatusRecordSchema / botStatusEventSchema accept optional capabilities.

services/kilo-chat worker:

  • MEDIA_BUCKET R2 binding + aws4fetch dependency.
  • New secrets store refs: R2_ACCESS_KEY_ID_KILOCHAT_MEDIA, R2_SECRET_ACCESS_KEY_KILOCHAT_MEDIA.
  • New R2_ACCOUNT_ID / R2_BUCKET_NAME / KEY_PREFIX vars. (Dev override: wrangler dev --var KEY_PREFIX:dev/.)
  • src/util/presigner.ts (mintPutUrl/mintGetUrl) and src/util/attachment-key.ts.
  • New attachments table on ConversationDO + migration; new capabilities column on bot_status + migration.
  • ConversationDO: initAttachment (30s idempotency), getAttachmentForRead, createMessage links rows, editMessage allows attachment subset, deleteMessage/destroy purge from R2, 24h orphan-sweep alarm.
  • Routes: POST /v1/attachments/init and GET /v1/attachments/:id/url?conversationId=…, mounted on both user (/v1/...) and bot (/bot/v1/sandboxes/:sandboxId/...) scopes.
  • Webhook buildPayload emits attachments; tolerates empty text when attachments are present.

services/kiloclaw/controller:

  • Relay routes for /attachments/init and /attachments/:id/url (preserves query string).

Plugin (services/kiloclaw/plugins/kilo-chat):

  • KiloChatClient.initAttachment + getAttachmentUrl.
  • PLUGIN_CAPABILITIES = ['attachments'] propagated through sendBotStatus calls (sendPresence, handleBotStatusRequest).
  • outbound.attachedResults.sendMedia (init → PUT to R2 → createMessage).
  • Inbound dispatchInbound downloads attachments and populates MediaPaths/MediaUrls/MediaTypes.

Mobile app:

  • apps/mobile renders attachment content blocks (paperclip icon + filename). Full UX (thumbnails, tap-to-download) deferred.

Tests

  • packages/kilo-chat: 84 tests passing (+18 new).
  • services/kilo-chat: 368 tests passing across 38 files (+~50 new). Clean exit, no leaked unhandled rejections.
  • services/kiloclaw/plugins/kilo-chat: 186 tests passing across 17 files (+16 new).
  • Workspace pnpm run typecheck: clean.
  • Workspace pnpm run lint: clean.
  • Web jest suite fails on PG connection in this worktree (pre-existing infra dependency on a running Postgres, unrelated to this change).

Operator action required before deploy

  1. wrangler r2 bucket create kilo-chat-media
  2. Create an R2 API token scoped to kilo-chat-media with Object Read + Write.
  3. Put R2_ACCESS_KEY_ID_KILOCHAT_MEDIA and R2_SECRET_ACCESS_KEY_KILOCHAT_MEDIA into the Secrets Store under store_id 342a86d9e3a94da698e82d0c6e2a36f0.
  4. (R2_ACCOUNT_ID and store_id are already set in wrangler.jsonc to match the existing kilo-chat values — no placeholders remain.)

Test plan

  • Bot upload: bot sends an image; R2 object lands at dev/attachments/<convId>/<botId>/...; human client renders attachment block.
  • Human upload: curl POST /v1/attachments/initPUT to returned URL → POST /v1/messages with attachment block. Plugin webhook delivers attachments[]; files land under sandbox media/inbound/; agent responds based on attachment content.
  • Edit drops attachment → R2 object is gone (wrangler r2 object get → 404).
  • Delete message → R2 objects gone.
  • Orphan sweep: init without create → row gone after 24h alarm fires.
  • Bot status capabilities: bot.status event includes capabilities: ['attachments'] in client DevTools.

Notes

  • The GET /attachments/:id/url route takes ?conversationId=<ULID> to avoid a separate global attachment-id index DO.
  • editMessage may only drop existing attachments, never add new ones (matches spec).

iscekic added 30 commits May 18, 2026 14:48
…rets

Operator action required pre-deploy:
- create R2 bucket kilo-chat-media
- create bucket-scoped R2 API token
- put R2_ACCESS_KEY_ID_KILOCHAT_MEDIA + R2_SECRET_ACCESS_KEY_KILOCHAT_MEDIA into Secrets Store
Spec note: route takes ?conversationId=<ULID> query param to avoid a global
attachment-id index DO.
@iscekic iscekic self-assigned this May 18, 2026
iscekic added 17 commits May 18, 2026 18:06
Annotate the two secrets-store bindings (R2_ACCESS_KEY_ID_KILOCHAT_MEDIA,
R2_SECRET_ACCESS_KEY_KILOCHAT_MEDIA) with @from hints so dev/local/env-sync
auto-populates the local secrets store from .env.local on first run.
Without atomicity, a crash between the message INSERT and the attachment
UPDATE loop leaves messages referencing attachments that remain 'pending'.
getAttachmentForRead only returns 'linked' rows, so such attachments
become permanently orphaned but still referenced by the committed message.

Wrapping both operations in a single db.transaction() ensures either the
message is created and all attachments are linked, or neither happens.

Regression test for the mid-flight crash path is omitted: reproducing it
would require monkey-patching the Drizzle tx object mid-iteration, which
is invasive and fragile. The structural fix is the important part.
Previously scheduleOrphanSweepIfNeeded and the re-arm in
sweepOrphanAttachments only set the alarm when none existed or the existing
one was already past. A stale orphan-sweep alarm (further out than 24h)
would never be corrected. Add existing > target to both conditions so a
far-future orphan-sweep alarm is pulled in to the ~24h window.
Add foreign keys for message_id → messages.id and uploader_id → members.id
on the attachments table, matching the style of botMessageNotifications.
Regenerate migration 0002 to include the FK clauses in the CREATE TABLE DDL.
…bhook

Adds .min(1) to filename in attachmentBlockSchema and the inline
attachment object in messageCreatedWebhookSchema, mirrored in the
kiloclaw plugin synced copies, closing the contract gap with
attachmentInitRequestSchema which already enforced non-empty.
Add tests asserting that a bot authenticated for sandbox-B is rejected
with 403 when it references a conversationId created under sandbox-A
membership, for both the init and geturl attachment routes.
Use R2_ACCESS_KEY_ID and R2_SECRET_ACCESS_KEY (the worker binding names)
instead of the Secrets Store secret_name keys, with comments naming the
upstream secret and store_id for reference.
The attachmentInitRequestSchema requires `.positive()` since a zero-byte
upload makes no sense, but the downstream `attachmentBlockSchema` and the
message.created webhook accepted `.nonnegative()`. Align both with the
init schema so a stored attachment can never claim zero bytes.
… ctx

The presigned R2 GET URLs we packed into `MediaUrl`/`MediaUrls` on the
inbound context payload expire after 1 hour and no downstream code in
kiloclaw consumes them — the agent runner reads `MediaPath` for the
local file copy. Removing the fields stops persisting stale URLs into
session ctx for no benefit.
Was duplicated across the init request schema, ConversationDO, the
plugin channel, and the plugin webhook dispatcher. Export it from
@kilocode/kilo-chat (and the plugin's synced mirror) so the cap stays
in lockstep if it ever changes.
… test helper

The DO exposed a public 'bootstrapConversation' RPC method that was
documented test-only but was reachable from any caller with a stub. Move
the equivalence into a 'bootstrapConversationForTest' helper in the test
support file so production code only ships 'initialize'.
…ttachmentForRead

Both DO methods now return discriminated-union results matching the
createMessage pattern:

  { ok: true, ... } | { ok: false, code: 'forbidden' | 'invalid', error }

The route handlers map the codes to 4xx responses directly instead of
regex-matching error messages, which was brittle and would break the
moment an error string ever changed.

Test consumers use a new unwrap() helper that asserts the success branch
or throws.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant