Skip to content

Feat/2470b agent attachments#69

Merged
Skobeltsyn merged 2 commits into
mainfrom
feat/2470b-agent-attachments
May 30, 2026
Merged

Feat/2470b agent attachments#69
Skobeltsyn merged 2 commits into
mainfrom
feat/2470b-agent-attachments

Conversation

@Skobeltsyn
Copy link
Copy Markdown
Contributor

No description provided.

Skobeltsyn and others added 2 commits May 30, 2026 13:39
Slice b of #2470 — typed `Content.Image` attachments at the agent
invoke surface, with the runtime dereferencing refs against an
injected BlobStore. Slice a (commit cea17c0) wired the per-provider
wire format on `LlmMessage.images`; this commit puts a clean user-
facing API on top so authors don't base64-encode by hand.

```kotlin
val store = FileBlobStore(Path.of("snapshots/blobs"))
val agent = agent<String, String>("vision") {
    model { ollama("qwen3-vl:8b") }
    blobStore(store)
    skills { skill<String, String>("describe", "") { tools() } }
}

val ref = store.put(pngBytes, ImageMime.Png.wireMime)
agent.invokeWithAttachments(
    "What is in this image?",
    attachments = listOf(Content.Image(ref, ImageMime.Png)),
)
```

Implementation:

- `Agent.blobStore: BlobStore?` + `fun blobStore(store)` DSL setter
  mirroring the existing memoryBank pattern. Optional injection;
  null when the agent doesn't accept image attachments.
- `Agent.invokeSuspendWithAttachments(input, attachments)` + blocking
  shim `Agent.invokeWithAttachments(input, attachments)`. Threads
  through `invokeSuspendForSession(attachments = ...)` → `execute
  Agentic(attachments = ...)`.
- `executeAgentic` gains `attachments: List<Content>? = null`. On the
  FIRST user-message build, when attachments are non-null:
  * Requires `agent.blobStore` to be configured — else fail-fast with
    `Agent '<name>' has attachments but no blobStore` (caller
    misconfiguration, surfaced before any provider HTTP).
  * For each `Content.Image(ref, mime)`: read bytes from the store
    (errors fast with the ref's hash prefix when missing — useful
    for forensics when a snapshot resumes against a partially-purged
    store), base64-encode once, map `ImageMime` → `ImagePart.WireMime`
    via the closed sealed types (no String conversion).
  * Non-image content variants (Text/Document/Audio/Video) are
    silently skipped in v1. Slice c covers Document/Audio/Video
    via provider doc/audio/video adapters.
  * Empty list / all-non-image list → `images = null` (not empty
    list) so no provider sees an empty array.
- Ignored on resume — the snapshot's restored conversation already
  carries the original attachments on the saved user turn.

Composition with existing surfaces:
- Slice a (the wire format) is unchanged — this commit just routes
  the dereferenced ImagePart into `LlmMessage.images` on the first
  user message instead of requiring the caller to do it.
- Snapshot/resume (#2386/#2754) — refs live in the snapshot; the
  agent's blobStore must dereference the same refs on resume (no
  re-deref by attachments since `attachments` is ignored on resume).
- Audit (#1914, #2469) — refs already in `outputParts` for tool
  returns; this commit's attachment path is "into the model", not
  "out via a tool" — no audit-row shape change.

Tests (AgentAttachmentsTest.kt — 8 unit cases):
- Content.Image dereferences to ImagePart on first user message
- Closed ImageMime → ImagePart.WireMime mapping for all four variants
- Multiple images compose in order; non-image variants skipped
- Attachments without blobStore fail fast with a clear message
- Missing-blob ref fails fast with hash context
- Legacy invokeSuspend (no attachments) wire shape unchanged
- Empty attachments list short-circuits to null images
- All-non-image attachments → null images (not empty array)

Live tests (AgentVisionLiveTest.kt — 6 cases):
- Same fixtures (VisionFixtures.threeSquaresPng + housePng) and same
  cost discipline as VisionLiveTest (slice a), but going through
  the agent surface rather than the raw ModelClient. Verifies the
  full deref + base64 + LlmMessage.images + provider-wire path
  end-to-end on Ollama qwen3-vl:8b (live-llm), Claude Haiku 4.5
  (live-cloud-api), OpenAI gpt-4o-mini (live-cloud-api). Model
  names overridable via env. `assumeTrue` skips per provider when
  no key / no Ollama.

Full suite: 1818 tests, 0 failures.

To run live: `./gradlew integrationTest --tests "*AgentVisionLiveTest*"`
(Ollama path) and `./gradlew test --tests "*AgentVisionLiveTest*"`
(Claude + OpenAI paths).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ELOG

- docs/multimodal.md — new "Agent attachments — typed Content.Image at
  the invoke surface (#2470 slice b)" section between the slice-a
  wire-format section and the "What's still coming" list. Walks
  through the BlobStore injection, invokeWithAttachments / invoke
  SuspendWithAttachments split, the runtime guarantees (first user
  message only, closed mime mapping, fail-fast on misconfig, skip
  non-image variants, snapshot/resume composition), and the live test
  layout (companion to slice a's VisionLiveTest). "What's still
  coming" list updated — slice b removed, slice c added (provider
  doc/audio/video paths).
- README.md — new "Typed Content.Image at the agent surface" bullet
  right after the slice-a "Vision input to models" bullet. Names the
  composition guarantees and the live-test coverage.
- CHANGELOG.md `## [Unreleased]` — new "Typed agent attachments
  (#2470 slice b)" section ABOVE the existing slice-a section. Covers
  the invokeWithAttachments API, BlobStore DSL, closed mime mapping,
  fail-fast errors, the non-image skip behaviour, the empty-list edge
  case, and resume composition.

No source changes. Full suite stays at 1818 / 0 failures from the
prior commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Skobeltsyn Skobeltsyn merged commit a3e7037 into main May 30, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant