Skip to content

feat: image support across wiki, MCP, HTTP, web UI, and sync#53

Merged
aniongithub merged 10 commits into
mainfrom
feat/image-support
May 25, 2026
Merged

feat: image support across wiki, MCP, HTTP, web UI, and sync#53
aniongithub merged 10 commits into
mainfrom
feat/image-support

Conversation

@aniongithub
Copy link
Copy Markdown
Owner

Summary

Adds first-class image support to mind-map. Agents and humans can now upload images to the wiki, embed them with standard markdown ![](path) syntax, and have them render in the web UI. Images travel through git sync alongside markdown pages. Vision-capable agents can opt into receiving image bytes inline via MCP.

The design was discussed in projects/mind-map/design/image-support (in the mind-map wiki). This PR implements that design across 6 ordered slices plus an end-to-end visual harness.

What's new

For agents (MCP)

Three new tools and two new flags on an existing tool:

  • upload_image(page, name, content_base64) — writes binary content into the page's sidecar directory (<page>.assets/<name>), returns the markdown-ready path. Auto-suffixes on filename collision, magic-byte sniffs against the browser-renderable image set (PNG, JPEG, GIF, WebP, AVIF, SVG, BMP, ICO), and caps size at a configurable per-deployment limit (default 10 MB).
  • download_image(path) — returns the asset as an MCP ImageContent block so vision-capable agents see it inline.
  • delete_image(path) — removes an asset and its index rows. Useful for tooling that wants canonical filenames across re-runs.
  • get_page (and search_pages) gain include_images and include_image_metadata flags. Default off — image work is opt-in to keep token cost predictable. Operators can force both off via a server-level kill switch for token-constrained deployments.

For humans (Web UI + HTTP)

  • POST /api/assets (JSON or multipart) and DELETE /api/assets/<path> for upload/delete from the browser or curl.
  • GET /assets/<path> static handler serves the bytes via http.ServeContent (conditional GET, byte ranges, the works).
  • SVG responses get a strict Content-Security-Policy (default-src 'none'; sandbox) to neutralize script injection from hand-crafted SVG payloads.
  • The web UI's markdown renderer rewrites wiki-local image references to /assets/... before marked parses them, so <img src> Just Works. External URLs and anchor-only refs are left alone.

For sync

  • *.assets/* files now travel between wiki and shadow clone alongside *.md. Delete-mirroring works for both kinds.
  • register_sync gains an lfs flag and an lfs_patterns list. When enabled, the synced clone is configured for git-lfs and a managed .gitattributes is written. Useful for plain repos hosting large binary assets; should be left off for GitHub wikis (LFS unsupported there).

Storage model

  • Each page's images live in a sidecar directory: page foo/bar → assets at foo/bar.assets/. The relative path that appears in markdown is also the filesystem path on disk; the same string is the target in the link index.
  • A new kind column on the links table distinguishes 'link' (wikilinks) from 'image'. Lifecycle (delete, move, GC) is driven by index queries — no parallel bookkeeping.
  • DeletePage cascades: any asset in the page's sidecar with no remaining kind='image' referencer is removed. Shared assets (referenced from another page) stay in place.
  • MovePage splits assets on the index: exclusive ones travel with the page (in-body paths are rewritten); shared ones stay in the original sidecar and the moved page's body keeps its original references (which still resolve).

Migrations

  • A first-class migration runner backs the schema change. State tracked under wiki_state.schema_version. Migration 1 adds the links.kind column via PRAGMA table_info-guarded ALTER TABLE, idempotent and back-compat with pre-existing databases.

Commits, in review order

# Commit What it does
1 f05f5a1 feat(wiki): index image references with kind='image' Migration runner + parser walks goldmark AST for ast.Image + indexer writes kind='image' rows. Existing wikilink queries filtered to kind='link'. No user-visible behavior change yet.
2 6cce167 feat(wiki): asset CRUD with sidecar storage and lifecycle cascades UploadAsset / ReadAsset / StatAsset / DeleteAsset. Magic-byte sniff, collision auto-suffix (case-insensitive), traversal guards. DeletePage cascades. MovePage splits exclusive vs shared assets via index queries.
3 043dca2 feat(mcp): upload_image, download_image, and image read flags MCP tools + the include_images / include_image_metadata flags on get_page + operator kill switch.
4 95c120d feat(httpapi): asset upload + static serving with SVG CSP POST /api/assets, GET /assets/<path>, SVG-specific strict CSP.
5 9e039a7 feat(webui): rewrite wiki-local image refs to /assets/ and style them renderMarkdown prefixes local image refs with /assets/; external URLs untouched. .markdown img CSS for max-width + framed look.
6 c2f29b1 feat(sync): carry sidecar assets and add optional git-lfs syncableRel predicate, register_sync LFS flag, ensureLFSConfig writes managed .gitattributes.
7 d10000f test(image-support): add Playwright-based end-to-end visual harness First version of tools/screenshot/.
8 6c6ab4f chore(devcontainer): pick host port at initialize time, not hardcoded .devcontainer/initializeCommand.sh picks a free host port, runArgs: --network host --env-file ports.env. No more port collision when multiple worktrees run side by side. Wiki page preferences/devcontainer-ports documents the pattern.
9 cc34fd2 feat(images): delete API + demo capture harness with theme/sort/search shots DELETE /api/assets/<path> + delete_image MCP tool. Rewritten tools/screenshot/capture.mjs with per-capture compose functions, idempotent re-runs, 9 demo captures covering home graph (light + dark), page detail, three sidebar sort modes, sidebar + in-page search, and the settings modal. Devcontainer switched to the schlich/playwright feature.

End-to-end verification

The demo harness was run against the in-container server. All 9 captures uploaded, byte-verified via the static handler, and embedded into 5 architecture pages under managed sentinel blocks. verify.mjs (Playwright + DOM inspection) confirmed <img> rendering at naturalWidth > 0, complete=true with 200 image/png responses from /assets/....

The captures have been pushed to the public wiki at https://github.com/aniongithub/mind-map/wiki — see the architecture/ pages for the new screenshots.

Test coverage

Per-package counts of new tests in this PR (go test ./... passes on every commit):

  • internal/wiki: image indexing, dedup, reindex cleanup, migration idempotence, migration from legacy schema, upload + collision + sanitize + size cap + SVG accept + non-image reject, read/stat round-trip, traversal rejection, delete cascade for exclusive vs shared assets, move relocate vs shared-leave, delete-asset happy path / not-found / traversal / empty-sidecar sweep
  • internal/mcp: upload + download + include_images + include_image_metadata + force-off + reject-non-image
  • internal/httpapi: JSON upload + multipart upload + reject-non-image + serve round-trip + SVG CSP header + 404 + traversal rejection
  • internal/sync: syncableRel predicate happy paths and substring traps, end-to-end sync of page + sidecar PNG to a bare git remote, LFS settings persist + reflect in syncTarget, back-compat RegisterMapping leaves LFS off

Known limitations / follow-ups

  • Sync push hardcodes main branch, but GitHub wikis use master — filed as sync push hardcodes 'main' branch; GitHub wiki remotes use 'master' #52. Worked around in this PR's demo push manually; doesn't block merge.
  • Devcontainer credential sharing doesn't work in a git worktree — VS Code's credential injection logic doesn't kick in when the workspace's .git file points at a path outside the container's mounts. Push-from-host is the workaround. Worth a separate issue if anyone cares; mostly relevant for image-support contributors and the harness.
  • In-page search highlight on initial load: ?q=foo in the URL hash doesn't always trigger the SPA's highlight on first paint; user-initiated fillSearch always works. Minor UX bug, captured implicitly by the harness's fillSearch workaround.
  • tools/screenshot/capture.mjs accumulates captured/ files locally, gitignored; nothing to clean up between runs.

Branch diff at a glance

~3,500 LOC added across 19 files (image-support work only, excluding pre-existing commits):

  • internal/wiki/ — assets.go, migrate.go, parse.go, pages.go, index.go, plus tests
  • internal/mcp/ — images.go, server.go, plus tests
  • internal/httpapi/ — images.go, server.go, plus tests
  • internal/sync/ — sync.go, plus tests
  • internal/config/SyncMapping.LFS + LFSPatterns fields, helpers
  • webui/src/App.tsx URL rewrite, styles.css .markdown img
  • .devcontainer/ — initializeCommand.sh, devcontainer.json (port pattern + playwright feature), Dockerfile (Chromium deps now via feature)
  • .vscode/ — launch.json, tasks.json read host port via ports.env
  • tools/screenshot/ — Playwright harness with 9 demo captures

🤖 Generated with opencode

Co-Authored-By: opencode noreply@opencode.ai

Lay the groundwork for image support by tracking ![](path) references
in the existing links table, distinguished from wikilinks by a new
kind column. Every lifecycle question for an asset can now be answered
with an index query, no parallel bookkeeping.

- Add a migration runner keyed on wiki_state.schema_version; first
  migration adds the kind column to links via ALTER TABLE. Probe with
  PRAGMA table_info so the run is idempotent.
- Switch parsePage to walk the goldmark AST for ast.Image nodes;
  wikilink extraction (still string-scan, since [[..]] is non-standard)
  is unchanged. External URLs and anchor-only refs are skipped.
- Insert image refs alongside wikilinks in indexPage and Reindex with
  kind='image'.
- Constrain getLinks, getBacklinks, and AllLinks to kind='link' so
  existing callers see the same page-edge surface as before.

Tests cover: image refs indexed and distinguished by kind, dedup on
repeat references, reindex drops stale rows, migration is idempotent
across reopens, migration from a legacy (pre-kind) schema.
UploadAsset writes binary content into a per-page sidecar directory
(<page>.assets/<name>), sniffs magic bytes against the browser-renderable
image set (PNG/JPEG/GIF/WebP/AVIF/SVG/BMP/ICO), auto-suffixes on name
collision (case-insensitive), and caps size at Wiki.MaxAssetBytes
(default 10MB).

ReadAsset and StatAsset round-trip bytes + MIME, validating paths
against the wiki root to prevent traversal.

DeletePage now sweeps the sidecar against the link index after dropping
the page's own rows: any asset with no remaining kind='image'
referencer is deleted, and the sidecar dir is removed if empty. Shared
assets (referenced from other pages) are kept in place.

MovePage uses splitSidecarOnMove to decide per-asset what travels:
exclusive assets are renamed alongside the page and in-body references
are rewritten to the new sidecar path; shared assets stay in the
original sidecar and the moved page's body keeps pointing at the old
path (which still resolves). gcSidecarAssets cleans up afterward.

Tests cover: upload + collision suffix (incl. case-insensitive), SVG
acceptance, non-image rejection, size cap, filename sanitization,
read/stat round-trip, traversal rejection, delete cascade for
exclusive assets, delete preserving shared assets, move relocating
exclusive assets with body rewrite, move leaving shared assets behind
with body unchanged.
Three additions to the MCP tool surface:

- upload_image: agent uploads base64 image bytes to a page's sidecar,
  receives the markdown-ready path + URL + size + mime. Embedding the
  ![]() reference is the agent's job (via update_page / edit_page);
  the design defers the convenience insert_image tool until v2 to
  keep tool responsibilities clean.

- download_image: returns mcp.ImageContent so vision-capable agents
  see the asset inline.

- get_page gains include_images and include_image_metadata flags.
  Default off — image work is opt-in to keep token cost predictable.
  include_images attaches the actual bytes as MCP ImageContent blocks
  after the text payload; include_image_metadata embeds {path,size,
  mime} entries in the JSON body without the bytes. Both can be
  globally overridden by Server.SetForceImagesOff for token-
  constrained deployments; when forced, the response includes
  images_forced_off=true so callers don't reason as if they got what
  they asked for.

Wiki.ImageRefsForPage is exposed (it was already used internally by
MovePage as imageRefsFor) so the MCP layer can enumerate a page's
image references without re-parsing the body.

Tests cover: upload happy path, download returning ImageContent,
include_image_metadata embedding entries, include_images returning
both text + image blocks, non-image rejection (via result.IsError),
force-off override.
POST /api/assets accepts either JSON (page, name, content_base64) or
multipart/form-data (page, name, file) and writes the bytes through
the wiki's UploadAsset. Maps wiki errors to HTTP status codes:
ErrAssetTooLarge → 413, ErrUnsupportedAssetType → 415, other errors → 400.

GET /assets/<path...> serves uploaded asset bytes via http.ServeContent
(gets conditional GET and byte-range support for free). Content-Type
is the MIME detected at read time. Cache-Control: public, max-age=300
because assets are stable-by-path (collision auto-suffix means the
same path always returns the same bytes).

SVG responses get a strict Content-Security-Policy that disables
scripts and external loads (default-src 'none'; style-src
'unsafe-inline'; sandbox) to neutralize script-injection from
hand-crafted SVG payloads. Same-origin only.

Tests cover: JSON upload happy path returning correct path/URL/size,
multipart upload, non-image rejection (415), serving bytes round-
trip, SVG CSP header presence, not-found (404), and traversal
rejection (the wiki layer guard surfaces a 4xx for ../ escapes).
renderMarkdown now prefixes wiki-local image destinations with
/assets/ before handing the markdown to marked. The static asset
handler then serves the bytes from disk, so <img src> Just Works in
rendered pages. External URLs (http/https/data:/mailto:/etc.) and
anchor-only refs are left alone — isWikiLocalImageRef mirrors the
Go-side parser logic so the front-end and indexer agree on what
counts as 'local'.

styles.css gets a .markdown img rule: max-width 100% (no horizontal
overflow), block layout with vertical margin, and a subtle
code-bg-tinted frame that matches the Metro look of the rest of the
page. Click-to-zoom and modal previews can ride on top of this
later.
Sync now ferries the contents of *.assets/ sidecar directories alongside
markdown pages, so images uploaded via the image-support tools survive
the round-trip through git. The new syncableRel predicate centralizes
the file-set decision (any *.md plus any file under a *.assets/
segment); future file kinds get added there. copyToWiki and copyFromWiki
share it, and the delete-mirroring scan uses it too so assets are
removed from the clone when removed from the wiki.

LFS support is opt-in per mapping:

  - config.SyncMapping gains LFS bool + LFSPatterns []string. Persists
    in config.json. DefaultLFSPatterns() returns the browser-image set
    as the sensible default.
  - sync.MappingOptions + Manager.RegisterMappingWithOptions accept the
    new fields. Manager.RegisterMappingWithLFS is a flat-argument
    variant satisfying mcp.SyncRegistrarWithLFS, kept flat so the mcp
    and sync packages don't share named-struct types.
  - syncTarget gets lfs/lfsPatterns fields populated from the mapping.
    When lfs is true, syncTarget calls ensureLFSConfig before staging
    each cycle: runs 'git lfs install --local', writes a managed
    .gitattributes routing the patterns through LFS, and stages it.
    Failures (e.g. git-lfs not installed) surface to setError so the
    operator sees them in Status without crashing the loop.

MCP register_sync gains lfs + lfs_patterns inputs. Dispatch picks the
LFS-aware registrar method when available; otherwise logs a warning and
falls back to the no-LFS path so older mocks still satisfy the contract.

Tests cover: syncableRel happy paths and substring traps, end-to-end
sync of a page + sidecar PNG to a bare remote, LFS settings persisted
to config + reflected in syncTarget, back-compat RegisterMapping leaving
LFS off.
A real-browser integration test for the image-support pipeline. The
unit tests under internal/wiki, internal/mcp, and internal/httpapi
prove each layer in isolation; this harness drives the whole flow:

  POST /api/assets         (upload)
   -> sidecar storage
   -> indexer kind='image' row
  PUT  /api/pages/...      (embed the !\[]() reference)
   -> reindex with new link row
  GET  /#/<page>           (open in real Chromium)
   -> marked rewrites src to /assets/<path>
  GET  /assets/<path>      (static handler serves bytes)
   -> <img naturalWidth=..., complete=true> in the DOM

Two scripts under tools/screenshot:

  capture.mjs  Captures five representative views (home/graph, page
               detail, search, MCP page, settings modal), uploads each
               via POST /api/assets, and embeds the reference under a
               managed sentinel block (<!-- mind-map screenshots ... -->)
               so re-runs replace the prior block instead of appending.
               Stable filenames + collision auto-suffix mean repeat
               runs accumulate cleanly. Byte-verifies each upload by
               GETting /assets/<path> right after writing.

  verify.mjs   Opens one page in the SPA, inspects the rendered DOM
               for <img> elements with .complete=true and non-zero
               naturalWidth, records all /assets/* HTTP responses,
               and fails loudly if any check fails.

Dockerfile gains Chromium runtime deps (libnss3, libnspr4, libatk-*,
libgbm1, libpango, libcairo, libasound2, libatspi, fonts-liberation,
fonts-noto-color-emoji) so 'npx playwright install chromium' produces
a usable browser without --no-sandbox surprises beyond the one we
already pass.

tools/screenshot/.gitignore excludes node_modules/, captured/, and
package-lock.json — the install + capture are deterministic enough
that re-running rebuilds them.

Validated end-to-end against the worktree's mind-map instance: all
five captures uploaded, all five embed references landed, verify.mjs
confirmed <img naturalWidth=2560, complete=true> with 200 image/png
from /assets/architecture/wiki-engine.assets/page-detail-1.png. The
rendered page even shows a recursive screenshot — the wiki-engine
page's screenshot was taken after the screenshot was embedded, so
the captured PNG itself contains a working <img> render. Three layers
of the pipeline visible in one image.
Replace the hardcoded "127.0.0.1:51888:4242" appPort with the
initializeCommand pattern documented in the wiki page
preferences/devcontainer-ports.

The implementation that actually works:

  1. initializeCommand.sh runs on the host before docker run. Picks
     the first free port from a preferred range (51888..51893, falls
     back to a kernel-assigned port). Writes
     MIND_MAP_HOST_PORT=NNNN to .devcontainer/ports.env.

  2. runArgs uses --network host (instead of appPort). This is the
     crucial bit: it sidesteps the appPort substitution timing
     problem entirely. ${localEnv:...} in appPort is evaluated
     BEFORE initializeCommand runs, so a dynamically-chosen port
     can't be plumbed through it; --network host means there's no
     host:container mapping in the first place.

  3. runArgs --env-file .devcontainer/ports.env propagates the
     chosen port into the container so the mind-map binary's
     `serve --addr` reads it.

  4. launch.json: in-container launches (mind-map Server, the
     stdio variant, etc.) read the port via ${env:MIND_MAP_HOST_PORT}
     since VS Code-launched processes inherit container env. The
     host-side Chrome launch uses a shellCommand.execute input that
     reads .devcontainer/ports.env directly (requires the
     augustocdias.tasks-shell-input extension, now declared in
     devcontainer.json customizations).

  5. tasks.json waitForServer sources ports.env at the top of its
     shell command. No extension needed since tasks of type 'shell'
     give us a real shell.

  6. .gitignore excludes .devcontainer/ports.env (host-specific,
     per-run value).

Verified end-to-end: rebuilt the container, confirmed --network host
mode in docker inspect, confirmed $MIND_MAP_HOST_PORT propagated to
container env, started mind-map binding to $MIND_MAP_HOST_PORT from
inside the container, hit http://127.0.0.1:51888 from the host and
got HTTP 200. All existing go test ./... pass.

The first version of this change (committed and then immediately
fixed within this same commit) tried ${localEnv:MIND_MAP_HOST_PORT}
in appPort. Don't do that. The wiki page records the trap.
…h shots

Two coordinated changes that finish the image-support feature for the
demo run:

1. DeleteAsset across the stack. wiki.DeleteAsset removes a file by
   path, clears its kind='image' rows from the link index, and sweeps
   the parent sidecar if it ends up empty. Surfaces:

     - DELETE /api/assets/<path...>  (HTTP)
     - delete_image MCP tool          (MCP)

   This completes the asset CRUD surface (alongside UploadAsset and
   ReadAsset/StatAsset) and makes idempotent re-runs of the capture
   harness possible: drop yesterday's home.png before uploading
   today's, so filenames stay canonical (no -1/-2 suffix
   accumulation).

   Tests: round-trip happy path, missing-path ErrAssetNotFound,
   traversal rejection, empty-sidecar cleanup.

2. tools/screenshot/capture.mjs rewritten as a composable demo
   harness:

     - Per-capture async compose(page) function that puts the SPA
       into the desired state (set theme via localStorage, sort
       mode, click fit-all on the graph, fill search, etc.).
       Maximum flexibility for adding new shots.

     - Pre-DELETE per capture so re-runs replace cleanly instead of
       collision-suffixing.

     - Fresh browser context per capture so localStorage / theme
       changes don't bleed across shots.

     - 9 captures covering: home graph fitted (light + dark theme
       pair), page detail with mermaid, three sidebar sort modes
       (recent / path-tree / title), sidebar-search with highlight,
       in-page-search with body highlights, settings modal. Each
       lands on a distinct architecture/* page so navigation through
       the wiki naturally exposes them.

3. Dockerfile + devcontainer.json switch to the
   ghcr.io/schlich/devcontainer-features/playwright feature, which
   runs 'npx playwright install --with-deps' as the remote user
   during build. Drops 25 lines of manually-listed Chromium runtime
   deps from the Dockerfile and the per-clone 'npx playwright
   install' step from the capture harness.

   The README is updated to reflect the new (much simpler) setup.

The capture harness was run end-to-end against the running container:
all 9 shots uploaded, byte-verified through the static handler, and
embedded into 5 architecture pages. The verify.mjs spot-check
confirmed rendering with all images at 200 OK, naturalWidth > 0,
complete=true.
Resolves a single conflict in internal/mcp/server.go where main's
digest PR (#51) and this branch both added new MCP tools.

Resolution:

- get_wiki_context: take main's revised description that mentions the
  new digest fields (auto-merged cleanly outside the conflict region).
- get_wiki_digest: keep main's new tool registration AND handler.
- get_page handler: drop the old main-side getPage that takes
  pagePathInput. This branch's slice 3 already replaced it with
  getPageWithFlags (in images.go) which accepts the new
  IncludeImages / IncludeImageMetadata flags via getPageInput.
  Keeping both would mean two handlers for the same tool name.
- Placeholder comment in server.go points readers at images.go for
  the new get_page handler.

Verified: go vet ./... clean, go test ./... passes (8 packages, including
the new internal/digest package from main).
@aniongithub aniongithub merged commit 44f7f52 into main May 25, 2026
1 check passed
@aniongithub aniongithub deleted the feat/image-support branch May 25, 2026 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant