Skip to content

feat: import existing PPTX into the agent + edit flow#149

Open
naohito2000 wants to merge 18 commits into
mainfrom
feat/pptx-import-edit
Open

feat: import existing PPTX into the agent + edit flow#149
naohito2000 wants to merge 18 commits into
mainfrom
feat/pptx-import-edit

Conversation

@naohito2000
Copy link
Copy Markdown
Contributor

Summary

Adds an end-to-end import-pptx flow so users can drop an existing PPTX onto
the agent and continue authoring / editing from that deck instead of starting
from a sdpm template.

  • Converterpptx_to_json now emits the deck-structure layout
    (deck.json + slides/slide-NN.json + images/) and copies the source
    PPTX as the deck-local placeholder template (template.pptx).
  • Upload pipeline — Local & Cloud upload paths run the converter on
    attached PPTX, generate slide-structure + theme hints, and forward
    guide / guideInstruction / themeHints to the agent via the
    [Attached: …] marker so it routes into the import-pptx guide.
  • Translate — adds translate_extract / translate_apply over the
    deck structure so re-translation does not re-render the whole deck.
  • Prompts / guides — new import-pptx.md guide drives the 6-step
    flow (init → import → brief+outline → build → art-direction → present);
    edit-existing is removed (its responsibilities are now covered by
    the import-pptx guide).
  • art-direction — Step 5 uses three lenses (visual get_preview /
    theme XML analyze_template / PIL on rendered previews) to ground the
    style in the source deck. Style authoring itself is delegated to the
    existing create-style workflow.
  • Cloudanalyze_template(template="template.pptx", deck_id=…)
    short-circuits to the deck-local placeholder template downloaded from
    the workspace bucket; ApiLambda bundles sdpm; Decimal handling on
    DDB writebacks fixed.
  • Dark-theme detection_extract_theme_hints walks all
    slideMasters, picks the dominant one by slide-usage, and resolves the
    bg slot through its clrMap. Corporate decks that flip
    bg1=dk1 on a secondary master now report the correct dark
    background.
  • Web-UIUploadedFile carries the new fields end-to-end; the
    Reconnect useEffect re-adds the isLoadingRef guard so tool-cards
    do not flicker on stream reconnect; mcp-local user-local style /
    asset support restored after a checkout regression.

17 commits, rebased clean onto main. No conflicts.

Test plan

  • pytest — 209 passed, 2 skipped
  • Local (ACP, kiro-cli) — import-pptx flow on multiple decks
    (light / dark / corporate)
  • Cloud (ap-northeast-1, Cognito user) — same import-pptx flow,
    including analyze_template against the deck-local placeholder
    template
  • Dark-theme deck verified to render with dark background
    tokens after the dominant-slideMaster clrMap fix
  • Reviewer to spot-check the import-pptx guide wording and the
    new analyze_template deck-local short-circuit

Known follow-up

  • art-direction Lens C currently samples only the first 6 preview
    pages — for "all-page same background" decks this produces correct
    output, but hybrid-style decks (cover dark / body light) can
    still trip the agent's prior. Tracking as a follow-up to
    (a) widen the sample to span the deck and (b) make
    themeHints.backgroundLuminance the primary signal in the
    --color-bg token table.

🤖 Generated with Claude Code

Comment thread web-ui/src/lib/attachmentMarker.ts Fixed
naohito2000 pushed a commit that referenced this pull request May 14, 2026
CodeQL js/incomplete-sanitization (high) flagged
attachmentMarker.ts:17 — the previous escaping (replace `"` → `\"`
only) does not handle backslash characters, so an input containing
`\"` would leave a stray `"` in the marker.

Replace the manual escape + surrounding quotes with
JSON.stringify(), which handles backslash, quote, and control
characters in one step. Behaviour for the current call sites is
unchanged (their inputs contain no backslashes), but the code is
robust against future inputs and the CodeQL alert clears.

Refs: PR #149 CodeQL alert

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@naohito2000 naohito2000 marked this pull request as ready for review May 14, 2026 05:43
@ShotaroKataoka ShotaroKataoka added the blog:pending ブログ記事にする label May 14, 2026
Naohito Yoshikawa and others added 18 commits May 14, 2026 14:54
Change pptx_to_json on-disk output from a single slides.json to the
deck format used by the rest of the pipeline:

    {output_dir}/
      deck.json              {fonts, defaultTextColor}
      slides/slide-NN.json   per slide (1-based, zero-padded)
      images/                unchanged

Engine + Layer 2/3 changes:
- skill/sdpm/converter/pipeline.py: emit deck.json + slides/ on disk;
  in-memory return still includes {slides, fonts, defaultTextColor}.
- skill/sdpm/diff/__init__.py: load deck.json + slides/*.json via
  _load_deck_as_roundtrip helper instead of slides.json.
- mcp-local/tools.py::pptx_to_json: docstring + return dict now
  includes deck_dir for downstream tooling.
- shared/ingest.py: extend ConversionResult with deck_structure,
  slide_count, theme_hints, suggested_name. Theme hints unify Local
  / Cloud / DynamoDB keys; background luminance uses median across
  per-slide backgrounds with template lt1 / inverted dk1 fallback.

Tests:
- tests/test_pptx_import.py: scaffolds deck-structure, theme_hints,
  upload guide, read_uploaded_file, import_attachment, CLI
  acceptance, and DOCX non-regression. Phase 1 covers
  TestPptxToJsonDeckStructure / TestConvertPptxThemeHints /
  TestNonRegression; later classes go red until Phase 2/5.

Refs: .kiro/specs/pptx-import-edit/tasks.md T1, T2, T8

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Treat PPTX uploads as deck-structure decks across both modes so the
agent can branch into the import-pptx guide instead of the generic
briefing flow.

Local (mcp-local/upload_tools.py):
- _GUIDE_INSTRUCTION shared with Cloud — agents see identical
  intent-branching text whether they hit Local or Cloud.
- upload_file: when convert_file produces deck structure, surface
  guide / guideInstruction / suggestedName / slideCount / themeHints
  on the response. Skip the legacy filePath fallback so agents don't
  open just deck.json and miss slides/.
- read_uploaded_file (and helper _format_deck_text_summary): emit a
  Markdown-style "--- Slide N: title ---" summary across slides + image
  previews. Legacy slides.json branch retained for back-compat.
- import_attachment: return shortId so agents avoid regex extraction;
  recursively copy any directory (notably slides/) to
  attachments/{shortId}/{dirname}/. Drop legacy slides.json special case.

Cloud (api/index.py + mcp-server/tools/{attachment,upload}.py):
- api/index.py: route .pptx through convert_file (was skipped). Persist
  deck-structure metadata to DynamoDB as themeHints / slideCount /
  suggestedName. process_upload + get_upload_status share
  _pptx_guide_fields helper for parity with Local. Drop dead
  _extract_pptx_text helper (replaced by convert_file + _read_converted).
- mcp-server/tools/attachment.py::_import_converted: replace legacy
  slides.json case with slides/ recursive copy +
  attachments/{shortId}_deck.json + deckJson key. Always include shortId.
- mcp-server/tools/upload.py::_read_converted: detect deck structure
  via _format_deck_summary_from_s3 (streams slides from S3, same
  Markdown summary as Local).
- mcp-server/server.py + tools/sandbox.py: rewrite run_python docstring
  to foreground sandbox helpers; expose helper-based examples while
  keeping raw open()/json.load supported.

Refs: .kiro/specs/pptx-import-edit/tasks.md T3, T3b, T3c, T4 (a-d),
      T4e, T4f, T4g, T4h

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the import-pptx guide and the Guide-driven flow contract that
routes PPTX uploads through it instead of the new-deck Phase 1 flow.

skill/references/guides/import-pptx.md (new):
- Step 1-5 walkthrough for converting an uploaded PPTX into an
  editable deck. Hearings cover edit scope; specs (brief / outline /
  art-direction) are auto-generated from the PPTX content. Step 5
  rebuilds the deck against the PPTX-derived placeholder template
  copied as deck-local template.pptx by import_attachment.

agent/prompts/role/spec_agent.md (Cloud / compose_slides) and
mcp-local/acp-agent-prompts/spec-agent.md (Local / use_subagent):
- "Flow selection (evaluate this FIRST on every turn)" — classify the
  turn into guide flow vs new-deck flow before applying anything else.
  Prevents Phase 1 Briefing drift after a guideInstruction lands.
- "Phase 1 Flow" / "Slide Group Assignment" sections explicitly noted
  as new-deck only; guide flow skips them.
- Expanded "Guide-driven flows" section: tracks `uploadId`,
  `suggestedName`, `slideCount`, `themeHints` across hearing turns so
  the agent never re-asks the user to upload, and routes returning
  edit requests to compose_slides (Cloud) or sdpm-composer subagents
  (Local) after Step 5.

prompts/spec-agent.md, prompts/composer-agent.md (deleted):
- Leftovers from the c3073be prompt-composition refactor; current
  branch has zero references (verified via git grep). Removing
  prevents future edits to dead files.

mcp-local/upload_tools.py::_deck_root and
mcp-local/server_acp.py::init_presentation:
- Guard against the `Path("")` truthy bug when SDPM_DECK_ROOT is unset
  or whitespace-only — Path('.') is truthy, so the previous `or`
  fallback never ran and uploads landed under the process cwd. Strip
  whitespace and only fall back to the home directory when the env
  string is empty.

tests/test_pptx_import.py:
- TestDeckRootEnvHandling regression tests covering empty, set, and
  whitespace-only SDPM_DECK_ROOT.

Refs: .kiro/specs/pptx-import-edit/tasks.md T5, T6, T6a, T7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a translate workflow that creates a derived deck (sibling
directory ``{deck_dir}-{lang}/``) instead of mutating the original.

scripts/translate_extract.py (new):
- Walk slides/*.json and extract translatable text into
  derived_deck/translations/{slug}.txt with stable ordering. Skip short
  strings (<3 chars by default, configurable via --skip-short) and
  preserve styled-text tags so per-run diffs stay reviewable.

scripts/translate_apply.py (new):
- Apply edited derived_deck/translations/{slug}.txt back into
  slides/*.json, with --dry-run for verification. Vertical-tab
  characters (\\x0b) and styled-text tags survive a round-trip;
  structural keys (slug, layout, etc.) are excluded.

skill/references/workflows/translate-pptx.md (rewrite):
- Replace the legacy reverse-convert flow with the deck-structure
  flow: derived deck creation, translatable-text extract, hand
  translate, apply, build + measure + preview loop. Documents the
  styled-text tag rules and the partial-gradient caveat.

tests/test_translate.py (new):
- Cover derived-deck creation, empty-map template, --skip-short,
  in-place apply, --dry-run, \\x0b preservation, styled-text tag
  preservation, structural-key exclusion, specs untouched. Lint-clean
  variant from a496297e is used so pytest + ruff pass.

a496297e's version bump (sdpm 0.1.0 → 0.2.0) is dropped — main has
already moved to 0.3.0 — and its pipeline.py f-string lint fix was
folded into Phase 1.

Refs: .kiro/specs/pptx-import-edit/tasks.md T13, T14, T15
      .kiro/steering/versioning.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eliminate the Step 1 template selection hearing. Each imported PPTX
now ships a deck-local placeholder template, so the rebuilt deck
honors the source's masters/layouts/theme without the user picking a
sdpm template.

skill/sdpm/converter/template.py (new):
- extract_placeholder_template(pptx, output): copy the source PPTX
  with all slides removed and slide1.xml→placeholder, producing a
  master-only template suitable for python-pptx consumption.

shared/ingest.py + mcp-local/upload_tools.py + mcp-server/tools/attachment.py:
- Run extract_placeholder_template alongside pptx_to_json so the
  upload directory contains template.pptx; import_attachment copies
  it into the deck's root as the deck-local template.
- Local upload_file's response and the read_uploaded_file deck
  summary surface the template path so the agent can wire it through
  Step 4.

skill/sdpm/builder/__init__.py:
- Resolve "template.pptx" relative to the deck directory so build
  picks up the deck-local template ahead of bundled sdpm templates.

skill/references/guides/import-pptx.md:
- Drop the old Step 1 (template selection hearing) entirely; renumber
  to Steps 1-5. Step 4 sets deck["template"] = "template.pptx" and
  loops measure → preview → fix until visuals match the source.
- Step 4 art-direction.html customization (c857fac1) and Step 4 image
  src rewrite via image_mapping (844ca607) keep the rebuilt visuals
  aligned with the source PPTX. Step 4 explicitly calls
  generate_pptx after the build so the user gets a downloadable file
  (14927b47).

tests/test_pptx_import.py:
- TestPlaceholderTemplate, TestPptxBuilderResolvesDeckLocalTemplate,
  and import-pptx-flow extensions cover template extraction, copy,
  and end-to-end build.

Refs: .kiro/specs/pptx-import-edit/tasks.md FR-9

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two cloud-side fixes that the PPTX upload path needs in production.

infra/lib/web-ui-stack.ts + api/requirements.txt:
- ApiLambda was bundling only sdpm/analyzer; PPTX uploads require the
  full sdpm Engine (sdpm.converter, sdpm.utils, sdpm.builder) plus
  python-pptx, lxml, Pillow, qrcode, pygments, defusedxml — mirror
  skill/pyproject.toml so upload-time conversion via shared.ingest
  works without pip-installing sdpm.

api/index.py:
- Wrap themeHints in DynamoDB Decimal where Lambda code persists the
  PPTX-derived background luminance (a float). Without this DDB
  rejected the PutItem with TypeError: Float types are not supported.

mcp-server/tools/generate.py:
- Resolve "template.pptx" relative to the deck-local directory before
  falling back to bundled sdpm templates so the PPTX-derived
  placeholder template (FR-9) is picked up on Cloud builds. Mirrors
  the skill/sdpm/builder change in Phase 5.

Refs: .kiro/specs/pptx-import-edit/tasks.md FR-9 (cloud)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four bugs that block multi-turn import-pptx flows in the Cloud and
Local web UIs.

decks/page.tsx + ChatPanelShell.tsx (Panel A deckId after creation):
- After init_presentation creates a real deck, Panel A continued
  passing deckId="new" to ChatPanel, so each subsequent
  /api/agent/invoke call spawned a fresh kiro-cli ACP process and
  lost conversation context (agent forgets the deck path on the
  next turn).
- ChatPanelShell now passes deckId={panelADeckId ?? "new"} (and a
  matching deckName).
- decks/page.tsx falls back to ws.createdDeckId while ws.isNew so
  the deck path reaches Panel A via props instead of URL-hash only.

ChatPanel.tsx (chatSessionId mid-stream swap):
- The chatSessionId-watcher useEffect called setSessionId
  unconditionally. setSessionId triggers loadHistory, which calls
  setMessages with the latest .chat.json — overwriting the in-flight
  React state and dropping the ToolUse cards the user is watching.
  Before this branch, Panel A always passed deckId="new" so
  chatSessionId never changed mid-mount and the bug stayed dormant;
  the deckId-fix above lets Panel A adopt a real deck mid-stream and
  exposes it.
- Defer the sessionId swap while isLoadingRef is true; flush the
  deferred value once isLoading drops back to false.

useChatStream.ts (duplicate parallel chats from rapid double-trigger):
- The React-state isLoading guard does not flip until the next
  render, so two near-simultaneous Enter keys / clicks both passed
  the guard and produced two parallel /api/agent/invoke streams.
- Add a synchronous isLoadingRef check + eager set, and clear in
  finally so the next turn can proceed.

acp-process.ts (running flag never flipped → process not reused):
- handleLine returned early on the JSON-RPC response branch, making
  the end_turn detection block (~line 88) unreachable. Each turn
  spawned a fresh process, dropping prior context (read_guides,
  upload, hearing answers) and making the agent appear to drift back
  into Phase 1 briefing.
- Drop the early return so end_turn detection runs after the
  pending-promise resolve, restoring process reuse across turns.

f6669360 (ModelSelector setModels/setCurrentModel ref repair) from
the original branch is dropped — main's ChatPanelShell already
extracted ModelSelector into its own file with a self-contained
poll, so the fix is no longer applicable.

Refs: .kiro/specs/pptx-import-edit/notes.md (verification log)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ls.py

Phase 1 of this branch checked out mcp-local/tools.py from fe0067f7
to pick up the pptx_to_json deck-structure docstring update. That
checkout silently reverted three unrelated improvements that landed
on main via PR #110 / #146 (user-style-management):

- list_styles signature: dropped the include_all parameter and the
  user-local + bundled merge via list_styles_filtered. As a result
  list_styles called from server.py / server_acp.py only returned
  bundled styles and missed everything in
  ~/.config/sdpm/styles/, breaking the import-pptx Step 3-3
  scaffold-read step for users who keep custom styles there.
- generate_pptx / search_assets / list_asset_sources lost their
  invalidate_manifest_cache() calls, so the long-lived MCP Local
  process did not pick up user-local asset / config changes between
  invocations.

Restore the main version of tools.py and re-apply only the Phase 1
delta (pptx_to_json docstring + deck_dir return field). This keeps
Phase 1 functionality intact while bringing user-local style + asset
search back to parity with main.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b verification)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…direction

Phase 8b verification on Cloud showed the rewritten art-direction.html
captured the source PPTX only loosely — backgrounds, accent colors,
and fonts often drifted from the original deck. Root cause: Step 3-3
relied entirely on themeHints, which is a coarse summary (1
luminance, 3 accent hex values, 2 fonts) that omits the real theme
XML and the rendered pixel distribution.

Add a new sub-step 3-3b "Extract the source PPTX's actual design
tokens" that pulls two higher-fidelity signals before token writing:

1. **`analyze_template(template="template.pptx")`** — MCP tool that
   reads the theme XML directly and returns the full 12-slot color
   map (lt1 / dk1 / accent1-6 / hlink / folHlink), latin / eastAsian
   / complex font pairs, and per-layout placeholder positions.
   These are the original PPTX's authoring values, not approximations.

2. **PIL pixel-frequency sampling** — process slide image previews
   from the upload's images/ folder, count dominant RGB values, and
   convert to hex swatches. Cross-reference with theme_colors to:
   - confirm the *actual* background (may differ from
     themeHints.backgroundLuminance for non-default masters)
   - identify which accent is the deck's hero color (most-used
     accent that is not bg/text)
   - capture brand colors not declared in the theme as their own
     tokens (e.g. --color-brand-orange).

Step 3-3c (renamed from 3-3b) now lists token sources in
authority order: analyze_template > PIL swatches > themeHints >
slide JSON > slide thumbnails. themeHints drops to a sanity check
instead of a primary source.

The PIL sampling recipe mirrors the create-style workflow (used by
sdpm-style-creator agent) so the same color-extraction technique
applies whether the user is creating a fresh style or importing a
PPTX. Both paths now converge on theme XML + pixel evidence as the
authoritative inputs.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b verification —
      Cloud style fidelity gap)
      skill/references/workflows/create-style.md (PIL recipe parity)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… rendered previews

Phase 8b verification on Cloud showed two related issues:
- agent skipped Step 3-3 entirely after Step 2, stopping at "what
  do you want to edit?" instead of proceeding through specs / build.
- the previous PIL recipe pointed at the upload's images/ folder
  (PPTX-embedded photos and logos), not at slide-level renderings,
  so even when art-direction was authored the color signal was
  weak.

Re-order the guide so build precedes art-direction:

  Step 1 init
  Step 2 import
  Step 3 brief + outline (art-direction removed)
  Step 4 build (as-is reproduction against deck-local template.pptx)
  Step 5 art-direction.html  ← new position
  Step 6 present

Rationale: art-direction.html is consumed by the **composer** when
the user later asks to edit slides. The initial reproduction in
Step 4 does not need it — building first against the source's own
placeholder template gives a faithful as-is render that the user
can review immediately, and the rendered previews under previews/
become the most reliable input for authoring art-direction.html.

Step 5-2 PIL recipe now reads from previews/ instead of the upload
images/ directory. Theme XML (analyze_template) plus pixel sampling
on the actual rendered slides produce far more accurate token
values than upload-time approximations.

Step 5-3 spells out that no re-build is needed after writing
art-direction.html — the file is composer-side input for future
edits, not a precondition for the initial deck.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b ACP run —
      art-direction skip + previews-vs-upload-images gap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(Lens A/B/C)

Phase 8b feedback: PIL pixel statistics quantify color frequency but
cannot tell the agent what the colors *mean* — that the orange bar
is a section divider, that the rounded box is a card with shadow,
that the cover has a particular layout vs section headers. Decoration
motifs, layout grid, and typography hierarchy require the agent to
actually look at the rendered slides.

Step 5-2 now uses three complementary lenses:

- **Lens A — Visual inspection** via `get_preview(deck_id, slugs=[
  cover, section header, content, chart slide, closing], quality="high")`.
  The agent reads the actual rendered images for *meaning*: background
  type, title vs body color split, decoration motifs (bars, shadows,
  radii, dividers, cards, bullets), layout grid, typography
  hierarchy. These qualitative tokens are unreachable from theme XML
  or PIL statistics alone.

- **Lens B — Theme XML** via `analyze_template`. Authoritative for
  --color-* (12 theme slots), --font-* (latin / eastAsian / complex
  pairs), and --size-* / position tokens (layouts[] placeholder
  positions).

- **Lens C — PIL pixel-frequency** on `previews/`. Quantifies which
  theme entries actually appear on screen and surfaces brand colors
  not declared in the theme. Cross-checks Lens A's visual impression
  with concrete numbers.

Step 5-3 token-source rules now route by token kind:
- color → B + C, A flags brand-only colors
- font → B verbatim
- layout/size → B + A confirmation
- decoration / motif → A primary (B and C cannot give these)
- slide JSON / themeHints → final sanity checks

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b — PIL alone
      missed decoration meaning; agent needed to see images directly)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…yle workflow

Phase 8b feedback: art-direction.html ended up with the source deck's
content reproduced inside it (headlines, bullets, charts) and tokens
that disagreed with the actual rendered slides (--color-bg light when
the deck was dark, etc.). Root cause: the guide was describing how to
author a style from scratch in parallel with the canonical
create-style workflow, leaving the style-vs-reproduction distinction
implicit and missing the conventions the workflow already encodes.

Stop re-explaining style authoring in this guide. Step 5 now:

- Loads `read_workflows(["create-style"])` in 5-1 and treats it as
  authoritative for HTML skeleton, :root token conventions, .t-*
  text classes, demonstration-slide pattern (5-6 slides), absolute
  positioning rules, and the violation examples. Removes ~140 lines
  of duplicate skeleton / guideline text from this guide.

- Adds a critical reframe up front: art-direction.html is a *style
  guide*, not a reproduction. Demonstration slides contain
  placeholder text ("Cover Title", "Body sample"), not source-deck
  headlines / bullet lists / charts.

- Replaces the long inline HTML example with an import-pptx-specific
  token mapping table that routes each token kind to Lens A/B/C:
  --color-bg → theme_colors.lt1/dk1 confirmed by PIL (no guesswork);
  decoration motifs → Lens A only; layout sizes → theme XML +
  Lens A confirmation. The table is the only thing this guide
  contributes that create-style cannot.

- Adds a quality bar at the end: no hardcoded hex/pt outside :root,
  demonstration slides use placeholder copy, --color-bg matches the
  actual deck background, total 5-6 demonstration slides.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b — art-direction
      reproduced source content; --color-bg disagreed with deck;
      style conventions duplicated across guides instead of delegated)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
edit-existing.md was the legacy CLI-driven flow for editing an
existing PPTX:

  pptx_to_json.py → manual JSON edit → pptx_builder.py generate

That flow has been fully superseded by the import-pptx guide
introduced in this branch:

- Web UI / API entrypoint via upload_file (vs local file path)
- deck-structure (deck.json + slides/*.json) instead of legacy
  single-file slides.json
- PPTX-derived placeholder template — keeps the source's masters /
  layouts / theme (FR-9), instead of edit-existing's "MUST NOT carry
  over colors or styles" rule which forced a target-theme rewrite
- Brief / outline / art-direction auto-generated in the guide
- Composer subagent (compose_slides) for slide JSON edits, instead
  of the agent hand-editing JSON

mcp-server/server.py (Cloud) had already commented out its
"Workflow B" pointer, indicating Cloud was already on the
upload_file → guideInstruction path. This commit aligns the rest of
the codebase:

- Delete skill/references/workflows/edit-existing.md.
- Replace mcp-local/server.py "Workflow B" instructions with a
  pointer to the upload_file → guideInstruction flow (Web UI / API)
  with read_guides(["import-pptx"]) as the CLI fallback.
- Replace skill/SKILL.md "Workflow B" CLI hint with
  `pptx_builder.py guides import-pptx`.
- Update mcp-server/server.py comment block to note the cloud-side
  upload_file flow.

Side benefit: removes the agent-confusion path observed during
Phase 8b ACP verification — `list_workflows` no longer surfaces a
plausible-looking edit-existing entry that pulled the spec agent
away from the guideInstruction-driven import-pptx flow.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b ACP — agent
      read edit-existing instead of import-pptx via list_workflows)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 8b ACP verification: agent ignored the import-pptx flow,
called list_workflows on its own, and treated the PPTX as generic
reference material. Root cause: upload_file's deck-structure response
fields (guide / guideInstruction / suggestedName / slideCount /
themeHints) were dropped by the web-ui upload pipeline, so the
spec agent never saw the import-pptx pointer it was supposed to
follow.

uploadService.ts:
- UploadedFile interface gains guide, guideInstruction,
  suggestedName, slideCount, themeHints (matches the response shape
  from mcp-local/upload_tools.upload_file and the Cloud upload
  Lambda). Local synchronous path, Cloud process-call path, and
  pollUploadStatus all now forward these fields onto the
  UploadedFile object.

attachmentMarker.ts::buildAttachedMarker:
- New top branch: when guide + guideInstruction are present, emit a
  marker that surfaces them (plus uploadId / suggestedName /
  slideCount / themeHints) so the agent reads them as part of the
  user message. Without this branch the agent only sees
  "[Attached: file.pptx (path: ..., images: ...)]" — no signal that
  it should follow a specific guide.

After this fix, the [Attached:...] marker for a PPTX upload contains
guideInstruction text that the spec_agent's "Guide-driven flows"
section is already keyed on, closing the loop:

  upload_file response
    → uploadService stores guide+guideInstruction
    → buildAttachedMarker emits them
    → user message contains them
    → spec_agent recognizes guideInstruction and runs read_guides

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b — agent did
      not see guideInstruction; called list_workflows / treated PPTX
      as reference)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 7 commit 47cb6c23 ported the dup-chat ref guard
(2fc350b0) into useChatStream.ts but missed the second half of the
original fix: the Reconnect useEffect in ChatPanel.tsx also needs
the guard. Without it, when deckId prop swaps mid-stream
("new" → real deck after init_presentation), a second EventSource
consumer attaches to the same SSE alongside the in-flight fetch.
Both sides write to the same setMessages, racing on toolUses /
blocks, and the tool cards either flicker, get clobbered, or
duplicate.

Re-add the guard at the top of the Reconnect useEffect so it
short-circuits while streaming is already in progress. The
isLoadingRef state is shared with useChatStream via a sibling
useEffect (already syncing stream.isLoading), so the guard sees an
up-to-date value.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b ACP — tool
      cards disappearing mid-stream / on subsequent turns)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rMap

Phase 8b verification with a dark corporate deck (theme XML
bg1=#FFFFFF / dk1=#000000) produced themeHints.backgroundLuminance =
1.0 (white) even though every rendered slide was dark. The agent
then authored an art-direction.html with --color-bg set to white,
which contradicted the actual deck.

Root cause: PowerPoint can ship multiple slideMasters, each with its
own clrMap that remaps the bg1/tx1 slots. Corporate decks routinely
flip bg1=dk1 on a secondary master to declare a dark theme without
editing the theme XML itself. _extract_theme_hints was always
extracting master 0's colors and never consulting the clrMap, so
the bg1 slot was assumed to map to lt1 (#FFFFFF) regardless of how
the deck actually rendered.

Fix:

1. Walk all `ppt/slideMasters/*.xml` and trace which slideMaster
   each slide ends up on (slide → slideLayout → slideMaster). The
   master used by the most slides is the "dominant" master.

2. Re-extract theme_colors and color_mapping from the dominant
   master rather than always master 0.

3. Resolve the bg1 slot through color_mapping before falling back
   to lt1: when clrMap declares bg1=dk1 (dark deck), default_bg now
   correctly resolves to #000000 instead of #FFFFFF.

Verified on the test deck: backgroundLuminance now reports 0.0 for
this dark deck (was 1.0). 209 tests pass.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b — dark deck
      mis-classified as white because clrMap was never consulted)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eck_id

Cloud verification: import-pptx Step 5-2 calls
analyze_template(template="template.pptx") to extract theme XML from
the deck-local placeholder template that import_attachment copied
into the deck workspace. The Cloud MCP tool routed every name
through storage.list_templates() (the registered builtin/user
templates DDB), so "template.pptx" — which only exists at
decks/{deck_id}/template.pptx — was reported as not found. Step 5
failed.

Add a deck-local short-circuit. When template == "template.pptx",
require a deck_id arg, download decks/{deck_id}/template.pptx from
the workspace bucket, run sdpm.analyzer.analyze_template on the
downloaded file, and return the same shape (theme_colors / fonts /
layouts) that registered templates produce. The shared
shared.ingest fix from 2281351c (multi-master clrMap detection)
applies here automatically because analyze_template uses the same
sdpm.analyzer code path.

Update import-pptx guide Step 5-2 to spell the Cloud signature:

    analyze_template(template="template.pptx", deck_id=<deck_id>)

Local mode is unchanged: mcp-local's analyze_template already
accepts a path / template name and never went through DDB lookup.

Refs: .kiro/specs/pptx-import-edit/notes.md (Phase 8b Cloud — Step
      5 "Analyzing template" failed because deck-local template.pptx
      was not registered)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CodeQL js/incomplete-sanitization (high) flagged
attachmentMarker.ts:17 — the previous escaping (replace `"` → `\"`
only) does not handle backslash characters, so an input containing
`\"` would leave a stray `"` in the marker.

Replace the manual escape + surrounding quotes with
JSON.stringify(), which handles backslash, quote, and control
characters in one step. Behaviour for the current call sites is
unchanged (their inputs contain no backslashes), but the code is
robust against future inputs and the CodeQL alert clears.

Refs: PR #149 CodeQL alert

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@naohito2000 naohito2000 force-pushed the feat/pptx-import-edit branch from a25e1fd to eb511ee Compare May 14, 2026 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blog:pending ブログ記事にする

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants