Skip to content

fix(super-converter): preserve cell-level SDT wrapping table cells (SD-3289)#3539

Merged
caio-pizzol merged 4 commits into
mainfrom
caio-pizzol/sd-3289-preserve-cell-level-sdt
May 27, 2026
Merged

fix(super-converter): preserve cell-level SDT wrapping table cells (SD-3289)#3539
caio-pizzol merged 4 commits into
mainfrom
caio-pizzol/sd-3289-preserve-cell-level-sdt

Conversation

@caio-pizzol
Copy link
Copy Markdown
Contributor

Preserves Word date pickers (and other content controls that wrap a single table cell) on DOCX import/export. The shape is <w:tr><w:sdt><w:sdtContent><w:tc>...</w:tc></w:sdtContent></w:sdt></w:tr> — ECMA-376 Part 1 §17.5.2.32, CT_SdtCell. The row importer previously filtered to direct w:tc children only, silently dropping the wrapped cell on import and from any subsequent export. Reported on IT-1119.

  • Row encoder recognizes both direct <w:tc> and cell-level <w:sdt>, routes the inner cell through the existing tc-translator, and preserves w:sdtPr / w:sdtEndPr on a new cellSdt attr (rendered: false, declared on both tableCell and tableHeader).
  • Row decoder re-wraps the exported <w:tc> in <w:sdt> when the attr is present, so the w:date subtree round-trips (fullDate, dateFormat, lid, storeMappedDataAs, calendar).
  • Tests: 11 unit tests plus a real table import/export round-trip integration test that drives defaultNodeListHandler and exportSchemaToJson against the IT-1119 OOXML shape with no translator mocks.

Out of scope, tracked as follow-ups: content-controls Document API exposure for cell-level controls (target-resolution.ts only finds structuredContent / structuredContentBlock); exact multi-cell CT_SdtContentCell grouping round-trip; analogous row-level SDT (CT_SdtRow) and w:customXml-wrapped row/cell drops at the same filter sites.

Review: row import normalize helper, row decode re-wrap, schema attr on both tableCell and tableHeader. Ignore: pre-existing TS errors elsewhere in tr-translator.js (net -1 from this change).
Verified: super-converter suite 2979/2979; tr/ handler dir 32/32 including the round-trip integration test; tsc errors in tr-translator.js 34 → 33.

…D-3289)

Word date pickers and similar content controls that wrap a single
table cell are stored in DOCX as `<w:tr><w:sdt><w:sdtContent><w:tc>
...</w:tc></w:sdtContent></w:sdt></w:tr>` (ECMA-376 Part 1
§17.5.2.32, CT_SdtCell). The row importer previously filtered to
direct `w:tc` children only, silently dropping the wrapped cell on
import and from any subsequent export.

The row encoder now recognizes both direct `<w:tc>` and cell-level
`<w:sdt>`, routes the inner cell through the existing tc-translator,
and preserves `w:sdtPr`/`w:sdtEndPr` on a new `cellSdt` attr
(rendered: false, declared on `tableCell` and `tableHeader`). The
row decoder re-wraps the exported `<w:tc>` in `<w:sdt>` when the
attr is present, so date metadata (fullDate, dateFormat, lid,
calendar) round-trips.

- Cell-level controls remain outside the content-controls Document
  API surface for v1.
- Multi-cell `CT_SdtContentCell` wrappers import all cells
  defensively but drop wrapper metadata; exact multi-cell grouping
  is out of scope.
- Adds 11 unit tests plus a real table import/export round-trip
  integration test covering the IT-1119 OOXML shape.
@caio-pizzol caio-pizzol requested a review from a team as a code owner May 27, 2026 18:32
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 27, 2026

SD-3289

@github-actions
Copy link
Copy Markdown
Contributor

The MCP tools require permissions I don't have. I'll review based on the ECMA-376 spec from prior knowledge of the relevant sections.

Status: PASS

This PR correctly handles cell-level structured document tags. The key spec elements check out:

  • w:sdt as child of w:tr — ECMA-376 §17.4.79 (CT_Row) includes EG_ContentCellContent, which permits w:sdt (CT_SdtCell) as a legal sibling of w:tc. ✓ (spec)

  • CT_SdtCell structurew:sdtPr (0..1), w:sdtEndPr (0..1), w:sdtContent (0..1, type CT_SdtContentCell). The code extracts these in the correct order. ✓

  • Multi-cell wrappers — The PR's comment notes that CT_SdtContentCell's content model (EG_ContentCellContent with maxOccurs="unbounded") does permit multiple inner w:tc. The defensive fallback (import all inner cells, drop wrapper metadata) is a reasonable choice given v1 doesn't model SDT-grouped cell ranges. ✓

  • w:date metadata — CT_SdtDate has w:fullDate as an optional ST_DateTime attribute, with child elements w:dateFormat (CT_String), w:lid (CT_Lang), w:storeMappedDataAs (CT_SdtDateMappingType with enum {text, date, dateTime}), w:calendar (CT_Calendar with ST_CalendarType enum including 'gregorian'). The test fixture nests these inside w:date correctly — not as direct w:sdtPr children, which would be wrong. ✓ (spec)

  • Export orderw:sdt[w:sdtPr, w:sdtEndPr?, w:sdtContent] matches CT_SdtCell's sequence. ✓

  • w:tblW/w:tblLayout/w:tcW/w:gridCol in the integration fixture use standard attributes (w:w, w:type=dxa, etc.) — all spec-conformant.

The implementation treats the preserved sdtPr/sdtEndPr as opaque round-trip blobs, which is safe — no spec-defined attribute is being synthesized or stripped, so there's no risk of fabricating or losing values.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 66f4433436

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

No issues found across 6 files

Linked issue analysis

Linked issue: SD-3289: Preserve date content controls that wrap table cells

Status Acceptance criteria Notes
Import recognizes a cell-level wrapping a single , unwraps it and preserves the wrapper's w:sdtPr / w:sdtEndPr as per-cell metadata (cellSdt) The row normalizer extracts sdtPr/sdtEndPr and attaches cellSdt to the encoded tableCell when the wrapper contains exactly one w:tc.
Inner is routed through the existing tc-translator with correct extraParams (columnIndex/columnWidth) rather than being dropped or processed incorrectly The inner cell is passed to the existing tc translator and unit tests assert the translator was called with the expected extraParams.
Export re-wraps a table cell carrying cellSdt in a envelope, reconstructing w:sdtPr (and w:sdtEndPr when present) so date subtree metadata (e.g. w:date fullDate and child elements) round-trips Decoder replaces exported with containing preserved sdtPr/sdtEndPr and sdtContent; integration test verifies w:date attributes survive export and wrapped contains the date text.
The date text is preserved in the editor model and the exported tree contains exactly one cell with the date (no duplication or accidental drop) Integration test asserts the imported PM table has the date text in the correct cell and the exported OOXML contains a single wrapped cell with the date text.
Multi-cell SDT wrappers are handled defensively (do not claim to round-trip multi-cell grouping) — inner cells are imported but wrapper metadata is not incorrectly applied The normalizer drops wrapper metadata when sdtContent contains multiple w:tc children, emitting each inner cell without cellSdt, matching the PR's stated conservative behavior.

Re-trigger cubic

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…SD-3289)

The row encoder unwraps `<w:sdt><w:sdtContent><w:tc/></w:sdtContent></w:sdt>`
into real cells, but `findTableCellAtColumn` (called from
`handleTableCellNode` to mark vMerge continuations) still filtered to direct
`<w:tc>` children of `<w:tr>`. As a result, a vertical merge whose
continuation cell lived inside a cell-level SDT was never marked
`_vMergeConsumed`; the row encoder would then re-emit it as a real cell and
break the merge round-trip.

Extract the row-child normalization to a shared helper (`row-cell-children.js`)
and route both the row encoder and the vMerge column lookup through it. The
inner `<w:tc>` is now reachable from either side; vMerge marking propagates
naturally because both sides see the same node reference.

Caught by automated review on PR #3539. Adds a unit test covering a vMerge
column with an SDT-wrapped continuation cell.
@caio-pizzol caio-pizzol self-assigned this May 27, 2026
@caio-pizzol caio-pizzol merged commit 8d5c2a1 into main May 27, 2026
70 checks passed
@caio-pizzol caio-pizzol deleted the caio-pizzol/sd-3289-preserve-cell-level-sdt branch May 27, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants