Skip to content

wasm: nested constructor patterns in match arm args (#68)#71

Merged
hyperpolymath merged 1 commit into
mainfrom
feat/wasm-nested-patterns
May 15, 2026
Merged

wasm: nested constructor patterns in match arm args (#68)#71
hyperpolymath merged 1 commit into
mainfrom
feat/wasm-nested-patterns

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Closes #68.

What changed

compile_match in ephapax-wasm no longer uses br_table. Each arm is a Block($arm_i_fail) containing an optional tag check (for top-level Constructor patterns) plus a recursive compile_sub_pattern walk over the destructure tree. Refutable sub-patterns (Literal, nested Constructor) emit br_if $arm_i_fail on mismatch, falling through to the next arm.

compute payload + flat_tag (only for n>=2 data scrutinees)
block $done : i32
  block $arm_0_fail
    (tag check if ctor; br_if $arm_0_fail on mismatch)
    (destructure args via compile_sub_pattern;
     refutable sub-pat → br $arm_0_fail)
    (arm body)
    br $done
  end
  ...
  unreachable    ; typechecker proves dead for well-typed
end

Why drop br_table

br_table commits one outer-tag-value to one arm. When two arms target the same outer ctor with different inner patterns (Some(0) then Some(_)), the second arm is unreachable in the br_table form. With Maranget exhaustiveness from #66, the typechecker now accepts Some(Some(_)) + Some(None) + None as well-typed, but the br_table form silently miscompiles Some(Some(42)) for that layout. Linear dispatch costs O(N) tag checks instead of O(1) — negligible for typical N≤4 and required for correctness.

Sub-pattern coverage

compile_sub_pattern handles every Pattern variant:

Variant Behaviour
Wildcard no-op
Var local.set to fresh local
Unit no-op (single-inhabitant)
Literal(Unit) no-op
Literal(Bool) / Literal(I32) i32.ne + br_if fail_depth
Pair load offset 0 / offset 4; recurse
Tuple right-nested pair walk; recurse
Constructor emit_tag_walk + tag check + recurse

destructure_payload walks the right-nested pair chain (offset 0 / offset 4) and recurses compile_sub_pattern for each constructor arg, sharing the same fail_depth.

Out of scope

  • I64/F32/F64/String literal sub-patterns — currently route to immediate br_if 1 (fail). The typechecker accepts them but codegen needs i64.ne / fNN.ne / string compare. Follow-up issue (no number yet — happy to file if you want).
  • br_table fast path for the fully-irrefutable case. Linear is correct; a fast path can be reintroduced once perf matters.

Test plan

  • cargo test -p ephapax-wasm --lib → 73 pass (was 70), +3 new:
    • compile_module_match_nested_some_some — Option(Option(I32)) covered by None/Some(None)/Some(Some(v))
    • compile_module_match_nested_with_literal — refutable literal: Some(0) | Some(v) | None
    • compile_module_match_nested_tuple_ctordata B = Box(Option(I32)) matched by Box(Some(v)) | Box(None)
  • Each new test validates emitted bytes via wasmparser.
  • No regression in the 4 prior compile_module_match_* tests from feat(typing,interp,linear,wasm): core ExprKind::Match — typing + interp + N-arm linearity + br_table codegen #65.
  • Workspace lib tests: full sweep passes after AppLocker cache clears.
  • CI green on all checks.

🤖 Generated with Claude Code

Replaces the br_table arm dispatch from #65 with a linear per-arm
chain so refutable sub-patterns can fall through to the next arm.
br_table forwards exactly one tag → one arm and can't recover when
an inner pattern fails, which made arms like `Some(0) -> a |
Some(_) -> b` lose `b` at runtime once #66 admitted them as well-
typed.

## Codegen shape

```text
compute payload + flat_tag (only for n>=2 data scrutinees)
block $done : i32
  block $arm_0_fail
    (tag check if ctor; br_if $arm_0_fail on mismatch)
    (destructure args via recursive compile_sub_pattern;
     refutable sub-pat → br $arm_0_fail)
    (arm body)
    br $done
  end
  block $arm_1_fail
    ; same shape
  end
  ...
  unreachable    ; typechecker proves dead for well-typed
end
```

## New helpers in `ephapax-wasm`

* `compile_top_pattern` — top-level binding/check for one arm.
  Wildcard / Var bind whole scrutinee; Constructor recurses into
  args via `destructure_payload`; Literal/Pair/Tuple delegate to
  `compile_sub_pattern` on the raw scrutinee value.
* `compile_sub_pattern(pat, value_local, fail_depth)` — recursive
  for every `Pattern` variant. `Var`/`Wildcard`/`Unit` are
  irrefutable; `Literal` (Bool/I32) emits `i32.ne + br_if
  fail_depth`; `Pair`/`Tuple` destructure right-nested pair cells;
  nested `Constructor` does its own `emit_tag_walk` + tag check
  before recursing into args.
* `destructure_payload` — replaces `bind_constructor_payload` /
  `bind_single_arg`. Walks the right-nested pair chain
  (offset 0 / offset 4) and recurses `compile_sub_pattern` into
  each element with the shared `fail_depth`.
* `collect_first_ctor` — top-level helper used by `compile_match`
  to detect a data-type scrutinee from the first constructor
  pattern.

## Why drop br_table

* br_table has one target per tag. When two arms target the same
  outer ctor with different inner patterns (e.g. `Some(0)` then
  `Some(_)`), br_table commits to the first arm; inner refutation
  can't recover.
* With Maranget exhaustiveness (#66), `Some(Some(_))` + `Some(None)`
  + `None` is accepted as exhaustive. The br_table form silently
  miscompiles `Some(Some(42))` for this layout because both `Some`
  arms compete for tag 1.
* Linear dispatch is O(N) tag checks instead of O(1); negligible
  for typical N (≤ 4) and required for correctness once Maranget
  is on. A future PR could add a br_table fast path for the
  fully-irrefutable case.

## Behaviour preserved

* Single-ctor data type fast path: no tag walk; scrutinee IS
  payload (same as #65).
* Right-nested pair payload walk: identical layout (offset 0 /
  offset 4 chain).
* `emit_tag_walk` unchanged; reused both at top level and inside
  `compile_sub_pattern` for nested constructors.

## Tests

* `cargo test -p ephapax-wasm --lib` → 73 pass (was 70), +3 new:
  - `compile_module_match_nested_some_some` — `None | Some(None) |
    Some(Some(v))` over Option(Option(I32))
  - `compile_module_match_nested_with_literal` — `Some(0) |
    Some(v) | None` exercising refutable literal fall-through
  - `compile_module_match_nested_tuple_ctor` — `data B = Box(Option(I32))`
    matched as `Box(Some(v)) | Box(None)`
* Each new test validates the emitted bytes through `wasmparser`.
* No regressions in the 4 prior `compile_module_match_*` tests
  from #65.

## Out of scope

* I64/F32/F64/String literal sub-patterns — `compile_sub_pattern`
  currently routes them to immediate br_if-fail so the arm never
  succeeds. Real lowering is a follow-up (needs i64.ne / fNN.ne /
  string compare).
* br_table fast path for fully-irrefutable matches — keeping the
  diff small. Re-add when the perf cost matters.

Closes #68.
@hyperpolymath hyperpolymath force-pushed the feat/wasm-nested-patterns branch from eaa1ed8 to 482ed8b Compare May 15, 2026 07:22
@hyperpolymath hyperpolymath merged commit de436bd into main May 15, 2026
11 checks passed
@hyperpolymath hyperpolymath deleted the feat/wasm-nested-patterns branch May 15, 2026 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wasm: nested constructor patterns inside ExprKind::Match arm args

1 participant