Skip to content

feat(res-to-affine): land tree-sitter grammar build pipeline (Phase 2a, Refs #57)#321

Merged
hyperpolymath merged 2 commits into
mainfrom
phase-2a/res-to-affine-grammar-foundation
May 21, 2026
Merged

feat(res-to-affine): land tree-sitter grammar build pipeline (Phase 2a, Refs #57)#321
hyperpolymath merged 2 commits into
mainfrom
phase-2a/res-to-affine-grammar-foundation

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Phase 2 of affinescript#57 (.res → .affine migration assistant) replaces the Phase-1 line-regex scanner with a tree-sitter AST walker. The grammar is manifest-vendored at editors/tree-sitter-rescript/ (since #314), but the build pipeline that turns the manifest into a loadable parser had not been wired up. This PR is that wiring — Phase 2a foundation only, no walker yet.

What changed

  • justfile — new install-grammar recipe wrapping editors/tree-sitter-rescript/scripts/install.sh.
  • install.sh — error message when tree-sitter is missing now points at both cargo install tree-sitter-cli (Rust-native, repo-preferred per CLAUDE.md) and npm install -g tree-sitter-cli. Script behaviour is unchanged when the CLI is present.
  • .github/workflows/ci.yml — new migration-assistant job that installs tree-sitter-cli from npm (the fast CI path — pre-built binary, ~5 s vs. ~5 min cargo install from source), runs just install-grammar, verifies tools/vendor/tree-sitter-rescript/src/parser.c was produced, and smoke-parses the existing fixture at tools/res-to-affine/test/fixtures/sample.res. Sits alongside vscode-smoke as a Node-using carve-out under the same reasoning (manifest dep on an npm-distributed CLI; no new TypeScript source in this repo).
  • editors/tree-sitter-rescript/README.md — documents the dual install path, the new justfile recipe, and the CI gate.

Why split 2a from 2b

The original Phase-2 estimate ('Phase 2 = tree-sitter walker') silently assumed the build pipeline was ready. It wasn't — tools/vendor/tree-sitter-rescript/ does not exist locally for any fresh clone because install.sh had never been run. Splitting this out:

  1. Surfaces and gates the environmental prerequisite explicitly in CI (the migration-assistant job is the first signal if the pinned commit stops building).
  2. Lets Phase 2b land as a port-over-stable-infra change instead of infra+port mixed together.
  3. Mirrors how Phase 1 itself was structured (feat(res-to-affine): Phase-1 migration assistant skeleton (Refs #57) #314 manifest, then docs(res-to-affine): corpus run + regex precision fixes (Refs #57) #319 corpus-run precision pass).

What's NOT in this PR

  • No walker.ml, no AST-based detection logic, no --engine CLI flag.
  • No change to the Phase-1 regex scanner — it remains the default and only engine.
  • No tree-sitter generated artefacts checked in. tools/vendor/ is gitignored.

Phase 2b plan (next PR, stacks on this)

  • walker.ml invokes tree-sitter parse on a .res file (subprocess approach — sidesteps OCaml↔C FFI, ample for an offline migration assistant).
  • s-expression output parsed via sexplib0 (already in opam deps).
  • Single anti-pattern ported: side-effect-import (the one docs(res-to-affine): corpus run + regex precision fixes (Refs #57) #319 had to band-aid heaviest with regex anchoring).
  • New --engine={scanner,walker} CLI flag, default scanner to preserve Phase-1 behaviour.
  • Snapshot tests compare scanner vs walker output on the corpus.
  • README updated to document the new flag.

Test plan

  • CI green — in particular the new migration-assistant job, which proves the install path end-to-end.
  • Local: just install-grammar produces tools/vendor/tree-sitter-rescript/src/parser.c on a fresh clone (verified-by-design — install.sh is unchanged in code behaviour).

🤖 Generated with Claude Code

hyperpolymath and others added 2 commits May 21, 2026 10:31
…a, Refs #57)

Phase 2 of the `.res → .affine` migration assistant (#57) replaces the
Phase-1 line-regex scanner with a tree-sitter AST walker. The grammar
itself is manifest-vendored in `editors/tree-sitter-rescript/` (since
#314), but the actual build pipeline that turns the manifest into a
loadable parser had not been wired up. This commit is that wiring.

- `justfile` gains an `install-grammar` recipe that wraps the existing
  `editors/tree-sitter-rescript/scripts/install.sh` so the bootstrap
  step is a discoverable `just` recipe alongside `build`/`test`/etc.
- `editors/tree-sitter-rescript/scripts/install.sh` updates its error
  message when the `tree-sitter` CLI is missing to point at both the
  Rust-native install (`cargo install tree-sitter-cli`, the repo's
  preferred path for CLI tooling per CLAUDE.md) and the existing
  Node-based one (`npm install -g tree-sitter-cli`). The script
  behaviour is unchanged when the CLI is present.
- `.github/workflows/ci.yml` gains a `migration-assistant` job that
  installs the tree-sitter CLI via npm (the fast CI path — a pre-built
  binary, ~5 s vs. ~5 min for `cargo install` from source), runs
  `just install-grammar`, verifies `tools/vendor/tree-sitter-rescript/
  src/parser.c` was produced, and smoke-parses the existing
  `tools/res-to-affine/test/fixtures/sample.res` fixture. The job sits
  alongside `vscode-smoke` as a Node-using carve-out under the same
  reasoning (manifest dep on an npm-distributed tool whose binary
  output is what we actually consume; no new TS source in this repo).
- `editors/tree-sitter-rescript/README.md` documents the dual install
  path, the new justfile recipe, and the CI gate.

Phase 2b (the actual walker — `walker.ml`, AST-based detection of
`side-effect-import`, `--engine` CLI flag, snapshot tests vs. the
regex scanner) stacks on top of this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 2a `migration-assistant` job ran `just install-grammar`, but
GitHub Actions ubuntu-latest runners do not ship the `just` task
runner preinstalled. Direct script invocation
`./editors/tree-sitter-rescript/scripts/install.sh` is equivalent
and removes the implicit dependency on a tool nothing else in this
workflow uses. The justfile recipe stays for local developer
ergonomics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath merged commit 2110a58 into main May 21, 2026
0 of 15 checks passed
@hyperpolymath hyperpolymath deleted the phase-2a/res-to-affine-grammar-foundation branch May 21, 2026 13:10
hyperpolymath added a commit that referenced this pull request May 21, 2026
…hase 2b, Refs #57) (#322)

* feat(res-to-affine): land tree-sitter grammar build pipeline (Phase 2a, Refs #57)

Phase 2 of the `.res → .affine` migration assistant (#57) replaces the
Phase-1 line-regex scanner with a tree-sitter AST walker. The grammar
itself is manifest-vendored in `editors/tree-sitter-rescript/` (since
#314), but the actual build pipeline that turns the manifest into a
loadable parser had not been wired up. This commit is that wiring.

- `justfile` gains an `install-grammar` recipe that wraps the existing
  `editors/tree-sitter-rescript/scripts/install.sh` so the bootstrap
  step is a discoverable `just` recipe alongside `build`/`test`/etc.
- `editors/tree-sitter-rescript/scripts/install.sh` updates its error
  message when the `tree-sitter` CLI is missing to point at both the
  Rust-native install (`cargo install tree-sitter-cli`, the repo's
  preferred path for CLI tooling per CLAUDE.md) and the existing
  Node-based one (`npm install -g tree-sitter-cli`). The script
  behaviour is unchanged when the CLI is present.
- `.github/workflows/ci.yml` gains a `migration-assistant` job that
  installs the tree-sitter CLI via npm (the fast CI path — a pre-built
  binary, ~5 s vs. ~5 min for `cargo install` from source), runs
  `just install-grammar`, verifies `tools/vendor/tree-sitter-rescript/
  src/parser.c` was produced, and smoke-parses the existing
  `tools/res-to-affine/test/fixtures/sample.res` fixture. The job sits
  alongside `vscode-smoke` as a Node-using carve-out under the same
  reasoning (manifest dep on an npm-distributed tool whose binary
  output is what we actually consume; no new TS source in this repo).
- `editors/tree-sitter-rescript/README.md` documents the dual install
  path, the new justfile recipe, and the CI gate.

Phase 2b (the actual walker — `walker.ml`, AST-based detection of
`side-effect-import`, `--engine` CLI flag, snapshot tests vs. the
regex scanner) stacks on top of this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(res-to-affine): tree-sitter AST walker for side-effect-import (Phase 2b, Refs #57)

Lands the first AST-based detection pass on top of the Phase-2a
grammar build pipeline (#321). New module `tools/res-to-affine/
walker.ml` shells out to the vendored `tree-sitter` CLI, parses its
default s-expression output (with embedded `[row, col]` byte ranges),
and walks the AST to find the `side-effect-import` anti-pattern
structurally rather than via the column-0-anchored regex the Phase-1
scanner uses.

Detection upgrade vs. Phase 1
-----------------------------

The walker flags `let _ = Mod.value` only when the binding sits at
module top level — direct child of `source_file`, or direct child of
a `block` that is the body of a `module_declaration`. The Phase-1
regex matches any column-0 occurrence of the same shape. On the
existing `sample.res` fixture the walker reports 1 finding (the
intended top-level `let _ = Pixi.Sound.register` on line 8); the
scanner reports the same 1 plus would also have reported the same
pattern nested inside a function body. The regex pattern that #319
band-aided with column-0 anchoring is eliminated structurally by
the walker.

CLI
---

New `--engine={scanner,walker}` flag on the CLI (default: scanner,
preserving Phase-1 behaviour); `--grammar-dir DIR` overrides the
walker's parser location (default: `tools/vendor/tree-sitter-rescript`,
matching `install.sh` output). When the walker hits any error
(grammar missing, tree-sitter not on PATH, s-exp parse failure), it
prints a one-line reason to stderr and falls back to the scanner —
the CLI never bails because of walker problems.

Tests
-----

`test/test_walker.ml` adds two end-to-end tests under a new
`res-to-affine-walker` alcotest run. Both auto-skip if the
`tree-sitter` CLI is missing or the vendored grammar has not been
built, so `dune runtest` on a fresh clone (no bootstrap) still
passes. The repo-root discovery walks up from cwd looking for
`dune-project` to find the source-tree grammar path; the dune sandbox
otherwise hides it.

CI
--

The existing `build` job now installs `tree-sitter-cli` (npm, fast)
and runs `install.sh` before `dune runtest`, so the walker tests
actually execute rather than auto-skip. The migration-assistant
gate added in #321 stays — it remains the dedicated first-signal
job for grammar-pin drift.

Scope discipline
----------------

`raw-js`, `untyped-exception`, `mutable-global` remain
scanner-only; their AST counterparts land in Phase 2c. The walker
exposes a stable surface (`Walker.scan : grammar_dir:string ->
path:string -> source:string -> Scanner.finding list`) that 2c
extends rather than re-architects.

Stack: #321 (Phase 2a) → this PR (Phase 2b) → Phase 2c.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant