diff --git a/docs/MIGRATION-ASSISTANT.adoc b/docs/MIGRATION-ASSISTANT.adoc new file mode 100644 index 00000000..3cc6d484 --- /dev/null +++ b/docs/MIGRATION-ASSISTANT.adoc @@ -0,0 +1,127 @@ +// SPDX-License-Identifier: MPL-2.0 +// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell += ReScript → AffineScript migration assistant — architectural decision +:toc: macro +:toclevels: 2 + +Tracks issue #57 (parser + metaparser). Companion to +link:RESCRIPT-ELIMINATION.adoc[RESCRIPT-ELIMINATION.adoc], which is the +authoritative ledger for the broader estate ReScript-surface retirement. + +toc::[] + +== Context + +Estate language policy retires ReScript in favour of AffineScript → +typed-wasm. Per the inventories captured in +link:RESCRIPT-ELIMINATION.adoc[RESCRIPT-ELIMINATION.adoc] and the +upstream tracker `hyperpolymath/gitbot-fleet#148`, ~5k LOC of ReScript +remains in `gitbot-fleet/bots/sustainabot/bot-integration/src/` alone; +the idaptik tail is ~542 `.res` files plus ~80 `.ts`. By-hand +translation is impractical, *and* a literal transliterator misses the +point — the AffineScript answer to ReScript's anti-patterns is +*re-decomposition*, not a token-level rewrite. + +Issue #57 proposes a *migration assistant*: a tool that reads `.res`, +recognises the anti-patterns surfaced by the idaptik Wave 3 pilot, and +emits a `.affine` *skeleton* that surfaces the work the human migrator +still owes. + +== Decision + +* The migration assistant lives at `tools/res-to-affine/` as an OCaml + CLI built by the repo's existing `dune` toolchain. +* The canonical source-of-truth grammar for `.res` parsing is + https://github.com/rescript-lang/tree-sitter-rescript[`rescript-lang/tree-sitter-rescript`], + vendored manifest-only at `editors/tree-sitter-rescript/` (pinned to + commit `990214a83f25801dfe0226bd7e92bb71bba1970f`, version 6.0.0, + MIT-licensed and compatible with this repo's MPL-2.0). +* The tool ships in three phases: ++ +[cols="1,3,2"] +|=== +| Phase | What it does | Status + +| 1 +| Text-scan emitter detecting 4 of the 6 anti-patterns, emitting a + `.affine` skeleton with migration markers and the quoted original + for reference. +| this PR + +| 2 +| Replaces the text scanner with a tree-sitter AST walker reading the + vendored grammar. Adds the two deferred patterns. Same emitter + interface. +| follow-up + +| 3 +| Partial *translation* of pure-structural forms (type aliases, sum + decls, simple `let` bindings, `switch` → `match`). Effect-laden / + exception-bearing / globally-mutating regions remain TODO islands. +| follow-up +|=== +* The Phase-1 deliverable is **deliberately small and useful in + isolation**. It gates the architectural commitment to tree-sitter + behind something that already pays its way against real estate + `.res` files. + +== Alternatives considered + +=== Use the ReScript compiler's own AST (`bs-tools` / `rescript ast`) + +The richest signal is in the ReScript compiler's typed AST. Rejected +because: + +* Adding `rescript` as a build-time dependency contradicts the estate + language policy (which bans new ReScript code and treats ReScript as + the artefact to be retired). +* The ReScript compiler's AST changes across versions in + non-backwards-compatible ways; pinning would create an ongoing + compatibility burden in the wrong direction. + +=== Write a hand-rolled `.res` lexer/parser in OCaml + +We already have `lib/rescript_codegen.ml` going *affinescript → .res*, +so the grammar is partly understood. Rejected because: + +* ReScript's surface syntax is large; recreating it for a one-way + migration tool is days-to-weeks of work that the canonical + tree-sitter grammar has already done and maintains. +* The community grammar is MIT-licensed and version-pinned; the cost + of consuming it is a one-line manifest plus an install script. + +=== Pattern-detector only (no AST in any phase) + +Phase 1 *is* this — but committing to it permanently would leave the +two structural anti-patterns (callback records, oversized functions) +undetected forever, and would block Phase 3 (partial translation), +which is what makes the tool earn its keep on idaptik's 542 files. + +== Consequences + +* `editors/tree-sitter-rescript/` exists for the migration pipeline, + not as an editor binding. The editor binding for AffineScript itself + remains `editors/tree-sitter-affinescript/`. +* `tools/res-to-affine/` is the first OCaml tool under `tools/` + (existing tools are shell scripts or Rust). The `dune` integration + is local to the tool's own `dune` file; no workspace changes. +* Phase 2 introduces `tree-sitter` CLI as a runtime dependency for the + migration assistant. It is *not* a build-time dependency for the + AffineScript compiler itself. CI for the migration tool's Phase-2 + tests will need to install `tree-sitter-cli`. +* The Phase plan is recorded in + link:../tools/res-to-affine/README.md[`tools/res-to-affine/README.md`]; + this document is the architectural decision, the README is the + user/contributor surface. + +== References + +* `tools/res-to-affine/README.md` — tool usage, Phase plan, design rationale. +* `editors/tree-sitter-rescript/README.md` — vendoring manifest details. +* `affinescript#57` — parser + metaparser proposal. +* `gitbot-fleet#148` — downstream tracker for the consumed ReScript subtree. +* link:RESCRIPT-ELIMINATION.adoc[`RESCRIPT-ELIMINATION.adoc`] — estate-wide ledger. +* https://github.com/hyperpolymath/idaptik/blob/main/migration/main/LESSONS.md[idaptik LESSONS.md] + — six anti-patterns the assistant targets. +* https://github.com/hyperpolymath/idaptik/blob/main/migration/main/PILOT.md[idaptik PILOT.md] + — original Wave-3 pilot that surfaced the six patterns. diff --git a/editors/tree-sitter-rescript/README.md b/editors/tree-sitter-rescript/README.md new file mode 100644 index 00000000..f5148b0f --- /dev/null +++ b/editors/tree-sitter-rescript/README.md @@ -0,0 +1,44 @@ + + + +# tree-sitter-rescript (vendoring manifest) + +This directory is a **manifest-only vendoring** of the canonical +[`rescript-lang/tree-sitter-rescript`][upstream] grammar. The grammar +itself is not copied into this repository — `package.json` declares it +as a dependency, and `scripts/install.sh` fetches and builds it at the +pinned commit. + +The grammar is consumed by `tools/res-to-affine/`, the `.res → .affine` +migration assistant (`affinescript#57`). It is **not** an editor binding +for AffineScript; for that, see `editors/tree-sitter-affinescript/`. + +## Pinned upstream + +- **Repository:** +- **Commit:** `990214a83f25801dfe0226bd7e92bb71bba1970f` +- **Version:** 6.0.0 +- **License:** MIT (preserved upstream; compatible with this repo's MPL-2.0) + +When updating the pin, regenerate `tools/res-to-affine/test/expected/` +snapshots, since AST shapes may shift. + +## Install + +```sh +./scripts/install.sh +``` + +This writes a `tree-sitter-rescript` directory under `tools/vendor/` +(gitignored — same convention as the WASI adapter pinning), containing +the generated parser. Requires `git` and `tree-sitter` CLI on PATH. + +## Why manifest, not copy + +The upstream grammar is ~10k lines of JS plus generated C. Copying it +into this MPL-2.0 repo would (a) bloat the tree, (b) create an ongoing +sync burden, and (c) duplicate MIT-licensed code we have no business +modifying. The manifest+install approach keeps the dependency explicit +and pinned without absorbing the source. + +[upstream]: https://github.com/rescript-lang/tree-sitter-rescript diff --git a/editors/tree-sitter-rescript/package.json b/editors/tree-sitter-rescript/package.json new file mode 100644 index 00000000..cb0b17fd --- /dev/null +++ b/editors/tree-sitter-rescript/package.json @@ -0,0 +1,16 @@ +{ + "name": "@affinescript/tree-sitter-rescript-vendoring", + "version": "0.1.0", + "private": true, + "description": "Manifest-only vendoring of rescript-lang/tree-sitter-rescript for the .res -> .affine migration assistant (affinescript#57).", + "license": "MPL-2.0", + "dependencies": { + "tree-sitter-rescript": "github:rescript-lang/tree-sitter-rescript#990214a83f25801dfe0226bd7e92bb71bba1970f" + }, + "devDependencies": { + "tree-sitter-cli": "^0.25.0" + }, + "scripts": { + "install-grammar": "./scripts/install.sh" + } +} diff --git a/editors/tree-sitter-rescript/scripts/install.sh b/editors/tree-sitter-rescript/scripts/install.sh new file mode 100755 index 00000000..e4fa37dc --- /dev/null +++ b/editors/tree-sitter-rescript/scripts/install.sh @@ -0,0 +1,41 @@ +#!/usr/bin/env bash +# SPDX-License-Identifier: MPL-2.0 +# SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell +# +# Fetch and build the pinned tree-sitter-rescript grammar. +# Output goes under ../../.build/tree-sitter-rescript/ (gitignored). + +set -euo pipefail + +UPSTREAM_COMMIT="990214a83f25801dfe0226bd7e92bb71bba1970f" +REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../.." && pwd)" +# tools/vendor/ is the repo's convention for fetched-not-committed deps +# (see .gitignore line 103, mirrors the WASI adapter provisioning). +BUILD_DIR="${REPO_ROOT}/tools/vendor/tree-sitter-rescript" + +if ! command -v tree-sitter >/dev/null 2>&1; then + echo "error: tree-sitter CLI not found on PATH" >&2 + echo " install via: npm install -g tree-sitter-cli" >&2 + exit 2 +fi + +if ! command -v git >/dev/null 2>&1; then + echo "error: git not found on PATH" >&2 + exit 2 +fi + +mkdir -p "$(dirname "$BUILD_DIR")" + +if [ -d "$BUILD_DIR/.git" ]; then + git -C "$BUILD_DIR" fetch --quiet origin "$UPSTREAM_COMMIT" || true + git -C "$BUILD_DIR" checkout --quiet "$UPSTREAM_COMMIT" +else + rm -rf "$BUILD_DIR" + git clone --quiet https://github.com/rescript-lang/tree-sitter-rescript.git "$BUILD_DIR" + git -C "$BUILD_DIR" checkout --quiet "$UPSTREAM_COMMIT" +fi + +cd "$BUILD_DIR" +tree-sitter generate + +echo "tree-sitter-rescript built at ${BUILD_DIR} (commit ${UPSTREAM_COMMIT})" diff --git a/tools/res-to-affine/README.md b/tools/res-to-affine/README.md new file mode 100644 index 00000000..49377e1b --- /dev/null +++ b/tools/res-to-affine/README.md @@ -0,0 +1,138 @@ + + + +# `res-to-affine` — ReScript-to-AffineScript migration assistant + +A small OCaml CLI that reads a `.res` file and emits a `.affine` skeleton +with **migration markers** — comments that name each anti-pattern the +scanner found, point at the source line, and propose the AffineScript +answer the human migrator should consider before porting. + +Tracks: [`affinescript#57`](https://github.com/hyperpolymath/affinescript/issues/57) +(parser + metaparser). +Consumed by: [`hyperpolymath/gitbot-fleet#148`](https://github.com/hyperpolymath/gitbot-fleet/issues/148) +and the broader `idaptik` migration. + +## Usage + +```sh +# print skeleton to stdout +dune exec tools/res-to-affine/main.exe -- path/to/Foo.res + +# or write to a file +dune exec tools/res-to-affine/main.exe -- path/to/Foo.res -o Foo.affine +``` + +The output is **not compilable**. It is a starting point for the human: +a quoted copy of the original sits at the bottom; the top carries a +migration-considerations block; the middle is a `module` stub with +`TODO`s. The human picks the decomposition; the tool surfaces what +needs re-decomposing. + +## What gets flagged (Phase 1) + +The six anti-patterns surfaced in the +[idaptik Wave 3 pilot](https://github.com/hyperpolymath/idaptik/blob/main/migration/main/LESSONS.md), +of which the line-based scanner reliably detects four: + +| Tag | Detection | AffineScript answer | +|---|---|---| +| `side-effect-import` | `let _ = Mod.foo` at top level | Explicit registration call | +| `raw-js` | `%raw(...)` or `[%bs.raw ...]` | Typed extern (`ABI-FFI-README.md`) | +| `untyped-exception` | `Promise.catch`, `Js.Exn`, `raise`, `try` | `Result[E, A]` / `Validation[E, A]` | +| `mutable-global` | `:=` operator | Affine record threaded through | + +Deferred to Phase 2 (need real AST): + +- **inline lambda callback record** — N ≥ 3 `~handler: (...) =>` lambdas + inside one record literal (collapse to a row-polymorphic record). +- **oversized function** — function body > ~50 LOC (decompose). + +## Why a skeleton and not a transliteration + +The Frontier Programming Guides' standing rule is **re-decompose, not +transliterate**. A line-for-line port preserves the source's anti-patterns +into the target language and produces `.affine` files that are technically +parseable but architecturally still ReScript. The migration assistant's +job is to *make the re-decomposition tractable*, not to skip it. So: + +- The skeleton is **honest about being incomplete** — it does not + compile, on purpose. +- The original source is **quoted at the bottom** so the migrator + doesn't tab between files while writing the port. +- Each marker links a source line to the AffineScript pattern that + replaces it, so the migrator's next action is clear. + +## Phase plan + +### Phase 1 — text-scan emitter (this PR) + +- OCaml binary builds with the repo's existing `dune` toolchain. +- `Scanner` walks lines with `str` regexes; cheap and dependency-free. +- `Emitter` writes the migration-considerations block, a `module` stub, + and the quoted source. +- Snapshot tests under `test/` ensure stable output. + +This phase is **deliberately small**. It is useful immediately — runs +against any `.res` file, surfaces 4 of 6 anti-patterns, gives the +migrator a starting document — and it gates the architectural commitment +to tree-sitter in Phase 2 behind something that already pays its way. + +### Phase 2 — tree-sitter AST walker + +- Install the pinned grammar from + `editors/tree-sitter-rescript/` (manifest-only vendoring of + `rescript-lang/tree-sitter-rescript@990214a`). +- Replace `Scanner` with a walker over the s-expression output of + `tree-sitter parse --quiet`, parsed by the existing `sexplib0` + dependency. +- Adds the two deferred patterns (callback records, oversized + functions) and unlocks **structural** translation of trivial forms + (e.g. `option` → `Option[X]`, `result` → `Result[Y, X]`, + `switch x { | A => ... }` → `match x { A => ... }`). +- The `Emitter` interface does not change: same skeleton shape, same + marker schema, richer body. + +### Phase 3 — partial translation + +Once the AST walker exists, the emitter can do more than mark — it can +**translate** the pure-structural parts (type aliases, sum decls, +simple `let` bindings, switch-to-match) and leave only effect-laden, +exception-bearing, or globally-mutating regions as TODO. The skeleton +becomes a working port of ~60–80% of the input, with TODO islands +where re-decomposition is genuinely required. + +Phase 3 is when the tool earns its keep on idaptik's 542 files. + +## Testing + +```sh +dune test tools/res-to-affine/ +``` + +To regenerate snapshots after an intentional emitter change: + +```sh +cd tools/res-to-affine/test +../../../_build/default/tools/res-to-affine/main.exe \ + fixtures/sample.res > expected/sample.affine +``` + +The fixture under `test/fixtures/sample.res` is synthetic and exercises +every Phase-1 anti-pattern. Real `.res` files from the estate (e.g. +`gitbot-fleet/bots/sustainabot/bot-integration/src/*.res`) can be run +ad hoc through the CLI without changes to the test suite. + +## Non-goals + +- **Not a ReScript compiler.** The scanner does not parse ReScript; + even Phase 2 only walks the tree-sitter CST, not the ReScript + type-checker's AST. If a `.res` file is syntactically invalid the + tool may still emit a (less useful) skeleton. +- **Not a build-time dependency on ReScript.** The pinned grammar is a + parser, not the ReScript compiler. The estate's language policy + (CLAUDE.md) bans new ReScript code; this tool exists to **help retire + the existing ReScript surface**, not to bring more in. +- **Not for editor integration.** Editor tree-sitter bindings for + AffineScript live at `editors/tree-sitter-affinescript/`; this tool's + vendored grammar is for the migration pipeline only. diff --git a/tools/res-to-affine/dune b/tools/res-to-affine/dune new file mode 100644 index 00000000..abc26f7e --- /dev/null +++ b/tools/res-to-affine/dune @@ -0,0 +1,13 @@ +; SPDX-License-Identifier: MPL-2.0 + +(executable + (name main) + (modules main) + (public_name res-to-affine) + (package affinescript) + (libraries res_to_affine cmdliner fmt fmt.tty)) + +(library + (name res_to_affine) + (modules scanner emitter) + (libraries str)) diff --git a/tools/res-to-affine/emitter.ml b/tools/res-to-affine/emitter.ml new file mode 100644 index 00000000..1a563a24 --- /dev/null +++ b/tools/res-to-affine/emitter.ml @@ -0,0 +1,81 @@ +(* SPDX-License-Identifier: MPL-2.0 *) +(* SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell *) + +let module_name_of_path path = + let base = Filename.basename path in + let stem = + try Filename.chop_extension base + with Invalid_argument _ -> base + in + if String.length stem = 0 then "Module" + else + String.make 1 (Char.uppercase_ascii stem.[0]) + ^ String.sub stem 1 (String.length stem - 1) + +let bullet_for_finding (f : Scanner.finding) = + Printf.sprintf + "// - [%s] line %d: %s\n// %s" + (Scanner.kind_to_label f.kind) + f.line + (Scanner.kind_to_guidance f.kind) + f.excerpt + +let summarise_findings findings = + let buf = Buffer.create 256 in + if findings = [] then begin + Buffer.add_string buf + "// MIGRATE: scanner found no Phase-1 anti-patterns. A clean .res\n"; + Buffer.add_string buf + "// surface does not mean the port is mechanical —\n"; + Buffer.add_string buf + "// re-decomposition still applies (see PILOT.md upstream)."; + Buffer.contents buf + end else begin + Buffer.add_string buf + (Printf.sprintf + "// MIGRATE: %d migration consideration%s detected. Each entry below\n" + (List.length findings) + (if List.length findings = 1 then "" else "s")); + Buffer.add_string buf + "// names the pattern, source line, and the AffineScript\n"; + Buffer.add_string buf + "// answer to consider before porting."; + List.iter + (fun f -> + Buffer.add_char buf '\n'; + Buffer.add_string buf (bullet_for_finding f)) + findings; + Buffer.contents buf + end + +let quote_block source = + let lines = String.split_on_char '\n' source in + let quoted = List.map (fun l -> " " ^ l) lines in + let buf = Buffer.create (String.length source + 64) in + Buffer.add_string buf + "/* ORIGINAL RESCRIPT — retained for reference; delete once port lands.\n"; + Buffer.add_string buf (String.concat "\n" quoted); + Buffer.add_string buf "\n*/"; + Buffer.contents buf + +let emit ~module_name ~source_path ~source ~findings = + let buf = Buffer.create 4096 in + let add s = Buffer.add_string buf s in + add "// SPDX-License-Identifier: MPL-2.0\n"; + add "// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell\n"; + add "//\n"; + add (Printf.sprintf + "// Generated by tools/res-to-affine from %s\n" source_path); + add "// This is a Phase-1 SKELETON. It does NOT compile. Bodies are TODO.\n"; + add "// See tools/res-to-affine/README.md for the migration workflow.\n"; + add "//\n"; + add (summarise_findings findings); + add "\n\n"; + add (Printf.sprintf "module %s\n\n" module_name); + add "// TODO: re-decompose the original into focused AffineScript modules.\n"; + add "// Effect-track each signature, replace mutable state with affine\n"; + add "// records, and lift Result/Validation to fail-paths.\n"; + add "\n"; + add (quote_block source); + add "\n"; + Buffer.contents buf diff --git a/tools/res-to-affine/emitter.mli b/tools/res-to-affine/emitter.mli new file mode 100644 index 00000000..db9ebe5f --- /dev/null +++ b/tools/res-to-affine/emitter.mli @@ -0,0 +1,23 @@ +(* SPDX-License-Identifier: MPL-2.0 *) +(* SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell *) + +(** Skeleton emitter: given a source file and its scan findings, write + a [.affine] stub with migration markers and the original source + quoted at the bottom for human reference. + + The output is intentionally a {i skeleton}, not a transliteration. + The human picks the decomposition; the tool surfaces what needs + re-decomposing. See [tools/res-to-affine/README.md] for the + rationale and the Phase 1 / 2 / 3 plan. *) + +val module_name_of_path : string -> string +(** Derive an AffineScript module name from a path. [.../Config.res] + yields ["Config"]; non-PascalCase basenames are capitalised. *) + +val emit : + module_name:string -> + source_path:string -> + source:string -> + findings:Scanner.finding list -> + string +(** Render the skeleton. The result is a complete file contents string. *) diff --git a/tools/res-to-affine/main.ml b/tools/res-to-affine/main.ml new file mode 100644 index 00000000..d35313e1 --- /dev/null +++ b/tools/res-to-affine/main.ml @@ -0,0 +1,78 @@ +(* SPDX-License-Identifier: MPL-2.0 *) +(* SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell *) + +(** [res-to-affine] CLI — ReScript-to-AffineScript migration assistant. + + Reads a [.res] file, scans it for the six anti-patterns surfaced in + the idaptik Wave 3 pilot, and emits an [.affine] skeleton with + migration markers. The original source is quoted at the bottom of + the output so the human migrating the file has it side-by-side. + + Phase 1 (this binary) uses a text scanner. Phase 2 swaps the + [Scanner] implementation for a tree-sitter AST walker reading the + vendored grammar at [editors/tree-sitter-rescript/]. See the + tool README for the full plan. *) + +open Res_to_affine + +let read_file path = + let ic = open_in_bin path in + let n = in_channel_length ic in + let s = really_input_string ic n in + close_in ic; + s + +let write_file path contents = + let oc = open_out_bin path in + output_string oc contents; + close_out oc + +let run input output_opt = + if not (Sys.file_exists input) then begin + Format.eprintf "res-to-affine: input not found: %s@." input; + exit 2 + end; + let source = read_file input in + let findings = Scanner.scan source in + let module_name = Emitter.module_name_of_path input in + let out = + Emitter.emit + ~module_name + ~source_path:input + ~source + ~findings + in + match output_opt with + | None -> + print_string out + | Some path -> + write_file path out; + Format.printf + "res-to-affine: %d finding%s → %s@." + (List.length findings) + (if List.length findings = 1 then "" else "s") + path + +(* ---- cmdliner wiring ---- *) + +let input_arg = + let doc = "ReScript source file to migrate." in + Cmdliner.Arg.( + required & pos 0 (some non_dir_file) None & + info [] ~docv:"INPUT.res" ~doc) + +let output_arg = + let doc = "Write the skeleton to FILE instead of stdout." in + Cmdliner.Arg.( + value & opt (some string) None & + info ["o"; "output"] ~docv:"FILE" ~doc) + +let cmd = + let doc = "Emit an AffineScript skeleton from a ReScript source file." in + let info = Cmdliner.Cmd.info "res-to-affine" ~version:"0.1.0" ~doc in + let term = + Cmdliner.Term.(const run $ input_arg $ output_arg) + in + Cmdliner.Cmd.v info term + +let () = exit (Cmdliner.Cmd.eval cmd) diff --git a/tools/res-to-affine/scanner.ml b/tools/res-to-affine/scanner.ml new file mode 100644 index 00000000..1ceac1f2 --- /dev/null +++ b/tools/res-to-affine/scanner.ml @@ -0,0 +1,92 @@ +(* SPDX-License-Identifier: MPL-2.0 *) +(* SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell *) + +type kind = + | Side_effect_import + | Raw_js + | Untyped_exception + | Mutable_global + +let kind_to_label = function + | Side_effect_import -> "side-effect-import" + | Raw_js -> "raw-js" + | Untyped_exception -> "untyped-exception" + | Mutable_global -> "mutable-global" + +let kind_to_guidance = function + | Side_effect_import -> + "ReScript module-load side effect. AffineScript modules do not run \ + code at load time — rewrite as an explicit registration call." + | Raw_js -> + "%raw JS block. AffineScript has no untyped FFI — replace with \ + a typed extern (see ABI-FFI-README.md) or wait for the matching \ + binding." + | Untyped_exception -> + "Untyped exception / Promise.catch. AffineScript prefers \ + Result[E, A] for fail-fast paths and Validation[E, A] for \ + accumulating errors." + | Mutable_global -> + "Top-level mutable global. AffineScript does not encourage \ + module-scoped mutation; pass state as an affine record or \ + through an effect handler." + +type finding = { + kind : kind; + line : int; + excerpt : string; +} + +(* ---- regexes, compiled once ---- *) + +let re_side_effect_import = + Str.regexp "^[ \t]*let[ \t]+_[ \t]*=[ \t]*[A-Z][a-zA-Z0-9_]*\\." + +let re_raw_js = Str.regexp_case_fold "%raw\\|\\[%bs\\.raw" + +let re_untyped_exn = + Str.regexp + "Promise\\.catch\\|Js\\.Exn\\|[^a-zA-Z_]raise[ (]\\|[^a-zA-Z_]try[ {]" + +let re_mutable_global = Str.regexp ":=" + +let trim s = + let n = String.length s in + let i = ref 0 in + while !i < n && (s.[!i] = ' ' || s.[!i] = '\t') do incr i done; + let j = ref (n - 1) in + while !j >= !i && (s.[!j] = ' ' || s.[!j] = '\t' || s.[!j] = '\r') do decr j done; + String.sub s !i (!j - !i + 1) + +let truncate s = + if String.length s <= 80 then s + else String.sub s 0 77 ^ "..." + +(* A line is "code" when it isn't a comment-only line and isn't blank. + We don't strip in-line comments; the patterns we match are unlikely + to live inside a // comment. *) +let is_codeish line = + let t = trim line in + if t = "" then false + else not (String.length t >= 2 && t.[0] = '/' && t.[1] = '/') + +let try_match re line = + try + let _ = Str.search_forward re line 0 in true + with Not_found -> false + +let scan (source : string) : finding list = + let lines = String.split_on_char '\n' source in + let acc = ref [] in + List.iteri + (fun i raw -> + if is_codeish raw then begin + let excerpt = truncate (trim raw) in + let lineno = i + 1 in + let push k = acc := { kind = k; line = lineno; excerpt } :: !acc in + if try_match re_side_effect_import raw then push Side_effect_import; + if try_match re_raw_js raw then push Raw_js; + if try_match re_untyped_exn raw then push Untyped_exception; + if try_match re_mutable_global raw then push Mutable_global + end) + lines; + List.rev !acc diff --git a/tools/res-to-affine/scanner.mli b/tools/res-to-affine/scanner.mli new file mode 100644 index 00000000..f0e4889f --- /dev/null +++ b/tools/res-to-affine/scanner.mli @@ -0,0 +1,43 @@ +(* SPDX-License-Identifier: MPL-2.0 *) +(* SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell *) + +(** Phase-1 text scanner for ReScript anti-patterns. + + Detects the six anti-patterns surfaced in idaptik Wave 3 pilot + (see migration/main/LESSONS.md and PILOT.md upstream). This scanner + is line/regex-based; it is intentionally cheap and gives best-effort + location markers. Phase 2 replaces it with a tree-sitter AST walker + using the vendored [editors/tree-sitter-rescript] grammar. The + [Emitter] interface is stable across the two implementations. + + Patterns detected today: + - {b side-effect import} : [let _ = Mod.foo] (ReScript module-load hack) + - {b raw JS} : any line containing [%raw] (typed FFI required) + - {b untyped exception} : [Promise.catch], [Js.Exn], [raise], [try] + - {b mutable global} : top-level [ref] or [:=] assignment + + Deferred to Phase 2 (need real AST): + - {b inline lambda callback record} : N>=3 [~handler: (...) =>] in a record + - {b oversized function} : function body >50 LOC *) + +type kind = + | Side_effect_import + | Raw_js + | Untyped_exception + | Mutable_global + +val kind_to_label : kind -> string +(** Short tag used in emitted comment markers (e.g. ["side-effect-import"]). *) + +val kind_to_guidance : kind -> string +(** One-line human guidance for the emitter to print alongside the marker. *) + +type finding = { + kind : kind; + line : int; + excerpt : string; +} + +val scan : string -> finding list +(** [scan source] returns findings in source order. [source] is the + full .res file contents. *) diff --git a/tools/res-to-affine/test/dune b/tools/res-to-affine/test/dune new file mode 100644 index 00000000..a8f96b10 --- /dev/null +++ b/tools/res-to-affine/test/dune @@ -0,0 +1,8 @@ +; SPDX-License-Identifier: MPL-2.0 + +(test + (name test_emit) + (libraries res_to_affine alcotest) + (deps + (glob_files fixtures/*.res) + (glob_files expected/*.affine))) diff --git a/tools/res-to-affine/test/expected/sample.affine b/tools/res-to-affine/test/expected/sample.affine new file mode 100644 index 00000000..f3281384 --- /dev/null +++ b/tools/res-to-affine/test/expected/sample.affine @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: MPL-2.0 +// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell +// +// Generated by tools/res-to-affine from fixtures/sample.res +// This is a Phase-1 SKELETON. It does NOT compile. Bodies are TODO. +// See tools/res-to-affine/README.md for the migration workflow. +// +// MIGRATE: 6 migration considerations detected. Each entry below +// names the pattern, source line, and the AffineScript +// answer to consider before porting. +// - [side-effect-import] line 8: ReScript module-load side effect. AffineScript modules do not run code at load time — rewrite as an explicit registration call. +// let _ = Pixi.Sound.register +// - [raw-js] line 11: %raw JS block. AffineScript has no untyped FFI — replace with a typed extern (see ABI-FFI-README.md) or wait for the matching binding. +// let host = %raw(`globalThis.location.host`) +// - [mutable-global] line 15: Top-level mutable global. AffineScript does not encourage module-scoped mutation; pass state as an affine record or through an effect handler. +// currentUser := Some("alice") +// - [untyped-exception] line 19: Untyped exception / Promise.catch. AffineScript prefers Result[E, A] for fail-fast paths and Validation[E, A] for accumulating errors. +// try { +// - [untyped-exception] line 22: Untyped exception / Promise.catch. AffineScript prefers Result[E, A] for fail-fast paths and Validation[E, A] for accumulating errors. +// | Js.Exn.Error(_) => None +// - [untyped-exception] line 28: Untyped exception / Promise.catch. AffineScript prefers Result[E, A] for fail-fast paths and Validation[E, A] for accumulating errors. +// api->Promise.catch(e => Js.log(e)) + +module Sample + +// TODO: re-decompose the original into focused AffineScript modules. +// Effect-track each signature, replace mutable state with affine +// records, and lift Result/Validation to fail-paths. + +/* ORIGINAL RESCRIPT — retained for reference; delete once port lands. + // SPDX-License-Identifier: MIT + // Synthetic fixture exercising every Phase-1 anti-pattern. Not a real + // ReScript program; the scanner is line-based so it doesn't care. + + open Types + + // 1. side-effect import (Pixi sound modules) + let _ = Pixi.Sound.register + + // 2. raw JS escape hatch + let host = %raw(`globalThis.location.host`) + + // 3. mutable global ref + := assignment + let currentUser = ref(None) + currentUser := Some("alice") + + // 4. untyped exception path + let fetchUser = id => { + try { + Some(GitHub.Users.get(id)) + } catch { + | Js.Exn.Error(_) => None + } + } + + // 5. Promise.catch — different shape of the same anti-pattern + let load = () => + api->Promise.catch(e => Js.log(e)) + +*/ diff --git a/tools/res-to-affine/test/fixtures/sample.res b/tools/res-to-affine/test/fixtures/sample.res new file mode 100644 index 00000000..3409dd24 --- /dev/null +++ b/tools/res-to-affine/test/fixtures/sample.res @@ -0,0 +1,28 @@ +// SPDX-License-Identifier: MIT +// Synthetic fixture exercising every Phase-1 anti-pattern. Not a real +// ReScript program; the scanner is line-based so it doesn't care. + +open Types + +// 1. side-effect import (Pixi sound modules) +let _ = Pixi.Sound.register + +// 2. raw JS escape hatch +let host = %raw(`globalThis.location.host`) + +// 3. mutable global ref + := assignment +let currentUser = ref(None) +currentUser := Some("alice") + +// 4. untyped exception path +let fetchUser = id => { + try { + Some(GitHub.Users.get(id)) + } catch { + | Js.Exn.Error(_) => None + } +} + +// 5. Promise.catch — different shape of the same anti-pattern +let load = () => + api->Promise.catch(e => Js.log(e)) diff --git a/tools/res-to-affine/test/test_emit.ml b/tools/res-to-affine/test/test_emit.ml new file mode 100644 index 00000000..148fb718 --- /dev/null +++ b/tools/res-to-affine/test/test_emit.ml @@ -0,0 +1,77 @@ +(* SPDX-License-Identifier: MPL-2.0 *) +(* SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell *) + +(** Snapshot tests for the res-to-affine emitter. + + Each fixture pair lives under [test/fixtures/.res] and + [test/expected/.affine]. To regenerate after intentional + changes: + + {[ + dune exec tools/res-to-affine/main.exe -- \ + tools/res-to-affine/test/fixtures/.res \ + > tools/res-to-affine/test/expected/.affine + ]} + + Tree-sitter is NOT required by these tests; the Phase-1 scanner is + pure-OCaml. Phase 2 will add a separate suite gated on the + [editors/tree-sitter-rescript] install. *) + +open Res_to_affine + +let read_file path = + let ic = open_in_bin path in + let n = in_channel_length ic in + let s = really_input_string ic n in + close_in ic; + s + +let render_fixture fixture_path = + let source = read_file fixture_path in + let findings = Scanner.scan source in + Emitter.emit + ~module_name:(Emitter.module_name_of_path fixture_path) + ~source_path:fixture_path + ~source + ~findings + +let check_snapshot name = + let fixture = Printf.sprintf "fixtures/%s.res" name in + let expected = Printf.sprintf "expected/%s.affine" name in + let got = render_fixture fixture in + let want = read_file expected in + Alcotest.(check string) + (Printf.sprintf "%s snapshot" name) + want got + +let test_sample () = + check_snapshot "sample" + +let test_finding_kinds () = + let source = read_file "fixtures/sample.res" in + let kinds = + Scanner.scan source + |> List.map (fun (f : Scanner.finding) -> Scanner.kind_to_label f.kind) + |> List.sort_uniq compare + in + Alcotest.(check (list string)) + "all four Phase-1 kinds detected" + [ "mutable-global"; "raw-js"; "side-effect-import"; "untyped-exception" ] + kinds + +let test_module_name () = + Alcotest.(check string) "PascalCase basename" "Config" + (Emitter.module_name_of_path "/path/to/Config.res"); + Alcotest.(check string) "lowercase basename is capitalised" "Webhook" + (Emitter.module_name_of_path "webhook.res") + +let () = + Alcotest.run "res-to-affine" + [ + ( "snapshot", + [ Alcotest.test_case "sample.res → sample.affine" `Quick test_sample ] ); + ( "scanner", + [ Alcotest.test_case "all kinds detected" `Quick test_finding_kinds ] ); + ( "emitter", + [ Alcotest.test_case "module name derivation" `Quick test_module_name ] ); + ]