Skip to content

Commit 870cccd

Browse files
authored
[compiler] Summaries of the compiler passes to assist agents in development (facebook#35595)
Autogenerated summaries of each of the compiler passes which allow agents to get the key ideas of a compiler pass, including key input/output invariants, without having to reprocess the file each time. In the subsequent diff this seemed to help. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/facebook/react/pull/35595). * facebook#35607 * facebook#35298 * facebook#35596 * facebook#35573 * __->__ facebook#35595 * facebook#35539
1 parent c3b95b0 commit 870cccd

File tree

57 files changed

+10072
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+10072
-0
lines changed

compiler/CLAUDE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ This document contains knowledge about the React Compiler gathered during develo
44

55
## Project Structure
66

7+
When modifying the compiler, you MUST read the documentation about that pass in `compiler/packages/babel-plugin-react-compiler/docs/passes/` to learn more about the role of that pass within the compiler.
8+
79
- `packages/babel-plugin-react-compiler/` - Main compiler package
810
- `src/HIR/` - High-level Intermediate Representation types and utilities
911
- `src/Inference/` - Effect inference passes (aliasing, mutation, etc.)
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# lower (BuildHIR)
2+
3+
## File
4+
`src/HIR/BuildHIR.ts`
5+
6+
## Purpose
7+
Converts a Babel AST function node into a High-level Intermediate Representation (HIR), which represents code as a control-flow graph (CFG) with basic blocks, instructions, and terminals. This is the first major transformation pass in the React Compiler pipeline, enabling precise expression-level memoization analysis.
8+
9+
## Input Invariants
10+
- Input must be a valid Babel `NodePath<t.Function>` (FunctionDeclaration, FunctionExpression, or ArrowFunctionExpression)
11+
- The function must be a component or hook (determined by the environment)
12+
- Babel scope analysis must be available for binding resolution
13+
- An `Environment` instance must be provided with compiler configuration
14+
- Optional `bindings` map for nested function lowering (recursive calls)
15+
- Optional `capturedRefs` map for context variables captured from outer scope
16+
17+
## Output Guarantees
18+
- Returns `Result<HIRFunction, CompilerError>` - either a successfully lowered function or compilation errors
19+
- The HIR function contains:
20+
- A complete CFG with basic blocks (`body.blocks: Map<BlockId, BasicBlock>`)
21+
- Each block has an array of instructions and exactly one terminal
22+
- All control flow is explicit (if/else, loops, switch, logical operators, ternary)
23+
- Parameters are converted to `Place` or `SpreadPattern`
24+
- Context captures are tracked in `context` array
25+
- Function metadata (id, async, generator, directives)
26+
- All identifiers get unique `IdentifierId` values
27+
- Instructions have placeholder instruction IDs (set to 0, assigned later)
28+
- Effects are null (populated by later inference passes)
29+
30+
## Algorithm
31+
The lowering algorithm uses a recursive descent pattern with a `HIRBuilder` helper class:
32+
33+
1. **Initialization**: Create an `HIRBuilder` with environment and optional bindings. Process captured context variables.
34+
35+
2. **Parameter Processing**: For each function parameter:
36+
- Simple identifiers: resolve binding and create Place
37+
- Patterns (object/array): create temporary Place, then emit destructuring assignments
38+
- Rest elements: wrap in SpreadPattern
39+
- Unsupported: emit Todo error
40+
41+
3. **Body Processing**:
42+
- Arrow function expressions: lower body expression to temporary, emit implicit return
43+
- Block statements: recursively lower each statement
44+
45+
4. **Statement Lowering** (`lowerStatement`): Handle each statement type:
46+
- **Control flow**: Create separate basic blocks for branches, loops connect back to conditional blocks
47+
- **Variable declarations**: Create `DeclareLocal`/`DeclareContext` or `StoreLocal`/`StoreContext` instructions
48+
- **Expressions**: Lower to temporary and discard result
49+
- **Hoisting**: Detect forward references and emit `DeclareContext` for hoisted identifiers
50+
51+
5. **Expression Lowering** (`lowerExpression`): Convert expressions to `InstructionValue`:
52+
- **Identifiers**: Create `LoadLocal`, `LoadContext`, or `LoadGlobal` based on binding
53+
- **Literals**: Create `Primitive` values
54+
- **Operators**: Create `BinaryExpression`, `UnaryExpression` etc.
55+
- **Calls**: Distinguish `CallExpression` vs `MethodCall` (member expression callee)
56+
- **Control flow expressions**: Create separate value blocks for branches (ternary, logical, optional chaining)
57+
- **JSX**: Lower to `JsxExpression` with lowered tag, props, and children
58+
59+
6. **Block Management**: The builder maintains:
60+
- A current work-in-progress block accumulating instructions
61+
- Completed blocks map
62+
- Scope stack for break/continue resolution
63+
- Exception handler stack for try/catch
64+
65+
7. **Termination**: Add implicit void return at end if no explicit return
66+
67+
## Key Data Structures
68+
69+
### HIRBuilder (from HIRBuilder.ts)
70+
- `#current: WipBlock` - Work-in-progress block being populated
71+
- `#completed: Map<BlockId, BasicBlock>` - Finished blocks
72+
- `#scopes: Array<Scope>` - Stack for break/continue target resolution (LoopScope, LabelScope, SwitchScope)
73+
- `#exceptionHandlerStack: Array<BlockId>` - Stack of catch handlers for try/catch
74+
- `#bindings: Bindings` - Map of variable names to their identifiers
75+
- `#context: Map<t.Identifier, SourceLocation>` - Captured context variables
76+
- Methods: `push()`, `reserve()`, `enter()`, `terminate()`, `terminateWithContinuation()`
77+
78+
### Core HIR Types
79+
- **BasicBlock**: Contains `instructions: Array<Instruction>`, `terminal: Terminal`, `preds: Set<BlockId>`, `phis: Set<Phi>`, `kind: BlockKind`
80+
- **Instruction**: Contains `id`, `lvalue` (Place), `value` (InstructionValue), `effects` (null initially), `loc`
81+
- **Terminal**: Block terminator - `if`, `branch`, `goto`, `return`, `throw`, `for`, `while`, `switch`, `ternary`, `logical`, etc.
82+
- **Place**: Reference to a value - `{kind: 'Identifier', identifier, effect, reactive, loc}`
83+
- **InstructionValue**: The operation - `LoadLocal`, `StoreLocal`, `CallExpression`, `BinaryExpression`, `FunctionExpression`, etc.
84+
85+
### Block Kinds
86+
- `block` - Regular sequential block
87+
- `loop` - Loop header/test block
88+
- `value` - Block that produces a value (ternary/logical branches)
89+
- `sequence` - Sequence expression block
90+
- `catch` - Exception handler block
91+
92+
## Edge Cases
93+
94+
1. **Hoisting**: Forward references to `let`/`const`/`function` declarations emit `DeclareContext` before the reference, enabling correct temporal dead zone handling
95+
96+
2. **Context Variables**: Variables captured by nested functions use `LoadContext`/`StoreContext` instead of `LoadLocal`/`StoreLocal`
97+
98+
3. **For-of/For-in Loops**: Synthesize iterator instructions (`GetIterator`, `IteratorNext`, `NextPropertyOf`)
99+
100+
4. **Optional Chaining**: Creates nested `OptionalTerminal` structures with short-circuit branches
101+
102+
5. **Logical Expressions**: Create branching structures where left side stores to temporary, right side only evaluated if needed
103+
104+
6. **Try/Catch**: Adds `MaybeThrowTerminal` after each instruction in try block, modeling potential control flow to handler
105+
106+
7. **JSX in fbt**: Tracks `fbtDepth` counter to handle whitespace differently in fbt/fbs tags
107+
108+
8. **Unsupported Syntax**: `var` declarations, `with` statements, inline `class` declarations, `eval` - emit appropriate errors
109+
110+
## TODOs
111+
- `returnTypeAnnotation: null, // TODO: extract the actual return type node if present`
112+
- `TODO(gsn): In the future, we could only pass in the context identifiers that are actually used by this function and its nested functions`
113+
- Multiple `// TODO remove type cast` in destructuring pattern handling
114+
- `// TODO: should JSX namespaced names be handled here as well?`
115+
116+
## Example
117+
Input JavaScript:
118+
```javascript
119+
export default function foo(x, y) {
120+
if (x) {
121+
return foo(false, y);
122+
}
123+
return [y * 10];
124+
}
125+
```
126+
127+
Output HIR (simplified):
128+
```
129+
foo(<unknown> x$0, <unknown> y$1): <unknown> $12
130+
bb0 (block):
131+
[1] <unknown> $6 = LoadLocal <unknown> x$0
132+
[2] If (<unknown> $6) then:bb2 else:bb1 fallthrough=bb1
133+
134+
bb2 (block):
135+
predecessor blocks: bb0
136+
[3] <unknown> $2 = LoadGlobal(module) foo
137+
[4] <unknown> $3 = false
138+
[5] <unknown> $4 = LoadLocal <unknown> y$1
139+
[6] <unknown> $5 = Call <unknown> $2(<unknown> $3, <unknown> $4)
140+
[7] Return Explicit <unknown> $5
141+
142+
bb1 (block):
143+
predecessor blocks: bb0
144+
[8] <unknown> $7 = LoadLocal <unknown> y$1
145+
[9] <unknown> $8 = 10
146+
[10] <unknown> $9 = Binary <unknown> $7 * <unknown> $8
147+
[11] <unknown> $10 = Array [<unknown> $9]
148+
[12] Return Explicit <unknown> $10
149+
```
150+
151+
Key observations:
152+
- The function has 3 basic blocks: entry (bb0), consequent (bb2), alternate/fallthrough (bb1)
153+
- The if statement creates an `IfTerminal` at the end of bb0
154+
- Each branch ends with its own `ReturnTerminal`
155+
- All values are stored in temporaries (`$N`) or named identifiers (`x$0`, `y$1`)
156+
- Instructions have sequential IDs within blocks
157+
- Types and effects are `<unknown>` at this stage (populated by later passes)
Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
# enterSSA
2+
3+
## File
4+
`src/SSA/EnterSSA.ts`
5+
6+
## Purpose
7+
Converts the HIR from a non-SSA form (where variables can be reassigned) into Static Single Assignment (SSA) form, where each variable is defined exactly once and phi nodes are inserted at control flow join points to merge values from different paths.
8+
9+
## Input Invariants
10+
- The HIR must have blocks in reverse postorder (predecessors visited before successors, except for back-edges)
11+
- Block predecessor information (`block.preds`) must be populated correctly
12+
- The function's `context` array must be empty for the root function (outer function declarations)
13+
- Identifiers may be reused across multiple definitions/assignments (non-SSA form)
14+
15+
## Output Guarantees
16+
- Each identifier has a unique `IdentifierId` - no identifier is defined more than once
17+
- All operand references use the SSA-renamed identifiers
18+
- Phi nodes are inserted at join points where values from different control flow paths converge
19+
- Function parameters are SSA-renamed
20+
- Nested functions (FunctionExpression, ObjectMethod) are recursively converted to SSA form
21+
- Context variables (captured from outer scopes) are handled specially and not redefined
22+
23+
## Algorithm
24+
The pass uses the Braun et al. algorithm ("Simple and Efficient Construction of Static Single Assignment Form") with adaptations for handling loops and nested functions.
25+
26+
### Key Steps:
27+
1. **Block Traversal**: Iterate through blocks in order (assumed reverse postorder from previous passes)
28+
2. **Definition Tracking**: Maintain a per-block `defs` map from original identifiers to their SSA-renamed versions
29+
3. **Renaming**:
30+
- When a value is **defined** (lvalue), create a new SSA identifier with fresh `IdentifierId`
31+
- When a value is **used** (operand), look up the current SSA identifier via `getIdAt`
32+
4. **Phi Node Insertion**: When looking up an identifier at a block with multiple predecessors:
33+
- If all predecessors have been visited, create a phi node collecting values from each predecessor
34+
- If some predecessors are unvisited (back-edge/loop), create an "incomplete phi" that will be fixed later
35+
5. **Incomplete Phi Resolution**: When all predecessors of a block are finally visited, fix any incomplete phi nodes by populating their operands
36+
6. **Nested Function Handling**: Recursively apply SSA transformation to nested functions, temporarily adding a fake predecessor edge to enable identifier lookup from the enclosing scope
37+
38+
### Phi Node Placement Logic (`getIdAt`):
39+
- If the identifier is defined locally in the current block, return it
40+
- If at entry block with no predecessors and not found, mark as unknown (global)
41+
- If some predecessors are unvisited (loop), create incomplete phi
42+
- If exactly one predecessor, recursively look up in that predecessor
43+
- If multiple predecessors, create phi node with operands from all predecessors
44+
45+
## Key Data Structures
46+
- **SSABuilder**: Main class managing the transformation
47+
- `#states: Map<BasicBlock, State>` - Per-block state (defs map and incomplete phis)
48+
- `unsealedPreds: Map<BasicBlock, number>` - Count of unvisited predecessors per block
49+
- `#unknown: Set<Identifier>` - Identifiers assumed to be globals
50+
- `#context: Set<Identifier>` - Context variables that should not be redefined
51+
- **State**: Per-block state containing:
52+
- `defs: Map<Identifier, Identifier>` - Maps original identifiers to SSA-renamed versions
53+
- `incompletePhis: Array<IncompletePhi>` - Phi nodes waiting for predecessor values
54+
- **IncompletePhi**: Tracks a phi node created before all predecessors were visited
55+
- `oldPlace: Place` - Original place being phi'd
56+
- `newPlace: Place` - SSA-renamed phi result place
57+
- **Phi**: The actual phi node in the HIR
58+
- `place: Place` - The result of the phi
59+
- `operands: Map<BlockId, Place>` - Maps predecessor block to the place providing the value
60+
61+
## Edge Cases
62+
- **Loops (back-edges)**: When a variable is used in a loop header before the loop body assigns it, an incomplete phi is created and later fixed when the loop body block is visited
63+
- **Globals**: If an identifier is used but never defined (reaching the entry block without a definition), it's assumed to be a global and not renamed
64+
- **Context variables**: Variables captured from an outer function scope are tracked specially and not redefined when reassigned
65+
- **Nested functions**: Function expressions and object methods are processed recursively with a temporary predecessor edge linking them to the enclosing block
66+
67+
## TODOs
68+
- `[hoisting] EnterSSA: Expected identifier to be defined before being used` - Handles cases where hoisting causes an identifier to be used before definition (throws a Todo error for graceful bailout)
69+
70+
## Example
71+
72+
### Input (simple reassignment with control flow):
73+
```javascript
74+
function foo() {
75+
let y = 2;
76+
if (y > 1) {
77+
y = 1;
78+
} else {
79+
y = 2;
80+
}
81+
let x = y;
82+
}
83+
```
84+
85+
### Before SSA (HIR):
86+
```
87+
bb0 (block):
88+
[1] $0 = 2
89+
[2] $2 = StoreLocal Let y$1 = $0
90+
[3] $7 = LoadLocal y$1
91+
[4] $8 = 1
92+
[5] $9 = Binary $7 > $8
93+
[6] If ($9) then:bb2 else:bb3 fallthrough=bb1
94+
95+
bb2 (block):
96+
predecessor blocks: bb0
97+
[7] $3 = 1
98+
[8] $4 = StoreLocal Reassign y$1 = $3 // Same y$1 reassigned
99+
[9] Goto bb1
100+
101+
bb3 (block):
102+
predecessor blocks: bb0
103+
[10] $5 = 2
104+
[11] $6 = StoreLocal Reassign y$1 = $5 // Same y$1 reassigned
105+
[12] Goto bb1
106+
107+
bb1 (block):
108+
predecessor blocks: bb2 bb3
109+
[13] $10 = LoadLocal y$1 // Which y$1?
110+
[14] $12 = StoreLocal Let x$11 = $10
111+
```
112+
113+
### After SSA:
114+
```
115+
bb0 (block):
116+
[1] $15 = 2
117+
[2] $17 = StoreLocal Let y$16 = $15 // y$16: initial definition
118+
[3] $18 = LoadLocal y$16
119+
[4] $19 = 1
120+
[5] $20 = Binary $18 > $19
121+
[6] If ($20) then:bb2 else:bb3 fallthrough=bb1
122+
123+
bb2 (block):
124+
predecessor blocks: bb0
125+
[7] $21 = 1
126+
[8] $23 = StoreLocal Reassign y$22 = $21 // y$22: new SSA name
127+
[9] Goto bb1
128+
129+
bb3 (block):
130+
predecessor blocks: bb0
131+
[10] $24 = 2
132+
[11] $26 = StoreLocal Reassign y$25 = $24 // y$25: new SSA name
133+
[12] Goto bb1
134+
135+
bb1 (block):
136+
predecessor blocks: bb2 bb3
137+
y$27: phi(bb2: y$22, bb3: y$25) // PHI NODE: merges y$22 and y$25
138+
[13] $28 = LoadLocal y$27 // Uses phi result
139+
[14] $30 = StoreLocal Let x$29 = $28
140+
```
141+
142+
### Loop Example (while loop with back-edge):
143+
```javascript
144+
function foo() {
145+
let x = 1;
146+
while (x < 10) {
147+
x = x + 1;
148+
}
149+
return x;
150+
}
151+
```
152+
153+
### After SSA:
154+
```
155+
bb0 (block):
156+
[1] $13 = 1
157+
[2] $15 = StoreLocal Let x$14 = $13 // x$14: initial definition
158+
[3] While test=bb1 loop=bb3 fallthrough=bb2
159+
160+
bb1 (loop):
161+
predecessor blocks: bb0 bb3
162+
x$16: phi(bb0: x$14, bb3: x$23) // PHI merges initial and loop-updated values
163+
[4] $17 = LoadLocal x$16
164+
[5] $18 = 10
165+
[6] $19 = Binary $17 < $18
166+
[7] Branch ($19) then:bb3 else:bb2
167+
168+
bb3 (block):
169+
predecessor blocks: bb1
170+
[8] $20 = LoadLocal x$16 // Uses phi result
171+
[9] $21 = 1
172+
[10] $22 = Binary $20 + $21
173+
[11] $24 = StoreLocal Reassign x$23 = $22 // x$23: new SSA name in loop body
174+
[12] Goto(Continue) bb1
175+
176+
bb2 (block):
177+
predecessor blocks: bb1
178+
[13] $25 = LoadLocal x$16 // Uses phi result
179+
[14] Return Explicit $25
180+
```
181+
182+
The phi node at `bb1` (the loop header) is initially created as an "incomplete phi" when first visited because `bb3` (the loop body) hasn't been visited yet. Once `bb3` is processed and its terminal is handled, the incomplete phi is fixed by calling `fixIncompletePhis` to populate the operand from `bb3`.

0 commit comments

Comments
 (0)