Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 14 additions & 8 deletions docs/COMPILER_SELF_HOSTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,16 @@ EigenScript has achieved two different types of self-hosting:
- See [docs/meta_circular_evaluator.md](./meta_circular_evaluator.md)
- Located in [examples/eval.eigs](../examples/eval.eigs)

2. **Compiler Self-Hosting** (⚠️ Partial - this document)
2. **Compiler Self-Hosting** (✅ Fixpoint Achieved - this document)
- An EigenScript compiler written in EigenScript
- Compiles EigenScript to LLVM IR
- Located in `src/eigenscript/compiler/selfhost/`

## Current Status (v0.4.1)

### 🎉 Full Bootstrap Achieved!
### 🎉 Fixpoint Bootstrap Achieved!

As of v0.4.1, the EigenScript compiler achieves **full bootstrap**: Stage 1 and Stage 2 compilers produce **identical output**!
As of v0.4.1, the EigenScript compiler achieves **fixpoint bootstrap**: the compiler can reproduce itself exactly (Stage 2 = Stage 3), proving the bootstrap is stable and complete.

### ✅ What Works

Expand All @@ -48,15 +48,21 @@ As of v0.4.1, the EigenScript compiler achieves **full bootstrap**: Stage 1 and
- **Compile itself to create Stage 2**
- **Stage 2 Compiler**: The self-compiled compiler:
- Produces identical output to Stage 1
- Verifies the bootstrap is complete
- Can compile itself to create Stage 3
- **Stage 3 Compiler**: The third-generation compiler:
- Produces identical output to Stage 2
- Verifies the fixpoint is achieved
- **Module System**: All five compiler modules (lexer, parser, semantic, codegen, main) compile, link, and run correctly

### ✅ Bootstrap Verification
### ✅ Fixpoint Bootstrap Verification

```
Stage 1 (eigensc) ──compiles──> Stage 2 (eigensc2)
│ │
└──────── IDENTICAL OUTPUT ──────┘
Stage 1 (eigensc) ──compiles──> Stage 2 (eigensc2) ──compiles──> Stage 3 (eigensc3)
│ │ │
└──────── IDENTICAL ─────────────┴──────────── IDENTICAL ───────────┘

FIXPOINT ACHIEVED
(Stage N = Stage N+1 for all N ≥ 2)
```

### 🎯 Future Goals
Expand Down
77 changes: 54 additions & 23 deletions docs/SELF_HOSTING_QUICKSTART.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,28 @@ bash scripts/bootstrap_test.sh

### What This Does

The script performs a 4-stage test:
The script performs a complete bootstrap verification through 5 stages:

1. **Stage 0 → Stage 1**:
1. **Stage 0 → Stage 1**: Python reference compiler → `eigensc`
- Uses the Python reference compiler to compile the self-hosted compiler
- Creates `eigensc` executable (Stage 1 compiler)
- Creates the Stage 1 compiler executable

2. **Stage 1 Test**:
2. **Stage 1 Test**: Verify Stage 1 works
- Uses Stage 1 to compile a simple program
- Verifies it produces valid LLVM IR

3. **Stage 1 → Stage 2 (Bootstrap)**:
- Stage 1 compiles itself to create Stage 2 compiler (`eigensc2`)
3. **Stage 1 → Stage 2**: `eigensc` → `eigensc2`
- Stage 1 compiles itself to create the Stage 2 compiler
- Validates the generated LLVM IR

4. **Bootstrap Verification**:
- Compares Stage 1 and Stage 2 output on a test program
- Confirms they produce identical results
4. **Stage 2 Verification**: Compare Stage 1 and Stage 2
- Both compilers compile the same test program
- Confirms they produce identical output

5. **Stage 2 → Stage 3 (Fixpoint)**: `eigensc2` → `eigensc3`
- Stage 2 compiles itself to create Stage 3
- Verifies Stage 2 and Stage 3 produce identical output
- **Fixpoint achieved**: The compiler reproduces itself exactly

### Expected Output

Expand Down Expand Up @@ -73,7 +78,17 @@ Step 3: Stage 1 compiler compiles itself (bootstrap)

Step 4: Compare stage 1 and stage 2 output
-----------------------------------------
BOOTSTRAP SUCCESS: Stage 1 and Stage 2 produce identical output!
SUCCESS: Stage 1 and Stage 2 produce identical output!

Step 5: Stage 2 compiles itself (Stage 3 fixpoint)
-------------------------------------------------
SUCCESS: Stage 3 LLVM IR is valid!
SUCCESS: Stage 3 compiler created (eigensc3)!

╔═══════════════════════════════════════════════════╗
║ FIXPOINT ACHIEVED: Stage 2 = Stage 3 ║
║ The compiler reproduces itself exactly! ║
╚═══════════════════════════════════════════════════╝

========================================
Bootstrap Test Complete
Expand Down Expand Up @@ -152,12 +167,13 @@ target triple = "x86_64-pc-linux-gnu"

## What You've Accomplished

🎉 Congratulations! You've just witnessed **full bootstrap** - one of the most significant milestones in programming language development:
🎉 Congratulations! You've just witnessed **fixpoint bootstrap** - the strongest form of compiler self-hosting:

1. **Compiled a compiler** written in EigenScript using the Python reference compiler
2. **Used that compiler (Stage 1)** to compile itself, creating Stage 2
3. **Verified identical output** - Stage 1 and Stage 2 produce the same results
4. **Achieved true self-hosting** - the language can fully compile its own compiler
1. **Stage 0 → Stage 1**: Python compiled the EigenScript compiler
2. **Stage 1 → Stage 2**: The compiler compiled itself
3. **Stage 2 → Stage 3**: The self-compiled compiler compiled itself again
4. **Fixpoint verified**: Stage 2 and Stage 3 produce **identical output**
5. **True self-hosting**: The compiler can reproduce itself indefinitely

## Current Status

Expand Down Expand Up @@ -263,7 +279,7 @@ For complex programs, compare with examples in `examples/` directory.
```
┌─────────────────────────────────────────────┐
│ Stage 0: Python Reference Compiler │
│ (Production, always works)
│ (Production, always works) │
└──────────────┬──────────────────────────────┘
│ compiles
Expand All @@ -277,15 +293,25 @@ For complex programs, compare with examples in `examples/` directory.
│ compiles itself
┌─────────────────────────────────────────────┐
│ Stage 2: Second-Generation Compiler │
│ (eigensc2 - Full Bootstrap Achieved!) │
│ Stage 2: Second-Generation (eigensc2) │
│ ✅ Created by Stage 1 compiling itself │
│ ✅ Produces identical output to Stage 1 │
│ ✅ Bootstrap verification passed │
│ ✅ Can compile itself │
└──────────────┬──────────────────────────────┘
│ compiles itself
┌─────────────────────────────────────────────┐
│ Stage 3: Third-Generation (eigensc3) │
│ ✅ Created by Stage 2 compiling itself │
│ ✅ Produces identical output to Stage 2 │
│ ══════════════════════════════════════════ │
│ ║ FIXPOINT: Stage 2 = Stage 3 ║ │
│ ║ The compiler reproduces itself exactly ║ │
│ ══════════════════════════════════════════ │
└─────────────────────────────────────────────┘
```

**Current Status**: Full bootstrap achieved! Stage 1 successfully compiles itself to create Stage 2, and both produce identical output.
**Current Status**: Fixpoint bootstrap achieved! Stage 2 and Stage 3 produce identical output - the compiler can reproduce itself indefinitely.

## Key Files

Expand Down Expand Up @@ -319,7 +345,7 @@ No! EigenScript has **two types** of self-hosting:
2. **Self-Hosted Compiler** (src/eigenscript/compiler/selfhost/)
- A compiler written in EigenScript
- Generates LLVM IR
- ⚠️ Partially working (this guide)
- ✅ Fixpoint bootstrap achieved (this guide)

### Why is the compiler in EigenScript when the reference is in Python?

Expand Down Expand Up @@ -351,9 +377,14 @@ For comparison:

## Success!

If you got this far and saw the bootstrap succeed, you've witnessed something remarkable: **a programming language that can fully compile its own compiler, with Stage 1 and Stage 2 producing identical output**.
If you got this far and saw the bootstrap succeed, you've witnessed something remarkable: **a programming language that has achieved fixpoint bootstrap, where the compiler can reproduce itself exactly**.

This is the strongest form of compiler self-hosting:
- Stage 1 = Stage 2 (bootstrap works)
- Stage 2 = Stage 3 (fixpoint achieved)
- Stage N = Stage N+1 for all N ≥ 2 (stable indefinitely)

This is one of the most significant milestones in programming language development. Many languages never achieve full bootstrap - EigenScript has!
Many languages never achieve full bootstrap - EigenScript has achieved fixpoint in under a month!

🚀 **Next**: Try reading through the [complete guide](./COMPILER_SELF_HOSTING.md) to understand how it all works internally.

Expand Down
69 changes: 61 additions & 8 deletions scripts/bootstrap_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@
# Tests the self-hosted compiler's ability to compile itself
#
# Bootstrap stages:
# 1. Reference compiler compiles self-hosted compiler -> eigensc (stage 1)
# 1. Reference compiler compiles self-hosted compiler -> eigensc (Stage 1)
# 2. Stage 1 compiler compiles a test program -> verify output works
# 3. Stage 1 compiler compiles itself -> eigensc2 (stage 2)
# 4. Compare stage 1 and stage 2 outputs (should be identical)
# 3. Stage 1 compiler compiles itself -> eigensc2 (Stage 2)
# 4. Compare Stage 1 and Stage 2 outputs (should be identical)
# 5. Stage 2 compiler compiles itself -> eigensc3 (Stage 3 - fixpoint verification)
#
# Current Status (v0.4.1):
# - lexer.eigs: COMPILES ✓
Expand All @@ -15,8 +16,9 @@
# - codegen.eigs: COMPILES ✓
# - main.eigs: COMPILES ✓
# - LINKING: SUCCESS ✓ -> eigensc binary created
# - RUNTIME: WORKING ✓ - stage 1 compiler generates valid, executable LLVM IR
# - RUNTIME: WORKING ✓ - Stage 1 compiler generates valid, executable LLVM IR
# - BOOTSTRAP: COMPLETE ✓ - Stage 1 and Stage 2 produce IDENTICAL output!
# - FIXPOINT: VERIFIED ✓ - Stage 2 and Stage 3 produce IDENTICAL output!
#
# Key fixes for bootstrap (v0.4.1):
# - External variable assignment in codegen (parser_token_count, etc.)
Expand Down Expand Up @@ -49,7 +51,7 @@ mkdir -p "$BUILD_DIR"
cd "$BUILD_DIR"

# Clean previous builds
rm -f *.ll *.o *.exe eigensc eigensc2
rm -f *.ll *.o *.exe eigensc eigensc2 eigensc3

echo "Step 1: Compile self-hosted compiler with reference compiler"
echo "------------------------------------------------------------"
Expand Down Expand Up @@ -86,7 +88,7 @@ if [ ! -f codegen.ll ]; then
fi

echo " Compiling main.eigs..."
eigenscript-compile "$SELFHOST_DIR/main.eigs" -o main.ll -O0
eigenscript-compile "$SELFHOST_DIR/main.eigs" -o main.ll -O0 --no-runtime
if [ ! -f main.ll ]; then
echo "ERROR: Failed to compile main.eigs"
exit 1
Expand All @@ -108,7 +110,7 @@ done

# Link all modules together
echo " Linking stage 1 compiler..."
gcc lexer.o parser.o semantic.o codegen.o main.o eigenvalue.o -o eigensc -lm
gcc -no-pie lexer.o parser.o semantic.o codegen.o main.o eigenvalue.o -o eigensc -lm

if [ ! -f eigensc ]; then
echo "ERROR: Failed to create stage 1 compiler"
Expand Down Expand Up @@ -204,7 +206,58 @@ if [ -s main_stage2.ll ]; then
./eigensc2 test_simple.eigs > output2.ll 2>&1 || true

if diff -q output1.ll output2.ll > /dev/null 2>&1; then
echo " BOOTSTRAP SUCCESS: Stage 1 and Stage 2 produce identical output!"
echo " SUCCESS: Stage 1 and Stage 2 produce identical output!"
echo ""
echo "Step 5: Stage 2 compiles itself (Stage 3 fixpoint)"
echo "-------------------------------------------------"

# Stage 2 compiles main.eigs to create Stage 3
echo " Running: ./eigensc2 $SELFHOST_DIR/main.eigs"
./eigensc2 "$SELFHOST_DIR/main.eigs" > main_stage3_raw.ll 2>&1 || true

# Filter debug output
if [ -s main_stage3_raw.ll ]; then
ir_start=$(grep -n "; EigenScript Compiled Module" main_stage3_raw.ll | head -1 | cut -d: -f1)
if [ -n "$ir_start" ]; then
tail -n +$ir_start main_stage3_raw.ll > main_stage3.ll
else
cp main_stage3_raw.ll main_stage3.ll
fi
fi

if [ -s main_stage3.ll ]; then
# Validate Stage 3 LLVM IR
if llvm-as main_stage3.ll -o main_stage3.bc 2>/dev/null; then
echo " SUCCESS: Stage 3 LLVM IR is valid!"

# Build Stage 3 compiler
llc main_stage3.bc -o main_stage3.s -O2 2>/dev/null || true
if [ -f main_stage3.s ]; then
gcc -c main_stage3.s -o main_stage3.o 2>/dev/null || true
if [ -f main_stage3.o ]; then
gcc -no-pie lexer.o parser.o semantic.o codegen.o main_stage3.o eigenvalue.o -o eigensc3 -lm 2>/dev/null || true
if [ -f eigensc3 ]; then
echo " SUCCESS: Stage 3 compiler created (eigensc3)!"

# Compare Stage 2 and Stage 3 output
./eigensc3 test_simple.eigs > output3.ll 2>&1 || true

if diff -q output2.ll output3.ll > /dev/null 2>&1; then
echo ""
echo " ╔═══════════════════════════════════════════════════╗"
echo " ║ FIXPOINT ACHIEVED: Stage 2 = Stage 3 ║"
echo " ║ The compiler reproduces itself exactly! ║"
echo " ╚═══════════════════════════════════════════════════╝"
else
echo " Stage 2 and Stage 3 outputs differ"
fi
fi
fi
fi
else
echo " Stage 3 LLVM IR has errors"
fi
fi
else
echo " Outputs differ (may be cosmetic differences)"
echo " Stage 1 lines: $(wc -l < output1.ll)"
Expand Down
13 changes: 11 additions & 2 deletions src/eigenscript/compiler/cli/compile.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,7 @@ def compile_file(
opt_level: int = 0,
target_triple: str = None,
is_lib: bool = False,
no_runtime: bool = False,
):
"""Compile an EigenScript file to LLVM IR, object code, or executable."""

Expand Down Expand Up @@ -444,10 +445,12 @@ def compile_file(
print(f" ✗ Verification failed: {verify_error}")
return 1

# Link runtime bitcode only for main programs (not libraries)
if not is_lib:
# Link runtime bitcode only for main programs (not libraries or when --no-runtime)
if not is_lib and not no_runtime:
llvm_module = codegen.link_runtime_bitcode(llvm_module, target_triple)
print(f" ✓ Linked runtime bitcode (LTO enabled)")
elif no_runtime:
print(f" ✓ Skipping runtime linking (--no-runtime)")
else:
print(f" ✓ Library mode: runtime will be linked at final link stage")

Expand Down Expand Up @@ -555,6 +558,11 @@ def main():
parser.add_argument(
"--no-verify", action="store_true", help="Skip LLVM IR verification"
)
parser.add_argument(
"--no-runtime",
action="store_true",
help="Skip runtime bitcode linking (for separate linking with eigenvalue.o)",
)

args = parser.parse_args()

Expand All @@ -571,6 +579,7 @@ def main():
opt_level=args.optimize,
target_triple=args.target,
is_lib=args.lib,
no_runtime=args.no_runtime,
)

sys.exit(result)
Expand Down
Loading