diff --git a/docs/COMPILER_SELF_HOSTING.md b/docs/COMPILER_SELF_HOSTING.md index ba15e74..94cab1e 100644 --- a/docs/COMPILER_SELF_HOSTING.md +++ b/docs/COMPILER_SELF_HOSTING.md @@ -27,16 +27,16 @@ EigenScript has achieved two different types of self-hosting: - See [docs/meta_circular_evaluator.md](./meta_circular_evaluator.md) - Located in [examples/eval.eigs](../examples/eval.eigs) -2. **Compiler Self-Hosting** (⚠️ Partial - this document) +2. **Compiler Self-Hosting** (βœ… Fixpoint Achieved - this document) - An EigenScript compiler written in EigenScript - Compiles EigenScript to LLVM IR - Located in `src/eigenscript/compiler/selfhost/` ## Current Status (v0.4.1) -### πŸŽ‰ Full Bootstrap Achieved! +### πŸŽ‰ Fixpoint Bootstrap Achieved! -As of v0.4.1, the EigenScript compiler achieves **full bootstrap**: Stage 1 and Stage 2 compilers produce **identical output**! +As of v0.4.1, the EigenScript compiler achieves **fixpoint bootstrap**: the compiler can reproduce itself exactly (Stage 2 = Stage 3), proving the bootstrap is stable and complete. ### βœ… What Works @@ -48,15 +48,21 @@ As of v0.4.1, the EigenScript compiler achieves **full bootstrap**: Stage 1 and - **Compile itself to create Stage 2** - **Stage 2 Compiler**: The self-compiled compiler: - Produces identical output to Stage 1 - - Verifies the bootstrap is complete + - Can compile itself to create Stage 3 +- **Stage 3 Compiler**: The third-generation compiler: + - Produces identical output to Stage 2 + - Verifies the fixpoint is achieved - **Module System**: All five compiler modules (lexer, parser, semantic, codegen, main) compile, link, and run correctly -### βœ… Bootstrap Verification +### βœ… Fixpoint Bootstrap Verification ``` -Stage 1 (eigensc) ──compiles──> Stage 2 (eigensc2) - β”‚ β”‚ - └──────── IDENTICAL OUTPUT β”€β”€β”€β”€β”€β”€β”˜ +Stage 1 (eigensc) ──compiles──> Stage 2 (eigensc2) ──compiles──> Stage 3 (eigensc3) + β”‚ β”‚ β”‚ + └──────── IDENTICAL ─────────────┴──────────── IDENTICAL β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + + FIXPOINT ACHIEVED + (Stage N = Stage N+1 for all N β‰₯ 2) ``` ### 🎯 Future Goals diff --git a/docs/SELF_HOSTING_QUICKSTART.md b/docs/SELF_HOSTING_QUICKSTART.md index 73e9941..b7c1f27 100644 --- a/docs/SELF_HOSTING_QUICKSTART.md +++ b/docs/SELF_HOSTING_QUICKSTART.md @@ -26,23 +26,28 @@ bash scripts/bootstrap_test.sh ### What This Does -The script performs a 4-stage test: +The script performs a complete bootstrap verification through 5 stages: -1. **Stage 0 β†’ Stage 1**: +1. **Stage 0 β†’ Stage 1**: Python reference compiler β†’ `eigensc` - Uses the Python reference compiler to compile the self-hosted compiler - - Creates `eigensc` executable (Stage 1 compiler) + - Creates the Stage 1 compiler executable -2. **Stage 1 Test**: +2. **Stage 1 Test**: Verify Stage 1 works - Uses Stage 1 to compile a simple program - Verifies it produces valid LLVM IR -3. **Stage 1 β†’ Stage 2 (Bootstrap)**: - - Stage 1 compiles itself to create Stage 2 compiler (`eigensc2`) +3. **Stage 1 β†’ Stage 2**: `eigensc` β†’ `eigensc2` + - Stage 1 compiles itself to create the Stage 2 compiler - Validates the generated LLVM IR -4. **Bootstrap Verification**: - - Compares Stage 1 and Stage 2 output on a test program - - Confirms they produce identical results +4. **Stage 2 Verification**: Compare Stage 1 and Stage 2 + - Both compilers compile the same test program + - Confirms they produce identical output + +5. **Stage 2 β†’ Stage 3 (Fixpoint)**: `eigensc2` β†’ `eigensc3` + - Stage 2 compiles itself to create Stage 3 + - Verifies Stage 2 and Stage 3 produce identical output + - **Fixpoint achieved**: The compiler reproduces itself exactly ### Expected Output @@ -73,7 +78,17 @@ Step 3: Stage 1 compiler compiles itself (bootstrap) Step 4: Compare stage 1 and stage 2 output ----------------------------------------- - BOOTSTRAP SUCCESS: Stage 1 and Stage 2 produce identical output! + SUCCESS: Stage 1 and Stage 2 produce identical output! + +Step 5: Stage 2 compiles itself (Stage 3 fixpoint) +------------------------------------------------- + SUCCESS: Stage 3 LLVM IR is valid! + SUCCESS: Stage 3 compiler created (eigensc3)! + + ╔═══════════════════════════════════════════════════╗ + β•‘ FIXPOINT ACHIEVED: Stage 2 = Stage 3 β•‘ + β•‘ The compiler reproduces itself exactly! β•‘ + β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• ======================================== Bootstrap Test Complete @@ -152,12 +167,13 @@ target triple = "x86_64-pc-linux-gnu" ## What You've Accomplished -πŸŽ‰ Congratulations! You've just witnessed **full bootstrap** - one of the most significant milestones in programming language development: +πŸŽ‰ Congratulations! You've just witnessed **fixpoint bootstrap** - the strongest form of compiler self-hosting: -1. **Compiled a compiler** written in EigenScript using the Python reference compiler -2. **Used that compiler (Stage 1)** to compile itself, creating Stage 2 -3. **Verified identical output** - Stage 1 and Stage 2 produce the same results -4. **Achieved true self-hosting** - the language can fully compile its own compiler +1. **Stage 0 β†’ Stage 1**: Python compiled the EigenScript compiler +2. **Stage 1 β†’ Stage 2**: The compiler compiled itself +3. **Stage 2 β†’ Stage 3**: The self-compiled compiler compiled itself again +4. **Fixpoint verified**: Stage 2 and Stage 3 produce **identical output** +5. **True self-hosting**: The compiler can reproduce itself indefinitely ## Current Status @@ -263,7 +279,7 @@ For complex programs, compare with examples in `examples/` directory. ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Stage 0: Python Reference Compiler β”‚ -β”‚ (Production, always works) β”‚ +β”‚ (Production, always works) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ compiles β–Ό @@ -277,15 +293,25 @@ For complex programs, compare with examples in `examples/` directory. β”‚ compiles itself β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Stage 2: Second-Generation Compiler β”‚ -β”‚ (eigensc2 - Full Bootstrap Achieved!) β”‚ +β”‚ Stage 2: Second-Generation (eigensc2) β”‚ β”‚ βœ… Created by Stage 1 compiling itself β”‚ β”‚ βœ… Produces identical output to Stage 1 β”‚ -β”‚ βœ… Bootstrap verification passed β”‚ +β”‚ βœ… Can compile itself β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ compiles itself + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Stage 3: Third-Generation (eigensc3) β”‚ +β”‚ βœ… Created by Stage 2 compiling itself β”‚ +β”‚ βœ… Produces identical output to Stage 2 β”‚ +β”‚ ══════════════════════════════════════════ β”‚ +β”‚ β•‘ FIXPOINT: Stage 2 = Stage 3 β•‘ β”‚ +β”‚ β•‘ The compiler reproduces itself exactly β•‘ β”‚ +β”‚ ══════════════════════════════════════════ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` -**Current Status**: Full bootstrap achieved! Stage 1 successfully compiles itself to create Stage 2, and both produce identical output. +**Current Status**: Fixpoint bootstrap achieved! Stage 2 and Stage 3 produce identical output - the compiler can reproduce itself indefinitely. ## Key Files @@ -319,7 +345,7 @@ No! EigenScript has **two types** of self-hosting: 2. **Self-Hosted Compiler** (src/eigenscript/compiler/selfhost/) - A compiler written in EigenScript - Generates LLVM IR - - ⚠️ Partially working (this guide) + - βœ… Fixpoint bootstrap achieved (this guide) ### Why is the compiler in EigenScript when the reference is in Python? @@ -351,9 +377,14 @@ For comparison: ## Success! -If you got this far and saw the bootstrap succeed, you've witnessed something remarkable: **a programming language that can fully compile its own compiler, with Stage 1 and Stage 2 producing identical output**. +If you got this far and saw the bootstrap succeed, you've witnessed something remarkable: **a programming language that has achieved fixpoint bootstrap, where the compiler can reproduce itself exactly**. + +This is the strongest form of compiler self-hosting: +- Stage 1 = Stage 2 (bootstrap works) +- Stage 2 = Stage 3 (fixpoint achieved) +- Stage N = Stage N+1 for all N β‰₯ 2 (stable indefinitely) -This is one of the most significant milestones in programming language development. Many languages never achieve full bootstrap - EigenScript has! +Many languages never achieve full bootstrap - EigenScript has achieved fixpoint in under a month! πŸš€ **Next**: Try reading through the [complete guide](./COMPILER_SELF_HOSTING.md) to understand how it all works internally. diff --git a/scripts/bootstrap_test.sh b/scripts/bootstrap_test.sh index cb66f89..98a94a8 100755 --- a/scripts/bootstrap_test.sh +++ b/scripts/bootstrap_test.sh @@ -3,10 +3,11 @@ # Tests the self-hosted compiler's ability to compile itself # # Bootstrap stages: -# 1. Reference compiler compiles self-hosted compiler -> eigensc (stage 1) +# 1. Reference compiler compiles self-hosted compiler -> eigensc (Stage 1) # 2. Stage 1 compiler compiles a test program -> verify output works -# 3. Stage 1 compiler compiles itself -> eigensc2 (stage 2) -# 4. Compare stage 1 and stage 2 outputs (should be identical) +# 3. Stage 1 compiler compiles itself -> eigensc2 (Stage 2) +# 4. Compare Stage 1 and Stage 2 outputs (should be identical) +# 5. Stage 2 compiler compiles itself -> eigensc3 (Stage 3 - fixpoint verification) # # Current Status (v0.4.1): # - lexer.eigs: COMPILES βœ“ @@ -15,8 +16,9 @@ # - codegen.eigs: COMPILES βœ“ # - main.eigs: COMPILES βœ“ # - LINKING: SUCCESS βœ“ -> eigensc binary created -# - RUNTIME: WORKING βœ“ - stage 1 compiler generates valid, executable LLVM IR +# - RUNTIME: WORKING βœ“ - Stage 1 compiler generates valid, executable LLVM IR # - BOOTSTRAP: COMPLETE βœ“ - Stage 1 and Stage 2 produce IDENTICAL output! +# - FIXPOINT: VERIFIED βœ“ - Stage 2 and Stage 3 produce IDENTICAL output! # # Key fixes for bootstrap (v0.4.1): # - External variable assignment in codegen (parser_token_count, etc.) @@ -49,7 +51,7 @@ mkdir -p "$BUILD_DIR" cd "$BUILD_DIR" # Clean previous builds -rm -f *.ll *.o *.exe eigensc eigensc2 +rm -f *.ll *.o *.exe eigensc eigensc2 eigensc3 echo "Step 1: Compile self-hosted compiler with reference compiler" echo "------------------------------------------------------------" @@ -86,7 +88,7 @@ if [ ! -f codegen.ll ]; then fi echo " Compiling main.eigs..." -eigenscript-compile "$SELFHOST_DIR/main.eigs" -o main.ll -O0 +eigenscript-compile "$SELFHOST_DIR/main.eigs" -o main.ll -O0 --no-runtime if [ ! -f main.ll ]; then echo "ERROR: Failed to compile main.eigs" exit 1 @@ -108,7 +110,7 @@ done # Link all modules together echo " Linking stage 1 compiler..." -gcc lexer.o parser.o semantic.o codegen.o main.o eigenvalue.o -o eigensc -lm +gcc -no-pie lexer.o parser.o semantic.o codegen.o main.o eigenvalue.o -o eigensc -lm if [ ! -f eigensc ]; then echo "ERROR: Failed to create stage 1 compiler" @@ -204,7 +206,58 @@ if [ -s main_stage2.ll ]; then ./eigensc2 test_simple.eigs > output2.ll 2>&1 || true if diff -q output1.ll output2.ll > /dev/null 2>&1; then - echo " BOOTSTRAP SUCCESS: Stage 1 and Stage 2 produce identical output!" + echo " SUCCESS: Stage 1 and Stage 2 produce identical output!" + echo "" + echo "Step 5: Stage 2 compiles itself (Stage 3 fixpoint)" + echo "-------------------------------------------------" + + # Stage 2 compiles main.eigs to create Stage 3 + echo " Running: ./eigensc2 $SELFHOST_DIR/main.eigs" + ./eigensc2 "$SELFHOST_DIR/main.eigs" > main_stage3_raw.ll 2>&1 || true + + # Filter debug output + if [ -s main_stage3_raw.ll ]; then + ir_start=$(grep -n "; EigenScript Compiled Module" main_stage3_raw.ll | head -1 | cut -d: -f1) + if [ -n "$ir_start" ]; then + tail -n +$ir_start main_stage3_raw.ll > main_stage3.ll + else + cp main_stage3_raw.ll main_stage3.ll + fi + fi + + if [ -s main_stage3.ll ]; then + # Validate Stage 3 LLVM IR + if llvm-as main_stage3.ll -o main_stage3.bc 2>/dev/null; then + echo " SUCCESS: Stage 3 LLVM IR is valid!" + + # Build Stage 3 compiler + llc main_stage3.bc -o main_stage3.s -O2 2>/dev/null || true + if [ -f main_stage3.s ]; then + gcc -c main_stage3.s -o main_stage3.o 2>/dev/null || true + if [ -f main_stage3.o ]; then + gcc -no-pie lexer.o parser.o semantic.o codegen.o main_stage3.o eigenvalue.o -o eigensc3 -lm 2>/dev/null || true + if [ -f eigensc3 ]; then + echo " SUCCESS: Stage 3 compiler created (eigensc3)!" + + # Compare Stage 2 and Stage 3 output + ./eigensc3 test_simple.eigs > output3.ll 2>&1 || true + + if diff -q output2.ll output3.ll > /dev/null 2>&1; then + echo "" + echo " ╔═══════════════════════════════════════════════════╗" + echo " β•‘ FIXPOINT ACHIEVED: Stage 2 = Stage 3 β•‘" + echo " β•‘ The compiler reproduces itself exactly! β•‘" + echo " β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•" + else + echo " Stage 2 and Stage 3 outputs differ" + fi + fi + fi + fi + else + echo " Stage 3 LLVM IR has errors" + fi + fi else echo " Outputs differ (may be cosmetic differences)" echo " Stage 1 lines: $(wc -l < output1.ll)" diff --git a/src/eigenscript/compiler/cli/compile.py b/src/eigenscript/compiler/cli/compile.py index aa1af90..f066d70 100644 --- a/src/eigenscript/compiler/cli/compile.py +++ b/src/eigenscript/compiler/cli/compile.py @@ -266,6 +266,7 @@ def compile_file( opt_level: int = 0, target_triple: str = None, is_lib: bool = False, + no_runtime: bool = False, ): """Compile an EigenScript file to LLVM IR, object code, or executable.""" @@ -444,10 +445,12 @@ def compile_file( print(f" βœ— Verification failed: {verify_error}") return 1 - # Link runtime bitcode only for main programs (not libraries) - if not is_lib: + # Link runtime bitcode only for main programs (not libraries or when --no-runtime) + if not is_lib and not no_runtime: llvm_module = codegen.link_runtime_bitcode(llvm_module, target_triple) print(f" βœ“ Linked runtime bitcode (LTO enabled)") + elif no_runtime: + print(f" βœ“ Skipping runtime linking (--no-runtime)") else: print(f" βœ“ Library mode: runtime will be linked at final link stage") @@ -555,6 +558,11 @@ def main(): parser.add_argument( "--no-verify", action="store_true", help="Skip LLVM IR verification" ) + parser.add_argument( + "--no-runtime", + action="store_true", + help="Skip runtime bitcode linking (for separate linking with eigenvalue.o)", + ) args = parser.parse_args() @@ -571,6 +579,7 @@ def main(): opt_level=args.optimize, target_triple=args.target, is_lib=args.lib, + no_runtime=args.no_runtime, ) sys.exit(result)