Skip to content

Phase 1: Implement Eq. 24-style Newton-Schulz inner momentum variant (muon_ns) #5

@georgepullen

Description

@georgepullen

Purpose

Implement Newton-Schulz nonlinearity path for inner momentum updates to align with Eq. 24-style formulation.

Mandatory Reading (blocking)

First comment must summarize:

  • reports/NL_IMPLEMENTATION_ORACLE.md section 6.1.1 and optimizer gap notes
  • reports/paper/NL-print.extracted.clean.txt Eq. (24)
  • src/nested_learning/optim/m3.py Newton-Schulz implementation

Required Code Anchors

  • src/nested_learning/optim/deep.py
  • src/nested_learning/optim/m3.py
  • src/nested_learning/optim/factory.py

Scope

  • Add inner variant muon_ns using Newton-Schulz output transform.
  • Clarify difference between outer Muon optimizer and inner muon_ns memory rule in docs.
  • Keep backward compatibility with current muon configs.

Test Requirements

  • Unit tests for NS path shape/stability.
  • Deterministic toy-case checks.

Deliverables

  • Variant implementation + docs + ablation config.

Acceptance Criteria

  • No regression in outer optimizer behavior.
  • 5k run finite with expected telemetry keys.
  • First issue comment contains mandatory reading summary.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestexecution-boardExecution board ticket set for paper alignmentphase-1Phase 1: optimizer equation fidelity (Eq. 21-24)quality-gateHas explicit acceptance criteria and test gates

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions