feat: add optional llm narrative summaries for repo onboarding

baboonzero · baboonzero · commit 97d051d20aea · 2026-02-28T17:01:33.000+08:00
diff --git a/README.md b/README.md
@@ -145,6 +145,7 @@ python scripts/analyze.py analyze \
   --mode standard \
   --audience nontech \
   --overview-length medium \
+  --enable-llm-descriptions true \
   --enable-web-enrichment true
 ```
 
@@ -153,6 +154,11 @@ Useful optional controls:
 - `--include-glob "<pattern>"` (repeatable) to scope analysis to specific paths
 - `--exclude-glob "<pattern>"` (repeatable) to remove generated/irrelevant files
 
+For LLM-based narrative summaries:
+
+- Set `CODE_EXPLAINER_LLM_API_KEY` (or `OPENAI_API_KEY`)
+- Optional: `CODE_EXPLAINER_LLM_BASE_URL`, `CODE_EXPLAINER_LLM_MODEL`
+
 ## Install From GitHub (For Other Developers)
 
 Using Skills CLI:
diff --git a/code-explainer/SKILL.md b/code-explainer/SKILL.md
@@ -37,6 +37,7 @@ python scripts/analyze.py analyze \
   --overview-length <short|medium|long> \
   --include-glob <pattern> \
   --exclude-glob <pattern> \
+  --enable-llm-descriptions <true|false> \
   --enable-web-enrichment <true|false>
 ```
 
@@ -45,6 +46,7 @@ Defaults:
 - `mode=standard`
 - `audience=nontech`
 - `overview-length=medium`
+- `enable-llm-descriptions=true`
 - `enable-web-enrichment=true`
 
 ## Dependencies
@@ -77,18 +79,21 @@ bash ./scripts/install_runtime.sh
 2. Local index build (files/modules/symbol candidates).
 3. Stack/entrypoint/dependency/flow extraction.
 4. Documentation ingestion (`coverage_report.json`).
-5. Optional DeepWiki + web enrichment with attribution.
-6. Mermaid generation (Context + Container + flow set).
-7. Mermaid validation.
-8. SVG then PNG rendering.
-9. Overview + deep markdown generation.
-10. Quality gates and confidence report generation.
+5. Optional LLM narrative generation (`llm_summary.json`).
+6. Optional DeepWiki + web enrichment with attribution.
+7. Mermaid generation (Context + Container + flow set).
+8. Mermaid validation.
+9. SVG then PNG rendering.
+10. Overview + deep markdown generation.
+11. Quality gates and confidence report generation.
 
 ## Notes
 
 - For GitHub URLs, `git` must be available on PATH.
 - For high-fidelity diagram rendering, `mmdc` should be installed.
 - Without `mmdc`, fallback rendering is used and flagged in reports.
+- For LLM narrative summaries, set `CODE_EXPLAINER_LLM_API_KEY` (or `OPENAI_API_KEY`).
+- Optional: set `CODE_EXPLAINER_LLM_BASE_URL` and `CODE_EXPLAINER_LLM_MODEL`.
 - This skill does not mutate the analyzed target repository.
 
 ## Dependency Troubleshooting
diff --git a/code-explainer/assets/templates/deep_architecture.md.j2 b/code-explainer/assets/templates/deep_architecture.md.j2
@@ -28,6 +28,10 @@ Architecture Pattern: **{{architecture_pattern}}**
 
 {{docs_coverage}}
 
+## Suggested Deep-Dive Starters
+
+{{llm_deep_dive_starters}}
+
 ## Where To Modify for Common Changes
 
 {{where_to_modify}}
diff --git a/code-explainer/assets/templates/overview.md.j2 b/code-explainer/assets/templates/overview.md.j2
@@ -25,6 +25,10 @@ It appears to follow a **{{architecture_pattern}}** architecture with a stack ce
 
 {{building_blocks}}
 
+## Directory Map (Plain Language)
+
+{{directory_plain_summaries}}
+
 ## Documentation Coverage
 
 {{docs_coverage}}
@@ -48,6 +52,10 @@ Start with these docs:
 
 {{external_context_summary}}
 
+## Narrative Confidence Notes
+
+{{llm_confidence_notes}}
+
 ## Deep Dive Links
 
 - [Architecture Deep Explainer](../deep/ARCHITECTURE_DEEP.md)
diff --git a/code-explainer/references/mode-behavior.md b/code-explainer/references/mode-behavior.md
@@ -32,9 +32,18 @@ Goal: Maximum fidelity and audit-ready onboarding.
 - `nontech`: plain-language phrasing first, minimal jargon.
 - `mixed`: business-and-technical balance.
 - `engineering`: technical detail and traceability emphasis.
+- If LLM narrative is enabled and available, wording is further adapted per audience.
 
 ## Overview Length
 
 - `short`: executive skim.
 - `medium`: balanced default.
 - `long`: expanded onboarding context and references.
+
+## LLM Narrative
+
+- Controlled with `--enable-llm-descriptions <true|false>`.
+- Reads API config from env vars:
+- `CODE_EXPLAINER_LLM_API_KEY` (or `OPENAI_API_KEY`)
+- `CODE_EXPLAINER_LLM_BASE_URL` (optional)
+- `CODE_EXPLAINER_LLM_MODEL` (optional)
diff --git a/code-explainer/references/output-contract.md b/code-explainer/references/output-contract.md
@@ -24,8 +24,9 @@
 20. `meta/render_report.json`
 21. `meta/enrichment.json`
 22. `meta/coverage_report.json`
-23. `meta/docs_generation.json`
-24. `meta/quality_report.json`
+23. `meta/llm_summary.json`
+24. `meta/docs_generation.json`
+25. `meta/quality_report.json`
 
 ## Manifest Schema
 
@@ -47,6 +48,9 @@
 - `exclude_globs[]`
 - `docs_discovered`
 - `docs_parsed`
+- `llm_descriptions_enabled`
+- `llm_descriptions_used`
+- `llm_model`
 
 ## Coverage Schema
 
@@ -61,6 +65,21 @@
 - `parsed_docs[]` with `path`, `title`, `summary`, `headings[]`, `line_count`, `size_bytes`, `keywords[]`
 - `skipped_docs[]` with `path`, `reason`
 
+## LLM Narrative Schema
+
+`llm_summary.json` contains:
+
+- `generated_at`
+- `enabled`
+- `used`
+- `provider`
+- `model`
+- `repo_summary_paragraph`
+- `directory_summaries[]` with `name`, `summary`
+- `deep_dive_starters[]`
+- `confidence_notes[]`
+- `error`
+
 ## Confidence Schema
 
 `confidence_report.json` contains:
diff --git a/code-explainer/scripts/analyze.py b/code-explainer/scripts/analyze.py
@@ -19,6 +19,7 @@
 import map_dependencies
 import map_flows
 import ingest_docs
+import llm_describe
 import build_diagrams
 import validate_mermaid
 import render_diagrams
@@ -85,6 +86,7 @@ def _write_manifest(
     stack_payload: Dict[str, Any],
     entry_payload: Dict[str, Any],
     docs_payload: Dict[str, Any],
+    llm_payload: Dict[str, Any],
     module_count: int,
     diagram_count: int,
     include_globs: List[str],
@@ -103,6 +105,9 @@ def _write_manifest(
         "entrypoints": entry_payload.get("entrypoints", []),
         "docs_discovered": docs_payload.get("discovered_count", 0),
         "docs_parsed": docs_payload.get("parsed_count", 0),
+        "llm_descriptions_enabled": llm_payload.get("enabled", False),
+        "llm_descriptions_used": llm_payload.get("used", False),
+        "llm_model": llm_payload.get("model", ""),
         "module_count": module_count,
         "diagram_count": diagram_count,
         "include_globs": include_globs,
@@ -118,6 +123,7 @@ def run_pipeline(
     audience: str,
     overview_length: str,
     enable_web_enrichment: bool,
+    enable_llm_descriptions: bool,
     include_globs: List[str] | None = None,
     exclude_globs: List[str] | None = None,
 ) -> Dict[str, Any]:
@@ -145,6 +151,20 @@ def run_pipeline(
         flow_payload = map_flows.map_flows(stack_payload, entry_payload, dep_payload, meta_dir, mode)
         coverage_payload = ingest_docs.ingest_docs(repo_root, index_payload, meta_dir, mode)
         enrichment_payload = enrich_external.enrich_external(source, meta_dir, enable_web_enrichment)
+        llm_payload = llm_describe.generate_llm_descriptions(
+            repo_root=repo_root,
+            source=source,
+            mode=mode,
+            audience=audience,
+            index_payload=index_payload,
+            stack_payload=stack_payload,
+            entry_payload=entry_payload,
+            dep_payload=dep_payload,
+            flow_payload=flow_payload,
+            docs_payload=coverage_payload,
+            out_dir=meta_dir,
+            enabled=enable_llm_descriptions,
+        )
 
         diagram_manifest = build_diagrams.build_diagrams(
             stack=stack_payload,
@@ -172,6 +192,7 @@ def run_pipeline(
             flow_payload=flow_payload,
             diagram_manifest=diagram_manifest,
             docs_payload=coverage_payload,
+            llm_payload=llm_payload,
             enrichment_payload=enrichment_payload,
         )
         _write_confidence_and_attribution(output_root, docs_gen_payload, enrichment_payload)
@@ -185,6 +206,7 @@ def run_pipeline(
             stack_payload=stack_payload,
             entry_payload=entry_payload,
             docs_payload=coverage_payload,
+            llm_payload=llm_payload,
             module_count=len(index_payload.get("modules", [])),
             diagram_count=diagram_manifest.get("count", 0),
             include_globs=include_globs,
@@ -202,6 +224,7 @@ def run_pipeline(
             "file_count": index_payload.get("file_count", 0),
             "docs_discovered": coverage_payload.get("discovered_count", 0),
             "docs_parsed": coverage_payload.get("parsed_count", 0),
+            "llm_descriptions_used": llm_payload.get("used", False),
             "diagram_count": diagram_manifest.get("count", 0),
             "validation_ok": validation_payload.get("overall_ok", False),
             "renderer": render_payload.get("renderer", ""),
@@ -235,6 +258,7 @@ def _parse_args() -> argparse.Namespace:
         help="Glob(s) to exclude from indexing.",
     )
     parser.add_argument("--enable-web-enrichment", default="true")
+    parser.add_argument("--enable-llm-descriptions", default="true")
     return parser.parse_args()
 
 
@@ -246,13 +270,15 @@ def main() -> int:
 
     mode = common.normalize_mode(args.mode)
     web_enabled = common.bool_from_string(args.enable_web_enrichment)
+    llm_enabled = common.bool_from_string(args.enable_llm_descriptions)
     summary = run_pipeline(
         source=args.source,
         output_root=Path(args.output).resolve(),
         mode=mode,
         audience=args.audience,
         overview_length=args.overview_length,
         enable_web_enrichment=web_enabled,
+        enable_llm_descriptions=llm_enabled,
         include_globs=args.include_glob,
         exclude_globs=args.exclude_glob,
     )
@@ -266,6 +292,7 @@ def main() -> int:
         "file_count",
         "docs_discovered",
         "docs_parsed",
+        "llm_descriptions_used",
         "diagram_count",
         "validation_ok",
         "renderer",
diff --git a/code-explainer/scripts/generate_docs.py b/code-explainer/scripts/generate_docs.py
@@ -77,6 +77,46 @@ def _where_to_modify(modules: List[Dict[str, Any]], limit: int) -> str:
     return "\n".join(suggestions)
 
 
+def _llm_directory_summaries(llm_payload: Dict[str, Any], fallback_modules: List[Dict[str, Any]], limit: int = 8) -> str:
+    items = llm_payload.get("directory_summaries", [])
+    lines: List[str] = []
+    if isinstance(items, list):
+        for item in items[:limit]:
+            if not isinstance(item, dict):
+                continue
+            name = str(item.get("name", "")).strip()
+            summary = str(item.get("summary", "")).strip()
+            if not name or not summary:
+                continue
+            lines.append(f"- **{name}**: {summary}")
+    if lines:
+        return "\n".join(lines)
+
+    for module in fallback_modules[:limit]:
+        name = module.get("name", "")
+        if not name:
+            continue
+        lines.append(f"- **{name}**: Module with {module.get('file_count', 0)} files.")
+    return "\n".join(lines) if lines else "- No directory-level summary available."
+
+
+def _llm_deep_dive_starters(llm_payload: Dict[str, Any]) -> str:
+    starters = llm_payload.get("deep_dive_starters", [])
+    if not isinstance(starters, list) or not starters:
+        return "- Start from entrypoints, then trace one request through dependencies."
+    return "\n".join([f"- {str(item)}" for item in starters[:6]])
+
+
+def _llm_confidence_notes(llm_payload: Dict[str, Any]) -> str:
+    notes = llm_payload.get("confidence_notes", [])
+    if not isinstance(notes, list) or not notes:
+        if llm_payload.get("enabled", False) and not llm_payload.get("used", False):
+            error = llm_payload.get("error", "LLM summary unavailable for this run.")
+            return f"- {error}"
+        return "- LLM summary disabled; deterministic analysis remains primary."
+    return "\n".join([f"- {str(item)}" for item in notes[:6]])
+
+
 def _glossary_terms(
     stack_payload: Dict[str, Any],
     dep_payload: Dict[str, Any],
@@ -216,8 +256,13 @@ def _plain_system_summary(
     repo_name: str,
     stack_payload: Dict[str, Any],
     doc_payload: Dict[str, Any],
+    llm_payload: Dict[str, Any],
     audience: str,
 ) -> str:
+    llm_summary = str(llm_payload.get("repo_summary_paragraph", "")).strip()
+    if llm_summary:
+        return llm_summary
+
     parsed_docs = doc_payload.get("parsed_docs", [])
     summary_doc = _pick_summary_doc(parsed_docs)
     if summary_doc:
@@ -297,6 +342,7 @@ def generate_docs(
     flow_payload: Dict[str, Any],
     diagram_manifest: Dict[str, Any],
     docs_payload: Dict[str, Any],
+    llm_payload: Dict[str, Any],
     enrichment_payload: Dict[str, Any],
 ) -> Dict[str, Any]:
     overview_dir = common.ensure_dir(output_root / "overview")
@@ -335,11 +381,16 @@ def generate_docs(
         "audience_note": _audience_note(audience),
         "mode_note": _mode_note(mode),
         "overview_length_note": _overview_length_note(overview_length),
-        "plain_summary": _plain_system_summary(repo_name, stack_payload, docs_payload, audience),
+        "plain_summary": _plain_system_summary(repo_name, stack_payload, docs_payload, llm_payload, audience),
         "docs_coverage": _docs_coverage_line(docs_payload),
         "docs_quick_links": _docs_summary(docs_payload, profile["doc_link_limit"]),
         "primary_user_flow_summary": _primary_flow_summary(flow_payload),
         "external_context_summary": _external_context_summary(enrichment_payload),
+        "directory_plain_summaries": _llm_directory_summaries(llm_payload, modules, limit=profile["module_limit"]),
+        "llm_deep_dive_starters": _llm_deep_dive_starters(llm_payload),
+        "llm_confidence_notes": _llm_confidence_notes(llm_payload),
+        "llm_enabled": "true" if llm_payload.get("enabled", False) else "false",
+        "llm_used": "true" if llm_payload.get("used", False) else "false",
     }
 
     overview_template = common.load_template(templates_root / "overview.md.j2")
@@ -422,6 +473,17 @@ def generate_docs(
         ),
     ]
 
+    if llm_payload.get("used", False):
+        claims.append(
+            common.collect_claim(
+                "claim_llm_narrative",
+                "An LLM-generated narrative summary was incorporated for repository and directory explainers.",
+                ["meta/llm_summary.json"],
+                0.7,
+                "Generated from deterministic context payload + model inference.",
+            )
+        )
+
     if enrichment_payload.get("records"):
         claims.append(
             common.collect_claim(
@@ -467,6 +529,7 @@ def main() -> int:
     parser.add_argument("--flows", required=True)
     parser.add_argument("--diagram-manifest", required=True)
     parser.add_argument("--coverage", required=True)
+    parser.add_argument("--llm-summary", required=True)
     parser.add_argument("--enrichment", required=True)
     args = parser.parse_args()
 
@@ -484,6 +547,7 @@ def main() -> int:
         flow_payload=common.read_json(Path(args.flows), default={}),
         diagram_manifest=common.read_json(Path(args.diagram_manifest), default={}),
         docs_payload=common.read_json(Path(args.coverage), default={}),
+        llm_payload=common.read_json(Path(args.llm_summary), default={}),
         enrichment_payload=common.read_json(Path(args.enrichment), default={}),
     )
     print(json.dumps({"overview": payload["overview_file"], "deep_count": len(payload["deep_files"])}, indent=2))
diff --git a/code-explainer/scripts/llm_describe.py b/code-explainer/scripts/llm_describe.py
diff --git a/code-explainer/scripts/quality_gate.py b/code-explainer/scripts/quality_gate.py