Reusable Python library and CLI for narrated demo videos built around Manim, OpenAI TTS, and ffmpeg composition. Aimed at long-form, scripted explainers that walk through how a system works.
Prose + PlantUML sources for how Courseforge repositories fit together live under docs/suite/. Regenerate PNGs with ./scripts/render-suite-diagrams.sh — Java, Graphviz (dot), and the vendored JAR in third_party/plantuml/ (CI installs Graphviz and runs the same script; see .github/workflows/render-suite-diagrams.yml). The rendered site at courseforge.github.io pulls this tree on each publish from courseforge/infrastructure.
docgen no longer ships any Playwright-driven UI demo path. The previous
demo-function, playwright, discover-tests, vhs, tape-lint, sync-vhs,
per-function-*, and catalog commands — together with their config blocks
(vhs:, playwright:, playwright_test:, discover_tests:, catalog:,
per_function:) and the playwright, playwright_test, and vhs visual_map
types — have been removed.
Why: a UI-test-driven recorder turned out to be a fragile, project-specific concern
that pulled pytest-playwright, Node Playwright, VHS / ttyd, browser binaries,
trace parsing, and a discovery catalog into a generic library. The same goal is now
being prototyped in a consumer project (CourseForge tools/courseforge/demogen/)
with the “LLM emits a validated automation spec, a deterministic runner translates
it to Playwright” pattern. Once that contract stabilises a small portion may be
backported into docgen, but docgen itself stays Playwright-free.
If you still need the legacy behaviour, pin a pre-removal commit
(pip install docgen @ git+https://github.com/jmjava/documentation-generator.git@<sha>).
- TTS narration — generate MP3 audio from Markdown scripts via OpenAI
gpt-4o-mini-tts. - Whisper-aligned timestamps — extract word-level timing from TTS audio so visual cues can wait on real speech.
- Manim animations — primary visual surface. Use
docgen scene-spec-generatescene-compile(or hand-maintainedanimations/specs/*.scene.yaml) for deterministic diagram layout: rows are auto-paginated when they exceed the frame stack budget, specs that overflow safe width / budget are rejected, and (whentiming.jsoncarries Whisper words) each row’s first label is mapped to await_wordindex. Hand-maintained custom Manim classes still live inanimations/scenes.pyoutside theBEGIN/END GENERATED SCENEmarkers.
- ffmpeg composition — combine narration audio and Manim video into final segments, with a freeze-tail guard.
- Validation — A/V drift, freeze ratio, OCR error scan, layout, narration lint, Manim scene lint.
- GitHub Pages — auto-generate
index.html, deploy workflow, LFS rules,.gitignore. - Wizard — local web GUI to bootstrap narration scripts from existing project docs.
No IDE lock-in: maintenance workflows are docgen CLI + YAML + shell/CI (and
OpenAI where a command calls the API). The wizard is a local Flask app, not a
plugin tied to one editor.
pip install docgen @ git+https://github.com/jmjava/documentation-generator.gitgit clone https://github.com/jmjava/documentation-generator.git
cd documentation-generator
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytestCI installs ffmpeg and tesseract via apt — see .github/workflows/ci.yml.
Roadmap: milestones/README.md.
cd your-project/docs/demos
docgen wizard # optional: bootstrap narration from project docs
docgen generate-all # TTS → timestamps → Manim → compose → validate → concat
docgen validate --pre-push| Command | Description |
|---|---|
docgen init [TARGET_DIR] [--defaults] [--segments-file FILE] |
Scaffold a new project: docgen.yaml, wrapper scripts, directories |
docgen wizard [--port 8501] |
Launch narration setup wizard (local web GUI) |
docgen tts [--segment 01] [--dry-run] |
Generate TTS audio |
docgen timestamps |
Extract Whisper timestamps from TTS audio → timing.json |
docgen manim [--scene StackDAGScene] |
Render Manim animations |
docgen compose [01 02 03] [--ffmpeg-timeout 900] |
Compose segments (audio + video) |
docgen validate [--max-drift 2.75] [--pre-push] |
Run all validation checks |
docgen lint [--segment 01] |
Narration lint only |
docgen concat [--config full-demo] |
Concatenate full demo files |
docgen pages [--force] |
Generate index.html, pages.yml, .gitattributes, .gitignore |
docgen generate-all [--skip-tts] [--skip-manim] [--retry-manim] |
Full pipeline |
docgen rebuild-after-audio |
Recompose + validate + concat (skips TTS) |
docgen clean-bundle [-y] [--delete-config] [--keep-narration] |
Remove regenerable outputs under the bundle |
docgen narration-generate --segment 01 [--extra-path REL] [--hint TEXT] [--dry-run] [--force] |
Generate narration .md from repo sources + owner hints (OpenAI); see narration_from_source in YAML |
docgen yaml-generate [--merge-defaults] [--llm] [--dry-run] [--list-gaps] |
Merge defaults into docgen.yaml; optional OpenAI refresh of tts.instructions / wizard.system_prompt (rewrites the file — review in Git) |
docgen scene-compile SPEC.scene.yaml [--dry-run] |
Compile a declarative scene spec (YAML) into a _TimedScene class and inject it into animations/scenes.py — deterministic layout (rows of _box); applies auto-pagination + Whisper wait_word |
docgen scene-spec-generate [--segment 01 | --all] [--compile] [--print-only] [--output PATH] [--hint …] [--model …] |
Call OpenAI to emit YAML only (same schema as scene-compile); rejects specs that exceed the stack budget or safe row width, runs the same auto-paginate + word-alignment, optionally writes animations/specs/<stem>.scene.yaml and --compiles into scenes.py |
Create a docgen.yaml in your demos directory. Use docgen init to scaffold
a fresh layout, then docgen yaml-generate to fill in defaults from the files
already on disk. (docgen yaml-generate also keeps
manim_scene_generation.segments in step with visual_map.)
The visual_map key is maintainer-owned per-segment wiring. Supported types
are manim, mixed, still, and image.
If docgen.yaml sets env_file (often .env), variables are loaded with
shell-first semantics: anything already exported in the process (including
your IDE or CI) is not replaced by the file. To make the file win, set
DOCGEN_ENV_OVERRIDES=1 so every key from env_file overwrites the
environment, or DOCGEN_ENV_OVERRIDES=OPENAI_API_KEY,OTHER_KEY for specific
keys only.
When OPENAI_API_KEY is present in both the shell and env_file, docgen prints a
one-line hint to stderr so a silent 401 from the wrong key is easier to diagnose.
Under narration_from_source in docgen.yaml, the project owner lists
optional hints (strings) that steer the model (audience, terminology, what to
avoid). OpenAI generates the narration .md from your repo context
(context.paths / context.globs, relative to repo_root) plus those hints; the
result is what docgen tts reads. See docgen.narrate_from_source.
narration_from_source:
model: gpt-4o-mini
temperature: 0.65
max_context_bytes: 120000
hints:
- "Audience: contributors new to this repo."
- "Do not mention unreleased product codenames."
context:
paths:
- README.md
globs:
- "src/**/*.py"
segments:
"01":
hints:
- "This segment covers the install wizard only."
context:
paths:
- docs/install.mdvalidation:
max_drift_sec: 2.75
max_freeze_ratio: 0.25 # trailing-frame pad vs narration length (compose freeze guard + validate)
manim:
quality: 1080p30 # supports 480p15, 720p30, 1080p30, 1080p60, 1440p30, 1440p60, 2160p60
manim_path: "" # optional explicit binary path (relative to docgen.yaml or absolute)
font: "Liberation Sans"
min_font_size: 14
compose:
ffmpeg_timeout_sec: 300 # can also be overridden with: docgen compose --ffmpeg-timeout N- ffmpeg — composition and probing
- tesseract-ocr — OCR validation
- Manim — primary visuals (optional install:
pip install docgen[manim])
See milestone-doc-generator.md for the full design document.