GitHub - Fzkuji/OpenProgram: The Open Source Agent Harness Framework. Any LLM. Any platform. Agentic Programming Paradigm. ♾️

Open Source Agent Harness Framework. Any LLM. Any platform. Agentic Programming Paradigm.

Getting Started · API Reference · Philosophy · 中文

Built on the Agentic Programming paradigm. Current LLM agent frameworks let the LLM control everything — what to do, when, and how. The result? Unpredictable execution, context explosion, and no output guarantees. OpenProgram flips this: Python controls the flow, LLM only reasons when asked. See philosophy for the full rationale.

Quick Start

Requires Python 3.11+.

pip install openprogram                             # install (TUI + web UI included)
openprogram setup                                   # connect a provider (interactive)

Then chat with it — either in the terminal or the browser:

openprogram                                         # full-screen TUI

For the web UI, just open your browser at http://localhost:3000 — openprogram setup starts the worker in the background (frontend on 3000, FastAPI backend on 8109) so the page is already live.

Both surfaces share the same backend — sessions, settings, web-search defaults are persisted in ~/.agentic/ and visible from either entry point.

Setup

openprogram setup is a wizard that imports credentials from any CLI you've already logged into and asks for missing API keys. Skip it by setting one of these env vars yourself:

export ANTHROPIC_API_KEY=sk-ant-...                 # Claude
export OPENAI_API_KEY=sk-...                        # GPT
export GOOGLE_API_KEY=...                           # Gemini

Or use a CLI provider (no API key, uses your existing subscription):

npm i -g @anthropic-ai/claude-code && claude login
npm i -g @openai/codex && codex auth
npm i -g @google/gemini-cli && gemini auth login

Check what's detected with openprogram providers. Auto-detection order: Claude Code → Codex → Gemini CLI → Anthropic API → OpenAI API → Gemini API.

Optional extras

Extra	Adds	Post-install
`[anthropic]` / `[openai]` / `[gemini]`	Provider SDKs	—
`[browser]`	Playwright (~150 MB)	`playwright install chromium`
`[browser-stealth]`	Cloudflare-bypassing browsers	`patchright install chromium && camoufox fetch`
`[gui]`	Vision/control deps for GUI harness (~2 GB)	—
`[channels]`	Discord / Slack / WeChat bots	—
`[all]`	Everything except `[browser-stealth]`	run post-install commands as needed

Troubleshooting

"No provider available"

openprogram providers shows what's detected. Common causes: forgot claude login / codex auth; API key set in a different shell than you're running in; token expired (re-login).

"command not found: openprogram"

pip install dir not on PATH. Use python3 -m openprogram <args> instead, or add $(python3 -m site --user-base)/bin to your PATH.

Web UI port in use

Set OPENPROGRAM_WEB_PORT=8101 (frontend) or OPENPROGRAM_BACKEND_PORT=8102 (FastAPI) before starting the worker. Or store the preference via openprogram config ui.

Local-development install (multi-repo)

For GUI-Agent-Harness / Research-Agent-Harness:

pip install -e "$OPENPROGRAM_DIR"                   # first
pip install -e "$GUI_HARNESS_DIR"                   # depends on openprogram
pip install -e "$RESEARCH_HARNESS_DIR"

openprogram/functions/agentics/{GUI,Research}-Agent-Harness are symlinks — recreate if a repo moves:

cd openprogram/functions/agentics
rm -f GUI-Agent-Harness && ln -s "$GUI_HARNESS_DIR" GUI-Agent-Harness
rm -f Research-Agent-Harness && ln -s "$RESEARCH_HARNESS_DIR" Research-Agent-Harness

pip install -e writes absolute paths — rerun it from the new location if you rename a parent folder.

For platform-builder topics — Runtime retry semantics, the full agentic_function decorator API, the flat-DAG context model — see docs/API.md and the per-topic notes under docs/api/.

Supported Projects

Three agent applications ship with OpenProgram, under openprogram/functions/agentics/ — each a full agent built on the @agentic_function paradigm:

Project	What it does
GUI Agent Harness	Autonomous GUI agent — operates desktop apps (and OSWorld VMs) by vision: observe → plan → act → verify. Python drives the loop, the LLM only reasons when asked.
Research Agent Harness	Autonomous research agent — literature survey → idea → experiments → paper writing → cross-model review. Full pipeline from topic to submission-ready paper.
Wiki Agent Harness	Autonomous wiki builder — ingests notes, documents and conversations into a structured, Obsidian-compatible knowledge vault with `[[wikilinks]]`.

Why Agentic Programming?

Principle	How
Deterministic flow	Python controls `if/else/for/while`. Execution is guaranteed, not suggested.
Minimal LLM calls	Call the LLM only when reasoning is needed. 2 calls, not 10.
Prompt in code	The per-call prompt lives in the function body (`runtime.exec(content=...)`), not in scattered prompt files.
Self-evolving	Agents author, fix and improve functions directly — guided by the `agentic-programming` skill.

The problem with current frameworks

Current LLM agent frameworks place the LLM as the central scheduler. This creates three fundamental problems:

Unpredictable execution — the LLM may skip, repeat, or invent steps regardless of defined workflows
Context explosion — each tool-call round-trip accumulates history
No output guarantees — the LLM interprets instructions rather than executing them

The core issue: the LLM controls the flow, but nothing enforces it. Skills, prompts, and system messages are suggestions, not guarantees.

	Tool-Calling / MCP	Agentic Programming
Who schedules?	LLM decides	Python decides
Functions contain	Code only	Code + LLM reasoning
Context	Flat conversation	Structured tree
Prompt	Hidden in agent config	Docstring = prompt
Self-improvement	Not built-in	`create` → `fix` → evolve

MCP is the transport. Agentic Programming is the execution model. They're orthogonal.

Key Features

Automatic Context

Every @agentic_function call creates a Context node. Nodes form a tree that is automatically injected into LLM calls:

login_flow ✓ 8.8s
├── observe ✓ 3.1s → "found login form at (200, 300)"
├── click ✓ 2.5s → "clicked login button"
└── verify ✓ 3.2s → "dashboard confirmed"

When verify calls the LLM, it automatically sees what observe and click returned. No manual context management.

Deep Work — Autonomous Quality Loop

For complex tasks that demand sustained effort and high standards, deep_work runs an autonomous plan-execute-evaluate loop until the result meets the specified quality level:

from openprogram.functions.agentics.deep_work import deep_work

result = deep_work(
    task="Write a survey on context management in LLM agents.",
    level="phd",        # high_school → bachelor → master → phd → professor
    runtime=runtime,
)

The agent clarifies requirements upfront, then works fully autonomously — executing, self-evaluating, and revising until the output passes quality review. State is persisted to disk, so interrupted work resumes where it left off.

Functions that author functions

Writing, fixing and scaffolding @agentic_functions is itself agent work — done with ordinary file-editing tools, guided by the agentic-programming skill (skills/agentic-programming/SKILL.md). There are no dedicated create() / fix() framework calls: they only ever wrapped one LLM call plus a file write, which an agent does directly.

The skill is the complete spec — where the file goes, the decorator's metadata, the docstring vs content split, a rule-based validation checklist, and a smoke test. An agent reads it, writes the function, validates it, runs it; the write → run → fail → fix cycle still means programs improve through use.

Conversation as a git DAG

Session history is stored like a git repository, not a flat list. Every exchange is a commit, branches are first-class, and the right sidebar exposes the usual git operations:

Branch off any past exchange to explore an alternative without losing the original thread
Attach context from another session (cross-session reuse) as a labelled user message
Merge two threads when their branches converge
Cherry-pick specific commits across branches

Branches that touch files run in isolated git worktrees under the hood, so two concurrent agents on different branches can't fight over the same source tree. Other frameworks fork conversations by copying messages; we fork the underlying repo.

Layered memory

Memory isn't a single bag. Six separate stores under ~/.agentic/memory/ cover different timescales and purposes:

Layer	What goes there
`journal`	Short-term — recent observations, raw notes
`wiki`	Durable — facts the agent decided to keep around
`sleep`	Periodic consolidation (offline daemon merges journal → wiki)
`scheduler`	Cron-driven recalls that surface a memory at a specific time
`recall_counts`	Hit counts that boost frequently-used memories
`store`	Project-scoped key/value

Open /memory to inspect or hand-edit any layer; the agent decides which layer to write to based on what it learned. The split exists because "remember this until I tell you to forget" and "remember this for the next 10 turns" want different storage strategies.

API Reference

Core

Import	What it does
`from openprogram import agentic_function`	Decorator. Records each call as a node in the session DAG
`from openprogram.agentic_programming.runtime import Runtime`	LLM runtime. `exec()` calls the LLM with DAG-derived context
`from openprogram.providers.registry import create_runtime`	Create a Runtime with auto-detection or explicit provider (`create_runtime()` checks API keys and CLIs in priority order)

Authoring functions

There are no create() / fix() meta-functions — writing, editing and validating @agentic_functions is done directly with ordinary file-editing tools, guided by the agentic-programming skill (skills/agentic-programming/SKILL.md). That skill is the complete spec: file layout, the decorator's metadata, the docstring vs content split, and a rule-based validation checklist.

Built-in Functions

Import	What it does
`from openprogram.functions.agentics.deep_work import deep_work`	Autonomous plan-execute-evaluate loop with quality levels
`from openprogram.functions.agentics.ask_user import ask_user`	Ask the user a clarifying question and block until an answer arrives

Providers

Six built-in providers: Anthropic, OpenAI, Gemini (API), Claude Code, Codex, Gemini (CLI). All CLI providers maintain session continuity across calls. See Provider docs for details.

API Docs by Topic

agentic_function — decorator behavior, DAG node recording, the docstring / content split
Runtime — exec(), retries, response formats, provider wiring
Providers — built-in runtimes, detection order, CLI vs API tradeoffs

Integration

Guide	Description
Getting Started	3-minute setup and runnable examples
Claude Code	Use without API key via Claude Code CLI
OpenClaw	Use as OpenClaw skill
API Reference	Full API documentation

Project Structure

openprogram/
├── __init__.py                      # agentic_function re-export
├── cli.py                           # `openprogram` command entry point
├── agentic_programming/             # engine — paradigm-essential primitives
│   ├── function.py                  #   @agentic_function decorator
│   ├── runtime.py                   #   Runtime (exec + retry + DAG context)
│   ├── session.py                   #   session lifecycle
│   └── skills.py                    #   SKILL.md discovery
├── context/                         # flat-DAG context model — nodes, storage, render, compute_reads
├── providers/                       # Anthropic, OpenAI, Gemini, Claude Code, Codex, Gemini CLI
├── functions/
│   ├── _registry.py                 #   unified registry for tools + agentic functions
│   ├── tools/                       #   @function leaves — bash, read, edit, grep, semble_search, web_search, …
│   └── agentics/                    #   @agentic_function modules (each its own dir, code in __init__.py)
│       ├── ask_user/                #     ask the user a clarifying question
│       ├── deep_work/               #     autonomous plan-execute-evaluate loop
│       ├── extract_pdf_figures/     #     PDF figure extraction
│       ├── …                        #     other agentics …
│       ├── GUI-Agent-Harness/       #     GUI agent (separate repo, symlink)
│       ├── Research-Agent-Harness/  #     Research agent (separate repo, symlink)
│       └── Wiki-Agent-Harness/      #     Wiki agent (separate repo, symlink)
└── webui/                           # `openprogram web` — browser UI
skills/                              # SKILL.md files for agent integration
examples/                            # runnable demos
tests/                               # pytest suite

Contributing

This is a paradigm proposal with a reference implementation. We welcome discussions, alternative implementations in other languages, use cases that validate or challenge the approach, and bug reports.

See CONTRIBUTING.md for details.

Acknowledgements

OpenProgram stands on shoulders. The tool framework, provider abstraction, and several tool implementations were ported or adapted from the projects below — each under its own license. Enormous thanks to their authors.

OpenClaw (MIT) — layout of the tool registry (name / description / parameters / execute), provider abstraction with check_fn + requires_env gating, TOOLSETS presets, skill loading via SKILL.md frontmatter + late-bound read. Our full clone lives under references/openclaw/ (gitignored) for browsing.
hermes-agent (MIT) — starting point for execute_code (we trimmed the Docker / Modal layers), mixture_of_agents, and the general shape of the multi-provider web_search / image_generate / image_analyze tools.
pi-coding-agent (MIT) — via OpenClaw's import, the canonical AgentSkill shape (<available_skills> XML formatter, name / description / location).
Claude Code — overall ergonomics of the DEFAULT_TOOLS set (bash + read / write / edit + glob / grep / list
- apply_patch + todo_read / todo_write) and the todo tool's JSON schema.
Anthropic / OpenAI / Google SDKs — provider HTTP contracts; our providers call the raw HTTP APIs to keep SDK dependencies optional.

Individual tool files call out their direct inspirations in file-level docstrings where the lineage is more specific.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1,420 Commits
.github/workflows		.github/workflows
cli		cli
docs		docs
examples		examples
openprogram		openprogram
scripts		scripts
skills/agentic-programming		skills/agentic-programming
tests		tests
web		web
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README_DRAFT.md		README_DRAFT.md
_timing.py		_timing.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick Start

Setup

Optional extras

Troubleshooting

Supported Projects

Why Agentic Programming?

Key Features

Automatic Context

Deep Work — Autonomous Quality Loop

Functions that author functions

Conversation as a git DAG

Layered memory

API Reference

Core

Authoring functions

Built-in Functions

Providers

API Docs by Topic

Integration

Contributing

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Quick Start

Setup

Optional extras

Troubleshooting

Supported Projects

Why Agentic Programming?

Key Features

Automatic Context

Deep Work — Autonomous Quality Loop

Functions that author functions

Conversation as a git DAG

Layered memory

API Reference

Core

Authoring functions

Built-in Functions

Providers

API Docs by Topic

Integration

Contributing

Acknowledgements

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages