[codex] Close coding-deepgent MVP local agent harness core#220
Draft
kun1s2 wants to merge 111 commits intoshareAI-lab:mainfrom
Draft
[codex] Close coding-deepgent MVP local agent harness core#220kun1s2 wants to merge 111 commits intoshareAI-lab:mainfrom
kun1s2 wants to merge 111 commits intoshareAI-lab:mainfrom
Conversation
The OMX team runtime writes local state under .omx/, and worker worktrees require the leader workspace to be clean before launch. Committing the ignore rule preserves local orchestration artifacts outside source control while unblocking durable team execution. Constraint: omx team refuses to launch with a dirty leader workspace because it provisions worker worktrees Rejected: Stash .gitignore before launch | would make .omx/ unignored again during team execution Confidence: high Scope-risk: narrow Directive: Keep .omx/ ignored; do not remove unless replacing the OMX state location Tested: git diff showed only .omx/ ignore addition Not-tested: team launch after commit
The first LangChain milestone needs CI evidence that the parallel s01-s06 track exists, compiles without OpenAI credentials, avoids import-time model starts, and preserves visible teaching harness primitives. This adds the guardrail tests and wires CI through requirements.txt so later LangChain dependency additions are installed consistently. Constraint: Test lane owns tests/CI while code lane still owns agents_langchain implementation Confidence: medium Scope-risk: narrow Tested: python -m py_compile tests/test_langchain_agents_smoke.py; python -m pytest tests/test_agents_smoke.py -q Not-tested: tests/test_langchain_agents_smoke.py passes only after agents_langchain s01-s06 code lane lands
The docs lane needs a stable comparison entry point before the code and test lanes are integrated, so this records where the s01-s06 LangChain/OpenAI-interface track lives, how it should be configured, and how reviewers should keep it separate from the original agents/ baseline and web UI. Constraint: First milestone is s01-s06 only and must preserve agents/ plus web/ boundaries Constraint: LangChain docs currently install core langchain plus langchain-openai for OpenAI integration Rejected: Surface the track through web/ now | user explicitly scoped web UI/app out of this milestone Confidence: high Scope-risk: narrow Tested: python -m pytest tests/test_agents_smoke.py -q; python -m compileall agents tests -q; git diff --check; python -m pip install --dry-run -r requirements.txt pytest Not-tested: full pytest suite due pre-existing tests/test_s_full_background.py failure unrelated to docs/deps changes
The docs lane needs a stable comparison entry point before the code and test lanes are integrated, so this records where the s01-s06 LangChain/OpenAI-interface track lives, how it should be configured, and how reviewers should keep it separate from the original agents/ baseline and web UI. Constraint: First milestone is s01-s06 only and must preserve agents/ plus web/ boundaries Constraint: LangChain docs currently install core langchain plus langchain-openai for OpenAI integration Rejected: Surface the track through web/ now | user explicitly scoped web UI/app out of this milestone Confidence: high Scope-risk: narrow Tested: python -m pytest tests/test_agents_smoke.py -q; python -m compileall agents tests -q; git diff --check; python -m pip install --dry-run -r requirements.txt pytest Not-tested: full pytest suite due pre-existing tests/test_s_full_background.py failure unrelated to docs/deps changes
Add a parallel agents_langchain s01-s06 track so learners can compare the existing hand-written Anthropic SDK baseline against LangChain's OpenAI-interface runtime without changing the web UI or original agents. Constraint: First milestone is s01-s06 only and must preserve agents/*.py plus web/ Rejected: Put LangChain files under agents/ | risks confusing the existing web extractor and baseline teaching boundary Confidence: high Scope-risk: moderate Tested: python -m py_compile agents_langchain/*.py; python -m pytest tests/test_agents_smoke.py tests/test_langchain_agents_smoke.py -q; env -u OPENAI_API_KEY import check for agents_langchain modules
The first LangChain milestone needs to sit beside the hand-written Anthropic SDK lessons, not replace them, so this adds a separate agents_langchain package, non-live smoke tests, OpenAI-style setup docs, and CI dependency wiring while leaving the web app and original s01-s06 scripts unchanged. Constraint: Preserve existing agents/*.py as the baseline and avoid web UI/app changes for this milestone Constraint: Automated tests must not require OPENAI_API_KEY or network access Rejected: Put LangChain files under agents/ | would blur the baseline boundary and risk web extractor churn Confidence: high Scope-risk: moderate Tested: python -m py_compile agents_langchain/*.py tests/test_langchain_agents_smoke.py Tested: python -m pytest tests/test_agents_smoke.py tests/test_langchain_agents_smoke.py -q Tested: env -u OPENAI_API_KEY python -m pytest tests/test_langchain_agents_smoke.py -q Not-tested: Full pytest suite is blocked by pre-existing tests/test_s_full_background.py failure in unmodified agents/s_full.py Not-tested: Live LangChain/OpenAI calls intentionally not run
The integrated LangChain milestone passed its targeted checks, but full repository pytest still failed in BackgroundManagerTests because a running background task with result=None rendered as '[running] None'. Normalizing the None case to the existing running placeholder keeps the capstone behavior aligned with the test and avoids a misleading status string. Constraint: Full post-change verification should pass before concluding the milestone Rejected: Leave the unrelated failure unresolved | would keep full pytest red at handoff time Confidence: high Scope-risk: narrow Directive: Preserve the '(running)' placeholder contract for unfinished background tasks unless tests and user-visible output are updated together Tested: python -m py_compile agents/s_full.py agents_langchain/*.py tests/test_langchain_agents_smoke.py; python -m pytest tests -q Not-tested: Interactive manual run of agents/s_full.py background task commands
added 6 commits
April 12, 2026 09:26
The s06 alignment status is the document users are actively reading while deciding what CC behavior the chapter matches or intentionally omits. Translating it keeps the evidence/inference boundary accessible without changing the underlying implementation. Constraint: User asked specifically to convert the s06 cc_alignment progress document to Chinese Constraint: Preserve existing s06 alignment structure and verification claims Rejected: Translate the whole cc_alignment directory now | request only covered the s06 progress document and other files have unrelated in-flight edits Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep future sNN alignment ledgers explicit about 已对齐、部分对齐/教学等价、未对齐/有意不复制、测试证据、下一步候选 Tested: PYTHON_DOTENV_DISABLED=1 pytest tests/test_s06_context_compact_baseline.py tests/test_deepagents_track_smoke.py tests/test_stage_track_capability_contract.py -q (24 passed) Tested: git diff --check Not-tested: Markdown rendering in generated web UI
The approved runtime-foundation plan called for a professional domain architecture over the existing LangChain product surface. This commit reconciles the six team lanes into one semantic history entry: typed settings, dependency-injector composition, runtime invocation context, TodoWrite/todo domain extraction, filesystem/tool-system policy seams, JSONL sessions, Typer/Rich/structlog local operations, and stage-3 verification gates. Constraint: Scope is limited to coding-deepgent/ per the approved PRD and team staffing lanes. Constraint: LangChain remains the runtime boundary; containers compose providers but domain modules do not import containers or hide business rules. Constraint: Structured tool inputs continue to use Pydantic args_schema / LangChain-native schemas rather than ad-hoc dict parsing or alias fallback. Rejected: Keep runtime-generated omx(team) checkpoint/merge commits | operational scaffolding violates the Lore commit protocol and obscures semantic review boundaries. Rejected: Leave stage metadata at stage 1 | would skip the runtime-foundation contract tests after implementation. Rejected: Accept mypy failures as an initial baseline | final integration typing issues were fixable without broadening scope. Confidence: high Scope-risk: moderate Reversibility: clean via backup branch created before squash finalization. Directive: Do not introduce new cc mirror modules or container imports in domain packages; update project_status.json, README, and runtime-foundation contract tests together when advancing later stages. Tested: cd coding-deepgent && python -m pytest -q -> 59 passed Tested: cd coding-deepgent && ruff check . -> all checks passed Tested: cd coding-deepgent && ruff format --check . -> 68 files already formatted Tested: cd coding-deepgent && python -m mypy src/coding_deepgent tests -> success Tested: CLI smoke for python -m coding_deepgent --help, config show, sessions list, and doctor without credentials Tested: Review grep checks for forbidden imports/modules/dependencies Not-tested: Live model invocation; no credentials required for this foundation stage.
Stage 4 turns the runtime foundation into a safer and more extensible product base by adding deterministic permission decisions, local lifecycle hooks, and structured prompt/context assembly without replacing LangChain's create_agent loop. The implementation keeps cc-haha alignment at the behavior-contract level while preserving LangChain-first boundaries and tight regression coverage. Constraint: Must stay LangChain/LangGraph-first and avoid introducing a custom query loop or speculative runtime wheel. Constraint: Stage 4 ask semantics must remain deterministic and no-UI so verification stays headless and local. Rejected: Copy cc-haha permission UI / HITL flow now | Stage 4 requires deterministic no-execution ask handling, not interactive approval. Rejected: Build a custom tool executor around permissions/hooks | AgentMiddleware and strict tool schemas already express the needed control-plane seams. Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep future memory, compact, skills, subagents, and tasks layered on top of these control-plane seams; do not bypass PermissionManager or PromptContext with ad-hoc runtime logic. Tested: cd coding-deepgent && python -m pytest -q -> 72 passed Tested: cd coding-deepgent && ruff check . -> all checks passed Tested: cd coding-deepgent && ruff format --check . -> 81 files already formatted Tested: cd coding-deepgent && python -m mypy src/coding_deepgent tests -> success Tested: grep guards for forbidden query loop / container imports / alias-fallback patterns -> passed Tested: Architect verification -> APPROVE Not-tested: Interactive approval UI/HITL; intentionally deferred beyond Stage 4.
Stage 5 absorbs the prompt/context/memory slice of the Claude Code roadmap through LangChain-native seams: a model-visible save_memory tool writes through ToolRuntime.store, recalled memories can be injected by middleware into the model request, and a deterministic tool-result budget helper bounds oversized payloads without adding message-history pruning or a custom query loop. Constraint: LangChain/LangGraph store, ToolRuntime, middleware, and create_agent remain the integration surface. Constraint: This stage is a foundation seam, not a durable cross-process memory guarantee beyond the configured store backend. Rejected: Implement message-history projection/pruning now | would widen compact scope beyond the approved tool-result budget slice. Rejected: Add LLM autocompact/session-memory side-agent behavior now | harder to verify deterministically and risks custom runtime drift. Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep memory separate from Todo and future durable tasks; future compact/subagent/task work must use the existing memory and prompt-context seams. Tested: cd coding-deepgent && python -m pytest -q -> 80 passed Tested: cd coding-deepgent && ruff check . -> all checks passed Tested: cd coding-deepgent && ruff format --check . -> 93 files already formatted Tested: cd coding-deepgent && python -m mypy src/coding_deepgent tests -> success Tested: Stage 5 grep guards -> STAGE5_FINAL_GATES_OK Tested: Architect verification -> APPROVE Not-tested: Persistent cross-process memory backend; current stage intentionally ships the store-backed foundation seam.
Stage 6 adds the Option B-min slice from the cc product roadmap: local SKILL.md loading, a store-backed durable task graph, and a synchronous stateless run_subagent tool with an exact child-tool allowlist. The implementation keeps richer Claude Code agent runtime concepts deferred while wiring the new surfaces through the existing LangChain create_agent tool system. Constraint: LangChain-first implementation; no custom query loop, background agents, mailbox, worktrees, remote/team runtime, sidechain resume, or forked skill execution. Constraint: TodoWrite remains session-local and separate from durable task records. Rejected: Full AgentTool parity now | would require background/resume/mailbox/worktree behavior outside the approved Option B-min slice. Rejected: Plugin/MCP skill loading now | Stage 6 is local skills only and keeps extension platform work for a later stage. Confidence: high Scope-risk: moderate Reversibility: clean Directive: Future multi-agent work must explicitly expand the subagent contract before adding mailbox/background/worktree/resume semantics. Tested: cd coding-deepgent && python -m pytest -q -> 87 passed Tested: cd coding-deepgent && ruff check . -> all checks passed Tested: cd coding-deepgent && ruff format --check . -> 107 files already formatted Tested: cd coding-deepgent && python -m mypy src/coding_deepgent tests -> success Tested: forbidden runtime-creep grep guards -> passed Tested: Architect verification -> APPROVE Not-tested: real background/remote subagent execution; intentionally deferred.
|
Someone is attempting to deploy a commit to the crazyboym's projects Team on Vercel. A member of the Team first needs to authorize it. |
36897b1 to
d882d01
Compare
added 11 commits
April 15, 2026 04:56
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the Approach A MVP for
coding-deepgent: a local LangChain-native Agent Harness Core with explicit MVP/non-MVP boundaries and source-backed stage checkpoints through Stage 29.This PR now includes the earlier context/compact/task/verifier work plus the MVP closeout work from Stages 18B-29:
MVP Status
Canonical dashboard result:
Stage 30-36 reserve is not currently required.
Major Additions
Context / compact / session / memory
Durable workflow / subagents
Extension platform
Planning / Trellis
Validation
Full current validation on
coding-deepgent:ruff check coding-deepgent/src coding-deepgent/testsmypy coding-deepgent/src/coding_deepgent coding-deepgent/testspytest -q coding-deepgent/tests212 passedExplicit Deferred / Out Of MVP
Notes
This branch intentionally preserves the Trellis task/checkpoint artifacts that document the source-backed implementation and MVP closeout path.