-
Notifications
You must be signed in to change notification settings - Fork 46
🤖 feat: interactive harness init approval flow #1807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
446b377 to
43951c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 446b377437
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Follow-ups pushed:
Note: Chromatic "UI Review" / "UI Tests" are still pending (require baseline acceptance). |
7867a5e to
922f9e5
Compare
922f9e5 to
497091a
Compare
|
@codex review Changes:
Validation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 497091ad40
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
497091a to
a7bb8f0
Compare
|
@codex review Addressed “Preserve harness state when stat fails”:
Validation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98297a50f0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed “Guard harness load errors to avoid stuck loop state”:
Validation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dfbe15ec01
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
dfbe15e to
82d1f5d
Compare
82d1f5d to
d67aca6
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d459251934
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a11933565
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const prefix = workspaceName.trim().length > 0 ? workspaceName.trim() : "workspace"; | ||
|
|
||
| return { | ||
| configPath: joinForRuntime(runtimeConfig, workspacePath, HARNESS_DIR, `${prefix}.jsonc`), | ||
| progressPath: joinForRuntime( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanitize absolute workspace names before building harness paths
For in-place workspaces metadata.name can be an absolute path (see comment in getRuntimeAndWorkspacePath). Here prefix is taken directly from workspaceName, then passed to path.join via joinForRuntime. If workspaceName starts with /, path.join treats it as absolute and discards workspacePath/.mux/harness, so the harness config/progress files resolve to something like /Users/me/project.jsonc instead of .mux/harness/.... This breaks harness detection and can write files outside the repo. Consider normalizing workspaceName to a safe relative basename (or stripping leading slashes) before joining.
Useful? React with 👍 / 👎.
1a11933 to
010bbdf
Compare
010bbdf to
6f8a76e
Compare
Adds workspace-local harness config (checklist + gates) and an opt-in Ralph loop runner. - Backend services: WorkspaceHarnessService, GateRunnerService, GitCheckpointService, LoopRunnerService - ORPC: workspace.harness + workspace.loop endpoints - UI: RightSidebar Harness tab + command palette actions for gates/checkpoint/loop Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with • Model: openai:gpt-5.2 • Thinking: high • Cost: 0.17_ Change-Id: I99428a620b0bd65e9b9a2bb9023b9dd9e0843bc1
Change-Id: I15d81ab1136b5437df531ba6cb3e23cf84c321a0 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ide9e2ac1fa93252310350441843ae4d7eaa0ad25 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I0f684cca69decbe2756577ec54c321ea0e13b182 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Iebbcc21aaa8a919be5e1217c0d44b6cee070d782 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Include the workspace plan file path in harness reset/loop bearings summaries. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.2` • Thinking: `xhigh` • Cost: $47.81_ Change-Id: I89cf61ac2e147042882b58297d0bf9dde49835fd
Change-Id: Icf5963d92a65300117de0c264272f8ca3952c4e0 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ie569d9a08cf122c8d7dce626003d1620a6e37bf9 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I88bf5879b908141790c6119d99f93983071a6b5e Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ic9c7e77915dcf5662b2cf767f93202c928d13c91 Signed-off-by: Thomas Kosiewski <tk@coder.com>
- Add lightweight workspace.harness.exists endpoint - Remove legacy harness filename support - Conditionally show/remove Harness tab in right sidebar and persisted layout Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.2` • Thinking: `xhigh` • Cost: $74.55_ Change-Id: Icfd5f621eefc533c855a202e8f65739b3194791a
Change-Id: I19674059830f6d2a447a96dfef6cebb64c65143e Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I5d1a99f30bf1c8997053eab48894cac50f4d7317 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I44047b8a001042fec6b21d46baf7502a231540ce Signed-off-by: Thomas Kosiewski <tk@coder.com>
Replace planFileOnly with mode/agentId/allowedEditPaths and update tests + docs. Change-Id: If00ef62e3877084ef650822af645e254371c9d3b Signed-off-by: Thomas Kosiewski <tk@coder.com>
Ensure harness parent dirs exist before writing config/journal and update journal hint + tests. Change-Id: Id05475a1414d0e5b2a26efa3b63c49f619390b62 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Include nested harness glob in harness-init edit allowlist and update docs/tests. Change-Id: I5385802f1711dec7af09cc4526d8abc38c25df09 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I1f40d2ed2df5614ef7513e8a2efc61ac13fc7e73 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I8c417c433ed340807524e11609d2c4e39cdf07bb Signed-off-by: Thomas Kosiewski <tk@coder.com>
- Require Harness Init to delegate exploration to explore subagents\n- Inject <harness_output_path> derived from workspace name\n- Remove harness-from-plan agent + startFromPlan RPC Change-Id: Ie2b72ffc3bd7705c08d1ce861e73093518a14078 Signed-off-by: Thomas Kosiewski <tk@coder.com>
6f8a76e to
9f4d0a2
Compare
Adds an interactive Harness-from-Plan workflow so Ralph loop runs with an explicit, user-reviewed harness.
What changed
harness-initagent (repo-aware, interactive harness authoring) with UI styling.harness-initedits to.mux/harness/*.jsoncvia tool-layerallowedEditPathsenforcement.harness-initsub-agent spawning to read-onlyexploretasks.harness-initand request a harness proposal.propose_harnesstool + UI card with Approve & Start to start the loop in Exec mode.Validation
make static-check📋 Implementation Plan
Interactive Harness-from-Plan (repo-aware + chat-first approval)
Goals
rg, inspect CI config) before proposing checklist + gates.exploresub-agents to answer: “what parts of the repo are affected?” and “what gates/commands exist here?”Recommended approach — True inline “Harness Mode” (hidden harness-init agent)
Net LoC (product code): ~400–650
1) Add a hidden harness-init agent (interactive, repo-aware)
harness-init) that is:ui.hidden: true(not selectable in the agent picker / command palette)subagent.runnable: true; add a denylist check in thetasktool for defense-in-depth)--color-harness-init-mode(and optional hover/alpha variants) insrc/browser/styles/globals.css.src/browser/components/AgentModePicker.tsxto show Harness Init in that color.src/browser/components/ChatInput/index.tsxso the Send button usesbg-harness-init-modewhen the active agent isharness-init.src/node/builtinAgents/harness-from-plan.mdand the newharness-initprompt to share the same guidance:Makefile,justfile,package.jsonscripts,.github/workflows/*)2) Constrain edits to harness files (but allow in-place diffs)
file_edit_*tools forharness-init, but enforce a tool-layer allowlist so it can only edit.mux/harness/*.jsonc.ToolConfigurationwithallowedEditPathsand enforce it insrc/node/services/tools/fileCommon.ts.allowedEditPathsinsrc/node/services/aiService.tswhen the active agent isharness-init..mux/harness/<workspace>.jsoncwhen harness-init starts so the agent can make small diffs without re-outputting the whole file.propose_harness, assert the working tree has no changes outside.mux/harness/*.jsoncto catch accidental edits (including viabash).3) Enable explore subagents safely
task+task_awaitforharness-init.src/node/services/tools/task.tssoharness-initcan only spawnagentId: "explore".4) Wire “Start Ralph Loop” to switch agent + send an initiating message
src/browser/components/tools/ProposePlanToolCall.tsx:api.workspace.loop.startFromPlan()call with a Plan-Mode-like transition:updatePersistedState(getAgentIdKey(workspaceId), "harness-init")api.workspace.sendMessage({ message: "Generate a Ralph harness from the current plan and propose it" })5) Add
propose_harnesstool + approval UI (mirrorspropose_plan)propose_harness:recordFileStateso the UI can detect out-of-band editsProposeHarnessToolCall.tsx):api.workspace.harness.getif it already returns enough; otherwise add a dedicatedgetHarnessContentendpoint)execworkspace.loop.startendpoint if one doesn’t already exist)6) Tests
allowedEditPathsenforcement (can edit harness; cannot edit other files)propose_harnessvalidation.Alternative (less refactor)
Option — Dedicated “Harness Review” child workspace
Net LoC: ~300–500
Generated with
mux• Model:openai:gpt-5.2• Thinking:high• Cost: $60.54