Create prompt templates for agent communication (CS-10477)#4230
Create prompt templates for agent communication (CS-10477)#4230
Conversation
There was a problem hiding this comment.
💡 Codex Review
boxel/packages/software-factory/scripts/lib/factory-agent.ts
Lines 379 to 380 in 24bf1b0
plan() retries
The new iterate prompt is only selected when both previousActions and iteration are passed, but the public entry point still calls buildMessages(context) with no extra arguments. That means a real follow-up plan(context) after a failed test run will never hit ticket-iterate; it falls back to ticket-implement and loses the prior actions plus failure details that the fix-up loop needs. Compared with the previous implementation, this regresses all test-failure retries from self-contained repair prompts to essentially a fresh first pass.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR introduces a template-driven prompt system for the software-factory agent so the orchestrator’s LLM messages can be iterated via standalone markdown files rather than hardcoded strings.
Changes:
- Added markdown prompt templates under
packages/software-factory/prompts/(system + implement/iterate/test stages + action schema + examples). - Implemented
PromptLoader/FilePromptLoaderand a minimal mustache-like interpolator with{{var}},{{#each}}, and{{#if}}/{{else}}. - Updated
OpenRouterFactoryAgent.buildMessages()to assemble one-shot[system,user]messages from templates and added afactory:prompt-smokeCLI plus unit tests.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/software-factory/scripts/lib/factory-prompt-loader.ts | Implements file-backed prompt loading, interpolation, and prompt assembly helpers. |
| packages/software-factory/scripts/lib/factory-agent.ts | Switches buildMessages() from stubbed strings to template-based prompt construction. |
| packages/software-factory/prompts/system.md | Defines the shared system prompt (role/rules/realms/skills/tools + action schema). |
| packages/software-factory/prompts/ticket-implement.md | Template for initial implementation pass user prompt. |
| packages/software-factory/prompts/ticket-iterate.md | Template for iteration/fix pass user prompt (previous actions + failures + tool results). |
| packages/software-factory/prompts/ticket-test.md | Template for test generation pass user prompt. |
| packages/software-factory/prompts/action-schema.md | Provides the canonical action schema text embedded into the system prompt. |
| packages/software-factory/prompts/examples/* | Adds example input/output snippets for reference. |
| packages/software-factory/scripts/factory-prompt-smoke.ts | Adds CLI to print assembled prompts without any network/API key. |
| packages/software-factory/tests/factory-prompt-loader.test.ts | Adds unit/integration tests for interpolation and prompt assembly. |
| packages/software-factory/tests/factory-agent.test.ts | Updates tests to reflect new template-based message assembly and iterate-mode signature. |
| packages/software-factory/tests/index.ts | Registers the new test file in the suite. |
| packages/software-factory/package.json | Adds factory:prompt-smoke script. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Re: P1 — Thread failed-test context into Fixed in 3451842. Added export interface AgentContext {
// ...existing fields...
/** Actions from the previous plan() call, fed back for iteration prompts. */
previousActions?: AgentAction[];
/** Current iteration number (1-based), set by the orchestrator. */
iteration?: number;
}
async plan(context: AgentContext): Promise<AgentAction[]> {
let messages = this.buildMessages(
context,
context.previousActions,
context.iteration,
);
// ...
}The orchestrator sets Added an integration-style test ( |
backspace
left a comment
There was a problem hiding this comment.
I ran the suggested commands and saw the suggested output. The actual volume of words here is massive though, am I meant to be reading this all? I only skimmed
|
|
||
| # Realms | ||
|
|
||
| - Target realm: {{targetRealmUrl}} |
There was a problem hiding this comment.
Where does this come from? I ran with BOXEL_ENVIRONMENT=hello so I’d expect to see http://realm-server.hello.localhost/user/personal but I got http://localhost:4201/user/personal/
but I know I’m probably the only one using this at the moment, so when the time comes, I’ll add support for this
There was a problem hiding this comment.
the target realm is the realm that you specified to create the project in. i don't think the software factory is aware of the boxel environment work. I can add a ticket for that though
Implement the core FactoryAgent interface that decouples the orchestration loop from any specific LLM. This is the foundational ticket for the software factory execution loop. - Define types: FactoryAgentConfig, AgentContext, AgentAction, ResolvedSkill, ToolManifest, TestResult, ToolResult, and placeholder card types - Implement OpenRouterFactoryAgent with dual-path routing: - Direct path via OPENROUTER_API_KEY env var (simplest for local dev/CI) - Proxy path via realm server _request-forward (production, with billing) - Env var takes precedence over config over proxy - Implement MockFactoryAgent for deterministic testing - Add resolveFactoryModel() with CLI > env > FACTORY_DEFAULT_MODEL fallback - Add action validation, response parsing with markdown fence stripping - Add retry-once with error correction on malformed LLM responses - Add smoke-test script (pnpm factory:agent-smoke) for manual verification - 42 tests: unit tests + integration tests with mock HTTP servers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix all prettier and qunit lint errors - Trim and treat blank OPENROUTER_API_KEY as missing (avoids bypassing proxy with empty env var in CI) - Pass authorization through as-is to avoid Bearer Bearer double-prefix - Use new URL() for proxy URL construction (safe without trailing slash) - Reject non-object toolArgs in validation instead of silently dropping - Add tests for blank API key handling and invalid toolArgs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add markdown prompt templates and a PromptLoader that assembles one-shot
LLM messages from templates + runtime context. Each plan() call sends
exactly [system, user] — no multi-turn conversation.
- prompts/: system.md, ticket-implement.md, ticket-test.md,
ticket-iterate.md, action-schema.md, and examples/
- PromptLoader: reads, caches, and interpolates templates with
{{variable}}, {{#each}}, {{#if}}/{{else}} support
- OpenRouterFactoryAgent.buildMessages() now uses template-based assembly
- factory:prompt-smoke script for reviewing assembled prompts
- 37 new unit tests for interpolation, assembly, and integration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…formatting
- buildMessages() now keys off context.testResults alone (with sensible
defaults for previousActions/iteration) so iterate template is used
whenever test results are present
- assembleImplementPrompt() includes tool results when present, so
invoke_tool output is not silently dropped on re-plan
- Tool results propagate outputFormat from tool manifests — templates
use ```text or ```json fences based on the tool's declared format
- Extract shared buildToolResultsData() helper used by both implement
and iterate assembly functions
- Remove indentation from closing template tags ({{/each}}, {{/if}})
to avoid stray whitespace in rendered prompts
- Add prompts/ to .prettierignore (template syntax conflicts with
prettier's markdown list formatting)
- 5 new tests covering the above changes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add previousActions and iteration as optional fields on AgentContext so the orchestrator can set them and plan() threads them to buildMessages(). This ensures the iterate template is used with real data during the fix-up loop, not just sensible defaults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3451842 to
0a9d061
Compare
skimming is fine--the point is that they are there. we will tweak these prompts as we iterate on the factory |
…t-templates-for-agent-communication # Conflicts: # packages/software-factory/scripts/factory-agent-smoke.ts # packages/software-factory/scripts/lib/factory-agent.ts # packages/software-factory/tests/factory-agent.test.ts
…t-templates-for-agent-communication # Conflicts: # packages/software-factory/package.json
Summary
packages/software-factory/prompts/that define how the orchestrator communicates with the LLM — standalone files that can be iterated without code changesPromptLoaderwith{{variable}},{{#each}},{{#if}}/{{else}}interpolation — minimal mustache-like expansion, no template engine dependencybuildMessages()inOpenRouterFactoryAgentwith template-based one-shot prompt assemblyfactory:prompt-smokeCLI script for reviewing assembled prompts without an API key or serverDepends on
Files
Prompt templates (
packages/software-factory/prompts/):system.mdticket-implement.mdticket-test.mdticket-iterate.mdaction-schema.mdexamples/create-card.mdexamples/create-test.mdexamples/iterate-fix.mdImplementation:
scripts/lib/factory-prompt-loader.tsPromptLoaderinterface,FilePromptLoader, interpolation engine, message assembly functionsscripts/lib/factory-agent.tsbuildMessages()to use templatesscripts/factory-prompt-smoke.tstests/factory-prompt-loader.test.tsTry it out
No API keys, servers, or network access required — the smoke test assembles prompts with sample data and prints exactly what the LLM would receive.
Run all three stages (implement → iterate → test):
cd packages/software-factory pnpm factory:prompt-smokeRun a specific stage:
What you should see:
Each stage prints a
[SYSTEM]message and a[USER]message separated by decorated headers. For example, the implement stage output starts with:followed by the full system prompt (role, rules, action schema, skills, tools), then:
Each stage ends with a char count summary like:
The iterate stage additionally shows previous actions (with code blocks) and test failure output embedded in the user prompt, demonstrating the self-contained one-shot design.
Run unit tests:
All 123 tests should pass (37 new + 86 existing).
Test plan
PromptLoaderinterpolation (simple vars, dot paths,{{#each}},{{#if}}/{{else}})ticket-iterateprompt includes previous actions and test resultsfactory:prompt-smokeruns cleanly and prints all three stagesfactory-agenttests updated and passing🤖 Generated with Claude Code