Skip to content

Create prompt templates for agent communication (CS-10477)#4230

Merged
habdelra merged 7 commits intomainfrom
cs-10477-create-prompt-templates-for-agent-communication
Mar 24, 2026
Merged

Create prompt templates for agent communication (CS-10477)#4230
habdelra merged 7 commits intomainfrom
cs-10477-create-prompt-templates-for-agent-communication

Conversation

@habdelra
Copy link
Contributor

Summary

  • Add markdown prompt templates in packages/software-factory/prompts/ that define how the orchestrator communicates with the LLM — standalone files that can be iterated without code changes
  • Implement PromptLoader with {{variable}}, {{#each}}, {{#if}}/{{else}} interpolation — minimal mustache-like expansion, no template engine dependency
  • Replace the stub buildMessages() in OpenRouterFactoryAgent with template-based one-shot prompt assembly
  • Add factory:prompt-smoke CLI script for reviewing assembled prompts without an API key or server
  • 37 new unit tests covering interpolation, prompt assembly, and integration

Depends on

#4229 (CS-10476: Define FactoryAgent interface and OpenRouter integration) must be reviewed and merged first. This PR is based on that branch.

Files

Prompt templates (packages/software-factory/prompts/):

File Purpose
system.md Role, rules, output schema, skills, tools
ticket-implement.md First pass: project context + ticket + implementation instructions
ticket-test.md Generate tests for existing implementation
ticket-iterate.md Self-contained: ticket context + previous actions + test failures + fix instructions
action-schema.md Canonical AgentAction[] JSON schema
examples/create-card.md Example: creating a card definition + instance
examples/create-test.md Example: generating a test spec
examples/iterate-fix.md Example: fixing code after test failure

Implementation:

File Purpose
scripts/lib/factory-prompt-loader.ts PromptLoader interface, FilePromptLoader, interpolation engine, message assembly functions
scripts/lib/factory-agent.ts Updated buildMessages() to use templates
scripts/factory-prompt-smoke.ts CLI smoke test for reviewing prompts
tests/factory-prompt-loader.test.ts 37 new unit tests

Try it out

No API keys, servers, or network access required — the smoke test assembles prompts with sample data and prints exactly what the LLM would receive.

Run all three stages (implement → iterate → test):

cd packages/software-factory
pnpm factory:prompt-smoke

Run a specific stage:

pnpm factory:prompt-smoke -- --stage implement   # first pass
pnpm factory:prompt-smoke -- --stage iterate      # fix after test failure
pnpm factory:prompt-smoke -- --stage test         # generate tests

What you should see:

Each stage prints a [SYSTEM] message and a [USER] message separated by decorated headers. For example, the implement stage output starts with:

════════════════════════════════════════════════════════════════════════
  STAGE: implement (first pass)
════════════════════════════════════════════════════════════════════════

── [SYSTEM] ──────────────────────────
# Role

You are a software factory agent. You implement Boxel cards and tests in
target realms based on ticket descriptions and project context.

# Output Format
...

followed by the full system prompt (role, rules, action schema, skills, tools), then:

── [USER] ──────────────────────────
# Project

Build a sticky note card application for Boxel

Success criteria:
- StickyNote card renders with title and body
...

# Current Ticket

ID: Ticket/define-sticky-note-core
Summary: Define the core StickyNote CardDef
...

Each stage ends with a char count summary like:

📊  System: 3174 chars | User: 1482 chars

The iterate stage additionally shows previous actions (with code blocks) and test failure output embedded in the user prompt, demonstrating the self-contained one-shot design.

Run unit tests:

pnpm test:node

All 123 tests should pass (37 new + 86 existing).

Test plan

  • Unit tests for PromptLoader interpolation (simple vars, dot paths, {{#each}}, {{#if}}/{{else}})
  • Unit tests for one-shot message assembly at each loop stage
  • Test that ticket-iterate prompt includes previous actions and test results
  • Snapshot test for assembled system prompt with sample skills and tools
  • Verify factory:prompt-smoke runs cleanly and prints all three stages
  • Existing factory-agent tests updated and passing

🤖 Generated with Claude Code

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

async plan(context: AgentContext): Promise<AgentAction[]> {
let messages = this.buildMessages(context);

P1 Badge Thread failed-test context into plan() retries

The new iterate prompt is only selected when both previousActions and iteration are passed, but the public entry point still calls buildMessages(context) with no extra arguments. That means a real follow-up plan(context) after a failed test run will never hit ticket-iterate; it falls back to ticket-implement and loses the prior actions plus failure details that the fix-up loop needs. Compared with the previous implementation, this regresses all test-failure retries from self-contained repair prompts to essentially a fresh first pass.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a template-driven prompt system for the software-factory agent so the orchestrator’s LLM messages can be iterated via standalone markdown files rather than hardcoded strings.

Changes:

  • Added markdown prompt templates under packages/software-factory/prompts/ (system + implement/iterate/test stages + action schema + examples).
  • Implemented PromptLoader/FilePromptLoader and a minimal mustache-like interpolator with {{var}}, {{#each}}, and {{#if}}/{{else}}.
  • Updated OpenRouterFactoryAgent.buildMessages() to assemble one-shot [system,user] messages from templates and added a factory:prompt-smoke CLI plus unit tests.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
packages/software-factory/scripts/lib/factory-prompt-loader.ts Implements file-backed prompt loading, interpolation, and prompt assembly helpers.
packages/software-factory/scripts/lib/factory-agent.ts Switches buildMessages() from stubbed strings to template-based prompt construction.
packages/software-factory/prompts/system.md Defines the shared system prompt (role/rules/realms/skills/tools + action schema).
packages/software-factory/prompts/ticket-implement.md Template for initial implementation pass user prompt.
packages/software-factory/prompts/ticket-iterate.md Template for iteration/fix pass user prompt (previous actions + failures + tool results).
packages/software-factory/prompts/ticket-test.md Template for test generation pass user prompt.
packages/software-factory/prompts/action-schema.md Provides the canonical action schema text embedded into the system prompt.
packages/software-factory/prompts/examples/* Adds example input/output snippets for reference.
packages/software-factory/scripts/factory-prompt-smoke.ts Adds CLI to print assembled prompts without any network/API key.
packages/software-factory/tests/factory-prompt-loader.test.ts Adds unit/integration tests for interpolation and prompt assembly.
packages/software-factory/tests/factory-agent.test.ts Updates tests to reflect new template-based message assembly and iterate-mode signature.
packages/software-factory/tests/index.ts Registers the new test file in the suite.
packages/software-factory/package.json Adds factory:prompt-smoke script.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@habdelra
Copy link
Contributor Author

Re: P1 — Thread failed-test context into plan() retries

Fixed in 3451842. Added previousActions and iteration as optional fields on AgentContext:

export interface AgentContext {
  // ...existing fields...
  /** Actions from the previous plan() call, fed back for iteration prompts. */
  previousActions?: AgentAction[];
  /** Current iteration number (1-based), set by the orchestrator. */
  iteration?: number;
}

plan() now threads these through to buildMessages():

async plan(context: AgentContext): Promise<AgentAction[]> {
  let messages = this.buildMessages(
    context,
    context.previousActions,
    context.iteration,
  );
  // ...
}

The orchestrator sets context.previousActions and context.iteration before calling plan(), and the iterate template receives real data instead of empty defaults.

Added an integration-style test (plan() uses iterate template when context has previousActions and testResults) that stubs fetch, calls plan() with a full iteration context, and verifies the LLM request body contains the iterate prompt with previous actions, iteration number, and test failure details.

@habdelra habdelra requested a review from a team March 22, 2026 21:40
@habdelra habdelra changed the base branch from cs-10476-define-factoryagent-interface-and-openrouter-integration-v2 to main March 23, 2026 15:17
Copy link
Contributor

@backspace backspace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the suggested commands and saw the suggested output. The actual volume of words here is massive though, am I meant to be reading this all? I only skimmed


# Realms

- Target realm: {{targetRealmUrl}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this come from? I ran with BOXEL_ENVIRONMENT=hello so I’d expect to see http://realm-server.hello.localhost/user/personal but I got http://localhost:4201/user/personal/

but I know I’m probably the only one using this at the moment, so when the time comes, I’ll add support for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the target realm is the realm that you specified to create the project in. i don't think the software factory is aware of the boxel environment work. I can add a ticket for that though

habdelra and others added 5 commits March 23, 2026 17:31
Implement the core FactoryAgent interface that decouples the orchestration
loop from any specific LLM. This is the foundational ticket for the
software factory execution loop.

- Define types: FactoryAgentConfig, AgentContext, AgentAction, ResolvedSkill,
  ToolManifest, TestResult, ToolResult, and placeholder card types
- Implement OpenRouterFactoryAgent with dual-path routing:
  - Direct path via OPENROUTER_API_KEY env var (simplest for local dev/CI)
  - Proxy path via realm server _request-forward (production, with billing)
  - Env var takes precedence over config over proxy
- Implement MockFactoryAgent for deterministic testing
- Add resolveFactoryModel() with CLI > env > FACTORY_DEFAULT_MODEL fallback
- Add action validation, response parsing with markdown fence stripping
- Add retry-once with error correction on malformed LLM responses
- Add smoke-test script (pnpm factory:agent-smoke) for manual verification
- 42 tests: unit tests + integration tests with mock HTTP servers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix all prettier and qunit lint errors
- Trim and treat blank OPENROUTER_API_KEY as missing (avoids bypassing
  proxy with empty env var in CI)
- Pass authorization through as-is to avoid Bearer Bearer double-prefix
- Use new URL() for proxy URL construction (safe without trailing slash)
- Reject non-object toolArgs in validation instead of silently dropping
- Add tests for blank API key handling and invalid toolArgs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add markdown prompt templates and a PromptLoader that assembles one-shot
LLM messages from templates + runtime context. Each plan() call sends
exactly [system, user] — no multi-turn conversation.

- prompts/: system.md, ticket-implement.md, ticket-test.md,
  ticket-iterate.md, action-schema.md, and examples/
- PromptLoader: reads, caches, and interpolates templates with
  {{variable}}, {{#each}}, {{#if}}/{{else}} support
- OpenRouterFactoryAgent.buildMessages() now uses template-based assembly
- factory:prompt-smoke script for reviewing assembled prompts
- 37 new unit tests for interpolation, assembly, and integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…formatting

- buildMessages() now keys off context.testResults alone (with sensible
  defaults for previousActions/iteration) so iterate template is used
  whenever test results are present
- assembleImplementPrompt() includes tool results when present, so
  invoke_tool output is not silently dropped on re-plan
- Tool results propagate outputFormat from tool manifests — templates
  use ```text or ```json fences based on the tool's declared format
- Extract shared buildToolResultsData() helper used by both implement
  and iterate assembly functions
- Remove indentation from closing template tags ({{/each}}, {{/if}})
  to avoid stray whitespace in rendered prompts
- Add prompts/ to .prettierignore (template syntax conflicts with
  prettier's markdown list formatting)
- 5 new tests covering the above changes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add previousActions and iteration as optional fields on AgentContext so
the orchestrator can set them and plan() threads them to buildMessages().
This ensures the iterate template is used with real data during the
fix-up loop, not just sensible defaults.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@habdelra habdelra force-pushed the cs-10477-create-prompt-templates-for-agent-communication branch from 3451842 to 0a9d061 Compare March 23, 2026 21:33
@habdelra
Copy link
Contributor Author

I ran the suggested commands and saw the suggested output. The actual volume of words here is massive though, am I meant to be reading this all? I only skimmed

skimming is fine--the point is that they are there. we will tweak these prompts as we iterate on the factory

…t-templates-for-agent-communication

# Conflicts:
#	packages/software-factory/scripts/factory-agent-smoke.ts
#	packages/software-factory/scripts/lib/factory-agent.ts
#	packages/software-factory/tests/factory-agent.test.ts
…t-templates-for-agent-communication

# Conflicts:
#	packages/software-factory/package.json
@habdelra habdelra merged commit 23ebd63 into main Mar 24, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants