Define FactoryAgent interface and OpenRouter integration (CS-10476) by habdelra · Pull Request #4229 · cardstack/boxel

habdelra · 2026-03-22T19:21:25Z

Note: PR #4222 (CS-10449: Implement project artifact bootstrap from a brief) should be reviewed and merged first — this PR builds on the same software-factory package and the test index will need its imports once both land.

Summary

Define the core FactoryAgent interface and types (AgentContext, AgentAction, ToolManifest, TestResult, etc.) that decouple the orchestration loop from any specific LLM
Implement OpenRouterFactoryAgent with dual-path routing: direct API key (local dev) or realm server _request-forward proxy (production with billing)
Implement MockFactoryAgent for deterministic testing
Add smoke-test script for manual CLI verification
42 new tests (unit + integration with mock HTTP servers)

Try it out

After checking out this branch, you can verify the full round-trip to an LLM:

cd packages/software-factory

# Set your OpenRouter API key and run the smoke test:
OPENROUTER_API_KEY=sk-or-v1-YOUR_KEY_HERE \
  pnpm factory:agent-smoke \
  --realm-server-url https://realms-staging.stack.cards/

# Optionally override the model (defaults to anthropic/claude-sonnet-4):
OPENROUTER_API_KEY=sk-or-v1-YOUR_KEY_HERE \
  pnpm factory:agent-smoke \
  --realm-server-url https://realms-staging.stack.cards/ \
  --model anthropic/claude-opus-4

Expected output:

Model: anthropic/claude-sonnet-4                  ← resolved from default constant
Realm server: https://realms-staging.stack.cards/ ← from your --realm-server-url flag

Sending plan() request...

Received 3 action(s):                             ← count will vary (real LLM response)
[
  {                                               ┐
    "type": "create_file",                        │
    "path": "HelloWorld/hello-world.gts",         │ ← LLM-generated actions
    "content": "export class HelloWorld ...",      │   (content, paths, and number
    "realm": "target"                             │    of actions will differ on
  },                                              │    every run)
  {                                               │
    "type": "create_test",                        │
    "path": "TestSpec/hello-world.spec.ts",       │
    "content": "test('renders hello', ...",       │
    "realm": "test"                               │
  },                                              │
  {                                               │
    "type": "done"                                │
  }                                               ┘
]

Smoke test passed.                                ← confirms response was valid JSON
                                                    and parsed as AgentAction[]

The lines between [ and ] are the raw AgentAction[] returned by the LLM — the exact actions, file paths, and content will be different every time since it's a real model response. What matters is:

You see a valid JSON array printed
Each action has a valid type (one of: create_file, update_file, create_test, update_test, invoke_tool, done, etc.)
The script exits with "Smoke test passed." (meaning the response was successfully parsed and validated)

Run the tests:

cd packages/software-factory
pnpm test:node

Test plan

37 unit tests: action validation, response parsing, model resolution, mock agent, message assembly, API path selection
5 integration tests: full round-trip for both proxy and direct paths using local mock HTTP servers, error handling, retry on malformed response
TypeScript compiles cleanly (tsc --noEmit)
Manual smoke test with real OpenRouter API key

🤖 Generated with Claude Code

Linear: https://linear.app/cardstack/issue/CS-10476

Implement the core FactoryAgent interface that decouples the orchestration loop from any specific LLM. This is the foundational ticket for the software factory execution loop. - Define types: FactoryAgentConfig, AgentContext, AgentAction, ResolvedSkill, ToolManifest, TestResult, ToolResult, and placeholder card types - Implement OpenRouterFactoryAgent with dual-path routing: - Direct path via OPENROUTER_API_KEY env var (simplest for local dev/CI) - Proxy path via realm server _request-forward (production, with billing) - Env var takes precedence over config over proxy - Implement MockFactoryAgent for deterministic testing - Add resolveFactoryModel() with CLI > env > FACTORY_DEFAULT_MODEL fallback - Add action validation, response parsing with markdown fence stripping - Add retry-once with error correction on malformed LLM responses - Add smoke-test script (pnpm factory:agent-smoke) for manual verification - 42 tests: unit tests + integration tests with mock HTTP servers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5759f0f53f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

packages/software-factory/scripts/lib/factory-agent.ts

Copilot

Pull request overview

Introduces a new “FactoryAgent” abstraction in packages/software-factory and provides an OpenRouter-backed implementation (with direct-key and realm-server-proxy routing), plus a mock agent and new unit/integration tests to validate parsing, validation, and request routing.

Changes:

Added FactoryAgent types/utilities (action validation + response parsing) and OpenRouterFactoryAgent / MockFactoryAgent.
Added a CLI smoke-test script for manual end-to-end verification against OpenRouter.
Added unit + integration tests (including local HTTP stubs) and wired them into the test index.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
packages/software-factory/scripts/lib/factory-agent.ts	Defines core agent types, action parsing/validation, OpenRouter implementation (direct/proxy), and mock agent.
packages/software-factory/scripts/factory-agent-smoke.ts	Adds a manual CLI smoke test for exercising `OpenRouterFactoryAgent.plan()`.
packages/software-factory/tests/factory-agent.test.ts	Adds unit tests for action validation, response parsing, model resolution, and request path selection.
packages/software-factory/tests/factory-agent.integration.test.ts	Adds integration tests using local HTTP servers for both proxy and direct call paths (including retry behavior).
packages/software-factory/tests/index.ts	Registers the new agent test modules.
packages/software-factory/package.json	Adds `factory:agent-smoke` script entry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/software-factory/scripts/lib/factory-agent.ts

- Fix all prettier and qunit lint errors - Trim and treat blank OPENROUTER_API_KEY as missing (avoids bypassing proxy with empty env var in CI) - Pass authorization through as-is to avoid Bearer Bearer double-prefix - Use new URL() for proxy URL construction (safe without trailing slash) - Reject non-object toolArgs in validation instead of silently dropping - Add tests for blank API key handling and invalid toolArgs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ryagent-interface-and-openrouter-integration-v2 # Conflicts: # packages/software-factory/tests/index.ts

backspace · 2026-03-23T15:53:31Z

❯ pnpm factory:agent-smoke -- \
  --realm-server-url https://realms-staging.stack.cards/

> @cardstack/software-factory@1.0.0 factory:agent-smoke /Users/b/Documents/Cardstack/Code/boxel-motion/packages/software-factory
> NODE_NO_WARNINGS=1 ts-node --transpileOnly scripts/factory-agent-smoke.ts -- --realm-server-url https://realms-staging.stack.cards/

Smoke test failed: TypeError: Unexpected argument '--realm-server-url'. This command does not take positional arguments

I confirmed I’m on 6205d52

pnpm passes '--' through to ts-node which forwards it to the script, so process.argv contains ['--', '--realm-server-url', ...]. parseArgs with strict: true rejects this. Strip the leading '--' like the factory-entrypoint already does. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra · 2026-03-23T22:18:42Z

Fixed in dcea592 -- the -- that pnpm passes through to ts-node was being forwarded as a literal arg to parseArgs, which rejected it under strict: true. Now strips the leading -- the same way factory-entrypoint.ts already does.

Verified locally:

$ OPENROUTER_API_KEY=sk-or-... pnpm factory:agent-smoke -- --realm-server-url http://localhost:4201/

Model: anthropic/claude-sonnet-4
Realm server: http://localhost:4201/

Sending plan() request...

Received 4 action(s):
[
  { "type": "create_file", "path": "hello.py", ... },
  { "type": "create_test", "path": "test_hello.py", ... },
  { "type": "update_ticket", ... },
  { "type": "done" }
]

Smoke test passed.

The -- separator is not needed since pnpm doesn't consume --realm-server-url or --model. The script still strips a leading -- defensively in case users include it out of habit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra requested a review from Copilot March 22, 2026 19:25

Copilot started reviewing on behalf of habdelra March 22, 2026 19:25 View session

chatgpt-codex-connector bot reviewed Mar 22, 2026

View reviewed changes

packages/software-factory/scripts/lib/factory-agent.ts Show resolved Hide resolved

packages/software-factory/scripts/lib/factory-agent.ts Outdated Show resolved Hide resolved

packages/software-factory/scripts/lib/factory-agent.ts Show resolved Hide resolved

Copilot AI reviewed Mar 22, 2026

View reviewed changes

This was referenced Mar 22, 2026

Create prompt templates for agent communication (CS-10477) #4230

Open

Implement skill loader and resolver for agent context (CS-10478) #4231

Open

Implement tool registry and executor for agent actions (CS-10479) #4232

Open

habdelra requested a review from a team March 22, 2026 22:22

Merge remote-tracking branch 'origin/main' into cs-10476-define-facto…

6205d52

…ryagent-interface-and-openrouter-integration-v2 # Conflicts: # packages/software-factory/tests/index.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define FactoryAgent interface and OpenRouter integration (CS-10476)#4229

Define FactoryAgent interface and OpenRouter integration (CS-10476)#4229
habdelra wants to merge 5 commits intomainfrom
cs-10476-define-factoryagent-interface-and-openrouter-integration-v2

habdelra commented Mar 22, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

backspace commented Mar 23, 2026

Uh oh!

habdelra commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

habdelra commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Try it out

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

backspace commented Mar 23, 2026

Uh oh!

habdelra commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

habdelra commented Mar 22, 2026 •

edited

Loading