Fix isolated realm server teardown in software-factory test harness (CS-10472) by habdelra · Pull Request #4233 · cardstack/boxel

habdelra · 2026-03-23T00:38:26Z

Summary

Spawn ts-node directly instead of pnpm serve:realm to eliminate the intermediate wrapper process that broke process-group killing during teardown
Add port-free verification after stop() to confirm ports 4205/4201/4232 are released before the next test starts
Add process.on('exit') safety net in serve-realm.ts that force-kills child processes on unclean exit
Consolidate the bifurcated playwright configs (playwright.config.ts, playwright.e2e.config.ts, playwright.shared.ts) back into a single playwright.config.ts — all specs now run together under pnpm test:playwright
Remove the test:playwright-e2e workaround script from package.json
Remove the CS-10472 workaround comment from factory-target-realm.spec.ts

Dependencies

Note: PR #4222 must be reviewed and merged first — this PR is based on that branch and builds on its test infrastructure.

Test plan

pnpm test:playwright — all 8 tests pass (darkfactory, factory-bootstrap, factory-target-realm specs running together)
Back-to-back runs succeed with no port conflicts
No orphaned processes after test completion (ss -tlnp shows no lingering listeners on 4201/4205/4232)
pnpm lint:js and pnpm lint:format pass

Add `bootstrapProjectArtifacts()` that creates Project, KnowledgeArticle, and Ticket cards in a target realm from a normalized brief. Content is derived deterministically from the brief's sections, tags, and summary. Stable slug-based IDs ensure idempotency — rerunning skips existing cards. - New module `src/factory-bootstrap.ts` with full card generation logic - Wired into `runFactoryEntrypoint` after target realm bootstrap - 21 hermetic QUnit tests for artifact generation and idempotency - 3 Playwright specs testing live realm card creation and rendering - E2e subprocess test for full Matrix auth → realm creation → bootstrap flow - CLI entrypoint now calls `process.exit()` to prevent hanging on open handles Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…oject-artifact-bootstrap-from-a-brief

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix activeTicket to reflect actual in-progress ticket instead of always using tickets[0]; hasInProgressTicket now returns the active path - Rename bootstrap summary fields: createdProject → projectId, createdTickets → ticketIds, createdKnowledgeArticles → knowledgeArticleIds - Use actual activeTicket status instead of hardcoded 'in_progress' - Wait for stdout flush before process.exit to prevent truncated output - Add response status checks in matrix-auth test helpers - Clear timeout timer in run-command when child exits - Split e2e test into separate playwright config (pnpm test:playwright-e2e) with shared config extracted to playwright.shared.ts - Remove test.fixme — e2e test runs via dedicated command Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

After _create-realm succeeds, append the new realm URL to the Matrix user's app.boxel.realms account data. This mirrors what the host app does when creating a realm through the UI — without this step, the realm won't appear in the user's realm list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Set iconURL from first letter of realm name (Letter-{x}.png from boxel CDN) - Set backgroundURL to a random curated image from boxel CDN - Mirrors iconURLFor() and getRandomBackgroundURL() from host app - Format card artifact JSON with 2-space indentation for readability Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The split module pattern (schema in one file, UI side-effects in another) caused custom templates to not render when cards adopted from the public module. The realm server's module loader could resolve the schema without executing the UI side-effects, leaving cards with default edit views. Combining everything into darkfactory.gts ensures the fitted/isolated/ embedded templates are always co-located with their card class definitions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix flaky "Accept All" code patch test caused by Monaco diff editor race During test teardown, Monaco's WorkerBasedDocumentDiffProvider.computeDiff can receive a null result from the editor worker when models are disposed mid-computation. This caused an unhandled "no diff result available" error that surfaced as a flaky QUnit global failure in CI. Extend the existing Monaco patch to return an empty diff result instead of throwing — matching the pattern Monaco already uses for disposed models. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Narrow Monaco patch to only suppress error when models are disposed Address review feedback: instead of unconditionally returning an empty diff when the worker returns null, only suppress the error when the models are confirmed disposed (the teardown race). Genuine worker failures with live models still throw. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Separate the "manually visit prerendered url" debug message into a dedicated `prerenderer-reproduce` log category so it can be enabled independently of the noisy `prerenderer` logs. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…racking hot path (#4223) * Optimize prerender performance: eliminate URL() construction in dependency tracking hot path Flame chart profiling revealed that 98% of active CPU per frame during card prerendering was spent in the runtime dependency tracking system, primarily constructing URL objects. The render produced 22 identical ~9-second long tasks (one per card deserialization), totaling ~200 seconds of blocked main thread for a card with 23 linksToMany relationships. Three optimizations applied: 1. trimModuleIdentifier (loader.ts): Replace `new URL(id).href` with string slice operations + a Map cache. Module identifiers are already full URL strings, so extension trimming only needs string ops. This was the single largest CPU consumer at 52.8% of active time (~5s per card). 2. collectKnownModuleDependencies (loader.ts): Cache the flattened dependency set per module identifier. Once a module is evaluated its consumedModules never change, so repeated graph walks for the same module return the cached result. This turns O(cards × modules) into O(modules). 3. trackRuntimeRelationshipModuleDependencies (card-api.gts): Track which modules have already had their full dep trees tracked and skip redundant getKnownConsumedModules() calls. This function was called on every linksTo field getter access during rendering, each time walking the full module dependency graph. Additionally, normalizeModuleURL/normalizeInstanceURL/canonicalURL in dependency-tracker.ts now use string operations instead of URL construction, eliminating another hot source of URL() calls in the tracking pipeline. Closes CS-10473 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix cached Set mutation and remove session-scoping issue in dep tracking Address review feedback: - getKnownConsumedModules: filter instead of delete to avoid mutating the cached Set returned by collectKnownModuleDependencies - Remove trackedRelationshipModules skip cache from card-api.gts — it was process-global and not cleared between dependency tracking sessions, which could cause subsequent renders to under-report module deps. The Loader-level caching in collectKnownModuleDependencies already makes getKnownConsumedModules fast enough without a caller-side skip. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Software factory plan refinements * updated with note of orchestrator vs agent calling boxel apis * more refinement * lint

…CS-10472) Spawn ts-node directly instead of pnpm to eliminate the intermediate wrapper process that broke process-group killing. Add port-free verification after stop() and a process.on('exit') safety net in serve-realm.ts. Consolidate playwright configs back to a single file now that all specs can run together. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector · 2026-03-23T00:38:32Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copilot

Pull request overview

This PR primarily hardens the packages/software-factory Playwright test harness teardown for isolated realm servers (CS-10472), aiming to prevent orphaned processes and port conflicts between specs, and it simplifies Playwright configuration so all specs run together.

Changes:

Replace pnpm serve:* wrapper process spawning with direct ts-node CLI invocation to improve process-group teardown reliability.
Add post-stop() verification that key ports are released before subsequent tests start.
Consolidate Playwright configs/scripts so all specs run under pnpm test:playwright.

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
pnpm-lock.yaml	Updates patch hash for monaco-editor patch contents.
patches/monaco-editor@0.52.2.patch	Suppresses Monaco diff worker null-result error when models are disposed mid-computation.
packages/software-factory/tests/fixtures.ts	Spawns `ts-node` directly and adds `waitForPortFree()` checks after teardown.
packages/software-factory/tests/factory-target-realm.spec.ts	Removes CS-10472 workaround comment now that teardown is addressed.
packages/software-factory/src/harness.ts	Exposes child process PIDs from the isolated stack for cleanup.
packages/software-factory/src/cli/serve-realm.ts	Adds `process.on('exit')` safety net to SIGKILL child PIDs on unclean exit.
packages/software-factory/playwright.shared.ts	Deleted; config consolidated into a single Playwright config.
packages/software-factory/playwright.global-setup.ts	Spawns `ts-node` directly for `serve-support`.
packages/software-factory/playwright.e2e.config.ts	Deleted; e2e tests now run with the main config.
packages/software-factory/playwright.config.ts	Single unified config that runs all `*.spec.ts` together.
packages/software-factory/package.json	Removes `test:playwright-e2e` script.
packages/software-factory/docs/software-factory-testing-strategy.md	Updates docs to include a “test realm” and AI-generated test loop.
packages/software-factory/docs/one-shot-factory-go-plan.md	Expands docs around test realm + implement→test→iterate loop; adds model flag note.
packages/runtime-common/loader.ts	Adds caching for known module dependencies and uses string-based module-id trimming.
packages/runtime-common/dependency-tracker.ts	Replaces URL-constructor heavy normalization with string-based normalization.
packages/realm-server/prerender/render-runner.ts	Routes reproduce log line to a separate logger name.
packages/base/card-api.gts	Updates comment to reflect loader perf improvements.
.github/workflows/pr-review-reminder.yml	Adds scheduled workflow to post awaiting-review PRs to Discord.

Files not reviewed (1)

pnpm-lock.yaml: Language not supported

Comments suppressed due to low confidence (1)

packages/software-factory/src/cli/serve-realm.ts:62

The SIGINT/SIGTERM handlers call the async stop() without handling rejections. If runtime.stop() throws, this becomes an unhandled promise rejection and may leave the process running without exiting (and without triggering the exit cleanup). Consider attaching a .catch() that logs and sets a non-zero exit code, and/or setting cleanExit intent earlier so the exit handler behavior is deterministic.

  let stop = async () => {
    await runtime.stop();
    cleanExit = true;
    process.exit(0);
  };

  process.on('SIGINT', () => void stop());
  process.on('SIGTERM', () => void stop());

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/software-factory/tests/fixtures.ts

packages/runtime-common/loader.ts

.github/workflows/pr-review-reminder.yml

Only retry on EADDRINUSE; propagate other errors immediately. Always close the server handle on error to prevent handle leaks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-23T01:15:00Z

Host Test Results

1 files 1 suites 2h 6m 51s ⏱️
2 030 tests 2 015 ✅ 15 💤 0 ❌
2 045 runs 2 030 ✅ 15 💤 0 ❌

Results for commit 7be0c51.

♻️ This comment has been updated with latest results.

…rver-teardown # Conflicts: # packages/runtime-common/loader.ts # packages/software-factory/tests/factory-target-realm.spec.ts

github-actions · 2026-03-23T15:21:15Z

Preview deployments

habdelra and others added 13 commits March 20, 2026 12:36

Merge remote-tracking branch 'origin/main' into cs-10449-implement-pr…

1c607a1

…oject-artifact-bootstrap-from-a-brief

Reference CS-10472 in fixme'd e2e test comment

40d6401

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ci: Add workflow for weekday PR review summaries (#4226)

868bb42

Software factory plan refinements (#4227)

7b1947e

* Software factory plan refinements * updated with note of orchestrator vs agent calling boxel apis * more refinement * lint

habdelra requested a review from Copilot March 23, 2026 00:39

Copilot started reviewing on behalf of habdelra March 23, 2026 00:39 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

packages/software-factory/tests/fixtures.ts Outdated Show resolved Hide resolved

packages/software-factory/tests/fixtures.ts Show resolved Hide resolved

packages/runtime-common/loader.ts Show resolved Hide resolved

.github/workflows/pr-review-reminder.yml Show resolved Hide resolved

Improve waitForPortFree error handling per review feedback

ea1b9e6

Only retry on EADDRINUSE; propagate other errors immediately. Always close the server handle on error to prevent handle leaks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra requested a review from a team March 23, 2026 00:53

Merge remote-tracking branch 'origin/main' into cs-10472-fix-realm-se…

7be0c51

…rver-teardown # Conflicts: # packages/runtime-common/loader.ts # packages/software-factory/tests/factory-target-realm.spec.ts

habdelra changed the base branch from cs-10449-implement-project-artifact-bootstrap-from-a-brief to main March 23, 2026 15:15

backspace approved these changes Mar 23, 2026

View reviewed changes

habdelra merged commit 86cb8e6 into main Mar 23, 2026
119 of 120 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix isolated realm server teardown in software-factory test harness (CS-10472)#4233

Fix isolated realm server teardown in software-factory test harness (CS-10472)#4233
habdelra merged 15 commits intomainfrom
cs-10472-fix-realm-server-teardown

habdelra commented Mar 23, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

habdelra commented Mar 23, 2026

Summary

Dependencies

Test plan

Related

Uh oh!

chatgpt-codex-connector bot commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Host Test Results

Uh oh!

github-actions bot commented Mar 23, 2026

Preview deployments

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Mar 23, 2026 •

edited

Loading