Skip to content

Generate boxel-ui component Specs for AI agent discovery#4809

Open
jurgenwerk wants to merge 14 commits into
mainfrom
cs-10527-component-specs-for-searchable-reusable-ui-components
Open

Generate boxel-ui component Specs for AI agent discovery#4809
jurgenwerk wants to merge 14 commits into
mainfrom
cs-10527-component-specs-for-searchable-reusable-ui-components

Conversation

@jurgenwerk
Copy link
Copy Markdown
Contributor

@jurgenwerk jurgenwerk commented May 13, 2026

Summary

@cardstack/boxel-ui components are now searchable and reusable by the software factory agent. Every component (Button, Modal, Pill, Accordion, …) gets a generated Spec card in the catalog realm with its API, an example, and a keyword-rich description — so an agent that needs "a primary action button" or "collapsible disclosure" finds the right component instead of hand-rolling raw HTML.

Resolves CS-10527.

How it works

  1. Every @cardstack/boxel-ui component ships a usage.gts file with a <FreestyleUsage> block documenting its API. That is the source of truth.
  2. A generator (packages/boxel-ui/addon/bin/generate-component-specs.mjs) walks those usage.gts files and emits one Spec JSON per component into the catalog tree.
  3. The realm-server runs the generator at deploy time, between pulling boxel-catalog and rsyncing into /persistent/catalog/. The generated specs are build artifacts of boxel-ui — never committed to any repo.
  4. When the factory agent is asked to build a card with UI, it searches the catalog for specType: 'component', reads the matching spec's readMe, and imports the component from @cardstack/boxel-ui/components instead of writing <button> / <input> / <details>.

Opt-in for now

The agent-side behavior is gated on --enable-boxel-ui-discovery. Without the flag, the factory runs exactly as it did before — no skill loaded, no catalog references in the system prompt, the cross-realm prohibition still names catalog explicitly. With the flag, the discovery skill and a system-prompt exception block become active and the catalog drops out of the prohibition list.

One change applies regardless of the flag: MAX_TOOL_USE_TURNS is raised from 50 → 100. The discovery loop adds a few extra turns (search → read spec → implement → self-audit), and the previous cap was tight enough that an opt-in run could bail at the limit. Non-discovery briefs already finish well under 50, so the bump is a no-op for them.

Bonus dev-env fix

Uncovered while testing locally: the catalog file-watcher's reindex path crashes with Cannot determine realm owner for realm http://localhost:42XX/catalog/ when the realm-server boots in dev with stale HTTP-canonical rows in realm_registry left over from the http://→https:// canonical swap (migration 1779100257124). The earlier migration rewrote URL substrings across most tables but left the registry rows in place; realm-server's bootstrap then re-inserted the HTTPS rows alongside, and the HTTP duplicates were orphaned without matching realm_user_permissions.

A small follow-up migration in this PR (1779972468714_remove-stale-http-canonical-realm-registry-rows.js) deletes HTTP-canonical localhost rows in realm_registry that have no matching permission row at that URL — captures both HTTPS-sibling duplicates and fully-retired realms like legacy-catalog. No-op in staging / production (real hostnames, never localhost). Pre-dates this PR; bundled here so anyone pulling the branch has a working local realm-server.

Follow-ups (out of scope)

  • Same skill for the in-host Matrix code agent (this PR covers the factory only).
  • A short gts-pitfalls reference for known parse traps (@tracked on inline class expressions, generics on class-property initializers) that consume turn budget.

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

Preview deployments

Host Test Results

    1 files      1 suites   1h 47m 54s ⏱️
2 841 tests 2 826 ✅ 15 💤 0 ❌
2 860 runs  2 845 ✅ 15 💤 0 ❌

Results for commit db184b2.

Realm Server Test Results

    1 files  ±    0      1 suites  +1   11m 41s ⏱️ + 11m 41s
1 515 tests +1 515  1 514 ✅ +1 514  1 💤 +1  0 ❌ ±0 
1 606 runs  +1 606  1 605 ✅ +1 605  1 💤 +1  0 ❌ ±0 

Results for commit db184b2. ± Comparison against earlier commit d0944f3.

@jurgenwerk jurgenwerk force-pushed the cs-10527-component-specs-for-searchable-reusable-ui-components branch 8 times, most recently from 61b00ec to 2de436c Compare May 14, 2026 09:54
Publishes one Spec card per @cardstack/boxel-ui component into the
catalog so the software factory agent can discover and reuse UI
primitives by querying the realm instead of hand-rolling HTML.

Generator (packages/boxel-ui/addon/bin/generate-component-specs.mjs):
walks each component's usage.gts, extracts the primary FreestyleUsage
block (args, description, example, CSS vars), and emits one JSON Spec
per component. Wired as `pnpm generate:component-specs`. Writes two
outputs: an in-repo snapshot under test/fixtures/specs/ (52 files) for
the CI drift gate, and the live tree at packages/catalog/contents/Spec/
for local realm-server file-watcher reindex. Normalizes internal Boxel-
prefixed tag names in the example block to the public export name
(`<BoxelInput>` → `<Input>`) and inlines class-field array literals for
`@options` so enum values are listed verbatim.

Resolver: registerCardReferencePrefix('@cardstack/boxel-ui/', …) in
host/app/lib/externals.ts so a Spec ref pointing at
@cardstack/boxel-ui/components resolves via the existing fake-packages
URL scheme.

Agent wiring: new factory skill
`packages/software-factory/.agents/skills/boxel-ui-component-discovery/`
that teaches the discovery recipe and makes the rule mandatory before
any UI is written. Loaded automatically by factory-skill-loader on the
same GTS-keyword trigger as ember-best-practices. The system prompt
gains a corresponding catalog-search exception under "Stay in your
target realm" and surfaces the catalog realm URL (derived from the
target realm origin) so the agent doesn't probe staging/prod hosts.
boxel-development pointers reinforce the rule from the always-loaded
skill.

CI: ci-lint runs `generate:component-specs --check` against the boxel-ui
snapshot to catch missed regenerations. mirror-boxel-ui-specs.yaml
publishes the generated specs to cardstack/boxel-catalog on merge to
main (needs BOXEL_CATALOG_PUSH_TOKEN secret — documented inline).

Test artifacts: two example briefs under packages/software-factory/realm/Wiki/
(delete-confirmable-note, support-ticket-form) exercise the discovery
loop against Modal/Button and Input/Select/Button/Pill respectively.
@jurgenwerk jurgenwerk force-pushed the cs-10527-component-specs-for-searchable-reusable-ui-components branch from 2de436c to a77eb72 Compare May 26, 2026 11:46
Three failures, three targeted fixes:

- mirror-boxel-ui-specs.yaml: the project yamllint rule requires
  double quotes; the two path filters were single-quoted.
- factory-entrypoint.test.ts: the deepEqual expected object missed
  the new enableBoxelUiDiscovery key returned by parseFactoryEntrypointArgs.
- Move deriveCatalogRealmUrl into its own factory-catalog-realm.ts.
  Importing it from factory-target-realm pulled matrix-client and
  the realm-server index-writer (whose `declare private` fields the
  Playwright transpile harness can't parse) into the factory-agent
  module graph, breaking the eval-validation and friends.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Auto-generates one Spec card per @cardstack/boxel-ui component (from each usage.gts) into the catalog so the software-factory agent can discover and reuse UI primitives via realm search. Discovery is opt-in via a new --enable-boxel-ui-discovery CLI flag plumbed end-to-end into the agent's skill set and system prompt. A CI workflow mirrors the regenerated specs into the external cardstack/boxel-catalog repo on merges to main.

Changes:

  • New generator (generate-component-specs.mjs) that parses <FreestyleUsage> blocks and emits Spec JSON with keyword-rich cardDescription and a structured readMe, plus an @cardstack/boxel-ui/ CodeRef prefix mapping in externals.ts.
  • New boxel-ui-component-discovery skill and conditional system-prompt blocks ({{#if enableBoxelUiDiscovery}}), gated through FactoryEntrypointOptionsIssueLoopWiringConfigDefaultSkillResolver + ContextBuilderAgentContext.
  • New mirror-boxel-ui-specs.yaml workflow, three example briefs (plus an unrelated running-tracker), MAX_TOOL_USE_TURNS bumped 50 → 100, and updated docs/spec.md Section 3.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
packages/boxel-ui/addon/bin/generate-component-specs.mjs New generator emitting one Spec JSON per component from usage.gts.
packages/boxel-ui/addon/package.json Adds generate:component-specs script.
packages/boxel-ui/addon/src/components/{button,input,menu}/usage.gts Adds top-level @description for richer cardDescription.
packages/host/app/lib/externals.ts Registers @cardstack/boxel-ui/ CodeRef prefix mapping.
packages/software-factory/src/factory-skill-loader.ts Adds enableBoxelUiDiscovery option and resolved-skills logging.
packages/software-factory/src/factory-context-builder.ts Propagates flag into AgentContext.
packages/software-factory/src/factory-issue-loop-wiring.ts Wires the flag through to resolver and context builder.
packages/software-factory/src/factory-entrypoint.ts Adds --enable-boxel-ui-discovery CLI parsing/usage.
packages/software-factory/src/factory-target-realm.ts New deriveCatalogRealmUrl() helper.
packages/software-factory/src/factory-agent/{types,claude-code,opencode}.ts Adds enableBoxelUiDiscovery/catalogRealm to system prompt; raises MAX_TOOL_USE_TURNS to 100.
packages/software-factory/prompts/system.md Adds conditional catalog-search exception and catalog realm line.
packages/software-factory/.agents/skills/boxel-ui-component-discovery/SKILL.md New discovery skill (location mismatched against orchestrator loader).
packages/software-factory/realm/Wiki/{delete-confirmable-note,support-ticket-form,product-faq,running-tracker}.json Example briefs (one is unrelated to the discovery loop).
.github/workflows/mirror-boxel-ui-specs.yaml New mirror workflow pushing regenerated specs to cardstack/boxel-catalog.
docs/spec.md Documents component-spec generator and developer workflow.
.agents/skills/boxel-development/SKILL.md Stray blank line, unrelated.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/software-factory/src/factory-agent/claude-code.ts
Comment thread .github/workflows/mirror-boxel-ui-specs.yaml Outdated
Comment thread packages/software-factory/src/factory-skill-loader.ts
Comment thread .agents/skills/boxel-development/SKILL.md Outdated
Comment thread packages/software-factory/realm/Wiki/running-tracker.json Outdated
jurgenwerk and others added 6 commits May 28, 2026 12:52
- boxel-development/SKILL.md: add the discovery-skill pointer under
  "Load By Task" that the original commit message advertised but
  never landed. Reinforces the rule from the always-loaded skill
  surface so the auto-loader no longer carries it alone.
- mirror-boxel-ui-specs.yaml: drop the `path: boxel` checkout layout.
  The init action runs `pnpm install --frozen-lockfile` with no
  working-directory, so it has to execute at the workspace root —
  the previous layout would have failed the first time the workflow
  fired on push to main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Treat the boxel-ui component specs as build artifacts of boxel-ui
rather than content of cardstack/boxel-catalog. Run the generator
from the realm-server's setup:catalog-in-deployment step against the
deployed usage.gts files, after pulling the catalog clone and before
the rsync into /persistent/catalog/. The generated specs are never
committed to either repo.

- Delete .github/workflows/mirror-boxel-ui-specs.yaml — no
  cross-repo push, so no PAT, no BOXEL_CATALOG_PUSH_TOKEN secret.
- packages/realm-server/package.json: add the boxel-ui generator
  invocation to setup:catalog-in-deployment between catalog:update
  and the rsync.
- docs/spec.md: rewrite the Developer Workflow + Gotchas section to
  describe the deployment-build approach; remove the mirror
  workflow + token paragraphs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The discovery feature stands on its own without shipping example
briefs into the production software-factory realm. Anyone who wants
to exercise the discovery loop can author a brief themselves; the
realm doesn't need to ship our scratch examples.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After migration 1779100257124 flipped the canonical localhost scheme
from http:// to https://, realm-server's bootstrap re-inserted HTTPS
rows on next boot but left the old HTTP rows in realm_registry. The
HTTP rows have no matching realm_user_permissions row (the earlier
migration rewrote those to HTTPS), so when the file-watcher fires on
an HTTP-keyed Realm instance, getRealmOwnerUserId throws "Cannot
determine realm owner for realm http://localhost:42XX/...".

The crash surfaced when generate:component-specs wrote new files into
packages/catalog/contents/Spec/, but the bug is older and affects any
local realm with an HTTP-canonical leftover in realm_registry.

The migration deletes only HTTP rows that have an HTTPS sibling at
the equivalent path — confirmed stale duplicates. HTTP-only rows
(e.g. legacy-catalog) are left alone since deleting them could orphan
content tied to that URL. Production/staging use real hostnames and
are unaffected by the localhost pattern, so the migration is a no-op
there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generalize the cleanup rule to "delete HTTP-canonical localhost rows
in realm_registry that have no matching realm_user_permissions row at
that URL". That captures both the duplicate case (HTTPS sibling now
owns the permissions) and the retired-realm case (legacy-catalog had
its HTTPS rows removed entirely by 1779348449320 + 1779720206026,
leaving the HTTP registry row as a dangling orphan).

Realms that genuinely exist only at HTTP would still have HTTP
permission rows and are left untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 22 changed files in this pull request and generated 1 comment.

Comment thread packages/software-factory/prompts/system.md Outdated
Wrap ", not catalog" in {{#unless enableBoxelUiDiscovery}} so the
flag-off path restores the prior explicit prohibition against
querying the catalog realm. With the flag on, catalog is dropped
from the prohibition and the "Exception — catalog component specs"
block sanctions the boxel-ui spec search; with the flag off, catalog
goes back to being one of the explicitly-named forbidden realms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 22 changed files in this pull request and generated 1 comment.

Comment thread packages/software-factory/prompts/system.md Outdated
jurgenwerk and others added 3 commits May 29, 2026 12:08
The project's prompt loader only implements {{#if}} (with {{else}})
and {{#each}}; {{#unless}} isn't a supported block tag and would be
emitted verbatim into the system prompt. Swap to the supported form:
{{#if enableBoxelUiDiscovery}}{{else}}, not catalog{{/if}}.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this commit 16 of 52 generated specs fell back to the generic
"X — boxel-ui component (see readMe for API and example)" placeholder
because their primary <FreestyleUsage> block lacked a @description
attribute and had no usable <:description> prose fallback. Agents
searching the catalog for "collapsible disclosure" or "color sample"
would not match those components.

Add a keyword-rich @description to each: accordion, add-button,
context-button, copy-button, date-range-picker, field-container,
filter-list, icon-button, message, multi-select, picker, realm-icon,
resizable-panel-group, sort-dropdown, swatch, tabbed-header.

Every component now has a useful cardDescription; zero generic
placeholder hits remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Glimmer template attributes don't honor JS backslash escapes; \' inside
a single-quoted attribute value terminates the attribute and leaves the
remainder as garbage, breaking the HBS parse and cascading into the JS,
HBS-lint, and glint pipelines (TS6133 phantoms for every import in the
file). Reword "the realm's configured icon" to "its configured icon"
so the description carries no apostrophes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jurgenwerk jurgenwerk marked this pull request as ready for review May 29, 2026 11:59
@jurgenwerk jurgenwerk requested a review from a team May 29, 2026 12:00
@jurgenwerk jurgenwerk changed the title Generate boxel-ui component Specs for AI agent discovery (CS-10527) Generate boxel-ui component Specs for AI agent discovery May 29, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db184b2bba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1 to +3
---
name: boxel-ui-component-discovery
description: MANDATORY before writing any UI in a `.gts` template. Search the catalog for a boxel-ui component Spec and reuse it. Fall back to raw HTML only when no matching spec exists, and surface the gap when you do.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Put discovery skill where SkillLoader can find it

With --enable-boxel-ui-discovery, DefaultSkillResolver adds boxel-ui-component-discovery, but the factory's default SkillLoader only searches packages/software-factory/.agents/skills-orchestrator, packages/boxel-cli/plugin/skills, and the monorepo root .agents/skills; it does not search this package-local .agents/skills directory. In opt-in factory runs this skill will be logged as unavailable and skipped, while the system prompt still refers the agent to the missing skill for the exact catalog query/procedure, so the new discovery workflow is effectively not loaded. Move/copy it to a searched directory or add this directory as a fallback for the factory loader.

Useful? React with 👍 / 👎.

Comment on lines +435 to +439
const nameAttr = extractStringAttr(block.openAttrs, 'name');
const componentName =
nameAttr && /^[A-Z][A-Za-z0-9]*$/.test(nameAttr)
? nameAttr
: toPascalCase(slug);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use exported component names in generated specs

When a usage.gts @name is a display label or legacy/unprefixed name, this chooses a name that is not exported from @cardstack/boxel-ui/components (for example Input, Select, Field, Dropdown, MultiSelect, Progress, and Tag, while the barrel exports BoxelInput, BoxelSelect, FieldContainer, BoxelDropdown, BoxelMultiSelect, ProgressBar, and BoxelTag/TagList). Those generated specs then advertise invalid ref values and readMe imports, so an agent following the catalog spec will write imports that fail at runtime. Derive the public name from the barrel export or add per-component overrides instead of trusting the Freestyle display name.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants