realm-server: HTTPS+HTTP/2 in local dev#4797
Conversation
Heavy aggregator-card renders (cohort, dashboards) fan out 80+
federated-search requests per render inside one Chromium tab. Chrome's
HTTP/1.1 6-per-origin connection ceiling serializes them and turns a
single render into multiple minutes; HTTP/2 multiplexes them over one
connection and the same render finishes in seconds. Browsers only do
HTTP/2 over TLS, so the local realm-server now terminates a cert.
Single-origin design: the realm-server listens on
`https://localhost:4201` (and `https://localhost:4202` for test-realms)
when the dev cert is provisioned. There is no parallel HTTP listener
and no h2 alias port; the wire protocol and the canonical realm URL
agree. In-process tests and any environment without a cert keep getting
plain HTTP/1.1 via the same `listen(port)` entry point — `RealmServer`
picks the protocol from `REALM_SERVER_TLS_CERT_FILE`/`_KEY_FILE` rather
than two separate methods.
Cert provisioning is opt-in via `mise run infra:ensure-dev-cert`:
- Requires `mkcert` (single-origin HTTPS has no HTTP fallback in
dev, so a missing prereq is a hard error with install hints).
- Attempts `mkcert -install` once for system trust; declining the
sudo prompt is non-fatal — the cert still gets generated and
indexing keeps working via puppeteer's `--ignore-certificate-errors`
flag and `NODE_EXTRA_CA_CERTS` for Node clients.
- Idempotent: re-runs are a no-op until the cert is within 7 days of
expiry.
`env-vars.sh` flips `REALM_BASE_URL`/`REALM_TEST_URL` defaults to
`https://localhost:4201`/`4202`, exports the cert paths when files
exist, and points `NODE_EXTRA_CA_CERTS` at mkcert's root CA so Node-
side fetches (worker, scripts, prerender Node) trust the cert without
requiring `mkcert -install` to have run. `dev-common.sh` switches
wait-on's readiness probes to `https-get://` when the realm URL is
HTTPS. The host's `config/environment.js` defaults flip to
`https://localhost:4201` for `realmServerURL`, `baseRealmURL`,
`catalogRealmURL`, `legacyCatalogRealmURL`, `skillsRealmURL`, and
`openRouterRealmURL`. `middleware/index.ts#fullRequestURL` now detects
`ctx.req.socket.encrypted` so URL-keyed realm lookup matches the wire
protocol — combined with the canonical-URL flip, both halves agree.
CI / hermetic test harness path stays HTTP-only: if no cert is
provisioned, `env-vars.sh` leaves the TLS env vars unset and the
realm-server boots `http.createServer`, exactly as before.
Migration after pulling: any local card data created under the old
`http://localhost:4201/...` canonical references is stale and needs to
be re-indexed. README documents the one-time `mise run
infra:full-reset` step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1655a6f2df
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Preview deploymentsHost Test Results 1 files ± 0 1 suites ±0 1h 18m 34s ⏱️ + 8m 51s Results for commit de5129a. ± Comparison against earlier commit f5de14e. Realm Server Test Results 1 files ±0 1 suites ±0 7m 54s ⏱️ - 5m 20s Results for commit de5129a. ± Comparison against earlier commit f5de14e. |
Adds the two missing pieces from the initial HTTPS+HTTP/2 flip: 1. Same-port HTTP→HTTPS dispatcher in `server.ts`. When the realm-server speaks TLS, `listen(port)` now binds a net.Server that peeks the first byte off every connection: 0x16 (TLS ClientHello) routes to the http2 secure server; anything else is treated as plain HTTP and handed to a tiny 301-redirect handler that rewrites the URL to `https://<inbound-host><path>`. So `http://localhost:4201/…` in a browser bar or a `curl` invocation gets a clean 301 instead of a TLS handshake failure. Same listener, no extra port. 2. A node-pg-migrate that rewrites every URL-bearing text/varchar/jsonb column on every public table (except `modules`, which the realm-server truncates on startup) from `http://localhost:42XX` to `https://localhost:42XX`. Auto-discovered via `information_schema.columns` — covers `boxel_index`, `boxel_index_working`, `realm_registry`, `realm_meta`, `realm_metadata`, `realm_user_permissions`, `realm_versions`, `realm_file_meta`, `module_transpile_cache`, plus any future URL-bearing column that's added later (the discovery picks it up). WHERE-filtered so it only touches rows still containing the old URL — idempotent, no-op in production. `mise run dev` already passes `--migrateDB` to the realm-server, so the migration runs automatically on the first post-pull boot. README's "Local HTTPS dev access" section is rewritten to describe the new auto-migration flow (no more `mise run infra:full-reset` callout). Schema file renamed from `1779100257123_schema.sql` to `1779200000000_schema.sql` so host/config/environment.js's migration-vs-schema-name sentinel matches the new latest migration. Content is unchanged (the new migration is data-only). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI was failing across host/realm-server/matrix test suites because ensure-dev-cert exited non-zero when mkcert was missing, killing the mise dep chain before any service started, and because env-vars.sh flipped REALM_BASE_URL to https unconditionally — so even when the realm-server fell back to plain HTTP, every consumer was still asked to fetch against https. The host config defaults had the same problem: hardcoded https meant the in-browser realmServerURL didn't match the wire scheme. Three fixes, gated on cert presence: 1. `ensure-dev-cert` now exits 0 with a soft warning when mkcert is missing. The realm-server's `listen()` already falls back to plain `http.createServer` when the TLS env vars are unset, so this is the honest behavior for CI / hermetic-test environments. 2. `env-vars.sh` defaults `REALM_BASE_URL`/`REALM_TEST_URL` to http and only upgrades them to https inside the cert-detected block alongside the existing TLS env var exports. 3. `packages/host/config/environment.js` derives its scheme from `process.env.REALM_BASE_URL`, so the host config follows the same cert-presence-driven flip rather than baking https into the JS defaults. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Enables local-dev realm-server to serve a single canonical HTTPS origin with HTTP/2 (plus same-port HTTP→HTTPS redirect) to remove Chrome’s HTTP/1.1 per-origin connection bottleneck during heavy prerender/search fan-outs, and migrates local indexed data from http://localhost:42xx to https://localhost:42xx.
Changes:
- Add TLS-capable listener that multiplexes HTTPS/HTTP2 and HTTP redirect on the same port; update URL construction to recognize TLS sockets.
- Default local dev URLs/config/docs to
https://localhost:4201(+:4202for test realms) and add mkcert-based cert provisioning. - Add a Postgres migration to rewrite persisted localhost canonical URLs from http→https.
Reviewed changes
Copilot reviewed 45 out of 46 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Document local HTTPS/HTTP2 setup, migration, and updated local URLs. |
| QUICKSTART.md | Update quickstart URLs to https://localhost:4201. |
| packages/realm-server/tests/types-endpoint-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/search-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/search-prerendered-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/info-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/index-responses-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/helpers.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/federated-types-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/authentication-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/request-forward-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/user-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/reindex-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/markdown-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/info-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/dependencies-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/publish-unpublish-realm-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/prerender-manager-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/openrouter-passthrough-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/module-cache-race-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/helpers/index.ts | Update close helpers/types to tolerate non-http.Server server handles. |
| packages/realm-server/tests/get-boxel-claimed-domain-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/file-watcher-events-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/delete-boxel-claimed-domain-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/claim-boxel-domain-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/card-source-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/card-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/card-dependencies-endpoint-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/boxel-domain-availability-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/atomic-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/server.ts | Add TLS/http2+redirect dispatcher and export RealmHttpServer type; update listen logging. |
| packages/realm-server/prerender/browser-manager.ts | Add --ignore-certificate-errors for prerender Chromium when using https. |
| packages/realm-server/middleware/index.ts | Treat TLS sockets as https for fullRequestURL() computation. |
| packages/realm-server/main.ts | Make shutdown tolerant of non-http.Server handles lacking closeAllConnections(). |
| packages/realm-server/lib/dev-service-registry.ts | Broaden registry typing to net.Server. |
| packages/postgres/migrations/1779200000000_canonical-url-http-to-https.js | Add migration to rewrite localhost canonical URLs from http→https. |
| packages/host/config/schema/1779200000000_schema.sql | Add regenerated host sqlite schema snapshot. |
| packages/host/config/environment.js | Flip local default realm URLs to https. |
| mise-tasks/services/test-realms | Ensure dev cert task runs before test realms. |
| mise-tasks/services/realm-server-base | Ensure dev cert task runs before base realm server. |
| mise-tasks/services/realm-server | Ensure dev cert task runs before realm server. |
| mise-tasks/lib/env-vars.sh | Flip default realm URLs to https and export TLS cert/CA env vars. |
| mise-tasks/lib/dev-common.sh | Use https readiness probes when realm URLs are https. |
| mise-tasks/infra/ensure-dev-cert | New task to provision mkcert leaf cert for local HTTPS/HTTP2. |
| .claude/skills/indexing-diagnostics/SKILL.md | Update localhost URLs and markdown formatting in diagnostics skill doc. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Local realm-server speaks HTTPS+HTTP/2 in every environment — there is no HTTP fallback or opt-in. The dev cert is a hard prereq: - `ensure-dev-cert` exits non-zero when mkcert is missing. - `env-vars.sh` defaults `REALM_BASE_URL`/`REALM_TEST_URL` to https unconditionally and no longer flips schemes based on cert presence. - `host/config/environment.js` defaults to `https://localhost:4201` unconditionally; the previous scheme-from-env-var branch is gone. - The new `.github/actions/init` step installs mkcert via apt and runs `mise run infra:ensure-dev-cert` before any downstream job, so CI realm-servers boot HTTPS+HTTP/2 too. Test harnesses that launch Chromium already pass `--ignore-certificate-errors`; Node clients pick up the cert via `NODE_EXTRA_CA_CERTS`. - README's CI/harness paragraph is rewritten to describe the cert provisioning in the init action (no more "boots HTTP/1.1 in CI" line). Carries over the Copilot-flagged fixes: - Migration renamed to `1779100257124_canonical-url-http-to-https.js` (one greater than the existing latest, no 6+ consecutive zeros so it passes `lint:migrations`) and the matching schema dump renamed. - Migration body adds a `realm_registry` LIKE pre-check that short- circuits the full-column scans on production/staging databases where the canonical URLs never reference localhost. - Drops the unused `/* eslint-disable camelcase */` line that `lint:js` flagged. - `redirectToHttps()` parses the inbound `Host` via `new URL()` so bracketed IPv6 authorities (`[::1]:4201`) round-trip cleanly instead of the regex producing an invalid `https://::1:4201/...`. - `env-vars.sh` no longer concatenates `NODE_EXTRA_CA_CERTS` with `:` separators — Node accepts a single PEM path, not a list. If the dev already has it set, leave it alone; otherwise point at mkcert's CA. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Copilot 3230386975 — the previous QUICKSTART pointed users at https://localhost:4201 without telling them how to provision the cert that makes that origin work. Adds mkcert to the system dependencies list at step 1 with platform-specific install hints and the `mise run infra:ensure-dev-cert` one-liner, linking back to the README's "Local HTTPS dev access" section for the full story. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three task scripts under `mise-tasks/test-services/` were stuck on the
old `http-get://${REALM_BASE_URL#http://}/base/...` readiness probe
shape that strips a hardcoded `http://`. After env-vars.sh flipped
REALM_BASE_URL to https, that strip becomes a no-op and the probe URL
turns into the malformed `http-get://https://localhost:4201/...`,
which wait-on can't reach — every CI suite that drives `mise run
test-services:*` would hang on phase-1 readiness instead of starting
the next phase.
Same fix as `mise-tasks/lib/dev-common.sh`: detect the scheme from
`$REALM_BASE_URL` / `$REALM_TEST_URL` and pick `http-get://` or
`https-get://` accordingly; strip `*://` to leave just the authority.
Also wires `infra:ensure-dev-cert` into each script's depends list so
local invocations of `mise run test-services:*` (outside CI's init
action) provision the cert before the realm-server starts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Blockers (B1–B3):
- tests/index.ts deletes REALM_SERVER_TLS_CERT_FILE/_KEY_FILE before any
fixture realm-server is spun up; without this CI's globally-provisioned
cert leaks into supertest-driven in-process servers, the dispatcher
binds TLS on 127.0.0.1:444X, and the plain-HTTP-from-supertest path is
301-redirected, breaking every assertion that expects 200/4xx.
- realm-server/package.json `test:wait-for-servers` now uses
`https-get://` to match the new wire scheme; the previous `http-get://`
hit the dispatcher's 301 path and never reported ready.
- server.ts attaches a per-socket `error` handler before the readable
callback so an RST mid-handshake (or any peer-side socket error)
doesn't escalate to an uncaught exception — dispatcher is the only
inbound listener for the realm-server, can't be allowed to crash.
- `null` reads on the dispatcher socket now `destroy()` instead of just
resuming so half-open accumulators (port scanners, eager load
balancers) don't tie up file descriptors.
Major (M1, M3–M5):
- README's auto-migration callout pointed at the wrong migration filename
(1779200000000_… → 1779100257124_…).
- pg-adapter.ts env-mode regex now matches `^https?://localhost:42XX/`
so the post-flip https canonicals get rewritten to Traefik hostnames
when a dev switches the same DB into BOXEL_ENVIRONMENT mode.
- server.ts's serveIndex / serveFromRealm URL constructions now go
through `fullRequestURL(ctxt)` instead of `${ctxt.protocol}//${ctxt.host}`;
`ctxt.protocol` only honors x-forwarded-proto when `app.proxy = true`,
while `fullRequestURL` also reads the TLS socket flag. Pre-existing
inconsistency that the https flip would have made load-bearing.
- migration's information_schema walk excludes `is_generated = 'NEVER'`
so a future generated column on any public table doesn't abort the DO
block with "column can only be updated to DEFAULT".
Copilot's second pass:
- ensure-dev-cert checks for mkcert BEFORE the idempotent-skip — env-vars.sh
needs `mkcert -CAROOT` to populate NODE_EXTRA_CA_CERTS even when an
old cert already exists, and the previous ordering let a stale cert
slip past with the trust path half-wired.
- middleware/index.ts `fullRequestURL` falls back to `:authority` when
`headers.host` is absent — HTTP/2's compat layer normally populates
host from :authority but the pseudo-header is the canonical source.
- middleware/index.ts `fetchRequestFromContext` strips `:`-prefixed
pseudo-headers (`:method`, `:scheme`, `:path`, `:authority`) before
feeding them into `new Request(headers)`, which WHATWG Headers rejects.
- QUICKSTART mkcert bullet's continuation line is properly indented now
so markdown renders it inside the bullet instead of as a new paragraph.
- indexing-diagnostics SKILL.md two table rows now have the missing third
cell so the table renders correctly.
Minor (m2, m6, n3) + Option A:
- redirectToHttps falls back to `socket.localAddress:localPort` when the
Host header is absent (HTTP/1.0 client), instead of bare `localhost`
that would route to port 443.
- scripts/full-reindex.sh and register-bot.sh flip to `https://` with
`-k` (curl doesn't pick up NODE_EXTRA_CA_CERTS, and the local mkcert
CA isn't necessarily in the system trust store).
- prerender/browser-manager.ts comment references only REALM_BASE_URL
(REALM_SERVER_DOMAIN was stale — never exported by env-vars.sh).
- QUICKSTART step 10/11 and README's "view a realm's app" paragraph
redirect manual-browser navigation to `http://localhost:4200/` (the
vite host), with a note that visiting `https://localhost:4201` directly
surfaces mixed-content warnings because vite + icons + synapse still
speak http. Realm-server's https origin is reached only via fetches
inside the vite-served page, which is where the federated-search h2
win lands. README's "view example" output also flipped the realm log
line to `https://localhost:4202/test/` to match the new canonical.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README list item 3's wrapped continuation line is now indented under the bullet so markdown doesn't break it into a separate paragraph. - server.ts dispatcher tracks every accepted socket in a Set and mirrors http.Server's `closeAllConnections()` API. main.ts's existing typeof feature-detect picks this up; shutdown no longer hangs on long-lived h2 sessions or keep-alive sockets. - tests/listener-dispatcher-test.ts is new coverage for the dispatcher: generates a self-signed cert via openssl into a tmp dir, then exercises TLS+h2, ALPN HTTP/1.1 fallback, plain-HTTP→https 301 redirect, the no-Host-header path that uses `socket.localAddress`, malformed-cert downgrade to plain HTTP, and the no-cert-env-vars path. `createListener` is now exported from server.ts so the test can drive it without spinning up a full realm-server fixture (and the test bootstrap's global TLS-env-var delete doesn't interfere — each test restores its own env around `startListener`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`qunit/no-assert-logical-expression` was failing on three assertions that combined multiple conditions via `&&` / `||`. Splitting them into discrete `assert.true(...)` calls makes the failure point obvious when a test breaks and clears the lint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both `packages/workspace-sync-cli/tests/helpers/start-test-realm.ts` and `packages/realm-test-harness/src/isolated-realm-stack.ts` spawn a realm-server subprocess that inherits `process.env`. After CI's init action provisions the dev cert and `env-vars.sh` exports `REALM_SERVER_TLS_CERT_FILE/_KEY_FILE`, those env vars leak into the spawned realm-server, which binds the HTTPS+HTTP/2 dispatcher on the harness's chosen port. The integration tests and the realm-perf bench both drive plain `http://localhost:<port>/...` URLs against that server, hit the dispatcher's 301 path, and break: workspace-sync's CLI fails its session handshake with "expected 'Authorization' header" (it doesn't follow the redirect through the auth flow), and the bench fails its first GET with `404` because the realm route is behind https now. Same shape of fix as `realm-server/tests/index.ts` for the in-process qunit suite: destructure the two TLS env-var keys out of the spawn env so the child inherits everything except those. Plain `http.createServer` path, no redirect, harness HTTP URLs work as written. Production realm-servers and local dev are unaffected because they don't go through these harnesses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`packages/host/testem-live.js` was hardcoding `http://localhost:4201/catalog/` as the realm URL and launching Chrome with the default trust policy. After the HTTPS flip, the live-test runner's `discoverTestModules` fetched against `https://localhost:4201/catalog/...` (via the host's `realmServerURL` default) but the browser navigated to `http://localhost:4201/...`, getting a 301 to https and then failing the cert check — `mkcert -install` in CI's init action is best-effort and the headless Chrome in CI doesn't always pick up the system trust store anyway. Two fixes paired: - Default realm URL flips to `https://localhost:4201/catalog/` so the navigation target matches the wire. - Chrome's CI launch args get `--ignore-certificate-errors` so the live test runner accepts the mkcert leaf without depending on system trust. Safe — the URL is fixed by REALM_URL and the connection is loopback. Dev (`launch_in_dev`) doesn't add the flag because local devs typically have run `mkcert -install` successfully and the cert is trusted normally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…registry The pre-check needs to fire on a fresh install too. `realm_registry` is populated by the realm-server's runtime bootstrap (registry backfill + reconciler), not by migrations, so it's empty when this migration runs against a freshly-created DB — the migration short-circuited and the `http://localhost:42XX` permission rows seeded by the earlier `1726671342065_backfill-realm-owners.js` migration stayed un-rewritten. The realm-server then matches incoming requests against the new `https://localhost:42XX/…` canonical and the permission rows fail to join → world-readable catalog returns 401 → Live Tests fail with "Cannot access realm https://localhost:4201/catalog/ (HTTP 401)". Switch the pre-check to `realm_user_permissions.realm_url`, which is reliably populated with the localhost canonicals by the earlier seed-style migrations. The rest of the migration body is unchanged — the per-column WHERE clauses still restrict the touch set to rows that actually contain the old URL, so production/staging DBs (real hostnames, never localhost) still no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test mode runs against the host-internal `http://test-realm/...` virtual origin via VirtualNetwork; there is no real realm-server on the wire. Many host test fixtures hardcode the `http://localhost:4201/...` canonicals in mock setups, VirtualNetwork mappings, and JSON test data, so flipping the default URLs to https caused every fetch in the test suite to fail with `TypeError: Failed to fetch` — the host's VirtualNetwork was wired with https URL mappings the test mocks didn't recognize. `environmentDefaults(environment)` now reads the ember env and picks http for `environment === 'test'`, https otherwise. Dev gets the HTTPS+HTTP/2 flip exactly as designed; test stays where it always was. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous test-mode-on-http revert was wrong: in Host Tests the realm-server actually IS running (via mise run test-services:host), and that realm-server speaks HTTPS+HTTP/2. The host bundle's defaults need to match the wire so module/data fetches over the wire (like GET /base/card-api during warmup) reach the live realm-server. The http defaults were producing failed http→https mismatches. So: - environment.js test mode reverts to https defaults (same as dev). - test-wait-for-servers.sh + live-test-wait-for-servers.sh default their readiness probe URLs to `https-get://` to match. live-test-wait-for-servers.sh also gets the same scheme-detection helper (`to_wait_scheme`) the other scripts use so an explicit REALM_URL with either scheme works. `http://test-realm/...` URLs in tests (used by the in-memory test realm registry) are still intercepted by `getRealmInfoForURL` before any wire fetch — that path is unrelated to the wire defaults and any remaining failures there are a separate concern from the HTTPS flip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sweep of every place `http://localhost:4201`/`4202` appears with runtime impact: Runtime / wire-touching: - `package.json` `openrouter:sync` default REALM_URL → https - `mise-tasks/lib/test-dev-common.sh` stub env defaults → https - `packages/host/app/services/host-mode-service.ts` `originIsNotMatrixTests` accepts both http and https origins on the matrix-tests realm ports (https is the new default; http stays recognized so older snapshots still detect the test mode). - `packages/observability/scripts/apply.sh` / `diff.sh` default `REALM_SERVER_URL` → https. Cache import: - `scripts/import-cached-index.sh` env-mode sed remap now matches both `http://localhost:4201` and `https://localhost:4201` — older cache snapshots have http canonicals, post-flip dumps have https. Either prefix gets rewritten to the env-mode Traefik hostname. In-tree realm fixture data (cards served by dev realm-server): - `packages/experiments-realm/**/*.json` and `packages/catalog-realm/**/*.json` `id` / `relationships` URLs flipped from http to https. Without this every cross-card fetch inside a render paid a wire-level 301 redirect from the dispatcher. Docs: - `README.md`, `QUICKSTART.md`, `packages/host/docs/live-tests.md`, `packages/software-factory/README.md`, `packages/bot-runner/README.md`, `docs/commands-in-headless-chrome.md` — example URLs updated. Not flipped (intentional): - Test fixture JSONs under `packages/host/tests/cards/`, `packages/realm-server/tests/cards/`, ai-bot resource chats, and bench-realm snapshot fixtures. Those URLs match test-side mount points (`http://test-realm/...`, `http://127.0.0.1:4444/test/`, bench-stack http://localhost:4201) where the test infrastructure spawns the realm-server with TLS env vars cleared and listens plain HTTP. Flipping them would diverge from what the test code registers and break the in-process fixtures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Host Tests load the host bundle in a headless Chrome on testem (port 7357). The bundle's `realmServerURL` / `resolvedBaseRealmURL` defaults now point at `https://localhost:4201` to match the wire, but `mkcert -install` in CI's init action is best-effort and doesn't reliably land mkcert's root CA in headless Chrome's NSS trust store. Without `--ignore-certificate-errors`, every realm fetch made during shard warmup fails with `TypeError: Failed to fetch` against the self-signed cert and the rest of the shard never starts. Same fix already shipped in `testem-live.js`. Loopback only, fixed origin via host config — safe to relax cert trust. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Boxel-cli's vitest suite (and any other non-qunit caller of these helpers) doesn't share `packages/realm-server/tests/index.ts`'s bootstrap, so the global TLS env var delete that protects in-process qunit fixtures didn't apply to it. The CI init action provisions the cert, env-vars.sh exports the paths, and the test process inherits them — the spawned realm-server then binds HTTPS+HTTP/2 on its fixture port (`127.0.0.1:4446` for boxel-cli) and the CLI's plain-HTTP session calls fail with `404 Not Found` from the dispatcher's 301 path. Moving the env-var strip into the two `runTestRealmServer*` helpers themselves makes it defense-in-depth: every caller (qunit, vitest, software-factory harness) now goes through the same kill switch when spinning a fixture realm-server. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p2-v2 # Conflicts: # .claude/skills/indexing-diagnostics/SKILL.md # packages/realm-server/scripts/full-reindex.sh # packages/realm-server/tests/realm-endpoints/info-test.ts # packages/realm-server/tests/realm-endpoints/user-test.ts
Matrix client tests timed out waiting for `http-get://localhost:4201/base/_readiness-check` because the realm-server now speaks HTTPS+HTTP/2 only. Wait-on's plain http-get probe never resolves against the https listener. Same fix for start-without-matrix.sh (dev convenience script used to bring up the stack without Synapse). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Card fixture data hardcoded http://localhost:4202 in adoptsFrom.module. With the realm-server now on HTTPS, the page is served over https and Chrome blocks mixed-content fetches of the http module URL. Flipping to https keeps the canonical realm URL consistent with the actual listener scheme. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Node's HTTP/2 compat layer marks server-side Http2Stream.writable=false for HEAD-method streams (the protocol forbids a body, so the stream is non-writable up front). Koa's ctx.writable getter delegates to res.socket.writable, so for HEAD over h2 it sees false and respond() short-circuits on `if (!ctx.writable) return` — no headers are ever sent and the client hangs until its timeout. Reproduced with bare curl against the realm-server (every HEAD over h2 timed out, GET worked) and with a 30-line koa + http2.createSecureServer minimal repro, so this is not realm- or browser-specific. The host test bundle's CachingDefinitionLookup.probeRemoteRealm HEAD probe was the visible symptom that surfaced this on host CI. patchKoaResponseForH2Head() overrides Koa's response.writable prototype getter to recognise a healthy HEAD-over-h2 stream as writable. createListener applies it once when an h2 listener is constructed. Also: add a forbidden-header filter in setContextResponse so realm responses don't try to forward hop-by-hop headers (connection, keep-alive, transfer-encoding, etc.) onto an h2 reply — defence in depth per RFC 9113 §8.2.2. Test: new 'TLS h2 HEAD returns 200 without hanging' regression test in listener-dispatcher-test (would time out without the patch). Also register listener-dispatcher-test in tests/index.ts (it was never running) and fix a pre-existing this-binding bug in its cleanup that surfaced when the no-cert path started executing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The realm-server-only and worker-base service tasks (used by the CI matrix tests workflow's start-server-and-test stack) still pointed base realm's --toUrl at http://localhost:4201/base/ even though the realm-server now binds HTTPS+h2 on 4201. Result: every request to /base/* was a registry miss (realm registered under http:// but incoming request is https://) so /base/_readiness-check returned 404, wait-on's 10-minute timeout fired, and the whole shard failed. Bring these two tasks in line with services/realm-server and services/worker by switching to the https:// form. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two card-references assertions hardcode the expected sorted dep list with 'https://localhost:4202/test/person' pinned at position 0. That position was correct when the URL was 'http://localhost:4202/...' (http < http:// < https://, so http://localhost:4202 sorted before all http://localhost:4206 entries). After the canonical-URL flip to https in this branch, https://localhost:4202/test/person sorts AFTER all https://cardstack.com/base/* and BEFORE https://packages/* — moving the entry into that slot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
card-endpoints-test, types-endpoint-test, and module-syntax-test
hardcode card/module URLs at localhost:4202/node-test/. The HTTPS
flip in this branch makes the canonical address https://localhost:4202/,
so:
- types-endpoint-test asserts the returned card-type-summary `id`
against the http:// form; the realm returns the canonical https://
form, so deepEqual fails.
- card-endpoints-test posts `module: 'http://localhost:4202/.../friend'`
in the body; the realm tries to resolve a card type at that URL,
misses the canonical https:// entry in the module cache, and 500s.
- module-syntax-test passes the URLs to `new URL(...)` for relative-
path computation — purely string work, but flipping keeps the file
consistent with its siblings now that the realm speaks https.
Single search/replace across the three files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ands off Node's http2 compat layer surfaces pseudo-headers (`:method`, `:scheme`, `:path`, `:authority`) on `req.headers` alongside regular headers. koa-proxies / http-proxy forwards every header verbatim into `new http.ClientRequest(...)`, and Node rejects any name starting with `:` as `ERR_INVALID_HTTP_TOKEN`. Result on the h2 path: every proxied asset (notably `/auth-service-worker.js`) returns 500. The host bundle registers the service worker on every page load, so each matrix / host test refetches it and hits the same 500 on retries — shards churn for 30+ minutes burning the playwright retry budget. Wrap the proxy middleware: delete pseudo-headers from `ctxt.req.headers` before delegating to the inner koa-proxies handler. The URL and method are already extracted from ctxt, so the upstream HTTP/1.1 request has everything it needs without the h2 metadata. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit deleted h2 pseudo-headers (`:method`, `:path`, …) directly off `req.headers` before delegating to koa-proxies. Node's http2 compat layer returns the *internal* headers map from the `req.headers` getter — the same map `req.method` and `req.url` read from — so deleting `:method` and `:path` nulled out req.method/req.url for every subsequent middleware. Koa's `ctx.path` getter (called by koa-proxies' route matcher) then threw "Cannot read properties of undefined (reading 'pathname')", every request 500'd, and every Host Tests shard fell over. Switch to a non-destructive shadow: define a `headers` value property on `ctxt.req` with the filtered copy for the inner proxy call, then delete it in a `finally` so the prototype getter is restored for the rest of the request lifecycle. Mutates nothing Node owns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rder
The 2 previous attempts at h2-proofing koa-proxies both regressed
something:
- mutating `req.headers` to delete pseudo-headers clobbered Node's
internal headers map (the same map `req.method` / `req.url` read
from), turning every request into a 500 with "pathname undefined".
- shadowing `req.headers` with `Object.defineProperty` + restoring
via `delete` left the property missing for HTTP/1.1 requests (no
prototype getter to fall back to), which is also bad in subtle
downstream ways the realm-server boot did not survive.
The root issue is that http-proxy assigns `req.headers` straight onto
the `outgoing` options bag it hands to `http.ClientRequest`, and there
is no pre-construction hook to filter the headers. Replace the entire
koa-proxies + http-proxy stack with a hand-rolled forwarder: read URL
and headers from the Koa context, pick the headers we want to forward
(skipping `:`-prefixed pseudo-headers and `host`), issue an http.request
against the assets URL, stream the response back via `ctxt.body`. One
code path serves both h1 and h2 callers, no req.headers gymnastics.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The matrix test fixtures (playwright.config baseURL, helpers/index.ts
ports, the URL maps inside isolated-realm-server itself) all hardcode
`http://localhost:4205/…`. In CI, env-vars.sh exports
`REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` for the parent dev stack
that speaks HTTPS+HTTP/2 on 4201 / 4202, and those env vars are
inherited by every child `spawn()` unless explicitly stripped. The
isolated realm-server therefore boots in HTTPS+h2 mode while its realm
registry is keyed on `http://localhost:4205/…` — every
`http://localhost:4205/{test,skills,base}/_mtimes` request from the
worker comes through the dispatcher's plain-HTTP redirect path, lands
on the HTTPS endpoint as a `_mtimes` lookup for the *https://* URL
(which isn't registered), and 404s. The matrix tests then hang waiting
for the page to render against an unindexable realm, blow through the
playwright timeout, and shards run for ~2 hours.
Spawn the prerender / worker-manager / realm-server child processes
with a process.env clone that has the two TLS env vars deleted, so the
isolated stack stays plain HTTP and matches the hardcoded URLs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`mkcert -install` is internally idempotent, but the sudo probe still prompts for a password on every invocation if it can't verify the root CA without root privileges. Inside `mise run dev-all`, that prompt flows through `start-server-and-test`'s child shells alongside the parallel server output — the prompt is essentially invisible and unsendable, and the whole dev stack collapses with a SIGTERM cascade when sudo times out. Stop trying to invoke `mkcert -install` from the task. Instead, check upfront whether mkcert's `rootCA.pem` is already present in both the system trust store and the user's `~/.pki/nssdb`, and exit fast with a clear message telling the dev to run `mkcert -install` once manually if either is missing. After the one-time setup this task is a fast no-op on every invocation, with no chance of stalling dev-all on a sudo prompt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ensure-dev-cert already prints a clear "run \`mkcert -install\` once" message when the root CA isn't trusted, but inside `dev-all`'s parallel stack that message arrives as one of hundreds of buffered lines prefixed with `[start:development] [infra:ensure-dev-cert]`, intermixed with concurrent output from the other six services. The downstream cascade (vite dependency-scan restart, prerender / worker-manager teardown, 45-error rolldown traceback) buries the actual cause completely — a fresh dev hitting this sees a wall of plugin errors and no obvious "you need to run mkcert -install" hint. Invoke `mise run infra:ensure-dev-cert` as the very first step of dev-all, before we even spawn the host app. The cert check runs in isolation and its error is the only thing on screen. If it passes, the inner `mise run` invocations that re-depend on it are fast no-ops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…message
The first-time mkcert-install path now has a dedicated mise task,
`infra:trust-dev-cert`, that creates ~/.pki/nssdb and runs
`mkcert -install` (with its sudo prompt) interactively. The companion
`infra:ensure-dev-cert` task is now strictly read-only: it verifies the
mkcert root CA is already trusted in both the system store and the NSS
DB and exits 1 with a one-paragraph active-voice message if it isn't:
The mkcert dev root CA is not installed on this machine.
Run this once to install it (prompts for sudo):
mise run infra:trust-dev-cert
Then re-run the command you just ran.
Both `mise run dev` and `mise run dev-all` now invoke
`infra:ensure-dev-cert` upfront before spawning the parallel stack, so
that error is the first and only thing on screen instead of being
buried under hundreds of multiplexed lines of vite / start-server-and-
test output.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`dev` and `dev-all` now pass `BOXEL_DEV_INVOKED_AS` into the ensure-dev-cert invocation, and ensure-dev-cert substitutes that into the "Then re-run …" line. Three concrete variants: - `mise run dev` → "Then re-run `mise run dev`." - `mise run dev-all` → "Then re-run `mise run dev-all`." - direct invocation → "Then re-run `mise run infra:ensure-dev-cert`." The user no longer has to remember what they typed five lines of output ago. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The realm-server speaks HTTPS+HTTP/2 on 4201/4202 in local dev, but
vite's dev server was still listening on plain http://localhost:4200.
Browsers visiting http://localhost:4200 then fired cross-origin
requests to http://localhost:4201, which the realm-server's dispatcher
301-redirects to https://. Chrome blocks redirects on CORS preflight
requests ("Redirect is not allowed for a preflight request"), so every
realm-server fetch from the host bundle failed.
When `REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` are set (`env-vars.sh`
exports them whenever the mkcert leaf exists), vite now terminates TLS
using the same cert. Both vite dev (`pnpm start`) and vite preview
(`pnpm serve:dist`) pick this up via `server.https` / `preview.https`.
Knock-on changes:
- `env-vars.sh` flips `HOST_URL` to `https://localhost:4200` when the
cert is present, so the prerender's standby probe, the
realm-server's distURL asset rewriter, and the test-services
readiness URLs all stay scheme-consistent.
- `prerenderer.ts` falls back to `process.env.HOST_URL` (instead of
hardcoded http) so the prerender's BOXEL_HOST_URL default tracks
whatever the shell exported.
- `dev-all`'s host readiness loop and `start-host-dist.sh`'s
already-running probe pass `-k` to curl so the new HTTPS endpoint
is reachable even when the system trust store hasn't been
refreshed since the last `trust-dev-cert` run.
- The CI workflows (`ci.yaml`, `ci-software-factory.yaml`) flip
their post-`test-services` readiness curls to
`https://localhost:4200` with `-k`, matching the new scheme.
`trust-dev-cert` also gained a `certutil` precheck on Linux — without
libnss3-tools, `mkcert -install` only lands the root CA in
/etc/ssl/certs and Chromium (which reads NSS, not the system store)
still rejects the dev cert. Failing fast there with the apt/dnf
command is more useful than letting mkcert emit a buried warning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The realm-server speaks HTTPS+HTTP/2 on 4201/4202 in local dev, but
vite's dev server was still listening on plain http://localhost:4200.
Browsers visiting http://localhost:4200 then fired cross-origin
requests to http://localhost:4201, which the realm-server's dispatcher
301-redirects to https://. Chrome blocks redirects on CORS preflight
requests ("Redirect is not allowed for a preflight request"), so every
realm-server fetch from the host bundle failed.
When `REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` are set (`env-vars.sh`
exports them whenever the mkcert leaf exists), vite now terminates TLS
using the same cert. Both vite dev (`pnpm start`) and vite preview
(`pnpm serve:dist`) pick this up via `server.https` / `preview.https`.
Knock-on changes:
- `env-vars.sh` flips `HOST_URL` to `https://localhost:4200` when the
cert is present, so the prerender's standby probe, the
realm-server's distURL asset rewriter, and the test-services
readiness URLs all stay scheme-consistent.
- `prerenderer.ts` falls back to `process.env.HOST_URL` (instead of
hardcoded http) so the prerender's BOXEL_HOST_URL default tracks
whatever the shell exported.
- `dev-all`'s host readiness loop and `start-host-dist.sh`'s
already-running probe pass `-k` to curl so the new HTTPS endpoint
is reachable even when the system trust store hasn't been
refreshed since the last `trust-dev-cert` run.
- The CI workflows (`ci.yaml`, `ci-software-factory.yaml`) flip
their post-`test-services` readiness curls to
`https://localhost:4200` with `-k`, matching the new scheme.
`trust-dev-cert` also gained a `certutil` precheck on Linux — without
libnss3-tools, `mkcert -install` only lands the root CA in
/etc/ssl/certs and Chromium (which reads NSS, not the system store)
still rejects the dev cert. Failing fast there with the apt/dnf
command is more useful than letting mkcert emit a buried warning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`1779100257124_canonical-url-http-to-https` rewrites realm-server's
postgres state but doesn't touch synapse. Every logged-in dev / user
on a stack that boots after the HTTPS+h2 flip keeps reading
`http://localhost:4201/...` from their `app.boxel.realms`
account_data — the host bundle's first realm fetch then hits the
realm-server's dispatcher, which 301-redirects to https://, and the
browser blocks the CORS preflight with "Redirect is not allowed for
a preflight request." Every realm fetch fails until the user clears
localStorage AND someone rewrites the account_data.
New script: `packages/matrix/scripts/migrate-account-data-http-to-https.ts`.
Logs in as admin, paginates `/_synapse/admin/v2/users`, impersonates
each user via `/_synapse/admin/v1/users/{id}/login` to obtain a per-
user token (the standard `account_data` PUT endpoint requires the
user's own token — admin can read but not write other users'), reads
`app.boxel.realms`, rewrites the two localhost prefixes
(`http://localhost:4201/`, `http://localhost:4202/`) to https://, and
PUTs the new list back. Skips users with no realms set, users where
no URL needed rewriting, and the admin user itself (synapse refuses
self-impersonation). Safe to re-run.
Wired via:
- `pnpm migrate-account-data-http-to-https` (packages/matrix)
- `mise run infra:migrate-matrix-account-data-http-to-https`
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Vite serves HTTPS on :4200 in local dev (mkcert leaf). Typing
`http://localhost:4200/foo` into the browser is a common reflex and
currently hangs / `ERR_CONNECTION_REFUSED`. Add a tiny TCP dispatcher
that peeks the first byte of every incoming connection — same pattern
the realm-server uses on :4201:
- TLS ClientHello (0x16) → forward raw bytes to vite at an internal
loopback port so vite still terminates TLS itself with the cert
it loaded in vite.config.mjs.
- Anything else (an HTTP verb) → parse the request-target out of
the start-line and reply 301 to `https://localhost:4200<target>`.
Activated only when `REALM_SERVER_TLS_CERT_FILE` is set (the same
signal `vite.config.mjs` uses to enable `server.https`). Environment
mode (BOXEL_ENVIRONMENT) keeps its existing Traefik path untouched —
the redirect there is the proxy's job, not ours.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ime out
Vite is lazy — modules only get bundled when something requests them.
The `wait-for-host-standby` probe is supposed to be that something:
puppeteer navigates to `/_standby` and waits for `#standby-ready` to
appear, which forces vite to optimize the entire host bundle before
the prerender server starts. After this commit the probe was hitting
the wrong scheme:
- The probe fell back to `http://localhost:4200/_standby` whenever
`HOST_URL` env wasn't already https. With the new vite dispatcher
that 301-redirects to https://, the probe's puppeteer was
bouncing through a redirect to a cert it didn't trust and erroring
out on every retry. Vite was never actually hit, so its optimizer
never warmed.
- The prerender server then booted, opened its own chrome (which
*does* have `--ignore-certificate-errors` via BrowserManager),
navigated to `https://localhost:4200/_standby` for the first
standby creation — and got vite's cold optimizer plus
~1000 module fetches. Even over HTTP/2 the cold path runs >30s,
blowing the page-pool's hard-coded standby navigation budget.
Three changes:
- `wait-for-host-standby.ts` defaults to `https://localhost:4200`
when `REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` are set, so it
matches vite's actual scheme even if `HOST_URL` hasn't been
re-exported in the dev's current shell.
- The probe's puppeteer now passes `--ignore-certificate-errors`
when the URL is HTTPS, matching the prerender's BrowserManager.
- `PRERENDER_STANDBY_TIMEOUT_MS` is now configurable on the PagePool
constructor (env override). The dev prerender mise task defaults
it to 120000ms when BOXEL_HOST_URL is HTTPS — gives the cold-vite
first navigation real headroom. Production / hosted runners keep
the 30s default unless they opt in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e repo
Vite now terminates TLS on :4200 with the same mkcert leaf the
realm-server uses, so the canonical local-dev host URL is
`https://localhost:4200`. Sweep every place that still bakes in the
old http:// form:
- Defaults: `env-vars.sh` (HOST_URL — both the standard-mode reset
branch and the fresh-shell default), `mise-tasks/services/prerender`
(DEFAULT_HOST_URL), `start-host-dist.sh` (HOST_URL fallback),
`prerenderer.ts` (defaultHostURL), `main.ts` (distURL default),
`wait-for-host-standby.ts` (fallback default).
- Docs: top-level QUICKSTART, AGENTS, README; per-package READMEs
for host, boxel-homepage-realm, ai-bot, software-factory; the
host live-tests / HEAP_PROBE notes; the indexing-diagnostics and
host-test-memory-leak-hunting Claude skills; the
commands-in-headless-chrome doc.
- The dev synapse `client_base_url` (email-redirect base) flips so
matrix registration emails point at the right scheme.
- README's "view a realm's app" paragraph also rewritten: vite and
realm-server both speak HTTPS+HTTP/2 now, so there's no more
mixed-content caveat.
- Drop the now-redundant `HOST_URL=https://...` override inside
`env-vars.sh`'s cert-detection block — the unconditional default
above already sets the right value, and the comment that called
out the http/https mixing is no longer true.
Kept as http://: in-process test fixtures (realm-server tests strip
TLS env vars; their realm-server runs plain HTTP at 4444/4444+),
matrix isolated-realm-server tests, workspace-sync-cli test helpers,
and a few comments / explanatory references that intentionally cite
the old form ("…now lands on https://", "blob:http://localhost:4200/…"
example URL).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dev stack's prerender + `wait-for-host-standby` both run puppeteer against `https://localhost:4200/_standby` and re-trigger vite's optimizer to bundle ~1300 modules. Chrome 143 (the version bundled with puppeteer 24.35) hangs forever fetching one of those modules — specifically the large pre-optimized matrix-js-sdk chunk (`indexeddb-crypto-store-*.js` ~6 MB) — apparently because of an h2 stream-window bug. curl pulls the same URL over h2 in 100ms; system Chrome 148 fetches it in seconds; chrome 143 stalls. Both `BrowserManager` (prerender) and `wait-for-host-standby.ts` already prefer `PUPPETEER_EXECUTABLE_PATH` when set. Make env-vars.sh auto-discover a system chrome / chromium / chromium-browser binary and export the env var, so the standard dev path picks up the fixed chrome without anyone having to set it manually. Devs who haven't installed google-chrome locally keep the bundled puppeteer binary — they'll see the standby probe stall longer, but only until vite's optimizer cache warms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tack/boxel into worktree-cs-11114-http2-v2 # Conflicts: # packages/realm-server/prerender/prerenderer.ts
packages/host/vite.config.mjs reads REALM_SERVER_TLS_CERT_FILE / _KEY_FILE and, when set, terminates TLS in vite preview too. The harness uses dynamic ports and probes readiness via plain http://localhost:<port>/, then hands that same http URL to its spawned realm-server via HOST_URL. With the dev stack's TLS env vars inherited, vite preview would come up on HTTPS, the readiness fetch would hang, and every downstream HOST_URL fetch from the spawned realm-server would land on an HTTPS server keyed under the http:// origin. Same pattern as the matrix isolated-realm-server fix (12b7fbc) — strip the two TLS env vars from the spawn() env so the dynamic-port harness stack stays plain HTTP end-to-end, regardless of whether the surrounding dev env has the cert configured. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two HTTPS-related regressions surfaced after the same-port http→https dispatcher landed: 1. The dispatcher peeked the first byte of every incoming connection, socket.unshift()'d it back, and then socket.pipe()'d to vite. The unshift+pipe pattern races the upstream socket's connect handshake: the rest of the ClientHello arrives and writes to the upstream socket before the unshifted byte gets flushed, leaving vite with a corrupt handshake and the client with net::ERR_CONNECTION_CLOSED. Switch to the more deterministic pattern: do not unshift, instead write the peeked byte explicitly on the upstream's 'connect' event, then pipe for the remainder. 2. wait-on (in-process inside start-server-and-test) uses bundled axios that does not pick up NODE_EXTRA_CA_CERTS reliably on CI runners. The readiness probes against https://localhost:42XX therefore time out even though env-vars.sh exports NODE_EXTRA_CA_CERTS pointing at mkcert's root CA. Disable TLS validation only for the probe (via NODE_TLS_REJECT_UNAUTHORIZED=0 scoped to the wait-on invocation) — the services under test still present and validate the real cert. Also fix mise-tasks/ci/cache-index to support https REALM_BASE_URL: it was stripping `http://` only and hardcoding `http-get://`, which produced malformed wait-on URLs when REALM_BASE_URL was https. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…only
Two follow-ups after the dispatcher fix didn't unstick CI:
1. realm-server `main.ts` defaulted `--serverURL` to `http://localhost:${port}`
when no flag was passed (the mise tasks don't pass one). The default
became the `realmServerURL` JWT claim, so even after rotating the
realm-server to HTTPS the JWTs it minted still embedded
`realmServerURL: http://localhost:4201/`. The host's
assertOwnRealmServer then compared that to its own canonical
`https://localhost:4201/` and threw
"Multi-realm server support is not yet implemented: don't know how
to provide auth token for different realm servers", blanking every
index card. Hardcode the default to `https://localhost:${port}` —
the local dev stack requires the mkcert leaf (see
infra:ensure-dev-cert) and there's no scenario where a missing cert
should silently flip the canonical claim back to http.
2. CI software-factory job had a Serve-test-assets step that started
host-dist on :4200 even though SF Playwright tests use the
realm-test-harness, which is hermetic and brings up its own host on
dynamic ports (see packages/software-factory/docs/testing-strategy.md).
The bind was both pointless and an active foot-gun — colliding with
harness ports and masking host-bring-up regressions. Replace with
`services:icons` alone (the only external service the harness
actually consumes via ICONS_URL).
Also switch wait-on's TLS escape hatch from NODE_TLS_REJECT_UNAUTHORIZED
to START_SERVER_AND_TEST_INSECURE=1 — start-server-and-test passes
`strictSSL: !isInsecure()` into wait-on's options, which overrides
the global env var.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…REATE error The migration CI job's apply-migrations step failed with \`error: database "boxel" does not exist\` immediately after the script logged \`created database boxel\`. Root cause was two-part: 1. \`docker exec boxel-pg psql -U postgres ...\` defaulted to the unix socket at \`/var/run/postgresql/.s.PGSQL.5432\` inside the container. postgres:16.3 doesn't always create that directory, so both the \`-lqt\` lookup and the \`CREATE DATABASE\` call failed with \`connection to server on socket ... No such file or directory\`. 2. The script had no \`set -e\`, so \`CREATE DATABASE\` failing silently fell through to the \`echo "created database \$PGDATABASE"\` line. The migrate step then tried to connect to a non-existent database over TCP and crashed. Fix: pass \`-h localhost -p 5432\` to \`psql\` and \`pg_isready\` so the in-container calls always use TCP (which postgres listens on regardless of socket availability), and add \`set -e\` so a CREATE DATABASE failure exits non-zero instead of fabricating a success log line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eview The byte-peek + cross-process TCP pipe pattern that the dispatcher uses races chrome's TLS handshake on the CI runners — every prerender probe to https://localhost:4200/_standby gets net::ERR_CONNECTION_CLOSED while curl against the same port from a parallel shell succeeds. Symptom of an ALPN/h2 framing issue inside the pipe (TLS termination is at vite, but Node's raw socket.pipe between two processes apparently mangles enough of the handshake that chrome's stricter parser bails). The dispatcher's only real value is `vite` (dev) UX, where a human types `http://localhost:4200` in a browser bar and expects a 301 to https. `vite preview` is used by CI and `serve:dist` — there's no browser bar there, so bind vite preview directly to the public port with HTTPS and skip the dispatcher. Local dev's `vite` path is unchanged: it still gets the dispatcher and the http→https redirect. Also tighten ci/serve-test-assets's wait-on probe: use `https-get://` to force GET (start-server-and-test's default `https://` resolves to HEAD, which vite preview behind HTTP/2 doesn't reliably answer in CI). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Postgres Migration CI job validates that every migration is
reversible via up → down → up. The canonical-url-http-to-https
migration's down was a no-op, which broke that contract. Make the
down symmetric to the up:
- packages/postgres/migrations/1779100257124_canonical-url-http-to-https.js:
extract the rewrite SQL into a `rewriteBlock({ oldScheme, newScheme })`
helper. `up` calls it http→https; `down` calls it https→http. Same
`realm_user_permissions` pre-check on the source scheme, so staging
/ production (real hostnames, never `localhost`) is a no-op either
direction.
- packages/matrix/scripts/migrate-account-data-http-to-https.ts: add a
`--reverse` CLI flag that flips the URL prefix rewrite. Companion
pnpm script `migrate-account-data-https-to-http` and mise task
`infra:migrate-matrix-account-data-https-to-http` invoke it.
- PR description: add a "Rolling back" section pointing users at the
three-step reverse path (postgres down, matrix reverse, localStorage
clear).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…low-insecure-localhost Chrome 144+ silently demotes \`--ignore-certificate-errors\` to a dev-only flag and won't accept self-signed certs unless it's paired with \`--allow-insecure-localhost\`. Without that pairing, every TLS connection to https://localhost:4200 from puppeteer's chrome terminates the handshake with ERR_CONNECTION_CLOSED — which is what was blocking the prerender's wait-for-host-standby in CI (and, downstream, every Host / Matrix test job because realm-server boot depends on prerender being ready). curl over the same URL worked fine, hiding the cert trust nature of the problem under what looked like a generic TCP close. Pair the flags in both the prerender's BrowserManager and the standby-warmup script (scripts/wait-for-host-standby.ts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The local stack is now HTTPS+HTTP/2 end-to-end (realm-server on
:4201, vite on:4200). Three one-time manual steps are required beforemise run dev/mise run dev-allwill work; skipping any of them leaves the browser stuck in CORS / mixed-content / cert errors that look like the app is broken.1. Trust the dev TLS cert (one-time, prompts sudo)
The realm-server and vite both terminate TLS with a mkcert leaf. Chromium / Firefox read the system NSS DB (
~/.pki/nssdbon Linux), not/etc/ssl/certs, solibnss3-toolsis mandatory on Linux for the browser to actually trust the cert.Install the tools and run the trust task:
mise run infra:trust-dev-certrunsmkcert -install(sudo prompt) and seeds~/.pki/nssdbso Chromium picks up the root CA.mise run dev/dev-allinvokeinfra:ensure-dev-certas a preflight that fails fast with a copy-paste message if you skipped this. CI does the equivalent via passwordless sudo in.github/actions/init— devs don't need to think about it there.2. Migrate your matrix account_data (one-time)
Every Boxel user keeps their realm workspace list in matrix
app.boxel.realmsaccount_data. Existing entries still point athttp://localhost:42XX/.... The realm-server's HTTP→HTTPS dispatcher 301-redirects those requests, but CORS preflight forbids following redirects, so the host bundle's first realm fetch will fail with:Run the migration once after
mise run infra:trust-dev-cert:The script logs in as the local synapse admin, walks every user, impersonates each via
/_synapse/admin/v1/users/<id>/login, readsapp.boxel.realms, rewrites everyhttp://localhost:4201/.../http://localhost:4202/...tohttps://..., and PUTs the result back. Safe to re-run — users already on https are skipped.3. Clear your browser's localStorage
The host bundle caches realm-scoped JWTs and session metadata keyed by realm URL. Those entries are still keyed on the old
http://localhost:42XX/form, and there is no server-side migration path — the only way to evict them is in your browser:After that, log in fresh — new tokens will be keyed under the https origin.
When in doubt: full-reindex a realm
If something still looks broken after the three steps above — stale rows, missing types, a card that won't render — kick off a from-scratch index for the affected realm. The realm-server exposes the same
_grafana-reindexendpoint we use from Grafana in production. In local dev the shared secret is literallyshhh! it's a secret:The request blocks until the from-scratch job finishes and prints the result stats. Swap the
realm=query param for whichever realm you need (/base/,/test/,/user/<your-realm>/, etc.); the path is relative to the realm-server origin.And when even that doesn't help, the nuclear option is still available:
That drops and recreates the Postgres DB, clears dynamic realms, and restarts matrix — i.e. you start over from a clean slate.
Rolling back
If you ever need to go back to the previous http-canonical state (e.g. you're bisecting against this branch, or you
pnpm migrate downthe postgres migration), the rewrite is symmetric in both directions:Both migrations gate on
realm_user_permissionscontaining localhost canonicals in the relevant scheme, so they're no-ops on staging / production.Auto-applied: postgres data migration
The first
mise run devafter pulling runs a Postgres migration (1779100257124_canonical-url-http-to-https.js) that rewrites every text/varchar/jsonb column on every public table fromhttp://localhost:42XX/…tohttps://localhost:42XX/…in place — index rows, realm registry, permissions, JSONB documents insidepristine_doc/search_doc/etc. The migration is idempotent and gated on a cheaprealm_registrypre-check, so re-runs and production environments are no-ops.If you have stale
http://localhost:42XX/…URLs in personal-realm card.jsonfiles (inrealms/localhost_4201/**), the dispatcher's 301-redirect resolves them at runtime so cards still work — no on-disk rewrite is required. To clean the data anyway:Navigation
Visit
https://localhost:4200/(vite host) as the manual-browser entry point. Both vite and the realm-server speak HTTPS+HTTP/2 now, so the host bundle's realm fetches multiplex over a single h2 connection — no mixed-content warnings.Summary
Local dev's realm-server now speaks HTTPS+HTTP/2 on a single canonical origin (
https://localhost:4201, plushttps://localhost:4202for test-realms). This unblocks the heavy aggregator-card prerender bottleneck described in CS-11114 — cohort and dashboard renders today fan out 80+ federated-search requests inside one Chromium tab, get throttled by Chrome's HTTP/1.1 6-per-origin connection ceiling, and take minutes; HTTP/2 multiplexes them over one connection and the same render finishes in seconds.Following @lukemelia's suggestion in #4787, this PR ships the single-origin design rather than the dual-listen alternative that #4787 had been carrying. There is no separate h2 alias port, no per-page
__realmH2OriginMappings__injection, no alias-host rewrite middleware — the wire protocol and the canonical realm URL agree.Design
Realm-server: same-port HTTPS+HTTP/2 dispatcher
RealmServer.listen(port): whenREALM_SERVER_TLS_CERT_FILE/_KEY_FILEare set, binds a singlenet.Serverthat peeks the first byte of every connection.0x16(TLS ClientHello) routes to anhttp2.createSecureServerfor h2; anything else routes to a plainhttp.Serverthat 301-redirects tohttps://<host><path>. Same listener, no extra port. When the cert is absent (in-process test fixtures), falls back to plainhttp.createServer— unchanged behavior.closeAllConnections()so shutdown can force-close in-flight TLS / HTTP/2 / keep-alive sessions rather than waiting for peers.main.ts's existing typeof feature-detect picks it up unchanged.readFileSync+createSecureServerwrapped in try/catch so a malformed cert downgrades to plain HTTP with a warning rather than killing boot.patchKoaResponseForH2Head()on the Koa response prototype: Node's http2 compat layer leavesHttp2Stream.writable === falseon HEAD streams, which short-circuits Koa'srespond()and hangs every HEAD request indefinitely. The patch returnstruefor HEAD streams so the response actually flushes.middleware/index.ts:fullRequestURLdetectsctx.req.socket.encryptedfor the scheme and falls back to the HTTP/2:authoritypseudo-header whenheaders.hostis absent, so URL-keyed realm lookup matches the HTTPS canonical.fetchRequestFromContextstrips:-prefixed pseudo-headers before constructingnew Request(...)— WHATWGHeadersrejects them.setContextResponsefilters HTTP/1-onlyconnection/keep-alive/transfer-encoding/upgrade/proxy-connection/http2-settingsresponse headers that Node's h2 compat layer would otherwise reject.proxyAssetwas reimplemented as a hand-rolled forwarder (replacingkoa-proxies+http-proxy) so pseudo-headers and the requesthostget filtered before the upstream call —http-proxy.setHeader(':path', …)throwsERR_INVALID_HTTP_TOKEN.main.tsdefaults--serverURLtohttps://localhost:${port}(washttp://). The realm-server stampsserverURLinto therealmServerURLclaim of every JWT it mints, so an http default leaks into tokens and the host'sassertOwnRealmServerrejects them as a "different realm server".prerender/browser-manager.ts,scripts/wait-for-host-standby.ts) launches puppeteer with--ignore-certificate-errorswhenBOXEL_HOST_URL/REALM_BASE_URLis https.Vite host (
packages/host)vite.config.mjsreadsREALM_SERVER_TLS_CERT_FILE/_KEY_FILEand sets bothserver.https(dev) andpreview.https(built) so vite terminates TLS on:4200. Browsers refuse HTTP/2 over cleartext, so vite has to speak HTTPS for the h2 connection-pool win to apply on the host origin.packages/host/scripts/vite-with-traefik.jsadds a same-port http→https redirect dispatcher forvite(dev) only — vite binds an internal port, the dispatcher owns:4200, peeks the first byte, and either pipes raw bytes to vite (TLS) or 301-redirects (plain HTTP).vite previewskips the dispatcher and binds:4200directly (the byte-peek + cross-process TCP pipe pattern doesn't survive chrome's TLS+h2 handshake under load in CI; preview doesn't need browser-bar UX anyway).config/environment.jsdefaults flip tohttps://localhost:4201forrealmServerURL/baseRealmURL/catalogRealmURL/legacyCatalogRealmURL/skillsRealmURL/openRouterRealmURL.Mise tasks and env-vars
mise run infra:ensure-dev-certprovisions the mkcert leaf at$HOME/.local/share/boxel/dev-certs/. Idempotent; auto-runsinfra:trust-dev-certwhen passwordless sudo is available (CI), otherwise fails fast with a copy-paste install message.mise run dev/dev-allinvoke it as a preflight.mise run infra:trust-dev-certrunsmkcert -installand (on Linux) verifieslibnss3-toolsis installed so Chromium picks up the root CA from~/.pki/nssdb.mise-tasks/lib/env-vars.sh: defaultsREALM_BASE_URL/REALM_TEST_URL/HOST_URLto https; exportsREALM_SERVER_TLS_CERT_FILE/_KEY_FILE+NODE_EXTRA_CA_CERTSwhen the mkcert leaf is present. Also auto-detects system chrome (/usr/bin/google-chrome,Chromium.app, etc.) and setsPUPPETEER_EXECUTABLE_PATH— puppeteer's bundled Chrome 143 has an h2 stream-window bug that hangs the prerender on cold vite optimizer; Chrome 148+ is fine.services/{realm-server,realm-server-base,worker-base,prerender,test-realms}were updated to use the https canonical URLs.helpers/isolated-realm-server.tsandrealm-test-harness/src/{support-services,isolated-realm-stack}.tsstripREALM_SERVER_TLS_CERT_FILE/_KEY_FILEbefore spawning child processes, so the matrix-isolated and software-factory stacks stay plain HTTP on their dynamic ports regardless of the outer dev env.CI
.github/actions/init/action.ymlinstalls mkcert +libnss3-toolsvia apt and runsmise run infra:ensure-dev-certso realm-servers in CI come up HTTPS+HTTP/2 the same as local.tests/index.ts(realm-server test bootstrap) deletes the TLS env vars before any in-process fixture realm-server is spun up — supertest connects plain HTTP to those fixtures on random127.0.0.1:444Xports.https://localhost:42XX(inci/serve-test-assets,ci/cache-index,test-services/{host,realm-server,matrix}) usehttps-get://(start-server-and-test's defaulthttps://is HEAD, which vite preview behind h2 doesn't reliably answer) and passSTART_SERVER_AND_TEST_INSECURE=1to disable wait-on's strictSSL check.ci-software-factory.yamlno longer starts host-dist on:4200— the realm-test-harness is hermetic and brings up its own vite preview on dynamic ports. Onlyservices:icons(port 4206) is started externally.packages/postgres/scripts/ensure-db-exists.shforces-h localhost -p 5432(TCP) insidedocker exec, since the postgres:16.3 image doesn't reliably create/var/run/postgresql/.s.PGSQL.5432.set -emakes a failedCREATE DATABASEactually exit non-zero instead of fabricating a success line.Data migrations
Two migrations cover the http→https flip and are both reversible:
Postgres —
packages/postgres/migrations/1779100257124_canonical-url-http-to-https.js:information_schema.columnsfor every text/varchar/jsonb column on every public table (excludesmodules,pgmigrations/migrations, generated columns).REPLACE(...)-basedUPDATEs forhttp://localhost:4201→https://localhost:4201andhttp://localhost:4202→https://localhost:4202.WHEREfilter restricts the touch set to rows that still contain the old URL — idempotent.downis symmetric (https → http) — samerealm_user_permissionspre-check on the source scheme.realm_user_permissionscontaining localhost URLs, so production / staging (real hostnames, neverlocalhost) is a no-op either way.mise run devpasses--migrateDBto the realm-server, so the migration fires on the first post-pull boot.Matrix account_data —
packages/matrix/scripts/migrate-account-data-http-to-https.ts:app.boxel.realmsaccount_data entries fromhttp://localhost:42XX/...tohttps://..., PUTs back.--reverseflips the direction (pnpm migrate-account-data-https-to-http/mise run infra:migrate-matrix-account-data-https-to-http) for symmetry with the postgres migrate-down.Tests
packages/realm-server/tests/listener-dispatcher-test.tscovers the dispatcher: TLS h2, ALPN HTTP/1.1 fallback, TLS h2 HEAD (the patched-writablepath), plain-HTTP 301, no-Host-header raw-socket path, malformed-cert downgrade, and no-cert-env-vars plain HTTP.card-endpoints-test.ts,types-endpoint-test.ts,module-syntax-test.ts) were updated to https where they reference port 4202.realm-indexing-test.gts,realm-test.gts) updated for the new https canonical and the alphanumeric URL sort order that follows from it.Test plan
mise run infra:ensure-dev-certsucceeds with mkcert installed; emits clean install hints + exits 1 when missing.curl -kI --http2 https://localhost:4201/_alivereturnsHTTP/2 200.curl -kI --http1.1 https://localhost:4201/_alivereturnsHTTP/1.1 200(ALPN fallback for h1 clients).curl -sI http://localhost:4201/_alivereturnsHTTP/1.1 301withLocation: https://localhost:4201/_alive.curl -skI -X HEAD --http2 https://localhost:4201/_alivereturnsHTTP/2 200(HEAD over h2 doesn't hang).mise run devruns the URL-rewrite migration → realm-server boots clean on https.pnpm --filter @cardstack/postgres migrate down 1 && pnpm --filter @cardstack/postgres migrate upround-trips cleanly (the Postgres Migration CI job validates this).curl -X POST -H "Authorization: Bearer shhh! it's a secret" 'https://localhost:4201/_grafana-reindex?realm=/base/'— completes without errors.mise run infra:trust-dev-cert+ matrix account_data migration + localStorage clear, openhttps://localhost:4200/, log in, click a workspace — index card populates and DevTools showsh2for realm fetches.mise run devshutdown closes the listener cleanly.pnpm lintpasses onpackages/{realm-server,host,matrix,postgres,realm-test-harness}(lint:js+ prettier; pre-existinglint:typeserrors in../base/*.gtsare unrelated).Closes #4787 (dual-listen approach abandoned in favor of this single-origin design).
🤖 Generated with Claude Code