Skip to content

realm-server: HTTPS+HTTP/2 in local dev#4797

Draft
habdelra wants to merge 59 commits into
mainfrom
worktree-cs-11114-http2-v2
Draft

realm-server: HTTPS+HTTP/2 in local dev#4797
habdelra wants to merge 59 commits into
mainfrom
worktree-cs-11114-http2-v2

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented May 12, 2026

⚠️ Required dev setup after pulling this branch

The local stack is now HTTPS+HTTP/2 end-to-end (realm-server on :4201, vite on :4200). Three one-time manual steps are required before mise run dev / mise run dev-all will work; skipping any of them leaves the browser stuck in CORS / mixed-content / cert errors that look like the app is broken.

1. Trust the dev TLS cert (one-time, prompts sudo)

The realm-server and vite both terminate TLS with a mkcert leaf. Chromium / Firefox read the system NSS DB (~/.pki/nssdb on Linux), not /etc/ssl/certs, so libnss3-tools is mandatory on Linux for the browser to actually trust the cert.

Install the tools and run the trust task:

# Linux (Debian/Ubuntu)
sudo apt install -y mkcert libnss3-tools
# Linux (Fedora/RHEL)
sudo dnf install -y mkcert nss-tools
# macOS
brew install mkcert nss

mise run infra:trust-dev-cert

mise run infra:trust-dev-cert runs mkcert -install (sudo prompt) and seeds ~/.pki/nssdb so Chromium picks up the root CA. mise run dev / dev-all invoke infra:ensure-dev-cert as a preflight that fails fast with a copy-paste message if you skipped this. CI does the equivalent via passwordless sudo in .github/actions/init — devs don't need to think about it there.

2. Migrate your matrix account_data (one-time)

Every Boxel user keeps their realm workspace list in matrix app.boxel.realms account_data. Existing entries still point at http://localhost:42XX/.... The realm-server's HTTP→HTTPS dispatcher 301-redirects those requests, but CORS preflight forbids following redirects, so the host bundle's first realm fetch will fail with:

Access to fetch at http://localhost:4201/... from origin https://localhost:4200 has been blocked by CORS policy: Response to preflight request doesn't pass access control check: Redirect is not allowed for a preflight request.

Run the migration once after mise run infra:trust-dev-cert:

mise run infra:migrate-matrix-account-data-http-to-https

The script logs in as the local synapse admin, walks every user, impersonates each via /_synapse/admin/v1/users/<id>/login, reads app.boxel.realms, rewrites every http://localhost:4201/... / http://localhost:4202/... to https://..., and PUTs the result back. Safe to re-run — users already on https are skipped.

3. Clear your browser's localStorage

The host bundle caches realm-scoped JWTs and session metadata keyed by realm URL. Those entries are still keyed on the old http://localhost:42XX/ form, and there is no server-side migration path — the only way to evict them is in your browser:

DevTools → Application → Local Storage → right-click https://localhost:4200 (and the old http://localhost:4200 entry if present) → Clear

After that, log in fresh — new tokens will be keyed under the https origin.

When in doubt: full-reindex a realm

If something still looks broken after the three steps above — stale rows, missing types, a card that won't render — kick off a from-scratch index for the affected realm. The realm-server exposes the same _grafana-reindex endpoint we use from Grafana in production. In local dev the shared secret is literally shhh! it's a secret:

curl -X POST \
  -H "Authorization: Bearer shhh! it's a secret" \
  'https://localhost:4201/_grafana-reindex?realm=/user/prudent-octopus/'

The request blocks until the from-scratch job finishes and prints the result stats. Swap the realm= query param for whichever realm you need (/base/, /test/, /user/<your-realm>/, etc.); the path is relative to the realm-server origin.

And when even that doesn't help, the nuclear option is still available:

mise run infra:full-reset

That drops and recreates the Postgres DB, clears dynamic realms, and restarts matrix — i.e. you start over from a clean slate.

Rolling back

If you ever need to go back to the previous http-canonical state (e.g. you're bisecting against this branch, or you pnpm migrate down the postgres migration), the rewrite is symmetric in both directions:

# 1. Roll back the postgres canonical-url migration
pnpm --filter @cardstack/postgres migrate down 1
# 2. Flip account_data back so the host bundle matches
mise run infra:migrate-matrix-account-data-https-to-http
# 3. Clear localStorage again so JWTs get re-minted under the http origin

Both migrations gate on realm_user_permissions containing localhost canonicals in the relevant scheme, so they're no-ops on staging / production.


Auto-applied: postgres data migration

The first mise run dev after pulling runs a Postgres migration (1779100257124_canonical-url-http-to-https.js) that rewrites every text/varchar/jsonb column on every public table from http://localhost:42XX/… to https://localhost:42XX/… in place — index rows, realm registry, permissions, JSONB documents inside pristine_doc/search_doc/etc. The migration is idempotent and gated on a cheap realm_registry pre-check, so re-runs and production environments are no-ops.

If you have stale http://localhost:42XX/… URLs in personal-realm card .json files (in realms/localhost_4201/**), the dispatcher's 301-redirect resolves them at runtime so cards still work — no on-disk rewrite is required. To clean the data anyway:

find realms/localhost_4201 -name '*.json' -exec sed -i 's|http://localhost:4201|https://localhost:4201|g' {} +

Navigation

Visit https://localhost:4200/ (vite host) as the manual-browser entry point. Both vite and the realm-server speak HTTPS+HTTP/2 now, so the host bundle's realm fetches multiplex over a single h2 connection — no mixed-content warnings.

Summary

Local dev's realm-server now speaks HTTPS+HTTP/2 on a single canonical origin (https://localhost:4201, plus https://localhost:4202 for test-realms). This unblocks the heavy aggregator-card prerender bottleneck described in CS-11114 — cohort and dashboard renders today fan out 80+ federated-search requests inside one Chromium tab, get throttled by Chrome's HTTP/1.1 6-per-origin connection ceiling, and take minutes; HTTP/2 multiplexes them over one connection and the same render finishes in seconds.

Following @lukemelia's suggestion in #4787, this PR ships the single-origin design rather than the dual-listen alternative that #4787 had been carrying. There is no separate h2 alias port, no per-page __realmH2OriginMappings__ injection, no alias-host rewrite middleware — the wire protocol and the canonical realm URL agree.

Design

Realm-server: same-port HTTPS+HTTP/2 dispatcher

  • RealmServer.listen(port): when REALM_SERVER_TLS_CERT_FILE/_KEY_FILE are set, binds a single net.Server that peeks the first byte of every connection. 0x16 (TLS ClientHello) routes to an http2.createSecureServer for h2; anything else routes to a plain http.Server that 301-redirects to https://<host><path>. Same listener, no extra port. When the cert is absent (in-process test fixtures), falls back to plain http.createServer — unchanged behavior.
  • The dispatcher tracks every accepted socket and exposes closeAllConnections() so shutdown can force-close in-flight TLS / HTTP/2 / keep-alive sessions rather than waiting for peers. main.ts's existing typeof feature-detect picks it up unchanged.
  • Defensive cert load: readFileSync + createSecureServer wrapped in try/catch so a malformed cert downgrades to plain HTTP with a warning rather than killing boot.
  • patchKoaResponseForH2Head() on the Koa response prototype: Node's http2 compat layer leaves Http2Stream.writable === false on HEAD streams, which short-circuits Koa's respond() and hangs every HEAD request indefinitely. The patch returns true for HEAD streams so the response actually flushes.
  • middleware/index.ts:
    • fullRequestURL detects ctx.req.socket.encrypted for the scheme and falls back to the HTTP/2 :authority pseudo-header when headers.host is absent, so URL-keyed realm lookup matches the HTTPS canonical.
    • fetchRequestFromContext strips :-prefixed pseudo-headers before constructing new Request(...) — WHATWG Headers rejects them.
    • setContextResponse filters HTTP/1-only connection / keep-alive / transfer-encoding / upgrade / proxy-connection / http2-settings response headers that Node's h2 compat layer would otherwise reject.
    • proxyAsset was reimplemented as a hand-rolled forwarder (replacing koa-proxies + http-proxy) so pseudo-headers and the request host get filtered before the upstream call — http-proxy.setHeader(':path', …) throws ERR_INVALID_HTTP_TOKEN.
  • main.ts defaults --serverURL to https://localhost:${port} (was http://). The realm-server stamps serverURL into the realmServerURL claim of every JWT it mints, so an http default leaks into tokens and the host's assertOwnRealmServer rejects them as a "different realm server".
  • Prerender (prerender/browser-manager.ts, scripts/wait-for-host-standby.ts) launches puppeteer with --ignore-certificate-errors when BOXEL_HOST_URL / REALM_BASE_URL is https.

Vite host (packages/host)

  • vite.config.mjs reads REALM_SERVER_TLS_CERT_FILE/_KEY_FILE and sets both server.https (dev) and preview.https (built) so vite terminates TLS on :4200. Browsers refuse HTTP/2 over cleartext, so vite has to speak HTTPS for the h2 connection-pool win to apply on the host origin.
  • packages/host/scripts/vite-with-traefik.js adds a same-port http→https redirect dispatcher for vite (dev) only — vite binds an internal port, the dispatcher owns :4200, peeks the first byte, and either pipes raw bytes to vite (TLS) or 301-redirects (plain HTTP). vite preview skips the dispatcher and binds :4200 directly (the byte-peek + cross-process TCP pipe pattern doesn't survive chrome's TLS+h2 handshake under load in CI; preview doesn't need browser-bar UX anyway).
  • Host config/environment.js defaults flip to https://localhost:4201 for realmServerURL / baseRealmURL / catalogRealmURL / legacyCatalogRealmURL / skillsRealmURL / openRouterRealmURL.

Mise tasks and env-vars

  • mise run infra:ensure-dev-cert provisions the mkcert leaf at $HOME/.local/share/boxel/dev-certs/. Idempotent; auto-runs infra:trust-dev-cert when passwordless sudo is available (CI), otherwise fails fast with a copy-paste install message. mise run dev / dev-all invoke it as a preflight.
  • mise run infra:trust-dev-cert runs mkcert -install and (on Linux) verifies libnss3-tools is installed so Chromium picks up the root CA from ~/.pki/nssdb.
  • mise-tasks/lib/env-vars.sh: defaults REALM_BASE_URL/REALM_TEST_URL/HOST_URL to https; exports REALM_SERVER_TLS_CERT_FILE/_KEY_FILE + NODE_EXTRA_CA_CERTS when the mkcert leaf is present. Also auto-detects system chrome (/usr/bin/google-chrome, Chromium.app, etc.) and sets PUPPETEER_EXECUTABLE_PATH — puppeteer's bundled Chrome 143 has an h2 stream-window bug that hangs the prerender on cold vite optimizer; Chrome 148+ is fine.
  • services/{realm-server,realm-server-base,worker-base,prerender,test-realms} were updated to use the https canonical URLs.
  • Matrix helpers/isolated-realm-server.ts and realm-test-harness/src/{support-services,isolated-realm-stack}.ts strip REALM_SERVER_TLS_CERT_FILE/_KEY_FILE before spawning child processes, so the matrix-isolated and software-factory stacks stay plain HTTP on their dynamic ports regardless of the outer dev env.

CI

  • .github/actions/init/action.yml installs mkcert + libnss3-tools via apt and runs mise run infra:ensure-dev-cert so realm-servers in CI come up HTTPS+HTTP/2 the same as local.
  • tests/index.ts (realm-server test bootstrap) deletes the TLS env vars before any in-process fixture realm-server is spun up — supertest connects plain HTTP to those fixtures on random 127.0.0.1:444X ports.
  • Wait-on probes against https://localhost:42XX (in ci/serve-test-assets, ci/cache-index, test-services/{host,realm-server,matrix}) use https-get:// (start-server-and-test's default https:// is HEAD, which vite preview behind h2 doesn't reliably answer) and pass START_SERVER_AND_TEST_INSECURE=1 to disable wait-on's strictSSL check.
  • ci-software-factory.yaml no longer starts host-dist on :4200 — the realm-test-harness is hermetic and brings up its own vite preview on dynamic ports. Only services:icons (port 4206) is started externally.
  • packages/postgres/scripts/ensure-db-exists.sh forces -h localhost -p 5432 (TCP) inside docker exec, since the postgres:16.3 image doesn't reliably create /var/run/postgresql/.s.PGSQL.5432. set -e makes a failed CREATE DATABASE actually exit non-zero instead of fabricating a success line.

Data migrations

Two migrations cover the http→https flip and are both reversible:

Postgres — packages/postgres/migrations/1779100257124_canonical-url-http-to-https.js:

  • Walks information_schema.columns for every text/varchar/jsonb column on every public table (excludes modules, pgmigrations/migrations, generated columns).
  • For each column, runs in-place REPLACE(...)-based UPDATEs for http://localhost:4201https://localhost:4201 and http://localhost:4202https://localhost:4202. WHERE filter restricts the touch set to rows that still contain the old URL — idempotent.
  • down is symmetric (https → http) — same realm_user_permissions pre-check on the source scheme.
  • Both directions gate on realm_user_permissions containing localhost URLs, so production / staging (real hostnames, never localhost) is a no-op either way.
  • Runs automatically: mise run dev passes --migrateDB to the realm-server, so the migration fires on the first post-pull boot.

Matrix account_data — packages/matrix/scripts/migrate-account-data-http-to-https.ts:

  • Logs in as the local synapse admin, admin-impersonates every user, rewrites app.boxel.realms account_data entries from http://localhost:42XX/... to https://..., PUTs back.
  • --reverse flips the direction (pnpm migrate-account-data-https-to-http / mise run infra:migrate-matrix-account-data-https-to-http) for symmetry with the postgres migrate-down.
  • Safe to re-run; users already on the target scheme are skipped.

Tests

  • packages/realm-server/tests/listener-dispatcher-test.ts covers the dispatcher: TLS h2, ALPN HTTP/1.1 fallback, TLS h2 HEAD (the patched-writable path), plain-HTTP 301, no-Host-header raw-socket path, malformed-cert downgrade, and no-cert-env-vars plain HTTP.
  • The rest of the realm-server qunit/mocha suite continues to run plain HTTP via the test-bootstrap env-var delete; the per-test URL fixtures (card-endpoints-test.ts, types-endpoint-test.ts, module-syntax-test.ts) were updated to https where they reference port 4202.
  • Host integration test fixtures (realm-indexing-test.gts, realm-test.gts) updated for the new https canonical and the alphanumeric URL sort order that follows from it.

Test plan

  • mise run infra:ensure-dev-cert succeeds with mkcert installed; emits clean install hints + exits 1 when missing.
  • curl -kI --http2 https://localhost:4201/_alive returns HTTP/2 200.
  • curl -kI --http1.1 https://localhost:4201/_alive returns HTTP/1.1 200 (ALPN fallback for h1 clients).
  • curl -sI http://localhost:4201/_alive returns HTTP/1.1 301 with Location: https://localhost:4201/_alive.
  • curl -skI -X HEAD --http2 https://localhost:4201/_alive returns HTTP/2 200 (HEAD over h2 doesn't hang).
  • Pull with existing local realm data → first mise run dev runs the URL-rewrite migration → realm-server boots clean on https.
  • pnpm --filter @cardstack/postgres migrate down 1 && pnpm --filter @cardstack/postgres migrate up round-trips cleanly (the Postgres Migration CI job validates this).
  • Trigger a base-realm reindex via curl -X POST -H "Authorization: Bearer shhh! it's a secret" 'https://localhost:4201/_grafana-reindex?realm=/base/' — completes without errors.
  • After mise run infra:trust-dev-cert + matrix account_data migration + localStorage clear, open https://localhost:4200/, log in, click a workspace — index card populates and DevTools shows h2 for realm fetches.
  • mise run dev shutdown closes the listener cleanly.
  • pnpm lint passes on packages/{realm-server,host,matrix,postgres,realm-test-harness} (lint:js + prettier; pre-existing lint:types errors in ../base/*.gts are unrelated).

Closes #4787 (dual-listen approach abandoned in favor of this single-origin design).

🤖 Generated with Claude Code

Heavy aggregator-card renders (cohort, dashboards) fan out 80+
federated-search requests per render inside one Chromium tab. Chrome's
HTTP/1.1 6-per-origin connection ceiling serializes them and turns a
single render into multiple minutes; HTTP/2 multiplexes them over one
connection and the same render finishes in seconds. Browsers only do
HTTP/2 over TLS, so the local realm-server now terminates a cert.

Single-origin design: the realm-server listens on
`https://localhost:4201` (and `https://localhost:4202` for test-realms)
when the dev cert is provisioned. There is no parallel HTTP listener
and no h2 alias port; the wire protocol and the canonical realm URL
agree. In-process tests and any environment without a cert keep getting
plain HTTP/1.1 via the same `listen(port)` entry point — `RealmServer`
picks the protocol from `REALM_SERVER_TLS_CERT_FILE`/`_KEY_FILE` rather
than two separate methods.

Cert provisioning is opt-in via `mise run infra:ensure-dev-cert`:

  - Requires `mkcert` (single-origin HTTPS has no HTTP fallback in
    dev, so a missing prereq is a hard error with install hints).
  - Attempts `mkcert -install` once for system trust; declining the
    sudo prompt is non-fatal — the cert still gets generated and
    indexing keeps working via puppeteer's `--ignore-certificate-errors`
    flag and `NODE_EXTRA_CA_CERTS` for Node clients.
  - Idempotent: re-runs are a no-op until the cert is within 7 days of
    expiry.

`env-vars.sh` flips `REALM_BASE_URL`/`REALM_TEST_URL` defaults to
`https://localhost:4201`/`4202`, exports the cert paths when files
exist, and points `NODE_EXTRA_CA_CERTS` at mkcert's root CA so Node-
side fetches (worker, scripts, prerender Node) trust the cert without
requiring `mkcert -install` to have run. `dev-common.sh` switches
wait-on's readiness probes to `https-get://` when the realm URL is
HTTPS. The host's `config/environment.js` defaults flip to
`https://localhost:4201` for `realmServerURL`, `baseRealmURL`,
`catalogRealmURL`, `legacyCatalogRealmURL`, `skillsRealmURL`, and
`openRouterRealmURL`. `middleware/index.ts#fullRequestURL` now detects
`ctx.req.socket.encrypted` so URL-keyed realm lookup matches the wire
protocol — combined with the canonical-URL flip, both halves agree.

CI / hermetic test harness path stays HTTP-only: if no cert is
provisioned, `env-vars.sh` leaves the TLS env vars unset and the
realm-server boots `http.createServer`, exactly as before.

Migration after pulling: any local card data created under the old
`http://localhost:4201/...` canonical references is stale and needs to
be re-indexed. README documents the one-time `mise run
infra:full-reset` step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1655a6f2df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mise-tasks/lib/env-vars.sh
Comment thread packages/host/config/environment.js
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 12, 2026

Preview deployments

Host Test Results

    1 files  ± 0      1 suites  ±0   1h 18m 34s ⏱️ + 8m 51s
2 137 tests  - 40  2 123 ✅  - 39  14 💤  - 1  0 ❌ ±0 
2 153 runs   - 40  2 139 ✅  - 39  14 💤  - 1  0 ❌ ±0 

Results for commit de5129a. ± Comparison against earlier commit f5de14e.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   7m 54s ⏱️ - 5m 20s
1 372 tests ±0  1 372 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 451 runs  ±0  1 451 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit de5129a. ± Comparison against earlier commit f5de14e.

Adds the two missing pieces from the initial HTTPS+HTTP/2 flip:

1. Same-port HTTP→HTTPS dispatcher in `server.ts`. When the realm-server
   speaks TLS, `listen(port)` now binds a net.Server that peeks the
   first byte off every connection: 0x16 (TLS ClientHello) routes to
   the http2 secure server; anything else is treated as plain HTTP and
   handed to a tiny 301-redirect handler that rewrites the URL to
   `https://<inbound-host><path>`. So `http://localhost:4201/…` in a
   browser bar or a `curl` invocation gets a clean 301 instead of a
   TLS handshake failure. Same listener, no extra port.

2. A node-pg-migrate that rewrites every URL-bearing text/varchar/jsonb
   column on every public table (except `modules`, which the
   realm-server truncates on startup) from `http://localhost:42XX` to
   `https://localhost:42XX`. Auto-discovered via
   `information_schema.columns` — covers `boxel_index`,
   `boxel_index_working`, `realm_registry`, `realm_meta`,
   `realm_metadata`, `realm_user_permissions`, `realm_versions`,
   `realm_file_meta`, `module_transpile_cache`, plus any future
   URL-bearing column that's added later (the discovery picks it up).
   WHERE-filtered so it only touches rows still containing the old URL
   — idempotent, no-op in production.

`mise run dev` already passes `--migrateDB` to the realm-server, so the
migration runs automatically on the first post-pull boot. README's
"Local HTTPS dev access" section is rewritten to describe the new
auto-migration flow (no more `mise run infra:full-reset` callout).

Schema file renamed from `1779100257123_schema.sql` to
`1779200000000_schema.sql` so host/config/environment.js's
migration-vs-schema-name sentinel matches the new latest migration.
Content is unchanged (the new migration is data-only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI was failing across host/realm-server/matrix test suites because
ensure-dev-cert exited non-zero when mkcert was missing, killing the
mise dep chain before any service started, and because env-vars.sh
flipped REALM_BASE_URL to https unconditionally — so even when the
realm-server fell back to plain HTTP, every consumer was still asked
to fetch against https. The host config defaults had the same
problem: hardcoded https meant the in-browser realmServerURL didn't
match the wire scheme.

Three fixes, gated on cert presence:

1. `ensure-dev-cert` now exits 0 with a soft warning when mkcert is
   missing. The realm-server's `listen()` already falls back to plain
   `http.createServer` when the TLS env vars are unset, so this is
   the honest behavior for CI / hermetic-test environments.
2. `env-vars.sh` defaults `REALM_BASE_URL`/`REALM_TEST_URL` to http
   and only upgrades them to https inside the cert-detected block
   alongside the existing TLS env var exports.
3. `packages/host/config/environment.js` derives its scheme from
   `process.env.REALM_BASE_URL`, so the host config follows the same
   cert-presence-driven flip rather than baking https into the JS
   defaults.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables local-dev realm-server to serve a single canonical HTTPS origin with HTTP/2 (plus same-port HTTP→HTTPS redirect) to remove Chrome’s HTTP/1.1 per-origin connection bottleneck during heavy prerender/search fan-outs, and migrates local indexed data from http://localhost:42xx to https://localhost:42xx.

Changes:

  • Add TLS-capable listener that multiplexes HTTPS/HTTP2 and HTTP redirect on the same port; update URL construction to recognize TLS sockets.
  • Default local dev URLs/config/docs to https://localhost:4201 (+ :4202 for test realms) and add mkcert-based cert provisioning.
  • Add a Postgres migration to rewrite persisted localhost canonical URLs from http→https.

Reviewed changes

Copilot reviewed 45 out of 46 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
README.md Document local HTTPS/HTTP2 setup, migration, and updated local URLs.
QUICKSTART.md Update quickstart URLs to https://localhost:4201.
packages/realm-server/tests/types-endpoint-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/server-endpoints/search-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/server-endpoints/search-prerendered-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/server-endpoints/info-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/server-endpoints/index-responses-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/server-endpoints/helpers.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/server-endpoints/federated-types-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/server-endpoints/authentication-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/request-forward-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/realm-endpoints/user-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/realm-endpoints/reindex-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/realm-endpoints/markdown-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/realm-endpoints/info-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/realm-endpoints/dependencies-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/realm-endpoints-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/publish-unpublish-realm-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/prerender-manager-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/openrouter-passthrough-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/module-cache-race-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/helpers/index.ts Update close helpers/types to tolerate non-http.Server server handles.
packages/realm-server/tests/get-boxel-claimed-domain-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/file-watcher-events-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/delete-boxel-claimed-domain-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/claim-boxel-domain-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/card-source-endpoints-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/card-endpoints-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/card-dependencies-endpoint-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/boxel-domain-availability-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/tests/atomic-endpoints-test.ts Use RealmHttpServer type alias for server handle.
packages/realm-server/server.ts Add TLS/http2+redirect dispatcher and export RealmHttpServer type; update listen logging.
packages/realm-server/prerender/browser-manager.ts Add --ignore-certificate-errors for prerender Chromium when using https.
packages/realm-server/middleware/index.ts Treat TLS sockets as https for fullRequestURL() computation.
packages/realm-server/main.ts Make shutdown tolerant of non-http.Server handles lacking closeAllConnections().
packages/realm-server/lib/dev-service-registry.ts Broaden registry typing to net.Server.
packages/postgres/migrations/1779200000000_canonical-url-http-to-https.js Add migration to rewrite localhost canonical URLs from http→https.
packages/host/config/schema/1779200000000_schema.sql Add regenerated host sqlite schema snapshot.
packages/host/config/environment.js Flip local default realm URLs to https.
mise-tasks/services/test-realms Ensure dev cert task runs before test realms.
mise-tasks/services/realm-server-base Ensure dev cert task runs before base realm server.
mise-tasks/services/realm-server Ensure dev cert task runs before realm server.
mise-tasks/lib/env-vars.sh Flip default realm URLs to https and export TLS cert/CA env vars.
mise-tasks/lib/dev-common.sh Use https readiness probes when realm URLs are https.
mise-tasks/infra/ensure-dev-cert New task to provision mkcert leaf cert for local HTTPS/HTTP2.
.claude/skills/indexing-diagnostics/SKILL.md Update localhost URLs and markdown formatting in diagnostics skill doc.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/realm-server/server.ts Outdated
Comment thread mise-tasks/lib/env-vars.sh Outdated
Comment thread mise-tasks/lib/env-vars.sh Outdated
Comment thread README.md
Comment thread README.md Outdated
Comment thread QUICKSTART.md
habdelra and others added 2 commits May 12, 2026 19:13
Local realm-server speaks HTTPS+HTTP/2 in every environment — there is
no HTTP fallback or opt-in. The dev cert is a hard prereq:

- `ensure-dev-cert` exits non-zero when mkcert is missing.
- `env-vars.sh` defaults `REALM_BASE_URL`/`REALM_TEST_URL` to https
  unconditionally and no longer flips schemes based on cert presence.
- `host/config/environment.js` defaults to `https://localhost:4201`
  unconditionally; the previous scheme-from-env-var branch is gone.
- The new `.github/actions/init` step installs mkcert via apt and runs
  `mise run infra:ensure-dev-cert` before any downstream job, so CI
  realm-servers boot HTTPS+HTTP/2 too. Test harnesses that launch
  Chromium already pass `--ignore-certificate-errors`; Node clients
  pick up the cert via `NODE_EXTRA_CA_CERTS`.
- README's CI/harness paragraph is rewritten to describe the cert
  provisioning in the init action (no more "boots HTTP/1.1 in CI" line).

Carries over the Copilot-flagged fixes:

- Migration renamed to `1779100257124_canonical-url-http-to-https.js`
  (one greater than the existing latest, no 6+ consecutive zeros so it
  passes `lint:migrations`) and the matching schema dump renamed.
- Migration body adds a `realm_registry` LIKE pre-check that short-
  circuits the full-column scans on production/staging databases where
  the canonical URLs never reference localhost.
- Drops the unused `/* eslint-disable camelcase */` line that
  `lint:js` flagged.
- `redirectToHttps()` parses the inbound `Host` via `new URL()` so
  bracketed IPv6 authorities (`[::1]:4201`) round-trip cleanly instead
  of the regex producing an invalid `https://::1:4201/...`.
- `env-vars.sh` no longer concatenates `NODE_EXTRA_CA_CERTS` with `:`
  separators — Node accepts a single PEM path, not a list. If the dev
  already has it set, leave it alone; otherwise point at mkcert's CA.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Copilot 3230386975 — the previous QUICKSTART pointed users at
https://localhost:4201 without telling them how to provision the cert
that makes that origin work. Adds mkcert to the system dependencies
list at step 1 with platform-specific install hints and the
`mise run infra:ensure-dev-cert` one-liner, linking back to the
README's "Local HTTPS dev access" section for the full story.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three task scripts under `mise-tasks/test-services/` were stuck on the
old `http-get://${REALM_BASE_URL#http://}/base/...` readiness probe
shape that strips a hardcoded `http://`. After env-vars.sh flipped
REALM_BASE_URL to https, that strip becomes a no-op and the probe URL
turns into the malformed `http-get://https://localhost:4201/...`,
which wait-on can't reach — every CI suite that drives `mise run
test-services:*` would hang on phase-1 readiness instead of starting
the next phase.

Same fix as `mise-tasks/lib/dev-common.sh`: detect the scheme from
`$REALM_BASE_URL` / `$REALM_TEST_URL` and pick `http-get://` or
`https-get://` accordingly; strip `*://` to leave just the authority.
Also wires `infra:ensure-dev-cert` into each script's depends list so
local invocations of `mise run test-services:*` (outside CI's init
action) provision the cert before the realm-server starts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 49 out of 50 changed files in this pull request and generated 11 comments.

Comment thread mise-tasks/lib/env-vars.sh
Comment thread mise-tasks/infra/ensure-dev-cert Outdated
Comment thread packages/realm-server/tests/helpers/index.ts
Comment thread QUICKSTART.md Outdated
Comment thread packages/realm-server/middleware/index.ts Outdated
Comment thread .claude/skills/indexing-diagnostics/SKILL.md Outdated
Comment thread .claude/skills/indexing-diagnostics/SKILL.md Outdated
Comment thread README.md
Comment thread packages/realm-server/server.ts
Comment thread packages/realm-server/main.ts
habdelra and others added 14 commits May 12, 2026 19:37
Blockers (B1–B3):
- tests/index.ts deletes REALM_SERVER_TLS_CERT_FILE/_KEY_FILE before any
  fixture realm-server is spun up; without this CI's globally-provisioned
  cert leaks into supertest-driven in-process servers, the dispatcher
  binds TLS on 127.0.0.1:444X, and the plain-HTTP-from-supertest path is
  301-redirected, breaking every assertion that expects 200/4xx.
- realm-server/package.json `test:wait-for-servers` now uses
  `https-get://` to match the new wire scheme; the previous `http-get://`
  hit the dispatcher's 301 path and never reported ready.
- server.ts attaches a per-socket `error` handler before the readable
  callback so an RST mid-handshake (or any peer-side socket error)
  doesn't escalate to an uncaught exception — dispatcher is the only
  inbound listener for the realm-server, can't be allowed to crash.
- `null` reads on the dispatcher socket now `destroy()` instead of just
  resuming so half-open accumulators (port scanners, eager load
  balancers) don't tie up file descriptors.

Major (M1, M3–M5):
- README's auto-migration callout pointed at the wrong migration filename
  (1779200000000_… → 1779100257124_…).
- pg-adapter.ts env-mode regex now matches `^https?://localhost:42XX/`
  so the post-flip https canonicals get rewritten to Traefik hostnames
  when a dev switches the same DB into BOXEL_ENVIRONMENT mode.
- server.ts's serveIndex / serveFromRealm URL constructions now go
  through `fullRequestURL(ctxt)` instead of `${ctxt.protocol}//${ctxt.host}`;
  `ctxt.protocol` only honors x-forwarded-proto when `app.proxy = true`,
  while `fullRequestURL` also reads the TLS socket flag. Pre-existing
  inconsistency that the https flip would have made load-bearing.
- migration's information_schema walk excludes `is_generated = 'NEVER'`
  so a future generated column on any public table doesn't abort the DO
  block with "column can only be updated to DEFAULT".

Copilot's second pass:
- ensure-dev-cert checks for mkcert BEFORE the idempotent-skip — env-vars.sh
  needs `mkcert -CAROOT` to populate NODE_EXTRA_CA_CERTS even when an
  old cert already exists, and the previous ordering let a stale cert
  slip past with the trust path half-wired.
- middleware/index.ts `fullRequestURL` falls back to `:authority` when
  `headers.host` is absent — HTTP/2's compat layer normally populates
  host from :authority but the pseudo-header is the canonical source.
- middleware/index.ts `fetchRequestFromContext` strips `:`-prefixed
  pseudo-headers (`:method`, `:scheme`, `:path`, `:authority`) before
  feeding them into `new Request(headers)`, which WHATWG Headers rejects.
- QUICKSTART mkcert bullet's continuation line is properly indented now
  so markdown renders it inside the bullet instead of as a new paragraph.
- indexing-diagnostics SKILL.md two table rows now have the missing third
  cell so the table renders correctly.

Minor (m2, m6, n3) + Option A:
- redirectToHttps falls back to `socket.localAddress:localPort` when the
  Host header is absent (HTTP/1.0 client), instead of bare `localhost`
  that would route to port 443.
- scripts/full-reindex.sh and register-bot.sh flip to `https://` with
  `-k` (curl doesn't pick up NODE_EXTRA_CA_CERTS, and the local mkcert
  CA isn't necessarily in the system trust store).
- prerender/browser-manager.ts comment references only REALM_BASE_URL
  (REALM_SERVER_DOMAIN was stale — never exported by env-vars.sh).
- QUICKSTART step 10/11 and README's "view a realm's app" paragraph
  redirect manual-browser navigation to `http://localhost:4200/` (the
  vite host), with a note that visiting `https://localhost:4201` directly
  surfaces mixed-content warnings because vite + icons + synapse still
  speak http. Realm-server's https origin is reached only via fetches
  inside the vite-served page, which is where the federated-search h2
  win lands. README's "view example" output also flipped the realm log
  line to `https://localhost:4202/test/` to match the new canonical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README list item 3's wrapped continuation line is now indented under
  the bullet so markdown doesn't break it into a separate paragraph.
- server.ts dispatcher tracks every accepted socket in a Set and mirrors
  http.Server's `closeAllConnections()` API. main.ts's existing typeof
  feature-detect picks this up; shutdown no longer hangs on long-lived
  h2 sessions or keep-alive sockets.
- tests/listener-dispatcher-test.ts is new coverage for the dispatcher:
  generates a self-signed cert via openssl into a tmp dir, then exercises
  TLS+h2, ALPN HTTP/1.1 fallback, plain-HTTP→https 301 redirect, the
  no-Host-header path that uses `socket.localAddress`, malformed-cert
  downgrade to plain HTTP, and the no-cert-env-vars path. `createListener`
  is now exported from server.ts so the test can drive it without
  spinning up a full realm-server fixture (and the test bootstrap's
  global TLS-env-var delete doesn't interfere — each test restores its
  own env around `startListener`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`qunit/no-assert-logical-expression` was failing on three assertions
that combined multiple conditions via `&&` / `||`. Splitting them into
discrete `assert.true(...)` calls makes the failure point obvious when
a test breaks and clears the lint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both `packages/workspace-sync-cli/tests/helpers/start-test-realm.ts`
and `packages/realm-test-harness/src/isolated-realm-stack.ts` spawn a
realm-server subprocess that inherits `process.env`. After CI's init
action provisions the dev cert and `env-vars.sh` exports
`REALM_SERVER_TLS_CERT_FILE/_KEY_FILE`, those env vars leak into the
spawned realm-server, which binds the HTTPS+HTTP/2 dispatcher on the
harness's chosen port. The integration tests and the realm-perf bench
both drive plain `http://localhost:<port>/...` URLs against that
server, hit the dispatcher's 301 path, and break: workspace-sync's
CLI fails its session handshake with "expected 'Authorization'
header" (it doesn't follow the redirect through the auth flow), and
the bench fails its first GET with `404` because the realm route is
behind https now.

Same shape of fix as `realm-server/tests/index.ts` for the in-process
qunit suite: destructure the two TLS env-var keys out of the spawn
env so the child inherits everything except those. Plain
`http.createServer` path, no redirect, harness HTTP URLs work as
written. Production realm-servers and local dev are unaffected
because they don't go through these harnesses.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`packages/host/testem-live.js` was hardcoding `http://localhost:4201/catalog/`
as the realm URL and launching Chrome with the default trust policy. After
the HTTPS flip, the live-test runner's `discoverTestModules` fetched against
`https://localhost:4201/catalog/...` (via the host's `realmServerURL`
default) but the browser navigated to `http://localhost:4201/...`, getting
a 301 to https and then failing the cert check — `mkcert -install` in CI's
init action is best-effort and the headless Chrome in CI doesn't always
pick up the system trust store anyway.

Two fixes paired:
- Default realm URL flips to `https://localhost:4201/catalog/` so the
  navigation target matches the wire.
- Chrome's CI launch args get `--ignore-certificate-errors` so the live
  test runner accepts the mkcert leaf without depending on system trust.
  Safe — the URL is fixed by REALM_URL and the connection is loopback.

Dev (`launch_in_dev`) doesn't add the flag because local devs typically
have run `mkcert -install` successfully and the cert is trusted normally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…registry

The pre-check needs to fire on a fresh install too. `realm_registry` is
populated by the realm-server's runtime bootstrap (registry backfill +
reconciler), not by migrations, so it's empty when this migration runs
against a freshly-created DB — the migration short-circuited and the
`http://localhost:42XX` permission rows seeded by the earlier
`1726671342065_backfill-realm-owners.js` migration stayed un-rewritten.

The realm-server then matches incoming requests against the new
`https://localhost:42XX/…` canonical and the permission rows fail to
join → world-readable catalog returns 401 → Live Tests fail with
"Cannot access realm https://localhost:4201/catalog/ (HTTP 401)".

Switch the pre-check to `realm_user_permissions.realm_url`, which is
reliably populated with the localhost canonicals by the earlier
seed-style migrations. The rest of the migration body is unchanged —
the per-column WHERE clauses still restrict the touch set to rows that
actually contain the old URL, so production/staging DBs (real
hostnames, never localhost) still no-op.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test mode runs against the host-internal `http://test-realm/...`
virtual origin via VirtualNetwork; there is no real realm-server on
the wire. Many host test fixtures hardcode the
`http://localhost:4201/...` canonicals in mock setups, VirtualNetwork
mappings, and JSON test data, so flipping the default URLs to https
caused every fetch in the test suite to fail with
`TypeError: Failed to fetch` — the host's VirtualNetwork was wired
with https URL mappings the test mocks didn't recognize.

`environmentDefaults(environment)` now reads the ember env and picks
http for `environment === 'test'`, https otherwise. Dev gets the
HTTPS+HTTP/2 flip exactly as designed; test stays where it always was.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous test-mode-on-http revert was wrong: in Host Tests the
realm-server actually IS running (via mise run test-services:host),
and that realm-server speaks HTTPS+HTTP/2. The host bundle's defaults
need to match the wire so module/data fetches over the wire (like
GET /base/card-api during warmup) reach the live realm-server. The
http defaults were producing failed http→https mismatches.

So:
- environment.js test mode reverts to https defaults (same as dev).
- test-wait-for-servers.sh + live-test-wait-for-servers.sh default
  their readiness probe URLs to `https-get://` to match.
  live-test-wait-for-servers.sh also gets the same scheme-detection
  helper (`to_wait_scheme`) the other scripts use so an explicit
  REALM_URL with either scheme works.

`http://test-realm/...` URLs in tests (used by the in-memory test
realm registry) are still intercepted by `getRealmInfoForURL` before
any wire fetch — that path is unrelated to the wire defaults and any
remaining failures there are a separate concern from the HTTPS flip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sweep of every place `http://localhost:4201`/`4202` appears with
runtime impact:

Runtime / wire-touching:
- `package.json` `openrouter:sync` default REALM_URL → https
- `mise-tasks/lib/test-dev-common.sh` stub env defaults → https
- `packages/host/app/services/host-mode-service.ts`
  `originIsNotMatrixTests` accepts both http and https origins on the
  matrix-tests realm ports (https is the new default; http stays
  recognized so older snapshots still detect the test mode).
- `packages/observability/scripts/apply.sh` / `diff.sh` default
  `REALM_SERVER_URL` → https.

Cache import:
- `scripts/import-cached-index.sh` env-mode sed remap now matches both
  `http://localhost:4201` and `https://localhost:4201` — older cache
  snapshots have http canonicals, post-flip dumps have https. Either
  prefix gets rewritten to the env-mode Traefik hostname.

In-tree realm fixture data (cards served by dev realm-server):
- `packages/experiments-realm/**/*.json` and
  `packages/catalog-realm/**/*.json` `id` / `relationships` URLs
  flipped from http to https. Without this every cross-card fetch
  inside a render paid a wire-level 301 redirect from the dispatcher.

Docs:
- `README.md`, `QUICKSTART.md`, `packages/host/docs/live-tests.md`,
  `packages/software-factory/README.md`, `packages/bot-runner/README.md`,
  `docs/commands-in-headless-chrome.md` — example URLs updated.

Not flipped (intentional):
- Test fixture JSONs under `packages/host/tests/cards/`,
  `packages/realm-server/tests/cards/`, ai-bot resource chats, and
  bench-realm snapshot fixtures. Those URLs match test-side mount
  points (`http://test-realm/...`, `http://127.0.0.1:4444/test/`,
  bench-stack http://localhost:4201) where the test infrastructure
  spawns the realm-server with TLS env vars cleared and listens
  plain HTTP. Flipping them would diverge from what the test code
  registers and break the in-process fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Host Tests load the host bundle in a headless Chrome on testem (port
7357). The bundle's `realmServerURL` / `resolvedBaseRealmURL` defaults
now point at `https://localhost:4201` to match the wire, but
`mkcert -install` in CI's init action is best-effort and doesn't
reliably land mkcert's root CA in headless Chrome's NSS trust store.
Without `--ignore-certificate-errors`, every realm fetch made during
shard warmup fails with `TypeError: Failed to fetch` against the
self-signed cert and the rest of the shard never starts.

Same fix already shipped in `testem-live.js`. Loopback only, fixed
origin via host config — safe to relax cert trust.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Boxel-cli's vitest suite (and any other non-qunit caller of these
helpers) doesn't share `packages/realm-server/tests/index.ts`'s
bootstrap, so the global TLS env var delete that protects in-process
qunit fixtures didn't apply to it. The CI init action provisions the
cert, env-vars.sh exports the paths, and the test process inherits
them — the spawned realm-server then binds HTTPS+HTTP/2 on its
fixture port (`127.0.0.1:4446` for boxel-cli) and the CLI's plain-HTTP
session calls fail with `404 Not Found` from the dispatcher's 301
path.

Moving the env-var strip into the two `runTestRealmServer*` helpers
themselves makes it defense-in-depth: every caller (qunit, vitest,
software-factory harness) now goes through the same kill switch when
spinning a fixture realm-server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p2-v2

# Conflicts:
#	.claude/skills/indexing-diagnostics/SKILL.md
#	packages/realm-server/scripts/full-reindex.sh
#	packages/realm-server/tests/realm-endpoints/info-test.ts
#	packages/realm-server/tests/realm-endpoints/user-test.ts
Matrix client tests timed out waiting for
`http-get://localhost:4201/base/_readiness-check` because the realm-server
now speaks HTTPS+HTTP/2 only. Wait-on's plain http-get probe never
resolves against the https listener. Same fix for
start-without-matrix.sh (dev convenience script used to bring up the
stack without Synapse).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Card fixture data hardcoded http://localhost:4202 in adoptsFrom.module.
With the realm-server now on HTTPS, the page is served over https and
Chrome blocks mixed-content fetches of the http module URL. Flipping
to https keeps the canonical realm URL consistent with the actual
listener scheme.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
habdelra and others added 30 commits May 13, 2026 10:55
Node's HTTP/2 compat layer marks server-side Http2Stream.writable=false
for HEAD-method streams (the protocol forbids a body, so the stream is
non-writable up front). Koa's ctx.writable getter delegates to
res.socket.writable, so for HEAD over h2 it sees false and respond()
short-circuits on `if (!ctx.writable) return` — no headers are ever
sent and the client hangs until its timeout. Reproduced with bare curl
against the realm-server (every HEAD over h2 timed out, GET worked)
and with a 30-line koa + http2.createSecureServer minimal repro, so
this is not realm- or browser-specific. The host test bundle's
CachingDefinitionLookup.probeRemoteRealm HEAD probe was the visible
symptom that surfaced this on host CI.

patchKoaResponseForH2Head() overrides Koa's response.writable
prototype getter to recognise a healthy HEAD-over-h2 stream as
writable. createListener applies it once when an h2 listener is
constructed. Also: add a forbidden-header filter in setContextResponse
so realm responses don't try to forward hop-by-hop headers (connection,
keep-alive, transfer-encoding, etc.) onto an h2 reply — defence in
depth per RFC 9113 §8.2.2.

Test: new 'TLS h2 HEAD returns 200 without hanging' regression test in
listener-dispatcher-test (would time out without the patch). Also
register listener-dispatcher-test in tests/index.ts (it was never
running) and fix a pre-existing this-binding bug in its cleanup that
surfaced when the no-cert path started executing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The realm-server-only and worker-base service tasks (used by the CI
matrix tests workflow's start-server-and-test stack) still pointed
base realm's --toUrl at http://localhost:4201/base/ even though the
realm-server now binds HTTPS+h2 on 4201. Result: every request to
/base/* was a registry miss (realm registered under http:// but
incoming request is https://) so /base/_readiness-check returned 404,
wait-on's 10-minute timeout fired, and the whole shard failed.

Bring these two tasks in line with services/realm-server and
services/worker by switching to the https:// form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two card-references assertions hardcode the expected sorted dep
list with 'https://localhost:4202/test/person' pinned at position 0.
That position was correct when the URL was 'http://localhost:4202/...'
(http < http:// < https://, so http://localhost:4202 sorted before all
http://localhost:4206 entries). After the canonical-URL flip to https
in this branch, https://localhost:4202/test/person sorts AFTER all
https://cardstack.com/base/* and BEFORE https://packages/* — moving
the entry into that slot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
card-endpoints-test, types-endpoint-test, and module-syntax-test
hardcode card/module URLs at localhost:4202/node-test/. The HTTPS
flip in this branch makes the canonical address https://localhost:4202/,
so:

  - types-endpoint-test asserts the returned card-type-summary `id`
    against the http:// form; the realm returns the canonical https://
    form, so deepEqual fails.
  - card-endpoints-test posts `module: 'http://localhost:4202/.../friend'`
    in the body; the realm tries to resolve a card type at that URL,
    misses the canonical https:// entry in the module cache, and 500s.
  - module-syntax-test passes the URLs to `new URL(...)` for relative-
    path computation — purely string work, but flipping keeps the file
    consistent with its siblings now that the realm speaks https.

Single search/replace across the three files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ands off

Node's http2 compat layer surfaces pseudo-headers (`:method`, `:scheme`,
`:path`, `:authority`) on `req.headers` alongside regular headers.
koa-proxies / http-proxy forwards every header verbatim into
`new http.ClientRequest(...)`, and Node rejects any name starting with
`:` as `ERR_INVALID_HTTP_TOKEN`. Result on the h2 path: every proxied
asset (notably `/auth-service-worker.js`) returns 500. The host bundle
registers the service worker on every page load, so each matrix /
host test refetches it and hits the same 500 on retries — shards
churn for 30+ minutes burning the playwright retry budget.

Wrap the proxy middleware: delete pseudo-headers from `ctxt.req.headers`
before delegating to the inner koa-proxies handler. The URL and method
are already extracted from ctxt, so the upstream HTTP/1.1 request has
everything it needs without the h2 metadata.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit deleted h2 pseudo-headers (`:method`, `:path`, …)
directly off `req.headers` before delegating to koa-proxies. Node's
http2 compat layer returns the *internal* headers map from the
`req.headers` getter — the same map `req.method` and `req.url` read
from — so deleting `:method` and `:path` nulled out req.method/req.url
for every subsequent middleware. Koa's `ctx.path` getter (called by
koa-proxies' route matcher) then threw "Cannot read properties of
undefined (reading 'pathname')", every request 500'd, and every Host
Tests shard fell over.

Switch to a non-destructive shadow: define a `headers` value property
on `ctxt.req` with the filtered copy for the inner proxy call, then
delete it in a `finally` so the prototype getter is restored for the
rest of the request lifecycle. Mutates nothing Node owns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rder

The 2 previous attempts at h2-proofing koa-proxies both regressed
something:

  - mutating `req.headers` to delete pseudo-headers clobbered Node's
    internal headers map (the same map `req.method` / `req.url` read
    from), turning every request into a 500 with "pathname undefined".
  - shadowing `req.headers` with `Object.defineProperty` + restoring
    via `delete` left the property missing for HTTP/1.1 requests (no
    prototype getter to fall back to), which is also bad in subtle
    downstream ways the realm-server boot did not survive.

The root issue is that http-proxy assigns `req.headers` straight onto
the `outgoing` options bag it hands to `http.ClientRequest`, and there
is no pre-construction hook to filter the headers. Replace the entire
koa-proxies + http-proxy stack with a hand-rolled forwarder: read URL
and headers from the Koa context, pick the headers we want to forward
(skipping `:`-prefixed pseudo-headers and `host`), issue an http.request
against the assets URL, stream the response back via `ctxt.body`. One
code path serves both h1 and h2 callers, no req.headers gymnastics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The matrix test fixtures (playwright.config baseURL, helpers/index.ts
ports, the URL maps inside isolated-realm-server itself) all hardcode
`http://localhost:4205/…`. In CI, env-vars.sh exports
`REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` for the parent dev stack
that speaks HTTPS+HTTP/2 on 4201 / 4202, and those env vars are
inherited by every child `spawn()` unless explicitly stripped. The
isolated realm-server therefore boots in HTTPS+h2 mode while its realm
registry is keyed on `http://localhost:4205/…` — every
`http://localhost:4205/{test,skills,base}/_mtimes` request from the
worker comes through the dispatcher's plain-HTTP redirect path, lands
on the HTTPS endpoint as a `_mtimes` lookup for the *https://* URL
(which isn't registered), and 404s. The matrix tests then hang waiting
for the page to render against an unindexable realm, blow through the
playwright timeout, and shards run for ~2 hours.

Spawn the prerender / worker-manager / realm-server child processes
with a process.env clone that has the two TLS env vars deleted, so the
isolated stack stays plain HTTP and matches the hardcoded URLs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`mkcert -install` is internally idempotent, but the sudo probe still
prompts for a password on every invocation if it can't verify the root
CA without root privileges. Inside `mise run dev-all`, that prompt
flows through `start-server-and-test`'s child shells alongside the
parallel server output — the prompt is essentially invisible and
unsendable, and the whole dev stack collapses with a SIGTERM cascade
when sudo times out.

Stop trying to invoke `mkcert -install` from the task. Instead, check
upfront whether mkcert's `rootCA.pem` is already present in both the
system trust store and the user's `~/.pki/nssdb`, and exit fast with a
clear message telling the dev to run `mkcert -install` once manually
if either is missing. After the one-time setup this task is a fast
no-op on every invocation, with no chance of stalling dev-all on a
sudo prompt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ensure-dev-cert already prints a clear "run \`mkcert -install\` once"
message when the root CA isn't trusted, but inside `dev-all`'s
parallel stack that message arrives as one of hundreds of buffered
lines prefixed with `[start:development]   [infra:ensure-dev-cert]`,
intermixed with concurrent output from the other six services. The
downstream cascade (vite dependency-scan restart, prerender /
worker-manager teardown, 45-error rolldown traceback) buries the
actual cause completely — a fresh dev hitting this sees a wall of
plugin errors and no obvious "you need to run mkcert -install" hint.

Invoke `mise run infra:ensure-dev-cert` as the very first step of
dev-all, before we even spawn the host app. The cert check runs in
isolation and its error is the only thing on screen. If it passes,
the inner `mise run` invocations that re-depend on it are fast no-ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…message

The first-time mkcert-install path now has a dedicated mise task,
`infra:trust-dev-cert`, that creates ~/.pki/nssdb and runs
`mkcert -install` (with its sudo prompt) interactively. The companion
`infra:ensure-dev-cert` task is now strictly read-only: it verifies the
mkcert root CA is already trusted in both the system store and the NSS
DB and exits 1 with a one-paragraph active-voice message if it isn't:

  The mkcert dev root CA is not installed on this machine.

  Run this once to install it (prompts for sudo):

    mise run infra:trust-dev-cert

  Then re-run the command you just ran.

Both `mise run dev` and `mise run dev-all` now invoke
`infra:ensure-dev-cert` upfront before spawning the parallel stack, so
that error is the first and only thing on screen instead of being
buried under hundreds of multiplexed lines of vite / start-server-and-
test output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`dev` and `dev-all` now pass `BOXEL_DEV_INVOKED_AS` into the
ensure-dev-cert invocation, and ensure-dev-cert substitutes that into
the "Then re-run …" line. Three concrete variants:

  - `mise run dev`            → "Then re-run `mise run dev`."
  - `mise run dev-all`        → "Then re-run `mise run dev-all`."
  - direct invocation         → "Then re-run `mise run infra:ensure-dev-cert`."

The user no longer has to remember what they typed five lines of
output ago.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The realm-server speaks HTTPS+HTTP/2 on 4201/4202 in local dev, but
vite's dev server was still listening on plain http://localhost:4200.
Browsers visiting http://localhost:4200 then fired cross-origin
requests to http://localhost:4201, which the realm-server's dispatcher
301-redirects to https://. Chrome blocks redirects on CORS preflight
requests ("Redirect is not allowed for a preflight request"), so every
realm-server fetch from the host bundle failed.

When `REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` are set (`env-vars.sh`
exports them whenever the mkcert leaf exists), vite now terminates TLS
using the same cert. Both vite dev (`pnpm start`) and vite preview
(`pnpm serve:dist`) pick this up via `server.https` / `preview.https`.

Knock-on changes:
  - `env-vars.sh` flips `HOST_URL` to `https://localhost:4200` when the
    cert is present, so the prerender's standby probe, the
    realm-server's distURL asset rewriter, and the test-services
    readiness URLs all stay scheme-consistent.
  - `prerenderer.ts` falls back to `process.env.HOST_URL` (instead of
    hardcoded http) so the prerender's BOXEL_HOST_URL default tracks
    whatever the shell exported.
  - `dev-all`'s host readiness loop and `start-host-dist.sh`'s
    already-running probe pass `-k` to curl so the new HTTPS endpoint
    is reachable even when the system trust store hasn't been
    refreshed since the last `trust-dev-cert` run.
  - The CI workflows (`ci.yaml`, `ci-software-factory.yaml`) flip
    their post-`test-services` readiness curls to
    `https://localhost:4200` with `-k`, matching the new scheme.

`trust-dev-cert` also gained a `certutil` precheck on Linux — without
libnss3-tools, `mkcert -install` only lands the root CA in
/etc/ssl/certs and Chromium (which reads NSS, not the system store)
still rejects the dev cert. Failing fast there with the apt/dnf
command is more useful than letting mkcert emit a buried warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The realm-server speaks HTTPS+HTTP/2 on 4201/4202 in local dev, but
vite's dev server was still listening on plain http://localhost:4200.
Browsers visiting http://localhost:4200 then fired cross-origin
requests to http://localhost:4201, which the realm-server's dispatcher
301-redirects to https://. Chrome blocks redirects on CORS preflight
requests ("Redirect is not allowed for a preflight request"), so every
realm-server fetch from the host bundle failed.

When `REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` are set (`env-vars.sh`
exports them whenever the mkcert leaf exists), vite now terminates TLS
using the same cert. Both vite dev (`pnpm start`) and vite preview
(`pnpm serve:dist`) pick this up via `server.https` / `preview.https`.

Knock-on changes:
  - `env-vars.sh` flips `HOST_URL` to `https://localhost:4200` when the
    cert is present, so the prerender's standby probe, the
    realm-server's distURL asset rewriter, and the test-services
    readiness URLs all stay scheme-consistent.
  - `prerenderer.ts` falls back to `process.env.HOST_URL` (instead of
    hardcoded http) so the prerender's BOXEL_HOST_URL default tracks
    whatever the shell exported.
  - `dev-all`'s host readiness loop and `start-host-dist.sh`'s
    already-running probe pass `-k` to curl so the new HTTPS endpoint
    is reachable even when the system trust store hasn't been
    refreshed since the last `trust-dev-cert` run.
  - The CI workflows (`ci.yaml`, `ci-software-factory.yaml`) flip
    their post-`test-services` readiness curls to
    `https://localhost:4200` with `-k`, matching the new scheme.

`trust-dev-cert` also gained a `certutil` precheck on Linux — without
libnss3-tools, `mkcert -install` only lands the root CA in
/etc/ssl/certs and Chromium (which reads NSS, not the system store)
still rejects the dev cert. Failing fast there with the apt/dnf
command is more useful than letting mkcert emit a buried warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`1779100257124_canonical-url-http-to-https` rewrites realm-server's
postgres state but doesn't touch synapse. Every logged-in dev / user
on a stack that boots after the HTTPS+h2 flip keeps reading
`http://localhost:4201/...` from their `app.boxel.realms`
account_data — the host bundle's first realm fetch then hits the
realm-server's dispatcher, which 301-redirects to https://, and the
browser blocks the CORS preflight with "Redirect is not allowed for
a preflight request." Every realm fetch fails until the user clears
localStorage AND someone rewrites the account_data.

New script: `packages/matrix/scripts/migrate-account-data-http-to-https.ts`.
Logs in as admin, paginates `/_synapse/admin/v2/users`, impersonates
each user via `/_synapse/admin/v1/users/{id}/login` to obtain a per-
user token (the standard `account_data` PUT endpoint requires the
user's own token — admin can read but not write other users'), reads
`app.boxel.realms`, rewrites the two localhost prefixes
(`http://localhost:4201/`, `http://localhost:4202/`) to https://, and
PUTs the new list back. Skips users with no realms set, users where
no URL needed rewriting, and the admin user itself (synapse refuses
self-impersonation). Safe to re-run.

Wired via:
  - `pnpm migrate-account-data-http-to-https` (packages/matrix)
  - `mise run infra:migrate-matrix-account-data-http-to-https`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Vite serves HTTPS on :4200 in local dev (mkcert leaf). Typing
`http://localhost:4200/foo` into the browser is a common reflex and
currently hangs / `ERR_CONNECTION_REFUSED`. Add a tiny TCP dispatcher
that peeks the first byte of every incoming connection — same pattern
the realm-server uses on :4201:

  - TLS ClientHello (0x16) → forward raw bytes to vite at an internal
    loopback port so vite still terminates TLS itself with the cert
    it loaded in vite.config.mjs.
  - Anything else (an HTTP verb) → parse the request-target out of
    the start-line and reply 301 to `https://localhost:4200<target>`.

Activated only when `REALM_SERVER_TLS_CERT_FILE` is set (the same
signal `vite.config.mjs` uses to enable `server.https`). Environment
mode (BOXEL_ENVIRONMENT) keeps its existing Traefik path untouched —
the redirect there is the proxy's job, not ours.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ime out

Vite is lazy — modules only get bundled when something requests them.
The `wait-for-host-standby` probe is supposed to be that something:
puppeteer navigates to `/_standby` and waits for `#standby-ready` to
appear, which forces vite to optimize the entire host bundle before
the prerender server starts. After this commit the probe was hitting
the wrong scheme:

  - The probe fell back to `http://localhost:4200/_standby` whenever
    `HOST_URL` env wasn't already https. With the new vite dispatcher
    that 301-redirects to https://, the probe's puppeteer was
    bouncing through a redirect to a cert it didn't trust and erroring
    out on every retry. Vite was never actually hit, so its optimizer
    never warmed.
  - The prerender server then booted, opened its own chrome (which
    *does* have `--ignore-certificate-errors` via BrowserManager),
    navigated to `https://localhost:4200/_standby` for the first
    standby creation — and got vite's cold optimizer plus
    ~1000 module fetches. Even over HTTP/2 the cold path runs >30s,
    blowing the page-pool's hard-coded standby navigation budget.

Three changes:

  - `wait-for-host-standby.ts` defaults to `https://localhost:4200`
    when `REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` are set, so it
    matches vite's actual scheme even if `HOST_URL` hasn't been
    re-exported in the dev's current shell.
  - The probe's puppeteer now passes `--ignore-certificate-errors`
    when the URL is HTTPS, matching the prerender's BrowserManager.
  - `PRERENDER_STANDBY_TIMEOUT_MS` is now configurable on the PagePool
    constructor (env override). The dev prerender mise task defaults
    it to 120000ms when BOXEL_HOST_URL is HTTPS — gives the cold-vite
    first navigation real headroom. Production / hosted runners keep
    the 30s default unless they opt in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e repo

Vite now terminates TLS on :4200 with the same mkcert leaf the
realm-server uses, so the canonical local-dev host URL is
`https://localhost:4200`. Sweep every place that still bakes in the
old http:// form:

  - Defaults: `env-vars.sh` (HOST_URL — both the standard-mode reset
    branch and the fresh-shell default), `mise-tasks/services/prerender`
    (DEFAULT_HOST_URL), `start-host-dist.sh` (HOST_URL fallback),
    `prerenderer.ts` (defaultHostURL), `main.ts` (distURL default),
    `wait-for-host-standby.ts` (fallback default).
  - Docs: top-level QUICKSTART, AGENTS, README; per-package READMEs
    for host, boxel-homepage-realm, ai-bot, software-factory; the
    host live-tests / HEAP_PROBE notes; the indexing-diagnostics and
    host-test-memory-leak-hunting Claude skills; the
    commands-in-headless-chrome doc.
  - The dev synapse `client_base_url` (email-redirect base) flips so
    matrix registration emails point at the right scheme.
  - README's "view a realm's app" paragraph also rewritten: vite and
    realm-server both speak HTTPS+HTTP/2 now, so there's no more
    mixed-content caveat.
  - Drop the now-redundant `HOST_URL=https://...` override inside
    `env-vars.sh`'s cert-detection block — the unconditional default
    above already sets the right value, and the comment that called
    out the http/https mixing is no longer true.

Kept as http://: in-process test fixtures (realm-server tests strip
TLS env vars; their realm-server runs plain HTTP at 4444/4444+),
matrix isolated-realm-server tests, workspace-sync-cli test helpers,
and a few comments / explanatory references that intentionally cite
the old form ("…now lands on https://", "blob:http://localhost:4200/…"
example URL).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dev stack's prerender + `wait-for-host-standby` both run puppeteer
against `https://localhost:4200/_standby` and re-trigger vite's
optimizer to bundle ~1300 modules. Chrome 143 (the version bundled
with puppeteer 24.35) hangs forever fetching one of those modules —
specifically the large pre-optimized matrix-js-sdk chunk
(`indexeddb-crypto-store-*.js` ~6 MB) — apparently because of an h2
stream-window bug. curl pulls the same URL over h2 in 100ms; system
Chrome 148 fetches it in seconds; chrome 143 stalls.

Both `BrowserManager` (prerender) and `wait-for-host-standby.ts`
already prefer `PUPPETEER_EXECUTABLE_PATH` when set. Make env-vars.sh
auto-discover a system chrome / chromium / chromium-browser binary
and export the env var, so the standard dev path picks up the fixed
chrome without anyone having to set it manually. Devs who haven't
installed google-chrome locally keep the bundled puppeteer binary —
they'll see the standby probe stall longer, but only until vite's
optimizer cache warms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tack/boxel into worktree-cs-11114-http2-v2

# Conflicts:
#	packages/realm-server/prerender/prerenderer.ts
packages/host/vite.config.mjs reads REALM_SERVER_TLS_CERT_FILE /
_KEY_FILE and, when set, terminates TLS in vite preview too. The
harness uses dynamic ports and probes readiness via plain
http://localhost:<port>/, then hands that same http URL to its
spawned realm-server via HOST_URL. With the dev stack's TLS env
vars inherited, vite preview would come up on HTTPS, the readiness
fetch would hang, and every downstream HOST_URL fetch from the
spawned realm-server would land on an HTTPS server keyed under the
http:// origin.

Same pattern as the matrix isolated-realm-server fix (12b7fbc) —
strip the two TLS env vars from the spawn() env so the dynamic-port
harness stack stays plain HTTP end-to-end, regardless of whether
the surrounding dev env has the cert configured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two HTTPS-related regressions surfaced after the same-port http→https
dispatcher landed:

1. The dispatcher peeked the first byte of every incoming connection,
   socket.unshift()'d it back, and then socket.pipe()'d to vite. The
   unshift+pipe pattern races the upstream socket's connect handshake:
   the rest of the ClientHello arrives and writes to the upstream socket
   before the unshifted byte gets flushed, leaving vite with a corrupt
   handshake and the client with net::ERR_CONNECTION_CLOSED. Switch to
   the more deterministic pattern: do not unshift, instead write the
   peeked byte explicitly on the upstream's 'connect' event, then pipe
   for the remainder.

2. wait-on (in-process inside start-server-and-test) uses bundled axios
   that does not pick up NODE_EXTRA_CA_CERTS reliably on CI runners.
   The readiness probes against https://localhost:42XX therefore time
   out even though env-vars.sh exports NODE_EXTRA_CA_CERTS pointing at
   mkcert's root CA. Disable TLS validation only for the probe (via
   NODE_TLS_REJECT_UNAUTHORIZED=0 scoped to the wait-on invocation) —
   the services under test still present and validate the real cert.

   Also fix mise-tasks/ci/cache-index to support https REALM_BASE_URL:
   it was stripping `http://` only and hardcoding `http-get://`, which
   produced malformed wait-on URLs when REALM_BASE_URL was https.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…only

Two follow-ups after the dispatcher fix didn't unstick CI:

1. realm-server `main.ts` defaulted `--serverURL` to `http://localhost:${port}`
   when no flag was passed (the mise tasks don't pass one). The default
   became the `realmServerURL` JWT claim, so even after rotating the
   realm-server to HTTPS the JWTs it minted still embedded
   `realmServerURL: http://localhost:4201/`. The host's
   assertOwnRealmServer then compared that to its own canonical
   `https://localhost:4201/` and threw
   "Multi-realm server support is not yet implemented: don't know how
   to provide auth token for different realm servers", blanking every
   index card. Hardcode the default to `https://localhost:${port}` —
   the local dev stack requires the mkcert leaf (see
   infra:ensure-dev-cert) and there's no scenario where a missing cert
   should silently flip the canonical claim back to http.

2. CI software-factory job had a Serve-test-assets step that started
   host-dist on :4200 even though SF Playwright tests use the
   realm-test-harness, which is hermetic and brings up its own host on
   dynamic ports (see packages/software-factory/docs/testing-strategy.md).
   The bind was both pointless and an active foot-gun — colliding with
   harness ports and masking host-bring-up regressions. Replace with
   `services:icons` alone (the only external service the harness
   actually consumes via ICONS_URL).

   Also switch wait-on's TLS escape hatch from NODE_TLS_REJECT_UNAUTHORIZED
   to START_SERVER_AND_TEST_INSECURE=1 — start-server-and-test passes
   `strictSSL: !isInsecure()` into wait-on's options, which overrides
   the global env var.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…REATE error

The migration CI job's apply-migrations step failed with
\`error: database "boxel" does not exist\` immediately after the script
logged \`created database boxel\`. Root cause was two-part:

1. \`docker exec boxel-pg psql -U postgres ...\` defaulted to the
   unix socket at \`/var/run/postgresql/.s.PGSQL.5432\` inside the
   container. postgres:16.3 doesn't always create that directory, so
   both the \`-lqt\` lookup and the \`CREATE DATABASE\` call failed
   with \`connection to server on socket ... No such file or directory\`.
2. The script had no \`set -e\`, so \`CREATE DATABASE\` failing silently
   fell through to the \`echo "created database \$PGDATABASE"\` line.
   The migrate step then tried to connect to a non-existent database
   over TCP and crashed.

Fix: pass \`-h localhost -p 5432\` to \`psql\` and \`pg_isready\` so the
in-container calls always use TCP (which postgres listens on regardless
of socket availability), and add \`set -e\` so a CREATE DATABASE failure
exits non-zero instead of fabricating a success log line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eview

The byte-peek + cross-process TCP pipe pattern that the dispatcher uses
races chrome's TLS handshake on the CI runners — every prerender probe
to https://localhost:4200/_standby gets net::ERR_CONNECTION_CLOSED
while curl against the same port from a parallel shell succeeds.
Symptom of an ALPN/h2 framing issue inside the pipe (TLS termination is
at vite, but Node's raw socket.pipe between two processes apparently
mangles enough of the handshake that chrome's stricter parser bails).

The dispatcher's only real value is `vite` (dev) UX, where a human
types `http://localhost:4200` in a browser bar and expects a 301 to
https. `vite preview` is used by CI and `serve:dist` — there's no
browser bar there, so bind vite preview directly to the public port
with HTTPS and skip the dispatcher. Local dev's `vite` path is
unchanged: it still gets the dispatcher and the http→https redirect.

Also tighten ci/serve-test-assets's wait-on probe: use `https-get://`
to force GET (start-server-and-test's default `https://` resolves to
HEAD, which vite preview behind HTTP/2 doesn't reliably answer in CI).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Postgres Migration CI job validates that every migration is
reversible via up → down → up. The canonical-url-http-to-https
migration's down was a no-op, which broke that contract. Make the
down symmetric to the up:

- packages/postgres/migrations/1779100257124_canonical-url-http-to-https.js:
  extract the rewrite SQL into a `rewriteBlock({ oldScheme, newScheme })`
  helper. `up` calls it http→https; `down` calls it https→http. Same
  `realm_user_permissions` pre-check on the source scheme, so staging
  / production (real hostnames, never `localhost`) is a no-op either
  direction.

- packages/matrix/scripts/migrate-account-data-http-to-https.ts: add a
  `--reverse` CLI flag that flips the URL prefix rewrite. Companion
  pnpm script `migrate-account-data-https-to-http` and mise task
  `infra:migrate-matrix-account-data-https-to-http` invoke it.

- PR description: add a "Rolling back" section pointing users at the
  three-step reverse path (postgres down, matrix reverse, localStorage
  clear).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…low-insecure-localhost

Chrome 144+ silently demotes \`--ignore-certificate-errors\` to a
dev-only flag and won't accept self-signed certs unless it's paired
with \`--allow-insecure-localhost\`. Without that pairing, every TLS
connection to https://localhost:4200 from puppeteer's chrome terminates
the handshake with ERR_CONNECTION_CLOSED — which is what was blocking
the prerender's wait-for-host-standby in CI (and, downstream, every
Host / Matrix test job because realm-server boot depends on prerender
being ready). curl over the same URL worked fine, hiding the cert
trust nature of the problem under what looked like a generic TCP
close.

Pair the flags in both the prerender's BrowserManager and the
standby-warmup script (scripts/wait-for-host-standby.ts).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants