Skip to content

Add a volume key signing script, simplify env vars#2926

Open
djeebus wants to merge 5 commits into
mainfrom
add-volume-key-signing-script
Open

Add a volume key signing script, simplify env vars#2926
djeebus wants to merge 5 commits into
mainfrom
add-volume-key-signing-script

Conversation

@djeebus
Copy link
Copy Markdown
Contributor

@djeebus djeebus commented Jun 4, 2026

No description provided.

@cursor
Copy link
Copy Markdown

cursor Bot commented Jun 4, 2026

PR Summary

High Risk
Removes automated provisioning of volume-content JWT signing and verification material; misconfigured or missing env vars will break API signing or belt verification.

Overview
Volume-content JWT signing is no longer created or wired through GCP Terraform: the ed25519 key resources, related variables, Nomad inputs, and verification outputs are removed, and the API env block only sets VOLUME_TOKEN_ISSUER from domain_name (signing method, key, kid, and duration are no longer injected by IaC). Operators are expected to run scripts/generate-volume-content-signing-key.sh and configure VOLUME_TOKEN_* on the API and JWT_VERIFICATION_KEYS on belt volume-content manually.

Reviewed by Cursor Bugbot for commit b7e03c4. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The use of base64 -w0 in scripts/generate-volume-content-signing-key.sh is not portable and will fail on macOS because the -w option is specific to GNU base64. To ensure the script runs successfully across both Linux and macOS, standard base64 should be used and piped to tr -d '\n' to strip newlines.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +24 to +25
private_b64="$(printf '%s\n' "$private_pem" | base64 -w0)"
public_b64="$(printf '%s\n' "$public_pem" | base64 -w0)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of base64 -w0 is not portable and will fail on macOS because the -w option is specific to GNU base64. To ensure the script runs successfully across both Linux and macOS, use standard base64 and pipe the output to tr -d '\n' to strip newlines.

Suggested change
private_b64="$(printf '%s\n' "$private_pem" | base64 -w0)"
public_b64="$(printf '%s\n' "$public_pem" | base64 -w0)"
private_b64="$(printf '%s\n' "$private_pem" | base64 | tr -d '\n')"
public_b64="$(printf '%s\n' "$public_pem" | base64 | tr -d '\n')"

Comment thread iac/provider-gcp/main.tf
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 97092c45e0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread iac/provider-gcp/main.tf
VOLUME_TOKEN_SIGNING_KEY_NAME = local.volume_token_signature_name
VOLUME_TOKEN_DURATION = var.volume_token_valid_for
VOLUME_TOKEN_SIGNING_METHOD = local.volume_token_signature_method
VOLUME_TOKEN_ISSUER = var.domain_name
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore required volume token signing env vars

With the default api_env_vars still empty, this map now renders the GCP API jobs with only VOLUME_TOKEN_ISSUER; packages/api/internal/cfg/model.go marks VOLUME_TOKEN_SIGNING_METHOD, VOLUME_TOKEN_SIGNING_KEY, and VOLUME_TOKEN_SIGNING_KEY_NAME as required, so a fresh/default GCP deployment will fail during config parsing before the API starts unless operators manually inject all three override variables.

Useful? React with 👍 / 👎.

echo
echo "VOLUME_TOKEN_SIGNING_METHOD=EdDSA"
echo "VOLUME_TOKEN_SIGNING_KEY_NAME=${unix_now}"
echo "VOLUME_TOKEN_SIGNING_KEY=ED25519:${private_b64}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Agentic Security Review
Severity: MEDIUM
The script prints VOLUME_TOKEN_SIGNING_KEY (the private signing key material) directly to stdout. This is a secret-bearing value and can be captured by CI logs, shell history tooling, terminal recording, or shared log sinks if the script is run outside a tightly controlled local session.
Impact: Leakage of this key would allow an attacker to mint valid volume-content JWTs for services trusting the corresponding verification key.

Fix in Cursor Fix in Web

Reviewed by Cursor Security Reviewer for commit 97092c4. Configure here.

Comment thread iac/provider-gcp/variables.tf Outdated
Comment on lines 793 to 795
default = ""
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The variable volume_token_issuer at variables.tf:793 is now dead code — its only reader was the deleted coalesce(var.volume_token_issuer, var.domain_name) in volume_content_signing_key.tf. The sibling variables volume_token_valid_for and volume_token_signature were removed in this PR but this one was left behind, and main.tf:130 now hardcodes VOLUME_TOKEN_ISSUER = var.domain_name, silently dropping any TF_VAR_volume_token_issuer override an operator has set. Either delete the variable for consistency, or restore coalesce(var.volume_token_issuer, var.domain_name) in main.tf to preserve the override capability.

Extended reasoning...

What the bug is

This PR simplifies the volume-token env-var wiring by deleting iac/provider-gcp/volume_content_signing_key.tf, removing the related locals (volume_token_issuer, volume_token_signing_key, etc.), and hardcoding VOLUME_TOKEN_ISSUER = var.domain_name in iac/provider-gcp/main.tf:130. As part of this cleanup, two sibling input variables — volume_token_valid_for and volume_token_signature — were correctly removed from iac/provider-gcp/variables.tf. But variable "volume_token_issuer" (lines 793–795) was left behind.

How it manifests

A grep for volume_token_issuer across iac/ after this PR returns exactly one hit: the declaration itself at iac/provider-gcp/variables.tf:793. Nothing reads it. Pre-PR, the deleted file contained:

local.volume_token_issuer = coalesce(var.volume_token_issuer, var.domain_name)

That local was wired into local.api_env_vars as VOLUME_TOKEN_ISSUER. The new code at main.tf:130 replaces the whole chain with:

VOLUME_TOKEN_ISSUER = var.domain_name

The coalesce — and therefore the override capability — is gone.

Step-by-step proof of the silent regression

Consider an existing deployment that today sets TF_VAR_volume_token_issuer="https://issuer.example.com" while var.domain_name="api.example.com":

  1. Pre-PR apply: the deleted local computes coalesce("https://issuer.example.com", "api.example.com") = "https://issuer.example.com". The API service receives VOLUME_TOKEN_ISSUER=https://issuer.example.com and signs JWTs with iss="https://issuer.example.com".
  2. Post-PR apply: main.tf:130 ignores var.volume_token_issuer entirely. The API receives VOLUME_TOKEN_ISSUER=api.example.com and starts signing JWTs with iss="api.example.com".
  3. Terraform produces no warning because var.volume_token_issuer is still a valid declared variable — operators have no signal that their override stopped taking effect.
  4. Downstream verifiers (the belt volume-content service, per scripts/generate-volume-content-signing-key.sh) that pin to the previous iss value will reject the newly-issued tokens.

Why existing code does not prevent this

Terraform does not flag unused input variables; only unused locals would raise an error. So the dangling declaration is fully syntactically valid, and the default = "" ensures the variable can still be set without complaint — it just goes nowhere.

How to fix

Two options, pick one:

  • (a) Delete the variable at iac/provider-gcp/variables.tf:793–795. This matches the cleanup intent of the PR (which already removed volume_token_valid_for and volume_token_signature) and is the simplest fix if the override was never meant to survive.
  • (b) Preserve the override by restoring the coalesce wiring in main.tf:130:
VOLUME_TOKEN_ISSUER = coalesce(var.volume_token_issuer, var.domain_name)

If (a) is chosen and any deployment is known to currently set this override, a brief note in the PR description / release notes pointing operators at the behavioral change is worth adding.

Comment on lines +24 to +25
private_b64="$(printf '%s\n' "$private_pem" | base64 -w0)"
public_b64="$(printf '%s\n' "$public_pem" | base64 -w0)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: base64 -w0 at lines 24-25 is a GNU coreutils extension — BSD base64 (macOS default) doesn't accept -w and exits with base64: invalid option -- w. Combined with set -euo pipefail (line 19), the script aborts before printing any env vars on a stock macOS shell. Swapping to | base64 | tr -d '\n' (or openssl base64 -A) works on both BSD and GNU.

Extended reasoning...

What the bug is. Lines 24-25 use GNU coreutils' -w0 flag to suppress base64 line wrapping:

private_b64="$(printf '%s\n' "$private_pem" | base64 -w0)"
public_b64="$(printf '%s\n' "$public_pem" | base64 -w0)"

BSD base64 (the default on macOS) does not implement -w. Its options per the manpage are -bDdhio, with -b count used for line breaks. Running base64 -w0 on macOS prints base64: invalid option -- w to stderr and exits non-zero.

Why the script aborts. Line 19 sets set -euo pipefail. The non-zero exit propagates through the command substitution and -e immediately aborts the script before any of the echo lines below run. The operator sees the base64 error and nothing else — no env vars are printed.

Step-by-step on macOS. (1) Operator runs ./scripts/generate-volume-content-signing-key.sh. (2) openssl genpkey succeeds, private_pem is populated. (3) openssl pkey -pubout succeeds, public_pem is populated. (4) The first base64 -w0 command substitution fires; BSD base64 errors out with invalid option -- w and exits non-zero. (5) set -e triggers, script aborts. (6) Nothing is printed to stdout — the operator has no env vars to copy into infra.

Impact. This is the documented mechanism for generating volume-content signing keys (the script header explicitly says it exists to emit env vars for the API and belt services). Any e2b engineer running this on a Mac without GNU coreutils installed will hit this. The failure is loud and immediate, not silent corruption — so no risk of bad keys reaching production — but it's a usability defect for the operator workflow.

Fix. The portable replacement that works on both BSD and GNU is one line per call:

private_b64="$(printf '%s\n' "$private_pem" | base64 | tr -d '\n')"
public_b64="$(printf '%s\n' "$public_pem" | base64 | tr -d '\n')"

Or equivalently openssl base64 -A, which is already a dependency of this script.

Addressing the duplicate refutation. A refuter flagged this as a duplicate of bug_003 (which was itself refuted) and argued the rule "duplicates should be refuted" applies. The convention is the opposite: when two reports describe the same issue, the lower-ID/original report is kept and the later one is refuted as the duplicate — not both. bug_002 is the original. The other arguments in that refutation (operator-only script, loud failure, trivial workaround) are valid severity reductions, not reasons to drop the report — they're why this is filed as a nit rather than a normal-severity blocker. Note also that the same base64 -w0 pattern is already used in packages/api/Makefile (lines 97, 111), so this PR is following repo convention; the inconsistency would be fixing only this script. Worth noting in case the team wants to standardize on the portable form across the repo, but not blocking.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit f3b0952. Configure here.

Comment thread iac/provider-gcp/main.tf

VOLUME_TOKEN_ISSUER = local.volume_token_issuer
VOLUME_TOKEN_SIGNING_KEY = local.volume_token_signing_key
VOLUME_TOKEN_SIGNING_KEY_NAME = local.volume_token_signature_name
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCP API missing signing env

High Severity

local.api_env_vars now only sets VOLUME_TOKEN_ISSUER, but the API still requires VOLUME_TOKEN_SIGNING_METHOD, VOLUME_TOKEN_SIGNING_KEY, and VOLUME_TOKEN_SIGNING_KEY_NAME at startup. With default empty var.api_env_vars, the API job will not get those variables and config parsing should fail on boot.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f3b0952. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true and intentional. Better to fail obviously and centralize required env var rules than to have two different sources of truth for how things need to be defined and which are necessary.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f3b09525b9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread iac/provider-gcp/main.tf
VOLUME_TOKEN_SIGNING_KEY_NAME = local.volume_token_signature_name
VOLUME_TOKEN_DURATION = var.volume_token_valid_for
VOLUME_TOKEN_SIGNING_METHOD = local.volume_token_signature_method
VOLUME_TOKEN_ISSUER = var.domain_name
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve the volume token issuer override

When a GCP deployment sets volume_token_issuer to a non-domain issuer, this now ignores that still-declared variable and always renders VOLUME_TOKEN_ISSUER as var.domain_name; generateVolumeContentToken copies this config into the JWT iss claim, so volume-content verifiers configured for the overridden issuer will reject newly issued tokens after this change. Restore the previous fallback behavior (coalesce(var.volume_token_issuer, var.domain_name)) or remove the variable with an explicit migration path.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant