Skip to content

UX improvements: TPM reseal (HOTP/TOTP/DUK) adds integrity report; detects disk/tpm swap and guide user into action, add terminal colors and guidance! Reduced quiet noise.#2068

Open
tlaurion wants to merge 25 commits intolinuxboot:masterfrom
tlaurion:tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap

Conversation

@tlaurion
Copy link
Collaborator

@tlaurion tlaurion commented Mar 6, 2026


Improve TPM/TOTP/HOTP recovery and reseal behavior by adding integrity-first
gating, clearer failure handling, and stronger rollback preflight checks.

  • add integrity report + investigation flows in GUI, with explicit actions
    before reseal/reset paths
  • introduce TPM reset-required markers and rollback preflight validation to
    fail early on inconsistent TPM state
  • make unseal/seal paths safer and more recoverable (nonfatal unseal mode,
    clearer reset/reseal guidance, better TPM1/TPM2 handling)
  • improve kexec signing reliability with explicit signing key selection and
    actionable GPG error diagnostics
  • avoid hiding interactive password/PIN prompts by removing inappropriate
    debug wrappers around sensitive interactive commands
  • add run_lvm wrapper and switch runtime scripts to reduce harmless LVM noise
  • refresh TPM2 primary-handle hash in update/signing flows to keep trust
    metadata in sync
  • add new qemu fbwhiptail prod_quiet board configs for TPM1 and TPM2
  • fix board-name values for existing qemu hotp prod_quiet variants
  • document QEMU canokey state reuse and TPM2 pcap capture debugging
  • ignore exported public key artifacts (*.asc) in .gitignore
  • add TRACE_FUNC example under doc/logging.md
  • add coloring to console. See doc/logging.md for assumption changes (It was initially thought that logging under /tmp/debug.log was to be limited by informational modes (debug/quiet/info) where now everything that can be logged is logged, where the information level gives everything it can, without secrets output
    • Addition of STATUS, STATUS_OK and INPUT under /etc/functions to uniformize coloring
    • All wrappers are now colorized with ansi escape characters
  • And many other fixes found along the way.

Tested : simulating or real firmware upgrade from master to this PR CI created rom artifacts 03/11/2026

  • qemu-fbwhiptail-tpm2
  • qemu-fbwhiptail-tpm1
  • qemu-fbwhiptail-tpm2-hotp
  • qemu-fbwhiptail-tpm1-hotp
  • qemu-fbwhiptail-tpm2-hotp-prod_quiet
  • qemu-fbwhiptail-tpm1-hotp_prod_quiet
  • v540tu (real hardware TPM2+HOTP
    • Debian-13 DVD install based LUKS+EXT4 default deployment factory reset up to TPM DUK setup and kexec into dev env (Where I do KVM based testing for devel, including root hashes creation + verification to extend testing of Root hash generalize #2067)
  • x230-hotp-maximized (TPM1.2 + HOTP)
    • Tested root hashes on QubesOS 4.3 (LUKS+ThinLVM+ext4 dom0) creation + verification after updates (Root hash generalize #2067 continuation confirmed working) + Factory reset up to TPM DUK kexec into QubesOS 4.3
    • 'o' early at boot still generates a single random diceware passphrase shared for all security components.

Workflow change
CC @wessel-novacustom comments?
There were reports of Heads not providing integrity checks prior of resealing TOTP/HOTP, so that user is confident about the state of /boot prior of resealing TOTP/HOTP/DUK which would resign /boot content.

Normal workflow after upgrading firmware while /boot unchanged

Screenshot_20260311_112447 Screenshot_20260311_183226 Screenshot_20260311_112658 Screenshot_20260311_112706 Screenshot_20260311_183338 Screenshot_20260311_183457

Normal non-hotp boot workflow requesting TPM DUK

Screenshot_20260311_182809

Other corners cases

TPM reset from OS?

Similar to above, but pushes for TPM Reset since TPM reseal won't work
Screenshot_20260311_183714
Screenshot_20260311_183813

Replaced gpg key, mismatch from USB Security dongle etc

This is where testing of corner cases is lacking (too much time involved here already)

Copilot AI review requested due to automatic review settings March 6, 2026 16:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Heads’ TPM reseal UX by adding an integrity “gate” (TOTP/HOTP + /boot verification) and better detection/handling of TPM/disk swap or rollback-counter inconsistencies, plus some QEMU-focused debugging/documentation updates.

Changes:

  • Add measured integrity reporting + discrepancy investigation flows, and integrate them into reseal/reset paths in the GUI.
  • Improve TPM rollback-counter handling (preflight validation, clearer error guidance, better prompt visibility).
  • Replace fdisk-based disk display with a sysfs-based helper and add QEMU troubleshooting/debug tips (including TPM2 pcap capture).

Reviewed changes

Copilot reviewed 8 out of 20 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
targets/qemu.md Adds QEMU troubleshooting notes (Canokey state reuse, TPM2 pcap capture).
initrd/etc/gui_functions Adds integrity report + investigation UI helpers; system info now uses disk_info_sysfs.
initrd/etc/functions Adds trace stack, rollback-counter preflight helpers, sysfs disk info helper, and multiple TPM/boot-device related adjustments.
initrd/bin/unseal-totp Improves TPM2 primary-handle error handling and adds nonfatal mode support.
initrd/bin/unseal-hotp Improves TPM2 primary-handle + rollback-state-aware error handling and adds nonfatal mode support.
initrd/bin/tpmr Improves TPM2 counter increment auth handling, counter-create UX, and TPM2 seal/unseal messaging.
initrd/bin/seal-totp Adds TPM2 primary-handle precheck + clearer sealing failure guidance.
initrd/bin/root-hashes-gui.sh Improves tracing/debugging and adds more flexible LVM LV selection/cleanup.
initrd/bin/oem-system-info-xx30 Switches disk listing to disk_info_sysfs to avoid fdisk/busybox limitations.
initrd/bin/oem-factory-reset Adjusts TPM counter increment handling and removes duplicated integrity report implementation.
initrd/bin/kexec-sign-config Changes TPM counter increment handling and adds a pre-check for empty GPG keyring; modifies signing pipeline.
initrd/bin/kexec-select-boot Hard-fails on TPM2 primary handle hash mismatch with a stronger warning.
initrd/bin/kexec-seal-key Tweaks passphrase prompts/formatting for improved UX.
initrd/bin/gui-init Adds integrity gate + rollback-counter preflight UX and integrates investigation/report flows.
boards/qemu-coreboot-fbwhiptail-tpm2/qemu-coreboot-fbwhiptail-tpm2.config Documents TPM2 pcap capture option in board config.
boards/qemu-coreboot-fbwhiptail-tpm2-prod_quiet/qemu-coreboot-fbwhiptail-tpm2-prod_quiet.config Adds a new “prod_quiet” QEMU TPM2 board config.
boards/qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet/qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet.config Adjusts board name and minor formatting.
boards/qemu-coreboot-fbwhiptail-tpm1-prod_quiet/qemu-coreboot-fbwhiptail-tpm1-prod_quiet.config Adds a new “prod_quiet” QEMU TPM1 board config.
boards/qemu-coreboot-fbwhiptail-tpm1-hotp-prod_quiet/qemu-coreboot-fbwhiptail-tpm1-hotp-prod_quiet.config Adjusts board name.
.gitignore Ignores *.asc files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tlaurion tlaurion force-pushed the tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap branch from 3f855b8 to 3f2fe25 Compare March 6, 2026 16:36
@tlaurion tlaurion requested a review from Copilot March 6, 2026 16:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 20 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 19 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tlaurion tlaurion marked this pull request as draft March 7, 2026 03:41
@tlaurion tlaurion force-pushed the tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap branch 6 times, most recently from b905930 to 8be0849 Compare March 8, 2026 14:36
@tlaurion tlaurion changed the title Tpm reseal ux integrity report detect disk and tpm swap UX improvements: TPM reseal (HOTP/TOTP/DUK) adds integrity report; detects disk/tpm swap and guide user into action Mar 8, 2026
@tlaurion tlaurion force-pushed the tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap branch from 8be0849 to 5b6ab4f Compare March 8, 2026 15:13
@tlaurion tlaurion requested a review from Copilot March 8, 2026 15:14
@tlaurion tlaurion force-pushed the tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap branch 3 times, most recently from 934b5c8 to 8b41926 Compare March 17, 2026 18:28
@wessel-novacustom
Copy link

Looks good to me.

tlaurion added 25 commits March 24, 2026 21:58
…logging.md

Introduce uppercase logging functions as the canonical API for all initrd
scripts, replacing the old lowercase die()/warn() and bare echo patterns:

  DIE     - fatal error, bold red, always visible, waits for Enter
  WARN    - likely problem that can continue, bold yellow, always visible
  STATUS  - action announcement (precedes STATUS_OK/WARN/DIE), bold blue
  STATUS_OK - confirmed positive outcome, bold green
  INFO    - contextual info for end users, suppressed in quiet mode
  NOTE    - security reminders and hand-off to uncontrolled output (GPG etc.)
  INPUT   - interactive prompt wrapper (echo after read so single-char
            keypresses do not bleed onto the next output line)
  DEBUG   - developer-facing state/decision tracing, debug.log only
  LOG     - unconditional debug.log write (no console), for audit trail

All functions write plain text to debug.log regardless of output mode so
the full trace is always available for post-mortem analysis.
Console output uses ANSI escape codes for color; plain-text prefix strings
(e.g. "!!! ERROR:", "*** WARNING:") carry the same meaning for users who
cannot distinguish colors.

Add pin_color() helper: returns green/yellow/red escape code based on
remaining PIN retry count (>=3 green, 2 yellow, <=1 red).
Add release_scdaemon(): kills gpg-agent + scdaemon together (killing only
scdaemon is insufficient - gpg-agent immediately restarts it), waits for
NK3 CCID teardown delay before returning.

Rewrite doc/logging.md to document:
- All log levels with their console visibility per output mode
- ANSI color rationale (color as enhancement, not sole signal)
- STATUS vs INFO boundary: STATUS precedes an outcome; INFO is config-path
  description. PIN retry counters are STATUS (state before PIN-consuming op).
- NOTE patterns: security reminders vs hand-off to uncontrolled output
- WARN vs DEBUG: unexpected situations visible to users vs dev tracing

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…ction

Strengthen TPM integrity and reseal paths to detect and gate on disk swap,
TPM swap, or unexpected TPM reset before allowing resealing.

tpmr:
- Add HEADS_NONFATAL_UNSEAL support: callers can request non-fatal unseal
  and inspect the exit code rather than dying on first failure
- Improve TPM1/TPM2 error handling and retry logic for password entry
- Add TPM reset-required marker tracking helpers (tpm_reset_required,
  clear_tpm_reset_required) so scripts can signal early-abort conditions
- Refresh TPM2 primary-handle hash in update/signing flows to keep trust
  metadata in sync with actual TPM state

kexec-unseal-key / kexec-seal-key:
- Use new logging functions (STATUS/STATUS_OK/WARN/DIE/INPUT)
- kexec-unseal-key: support nonfatal unseal mode so callers can handle
  failures gracefully rather than hard-exiting
- kexec-seal-key: add STATUS progress messages around TPM operations

gui_functions (new functions):
- report_integrity_measurements(): full integrity report with TOTP/HOTP
  verification and /boot detached signature check; adds STATUS/STATUS_OK
  around slow hotp_verification info and gpgv calls to inform users during
  silent delays
- investigate_integrity_discrepancies(): guided investigation flow when
  hashes do not match — offers options to view diffs, show changed files,
  reset TPM, or reseal
- detached_kexec_signature_valid(): verifies /boot kexec detached signature
- _whiptail_preprocess_args(): shared argument preprocessing for whiptail

gui-init:
- Add gate_reseal_with_integrity_report(): integrity check must pass before
  any TPM reseal path is allowed; sets INTEGRITY_GATE_REQUIRED on mismatch
- Add prompt_missing_gpg_key_action(): per-state signing key guidance
  (AVAILABLE / CARD UNPROVISIONED / CARD KEY DOES NOT MATCH / NO CARD)
  replacing a generic catch-all message
- Add rollback preflight validation: fail early on inconsistent TPM state
  (counter unreadable = disk swap / TPM swap / TPM reset) before presenting
  TOTP/HOTP prompts
- Suppress redundant integrity report when navigating to OEM Factory Reset
  from within the report (INTEGRITY_REPORT_ALREADY_SHOWN)
- Call wait_for_gpg_card silently first; only prompt to insert card if not
  already detected
- Call enable_usb unconditionally at startup (was gated on HOTP config)
- update_hotp(): PIN retry counter shown only on retry attempts, not on
  normal first-attempt success, to eliminate spurious PIN counter display
  during normal boot

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
kexec-sign-config:
- Add pre-signing validation: check at least one public GPG key is present
  before attempting to sign, with actionable error guidance
- Add explicit signing key selection logic with key ID handling
- Add targeted error detection for common GPG failures:
    dirmngr unavailability, missing secret key, bad/wrong PIN,
    blocked smartcard PIN — each with an actionable user message
- Separate stdout (signature) from stderr (status) in gpg invocation
- Use DIE/STATUS/DEBUG logging functions throughout

seal-hotpkey:
- Add show_pin_retries() helper: re-queries device and displays remaining
  PIN retry count with color coding via pin_color() (green/yellow/red)
- Show PIN retry counter before each manual PIN entry prompt so users know
  how many attempts remain before lockout
- Use INPUT for all prompts so single-char keypresses work without Enter
- Replace echo/warn/die with STATUS/STATUS_OK/WARN/NOTE/DIE
- NOTE: "Nitrokey 3 requires physical presence" (hand-off to hardware)

key-init / gpg-gui.sh:
- Replace bare echo/die/warn with STATUS/STATUS_OK/WARN/DIE/NOTE
- gpg-gui.sh: use NOTE for GPG card interaction prompts (hand-off pattern)

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…e boot TPM DUK

Replace bare echo/die/warn patterns with STATUS/STATUS_OK/INFO/NOTE/WARN/DIE
and INPUT across all kexec and boot selection scripts.

kexec-select-boot:
- Use DIE/WARN/STATUS/INFO/INPUT throughout
- Consolidate redundant multi-WARN blocks into single actionable messages
- Fix: insecure boot path (force_boot=y) must never request TPM Disk Unlock
  Key — add && [ "$force_boot" != "y" ] guard to DUK extraction condition

kexec-save-default:
- STATUS_OK for completed operations (e.g. /secret.key added to entries)
- INFO for config-path descriptions (no crypttab found)

kexec-insert-key / kexec-save-key / kexec-boot:
- Replace echo/die/warn with logging functions
- Use INPUT for interactive prompts

kexec-parse-boot / kexec-parse-bls:
- Minor logging consistency fixes

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…leanup

Replace all bare echo/die/warn patterns with STATUS/STATUS_OK/INFO/NOTE/WARN/
DIE/INPUT throughout oem-factory-reset.

Key changes:
- INFO for config-path descriptions (key generation location, subkey copy
  policy) — these describe what will happen, not outcomes
- NOTE for security reminders (backup thumb drive handling, subkey warnings)
  and hand-off to uncontrolled GPG output
- INPUT for all interactive prompts (PIN entry, confirmations, pauses)
- STATUS/STATUS_OK around operations with measurable outcomes
- WARN for genuinely unexpected situations (unknown launch mode)
- INFO (not WARN) for intentional user actions like aborting a wipe

Remove standalone report_integrity_measurements() — now in gui_functions.
Suppress redundant integrity report display when already shown in the
current session (INTEGRITY_REPORT_ALREADY_SHOWN guard).
Use INPUT function for all read prompts so single-char entry works.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
initrd/init:
- Capture coreboot CBMEM console log before Heads extends PCRs, storing
  it to /tmp/measuring_trace.log for boot measurement audit trail
- Update quiet mode messaging to reference /tmp/measuring_trace.log
- Replace bare echo/warn patterns with NOTE/WARN/STATUS/INFO
- NOTE for BASIC mode tamper detection disabled (security reminder)
- Fix serial console recovery shell: use RECOVERY_TTY env var pattern
  so the shell respawns correctly when exited rather than dying silently
- WARN for missing CONFIG_SYSCTL/CONFIG_PROC_SYSCTL (unexpected kernel
  config, users need to see it)

reboot:
- On qemu-* boards, call poweroff instead of reboot (qemu has no reset);
  pause for recovery shell before exiting so output is readable
- Use STATUS/DIE logging functions

gui-init-basic / generic-init:
- Replace echo/die/warn with STATUS/STATUS_OK/WARN/DIE

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Replace bare echo/die/warn patterns with STATUS/STATUS_OK/INFO/NOTE/WARN/
DIE/INPUT in all remaining initrd scripts for consistency with the logging
infrastructure introduced in the companion commit.

Scripts updated: luks-functions, mount-boot, cbfs-init, media-scan,
mount-usb, kexec-iso-init, config-gui.sh, root-hashes-gui.sh,
network-init-recovery, qubes-measure-luks, lock_chip, flash.sh,
flashprog-kgpe-d16-openbmc.sh, inject_firmware.sh, change-time.sh,
uefi-init, unpack_initramfs.sh, usb-autoboot.sh, usb-init, wget-measure.sh,
seal-totp, unseal-totp, unseal-hotp, wipe-totp, tpm-reset,
sbin/config-dhcp.sh, sbin/insmod.

Notable functional changes:
- unseal-totp/unseal-hotp: use INPUT for prompts; STATUS/STATUS_OK around
  TPM unseal operations
- seal-totp: STATUS around QR code generation
- luks-functions: demote "No encrypted LVMs/devices found" to DEBUG
  (expected state on non-LUKS systems, not an actionable warning)
- network-init-recovery: use INPUT for all interactive prompts
- add run_lvm wrapper in luks-functions to suppress harmless LVM noise

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Fix board-name values for existing QEMU HOTP prod_quiet variants (were
using wrong BOARD_NAME strings). Unify QEMU configs so they serve as
comprehensive examples covering all board configuration options.

Add new prod_quiet board configs:
- qemu-coreboot-fbwhiptail-tpm1-prod_quiet
- qemu-coreboot-fbwhiptail-tpm2-prod_quiet

These configs set CONFIG_QUIET_MODE=y for testing the quiet boot path
with TPM1 and TPM2 respectively, without HOTP.

Add *.asc to .gitignore to exclude exported GPG public key artifacts
(armored public key files generated during OEM re-ownership testing).

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
… symlinks

Move content from top-level and scattered locations into doc/ as the
single authoritative location for all documentation:

New doc/ files:
- doc/architecture.md    - system architecture overview
- doc/boot-process.md    - boot flow walkthrough
- doc/docker.md          - Docker build environment guide
- doc/faq.md             - FAQ (was FAQ.md at root)
- doc/qemu.md            - QEMU testing guide (was targets/qemu.md);
                           documents canokey state reuse and TPM2 pcap
                           capture debugging
- doc/security-model.md  - threat model and trust assumptions
- doc/tpm.md             - TPM usage and PCR layout
- doc/ux-patterns.md     - UX patterns and whiptail conventions
- doc/variation-to-defconfig.md  - variation-to-defconfig guide
- doc/wp-notes.md        - write-protect notes (was WP_NOTES.md)
- doc/BOARDS_AND_TESTERS.md      - boards and testers list

Convert top-level files to symlinks pointing to doc/ equivalents so
existing URLs and bookmarks remain valid:
  BOARDS_AND_TESTERS.md        -> doc/BOARDS_AND_TESTERS.md
  FAQ.md                       -> doc/faq.md
  WP_NOTES.md                  -> doc/wp-notes.md
  config/variation_to_defconfig.md -> ../doc/variation-to-defconfig.md
  targets/qemu.md              -> ../doc/qemu.md

Trim README.md to an index pointing to doc/ rather than duplicating
documentation content inline.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…e indicator

hotp_verification info output was previously printed raw to the user.
After the logging refactor it was captured but the firmware version fields
were silently discarded.  Restore visibility so users know when to upgrade.

Add hotpkey_fw_display() helper in etc/functions:
- Parses "hotp_verification info" output for Firmware fields
- NK3:          "Firmware Nitrokey 3: vX.Y.Z" (min: v1.8.3)
                also shows "Firmware Secrets App: vX.Y"
- Nitrokey Pro: "Firmware: vX.Y"               (min: v0.15)
- Librem Key:   shown as-is (minimum version unknown)
- Green  if version >= minimum recommended
- Yellow if version is older (upgrade recommended)

Call hotpkey_fw_display() after successful dongle detection in:
- seal-hotpkey:                  shown alongside PIN retry counter
- gui-init update_hotp():        shown on every HOTP boot check
- gui_functions report_integrity_measurements(): shown in integrity report

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
The PIN retry count was shown twice when the default PIN was not tried
(either past the 1-month window or retries < 3): once at the top of the
script from the initial hotp_token_info parse, and again immediately via
show_pin_retries() before the manual PIN prompt.

Remove the initial STATUS display. show_pin_retries() is already called
before every manual PIN entry and re-queries the device each time, so the
count shown is always accurate (reflects any attempt that consumed a retry).

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…m STATUS flow

The "Resealing TPM Disk Unlock Key alongside TOTP/HOTP secret" message is
a side-effect annotation, not an action with its own STATUS_OK outcome.
Using NOTE makes it visually distinct from the surrounding STATUS/STATUS_OK
flow so users can tell at a glance that the DUK reseal is a secondary
operation piggybacking on the TOTP/HOTP reseal, not a standalone action.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…uiet mode)

INFO is suppressed in quiet mode (CONFIG_QUIET_MODE=y), so the firmware
version of the USB security dongle was never shown to users running quiet
builds. Firmware version is security-relevant: users need to see it to
know when their dongle needs an upgrade. Switch hotpkey_fw_display() to
NOTE, which is always visible in all output modes.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Add STATUS/STATUS_OK around the extraction loop so the user always sees
when cbfs-init starts and finishes.  Demote per-file output to DEBUG and
update the STATUS text to describe what is being extracted.
When seal-hotpkey fails mid-way (connection error, dongle removed),
the HOTP slot on the dongle is left unconfigured. On the next boot,
hotp_verification check returns exit code 6 (EXIT_SLOT_NOT_PROGRAMMED)
which was unhandled, falling into the generic transient-error retry loop
and leaving the user with no actionable guidance.

- Add exit code 6 case in update_hotp() retry loop: break immediately
  (retrying cannot configure an unconfigured slot), set HOTP status to
  "HOTP slot not configured" and BG_COLOR_MAIN_MENU="warning".
- Add a whiptail dialog for the slot-not-configured case that explains
  the likely cause and offers "Generate new TOTP/HOTP secret" or
  recovery shell as next steps.
- Export HOTPKEY_BRANDING after it is set in gui-init and seal-hotpkey
  so all child processes inherit the value without re-reading
  /boot/kexec_hotp_key. Re-export after the VID-based override in
  seal-hotpkey so the correct branding propagates.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…nly on first setup

/boot/kexec_hotp_key is written at the end of a successful seal and
already holds the correct branding string. The VID-based detection block
was unconditionally overwriting it on every run, discarding the stored
value and always falling back to the generic "Nitrokey" label.

Only run VID detection when the file does not yet exist (first-time
OEM setup). On all subsequent seals the stored content is used as-is,
which preserves any more specific branding set by the previous run.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…loop

hotp_verification check does not consume a PIN retry - it verifies an
HOTP code, not a PIN. Showing "PIN retries remaining" in the transient
error retry path was misleading (implied a PIN was consumed) and caused
the counter to be displayed twice when two consecutive transient failures
occurred (USB glitch, NK3 connection error).

Remove the re-query of hotp_verification info and the PIN retries STATUS
from the retry handler; the WARN about the failed attempt is sufficient.
The unused hotp_pin_retries and prompt_label locals are also removed.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Two related bugs caused the GPG User PIN retry counter to appear stuck:

1. gpg_auth() (functions): confirm_gpg_card (which shows the current PIN
   counter) ran only in a pre-loop before the signing loop. The 3-attempt
   signing loop never re-queried the counter, so after a bad PIN the user
   saw "GPG authentication failed, please try again" with no updated count.
   Fix: move confirm_gpg_card inside the signing loop so it runs before
   each attempt, showing the decremented count after each wrong PIN.
   Use "until (confirm_gpg_card); do true; done" to preserve the existing
   card-presence retry behaviour within each signing attempt.

2. kexec-sign-config: confirm_gpg_card is correctly at the top of the
   for-tries loop, but bad PIN immediately called DIE, preventing tries 2
   and 3 (with their updated count display) from being reached.
   Fix: on bad PIN with tries < 3, WARN and continue so the next loop
   iteration calls confirm_gpg_card again and shows the decremented count.
   On tries == 3, DIE with the full remediation message as before.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Show "Attempt N/3" on the first prompt as well, not only on retries.
Move minimum firmware version constants out of functions into a dedicated
etc/dongle-versions file.  Add a warning when the dongle firmware predates
NK3 and requires external reprogramming rather than an in-system upgrade.
…ove atomically on success

Both scripts write to a staging directory under /tmp rather than directly
to the destination, then move files into place atomically on success.
kexec-save-default also includes the staging path in DEBUG messages.
GPG signature verification (check_config, detached_kexec_signature_valid,
root-hashes-gui.sh, oem-factory-reset) was broken after the mktemp/atomic
staging changes: sha256sum embedded absolute staging-dir paths
(/tmp/kexec-sign-XXXXXX/kexec_hashes.txt) into the signed data while
verification re-ran sha256sum with /boot/kexec_hashes.txt paths, producing
a guaranteed BAD signature on every boot after TPM reset or re-sign.

Fix: all signing and verification now cd into the target directory and use
relative filenames, so the sha256sum output is path-independent and matches
across sign→move→verify.  Same pattern applied uniformly to all five call
sites.

TPM DUK sealing (kexec-seal-key) hardened:
- DRK passphrase is now tested against ALL selected devices before accepting;
  partial success (some devices unlockable) is reported to the user with an
  explicit confirmation prompt; only the unlockable subset proceeds.
- kexec_key_devices.txt is rewritten to the unlockable subset so boot-time
  unlock is not attempted against devices that never received a DUK.
- Hard guard at luksKillSlot: DIE if the slot to wipe equals drk_key_slot,
  regardless of how wipe_desired was set — prevents DRK destruction.
- find_drk_key_slot() now takes dev and keyslots as explicit arguments
  (was implicitly inheriting outer-scope variables).
- mapfile used instead of word-splitting subshell for luks_used_keyslots.
- All unquoted variables and [ p -o q ] patterns fixed throughout.

LUKS device/LVM selection (kexec-save-default):
- 'all' keyword accepted in device/LVM selection prompts; expands to all
  discovered devices.  Empty input no longer silently accepted as valid.
- Prompt text updated to make 'all' discoverable.

TPM rollback preflight warning (gui-init):
- When the TPM counter cannot be read, the dialog now explicitly warns that
  /boot must be treated as UNTRUSTED if the condition was not intentional,
  matching the severity language of the integrity report.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
All $paramsdir/$paramsdev references now quoted to prevent word-splitting.
Added comment explaining that kexec-seal-key may rewrite kexec_key_devices.txt
to the unlockable DRK subset before kexec-sign-config runs, so the signed
config always reflects only devices that actually received a DUK.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…ssphrase handling

kexec-seal-key: defer all paramsdir writes into one rw mount window at the
end of the script.  Previously the kexec_key_devices.txt cp happened early
with no remount guard, so when reseal_tpm_disk_decryption_key called
kexec-seal-key directly (not via kexec-save-key) the write failed with
EROFS because /boot was still mounted ro.  kexec_lukshdr_hash.txt was
already guarded; now both writes share one mount -o rw,remount / cp -f /
mount -o ro,remount block.  Add cp -f to both writes for consistency.

luks-functions:
- luks_reencrypt: remove redundant passphrase re-read block (dead code
  since test_luks_current_disk_recovery_key_passphrase already sets and
  exports the variable); replace seq 0 31 brute-force keyslot scan with
  luksDump-based enumeration of only the enabled slots (matches the
  approach in kexec-seal-key).
- luks_change_passphrase: move new passphrase prompt before the
  per-container loop (was inside the elif on first iteration, confusing);
  write temp files once before the loop instead of per-container.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
… visible feedback

Boot log timing showed multi-second gaps where the user had no output.
Add STATUS/STATUS_OK around the HOTP token presence check in
gate_reseal_with_integrity_report (~3s gap), before wait_for_gpg_card in
report_integrity_measurements (~1s gap), and before the TPM rollback
counter read in reseal_tpm_disk_decryption_key.

Standardize on "boot hashes" in update_checksums and related messages,
consistent with kexec-select-boot's existing wording.
@tlaurion tlaurion force-pushed the tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap branch from 8e56ac2 to 43e6054 Compare March 25, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

4 participants