Skip to content

feat(auth): prompt for recovery on startup#77

Open
ndycode wants to merge 28 commits intogit-plan/02-backup-restore-managerfrom
git-plan/03-startup-recovery-prompt
Open

feat(auth): prompt for recovery on startup#77
ndycode wants to merge 28 commits intogit-plan/02-backup-restore-managerfrom
git-plan/03-startup-recovery-prompt

Conversation

@ndycode
Copy link
Owner

@ndycode ndycode commented Mar 12, 2026

Summary

  • prompt for named-backup recovery before OAuth when interactive codex auth login starts with zero saved accounts and recoverable backups exist
  • keep the startup recovery path available after backing out of the restore browser, cancelling confirmation, or hitting restore-time errors, while still suppressing the prompt after same-session fresh and reset actions
  • preserve the full pre-scanned assessment set in the startup-launched restore manager, harden backup discovery against transient Windows filesystem locks, and merge the latest base-branch follow-up so the stacked PR is conflict-free again

Validation

  • npm run typecheck
  • npm run lint
  • npm run build
  • npx vitest run test/storage.test.ts
  • cmd /c "set NODE_OPTIONS=--max-old-space-size=16384&& npx vitest run test/codex-manager-cli.test.ts -t startup"
  • npx vitest run test/codex-manager-cli.test.ts -t "returns to the login menu when backup reassessment fails before confirmation"

Notes

  • This PR now includes the latest base-branch follow-up from git-plan/02-backup-restore-manager to clear the merge conflict that GitHub was still reporting.
  • The full test/codex-manager-cli.test.ts suite still has the known Vitest heap ceiling in this worktree, so validation stays focused on the startup-recovery and restore-manager slices touched here.

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

this PR wires a named-backup recovery prompt into runAuthLogin, passes the pre-scanned allAssessments set into the restore manager to avoid a double-read, adds getActionableNamedBackupRestores to storage with per-entry EBUSY guards and correct totalBackups tracking, and hardens listNamedBackupsWithoutLoading with retryTransientFilesystemOperation. the core architecture and windows concurrency mitigations are sound.

key concerns:

  • unbounded re-prompt loop (unaddressed from prior review): when runBackupRestoreManager returns "dismissed", the code resets recoveryPromptAttempted = false rather than setting suppressRecoveryPrompt = true. a user who accepts the outer confirm, backs out of the restore browser, and accepts again will loop indefinitely — no test covers two consecutive back-outs
  • test mock ordering in "display settings load fails before confirm": loadDashboardDisplaySettingsMock is chained mockResolvedValueOnce(success).mockImplementationOnce(throw ENOENT); with an empty account store the outer menu is skipped, so the only loadDashboardDisplaySettings call in the path is inside the recovery prompt — it hits the success mock, confirm is reached, and the throw is never triggered; the test may be silently non-functional
  • duplicate error-code extraction helper: getBackupRestoreAssessmentErrorLabel (storage.ts) and getRedactedFilesystemErrorLabel (codex-manager.ts) are identical — should live in a shared utility to keep the redaction strategy in sync

Confidence Score: 2/5

  • not safe to merge until the unbounded re-prompt loop is resolved and the "display settings load fails" test is verified to actually trigger the intended throw
  • storage.ts windows hardening and the allAssessments passthrough are solid, but the unbounded loop bug was flagged in a prior review round and remains unaddressed — a user who accepts the outer confirm and backs out repeatedly is stuck in an infinite login loop with no escape; additionally one of the new tests may silently not exercise the path it is named for, reducing confidence in the regression coverage for the recovery prompt failure path
  • lib/codex-manager.ts (re-prompt loop), test/codex-manager-cli.test.ts (mock ordering in display-settings-fail test)

Important Files Changed

Filename Overview
lib/codex-manager.ts adds startup recovery prompt, resolveStartupRecoveryAction, loadBackupRestoreManagerAssessments, and getRedactedFilesystemErrorLabel; correctly passes allAssessments to the manager and handles scan/prompt failures — but the re-prompt-after-dismiss path still resets recoveryPromptAttempted = false, creating an unbounded loop when the user repeatedly accepts the outer confirm and backs out of the inner manager
lib/storage.ts adds getActionableNamedBackupRestores, scanNamedBackups, listNamedBackupsWithoutLoading, and assessNamedBackupRestoreCandidate; wraps per-entry scan steps in try/catch, tracks totalBackups from raw readdir counts before per-entry failures, and wraps listNamedBackupsWithoutLoading directory reads in retryTransientFilesystemOperation — the duplicate error-extraction helper is a minor style concern but all critical paths appear safe
lib/cli.ts adds isInteractiveLoginMenuAvailable() combining !isNonInteractiveMode() and isTTY(), and uses it in promptLoginMode; straightforward and consistent with existing TTY guards
test/codex-manager-cli.test.ts adds 12 new startup-recovery CLI tests and promotes OAuth mocks to top-level constants; the suite-level afterEach(() => vi.restoreAllMocks()) is present (addressed from prior review), but the "display settings load fails before confirm" test has a suspected mock-ordering bug where a leading mockResolvedValueOnce(success) may consume the only recovery-prompt loadDashboardDisplaySettings call, silently preventing the intended ENOENT throw from being reached
test/recovery.test.ts adds getActionableNamedBackupRestores override tests and a removeWithRetry helper for Windows-safe temp dir cleanup; reformats existing tests to tabs, adds afterEach(() => vi.restoreAllMocks()) suite-level guard — changes look solid

Sequence Diagram

sequenceDiagram
    participant U as User
    participant C as runAuthLogin
    participant S as storage (getActionableNamedBackupRestores)
    participant M as runBackupRestoreManager
    participant O as OAuth flow

    C->>C: loadAccounts() → existingCount=0
    C->>C: canPromptForRecovery? check suppressRecoveryPrompt + recoveryPromptAttempted + TTY
    alt pendingRecoveryState is null
        C->>S: getActionableNamedBackupRestores()
        S-->>C: {assessments, allAssessments, totalBackups}
        alt assessments empty + scan not failed
            C->>C: allowEmptyStorageMenu=true → continue loginFlow (open empty menu)
        else scan failed
            C-->>O: fall through to OAuth
        end
    else reuse pendingRecoveryState
        C->>C: recoveryState = pendingRecoveryState (no re-scan)
    end
    alt assessments.length > 0
        C->>C: loadDashboardDisplaySettings()
        C->>U: confirm("Found N recoverable backups … Restore now?")
        alt user confirms
            C->>M: runBackupRestoreManager(displaySettings, allAssessments)
            M->>U: show backup list, user selects
            alt user restores successfully
                M-->>C: "restored"
                C->>C: accounts populated → outer menu opens
            else user backs out or restore fails
                M-->>C: "dismissed"
                C->>C: pendingRecoveryState=recoveryState, recoveryPromptAttempted=false
                Note over C: ⚠️ unbounded loop: confirm fires again on next iteration
                C->>C: continue loginFlow (re-prompt)
            end
        else user declines
            C-->>O: fall through to OAuth
        end
    else no actionable backups
        C-->>O: fall through to OAuth
    end
    O->>U: OAuth login flow
Loading

Comments Outside Diff (2)

  1. lib/codex-manager.ts, line 4091-4098 (link)

    Unbounded re-prompt loop still present

    when runBackupRestoreManager returns "dismissed", recoveryPromptAttempted is reset to false and pendingRecoveryState is preserved. on the next iteration of loginFlow, existingCount is still 0, suppressRecoveryPrompt is still false, and recoveryPromptAttempted is now false — so canPromptForRecovery is true again and the outer confirm fires once more. a user who repeatedly accepts the outer prompt but backs out of the inner manager will loop indefinitely, with no escape short of answering "no" to the confirm.

    the tests only exercise one back-out (second confirm returns false). a second consecutive back-out reproduces the loop. consider flipping suppressRecoveryPrompt = true instead of recoveryPromptAttempted = false after a dismissal to permanently suppress the prompt within the session:

    if (restoreResult !== "restored") {
        pendingRecoveryState = recoveryState;
        // suppress rather than reset so repeated back-out does not loop
        suppressRecoveryPrompt = true;
    }
    continue;

    this also removes the need for pendingRecoveryState entirely (the assessment set would no longer be replayed on a suppressed re-prompt).

  2. test/codex-manager-cli.test.ts, line 1840-1858 (link)

    Extra mock consumes the intended throw — test may not cover the stated scenario

    loadDashboardDisplaySettingsMock is chained with .mockResolvedValueOnce(settings).mockImplementationOnce(throw ENOENT). with an empty account store, the outer menu loop is never entered, so the only call to loadDashboardDisplaySettings before confirm is the one inside the recovery prompt. that call consumes the first mock (success), getNamedBackupsDirectoryPath() is called, promptWasShown = true, and confirm() is reached — contradicting the assertions expect(getNamedBackupsDirectoryPathMock).not.toHaveBeenCalled() and expect(confirmMock).not.toHaveBeenCalled().

    if there is no preliminary call to loadDashboardDisplaySettings before the recovery prompt in this path, the throw mock is never triggered and the test silently passes without exercising the intended failure. the fix is to drop the leading mockResolvedValueOnce so the first (and only) relevant call immediately throws:

    loadDashboardDisplaySettingsMock.mockRejectedValueOnce(
        makeErrnoError(
            "no such file or directory, open '/mock/dashboard-settings.json'",
            "ENOENT",
        ),
    );

    if there is a legitimate preliminary call from an outer init path, add a comment explaining it so future readers understand the two-call sequence.

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/codex-manager.ts
Line: 4091-4098

Comment:
**Unbounded re-prompt loop still present**

when `runBackupRestoreManager` returns `"dismissed"`, `recoveryPromptAttempted` is reset to `false` and `pendingRecoveryState` is preserved. on the next iteration of `loginFlow`, `existingCount` is still 0, `suppressRecoveryPrompt` is still `false`, and `recoveryPromptAttempted` is now `false` — so `canPromptForRecovery` is `true` again and the outer confirm fires once more. a user who repeatedly accepts the outer prompt but backs out of the inner manager will loop indefinitely, with no escape short of answering "no" to the confirm.

the tests only exercise **one** back-out (second confirm returns false). a second consecutive back-out reproduces the loop. consider flipping `suppressRecoveryPrompt = true` instead of `recoveryPromptAttempted = false` after a dismissal to permanently suppress the prompt within the session:

```typescript
if (restoreResult !== "restored") {
    pendingRecoveryState = recoveryState;
    // suppress rather than reset so repeated back-out does not loop
    suppressRecoveryPrompt = true;
}
continue;
```

this also removes the need for `pendingRecoveryState` entirely (the assessment set would no longer be replayed on a suppressed re-prompt).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: test/codex-manager-cli.test.ts
Line: 1840-1858

Comment:
**Extra mock consumes the intended throw — test may not cover the stated scenario**

`loadDashboardDisplaySettingsMock` is chained with `.mockResolvedValueOnce(settings).mockImplementationOnce(throw ENOENT)`. with an empty account store, the outer menu loop is never entered, so the **only** call to `loadDashboardDisplaySettings` before `confirm` is the one inside the recovery prompt. that call consumes the first mock (success), `getNamedBackupsDirectoryPath()` is called, `promptWasShown = true`, and `confirm()` is reached — contradicting the assertions `expect(getNamedBackupsDirectoryPathMock).not.toHaveBeenCalled()` and `expect(confirmMock).not.toHaveBeenCalled()`.

if there is no preliminary call to `loadDashboardDisplaySettings` before the recovery prompt in this path, the throw mock is never triggered and the test silently passes without exercising the intended failure. the fix is to drop the leading `mockResolvedValueOnce` so the first (and only) relevant call immediately throws:

```typescript
loadDashboardDisplaySettingsMock.mockRejectedValueOnce(
    makeErrnoError(
        "no such file or directory, open '/mock/dashboard-settings.json'",
        "ENOENT",
    ),
);
```

if there is a legitimate preliminary call from an outer init path, add a comment explaining it so future readers understand the two-call sequence.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/storage.ts
Line: 491-500

Comment:
**Duplicated error-code extraction helper**

`getBackupRestoreAssessmentErrorLabel` here and `getRedactedFilesystemErrorLabel` in `lib/codex-manager.ts` have identical logic (check `ErrnoException.code`, fall back to `error.name`, then `"UNKNOWN"`). any future change to one copy must be mirrored in the other. consider extracting into a shared utility (e.g., `lib/errors.ts`) and importing it in both files to keep the redaction strategy consistent.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 192e4e3

Context used:

  • Rule used - What: Every code change must explain how it defend... (source)

ndycode added 2 commits March 12, 2026 15:08
…artup-recovery-prompt-devtools

# Conflicts:
#	test/codex-manager-cli.test.ts
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

adds an interactive startup recovery prompt before oauth when no accounts exist and recoverable named backups are present; centralizes interactivity checks via isInteractiveLoginMenuAvailable() and suppresses prompts after same-session destructive actions. see lib/codex-manager.ts:line, lib/cli.ts:line, lib/storage.ts:line.

Changes

Cohort / File(s) Summary
documentation
docs/getting-started.md, docs/troubleshooting.md, docs/upgrade.md
added user-facing notes describing the startup recovery prompt, when it appears, and when it is suppressed.
interactivity helper
lib/cli.ts
added isInteractiveLoginMenuAvailable() and replaced direct tty checks with the helper in login gating (lib/cli.ts:line).
login & restore flow
lib/codex-manager.ts
wired getActionableNamedBackupRestores() and isInteractiveLoginMenuAvailable() into runAuthLogin; added suppressRecoveryPrompt and recoveryPromptAttempted; changed runBackupRestoreManager to return `"restored"
backup assessment & scanning
lib/storage.ts
added getActionableNamedBackupRestores(), scanNamedBackups(), assessNamedBackupRestoreCandidate(), and types (LoadedBackupCandidate, NamedBackupScanEntry, ActionableNamedBackupRecoveries); refactored listing/assessment plumbing and load paths (lib/storage.ts:line).
cli & recovery tests
test/codex-manager-cli.test.ts, test/recovery.test.ts
large test additions and rewrites: new mocks for oauth and restore flows, new tests for startup-restore prompts, suppression paths, error cases, and extensive test scaffolding changes (test/codex-manager-cli.test.ts:line, test/recovery.test.ts:line).
test host-prompt mocks
test/fetch-helpers.test.ts, test/public-api-contract.test.ts, test/request-transformer.test.ts
added test-time mocks for host codex prompt to stabilize tests.

Sequence Diagram(s)

sequenceDiagram
    participant user as user
    participant login as login flow\n(lib/codex-manager.ts)
    participant storage as backup storage\n(lib/storage.ts)
    participant ui as restore manager\n(ui)
    participant oauth as oauth server

    user->>login: start login (no accounts)
    login->>login: check isInteractiveLoginMenuAvailable()\n(lib/cli.ts)
    alt interactive
        login->>storage: getActionableNamedBackupRestores()
        storage->>storage: scan & assess named backups\n(lib/storage.ts:line)
        storage-->>login: actionable backups found
        login->>user: prompt "restore before oauth?"
        alt user confirms
            user-->>login: confirm
            login->>ui: launch restore manager
            ui->>ui: perform restore
            ui-->>login: restored
        else user skips
            user-->>login: skip
        end
    end
    login->>oauth: proceed to oauth flow
    oauth-->>user: oauth complete
Loading
sequenceDiagram
    participant user as user
    participant login as login flow\n(lib/codex-manager.ts)

    user->>login: trigger fresh/deleteAll or reset
    login->>login: set suppressRecoveryPrompt = true\n(lib/codex-manager.ts:line)
    login->>login: perform destructive action
    login->>login: skip recovery prompt due to suppressRecoveryPrompt
    login-->>user: continue to fresh login
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

review notes

  • missing regression tests: add an explicit end-to-end test that exercises the suppressRecoveryPrompt lifecycle across a full login flow to ensure the flag is scoped correctly and does not leak between sequential attempts (lib/codex-manager.ts:line, test/codex-manager-cli.test.ts:line).
  • windows edge cases: verify tty detection on windows CI runners. lib/cli.ts uses process.stdin.isTTY() and may behave differently on windows; add coverage or fallback checks (lib/cli.ts:line).
  • concurrency risks: recoveryPromptAttempted and suppressRecoveryPrompt are tracked without per-flow isolation. concurrent login attempts may suppress prompts incorrectly; add tests or locking around assessment -> restore sequences (lib/codex-manager.ts:line, lib/storage.ts:line).
🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning PR description is comprehensive and well-structured, covering objectives, validation steps, and known issues with greptile review findings. Address the three flagged issues before merge: (1) unbounded re-prompt loop in lib/codex-manager.ts:4091–4098 by setting suppressRecoveryPrompt=true after dismissal; (2) mock ordering bug in test/codex-manager-cli.test.ts:1840–1858 by removing leading mockResolvedValueOnce; (3) deduplicate error-label helper (getBackupRestoreAssessmentErrorLabel vs getRedactedFilesystemErrorLabel) into lib/errors.ts.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed title follows conventional commits format with type feat, scope auth, and lowercase imperative summary under 72 chars; accurately describes the main startup recovery prompt addition.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch git-plan/03-startup-recovery-prompt
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/codex-manager.ts (1)

4308-4334: ⚠️ Potential issue | 🟠 Major

set suppressRecoveryPrompt only after the destructive action succeeds.

lib/codex-manager.ts:4308-4334 flips the flag before deleteSavedAccounts() / resetLocalState() run. if either path fails — for example a windows ebusy / eperm during storage cleanup — the same session will still skip the restore prompt even though the reset never completed.

proposed fix
 if (menuResult.mode === "fresh" && menuResult.deleteAll) {
-	suppressRecoveryPrompt = true;
 	await runActionPanel(
 		DESTRUCTIVE_ACTION_COPY.deleteSavedAccounts.label,
 		DESTRUCTIVE_ACTION_COPY.deleteSavedAccounts.stage,
@@
 		},
 		displaySettings,
 	);
+	suppressRecoveryPrompt = true;
 	continue;
 }
 if (menuResult.mode === "reset") {
-	suppressRecoveryPrompt = true;
 	await runActionPanel(
 		DESTRUCTIVE_ACTION_COPY.resetLocalState.label,
 		DESTRUCTIVE_ACTION_COPY.resetLocalState.stage,
@@
 		},
 		displaySettings,
 	);
+	suppressRecoveryPrompt = true;
 	continue;
 }

as per coding guidelines, lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager.ts` around lines 4308 - 4334, The code sets
suppressRecoveryPrompt before running destructive actions; change the flow so
suppressRecoveryPrompt is set only after the destructive action completes
successfully: in the branches using runActionPanel for
DESTRUCTIVE_ACTION_COPY.deleteSavedAccounts and
DESTRUCTIVE_ACTION_COPY.resetLocalState, remove the pre-action assignment to
suppressRecoveryPrompt, run the async action (await
deleteSavedAccounts()/resetLocalState() within runActionPanel), and only on
successful completion set suppressRecoveryPrompt = true (and then continue);
ensure exceptions from deleteSavedAccounts/resetLocalState are propagated or
logged (do not set the flag on error), update or add vitest coverage for these
code paths to assert the flag is only toggled on success and that
EBUSY/EPERM-like errors leave the flag false, and ensure any logging inside
these flows does not leak tokens/emails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/getting-started.md`:
- Line 52: Update the docs sentence to reflect the actual gate: state that the
restore prompt appears only when there are zero saved accounts (not “no active
accounts”) and that it is interactive-only and suppressed after in-session
destructive resets (i.e., align wording with the startup prompt logic in the CLI
entry and the CodexManager restore prompt logic). Also add an upgrade-note entry
describing the new startup prompt behavior and mention the new npm scripts, and
ensure README and SECURITY are updated to remain consistent with the current CLI
flags and workflows.

In `@lib/storage.ts`:
- Around line 1327-1339: The current code reopens every backup by calling
assessNamedBackupRestore for each item in Promise.all, doubling I/O and risking
EMFILE/EBUSY; change this to reuse the results from listNamedBackups() (have
listNamedBackups return validated backup objects/metadata and pass those into
assessNamedBackupRestore) or, if reuse is not possible, replace the unbounded
Promise.all over backups.map(...) with a bounded concurrency queue (e.g.,
p-limit or an internal semaphore) when invoking assess(backup.name, {
currentStorage }), and ensure assessNamedBackupRestore and assess accept the
cached backup object to avoid revalidation; update tests in
test/recovery.test.ts to exercise the reused-path and add cases simulating
EBUSY/429 and verify no token/email leaks in logs.

In `@test/codex-manager-cli.test.ts`:
- Around line 495-514: Add coverage for the true non-interactive startup path by
creating (or extending) a test that sets interactive TTY off: call
setInteractiveTTY(false) before importing and running
runCodexMultiAuthCli(["auth","login"]) (instead of setInteractiveTTY(true)),
keep isInteractiveLoginMenuAvailableMock.mockReturnValue(false) and the other
mocks, and then assert exitCode is 0 and that
getActionableNamedBackupRestoresMock and confirmMock were not called and
createAuthorizationFlowMock was invoked once and restoreNamedBackupMock was not
called; reference the existing test flow around runCodexMultiAuthCli,
isInteractiveLoginMenuAvailableMock, getActionableNamedBackupRestoresMock,
confirmMock, createAuthorizationFlowMock, and restoreNamedBackupMock to locate
where to add the tty-off case.
- Around line 427-493: Add a regression test for the startup-recovery decline
path by duplicating the existing "offers startup restore before OAuth when
interactive login starts empty" case but have confirmMock.resolve to false for
the recovery prompt (and selectMock still returning the restore option), then
assert that restoreNamedBackupMock is NOT called and createAuthorizationFlowMock
(and the OAuth flow) IS invoked; locate the test around runCodexMultiAuthCli in
test/codex-manager-cli.test.ts and use the same mocks (loadAccountsMock,
getActionableNamedBackupRestoresMock, listNamedBackupsMock,
assessNamedBackupRestoreMock, selectMock, confirmMock) to simulate decline and
verify fallback to createAuthorizationFlowMock rather than early exit.
- Around line 448-454: The test currently flips loadAccountsMock after three
calls by counting reads, which makes the assertion depend on call order;
instead, make the storage transition driven by restoreNamedBackupMock: add a
flag (e.g., restored = false) that restoreNamedBackupMock sets to true when
invoked, and change loadAccountsMock.mockImplementation to return
structuredClone(restoredStorage) when that flag is true (otherwise
structuredClone(emptyStorage)); reference loadAccountsMock,
restoreNamedBackupMock, emptyStorage and restoredStorage when making this change
so the test validates actual restore behavior rather than call count.

In `@test/recovery.test.ts`:
- Around line 1290-1357: The test block using describe.skip around the
getActionableNamedBackupRestores coverage must be enabled or removed: change
describe.skip to describe so the test runs (or delete this duplicate block if
it's intentionally duplicate), ensure the test calls
storage.getActionableNamedBackupRestores and retains the vi.fn assess mock and
assertions (expect(result.totalBackups)...,
expect(result.assessments.map(...))..., expect(assess).toHaveBeenCalledTimes(2))
so it becomes a real deterministic Vitest test exercising
getActionableNamedBackupRestores with the mockBackups and assess
implementations.

---

Outside diff comments:
In `@lib/codex-manager.ts`:
- Around line 4308-4334: The code sets suppressRecoveryPrompt before running
destructive actions; change the flow so suppressRecoveryPrompt is set only after
the destructive action completes successfully: in the branches using
runActionPanel for DESTRUCTIVE_ACTION_COPY.deleteSavedAccounts and
DESTRUCTIVE_ACTION_COPY.resetLocalState, remove the pre-action assignment to
suppressRecoveryPrompt, run the async action (await
deleteSavedAccounts()/resetLocalState() within runActionPanel), and only on
successful completion set suppressRecoveryPrompt = true (and then continue);
ensure exceptions from deleteSavedAccounts/resetLocalState are propagated or
logged (do not set the flag on error), update or add vitest coverage for these
code paths to assert the flag is only toggled on success and that
EBUSY/EPERM-like errors leave the flag false, and ensure any logging inside
these flows does not leak tokens/emails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f95895b7-9f08-4e7b-ba36-8baa8d962b68

📥 Commits

Reviewing files that changed from the base of the PR and between c10e66b and d1a763a.

📒 Files selected for processing (7)
  • docs/getting-started.md
  • docs/troubleshooting.md
  • lib/cli.ts
  • lib/codex-manager.ts
  • lib/storage.ts
  • test/codex-manager-cli.test.ts
  • test/recovery.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/cli.ts
  • lib/codex-manager.ts
  • lib/storage.ts
docs/**

⚙️ CodeRabbit configuration file

keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.

Files:

  • docs/getting-started.md
  • docs/troubleshooting.md
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/recovery.test.ts
  • test/codex-manager-cli.test.ts

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 7 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="test/recovery.test.ts">

<violation number="1" location="test/recovery.test.ts:1290">
P3: Avoid committing this suite as `describe.skip`; it disables coverage for backup-restore behavior and can hide regressions.</violation>
</file>

<file name="lib/storage.ts">

<violation number="1" location="lib/storage.ts:1337">
P2: Avoid failing the entire backup scan when one backup assessment throws. A single unreadable backup should be skipped so other recoverable backups can still be offered.</violation>

<violation number="2" location="lib/storage.ts:1338">
P2: Passing `currentStorage: null` into `assessNamedBackupRestore` causes it to reload accounts for every backup. Normalize `null` before calling `assess` so assessments share one in-memory storage value.</violation>
</file>

<file name="lib/codex-manager.ts">

<violation number="1" location="lib/codex-manager.ts:4376">
P1: Wrap startup recovery scan in a try/catch so backup-read failures degrade to the normal login path instead of aborting `auth login`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/codex-manager.ts`:
- Around line 4387-4389: The startup recovery scan catch block currently
swallows all errors and quietly sets recoveryState, which hides Windows
EBUSY/EPERM failures; modify the catch to capture the error (e.g., catch (err))
and emit a warning or error log including error.message/stack via the existing
logger (mentioning the startup recovery scan and recoveryState fallback), then
keep the fallback assignment; update/add vitest cases to simulate Windows
EBUSY/EPERM and HTTP 429 to ensure the new logging and queue retry behavior
handles those scenarios and update any affected queue logic to retry/backoff on
EBUSY/429 as required.
- Around line 4111-4115: The loop that awaits assessNamedBackupRestore for each
backup can be aborted by a single thrown error; change the logic in the restore
manager where backups are iterated (the for (const backup of backups) block
calling assessNamedBackupRestore) to wrap each await in a per-item try/catch so
a failing backup (e.g., EBUSY/EPERM or malformed data) is caught, logged, and
the loop continues; collect failed backup names into a failures array or map and
surface them after the loop, and add/adjust vitest cases to assert that a single
assessNamedBackupRestore rejection does not stop processing of remaining backups
and that the new flow retries or queues transient errors (EBUSY/429)
appropriately (update any queues used for restore attempts to handle EBUSY/429
backoff/retry).

In `@test/codex-manager-cli.test.ts`:
- Around line 744-771: Add a regression test in test/codex-manager-cli.test.ts
that mirrors the existing "falls back to OAuth when startup recovery scan
throws" case but makes getActionableNamedBackupRestoresMock.reject with a
windows-style errno error (use makeErrnoError("resource busy", "EBUSY") and also
include a variant for "EPERM" if desired), keep setInteractiveTTY(true),
promptLoginModeMock.mockResolvedValueOnce({ mode: "cancel" }) and await
configureSuccessfulOAuthFlow(now), then import and call
runCodexMultiAuthCli(["auth","login"]) and assert exitCode === 0,
getActionableNamedBackupRestoresMock called once, confirmMock and
restoreNamedBackupMock not called, and createAuthorizationFlowMock called once;
this ensures assessNamedBackupRestore concurrency errors fall back to OAuth
without crashing.

In `@test/recovery.test.ts`:
- Around line 139-302: Add a deterministic vitest that exercises
getActionableNamedBackupRestores without an injected assess (to hit the default
scan/assess path) and simulates a Windows-style EBUSY failure for one backup
while leaving another valid; specifically, import the storage module, stub the
underlying file-read/load used by the default assessment (e.g., spyOn
fs.promises.readFile or the internal loader function used by
getActionableNamedBackupRestores) to throw an Error with code "EBUSY" for the
failing backup path and return valid data for the other backup, call
getActionableNamedBackupRestores({backups: [...]}) and assert totalBackups
equals the mock length and that the returned assessments include the valid
backup only, verify the loader was called for both backups, and restore mocks at
test end to keep the test deterministic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4eaf19de-2526-4a5b-81e1-1d98d882fb59

📥 Commits

Reviewing files that changed from the base of the PR and between d1a763a and d756d0a.

📒 Files selected for processing (9)
  • docs/getting-started.md
  • docs/upgrade.md
  • lib/codex-manager.ts
  • lib/storage.ts
  • test/codex-manager-cli.test.ts
  • test/fetch-helpers.test.ts
  • test/public-api-contract.test.ts
  • test/recovery.test.ts
  • test/request-transformer.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
docs/**

⚙️ CodeRabbit configuration file

keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.

Files:

  • docs/upgrade.md
  • docs/getting-started.md
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/request-transformer.test.ts
  • test/fetch-helpers.test.ts
  • test/codex-manager-cli.test.ts
  • test/public-api-contract.test.ts
  • test/recovery.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/codex-manager.ts
  • lib/storage.ts
🔇 Additional comments (14)
docs/getting-started.md (1)

52-53: docs wording now matches runtime behavior.

this is aligned with the startup gates in lib/codex-manager.ts:4373-4378 and lib/cli.ts:26-29.

docs/upgrade.md (1)

65-70: good upgrade-note coverage for the new login flow.

this section is consistent with the recovery gating implemented in lib/codex-manager.ts:4373-4414.

lib/storage.ts (2)

1295-1313: nice change: backup scan now carries loaded candidates through.

lib/storage.ts:1295-1313 reduces redundant reopen/parse work and keeps the scan path safer for windows file-lock pressure.


1365-1383: good resilience in actionable filtering path.

lib/storage.ts:1365-1383 keeps processing when a single backup assessment fails, which is the right behavior for startup recovery.

test/request-transformer.test.ts (1)

1-5: lgtm on the host-codex-prompt mock.

the mock is hoisted correctly before imports and returns a deterministic cached prompt. this matches the pattern in test/fetch-helpers.test.ts:2-5 and test/public-api-contract.test.ts:2-5.

test/fetch-helpers.test.ts (1)

2-5: lgtm on the host-codex-prompt mock.

deterministic mock setup, consistent with other test files.

test/public-api-contract.test.ts (1)

2-5: lgtm on the host-codex-prompt mock.

consistent with the other test files.

test/codex-manager-cli.test.ts (7)

530-580: good addition of the decline-recovery regression.

this test at test/codex-manager-cli.test.ts:530 correctly verifies that when the user declines startup recovery (confirmMock.mockResolvedValue(false)), the flow continues into oauth (createAuthorizationFlowMock is called) and restore is skipped. this addresses the previous review feedback about missing decline path coverage.


718-742: good coverage for the true non-interactive path.

this test at test/codex-manager-cli.test.ts:718 sets setInteractiveTTY(false) and isInteractiveLoginMenuAvailableMock.mockReturnValue(false), then asserts that getActionableNamedBackupRestoresMock and confirmMock are never called while oauth proceeds. this addresses the previous review feedback about needing a tty-off case.


484-518: restore side effect now drives storage state correctly.

the test at test/codex-manager-cli.test.ts:515-518 wires restoreNamedBackupMock.mockImplementation to flip storageState to restoredStorage, so the test validates actual restore behavior rather than call ordering. this addresses the previous review feedback about making the test a real regression check.


305-308: default mock for getActionableNamedBackupRestores looks correct.

setting assessments: [] and totalBackups: 0 as the default ensures tests that don't care about recovery prompts won't be affected.


236-270: configureSuccessfulOAuthFlow helper is clean.

centralizes the oauth mock setup to avoid duplication across tests. the mock values are deterministic and the helper correctly awaits the dynamic imports.


773-820: parameterized suppression test uses loadCount counter - acceptable but slightly fragile.

the test at test/codex-manager-cli.test.ts:801-807 uses loadCount <= 2 to flip from populated to empty storage. this is a call-ordering assertion rather than action-driven state, but it's acceptable here since the test is verifying that getActionableNamedBackupRestoresMock is never called (i.e., the suppression guard fires before the scan). the storage flip simulates the post-wipe state correctly.


463-528: startup restore happy path test is solid.

the test at test/codex-manager-cli.test.ts:463 validates the full flow: actionable backups exist → user confirms → select restore → restore executes → storage updated → oauth skipped. good coverage for the main pr objective.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
test/recovery.test.ts (1)

375-444: ⚠️ Potential issue | 🟠 Major

this ebusy case still bypasses the startup fast path.

test/recovery.test.ts, Lines 431-434 pass backups, so this test drives lib/storage.ts, Lines 1393-1405 instead of the no-args fast path at lib/storage.ts, Lines 1373-1390 that lib/codex-manager.ts, Lines 4401-4403 uses on startup. we still need a storage-backed ebusy regression that faults during the scan path itself and proves one locked backup does not hide another actionable backup.

As per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/recovery.test.ts` around lines 375 - 444, The test currently passes the
backups array into getActionableNamedBackupRestores which exercises the
non-startup code path; change the test to exercise the storage-backed scan path
by calling storage.getActionableNamedBackupRestores() with no args (so it calls
storage.listNamedBackups internally) and keep the fs.readFile spy that throws
NodeJS.ErrnoException with code "EBUSY" for the locked backup path; this ensures
the EBUSY fault occurs during the scan path in
storage.getActionableNamedBackupRestores and verifies one locked backup does not
hide the other actionable backup (refer to getActionableNamedBackupRestores and
listNamedBackups in lib/storage.ts).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/codex-manager.ts`:
- Around line 4390-4429: The pre-OAuth recovery prompt can throw UI errors and
must not abort login; wrap the confirm(...) and subsequent
runBackupRestoreManager(...) handling in a local try/catch so any thrown errors
(from confirm, runBackupRestoreManager, or UI libs) are caught, logged/warned,
and the code continues down the OAuth path instead of failing; locate the block
guarded by isInteractiveLoginMenuAvailable() and
getActionableNamedBackupRestores(...) and modify the branch that inspects
recoveryState.assessments to catch UI/CLI errors and treat them as a benign
fallback to OAuth, and update test/recovery.test.ts to add/adjust a case
exercising the forced-interactive/non-tty CLI error path.
- Around line 4193-4197: The call to restoreNamedBackup(assessment.backup.name)
must not be allowed to throw and abort the backup browser/startup flow; wrap the
await in a try/catch inside the surrounding function (the block that currently
logs and returns "restored"), catch filesystem/concurrency errors (EBUSY/EPERM)
and transient rate errors (429), implement a small retry/backoff or queue retry
hook, log the error with assessment.backup.name and error details via
console/processLogger, and return a non-fatal result (e.g., "restored_failed" or
"skipped") so the caller does not teardown the browser or login flow;
update/extend test/recovery.test.ts to cover the restore failure path and
EBUSY/EPERM/429 retry behavior.

In `@test/codex-manager-cli.test.ts`:
- Around line 1702-1707: The test currently only verifies oauth was skipped
(checks like getActionableNamedBackupRestoresMock and confirmMock) but doesn't
assert the actual restore side-effect; update the test to assert the restore
executed by adding an expectation that the restore handler was invoked (e.g.,
expect(restoreBackupMock or performRestoreMock).toHaveBeenCalled() or expect(the
specific restore function used by the cli command) and/or assert the observable
effect (e.g., mocked filesystem/write calls or returned restore status changed)
after invoking the command; locate the assertions near the existing checks
referencing getActionableNamedBackupRestoresMock, confirmMock, selectMock, and
createAuthorizationFlowMock and add the new expectation(s) there to guarantee
the restore ran.

In `@test/recovery.test.ts`:
- Around line 1-4: Add a top-level cleanup to restore vitest spies after each
test so any vi.spyOn(...) created in recovery.test.ts (the spies referenced
around the failing cases) cannot leak into other tests: add or update an
afterEach hook (near the test file's top imports) to call vi.restoreAllMocks()
(or restore specific spies if they are named) and optionally
vi.resetAllMocks()/vi.resetModules() as needed to fully reset mocked fs/storage
behavior between tests; ensure the new afterEach runs for every test path, not
only the happy path.

---

Duplicate comments:
In `@test/recovery.test.ts`:
- Around line 375-444: The test currently passes the backups array into
getActionableNamedBackupRestores which exercises the non-startup code path;
change the test to exercise the storage-backed scan path by calling
storage.getActionableNamedBackupRestores() with no args (so it calls
storage.listNamedBackups internally) and keep the fs.readFile spy that throws
NodeJS.ErrnoException with code "EBUSY" for the locked backup path; this ensures
the EBUSY fault occurs during the scan path in
storage.getActionableNamedBackupRestores and verifies one locked backup does not
hide the other actionable backup (refer to getActionableNamedBackupRestores and
listNamedBackups in lib/storage.ts).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0bf27991-cefa-4bef-bf1a-400915419c3f

📥 Commits

Reviewing files that changed from the base of the PR and between d756d0a and 4832de2.

📒 Files selected for processing (4)
  • lib/codex-manager.ts
  • lib/storage.ts
  • test/codex-manager-cli.test.ts
  • test/recovery.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/codex-manager.ts
  • lib/storage.ts
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/recovery.test.ts
  • test/codex-manager-cli.test.ts

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/recovery.test.ts (1)

445-514: ⚠️ Potential issue | 🟡 Minor

restore the fs.readfile spy in this test.

test/recovery.test.ts:489-514 creates readFileSpy and never restores it. this can leak mocked fs behavior into later cases and make file order matter.

proposed fix
 		const originalReadFile = fs.readFile.bind(fs);
 		const readFileSpy = vi.spyOn(fs, "readFile").mockImplementation(
 			(async (...args: Parameters<typeof fs.readFile>) => {
 				const [path] = args;
 				if (path === lockedBackup?.path) {
 					const error = new Error("resource busy") as NodeJS.ErrnoException;
 					error.code = "EBUSY";
 					throw error;
 				}
 				return originalReadFile(...args);
 			}) as typeof fs.readFile,
 		);
-
-		const result = await storage.getActionableNamedBackupRestores({
-			backups,
-			currentStorage: emptyStorage,
-		});
-
-		expect(result.totalBackups).toBe(2);
-		expect(result.assessments.map((item) => item.backup.name)).toEqual([
-			"valid-backup",
-		]);
-		expect(readFileSpy).toHaveBeenCalledTimes(2);
-		expect(readFileSpy.mock.calls.map(([path]) => path)).toEqual(
-			expect.arrayContaining([lockedBackup?.path, validBackup?.path]),
-		);
+		try {
+			const result = await storage.getActionableNamedBackupRestores({
+				backups,
+				currentStorage: emptyStorage,
+			});
+
+			expect(result.totalBackups).toBe(2);
+			expect(result.assessments.map((item) => item.backup.name)).toEqual([
+				"valid-backup",
+			]);
+			expect(readFileSpy).toHaveBeenCalledTimes(2);
+			expect(readFileSpy.mock.calls.map(([path]) => path)).toEqual(
+				expect.arrayContaining([lockedBackup?.path, validBackup?.path]),
+			);
+		} finally {
+			readFileSpy.mockRestore();
+		}
 	});

As per coding guidelines, test/**: tests must stay deterministic and use vitest.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/recovery.test.ts` around lines 445 - 514, The test sets a spy
readFileSpy via vi.spyOn(fs, "readFile") but never restores it, leaking the mock
into other tests; update the "keeps actionable backups when default assessment
hits EBUSY" test (or the surrounding setup) to restore the spy after use by
calling readFileSpy.mockRestore() (or using a try/finally around the async test
body to ensure restoration) or move restoration to an afterEach that calls
readFileSpy.mockRestore()/vi.restoreAllMocks(); reference readFileSpy and
fs.readFile in the change so the spy is always cleaned up.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/recovery.test.ts`:
- Around line 445-514: The test sets a spy readFileSpy via vi.spyOn(fs,
"readFile") but never restores it, leaking the mock into other tests; update the
"keeps actionable backups when default assessment hits EBUSY" test (or the
surrounding setup) to restore the spy after use by calling
readFileSpy.mockRestore() (or using a try/finally around the async test body to
ensure restoration) or move restoration to an afterEach that calls
readFileSpy.mockRestore()/vi.restoreAllMocks(); reference readFileSpy and
fs.readFile in the change so the spy is always cleaned up.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f2c34c27-c6c1-41a7-a52d-10244dda9115

📥 Commits

Reviewing files that changed from the base of the PR and between 4832de2 and ef42d0d.

📒 Files selected for processing (2)
  • lib/storage.ts
  • test/recovery.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/recovery.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/storage.ts
🔇 Additional comments (3)
test/recovery.test.ts (2)

45-61: good windows cleanup helper.

test/recovery.test.ts:45-61 uses bounded retry with ebusy/eperm/enotempty handling, which is the right shape for deterministic windows teardown behavior.


1320-1362: solid concurrency regression coverage here.

test/recovery.test.ts:1320-1362 explicitly reproduces in-flight duplicate recovery calls and asserts first/second outcomes, which is a good deterministic race guard.

lib/storage.ts (1)

1380-1450: fast/slow path split and per-backup isolation look correct.

lib/storage.ts:1387-1394 avoids prereading backup contents when a custom assessor is injected, and lib/storage.ts:1435-1447 prevents one failing backup from aborting the entire assessment pass. coverage aligns with test/recovery.test.ts:376-443 and test/recovery.test.ts:445-514.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/storage.ts (1)

1935-1946: ⚠️ Potential issue | 🟠 Major

redact filesystem error details before persisting loadError.

in lib/storage.ts:1935-1946, error: String(error) can include local absolute paths. that value is later surfaced via backup metadata hints in lib/codex-manager.ts:4380-4389, so path data can leak in ui output.

proposed fix
+function getRedactedFilesystemErrorLabel(error: unknown): string {
+	const code = (error as NodeJS.ErrnoException).code;
+	if (typeof code === "string" && code.trim().length > 0) return code;
+	if (error instanceof Error && error.name && error.name !== "Error") {
+		return error.name;
+	}
+	return "UNKNOWN";
+}
+
 async function loadBackupCandidate(path: string): Promise<LoadedBackupCandidate> {
 	try {
 		return await retryTransientFilesystemOperation(() =>
 			loadAccountsFromPath(path),
 		);
 	} catch (error) {
 		return {
 			normalized: null,
 			storedVersion: undefined,
 			schemaErrors: [],
-			error: String(error),
+			error: getRedactedFilesystemErrorLabel(error),
 		};
 	}
 }

please also add a regression that asserts backup hint/error text does not expose absolute paths during failed backup reads (e.g., ENOENT/EBUSY) in test/recovery.test.ts.

As per coding guidelines, lib/**: "focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/storage.ts` around lines 1935 - 1946, The loadBackupCandidate function
currently returns error: String(error) which may include absolute filesystem
paths; update loadBackupCandidate to sanitize/redact filesystem-specific details
before persisting loadError by mapping the thrown error from
loadAccountsFromPath into a safe message (e.g., normalize known errno codes like
ENOENT/EBUSY and return a generic, non-path-containing string) and preserve the
original error only in non-persisted debug logs; additionally add a vitest
regression in test/recovery.test.ts that simulates failed backup reads
(ENOENT/EBUSY) and asserts that the backup hint/error text produced by the codex
manager (the place that surfaces loadBackupCandidate errors) does not contain
absolute path fragments or drive letters.
lib/codex-manager.ts (1)

4421-4424: ⚠️ Potential issue | 🟠 Major

guard reassessment with local redacted handling before confirmation.

lib/codex-manager.ts:4421-4424 can still throw on windows races (ebusy/eperm/enoent) after item selection. that escapes this function and is rendered with raw message in lib/codex-manager.ts:3935-3939, which can leak filesystem paths and skip the manager-local redaction path.

proposed fix
-	const latestAssessment = await assessNamedBackupRestore(
-		selection.assessment.backup.name,
-		{ currentStorage: await loadAccounts() },
-	);
+	let latestAssessment: BackupRestoreAssessment;
+	try {
+		latestAssessment = await assessNamedBackupRestore(
+			selection.assessment.backup.name,
+			{ currentStorage: await loadAccounts() },
+		);
+	} catch (error) {
+		const errorLabel = getRedactedFilesystemErrorLabel(error);
+		console.warn(
+			`failed to reassess backup "${selection.assessment.backup.name}" (${errorLabel}).`,
+		);
+		return "dismissed";
+	}

please add a regression in test/recovery.test.ts for “backup removed/locked after menu selection” to ensure the manager returns "dismissed" and emits redacted labels only.

As per coding guidelines, lib/**: "focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager.ts` around lines 4421 - 4424, Wrap the reassessment call to
assessNamedBackupRestore (currently invoked with
selection.assessment.backup.name and { currentStorage: await loadAccounts() })
in a try/catch that specifically handles filesystem concurrency errors
(EBUSY/EPERM/ENOENT) and, on those errors, returns the manager-level result
"dismissed" and emits only redacted labels via the existing local redaction path
instead of rethrowing; ensure you call the same redaction helper used for the
user-facing branch rather than letting the raw error bubble to the confirmation
codepath (the code that logs at the confirmation/render site). Also add a vitest
regression in test/recovery.test.ts named "backup removed/locked after menu
selection" that simulates the backup being removed/locked after selection and
asserts the manager returns "dismissed" and only redacted labels are emitted.
♻️ Duplicate comments (2)
test/codex-manager-cli.test.ts (1)

2305-2311: 🧹 Nitpick | 🔵 Trivial

avoid call-count-driven storage transitions in this regression.

in test/codex-manager-cli.test.ts:2305-2311, storage flips from populated to empty based on loadCount. this makes correctness depend on internal read ordering rather than the destructive action side effect.

deterministic refactor
-			let loadCount = 0;
-			loadAccountsMock.mockImplementation(async () => {
-				loadCount += 1;
-				return loadCount <= 2
-					? structuredClone(populatedStorage)
-					: structuredClone(emptyStorage);
-			});
+			let destructiveApplied = false;
+			action.mockImplementationOnce(async (...args) => {
+				destructiveApplied = true;
+				return (await Promise.resolve(action.getMockImplementation()?.(...args))) ?? {
+					accountsCleared: true,
+					flaggedCleared: mode === "reset",
+					quotaCacheCleared: mode === "reset",
+				};
+			});
+			loadAccountsMock.mockImplementation(async () =>
+				structuredClone(destructiveApplied ? emptyStorage : populatedStorage),
+			);

as per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/codex-manager-cli.test.ts` around lines 2305 - 2311, The test currently
flips storage based on a call counter (loadCount) inside loadAccountsMock which
makes behavior order-dependent; replace this with deterministic control: either
stub loadAccountsMock using a fixed sequence of returns via
loadAccountsMock.mockResolvedValueOnce(populatedStorage) /
mockResolvedValueOnce(...) for each expected call, or introduce an explicit
storageState flag (e.g., let storageState = 'populated') that loadAccountsMock
returns structuredClone(storageState === 'populated' ? populatedStorage :
emptyStorage) and then toggle storageState in the specific destructive/mock of
the mutating method (e.g., in the mock for the delete/overwrite function) so
transitions occur only when that action is invoked rather than by call count.
Ensure you reference loadAccountsMock, populatedStorage, emptyStorage,
structuredClone and the mutating mock you already have so the test becomes
deterministic.
test/recovery.test.ts (1)

647-656: ⚠️ Potential issue | 🟠 Major

windows ebusy stat matcher targets the wrong backup filename.

in test/recovery.test.ts:647-656, the normalized-path branch checks "/locked-backup.json", but this case creates first-backup and second-backup. if stat path equality does not hit, the windows-style fallback matcher will never trigger.

proposed fix
-				if (
-					path === lockedBackup?.path ||
-					normalizedPath.endsWith("/locked-backup.json")
-				) {
+				if (
+					path === lockedBackup?.path ||
+					normalizedPath.endsWith("/first-backup.json")
+				) {

as per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/recovery.test.ts` around lines 647 - 656, The windows-style fallback in
the fs.stat mock is checking for "/locked-backup.json" which doesn't match the
test-created backups; update the normalizedPath.endsWith(...) check inside the
stat mock (the block around originalStat, statSpy, fs.stat and
lockedBackup?.path) to match the actual backup filenames created by the test
(e.g., include endsWith("/first-backup") and endsWith("/second-backup") or the
exact file names like "/first-backup.json" and "/second-backup.json") so the
Windows-path branch triggers when those backup files are stat-ed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/troubleshooting.md`:
- Line 21: Update the troubleshooting line to reflect the real recovery gate:
say the restore prompt on `codex auth login` appears only when a recoverable
backup is detected (not merely “named backups”) as implemented in the
recoverable-backup logic in lib/codex-manager.ts (around the recoverable backups
flow), and that the prompt is suppressed for same-session fresh/reset actions
handled by the reset-suppression logic in lib/codex-manager.ts; also update
docs/** to add an upgrade note and mention any new npm scripts per the docs
guideline.

In `@lib/codex-manager.ts`:
- Around line 4045-4048: Add a focused vitest in test/recovery.test.ts that
simulates a recovery loop where recoveryState.assessments is an empty array and
recoveryScanFailed is false so the code path in codex-manager's loginFlow sets
allowEmptyStorageMenu = true and continues; assert that the next loop iteration
enters the empty-storage menu (e.g., by spying/mocking the function or branch
that renders the storage menu) and assert explicitly that no OAuth flow (oauth
entry point) is invoked on that iteration. When adding the test, reuse the same
setup helpers used by existing recovery tests, ensure mocks/queues used simulate
Windows IO/EBUSY/429 resilience per repo guidelines, and verify logs do not leak
tokens/emails while also referencing recoveryState, allowEmptyStorageMenu,
loginFlow, recoveryScanFailed, and assessments in the test assertions.

In `@test/codex-manager-cli.test.ts`:
- Around line 1735-2327: Add a deterministic test case that covers the branch
where getActionableNamedBackupRestores returns totalBackups > 0 but an empty
assessments array: mock getActionableNamedBackupRestoresMock.mockResolvedValue({
assessments: [], totalBackups: 2 }), setInteractiveTTY(true), have
loadAccountsMock return an empty storage state, promptLoginModeMock.resolve to {
mode: "cancel" }, and then import and run
runCodexMultiAuthCli(["auth","login"]); assert the CLI returns 0, assert
getActionableNamedBackupRestoresMock was called, and assert that no
backup-selection flows ran (confirmMock, selectMock, restoreNamedBackupMock not
called) while createAuthorizationFlowMock was called once to continue into
OAuth; add this test in the same suite (referencing
getActionableNamedBackupRestoresMock, confirmMock, selectMock,
restoreNamedBackupMock, createAuthorizationFlowMock, runCodexMultiAuthCli) to
prevent the regression.

---

Outside diff comments:
In `@lib/codex-manager.ts`:
- Around line 4421-4424: Wrap the reassessment call to assessNamedBackupRestore
(currently invoked with selection.assessment.backup.name and { currentStorage:
await loadAccounts() }) in a try/catch that specifically handles filesystem
concurrency errors (EBUSY/EPERM/ENOENT) and, on those errors, returns the
manager-level result "dismissed" and emits only redacted labels via the existing
local redaction path instead of rethrowing; ensure you call the same redaction
helper used for the user-facing branch rather than letting the raw error bubble
to the confirmation codepath (the code that logs at the confirmation/render
site). Also add a vitest regression in test/recovery.test.ts named "backup
removed/locked after menu selection" that simulates the backup being
removed/locked after selection and asserts the manager returns "dismissed" and
only redacted labels are emitted.

In `@lib/storage.ts`:
- Around line 1935-1946: The loadBackupCandidate function currently returns
error: String(error) which may include absolute filesystem paths; update
loadBackupCandidate to sanitize/redact filesystem-specific details before
persisting loadError by mapping the thrown error from loadAccountsFromPath into
a safe message (e.g., normalize known errno codes like ENOENT/EBUSY and return a
generic, non-path-containing string) and preserve the original error only in
non-persisted debug logs; additionally add a vitest regression in
test/recovery.test.ts that simulates failed backup reads (ENOENT/EBUSY) and
asserts that the backup hint/error text produced by the codex manager (the place
that surfaces loadBackupCandidate errors) does not contain absolute path
fragments or drive letters.

---

Duplicate comments:
In `@test/codex-manager-cli.test.ts`:
- Around line 2305-2311: The test currently flips storage based on a call
counter (loadCount) inside loadAccountsMock which makes behavior
order-dependent; replace this with deterministic control: either stub
loadAccountsMock using a fixed sequence of returns via
loadAccountsMock.mockResolvedValueOnce(populatedStorage) /
mockResolvedValueOnce(...) for each expected call, or introduce an explicit
storageState flag (e.g., let storageState = 'populated') that loadAccountsMock
returns structuredClone(storageState === 'populated' ? populatedStorage :
emptyStorage) and then toggle storageState in the specific destructive/mock of
the mutating method (e.g., in the mock for the delete/overwrite function) so
transitions occur only when that action is invoked rather than by call count.
Ensure you reference loadAccountsMock, populatedStorage, emptyStorage,
structuredClone and the mutating mock you already have so the test becomes
deterministic.

In `@test/recovery.test.ts`:
- Around line 647-656: The windows-style fallback in the fs.stat mock is
checking for "/locked-backup.json" which doesn't match the test-created backups;
update the normalizedPath.endsWith(...) check inside the stat mock (the block
around originalStat, statSpy, fs.stat and lockedBackup?.path) to match the
actual backup filenames created by the test (e.g., include
endsWith("/first-backup") and endsWith("/second-backup") or the exact file names
like "/first-backup.json" and "/second-backup.json") so the Windows-path branch
triggers when those backup files are stat-ed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e65b778e-a079-4ad4-8ae6-b61cdc5cacbc

📥 Commits

Reviewing files that changed from the base of the PR and between ef42d0d and 8ddfe38.

📒 Files selected for processing (6)
  • docs/troubleshooting.md
  • lib/cli.ts
  • lib/codex-manager.ts
  • lib/storage.ts
  • test/codex-manager-cli.test.ts
  • test/recovery.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/cli.ts
  • lib/storage.ts
  • lib/codex-manager.ts
docs/**

⚙️ CodeRabbit configuration file

keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.

Files:

  • docs/troubleshooting.md
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/recovery.test.ts
  • test/codex-manager-cli.test.ts
🔇 Additional comments (1)
lib/cli.ts (1)

29-31: interactive gating refactor looks good.

lib/cli.ts:29-31 and lib/cli.ts:266-268 cleanly centralize login-menu interactivity checks while preserving fallback behavior.

Also applies to: 266-268

@ndycode
Copy link
Owner Author

ndycode commented Mar 14, 2026

Addressed the two remaining startup-recovery issues in 8380de6:

  • the startup-launched restore manager now receives the full pre-scanned assessment set, so its list stays consistent with the prompt counts instead of shrinking to actionable rows only
  • the final assessNamedBackupRestore(...) check before restore is now guarded and redacted, so transient EBUSY/EPERM failures dismiss back to login instead of crashing the session

Validation on this branch:

  • npm run typecheck
  • npx vitest run test/recovery.test.ts
  • cmd /c "set NODE_OPTIONS=--max-old-space-size=16384&& npx vitest run test/codex-manager-cli.test.ts -t startup"
  • npm run lint
  • npm run build

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="lib/codex-manager.ts">

<violation number="1" location="lib/codex-manager.ts:4079">
P3: This null-check is unreachable and introduces dead control flow in the startup recovery path.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant