fix: prevent RN migration from overwriting local channel monitors by jvsena42 · Pull Request #495 · synonymdev/bitkit-ios

jvsena42 · 2026-03-17T16:57:41Z

Description

This PR prevents old RN channel monitors from overwriting newer local ones during orphaned channel recovery, which caused LDK to refuse to start and display stale balances.

Root cause: PR #462 introduced orphaned channel monitor recovery that re-fetches all RN monitors from the remote backup. The ldk-node version at the time (c5698d0, post-rc.32) blindly wrote migration monitors to storage via setChannelDataMigration(), overwriting existing ones — including monitors that had advanced far beyond the old RN version. PR #480 fixed the channel manager overwrite but not the monitor overwrite.

Why it triggered for this user: isChannelRecoveryChecked is a new UserDefaults key introduced by PR #462. On first launch of the new version, it defaults to false, causing fetchOrphanedChannelMonitorsIfNeeded() to run for ALL existing RN-migrated users — not just those with genuinely orphaned channels. Every RN-migrated user with active channels who updated to the version with PR #462 is affected.

Fix: The migration-aware write protection already exists in ldk-node v0.7.0-rc.33 (commit e48948f). apply_channel_data_migration() now compares update_id before writing and skips if the existing monitor is newer or equal. This PR bumps ldk-node to include this fix. No Swift-side filtering is needed.

Recovery for already-affected users: Requires a separate ldk-node change — set_accept_stale_channel_monitors(true) to allow the node to start with stale monitors and cooperative-close the affected channels (see PR comments for details).

Balance impact: When the node fails to start, @AppStorage balance values (totalBalanceSats, totalOnchainSats, totalLightningSats) are never updated — the user sees a plausible-looking but completely frozen balance from the last successful session.

Logs from the bug

bitkit_logs_2026-03-17_14-16-30.zip

Linked Issues/Tasks

Related PRs:

fix: recover orphaned channel monitors from RN migration #462 (introduced orphaned channel recovery)
fix: cleanup unpaid order transfers from migration #466 (cleanup unpaid order transfers)
fix(lightning): fix orphaned channels migration #480 (fix: don't overwrite channel manager during recovery)

Screenshot / Video

N/A - backend logic change

jvsena42 · 2026-03-17T17:24:35Z

Safest recovery path for already-affected users

The safest approach is to accept stale monitors and cooperative-close the affected channels, rather than wiping data or force-closing with revoked commitments.

Why this is the safest option

Option	Funds at risk?	Other channels?	Mechanism
Wipe all channel data	Funds may go unclaimed (no monitor watching for force-close)	All lost	Counterparty eventually force-closes
Force-close with stale monitor	ALL funds at risk — broadcasting a revoked commitment (update_id 10 has been superseded 110 times, counterparty holds revocation keys)	Preserved	Penalty transaction
Accept stale + cooperative close	No	Preserved	Cooperative close uses agreed-upon final balances

How it works

Node starts with a stale monitor (warn instead of refusing)
Immediately initiates cooperative close of the stale channel
Both parties agree on correct final balances → closing tx → funds returned safely
The only risk window is between startup and cooperative close: if the counterparty broadcasts a revoked state between updates 10–120, our stale monitor can't punish them. But the counterparty is Blocktank (our LSP) — no incentive to cheat.

Required changes

rust-lightning (ChannelManager deserialization):

The stale check and DangerousValue error lives in channelmanager.rs:~17355. Needs a flag to warn-and-continue instead of refusing to start:

// Instead of returning Err(DecodeError::DangerousValue) when monitor is stale:
if args.accept_stale_channel_monitors {
    log_warn!(logger,
        "Accepting stale monitor for channel {} (monitor update_id {} vs manager update_id {}). \
         Cooperative close recommended.",
        channel_id, monitor_update_id, manager_update_id
    );
    stale_channels.push(channel_id);
    // Continue deserialization instead of returning error
} else {
    return Err(DecodeError::DangerousValue); // existing behavior
}

ldk-node (builder.rs):

Expose the flag via the builder API:

pub fn set_accept_stale_channel_monitors(&mut self, accept: bool) -> &mut Self {
    self.accept_stale_monitors = accept;
    self
}

After the node starts, auto-close stale channels:

for channel_id in stale_channels {
    node.close_channel(&channel_id, counterparty_node_id)?;
}

Swift side (after ldk-node change):

On the first startup failure due to stale monitors, retry with the flag enabled:

// In LightningService.setup() or WalletViewModel.start()
// If node fails with stale monitor error, rebuild with:
builder.setAcceptStaleChannelMonitors(true)
// Node starts → stale channels are auto-closed cooperatively

Combine with migration-aware write (prevention)

For defense-in-depth, also add the migration-aware write from the previous comment to setChannelDataMigration() in builder.rs — skip writing monitors that already exist in the KVStore. This prevents the bug at the ldk-node level regardless of what the app passes.

jvsena42 · 2026-03-17T17:34:15Z

Impact on displayed balance

The balance values (totalBalanceSats, totalOnchainSats, totalLightningSats, etc.) are all @AppStorage (UserDefaults-backed) properties in WalletViewModel. They persist across app launches and are never reset to 0 unless explicitly overwritten.

When the node fails to start due to the stale monitor:

updateBalanceState() fails because lightningService.refreshCache() / balanceManager.deriveBalanceState() require a running node
The error is caught and none of the balance properties are updated
The @AppStorage values retain whatever was last written from the previous successful session (before the update to build 182)

Result: The user sees a plausible-looking but completely frozen balance — the last known values from before the app update. It's not zero, so it doesn't immediately look broken. But:

No new payments are reflected (incoming or outgoing)
Both on-chain and Lightning balances are frozen (even though on-chain could theoretically refresh independently, deriveBalanceState() fails entirely when the node isn't running)
Send/receive operations fail because the node isn't started
The home screen, savings view, and spending view all display stale data

jvsena42 · 2026-03-18T09:52:06Z

Correction: ldk-node migration-aware write already exists

The migration-aware write protection already exists in ldk-node — but was added after the app's pinned version.

Timeline

Commit	Date	Description
`c5698d0`	Mar 5	App's pinned ldk-node version (Package.resolved) — does NOT have the protection
`e48948f`	Mar 9	`fix: prevent channel monitor migration from overwriting newer state`
`2df9b0a`	Mar 9	`fix: fail-closed on migration read/deserialize errors`
`cde6037`	Mar 9	`fix: skip redundant monitor write when update_id is equal`
`4f91c8c`	Mar 9	`fix: guard channel manager migration against overwriting newer state`

The current apply_channel_data_migration() at builder.rs:1342-1492 already:

Channel manager: Checks if one exists in the KVStore before writing (skips if yes)
Channel monitors: Deserializes existing monitors, compares update_id, skips if existing ≥ migrated
Fail-closed: Returns ReadFailed on any non-NotFound errors

What this means

Bumping ldk-node to include e48948f (on fix/existing-channel-rewriting branch or main) fixes the root cause at the Rust level
The Swift-side filterMigrationMonitors in this PR is belt-and-suspenders prevention — still valuable but the ldk-node fix is the primary protection
Recovery for already-affected users still needs Option D (accept stale monitors + cooperative close) — the migration-aware write only prevents future overwrites, doesn't help users whose monitors are already stale in VSS

jvsena42 · 2026-03-18T10:09:44Z

Update: ldk-node fix confirmed in `v0.7.0-rc.33`

The migration-aware write protection is included in ldk-node v0.7.0-rc.33 (latest tag). The app is currently pinned at c5698d0 (post-rc.32, pre-rc.33) which does NOT have the protection.

App version: c5698d0 (between v0.7.0-rc.32 and v0.7.0-rc.33) — no update_id comparison, blindly overwrites
Fix version: v0.7.0-rc.33 — includes e48948f through dc087fb (migration-aware write, fail-closed errors, tests)

Action: Bump ldk-node to v0.7.0-rc.33. The Swift-side filterMigrationMonitors code can be removed — ldk-node handles it at the Rust level.

Previous comments about "required ldk-node change" for migration-aware write are now moot — it's already done. The only remaining ldk-node work is Option D (accept stale monitors + cooperative close) for recovering already-affected users.

…ludes the migration-aware write protection

jvsena42 · 2026-03-18T10:42:48Z

@piotr-iohk replaced by #498 for a cleaner commit history

fix: skip RN monitors for channels that already exist locally

86be5df

jvsena42 self-assigned this Mar 17, 2026

This comment was marked as outdated.

Sign in to view

jvsena42 marked this pull request as ready for review March 17, 2026 17:29

jvsena42 marked this pull request as draft March 17, 2026 17:29

revert Swift-side filtering + bump ldk-node to v0.7.0-rc.33 which inc…

8ab0069

…ludes the migration-aware write protection

jvsena42 force-pushed the fix/channel-monitor-stale-data branch from d7a9d96 to 8ab0069 Compare March 18, 2026 10:30

Merge branch 'master' into fix/channel-monitor-stale-data

de574a6

jvsena42 mentioned this pull request Mar 18, 2026

fix: bump ldk-node to v0.7.0-rc.33 #498

Closed

jvsena42 closed this Mar 18, 2026

ben-kaufman mentioned this pull request Mar 18, 2026

feat: accept stale channel monitors for recovery synonymdev/ldk-node#76

Open

2 tasks

jvsena42 mentioned this pull request Mar 18, 2026

feat: one-time stale channel monitor recovery (v2) #501

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent RN migration from overwriting local channel monitors#495

fix: prevent RN migration from overwriting local channel monitors#495
jvsena42 wants to merge 3 commits intomasterfrom
fix/channel-monitor-stale-data

jvsena42 commented Mar 17, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

jvsena42 commented Mar 17, 2026 •

edited

Loading

Uh oh!

jvsena42 commented Mar 17, 2026

Uh oh!

jvsena42 commented Mar 18, 2026 •

edited

Loading

Uh oh!

jvsena42 commented Mar 18, 2026

Uh oh!

jvsena42 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jvsena42 commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Linked Issues/Tasks

Screenshot / Video

Uh oh!

This comment was marked as outdated.

jvsena42 commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Safest recovery path for already-affected users

Why this is the safest option

How it works

Required changes

Combine with migration-aware write (prevention)

Uh oh!

jvsena42 commented Mar 17, 2026

Impact on displayed balance

Uh oh!

jvsena42 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Correction: ldk-node migration-aware write already exists

Timeline

What this means

Uh oh!

jvsena42 commented Mar 18, 2026

Update: ldk-node fix confirmed in v0.7.0-rc.33

Uh oh!

jvsena42 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jvsena42 commented Mar 17, 2026 •

edited

Loading

jvsena42 commented Mar 17, 2026 •

edited

Loading

jvsena42 commented Mar 18, 2026 •

edited

Loading

Update: ldk-node fix confirmed in `v0.7.0-rc.33`