Skip to content

Avoid duplicate on-chain HTLC claims after replay#4583

Open
joostjager wants to merge 12 commits intolightningdevkit:mainfrom
joostjager:onchain-claim-replay-fixes
Open

Avoid duplicate on-chain HTLC claims after replay#4583
joostjager wants to merge 12 commits intolightningdevkit:mainfrom
joostjager:onchain-claim-replay-fixes

Conversation

@joostjager
Copy link
Copy Markdown
Contributor

@joostjager joostjager commented Apr 30, 2026

Fixes #4572

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Apr 30, 2026

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 94.66019% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.22%. Comparing base (0c7e6e7) to head (f73f75c).
⚠️ Report is 28 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/chain/channelmonitor.rs 79.00% 20 Missing and 1 partial ⚠️
lightning/src/chain/onchaintx.rs 99.58% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4583      +/-   ##
==========================================
- Coverage   87.16%   86.22%   -0.95%     
==========================================
  Files         161      156       -5     
  Lines      109251   108965     -286     
  Branches   109251   108965     -286     
==========================================
- Hits        95230    93950    -1280     
- Misses      11547    12402     +855     
- Partials     2474     2613     +139     
Flag Coverage Δ
fuzzing-fake-hashes ?
fuzzing-real-hashes ?
tests 86.22% <94.66%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joostjager joostjager self-assigned this Apr 30, 2026
@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch 2 times, most recently from 531576d to 9e0f886 Compare May 1, 2026 11:06
@joostjager joostjager marked this pull request as ready for review May 1, 2026 11:22
@joostjager joostjager changed the title Onchain claim replay fixes Avoid duplicate on-chain HTLC claims after replay May 1, 2026
@joostjager joostjager removed the request for review from valentinewallace May 1, 2026 11:24
@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented May 1, 2026

The reorg resurrection path at line 1208-1222 correctly handles ContentiousOutpoint packages: if the locktime is still in the future, it goes back to locktimed_packages; otherwise it merges back into the pending claim. The new outpoint_claim_state check will work correctly with this flow.

I've now reviewed every hunk in this diff thoroughly, tracing through the key behavioral changes:

  1. HTLCSpendConfirmation no longer waits for CSV — correctly separates HTLC finality from delayed output maturity
  2. htlc_output_resolution_on_chain at 4 call sites — correctly suppresses duplicate claims for resolved HTLCs
  3. outpoint_claim_state three-level dedup — properly checks ContentiousOutpoint, claimable_outpoints, and locktimed_packages in the right priority order
  4. Ordering swap (check_tx_and_push_spendable_outputs before is_resolving_htlc_output) — necessary for the debug_assert, functionally equivalent
  5. ClaimId::from_htlcs canonicalization — backward compatible since existing ClaimIds are persisted directly
  6. Holder preimage claim logging — correctly filters inbound HTLCs and warns on lost-fund scenarios
  7. All new tests — exercise the intended scenarios with correct assertions

No issues found.

Comment thread lightning/src/chain/onchaintx.rs Outdated
requests.retain(|req| {
let outpoint = req.outpoint();
if self.claimable_outpoints.get(outpoint).is_some() {
if self.is_outpoint_spend_waiting_threshold_conf(outpoint) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this break for reorgs where a counterparty's transaction gets unconfirmed? ie if we have a event awaiting confirmations for the counterparty claiming an outpoint, but then that gets reorg'd out and not replayed, do we still manage to broadcast our own claim (and RBF it)?

I assume that this case is only reachable if we receive a preimage after a counterparty's timeout claim is confirming?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on a reorg the item in onchain_events_awaiting_threshold_conf is revived again? Basically the claim is passed around to different data structures, but never lost?

Comment thread lightning/src/chain/channelmonitor.rs Outdated

fn is_htlc_output_spent_on_chain(&self, htlc: &HTLCOutputInCommitment) -> bool {
if let Some(transaction_output_index) = htlc.transaction_output_index {
// This is a monitor-level HTLC generation filter. OnchainTxHandler
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why exactly do we need the duplicate?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It stems from OnchainTxHandler not persisting state

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplication covers a restart/replay gap after finality.

OnchainTxHandler persists active claim state, but drops it once the claim reaches anti-reorg finality. It does not keep an ever-growing resolved-outpoint tombstone set.

After that cleanup, a replayed preimage update can look like a fresh claim request to OnchainTxHandler. ChannelMonitor does persist final HTLC resolution in htlcs_resolved_on_chain, so this filter uses that state to suppress only already-resolved HTLC outpoints.

Comment thread lightning/src/chain/channelmonitor.rs Outdated
// still guards package state for outpoints split out by confirmed
// spends; here we avoid recreating HTLC claim requests once the
// monitor has observed resolution.
self.onchain_events_awaiting_threshold_conf.iter().any(|entry| match entry.event {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as the previous commit - does this break broadcasting our conflicting claim on reorg?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, here indeed a claim is never submitted and also cannot be resurrected.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now only checking for deeply confirmed outpoints, but need to look at the consequences of that.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I would imagine any issues we have with deeply-confirmed would apply to less-confirmed as well, only more likely.

Copy link
Copy Markdown
Contributor Author

@joostjager joostjager May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added resolve HTLC spends at anti-reorg finality. Previously, there was a gap between anti-reorg finality and CSV maturity where the monitor could resubmit a claim to the onchain handler. The commit closes that by adding HTLC spends to htlcs_resolved_on_chain at anti-reorg finality.

That should not affect balance reporting, since any CSV-delayed output remains tracked separately. Before anti-reorg finality, the claim still goes to the onchain handler, so its reorg handling stays intact. After anti-reorg finality, replayed claims are suppressed by the monitor.

Let me know what you think.

@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch 2 times, most recently from 37d6b77 to 1c3b725 Compare May 1, 2026 18:26
joostjager added 5 commits May 4, 2026 09:36
Have ChannelMonitor hand singular ClaimRequests to OnchainTxHandler.

Convert them to PackageTemplates only after duplicate filtering.

This makes the single-outpoint invariant explicit at that boundary.
Clarify ChannelMonitor comments around on-chain event thresholds.
Some events only wait for anti-reorg finality, while CSV-delayed
outputs wait until spendable through the same threshold queue.
Move repeated OnchainTxHandler setup into shared test helpers so the
claim-replay coverage can focus on the behavior under test.
Add a monitor test for an inbound HTLC claimed by preimage from a
holder commitment. Confirm that the claimable balance remains unchanged
after the HTLC-success spend reaches anti-reorg finality but before the
CSV-delayed output is spendable.
Treat HTLCSpendConfirmation entries as irrevocably resolved once
the commitment HTLC output spend reaches anti-reorg finality. Do
not wait for CSV maturity of any delayed output created by that
spend.

Delayed outputs remain tracked separately as MaturingOutput entries,
keeping claimable balances alive until they are CSV-mature and can be
surfaced as SpendableOutputs.
@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch 2 times, most recently from a3fbe2d to 3196617 Compare May 4, 2026 10:44
@joostjager joostjager requested a review from TheBlueMatt May 5, 2026 12:46
OnchainEvent::HTLCSpendConfirmation { on_to_local_output_csv: Some(csv), .. } => {
// A CSV'd transaction is confirmable in block (input height) + CSV delay, which means
// it's broadcastable when we see the previous block.
OnchainEvent::FundingSpendConfirmation { on_local_output_csv: Some(csv), .. } => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the same apply to FundingSpendConfirmation? Maybe its worth asserting this exists when pushing HTLCSpendConfirmations (might have to move self.check_tx_and_push_spendable_outputs above self.is_resolving_htlc_output)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered making the change for FundingSpendConfirmation too, but it breaks more things. Balance calc needs to be updated, but also the changed behavior permeates up and leads to problems. I don't think it is needed functionally because there's no preimage claim coming in that triggers a sweep?

I did add a fixup for the assert you describe, that indeed also swaps those two calls).

Comment thread lightning/src/chain/channelmonitor.rs Outdated
if htlc.offered && htlc.payment_hash == matching_payment_hash {
if htlc.offered
&& htlc.payment_hash == matching_payment_hash
&& !self.is_htlc_output_resolved_on_chain(htlc)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we at least log a huge "we lost money" error here (same elsewhere)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logging fixup, but not completely certain it's now in the exact right places.

Comment thread lightning/src/chain/onchaintx.rs Outdated
}

fn is_outpoint_spend_waiting_threshold_conf(&self, outpoint: &BitcoinOutPoint) -> bool {
self.onchain_events_awaiting_threshold_conf.iter().any(|entry| {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this look at the timelocked set too to simplify the callsite?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added fixup that refactors to outpoint_claim_state helper

@ldk-reviews-bot
Copy link
Copy Markdown

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

joostjager added 6 commits May 6, 2026 17:05
Check that any HTLCSpendConfirmation carrying a local-output CSV
has a matching delayed MaturingOutput. Scan spendable outputs before
recording HTLC spend confirmations so the invariant is present when
the assertion runs.
A replayed holder HTLC claim may arrive as a single-outpoint
request after earlier requests were merged into a delayed package.
Check whether an existing delayed package already covers the new
request instead of requiring exact outpoint-set equality.

Add focused OnchainTxHandler coverage and a ChannelMonitor regression
through claim_funds for both current anchor variants.
When a transaction spends one outpoint from a delayed package, the
split outpoint is tracked as a ContentiousOutpoint until the spend
reaches anti-reorg finality. Reject replayed claim requests for those
pending-spent outpoints so they are not added back before the spend
reaches anti-reorg finality or reorgs out.

Add an OnchainTxHandler regression that replays a holder claim during
that pending-spent window and verifies reorg resurrection still works.
Classify duplicate outpoint state in one helper.

Preserve existing filter ordering and timelock logging.
Filter regenerated HTLC claim requests once ChannelMonitor has persisted
anti-reorg finality for the commitment HTLC output spend.

This keeps replayed preimage updates from recreating claims after
OnchainTxHandler has cleaned up its active retry state, relying on the
monitor's persisted HTLC resolution state.
Log when a replayed preimage claim is skipped because the
HTLC output reached anti-reorg finality without that preimage.
Hash HTLC claim outpoints in canonical order so the same logical HTLC
set produces the same ClaimId regardless of descriptor order.

Add a unit test covering reversed descriptor order.
@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch from 3196617 to f73f75c Compare May 6, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Duplicate Delayed Holder HTLC Claim Replay After Force-Close

4 participants