splice_locked + channel_reestablish bug fixes#4654
Conversation
|
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
|
No new issues found. I re-examined the entire PR against the current code:
The two inline comments from my prior pass remain the record for this PR; no additional defects were found in this re-review. |
|
🔔 1st Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
|
🔔 2nd Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
When a splice confirms after our `channel_reestablish` was generated and sent, but prior to processing the counterparty's, we may promote the splice and clear `pending_splice`. In such cases, we're still required to send an explicit `splice_locked` as the `channel_reestablish` we sent did not consider the splice confirmation.
629f511 to
2f613ce
Compare
| if let Some(msg) = funding_tx_signed.as_mut().and_then(|v| v.tx_signatures.take()) { | ||
| pending_msg_events.push(MessageSendEvent::SendTxSignatures { | ||
| node_id: counterparty_node_id, | ||
| msg, | ||
| }); | ||
| } | ||
| if let Some(msg) = tx_abort { | ||
| pending_msg_events.push(MessageSendEvent::SendTxAbort { | ||
| node_id: counterparty_node_id, | ||
| msg, | ||
| }); | ||
| } | ||
| if let Some(msg) = funding_tx_signed.as_mut().and_then(|v| v.splice_locked.take()) { | ||
| pending_msg_events.push(MessageSendEvent::SendSpliceLocked { | ||
| node_id: counterparty_node_id, | ||
| msg, | ||
| }); | ||
| } |
There was a problem hiding this comment.
This reorder moves tx_signatures (and tx_abort) — not just splice_locked — ahead of the commitment update (handle_cs!). For splice/V2-funding reestablish, both can legitimately be present at once: channel_reestablish (channel.rs ~10637) sets commitment_update from retransmit_funding_commit_sig (the retransmitted funding commitment_signed) while tx_signatures is independently populated from session.holder_tx_signatures().
The interactive-tx retransmission rules (documented in channel.rs ~10603-10616 and BOLT2) require retransmitting commitment_signed before sending tx_signatures. With this reorder, when the peer requested our commitment_signed (because it lost it) and we also have witnesses to send, we now emit tx_signatures first, so the peer receives tx_signatures for a funding tx whose commitment_signed it hasn't seen yet — likely an error/force-close on their side.
Only splice_locked needs to precede the commitment update here. Consider keeping tx_signatures/tx_abort after handle_cs!/handle_raa! and moving just splice_locked up.
There was a problem hiding this comment.
Ouch, this was a real issue that didn't have test coverage, so I went ahead and added it.
If we have pending updates to send to our counterparty on reestablishment, while also pending a `splice_locked` send, then we must send our `splice_locked` first as the pending updates are considering the post-splice-locked state.
A stale ChannelManager can be reloaded after a monitor update has already completed in a prior runtime and released its post-update messages to the counterparty. The latest ChannelMonitor is not stale, but the serialized manager may still contain the old in-flight monitor state and `monitor_pending_*` resend flags from before the completion action ran. This becomes observable when startup monitor-completion background events are interleaved with splice promotion. On reload, the completed monitor update is queued as a background event. If a splice confirmation is processed before that background event fully resumes the channel, splice promotion can create a new `RenegotiatedFundingLocked` monitor update. The old completion is then blocked behind the new in-flight splice update. Once the channel reconnects and the splice update completes, `monitor_updating_restored` may consume the stale `monitor_pending_revoke_and_ack` / `monitor_pending_commitment_signed` flags and release a duplicate `revoke_and_ack` or `commitment_signed`. The peer's `channel_reestablish` commitment numbers are authoritative for this case. If `next_remote_commitment_number` says the peer is not waiting for a `revoke_and_ack`, clear `monitor_pending_revoke_and_ack`. Likewise, if `next_local_commitment_number` says the peer already has our latest `commitment_signed`, clear `monitor_pending_commitment_signed`.
2f613ce to
c58cba3
Compare
This PR features commits to fix three different issues found by the
chanmon_consistencytarget around the interaction ofsplice_lockedandchannel_reestablish. It fixes the following payloads from #4363: