Skip ChannelManager persistence after monitor update completion by joostjager · Pull Request #4424 · lightningdevkit/rust-lightning

joostjager · 2026-02-18T13:37:45Z

Since we are pulling ChannelManager persistence into the critical path, we should avoid persisting unnecessarily. When process_pending_monitor_events processes only Completed events, resulting state changes may be recoverable on restart and do not require persistence.

ldk-reviews-bot · 2026-02-18T13:37:48Z

👋 I see @TheBlueMatt was un-assigned.
If you'd like another reviewer assignment, please click here.

TheBlueMatt · 2026-02-18T13:56:17Z

lightning/src/ln/channelmanager.rs

+						if result == NotifyOption::SkipPersistNoEvents {
+							result = NotifyOption::SkipPersistHandleEvents;
+						}
 						self.channel_monitor_updated(


We'll need to carefully review everything this calls to make sure they set the store-required flag. Might be easier to just manually set the flag if there's anything to do in handle_post_monitor_update_chan_resume?

By "anything to do in handle_post_monitor_update_chan_resume", do you mean specifically a new monitor update being created? Or is there more that isn't recoverable on restart?

Added a commit that sets needs_persist_flag in handle_new_monitor_update_locked_actions_handled_by_caller right after chain_monitor.update_channel to cover that case.

Have to say that certainty level instantly drops when working on this state machine 😬

I have an alternative approach on that is more conservative in stripping out persists: https://github.com/joostjager/rust-lightning/tree/skip-cm-persist-monitor-completed-alt

Rather than skipping persistence for all MonitorEvent::Completed events unconditionally, it inspects the result of monitor_updating_restored and only skips persistence when the effect is just queuing outbound messages (commitment_signed, revoke_and_ack). But also needs to be carefully reviewed to see if we don't miss anything.

Rather than trying to carefully thread the needs-persistence flags through, maybe we just explicitly set needs_persist_flag in channel_monitor_updated? But yea I think your second approach there makes way more sense.

Ok, pushed that to this PR after splitting in two commits for reviewability. Now it needs a careful eye to see if the persist condition is not missing anything.

I'm confused, as of b0ec5eb I don't see any references to needs_persist_flag in the diff?

With 'pushed that', I meant the second, more conservative, approach for determining whether persistence is needed.

I thought threading through at least keeps the overview of where persistence is requested from exactly, and it also connects with process_pending_monitor_events needing to return a value for the persistence guard. It is a bit confusing that two mechanisms are both used.

I can return SkipPersistHandleEvents from process_pending_monitor_events for Completed events, but I think the needs_persist flag cannot be set in channel_monitor_updated. It needs to be determined two levels deeper in try_resume_channel_post_monitor_update. Is your suggestion to thread back the bool from try_resume_channel_post_monitor_update to channel_monitor_updated and set the flag there? Or set the flag in try_resume_channel_post_monitor_update and consequently also in several other branches of the call tree?

Yea, my thinking was that we'd just self.needs_persist_flag.store() in try_resume_channel_post_monitor_update/wherever we decide we need to persist (after debug_assert'ing that we're holding the global_persistency_lock). Glancing at the other callsites this looks correct/redundant.

codecov · 2026-02-18T17:05:01Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.97%. Comparing base (14e522f) to head (b0ec5eb).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4424      +/-   ##
==========================================
+ Coverage   85.94%   85.97%   +0.03%     
==========================================
  Files         159      159              
  Lines      104644   104659      +15     
  Branches   104644   104659      +15     
==========================================
+ Hits        89934    89979      +45     
+ Misses      12204    12176      -28     
+ Partials     2506     2504       -2

Flag	Coverage Δ
tests	`85.97% <100.00%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Refactor process_pending_monitor_events to return a NotifyOption instead of a bool, allowing callers to distinguish between DoPersist, SkipPersistHandleEvents, and SkipPersistNoEvents. Both call sites in process_events_body and get_and_clear_pending_msg_events are updated accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ldk-reviews-bot · 2026-04-02T00:32:20Z

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

When `process_pending_monitor_events` processes only `Completed` events and the resulting state changes are limited to queuing outbound messages (e.g. `commitment_signed`, `revoke_and_ack`), persistence can be skipped. These messages are re-derived on restart from `in_flight_monitor_updates`. Rather than threading a `needs_persist` return value back through the call chain, set `needs_persist_flag` directly in `try_resume_channel_post_monitor_update` and `channel_monitor_updated` when persistence is actually required. AI tools were used in preparing this commit.

ldk-claude-review-bot · 2026-04-09T10:07:47Z

lightning/src/ln/channelmanager.rs

 			let unbroadcasted_batch_funding_txid =
 				chan.context.unbroadcasted_batch_funding_txid(&chan.funding);

+			// Queuing outbound messages (commitment_update, raa) alone does
+			// not require ChannelManager persistence.
+			let needs_persist = has_state_changes
+				|| !updates.finalized_claimed_htlcs.is_empty()
+				|| !updates.failed_htlcs.is_empty()
+				|| !update_actions.is_empty()
+				|| unbroadcasted_batch_funding_txid.is_some()
+				|| !htlc_forwards.is_empty()
+				|| decode_update_add_htlcs.is_some();
+
+			if needs_persist {
+				self.needs_persist_flag.store(true, Ordering::Release);
+			}


The needs_persist check omits committed_outbound_htlc_sources (pruning optimization, replayable) and tx_signatures (resendable message). This is likely intentional, but consider adding a comment explaining their exclusion so future maintainers of MonitorRestoreUpdates know the invariant.

More importantly: handle_channel_resumption (called above) invokes emit_channel_pending_event! and emit_initial_channel_ready_event!, which can push events to self.pending_events and mutate channel flags — neither of which is captured in this needs_persist check. This appears safe today because:

emit_channel_pending_event! requires is_funding_broadcastable() && !channel_pending_event_emitted — in practice this aligns with funding_broadcastable.is_some()

emit_initial_channel_ready_event! requires is_usable() && !initial_channel_ready_event_emitted — in practice this aligns with channel_ready.is_some()

Both conditions imply has_state_changes = true. However, this coupling is implicit. If those conditions ever diverge, events could land in pending_events without triggering persistence. In process_events_body the !pending_events.is_empty() check provides a safety net, but in get_and_clear_pending_msg_events there is no such fallback.

Consider adding a debug_assert or explicit comment documenting this invariant.

ldk-claude-review-bot · 2026-04-09T10:07:57Z

lightning/src/ln/channelmanager.rs

 			let _ = self.handle_error(err, counterparty_node_id);
 		}

-		has_pending_monitor_events
+		if needs_persist {
+			NotifyOption::DoPersist
+		} else {
+			NotifyOption::SkipPersistHandleEvents


Minor: the function now uses two distinct mechanisms for signaling persistence need:

Return value (NotifyOption::DoPersist): for HTLCEvent, HolderForceClosed*, CommitmentTxConfirmed

Direct flag (needs_persist_flag.store): for state changes discovered inside channel_monitor_updated → try_resume_channel_post_monitor_update

The SkipPersistHandleEvents return here relies on the caller calling event_persist_notifier.notify() so the background processor wakes up and sees the directly-set flag. This works correctly today but is subtle — consider adding a comment at the return site explaining that persistence may have already been requested via needs_persist_flag and that the caller must still call notify() for SkipPersistHandleEvents.

ldk-claude-review-bot · 2026-04-09T10:08:29Z

Review Summary

PR #4424: Skip ChannelManager persistence after monitor update completion

Two inline comments posted:

lightning/src/ln/channelmanager.rs:10347-10362 — The needs_persist check in the Unblocked path doesn't document why committed_outbound_htlc_sources and tx_signatures are excluded, and relies on an implicit invariant that emit_channel_pending_event!/emit_initial_channel_ready_event! (called in handle_channel_resumption above) only fire when has_state_changes is already true. This invariant is not enforced and the get_and_clear_pending_msg_events path lacks the !pending_events.is_empty() safety net that process_events_body has.
lightning/src/ln/channelmanager.rs:13187-13193 — The function now uses dual persistence mechanisms (return value for non-Completed events, direct needs_persist_flag for Completed-triggered state changes). The SkipPersistHandleEvents return relies on the caller calling notify() so the background processor wakes up and sees the directly-set flag. This is correct but subtle and should be documented.

Cross-cutting concerns:

No tests added. This is a non-trivial change to persistence behavior that could affect data durability. Tests verifying that (a) persistence is skipped for message-only completions and (b) persistence IS triggered for completions with state changes (htlc_forwards, finalized_claims, etc.) would significantly increase confidence.
The overall logic appears correct after thorough analysis. The key safety argument is that RAA/commitment_update/tx_signatures are regenerated by monitor_updating_restored on restart from the persisted ChannelMonitor state, so skipping CM persistence for message-only completions is safe. The needs_persist_flag is set directly for all other cases that produce non-recoverable state changes.

TheBlueMatt reviewed Feb 18, 2026

View reviewed changes

joostjager mentioned this pull request Feb 18, 2026

Enable deferred writing lightningdevkit/ldk-node#782

Merged

joostjager added this to Weekly Goals Feb 19, 2026

joostjager self-assigned this Feb 19, 2026

joostjager force-pushed the skip-cm-persist-monitor-completed branch 2 times, most recently from 7b1c046 to 48dce43 Compare February 25, 2026 13:18

joostjager force-pushed the skip-cm-persist-monitor-completed branch from 48dce43 to b0ec5eb Compare February 25, 2026 13:30

joostjager marked this pull request as ready for review February 26, 2026 09:22

joostjager requested a review from TheBlueMatt March 2, 2026 10:34

TheBlueMatt removed their request for review March 3, 2026 22:18

TheBlueMatt self-requested a review March 31, 2026 00:32

TheBlueMatt removed their request for review April 3, 2026 18:04

TheBlueMatt added this to the 0.3 milestone Apr 3, 2026

joostjager force-pushed the skip-cm-persist-monitor-completed branch from b0ec5eb to ebac69e Compare April 9, 2026 09:51

ldk-claude-review-bot reviewed Apr 9, 2026

View reviewed changes

Conversation

joostjager commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ldk-reviews-bot commented Apr 2, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldk-claude-review-bot commented Apr 9, 2026

Review Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

joostjager commented Feb 18, 2026 •

edited

Loading

ldk-reviews-bot commented Feb 18, 2026 •

edited

Loading

codecov bot commented Feb 18, 2026 •

edited

Loading