Close network conduit for cluster epoch-specific engines#8363
Close network conduit for cluster epoch-specific engines#8363
Conversation
The network manager keeps a reference to the engine to send it messages, preventing it from being garbage collected even when stopped, until the conduit is closed. All engines *should* close their network conduits on shutdown, not just these, but limiting to these cases for now as only the cluster sync and message hub engines use distinct channel IDs for each epoch.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
📝 WalkthroughWalkthroughAdds coordinated shutdown sequencing: message hub workers use a sync.WaitGroup and a shutdown worker that waits for all dispatch workers before closing the network conduit; Engine.Done now closes the conduit after request-handler shutdown. Tests updated to expect a single Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant DW as DispatchWorkers
participant SW as ShutdownWorker
participant C as NetworkConduit
participant L as Logger
DW->>DW: spawn workers (wg.Add)
DW->>DW: workers process messages
DW->>SW: workers done (wg.Done until all)
SW->>SW: wg.Wait()
SW->>C: Close()
alt Close returns error
C-->>L: log error
else Close ok
C-->>SW: nil
end
sequenceDiagram
autonumber
participant RH as RequestHandler
participant E as Engine.Done
participant C as NetworkConduit
participant L as Logger
RH->>E: shutdown complete
E->>C: Close()
alt Close returns error
C-->>L: log error
else Close ok
C-->>E: nil
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@engine/collection/message_hub/message_hub.go`:
- Around line 180-186: The shutdown goroutine currently ignores the error
returned by hub.con.Close(); modify it to handle the error by calling the
component's logger (e.g., hub.Logger or hub.logger) and logging a descriptive
message with the error when Close() returns non-nil; locate the goroutine that
waits on hub.ComponentManager.Done() and replace "_ = hub.con.Close()" with code
that captures the error from hub.con.Close(), checks if err != nil, and logs it
(include context like "failed to close network conduit" and the error).
In `@engine/collection/synchronization/engine.go`:
- Around line 185-186: Replace the discarded call to e.con.Close() with explicit
error handling: call e.con.Close(), capture the returned error into a variable,
and if it is non-nil log it using the engine logger
(e.log.Error().Err(err).Msg("failed to close network conduit")) so shutdown
failures are recorded; apply the same fix to the analogous discard in
message_hub.go (replace the "_ = ...Close()" pattern with the
captured-error-and-log pattern).
Co-authored-by: Peter Argue <peter.argue@flowfoundation.org>
When a conduit is created by an engine, the network manager keeps a reference to the engine to send it messages, preventing it from being garbage collected even when stopped, until the conduit is closed.
All engines should close their network conduits on shutdown, not just these, but limiting to these cases for now as only the cluster sync and message hub engines use distinct channel IDs for each epoch.
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.