Skip to content

goldmane: fix goroutine leaks in flow cache and emitter client#12160

Open
caseydavenport wants to merge 2 commits intoprojectcalico:masterfrom
caseydavenport:casey-goldmane-goroutine-leaks
Open

goldmane: fix goroutine leaks in flow cache and emitter client#12160
caseydavenport wants to merge 2 commits intoprojectcalico:masterfrom
caseydavenport:casey-goldmane-goroutine-leaks

Conversation

@caseydavenport
Copy link
Member

Two sources of goroutine leaks in goldmane:

ExpiringFlowCache.Run() looped forever with no cancellation. Each call to FlowClient.Connect() launched a new cache cleanup goroutine that could never be stopped — so reconnections accumulated goroutines, and FlowClient.Close() left one behind. The server-side deduplicator cache had the same issue. Fix: add a context.Context parameter to Run() and select on ctx.Done() alongside a ticker.

emitterClient cert watchers launched two goroutines (file watcher + reload consumer) against context.Background() in the constructor, with no way to shut them down. Fix: move goroutine startup out of newEmitterClient() into a new watchCerts(ctx) method, called from Emitter.Run() where we have the real lifecycle context. When ctx is cancelled the file watcher exits, closes updChan, and the reload goroutine drains and returns.

ExpiringFlowCache.Run() looped forever with no cancellation, leaking a
goroutine each time FlowClient.Connect() was called or when the server
shut down. Add a context parameter so callers can stop the cleanup loop.

The emitter HTTP client launched two goroutines (cert file watcher and
reload consumer) against context.Background(), making them impossible to
stop. Move goroutine startup out of the constructor into a new
watchCerts(ctx) method called from Emitter.Run(), so they respect the
emitter's lifecycle context.
@caseydavenport caseydavenport requested a review from a team as a code owner March 17, 2026 11:38
@caseydavenport caseydavenport added docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact labels Mar 17, 2026
Copilot AI review requested due to automatic review settings March 17, 2026 11:38
@caseydavenport caseydavenport added docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact labels Mar 17, 2026
@marvin-tigera marvin-tigera added this to the Calico v3.32.0 milestone Mar 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two goroutine leak sources in goldmane: the ExpiringFlowCache.Run() loop that had no cancellation mechanism, and the emitter client's certificate watcher goroutines that were started against context.Background() with no shutdown path. Both are now tied to a context.Context for proper lifecycle management.

Changes:

  • ExpiringFlowCache.Run() now accepts a context.Context and uses a ticker with select for cancellation support
  • Certificate watcher goroutines moved from newEmitterClient() constructor to a new watchCerts(ctx) method called from Emitter.Run()
  • All callers updated to pass context through

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
goldmane/pkg/internal/flowcache/timed_cache.go Add ctx parameter to Run(), replace time.After with ticker + select on ctx.Done()
goldmane/pkg/client/client.go Pass ctx to cache.Run()
goldmane/pkg/server/flow_collector_service.go Pass ctx to deduplicator.Run()
goldmane/pkg/daemon/daemon.go Pass ctx to collector.Run()
goldmane/pkg/emitter/http_client.go Extract cert watching into watchCerts(ctx) method
goldmane/pkg/emitter/emitter.go Call watchCerts(ctx) from Emitter.Run()

Select on ctx.Done() alongside time.After in the file watcher loop so
it exits promptly on cancellation instead of blocking up to 30s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants