Defer BPF program attachment until after IP sets are updated#12164
Defer BPF program attachment until after IP sets are updated#12164
Conversation
Move BPF program attachment from CompleteDeferredWork() to a new Apply() method on bpfEndpointManager. The Apply() method is called from the main dataplane apply loop after IP sets have been written, ensuring that policy rules referencing IP sets are up-to-date before BPF programs start enforcing them on newly-added workloads. Previously, BPF programs were attached during CompleteDeferredWork() which runs before IP set updates. This meant a newly-added workload could have its BPF program enforcing policy before the workload's own IP was added to the relevant IP sets, causing spurious deny flow logs. The new ManagerWithApply interface allows any manager to opt into post-IP-set application. The bpfEndpointManager implements it by moving program attachment, WEP updates, interface state map writes, and health reporting to Apply(). https://claude.ai/code/session_014RZxxt6Tf6YQY7K2T6CbjM
…erence Instead of a generic interface, store bpfEndpointManager directly on InternalDataplane and call ApplyBPFPrograms() explicitly from apply(). This makes the ordering dependency clear: BPF programs must be attached after IP sets are written to prevent spurious deny flow logs. https://claude.ai/code/session_014RZxxt6Tf6YQY7K2T6CbjM
felix/dataplane/linux/bpf_ep_mgr.go
Outdated
| @@ -1742,6 +1717,37 @@ func (m *bpfEndpointManager) CompleteDeferredWork() error { | |||
| } | |||
| } | |||
|
|
|||
| // Copy data from old map to the new map | |||
| m.copyDeltaOnce.Do(func() { | |||
There was a problem hiding this comment.
I don't think this should be re-ordered; I think it needs to happen after the programs are attached to prevent races. Move it to the apply function to maintain the ordering
There was a problem hiding this comment.
Pull request overview
This PR refactors Felix’s BPF dataplane update sequencing so that BPF program attachment happens after IP sets have been written, reducing transient policy denies for new workloads when policies reference IP sets.
Changes:
- Split BPF program attachment/workload interface programming out of
bpfEndpointManager.CompleteDeferredWork()into a newApplyBPFPrograms()method. - Updated the internal dataplane apply loop to call
ApplyBPFPrograms()after IP set updates complete and before iptables programming. - Updated
bpf_ep_mgr_test.goto callApplyBPFPrograms()afterCompleteDeferredWork().
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| felix/dataplane/linux/int_dataplane.go | Stores a pointer to the BPF endpoint manager and invokes ApplyBPFPrograms() after IP set updates in apply(). |
| felix/dataplane/linux/bpf_ep_mgr.go | Introduces ApplyBPFPrograms() and moves program/interface application and related map updates out of CompleteDeferredWork(). |
| felix/dataplane/linux/bpf_ep_mgr_test.go | Updates unit tests to perform the new two-step apply sequence (CompleteDeferredWork() then ApplyBPFPrograms()). |
You can also share your feedback on Copilot code review. Take the survey.
| if m.dirtyIfaceNames.Len() == 0 { | ||
| if m.removeOldJumps { |
felix/dataplane/linux/bpf_ep_mgr.go
Outdated
| var err error | ||
| if m.v6 != nil { | ||
| err = m.v6.CtMap.CopyDeltaFromOldMap() | ||
| } | ||
| if m.v4 != nil { | ||
| err = m.v4.CtMap.CopyDeltaFromOldMap() | ||
| } | ||
| if err != nil { | ||
| logrus.WithError(err).Debugf("Failed to copy data from old conntrack map %s", err) | ||
| } |
| err = bpfEpMgr.CompleteDeferredWork() | ||
| Expect(err).NotTo(HaveOccurred()) | ||
| err = bpfEpMgr.ApplyBPFPrograms() | ||
| Expect(err).NotTo(HaveOccurred()) |
felix/dataplane/linux/bpf_ep_mgr.go
Outdated
| bpfEndpointsGauge.Set(float64(len(m.nameToIface))) | ||
| bpfDirtyEndpointsGauge.Set(float64(m.dirtyIfaceNames.Len())) | ||
|
|
||
| if m.hostNetworkedNATMode != hostNetworkedNATDisabled { |
copyDeltaOnce copies conntrack entries from the old map to the new map. It must run after BPF programs are replaced so that old programs are no longer writing to the old map during the copy. Previously it was correctly ordered after program attachment in CompleteDeferredWork; the split inadvertently moved it before program attachment. https://claude.ai/code/session_014RZxxt6Tf6YQY7K2T6CbjM
felix/dataplane/linux/bpf_ep_mgr.go
Outdated
| if m.bpfPolicyDebugEnabled { | ||
| m.removeDirtyPolicies() | ||
| } | ||
|
|
||
| bpfEndpointsGauge.Set(float64(len(m.nameToIface))) |
There was a problem hiding this comment.
@claude this gauge should be updated after applying programs too; it's meant to mean "count of programs that we successfully applied"
- Move bpfEndpointsGauge to ApplyBPFPrograms() since it represents the count of programs successfully applied, not just tracked interfaces. - Add completeDeferredAndApply() helper in bpf_ep_mgr_test.go to reduce duplication of the two-step CompleteDeferredWork()+ApplyBPFPrograms() pattern across 74 call sites. - Add ApplyBPFPrograms() calls to felix/bpf/ut/attach_test.go which also calls CompleteDeferredWork() directly and expects programs to be attached. https://claude.ai/code/session_014RZxxt6Tf6YQY7K2T6CbjM
| // Copy data from old conntrack map to the new map. Must happen after | ||
| // programs are applied above so that old programs are no longer writing | ||
| // to the old map. | ||
| m.copyDeltaOnce.Do(func() { |
There was a problem hiding this comment.
@claude let's not re-order this relative to the code below unless there's a reason to do so.
Description
This change separates BPF program attachment from the
CompleteDeferredWork()method by introducing a newApplyBPFPrograms()method that is called explicitly after IP sets have been written to the dataplane.Why This Change
When BPF programs are attached to interfaces before IP sets are updated, policy rules that reference IP sets can produce spurious denies for newly-added workloads whose IPs are freshly added to IP sets. By deferring BPF program attachment until after IP set updates complete, we ensure that:
Changes Made
bpf_ep_mgr.go:
applyProgramsToDirtyDataInterfaces(),updateWEPsInDataplane(),removeDirtyPolicies()) fromCompleteDeferredWork()to a newApplyBPFPrograms()methodCompleteDeferredWork()to ensure it happens before program attachmentCompleteDeferredWork()now focuses on preparing state without attaching programsint_dataplane.go:
bpfEndpointManagerfield toInternalDataplaneto enable explicit control over when BPF programs are appliedapply()method to callApplyBPFPrograms()after IP sets are updated but before iptables rules are appliedbpf_ep_mgr_test.go:
ApplyBPFPrograms()afterCompleteDeferredWork()to maintain test correctnessTesting
All existing unit tests have been updated to call the new
ApplyBPFPrograms()method in the appropriate sequence. The test changes ensure that the separation of concerns is properly validated across all BPF endpoint manager test scenarios.Release note:
https://claude.ai/code/session_014RZxxt6Tf6YQY7K2T6CbjM