Skip to content

[BPF] Add native netkit BPF attachment support to Felix#12167

Draft
tomastigera wants to merge 10 commits intoprojectcalico:masterfrom
tomastigera:tomas/bpf-netkit-fv
Draft

[BPF] Add native netkit BPF attachment support to Felix#12167
tomastigera wants to merge 10 commits intoprojectcalico:masterfrom
tomastigera:tomas/bpf-netkit-fv

Conversation

@tomastigera
Copy link
Contributor

Summary

Initial implementation of native netkit BPF attachment in Felix. When Felix detects a workload endpoint using a netkit device (instead of veth), it attaches BPF programs via the netkit attach API rather than TC/TCX. This is the foundation for supporting netkit as an alternative to veth for workload networking.

  • Netkit detection and attachment: Felix auto-detects netkit workload interfaces and uses BPF_NETKIT_PRIMARY/BPF_NETKIT_PEER attachment instead of TC/TCX
  • Separate prog_array maps: Netkit programs have a different expected_attach_type and cannot share prog_array maps with TC/TCX programs. New maps (cali_p_nk_ing2, cali_p_nk_egr2, cali_j_nk_ing2, cali_j_nk_egr2) with pin path overrides redirect the same BPF object files to netkit-specific maps
  • host_ifindex in globals: BPF_NETKIT_PEER programs see the peer's ifindex in skb->ifindex; added host_ifindex to globals for correct map lookups
  • No bpf_redirect_peer for netkit: Netkit programs run in xmit context where bpf_redirect_peer silently drops; uses plain bpf_redirect via FIB instead
  • FV test infrastructure: --netkit flag for test-workload, FELIX_FV_NETKIT=Enabled env var, fv-bpf-netkit Makefile target
  • CI: Full BPF FV run with netkit on Ubuntu 25.10 (replaces UT-only 25.10 check)
None

Test plan

  • make -C felix fv-bpf-netkit with connectivity test (ipv4 tcp, no tunnel, no dsr) passes
  • Full BPF FV suite with netkit on Ubuntu 25.10 CI runner

🤖 Generated with Claude Code

tomastigera and others added 8 commits March 16, 2026 13:59
Add support for attaching BPF programs to netkit devices using native
BPF_NETKIT_PRIMARY/BPF_NETKIT_PEER attachment instead of TC/TCX. This
provides better performance by running BPF programs inside
ndo_start_xmit(), bypassing the per-CPU softirq backlog queue.

Felix detects netkit interfaces at runtime via link.Type() and
automatically selects the native attachment mechanism per-device. Veth
and netkit endpoints can coexist on the same node — veth devices
continue using TC/TCX while netkit devices use native attachment.

Key changes:
- libbpf C/Go wrappers for bpf_program__attach_netkit()
- Netkit detection (IfaceTypeNetkit) in bpf_ep_mgr
- attachNetkitProgram/detachNetkitProgram in tc/attach.go
- IsNetkitSupported kernel probe (creates test netkit pair)
- Hook direction mapping: TC ingress -> BPF_NETKIT_PEER,
  TC egress -> BPF_NETKIT_PRIMARY
- Per-device progAttachType threading through program loading
  to ensure netkit and TCX programs are cached separately
- Skip qdisc setup for netkit devices
- Netkit pin directory and cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… test

- Detect netkit device type for workload interfaces via netlink so Felix
  can auto-select native netkit attachment. Only workload interfaces are
  eligible — other netkit devices (host/data interfaces) are not ours.
- Clean up netkit pins on interface deletion alongside TCX pins.
- Add cleanupNetkitPins helper.
- Restrict netkit attachment override to workload interfaces only.
- Add TestAttachNetkit: creates a netkit device, verifies Felix
  auto-detects it and uses native BPF attachment (pins exist in
  netkit dir, no TC/TCX programs or qdisc). Skips on kernels < 6.7.
- Add createNetkitName helper for test device creation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ObjectFile() maps AttachType to BPF object filenames. The same object
file is used regardless of the attachment mechanism (TC, TCX, Netkit),
so ProgAttachType must be zeroed before the map lookup. Without this,
netkit devices failed to find their object file because the key
included ProgAttachType:"Netkit" which had no entry in the map.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enable FV tests to create netkit L2 pairs instead of veth pairs for
workload endpoints, controlled by FELIX_FV_NETKIT=Enabled env var.
This mirrors the FELIX_FV_NFTABLES pattern.

- Add NetkitMode() to infrastructure/modes.go
- Add --netkit flag to test-workload with doNetkitSetUp() that creates
  netkit pairs with the same IP/route/sysctl config as veth
- Pass --netkit to test-workload when NetkitMode() is true
- Add fv-bpf-netkit Makefile target and pass FELIX_FV_NETKIT to run-batches

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The kernel requires all programs in a prog_array to have the same
expected_attach_type. Netkit programs (BPF_NETKIT_PEER/PRIMARY) are
incompatible with TC/TCX programs in the same prog_array. TC and TCX
can share because the kernel treats their attach types as compatible.

Add netkit-specific ProgramsMaps and JumpMaps pinned under a separate
directory (netkit/ within GlobalPinDir). When loading BPF objects for
netkit interfaces, the prog_array map pin paths are redirected to
these netkit-specific maps via MapPinOverrides.

Changes:
- bpf/bpfdefs: Add NetkitGlobalPinDir constant
- bpf/jump: Add NetkitIngressMapParameters, NetkitEgressMapParameters
- bpf/hook: Add NetkitProgramsMap parameters with mapPinOverrides
  mechanism to redirect prog_array pins during object loading
- bpf/bpfmap: Add NetkitProgramsMaps and NetkitJumpMaps to CommonMaps
- bpf/bpf.go: Add LoadObjectWithPinOverrides for map pin path overrides
- bpf/tc: Add MapPinOverrides field to AttachPoint, pass through loadObject
- bpf_ep_mgr: Select netkit maps/allocators for netkit interfaces,
  lazy-load netkit default policy programs, update log filter and
  policy program paths for netkit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With netkit, BPF_NETKIT_PEER programs see skb->ifindex as the peer's
ifindex (inside the workload namespace), not the primary (host-side)
ifindex. The counters and ifstate maps are keyed by the primary
ifindex, causing "no counters" drops when the BPF program looks up
the peer ifindex.

Add host_ifindex to cali_tc_global_data. The Go code sets it from
ap.IfIndex (always the primary/host-side ifindex). The C code uses
host_ifindex for the counters map lookup, falling back to skb->ifindex
when host_ifindex is 0 (non-netkit interfaces).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bpf_redirect_peer requires a TC ingress context (skb_at_tc_ingress),
but netkit programs run in xmit (ndo_start_xmit) context. Calling
bpf_redirect_peer from netkit silently drops packets.

Disable RedirectPeer on netkit attach points so the FIB path uses
plain bpf_redirect instead, which works correctly from netkit context.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the UT-only 25.10 program-loading check with a full BPF FV
run using netkit devices (FELIX_FV_NETKIT=Enabled). This validates
end-to-end netkit BPF attachment with real traffic on a kernel that
supports netkit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@marvin-tigera marvin-tigera added this to the Calico v3.32.0 milestone Mar 17, 2026
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Mar 17, 2026
@tomastigera tomastigera added release-note-not-required Change has no user-facing impact and removed release-note-required Change has user-facing impact (no matter how small) labels Mar 17, 2026
@tomastigera tomastigera force-pushed the tomas/bpf-netkit-fv branch 4 times, most recently from c50a64c to b448cbf Compare March 18, 2026 16:45
Add AttachTypeNetkitPrimary/Peer to libbpf_stub.go for non-amd64
builds where CGO is unavailable.

Update attach_test.go to include ProgAttachType in AttachType key
comparisons, matching the field added to the struct for netkit support.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tomastigera tomastigera force-pushed the tomas/bpf-netkit-fv branch from b448cbf to 1349f41 Compare March 18, 2026 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-pr-required Change is not yet documented release-note-not-required Change has no user-facing impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants