Skip to content

Fix ep-to-host-action FV test flake in BPF mode#12169

Open
tomastigera wants to merge 1 commit intoprojectcalico:masterfrom
tomastigera:tomas/fix-ep-to-host-bpf-flake
Open

Fix ep-to-host-action FV test flake in BPF mode#12169
tomastigera wants to merge 1 commit intoprojectcalico:masterfrom
tomastigera:tomas/fix-ep-to-host-bpf-flake

Conversation

@tomastigera
Copy link
Contributor

Summary

  • Fix flaky _BPF-SAFE_ endpoint-to-host-action FV test that intermittently fails with "no route to host" in BPF mode
  • After TriggerDelayedStart(), the test was checking connectivity without waiting for Felix to finish programming the dataplane
  • The CTLB cgroup hook would allow connect() (BPF routes were already in the map) but the tc programs weren't yet attached to the workload's cali interface, so packets fell through to kernel routing which had no route yet
  • Add WaitForReady() after TriggerDelayedStart() to ensure Felix completes its first apply cycle before testing connectivity

Test plan

  • Run the failing test in BPF mode: make -C felix fv-bpf GINKGO_FOCUS="endpoint-to-host-action"
  • Verify test passes consistently (previously flaky)
None

🤖 Generated with Claude Code

@tomastigera tomastigera requested a review from a team as a code owner March 18, 2026 01:40
@tomastigera tomastigera added docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact labels Mar 18, 2026
Copilot AI review requested due to automatic review settings March 18, 2026 01:40
@tomastigera tomastigera added docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact labels Mar 18, 2026
@marvin-tigera marvin-tigera added this to the Calico v3.32.0 milestone Mar 18, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates a Felix FV test to avoid a startup race in BPF mode by explicitly waiting for Felix readiness before performing connectivity assertions.

Changes:

  • Add a WaitForReady() synchronization point after TriggerDelayedStart() in the endpoint-to-host-action FV test.
  • Update the inline comment to document the BPF-mode timing issue being addressed.

@tomastigera tomastigera force-pushed the tomas/fix-ep-to-host-bpf-flake branch from 75bf393 to 3b27a86 Compare March 18, 2026 16:54
After TriggerDelayedStart(), the test immediately checked
connectivity without waiting for Felix to finish programming.
In BPF mode, the CTLB cgroup hook would allow connect() (BPF
routes were in the map) but the tc programs weren't yet attached
to the workload's cali interface, causing "no route to host"
errors that exhausted the 10s connectivity checker timeout.

Add WaitForReady() after TriggerDelayedStart() to ensure Felix
has completed its first apply cycle before testing connectivity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tomastigera tomastigera force-pushed the tomas/fix-ep-to-host-bpf-flake branch from 3b27a86 to 0e254d2 Compare March 18, 2026 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants