fix(controller): use noCacheReader in webhook to avoid extProc injection cache race#1981
Open
KaveeshKhattar wants to merge 1 commit intoenvoyproxy:mainfrom
Open
Conversation
…ion cache race Signed-off-by: Kaveesh Khattar <kaveeshkhattar@gmail.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1981 +/- ##
==========================================
- Coverage 84.33% 84.33% -0.01%
==========================================
Files 130 130
Lines 17987 18032 +45
==========================================
+ Hits 15170 15207 +37
- Misses 1873 1877 +4
- Partials 944 948 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Contributor
|
Were you able to reproduce this bug? Can you provide the steps? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a
noCacheReader client.Readerfield togatewayMutatorwired frommgr.GetAPIReader()at construction time.Route lookups are extracted into two helper methods -
listAIGatewayRoutesForGatewayandlistMCPRoutesForGateway,Extract route lookups into two helper methods:
listAIGatewayRoutesForGatewaylistMCPRoutesForGatewayEach following a cache-first, fallback-second pattern, trying the cached client with MatchingFields index lookup first; if the cache returns empty, fall back to noCacheReader.
The no-cache path cannot use
MatchingFieldssince it has no access to in-memory indexes these exist only in the controller's cache, not on the API server.Manual filtering via
parentRefsMatchGatewayreplicates the same namespace resolution logic as the index functions. A comment marks this coupling explicitly so future changes to index logic are not missed.Related Issues/PRs (if applicable)
Fixes #1495
Related PR: #1789
Validation
3 tests added to
gateway_mutator_test.go, each using two separate fake client instances.One empty (simulating stale cache) and one populated (simulating direct API server) to exercise the fallback path:
TestGatewayMutator_mutatePod_UsesNoCacheReader: route exists only in noCacheReader, verifies sidecar is still injectedTestGatewayMutator_listAIGatewayRoutesForGateway_NoCacheReaderFallback: verifies fallback filtering returns only matching routesTestGatewayMutator_listMCPRoutesForGateway_NoCacheReaderFallback: same for MCP routesmake precommit testpassed locally.Additional Context
The Gateway controller has a corrective rollout mechanism where if it detects pods without the sidecar while effective routes exist, it triggers a rolling update to re-invoke the webhook.
However, this mechanism also uses the cached client for route lookups and is subject to the same cache race. This fix addresses the root cause directly at the webhook level, making the corrective rollout unnecessary for this scenario.