fix: stabilize flaky CI tests#426
Merged
Merged
Conversation
1. Add sync.Mutex to ScoutConnCollected to eliminate data race between Add()/MarkWrite() in readLoop and learn() iterating results. Introduce Snapshot() for safe read access. 2. Increase bloom filter test size from 500 to 100000 to prevent false negatives from random eviction in the stable bloom filter. 3. Use Require().NoError() in TestHTTPSRequest to prevent nil-pointer panic on resp.Body.Close() when the request fails. Fixes 9seconds#425
- Move error check before Snapshot() to avoid unnecessary allocation - Update existing tests to use Snapshot() instead of direct field access - Add TestConcurrentAddSnapshot to explicitly exercise the mutex
c63a0d1 to
73c6a3a
Compare
9seconds
reviewed
Mar 30, 2026
Owner
9seconds
left a comment
There was a problem hiding this comment.
Все хорошо, я оставил пару совсем минорных комментариев. Но если что, вмержу через пару часов и так
|
|
||
| resp, err := client.Get(addr) //nolint: noctx | ||
| suite.NoError(err) | ||
| suite.Require().NoError(err) |
Owner
There was a problem hiding this comment.
О, оказывается такое в testify завезли
Collaborator
Author
There was a problem hiding this comment.
Да, удобная штука — сразу останавливает тест вместо продолжения с nil.
|
|
||
| for i := 0; i < 1000; i++ { | ||
| data, _ := collected.Snapshot() | ||
| _ = len(data) |
Collaborator
Author
There was a problem hiding this comment.
Была идея заставить горутину обратиться к данным слайса, чтобы race detector зафиксировал конкурентный доступ. Но по факту сам вызов Snapshot() уже достаточен — убрал, оставил комментарий.
| func (s *ScoutConnCollected) Snapshot() ([]ScoutConnResult, int) { | ||
| s.mu.Lock() | ||
| snapshot := make([]ScoutConnResult, len(s.data)) | ||
| copy(snapshot, s.data) |
Collaborator
Author
There was a problem hiding this comment.
Поправил, спасибо.
- Replace manual make+copy with slices.Clone in Snapshot() - Remove redundant _ = len(data); Snapshot() call alone is sufficient to exercise the lock under -race
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #425
Summary
Data race in
ScoutConnCollected: addsync.Mutexto protectAdd()/MarkWrite()from concurrent access by HTTP transport readLoop. AddSnapshot()method solearn()reads a consistent copy instead of iterating live slices.Bloom filter false negatives: increase test filter size from 500 to 100,000. With 500 bytes the random eviction has ~1.7% chance per assertion of destroying a previously inserted key — matches the observed ~1-in-8 CI failure rate. Larger filter keeps the same test logic without the noise.
nil-pointer panic in
TestHTTPSRequest: replacesuite.NoError(err)withsuite.Require().NoError(err)so the test stops immediately on connection error instead of dereferencing nilresp.Test plan
-race -count=1-count=2