Fix silent error discarding in Badger sampling store#7965
Fix silent error discarding in Badger sampling store#7965Sanchit2662 wants to merge 2 commits intojaegertracing:mainfrom
Conversation
Signed-off-by: Sanchit2662 <sanchit2662@gmail.com>
There was a problem hiding this comment.
this would be cleaner
| return s.store.Update(func(txn *badger.Txn) error { |
There was a problem hiding this comment.
this is currently leaking into outer scope for no reason
| if err := txn.SetEntry(entriesToStore[i]); err != nil { | |
| return err | |
| } |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7965 +/- ##
==========================================
- Coverage 95.57% 95.54% -0.03%
==========================================
Files 316 316
Lines 16763 16763
==========================================
- Hits 16021 16017 -4
- Misses 579 582 +3
- Partials 163 164 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Sanchit2662 <sanchit2662@gmail.com>
|
Hi @yurishkuro , |
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical bug in the Badger sampling store where transaction errors were being silently discarded instead of being returned to callers. The methods InsertThroughput() and InsertProbabilitiesAndQPS() were capturing errors from Badger transactions but always returning nil, causing write failures to be invisible to components relying on persisted sampling data.
Changes:
- Modified
InsertThroughput()to return the transaction error instead of discarding it - Modified
InsertProbabilitiesAndQPS()to return the transaction error instead of discarding it
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return s.store.Update(func(txn *badger.Txn) error { | ||
| for i := range entriesToStore { | ||
| err = txn.SetEntry(entriesToStore[i]) | ||
| if err != nil { | ||
| if err := txn.SetEntry(entriesToStore[i]); err != nil { | ||
| return err | ||
| } | ||
| } | ||
|
|
||
| return nil | ||
| }) |
There was a problem hiding this comment.
The error propagation fix is correct, but there should be test coverage to verify that errors from s.store.Update() are actually returned to the caller. Consider adding a test that simulates a Badger transaction error (e.g., using a closed database or mocked error scenario) to ensure the error is properly propagated.
| return s.store.Update(func(txn *badger.Txn) error { | ||
| for i := range entriesToStore { | ||
| err = txn.SetEntry(entriesToStore[i]) | ||
| if err != nil { | ||
| if err := txn.SetEntry(entriesToStore[i]); err != nil { | ||
| return err | ||
| } | ||
| } | ||
|
|
||
| return nil | ||
| }) |
There was a problem hiding this comment.
The error propagation fix is correct, but there should be test coverage to verify that errors from s.store.Update() are actually returned to the caller. Consider adding a test that simulates a Badger transaction error (e.g., using a closed database or mocked error scenario) to ensure the error is properly propagated.
Summary
This PR fixes silent error handling in the Badger-based sampling store.
In
internal/storage/v1/badger/samplingstore/storage.go, the write methodsInsertThroughput()andInsertProbabilitiesAndQPS()capture errors from Badger transactions but always returnnil. This causes write failures to be silently ignored.As a result, components relying on persisted sampling data (such as adaptive sampling) may assume writes succeeded even when Badger rejected them due to disk pressure, corruption, or transaction failures.
Currently, transaction errors are captured but discarded:
This hides storage failures from callers and reduces visibility into persistence issues.
Fix
This PR ensures transaction errors are returned to callers instead of being dropped.
Changes:
InsertThroughput()now returns the transaction errorInsertProbabilitiesAndQPS()now returns the transaction errorThis aligns write behavior with expected error propagation and with patterns used in other storage backends.
Why This Matters
Prevents silent write failures
Improves reliability of adaptive sampling
Surfaces real storage issues to callers
The change is minimal and does not alter storage logic—only error propagation.
Verification
make fmt— passedmake lint— passedgo test ./internal/storage/v1/badger/...— all tests passChecklist
jaeger:make lint testjaeger-ui:npm run lintandnpm run test