feat: Add exponential backoff and log deduplication to Spotlight by mattico · Pull Request #5025 · getsentry/sentry-dotnet

mattico · 2026-03-13T17:58:40Z

Closes #3481.

Implements the error handling behavior from the SDK docs Spotlight spec. When the Spotlight server is unreachable, the SDK now logs the error only once (resetting after success) and implements exponential backoff (1s initial, 60s max) before retrying.

…sport When the Spotlight server is unreachable, the SDK now logs the error only once and implements exponential backoff (1s initial, 60s max) before retrying, per the Spotlight error handling spec. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…safe

mattico · 2026-03-13T18:30:22Z

The SDK docs suggest this retry logic:

    def send(self, envelope):
        try:
            # Attempt to send
            self._send_envelope(envelope)
            # Reset retry delay on success
            self.retry_delay = 1.0
            self.error_logged = False
        except ConnectionError as e:
            # Exponential backoff
            if not self.error_logged:
                logger.error(f"Spotlight server unreachable at {self.url}: {e}")
                self.error_logged = True

            # Wait before retry
            time.sleep(self.retry_delay)
            self.retry_delay = min(self.retry_delay * 2, self.max_retry_delay)

            # Retry once, then give up for this envelope
            try:
                self._send_envelope(envelope)
                self.retry_delay = 1.0
            except ConnectionError:
                # Silently drop envelope after retry
                pass

While the actual sentry-python SDK implements this retry logic:

https://github.com/getsentry/sentry-python/blob/6c6705a3d990559a80a48477a873d3171b928b12/sentry_sdk/spotlight.py#L59-L97

    def capture_envelope(self, envelope):
        # type: (Envelope) -> None

        # Check if we're in backoff period - skip sending to avoid blocking
        if self._last_error_time > 0:
            time_since_error = time.time() - self._last_error_time
            if time_since_error < self._retry_delay:
                # Still in backoff period, skip this envelope
                return

        body = io.BytesIO()
        envelope.serialize_into(body)
        try:
            req = self.http.request(
                url=self.url,
                body=body.getvalue(),
                method="POST",
                headers={
                    "Content-Type": "application/x-sentry-envelope",
                },
            )
            req.close()
            # Success - reset backoff state
            self._retry_delay = self.INITIAL_RETRY_DELAY
            self._last_error_time = 0.0
        except Exception as e:
            self._last_error_time = time.time()

            # Increase backoff delay exponentially first, so logged value matches actual wait
            self._retry_delay = min(self._retry_delay * 2, self.MAX_RETRY_DELAY)

            # Log error once per backoff cycle (we skip sends during backoff, so only one failure per cycle)
            sentry_logger.warning(
                "Failed to send envelope to Spotlight at %s: %s. "
                "Will retry after %.1f seconds.",
                self.url,
                e,
                self._retry_delay,
            )

There are some differences:

The docs only catch ConnectionError rather than all Exceptions.
The docs attempt to send every envelope up to twice, the backoff only delays the second attempt. The SDK drops envelopes during the backoff period, and does not retry failed envelopes.

I suppose both the example code and the SDK do fit the stated requirements, just differently.

I implemented behavior closer to the Python SDK than the docs suggestion, which I think is better anyhow. The one difference I can see with my implementation compared to the Python SDK is it logs one warning per backoff cycle while this logs only the first error (resetting after a success).

src/Sentry/Http/SpotlightHttpTransport.cs

codecov · 2026-03-15T22:21:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.10%. Comparing base (abf4d67) to head (2df3b7b).
⚠️ Report is 35 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5025      +/-   ##
==========================================
+ Coverage   73.91%   74.10%   +0.19%     
==========================================
  Files         497      499       +2     
  Lines       17974    18079     +105     
  Branches     3517     3514       -3     
==========================================
+ Hits        13285    13397     +112     
+ Misses       3833     3826       -7     
  Partials      856      856

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sentry · 2026-03-19T16:36:53Z

src/Sentry/Http/SpotlightHttpTransport.cs

+                    using var response = await _httpClient.SendAsync(request, cancellationToken).ConfigureAwait(false);
+                    await HandleResponseAsync(response, processedEnvelope, cancellationToken).ConfigureAwait(false);
+
+                    _backoff.RecordSuccess();


Bug: The backoff mechanism in SpotlightHttpTransport is incorrectly reset on HTTP error responses (4xx/5xx) because _backoff.RecordSuccess() is called regardless of the response status.
_{Severity: HIGH}

Suggested Fix

The HandleResponseAsync method should return a status indicating success. SpotlightHttpTransport should then check this status. Call _backoff.RecordSuccess() only if the response was successful (e.g., 200 OK). Otherwise, call _backoff.RecordFailure() to correctly trigger the exponential backoff logic for HTTP errors.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/Sentry/Http/SpotlightHttpTransport.cs#L55 Potential issue: In `SpotlightHttpTransport.SendEnvelopeAsync`, the `_backoff.RecordSuccess()` method is called after `HandleResponseAsync`. However, `HandleResponseAsync` does not throw an exception for non-200 HTTP status codes. If the Spotlight server returns an HTTP error (e.g., 500, 503), the backoff state is incorrectly reset, and the retry delay is not increased. This defeats the exponential backoff mechanism for HTTP-level failures, causing the SDK to spam a struggling server instead of waiting. The intended error logging for this failure path is also skipped.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

To my knowledge the python SDK does not consider an HTTP error response as a failure for backoff purposes (urllib3 doesn't throw for those). And the SDK docs don't either (only catch ConnectionError).

I haven't thought deeply about it otherwise.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

src/Sentry/Http/SpotlightHttpTransport.cs

jamescrosswell · 2026-03-24T23:50:18Z

Thanks heaps for the PR @mattico ! It looks pretty good to me. There are some issues with a couple of our CI checks when running against forks of the repo, which I'm asking for advice on. Once we've worked through those though, I think we should be able to get this reviewed and merged in.

Apologies for the delay...

mattico and others added 2 commits March 13, 2026 11:53

feat: introduce helper to make SpotlightHttpTransport backoff thread-…

fd08e9c

…safe

This comment was marked as outdated.

Sign in to view

Update CHANGELOG.md

854aff4

mattico marked this pull request as ready for review March 13, 2026 18:34

mattico requested review from Flash0ver and jamescrosswell as code owners March 13, 2026 18:34

cursor bot reviewed Mar 13, 2026

View reviewed changes

src/Sentry/Http/SpotlightHttpTransport.cs Show resolved Hide resolved

Flash0ver mentioned this pull request Mar 19, 2026

Support Spotlight without sending to Sentry #3481

Open

chore: simplify ExponentialBackoff retry delay logic

2df3b7b

sentry bot reviewed Mar 19, 2026

View reviewed changes

cursor bot reviewed Mar 19, 2026

View reviewed changes

src/Sentry/Http/SpotlightHttpTransport.cs Show resolved Hide resolved

jamescrosswell mentioned this pull request Mar 24, 2026

Some CI workflows always fail for external contributors #5062

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add exponential backoff and log deduplication to Spotlight#5025

feat: Add exponential backoff and log deduplication to Spotlight#5025
mattico wants to merge 4 commits intogetsentry:mainfrom
mattico:spotlight-backoff

mattico commented Mar 13, 2026 •

edited by Flash0ver

Loading

Uh oh!

This comment was marked as outdated.

mattico commented Mar 13, 2026

Uh oh!

Uh oh!

codecov bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

sentry bot Mar 19, 2026

Uh oh!

mattico Mar 20, 2026

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

jamescrosswell commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mattico commented Mar 13, 2026 • edited by Flash0ver Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

mattico commented Mar 13, 2026

Uh oh!

Uh oh!

codecov bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sentry bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

mattico Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jamescrosswell commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mattico commented Mar 13, 2026 •

edited by Flash0ver

Loading

codecov bot commented Mar 15, 2026 •

edited

Loading