Skip to content

Fix streaming metadata and DeepSeek reasoning#15

Merged
jhaynie merged 1 commit intomainfrom
fix/streaming-metadata-deepseek-reasoning
May 9, 2026
Merged

Fix streaming metadata and DeepSeek reasoning#15
jhaynie merged 1 commit intomainfrom
fix/streaming-metadata-deepseek-reasoning

Conversation

@jhaynie
Copy link
Copy Markdown
Member

@jhaynie jhaynie commented May 9, 2026

Summary

  • append gateway metadata SSE events after streamed billing is calculated
  • normalize DeepSeek reasoning requests into provider-supported thinking params
  • add coverage for streamed metadata and DeepSeek reasoning on/off forwarding

Tests

  • go test ./...

Summary by CodeRabbit

  • New Features

    • Streaming responses now emit a gateway.metadata SSE event with billing and token totals before the stream terminal event.
    • Improved deepseek request compatibility: top-level reasoning values are translated into a standardized thinking object and effort field.
  • Tests

    • Added coverage validating deepseek normalization for streaming and non-streaming flows and the gateway.metadata streaming event.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 9, 2026

Review Change Stack

Warning

Rate limit exceeded

@jhaynie has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 22 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 32cb9dbf-e42f-4607-96c8-981e550da3fa

📥 Commits

Reviewing files that changed from the base of the PR and between 65ba022 and 9399a1d.

📒 Files selected for processing (2)
  • autorouter.go
  • autorouter_test.go
📝 Walkthrough

Walkthrough

This PR normalizes Deepseek reasoning request fields into thinking objects (with optional reasoning_effort) for both Forward and ForwardStreaming, and adds SSE terminal buffering plus emission of a event: gateway.metadata containing billing totals before the terminal SSE event.

Changes

Deepseek Request Normalization and Streaming Metadata

Layer / File(s) Summary
Request Normalization Helper
autorouter.go
New normalizeProviderRequest function detects provider type; for Deepseek, removes reasoning field and injects thinking object with type (enabled/disabled) and conditionally reasoning_effort based on the original reasoning value.
Forward Integration
autorouter.go
In AutoRouter.Forward, normalizeProviderRequest is called after parsing the request body and before marshalling; applies provider-specific transformations before sending to upstream.
Streaming Request Normalization
autorouter.go
In AutoRouter.ForwardStreaming, request normalization is applied before stream options and re-marshalling the request body.
SSE Terminal Buffering
autorouter.go
When billing is enabled and upstream is SSE, the response writer is wrapped with an sseTerminalHoldingWriter to buffer terminal bytes for later metadata emission and flush.
Streaming Metadata Helper & Emission
autorouter.go
New writeGatewayMetadataEvent helper serializes billing metadata into SSE event: gateway.metadata and flushes it; after billing calculation the router emits this event before flushing the buffered SSE terminal payload.
Tests
autorouter_test.go
Tests added for Deepseek normalization (non-streaming and streaming), unknown reasoning handling, and a streaming test asserting gateway.metadata SSE event contains usage and cost fields and appears before data: [DONE].
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
autorouter_test.go (1)

348-477: ⚡ Quick win

Add one ForwardStreaming DeepSeek normalization test.

These cases only exercise Forward, but ForwardStreaming has its own request-rewrite path and now applies the same DeepSeek normalization separately. A single streaming on/off case here would keep the two paths from drifting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@autorouter_test.go` around lines 348 - 477, Add a new unit test that mirrors
either TestAutoRouter_DeepseekReasoningOffDisablesThinking or
TestAutoRouter_DeepseekReasoningHighEnablesThinking but calls
router.ForwardStreaming instead of router.Forward to verify the streaming
request-rewrite path applies the same DeepSeek normalization; reuse the same
test setup (mockProvider, upstream httptest server, NewAutoRouter with
ProviderDetectorFunc returning "deepseek") and assert that the upstreamBody has
no "reasoning", that "thinking.type" is set to "disabled" or "enabled" as
appropriate, and that "reasoning_effort" is preserved for the high case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@autorouter.go`:
- Around line 409-413: The current placement of writeGatewayMetadataEvent after
the upstream copy causes gateway.metadata to be sent after the terminal SSE
marker; change the SSE forwarding logic so that when
IsSSEStream(upstreamResp.Header.Get("Content-Type")) is true you intercept the
stream copy and detect the terminal event ("data: [DONE]") and call
writeGatewayMetadataEvent(w, rc, billing) immediately before forwarding that
final marker (or buffer/peek the last chunk and inject metadata before writing
it). Update the copy/forwarding code path (the SSE copy loop that reads from
upstreamResp.Body) to perform this detection and injection using the existing
IsSSEStream and writeGatewayMetadataEvent helpers instead of calling
writeGatewayMetadataEvent after the full copy completes.
- Around line 649-670: The current logic deletes raw["reasoning"] before
confirming a valid mapping, which can silently drop unsupported values and leave
contradictory reasoning_effort; update the handler so you first inspect and map
the incoming reasoning value (variable reasoning) without deleting it, only
remove raw["reasoning"] and set raw["thinking"] when you have a recognized
mapping (handle string cases and bool as in the switch on reasoning and values
like "off"/"false"/"none" -> thinking.type="disabled",
"low"/"medium"/"high"/"max"/"xhigh" -> thinking.type="enabled" and set
raw["reasoning_effort"]=value), and when you set thinking.type="disabled"
explicitly clear any existing raw["reasoning_effort"]; leave raw untouched for
unrecognized types/values so callers’ data isn’t lost.

---

Nitpick comments:
In `@autorouter_test.go`:
- Around line 348-477: Add a new unit test that mirrors either
TestAutoRouter_DeepseekReasoningOffDisablesThinking or
TestAutoRouter_DeepseekReasoningHighEnablesThinking but calls
router.ForwardStreaming instead of router.Forward to verify the streaming
request-rewrite path applies the same DeepSeek normalization; reuse the same
test setup (mockProvider, upstream httptest server, NewAutoRouter with
ProviderDetectorFunc returning "deepseek") and assert that the upstreamBody has
no "reasoning", that "thinking.type" is set to "disabled" or "enabled" as
appropriate, and that "reasoning_effort" is preserved for the high case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3ca413b0-a6fa-45c8-8443-83f6767716d8

📥 Commits

Reviewing files that changed from the base of the PR and between 6a5084d and 6c118b7.

📒 Files selected for processing (2)
  • autorouter.go
  • autorouter_test.go

Comment thread autorouter.go Outdated
Comment thread autorouter.go
@jhaynie jhaynie force-pushed the fix/streaming-metadata-deepseek-reasoning branch from 6c118b7 to 65ba022 Compare May 9, 2026 02:36
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@autorouter_test.go`:
- Around line 1468-1491: The assertions currently scan the whole stream body;
narrow them to only the metadata event slice by extracting the metadata chunk
between the "event: gateway.metadata" occurrence and the next event/terminal
marker and run the field checks against that chunk. Concretely, using the
existing body, find metadataIndex := strings.Index(body, "event:
gateway.metadata"), then find nextEventIndex :=
strings.Index(body[metadataIndex+1:], "\nevent: ") (or fallback to doneIndex :=
strings.Index(body, "data: [DONE]") if no next event) and slice metadataChunk :=
body[metadataIndex:metadataIndex+nextEventIndexOrDone]; then replace the current
strings.Contains(body, ...) assertions to strings.Contains(metadataChunk, ...)
for `"type":"gateway.metadata"`, `"total":0.0002`, `"promptTokens":100`, and
`"completionTokens":50` so you only assert fields present inside the metadata
event.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 60e92cbb-c391-4967-b3cd-23ef288b028c

📥 Commits

Reviewing files that changed from the base of the PR and between 6c118b7 and 65ba022.

📒 Files selected for processing (2)
  • autorouter.go
  • autorouter_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • autorouter.go
📜 Review details
🔇 Additional comments (4)
autorouter_test.go (4)

348-410: DeepSeek “reasoning: off” forwarding coverage is solid.

Good assertion set here: it verifies reasoning is stripped and thinking.type=disabled is forwarded upstream.


412-477: DeepSeek “reasoning: high” normalization is well-covered.

This test correctly validates the enabled thinking mode and expected reasoning_effort forwarding.


479-626: Streaming DeepSeek normalization tests are strong.

Nice coverage for both off/on streaming behavior, including stripping incompatible fields and forwarding provider-supported params.


628-645: Unknown reasoning passthrough behavior is validated cleanly.

This is a useful guardrail test to prevent over-normalizing unsupported values.

Comment thread autorouter_test.go
@jhaynie jhaynie force-pushed the fix/streaming-metadata-deepseek-reasoning branch from 65ba022 to 9399a1d Compare May 9, 2026 02:42
@jhaynie jhaynie merged commit b016666 into main May 9, 2026
2 checks passed
@jhaynie jhaynie deleted the fix/streaming-metadata-deepseek-reasoning branch May 9, 2026 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant