Weekly Update — May 19 – June 1, 2026 #2178

missBerg · 2026-06-01T14:54:56Z

missBerg
Jun 1, 2026
Maintainer

Note: This is a two-week update covering 2026-05-19 → 2026-06-01 — no newsletter went out last week.

Two weeks of post-v0.6 polish — streaming/SSE robustness on the Responses and MCP paths, translation-coverage edges (audio/video content, Anthropic beta headers, typeless assistant turns, custom Anthropic prefix), and a couple of stability fixes against ext-proc panics. Thanks to everyone who shipped, reviewed, and weighed in.

✨ What's new

Provider & translation coverage

audio_url and video_url content types in OpenAI schema — #2136 by @cjackal. OpenAI-shape clients can now send multimodal inputs to backends that consume audio/video (phi-4-mm, qwen3.5, and other OpenAI-compatible multimodal servers). Schema-only change, no API churn. Fixes #2035.
anthropic-beta header mapped on the AWSAnthropic backend — #2148 by @CodePrometheus. Bedrock takes Anthropic beta opt-ins in the body, not the header, so the translator now lifts anthropic-beta: context-1m-2025-08-07 into anthropic_beta: ["context-1m-2025-08-07"] (comma-separated values are split and trimmed). Fixes #2147.
prefix field honored on the Anthropic backend — #2108 by @ajac-zero. VersionedAPISchema.prefix had been silently dropped for Anthropic (the request path was hardcoded to /v1/messages); it's now threaded through schemaToFilterAPI, the endpoint spec, and the anthropicToAnthropicTranslator so operators can target Anthropic-compatible backends served under a custom prefix like /gateway/v1/messages. Default behavior is unchanged — prefix defaults to "v1" when unset. AWSAnthropic and GCPAnthropic still ignore the field.
Typeless assistant output messages on /v1/responses — #2172 by @dzibma. Multi-turn Responses inputs from clients like OpenCode (assistant turns serialized without an explicit type: "message") now parse as output messages instead of failing the EasyInputMessageParam unmarshal on output_text.

Routing

Hostname on AIGatewayRoute — #2160 by @xianml. AIGatewayRoute is no longer effectively cluster-scoped — you can now bind a route (and its /v1/models listing) to a specific hostname like xx.api.aieg.com, so different host-fronted model groups stop bleeding into each other's listings. Lands the rebased version of the long-running #1987; discussed in #1646.

Streaming & MCP

SSE event buffering on the Responses passthrough — #2163 by @CodePrometheus. In STREAMED body mode ext_proc can split a single SSE event across ResponseBody calls; the translator now holds incomplete bytes and parses only complete \n\n-terminated events, so a split response.completed no longer silently drops token usage. Fixes #2162.
Optional-space SSE parser — #2155 by @sriyer. The MCP proxy's parseEvent was matching on data: / event: with a mandatory space and silently dropping lines from backends that emit data:{...} per spec (e.g. Spring Boot's SseEmitter) — tools/list / prompts/list aggregations were coming back empty as a result. Both forms now parse the same way. Fixes #2149.

Bug fixes & stability

Nil-guard on AWS Bedrock response Output — #2157 by @siddharth1036. A 200 response without output.message (AWS Coral routing errors when the cluster points at bedrock.<region> instead of bedrock-runtime.<region>, guardrail interventions, empty bodies) was dereferencing Output.Message.Role and panicking the ext-proc — now it returns a descriptive error including the Bedrock stopReason instead of taking the pod into a panic loop.

Developer experience

HTTP/2 client traffic policy docs — #2131 by @taiman724. Documents http2.initialStreamWindowSize and initialConnectionWindowSize on ClientTrafficPolicy — the separate flow-control cap that keeps causing 413 request_payload_too_large even after operators bump bufferLimit. Fixes #2130.

💡 New design proposals & feature requests

A few new threads worth a read if any of these are in your wheelhouse:

Native Gemini /v1beta/models/<m>:generateContent client schema — #2165 by @jaimeluengo. Adds a GoogleGenAI client schema so the gateway can accept Gemini-native requests (path-extracted model, x-goog-api-key auth, SSE :streamGenerateContent) and translate to any backend — the missing counterpart to today's Anthropic and OpenAI client schemas. Unlocks @google/genai, gemini-cli, and langchain-google clients, which otherwise can't talk to the gateway at all. Author has offered to drive the PR in scoped increments.
One shared MCP-proxy Backend per Gateway — #2150 by @kanurag94. The MCPRoute controller currently mints one placeholder Backend (all pointing at the same dummy IP, all rewritten by the extension server to the same in-pod 127.0.0.1:9856) per route — 86 routes mean 86 redundant CDS clusters with their own connection-pool and stats state. Proposes a single Gateway-owned shared Backend instead; author has offered a PR.
Responses API rejects namespace and tool_search tool types — #2164 by @jaimeluengo. The strict ResponseToolUnion unmarshaller bails on two type values that OpenAI Codex emits today (namespace for tool grouping, tool_search for dynamic-tool discovery) — requests are hard-rejected at 400 before the model header is even extracted, breaking retry/spillover paths. Proposes two struct additions to the lenient list; author has a PR ready to send.
InferencePool backend ignores cross-namespace refs — #2173 by @ammarasyad. AIGatewayRoute docs advertise cross-namespace backendRefs with ReferenceGrant, but for an InferencePool backend the generated HTTPRoute rewrites the namespace to the AIGatewayRoute's own — silently, with Accepted: True. Includes a precise root-cause pointer into internal/controller/ai_gateway_route.go.
Anthropic → OpenAI translator surfaces 500 on numeric upstream error.code — #2151 by @fdaforno. vLLM (and other OpenAI-compatible backends) emit code as a JSON number; the translator's error struct expects a string, the unmarshal fails, and the client sees an empty HTTP 500 instead of the actual 4xx diagnostic. Two proposed paths: relax the type to accept both, or fall back to pass-through when the error body can't be parsed.

👀 PRs looking for review

If you have time to help review, these are open and waiting:

#2134 — Stop embedding plaintext API keys in MCP backend HTTPRoutes by @aishwaryaraimule21 — security fix for #2141; still the leading 0.6 patch candidate.
#2175 — Source MCP session encryption seed from a Secret by @walsm232 — keeps the MCP encryption seed out of Deployment args, Pod specs, and etcd. Fixes #2145.
#2166 — Allow InferencePool as a backend reference on AIServiceBackend by @isztldav — lets the inference-pool backend flow through the same AIServiceBackend surface as other providers.
#2169 — Time-to-first-token timeout with rollover to a new model by @albe2669 — failover trigger for slow-starting upstreams, not just hard errors.
#2052 — Proposal: OAuth 2.0 Token Exchange as upstream auth for MCP backends by @nacx — design doc for the auth path that's been a recurring 👀 entry; reviewer eyes welcome.
#1869 — Backend quota rate-limit filter for QuotaPolicy by @yuzisun — still the 0.7 quota work; unit-transformation comments between the API and the rate-limit service are the blocker.

🙏 Thanks to this week's contributors

@cjackal, @CodePrometheus, @dzibma, @xianml, @sriyer, @siddharth1036, @taiman724, @ajac-zero, @aishwaryaraimule21, @walsm232, @isztldav, @albe2669, @nacx, @yuzisun, @hustxiayang, @nuthalapativarun, @ChrisJBurns, @anurags25, @sc7565, @mtparet, @mturac, @arpitjain099, @immanuwell, @PatilHrushikesh, @sivanantha321, @jaimeluengo, @ammarasyad, @kanurag94, @fdaforno, @missBerg — and everyone who showed up to triage, review, and discuss.

See you next week!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Weekly Update — May 19 – June 1, 2026 #2178

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Weekly Update — May 19 – June 1, 2026 #2178

Uh oh!

missBerg Jun 1, 2026 Maintainer

✨ What's new

Provider & translation coverage

Routing

Streaming & MCP

Bug fixes & stability

Developer experience

💡 New design proposals & feature requests

👀 PRs looking for review

🙏 Thanks to this week's contributors

Replies: 0 comments

missBerg
Jun 1, 2026
Maintainer