feat: Add Gemini 3.0 streaming support for Vertex AI#1185
Draft
spartandingo wants to merge 7 commits into0xPlaygrounds:mainfrom
Draft
feat: Add Gemini 3.0 streaming support for Vertex AI#1185spartandingo wants to merge 7 commits into0xPlaygrounds:mainfrom
spartandingo wants to merge 7 commits into0xPlaygrounds:mainfrom
Conversation
- Implement StreamingCompletionModel<HttpClient: HttpClientExt> for generic HTTP client support - Uses Rig's GenericEventSource for SSE parsing with automatic retry handling - Support Gemini 3.0 models (gemini-3-pro, gemini-3-flash) with extended thinking - Tool calling support with function calls and thoughtSignature metadata - Comprehensive token usage tracking (input, output, cached, thoughts) - Version gating: only Gemini 3.0+ models supported with clear error messages - 4 unit tests covering deserialization, tool calls, and token counting - Remove model constants for Gemini 2.5 and lower (streaming unsupported) - Add model constants for Gemini 3.0 variants - Add 'http' to workspace dependencies for Request builder - Pattern aligns with Rig's own Gemini provider implementation update chore: Use direct http dependency for rig-vertexai instead of workspace
765d58f to
8d8e151
Compare
- streaming_endpoint() now returns full https://... URLs instead of relative paths - Fixes 'RelativeUrlWithoutBase' error when creating HTTP requests - Properly handles both global (Gemini 3) and regional endpoints - All 22 tests passing
- Gemini 3 streaming uses aiplatform.googleapis.com, not {region}-aiplatform.googleapis.com
- Matches endpoint structure from working implementation
- Regional endpoints only for non-Gemini-3 models
- All 22 tests passing
…ertex AI streaming - Expose credentials() method in VertexAI Client for manual authentication - Clarify authentication requirements for StreamingCompletionModel - Callers should pass authenticated HTTP clients with GCP Bearer tokens This enables integrations to handle authentication via interceptors, middleware, or pre-configured auth headers for Vertex AI API requests.
- Implement GcpAuthMiddleware for injecting Bearer tokens via reqwest-middleware - Add BearerToken type for managing GCP access tokens - Update streaming.rs to clarify auth requirements - Follows Rig's pattern of internal auth handling for clean provider APIs This enables StreamingCompletionModel to work with authenticated HTTP clients that inject GCP Bearer tokens automatically for all Vertex AI requests.
- Remove BearerToken requirement from GcpAuthMiddleware constructor - Implement dynamic token fetching with caching on each request - Add token-source dependency for token management - Add Default trait implementation for convenience
- Simplify GcpAuthMiddleware to placeholder for future enhancements - Document authentication requirements for StreamingCompletionModel - Users should configure auth via reqwest-middleware or ADC - All 22 tests passing in rig-vertexai
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds production-grade streaming support for Vertex AI Gemini 3.0 models (Pro and Flash) to the rig-vertexai integration. This implementation follows Rig framework conventions and patterns, using the generic
HttpClientExttrait abstraction andGenericEventSourcefor SSE parsing.Key Features
HttpClientExttraitGenericEventSourcefor reliable event stream handling with automatic retrythoughtSignaturemetadata from Gemini 3.0 modelsImplementation Details
http::Requestbuilder instead of hardcoding specific HTTP client implementationshttp = "1.3.1"Testing
Changes
rig-integrations/rig-vertexai/src/streaming.rs- New streaming module (415 lines)rig-integrations/rig-vertexai/src/lib.rs- Export StreamingCompletionModelrig-integrations/rig-vertexai/src/completion.rs- Add Gemini 3.0 model constantsrig-integrations/rig-vertexai/Cargo.toml- Add http dependencyCargo.toml- Add http to workspace dependencies