Skip to content

feat: add OpenAI to AWS Bedrock embeddings translation#1969

Open
Flgado wants to merge 5 commits intoenvoyproxy:mainfrom
Flgado:feat/openai-awsbedrock-titan-embeddings-translator
Open

feat: add OpenAI to AWS Bedrock embeddings translation#1969
Flgado wants to merge 5 commits intoenvoyproxy:mainfrom
Flgado:feat/openai-awsbedrock-titan-embeddings-translator

Conversation

@Flgado
Copy link

@Flgado Flgado commented Mar 17, 2026

Description
This PR adds translation support for the OpenAI /v1/embeddings endpoint targeting AWS Bedrock Titan Embed Text models (amazon.titan-embed-text-v1 and `amazon.titan-embed-text-v2:0).

It follows the same patterns established by the existing openai_awsbedrock.go translator (JSON decoding strategy, error handling, path header construction via url.PathEscape, span recording).

Changes:

  • awsbedrock.go: Added TitanEmbeddingRequest and `TitanEmbeddingResponse schema structs covering both Titan v1 and v2 fields (InputText, dimensions, normalize, embeddingTypes).
  • openai_awsbedrock_embeddings.go: New OpenAIEmbeddingTranslator implementation: translates EmbeddingRequest → TitanEmbeddingRequest, maps TitanEmbeddingResponse back to EmbeddingResponse, handles error responses (JSON and non-JSON), records token usage and span.
  • endpointspec.go — Wired NewEmbeddingOpenAIToAWSBedrockTranslator into the APISchemaAWSBedrock embedding path.
  • openai_awsbedrock_embeddings_test.go: 26 test cases across RequestBody, ResponseHeaders, ResponseBody, and ResponseError, covering Titan v1/v2 behavior, model override, batch rejection, dimensions forwarding, and error paths.
  • supported-endpoints.md — Updated embeddings provider list and compatibility table to reflect Titan support.

What is NOT supported in this PR:

  • Batch embeddings: Titan InvokeModel accepts only a single inputText per request. The translator returns a 400 for len(input) != 1.
  • embeddingsByType (quantized outputs): Titan v2 can return quantized vectors under `embeddingsByType, but the OpenAI response schema has no equivalent. The top-level embedding float array is always present and used.
  • Cohere Embed models (e.g. cohere.embed-english-v3): Cohere's embedding API uses texts (array) instead of inputText and requires a semantic `input_type field with no OpenAI equivalent. This needs its own translator, planned as a follow-up PR.

Special notes/questions for reviewers

Two design decisions I'd like maintainer input on before considering follow-up work:

1. Should normalize be user-configurable via vendor fields?

Currently normalize (Titan v2 only) is always left unset, using Titan's default (true). The GCP embeddings translator exposes a similar per-provider option via GCPVertexAIEmbeddingVendorFields.TaskType. Should we follow the same pattern and add an AWSBedrockTitanEmbeddingVendorFields struct with normalize? Or is leaving it as the default acceptable. Same question for the embeddingsByType.

2. Should embeddingsByType be surfaced in the response?

Titan v2 can return quantized binary vectors under embeddingsByType alongside the standard float embedding. This PR always discards it and returns only the float array. If the community wants to expose this, it would require either a vendor extension on EmbeddingResponse or a new field there is no OpenAI equivalent. Should this be in scope, and if so, what is the preferred approach?

Amazon Titan Embeddings docs for reference: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-titan-embed-text.html

Signed-off-by: Joao Folgado <jfolgado94@gmail.com>
@Flgado Flgado requested a review from a team as a code owner March 17, 2026 23:44
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Mar 18, 2026
@nacx
Copy link
Member

nacx commented Mar 18, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for translating OpenAI embedding requests to AWS Bedrock's Titan embedding models. The changes are well-structured, following existing patterns for translators and including comprehensive test coverage. The new request/response schemas, translator logic, and endpoint wiring are all correctly implemented.

I have one suggestion regarding the error handling logic for passthrough errors to ensure consistency and prevent potential loss of header information. Otherwise, the implementation is excellent.

@codecov-commenter
Copy link

codecov-commenter commented Mar 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.38%. Comparing base (76bb630) to head (228bacd).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1969      +/-   ##
==========================================
+ Coverage   84.21%   84.38%   +0.16%     
==========================================
  Files         128      129       +1     
  Lines       17832    18030     +198     
==========================================
+ Hits        15018    15215     +197     
  Misses       1871     1871              
- Partials      943      944       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Flgado added 3 commits March 18, 2026 21:49
…span recording test and error path coverage.

Signed-off-by: Joao Folgado <jfolgado94@gmail.com>
Signed-off-by: Joao Folgado <jfolgado94@gmail.com>
Signed-off-by: Joao Folgado <jfolgado94@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants