feat: openAI Files API Implementation by sivanantha321 · Pull Request #1982 · envoyproxy/ai-gateway

sivanantha321 · 2026-03-23T13:13:04Z

Description

This commit implements the OpenAI Files API in the Envoy AI Gateway, adding support for four file operations. The Files API is a prerequisite for the Batch Processing API (proposal #007), as batch jobs reference files uploaded via this API.

Implemented Endpoints

Endpoint	Method	Path	Description
Upload file	`POST`	`/v1/files`	Upload a file via `multipart/form-data` with purpose, expiration, and a `model_name` extra field for routing
Retrieve file	`GET`	`/v1/files/{file_id}`	Fetch file metadata by ID
Retrieve file content	`GET`	`/v1/files/{file_id}/content`	Download raw file content
Delete file	`DELETE`	`/v1/files/{file_id}`	Delete a file by ID

File ID Encoding & Multi-Backend Routing

A core challenge of the Files API in a gateway context is routing stickiness: once a file is uploaded to a specific backend (e.g., OpenAI, Azure), all subsequent operations on that file (retrieve, retrieve content, delete) must be routed to the same backend. Otherwise, the backend will return a "file not found" error.

To solve this without requiring the client to pass extra routing headers on every request, the gateway encodes the model/backend information directly into the file ID returned to the client. This is inspired by LiteLLM's approach.

Encoding Format

Original ID:  file-abc123
Encoded ID:   file-<base64url(id:file-abc123;model:gpt-4o-mini)>

Concretely:   file-aWQ6ZmlsZS1hYmMxMjM7bW9kZWw6Z3B0LTRvLW1pbmk
              └─┬─┘└──────────────────┬──────────────────────────┘
             prefix   base64url(id:file-abc123;model:gpt-4o-mini)

Key properties:

Preserves OpenAI-compatible prefixes (file-, batch_) so the ID looks valid to clients and SDKs
Transparent to clients — clients use the encoded ID as-is; they don't need to know it contains routing info
Enables automatic routing — the gateway decodes the file ID on subsequent requests to extract the model name and route to the correct backend

Routing Flow — Example API Usage

Step 1: Upload a file — Client includes model_name as an extra multipart field:

curl -X POST https://gateway.example.com/v1/files \
  -H "Authorization: Bearer $API_KEY" \
  -F "purpose=fine-tune" \
  -F "file=@training_data.jsonl" \
  -F "model_name=gpt-4o-mini"

The gateway routes to the backend mapped to gpt-4o-mini, and on the response, encodes the file ID:

{
  "id": "file-aWQ6ZmlsZS1hYmMxMjM7bW9kZWw6Z3B0LTRvLW1pbmk",
  "object": "file",
  "purpose": "fine-tune",
  "filename": "training_data.jsonl"
}

Step 2: Retrieve the file — Client uses the encoded ID as-is:

curl https://gateway.example.com/v1/files/file-aWQ6ZmlsZS1hYmMxMjM7bW9kZWw6Z3B0LTRvLW1pbmk \
  -H "Authorization: Bearer $API_KEY"

The gateway:

Decodes the file ID → extracts model: gpt-4o-mini and original ID: file-abc123
Routes to the correct backend using the model name
Sends GET /v1/files/file-abc123 to the upstream
Re-encodes the file ID in the response before returning to the client

Step 3: Retrieve file content:

curl https://gateway.example.com/v1/files/file-aWQ6ZmlsZS1hYmMxMjM7bW9kZWw6Z3B0LTRvLW1pbmk/content \
  -H "Authorization: Bearer $API_KEY"

Step 4: Delete the file:

curl -X DELETE https://gateway.example.com/v1/files/file-aWQ6ZmlsZS1hYmMxMjM7bW9kZWw6Z3B0LTRvLW1pbmk \
  -H "Authorization: Bearer $API_KEY"

The same decode → route → re-encode pattern applies for all operations.

Implementation Details

The encoding/decoding is implemented via two functions in internal/translator/util.go:

EncodeIDWithModel(id, modelName, idType) — used in ResponseBody of CreateFile translator to encode the file ID returned by the upstream
DecodeFileID(encodedID) — used in the processor/server layer to extract the model name and original ID from incoming requests

Internal headers (OriginalFileIDHeaderKey, DecodedFileIDHeaderKey) carry the original and decoded file IDs through the request processing pipeline so that translators for Retrieve, RetrieveContent, and Delete can:

Set the correct upstream path using the decoded (original) file ID
Restore the encoded file ID in the response body for the client

Tracing: Non-Standard OpenInference Approach

The OpenInference specification does not define semantic conventions for file operations. OpenInference has well-defined span types for LLM/chat completions, embeddings, etc., but file upload/retrieve/delete are not part of the specification.

This PR implements tracing for file operations using a non-standard adaptation of the OpenInference conventions:

Recorder	Span Name	Request Type	Response Type	Notes
`CreateFileRecorder`	`CreateFile`	`openai.FileNewParams`	`openai.FileObject`	Records `output.file_id` on success
`RetrieveFileRecorder`	`RetrieveFile`	`struct{}` (no body)	`openai.FileObject`	Records `output.file_id` on success
`RetrieveFileContentRecorder`	`RetrieveFileContent`	`struct{}` (no body)	`struct{}` (raw bytes)	No output attributes (binary content)
`DeleteFileRecorder`	`DeleteFile`	`struct{}` (no body)	`openai.FileDeleted`	Records `output.file_id` on success

What's non-standard:

All four recorders set openinference.span.kind = "LLM" and llm.system = "openai" — borrowing from existing OpenInference conventions even though file operations are not LLM inference calls. This is because OpenInference has no dedicated span kind for file/storage operations.
Span kind is set to trace.SpanKindInternal (consistent with other OpenInference spans in the codebase).
Request attributes are minimal — only openinference.span.kind and llm.system are set, since file operations don't have model/prompt/token semantics.
Response attributes use custom keys like output.file_id and output.mime_type which are not part of the OpenInference spec.
RetrieveFileContent records no output attributes at all, since the response is raw binary file data.
All recorders embed NoopChunkRecorder since file operations are never streamed.

Key Changes Across 65 Files (+4,501 / -245)

API Schema Types (internal/apischema/openai/openai.go):

New types: FileNewParams, FileObject, FileDeleted, FilePurpose, FileObjectPurpose, FileObjectStatus, FileNewParamsExpiresAfter
UnmarshalMultipart / MarshalMultipart methods on FileNewParams for handling multipart/form-data encoding

Endpoint Specs (internal/endpointspec/endpointspec.go):

New endpoint specs: CreateFileEndpointSpec, RetrieveFileEndpointSpec, RetrieveFileContentEndpointSpec, DeleteFileEndpointSpec
ParseBody signature extended with requestHeaders map[string]string parameter to support Content-Type parsing for multipart requests
CreateFileEndpointSpec.ParseBody validates multipart/form-data Content-Type and boundary, then parses the body via UnmarshalMultipart
Retrieve/Delete specs return empty struct{} since these are body-less operations (GET/DELETE)

Server Routing (internal/extproc/server.go):

Refactored from exact string matching to regex + HTTP method-based routing to support parameterized paths (e.g., /v1/files/{file_id}) and method differentiation (GET vs DELETE on the same path)

Processor (internal/extproc/processor_impl.go):

Added ProcessRequestHeaders support for body-less requests (GET/DELETE/HEAD) so routing can be initialized without waiting for a request body
New initRequest extracted as a shared initialization path for both header-phase and body-phase processing

Translators (internal/translator/openai_file.go):

New OpenAI-to-OpenAI passthrough translators for all four file endpoints
EncodeIDWithModel / DecodeFileID in internal/translator/util.go for multi-backend routing via encoded file IDs

Metrics (internal/metrics/noop_metrics.go):

New NoopMetrics / NoopMetricsFactory for endpoints that don't yet have dedicated metrics

Tests:

End-to-end data-plane tests covering upload, retrieve, retrieve content, and delete
Unit tests for EncodeIDWithModel / DecodeFileID round-trip encoding
Updated existing endpoint spec and server tests to match new ParseBody signature and regex-based routing

Related Issues/PRs (if applicable)

Related: Proposal #007 — Batch Processing API Support

Special notes for reviewers (if applicable)

The ParseBody interface change (added requestHeaders parameter) touches all existing endpoint spec implementations — each gains an unused _ map[string]string parameter to satisfy the interface.
The server routing refactor from exact-match map[string]ProcessorFactory to a []Route with regex + http method matching is a significant architectural change that affects all existing route registrations.
Metrics for file API endpoints currently use NoopMetricsFactory — a TODO is noted for adding dedicated metrics support.
The OpenInference tracing approach for file operations is non-standard and may need to be revisited if/when the OpenInference spec adds file operation conventions.
The model_name is required as an extra multipart field during file upload — this is a gateway-specific requirement not present in the standard OpenAI Files API.

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

- Refactor server to support method-based routing for processors using regex. - Introduce new tracing capabilities for OpenAI file operations including retrieval and deletion. - Implement translators for OpenAI file API endpoints: retrieve, retrieve content and delete files. Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…penAI Files API Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

- Update Translator.RequestBody signature to include request headers in internal/translator/translator.go. - Update translator implementations to accept the new signature in all OpenAI/Anthropic/Cohere translators. - Update translator tests to match the new signature across those files. File upload now requires model_name - Enforce presence of model_name in file upload multipart parsing in internal/endpointspec/endpointspec.go. - Encode the model into file IDs on create responses and rewrite Content-Length in internal/translator/openai_file.go. Decode file/batch IDs from request path (header‑only requests) - Add path‑based decoding for file/batch requests and set model + original/decoded ID headers in internal/extproc/processor_impl.go. - Add new header keys in internal/internalapi/internalapi.go. Update extproc mocks/tests for the new headers and decoding behavior. - Use decoded ID for routing + return original ID on retrieve/delete/content - Route retrieve/delete/content requests using decoded file IDs and echo original IDs in responses in internal/translator/openai_file.go Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…y management Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…res_after handling Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…erences Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…corders Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

- Implemented new processor filters for file operations including creation, retrieval, and deletion. - Added tests for file processing routes and header manipulations. - Introduced encoding and decoding functions for file IDs with model names. - Enhanced tracing capabilities for file-related operations. - Added comprehensive unit tests for the new functionality in the translator package. Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…ty; add unit tests for noOpMetrics Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…upstream Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

codecov-commenter · 2026-03-23T13:16:26Z

Codecov Report

❌ Patch coverage is 89.07407% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.46%. Comparing base (98200a4) to head (157a9bb).

Files with missing lines	Patch %	Lines
internal/translator/openai_file.go	75.70%	13 Missing and 13 partials ⚠️
internal/apischema/openai/openai.go	73.68%	10 Missing and 10 partials ⚠️
internal/tracing/tracing.go	77.77%	8 Missing ⚠️
internal/extproc/processor_impl.go	95.38%	2 Missing and 1 partial ⚠️
internal/internalapi/user_facing_errors.go	0.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1982      +/-   ##
==========================================
+ Coverage   84.33%   84.46%   +0.12%     
==========================================
  Files         130      133       +3     
  Lines       17987    18517     +530     
==========================================
+ Hits        15170    15641     +471     
- Misses       1873     1907      +34     
- Partials      944      969      +25

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

…pstream Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

sivanantha321 added 14 commits March 23, 2026 11:43

Add OpenAI File API types

96f6a23

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Add multipart file marshal function

9a59a92

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Implement create file endpoint

0613e3c

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Add Marshal, Unmarshal and NoOpMetrics

52f87ad

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Add extra parameters support and ID encoding/decoding functions for O…

3d6616e

…penAI Files API Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

refactor: update regex path handling to use pointers for better memor…

ad55005

…y management Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

feat: add multipart unmarshal tests for FileNewParams and update expi…

3acc8e7

…res_after handling Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

refactor: rename Route to RouteProcessorMapper and update related ref…

ce289e8

…erences Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

feat: implement tracing for file operations and add unit tests for re…

1940879

…corders Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

refactor: rename noopMetrics to noOpMetrics for consistency and clari…

18ddbd2

…ty; add unit tests for noOpMetrics Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

add end-to-end tests for file upload, retrieval, and deletion in test…

b039a95

…upstream Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

sivanantha321 added 2 commits March 24, 2026 12:21

clean up code formatting

0455347

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

feat: add file upload, retrieval, and deletion tests to TestWithTestU…

157a9bb

…pstream Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: openAI Files API Implementation#1982

feat: openAI Files API Implementation#1982
sivanantha321 wants to merge 16 commits intoenvoyproxy:mainfrom
sivanantha321:openai-file-api

sivanantha321 commented Mar 23, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sivanantha321 commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Implemented Endpoints

File ID Encoding & Multi-Backend Routing

Encoding Format

Routing Flow — Example API Usage

Implementation Details

Tracing: Non-Standard OpenInference Approach

Key Changes Across 65 Files (+4,501 / -245)

Related Issues/PRs (if applicable)

Special notes for reviewers (if applicable)

Uh oh!

codecov-commenter commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sivanantha321 commented Mar 23, 2026 •

edited

Loading

codecov-commenter commented Mar 23, 2026 •

edited

Loading