[Docs]: feature: envoy ai gateway integration by googs1025 · Pull Request #1733 · vllm-project/aibrix

googs1025 · 2025-11-06T05:36:05Z

Pull Request Description

[Please provide a clear and concise description of your changes here]

Related Issues

Resolves: #1732

Important: Before submitting, please complete the description above and review the checklist below.

Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

[Bug]: Corrections to existing functionality
[CI]: Changes to build process or CI pipeline
[Docs]: Updates or additions to documentation
[API]: Modifications to aibrix's API or interface
[CLI]: Changes or additions to the Command Line Interface
[Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

PR title includes appropriate prefix(es)
Changes are clearly explained in the PR description
New and existing tests pass successfully
Code adheres to project style and best practices
Documentation updated to reflect changes (if applicable)
Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

gemini-code-assist · 2025-11-06T05:36:27Z

Summary of Changes

Hello @googs1025, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Aibrix's deployment capabilities by integrating with Envoy AI Gateway. It provides the necessary Helm chart modifications to optionally deploy gateway infrastructure and includes detailed sample configurations for setting up a robust, multi-model AI inference gateway on Kubernetes. This allows for flexible and scalable management of AI model serving, complete with routing based on request attributes and API key validation.

Highlights

Envoy AI Gateway Integration: This pull request introduces a new feature to integrate Aibrix with Envoy AI Gateway, enabling advanced traffic management for AI inference workloads.
Conditional Helm Chart Deployment: Existing Helm chart templates for gateway components (gateway, deployment, envoy extension policy, httproute, rbac, service) are updated to be conditionally enabled via a new gateway.enable flag in values.yaml.
New Sample Deployment Guide and Configurations: A comprehensive deployment guide (README.md) and associated Kubernetes YAML configurations are added for setting up a multi-model AI inference gateway, including AIGatewayRoute, GatewayClass, Gateway, InferencePool, InferenceObjective, and mock model deployments for llama2-7b and mistral-7b.
Multi-Model Routing with API Keys: The new configurations demonstrate how to route requests to different AI models (llama2-7b and mistral-7b) based on custom headers (x-ai-eg-model) and validate API keys using AIGatewayRoute.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

googs1025 · 2025-11-06T05:38:05Z

dist/chart/values.yaml

  tolerations: []

 gateway:
+  enable: true


add this flag in helm chart

gemini-code-assist

Code Review

This pull request introduces a feature to enable/disable the Envoy AI Gateway integration via a Helm chart value, which is a solid improvement. It also adds extensive documentation and sample manifests for setting up a multi-model AI gateway. While the core chart changes are good, the new sample files have several issues that should be addressed. These include hardcoded secrets, use of non-specific image tags (e.g., :latest, :nightly), and critical dependencies between YAML files that make the setup brittle. Addressing these points will make the samples more robust, secure, and easier for users to follow.

samples/ai-gateway-integration/mistral-7b-inferencepool.yaml

samples/ai-gateway-integration/mistral-7b.yaml

samples/ai-gateway-integration/aigatewayroute.yaml

samples/ai-gateway-integration/llama-7b.yaml

gemini-code-assist · 2025-11-06T05:38:19Z

samples/ai-gateway-integration/llama-7b.yaml

+      serviceAccountName: mocked-app-sa
+      containers:
+        - name: llm-engine
+          image: aibrix/vllm-mock:nightly


Using the :nightly tag for the image makes the deployment non-reproducible and can introduce breaking changes unexpectedly. It's better to use a specific, stable version tag for sample code to ensure it works reliably for users.

samples/ai-gateway-integration/README.md

gemini-code-assist · 2025-11-06T05:38:19Z

samples/ai-gateway-integration/mistral-7b-inferencepool.yaml

+  namespace: default
+subjects:
+  - kind: ServiceAccount
+    name: mistral-7b-epp


The ServiceAccount subject in this RoleBinding is missing the namespace field. While it might default correctly to the default namespace, this is inconsistent with the corresponding RoleBinding in llama-7b-inferencepool.yaml and is less explicit. It's a best practice to specify the namespace for clarity and to prevent potential cross-namespace issues.

name: mistral-7b-epp namespace: default

samples/ai-gateway-integration/README.md

samples/ai-gateway-integration/mistral-7b.yaml

googs1025 · 2025-11-06T05:42:52Z

https://github.com/googs1025/aibrix/tree/ai-gateway/samples/ai-gateway-integration

Xunzhuo · 2025-11-06T07:10:09Z

@googs1025 nice work, i will review this soon.

googs1025 · 2025-11-06T08:29:01Z

@googs1025 nice work, i will review this soon.

thanks @Xunzhuo 😄

Jeffwan · 2025-11-14T22:41:11Z

@googs1025 to be consistent with #1735, i think we can add docs and samples at this moment but not make it as a deployment option? If that works for you, we can get rid of the helm related changes at this moment. Let's have a discussion and then decide what to bring to helm

googs1025 · 2025-11-15T00:15:48Z

@googs1025 to be consistent with #1735, i think we can add docs and samples at this moment but not make it as a deployment option? If that works for you, we can get rid of the helm related changes at this moment. Let's have a discussion and then decide what to bring to helm

I understand the intent to keep things minimal before finalizing the strategy. However, if we remove the Helm changes now, the commands in the README will actually fail — not just because of missing fields, but because the gateway resources defined in the sample YAMLs conflict with any existing gateway installation, and without the gateway.enable flag, there's no way to cleanly skip them in the chart.

Moreover, adding a simple gateway.enable flag doesn’t introduce much overhead:

We can keep it enabled by default (true) so existing behavior is unchanged.
It simply gives users an escape hatch to disable gateway components if they’re using their own Envoy setup or don’t need it. It’s a common in Helm charts (e.g., prometheus.enabled, serviceMonitor.enabled). 🤔

WDYT

googs1025 · 2025-11-15T00:18:29Z

maybe add comment like this:

🤔

 gateway:
  # Enables the built-in Envoy Gateway integration by default.
  # Set to false if you want to use your own gateway (e.g., custom Envoy, APISIX, etc.).
  # This option may evolve based on final architecture decisions (see #1733, #1735).
  enable: true

Jeffwan · 2025-11-16T07:33:56Z

@googs1025 enabled works for me. the let's address the conflicts and consider to merge the PR. @Xunzhuo do you have further feedback?

Signed-off-by: googs1025 <googs1025@gmail.com> Signed-off-by: CYJiang <googs1025@gmail.com>

googs1025 · 2025-11-16T11:23:22Z

@googs1025 enabled works for me. the let's address the conflicts and consider to merge the PR. @Xunzhuo do you have further feedback?

done

Jeffwan · 2025-11-17T06:47:05Z

I will merge this one. @Xunzhuo if you have further feedbacks, feel free to leave the comments and @googs1025 can address in follow up PR

googs1025 commented Nov 6, 2025

View reviewed changes

dist/chart/values.yaml

tolerations: []

gateway:

enable: true

Copy link
Copy Markdown

Collaborator Author

googs1025 Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this flag in helm chart

gemini-code-assist bot reviewed Nov 6, 2025

View reviewed changes

googs1025 force-pushed the ai-gateway branch from 13b2080 to 4aa6a21 Compare November 6, 2025 05:47

googs1025 requested review from Jeffwan and varungup90 November 6, 2025 05:59

googs1025 requested a review from Xunzhuo November 6, 2025 08:28

googs1025 force-pushed the ai-gateway branch 6 times, most recently from 144f920 to e436eaa Compare November 12, 2025 05:33

feature: envoy ai gateway integration

fb0c3bd

Signed-off-by: googs1025 <googs1025@gmail.com> Signed-off-by: CYJiang <googs1025@gmail.com>

googs1025 force-pushed the ai-gateway branch from e436eaa to fb0c3bd Compare November 16, 2025 11:14

Jeffwan merged commit 9d6ba81 into vllm-project:main Nov 17, 2025
4 checks passed

Conversation

googs1025 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

Related Issues

Pull Request Title Format

Submission Checklist

Uh oh!

gemini-code-assist bot commented Nov 6, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

googs1025 Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

googs1025 commented Nov 6, 2025

Uh oh!

Xunzhuo commented Nov 6, 2025

Uh oh!

googs1025 commented Nov 6, 2025

Uh oh!

Jeffwan commented Nov 14, 2025

Uh oh!

googs1025 commented Nov 15, 2025

Uh oh!

googs1025 commented Nov 15, 2025

Uh oh!

Jeffwan commented Nov 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

googs1025 commented Nov 16, 2025

Uh oh!

Jeffwan commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

googs1025 commented Nov 6, 2025 •

edited

Loading

Jeffwan commented Nov 16, 2025 •

edited

Loading