Skip to content

[Docs]: feature: envoy ai gateway integration#1733

Merged
Jeffwan merged 1 commit intovllm-project:mainfrom
googs1025:ai-gateway
Nov 17, 2025
Merged

[Docs]: feature: envoy ai gateway integration#1733
Jeffwan merged 1 commit intovllm-project:mainfrom
googs1025:ai-gateway

Conversation

@googs1025
Copy link
Copy Markdown
Collaborator

@googs1025 googs1025 commented Nov 6, 2025

Pull Request Description

[Please provide a clear and concise description of your changes here]

Related Issues

Resolves: #1732

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @googs1025, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Aibrix's deployment capabilities by integrating with Envoy AI Gateway. It provides the necessary Helm chart modifications to optionally deploy gateway infrastructure and includes detailed sample configurations for setting up a robust, multi-model AI inference gateway on Kubernetes. This allows for flexible and scalable management of AI model serving, complete with routing based on request attributes and API key validation.

Highlights

  • Envoy AI Gateway Integration: This pull request introduces a new feature to integrate Aibrix with Envoy AI Gateway, enabling advanced traffic management for AI inference workloads.
  • Conditional Helm Chart Deployment: Existing Helm chart templates for gateway components (gateway, deployment, envoy extension policy, httproute, rbac, service) are updated to be conditionally enabled via a new gateway.enable flag in values.yaml.
  • New Sample Deployment Guide and Configurations: A comprehensive deployment guide (README.md) and associated Kubernetes YAML configurations are added for setting up a multi-model AI inference gateway, including AIGatewayRoute, GatewayClass, Gateway, InferencePool, InferenceObjective, and mock model deployments for llama2-7b and mistral-7b.
  • Multi-Model Routing with API Keys: The new configurations demonstrate how to route requests to different AI models (llama2-7b and mistral-7b) based on custom headers (x-ai-eg-model) and validate API keys using AIGatewayRoute.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

tolerations: []

gateway:
enable: true
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this flag in helm chart

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a feature to enable/disable the Envoy AI Gateway integration via a Helm chart value, which is a solid improvement. It also adds extensive documentation and sample manifests for setting up a multi-model AI gateway. While the core chart changes are good, the new sample files have several issues that should be addressed. These include hardcoded secrets, use of non-specific image tags (e.g., :latest, :nightly), and critical dependencies between YAML files that make the setup brittle. Addressing these points will make the samples more robust, secure, and easier for users to follow.

serviceAccountName: mocked-app-sa
containers:
- name: llm-engine
image: aibrix/vllm-mock:nightly
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the :nightly tag for the image makes the deployment non-reproducible and can introduce breaking changes unexpectedly. It's better to use a specific, stable version tag for sample code to ensure it works reliably for users.

namespace: default
subjects:
- kind: ServiceAccount
name: mistral-7b-epp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ServiceAccount subject in this RoleBinding is missing the namespace field. While it might default correctly to the default namespace, this is inconsistent with the corresponding RoleBinding in llama-7b-inferencepool.yaml and is less explicit. It's a best practice to specify the namespace for clarity and to prevent potential cross-namespace issues.

    name: mistral-7b-epp
    namespace: default

@googs1025
Copy link
Copy Markdown
Collaborator Author

https://github.com/googs1025/aibrix/tree/ai-gateway/samples/ai-gateway-integration

@Xunzhuo
Copy link
Copy Markdown
Member

Xunzhuo commented Nov 6, 2025

@googs1025 nice work, i will review this soon.

@googs1025 googs1025 requested a review from Xunzhuo November 6, 2025 08:28
@googs1025
Copy link
Copy Markdown
Collaborator Author

@googs1025 nice work, i will review this soon.

thanks @Xunzhuo 😄

@googs1025 googs1025 force-pushed the ai-gateway branch 6 times, most recently from 144f920 to e436eaa Compare November 12, 2025 05:33
@Jeffwan
Copy link
Copy Markdown
Collaborator

Jeffwan commented Nov 14, 2025

@googs1025 to be consistent with #1735, i think we can add docs and samples at this moment but not make it as a deployment option? If that works for you, we can get rid of the helm related changes at this moment. Let's have a discussion and then decide what to bring to helm

@googs1025
Copy link
Copy Markdown
Collaborator Author

@googs1025 to be consistent with #1735, i think we can add docs and samples at this moment but not make it as a deployment option? If that works for you, we can get rid of the helm related changes at this moment. Let's have a discussion and then decide what to bring to helm

I understand the intent to keep things minimal before finalizing the strategy. However, if we remove the Helm changes now, the commands in the README will actually fail — not just because of missing fields, but because the gateway resources defined in the sample YAMLs conflict with any existing gateway installation, and without the gateway.enable flag, there's no way to cleanly skip them in the chart.

Moreover, adding a simple gateway.enable flag doesn’t introduce much overhead:

  • We can keep it enabled by default (true) so existing behavior is unchanged.
  • It simply gives users an escape hatch to disable gateway components if they’re using their own Envoy setup or don’t need it. It’s a common in Helm charts (e.g., prometheus.enabled, serviceMonitor.enabled). 🤔

WDYT

@googs1025
Copy link
Copy Markdown
Collaborator Author

maybe add comment like this:

🤔

 gateway:
  # Enables the built-in Envoy Gateway integration by default.
  # Set to false if you want to use your own gateway (e.g., custom Envoy, APISIX, etc.).
  # This option may evolve based on final architecture decisions (see #1733, #1735).
  enable: true

@Jeffwan
Copy link
Copy Markdown
Collaborator

Jeffwan commented Nov 16, 2025

@googs1025 enabled works for me. the let's address the conflicts and consider to merge the PR. @Xunzhuo do you have further feedback?

Signed-off-by: googs1025 <googs1025@gmail.com>
Signed-off-by: CYJiang <googs1025@gmail.com>
@googs1025
Copy link
Copy Markdown
Collaborator Author

@googs1025 enabled works for me. the let's address the conflicts and consider to merge the PR. @Xunzhuo do you have further feedback?

done

@Jeffwan
Copy link
Copy Markdown
Collaborator

Jeffwan commented Nov 17, 2025

I will merge this one. @Xunzhuo if you have further feedbacks, feel free to leave the comments and @googs1025 can address in follow up PR

@Jeffwan Jeffwan merged commit 9d6ba81 into vllm-project:main Nov 17, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide a example for integrating Aibrix with Envoy AI Gateway (using Gateway API + InferencePool)

3 participants