[Docs]: feature: envoy ai gateway integration#1733
Conversation
Summary of ChangesHello @googs1025, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances Aibrix's deployment capabilities by integrating with Envoy AI Gateway. It provides the necessary Helm chart modifications to optionally deploy gateway infrastructure and includes detailed sample configurations for setting up a robust, multi-model AI inference gateway on Kubernetes. This allows for flexible and scalable management of AI model serving, complete with routing based on request attributes and API key validation. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| tolerations: [] | ||
|
|
||
| gateway: | ||
| enable: true |
There was a problem hiding this comment.
add this flag in helm chart
There was a problem hiding this comment.
Code Review
This pull request introduces a feature to enable/disable the Envoy AI Gateway integration via a Helm chart value, which is a solid improvement. It also adds extensive documentation and sample manifests for setting up a multi-model AI gateway. While the core chart changes are good, the new sample files have several issues that should be addressed. These include hardcoded secrets, use of non-specific image tags (e.g., :latest, :nightly), and critical dependencies between YAML files that make the setup brittle. Addressing these points will make the samples more robust, secure, and easier for users to follow.
| serviceAccountName: mocked-app-sa | ||
| containers: | ||
| - name: llm-engine | ||
| image: aibrix/vllm-mock:nightly |
| namespace: default | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| name: mistral-7b-epp |
There was a problem hiding this comment.
The ServiceAccount subject in this RoleBinding is missing the namespace field. While it might default correctly to the default namespace, this is inconsistent with the corresponding RoleBinding in llama-7b-inferencepool.yaml and is less explicit. It's a best practice to specify the namespace for clarity and to prevent potential cross-namespace issues.
name: mistral-7b-epp
namespace: default|
@googs1025 nice work, i will review this soon. |
thanks @Xunzhuo 😄 |
144f920 to
e436eaa
Compare
|
@googs1025 to be consistent with #1735, i think we can add docs and samples at this moment but not make it as a deployment option? If that works for you, we can get rid of the helm related changes at this moment. Let's have a discussion and then decide what to bring to helm |
I understand the intent to keep things minimal before finalizing the strategy. However, if we remove the Helm changes now, the commands in the README will actually fail — not just because of missing fields, but because the gateway resources defined in the sample YAMLs conflict with any existing gateway installation, and without the gateway.enable flag, there's no way to cleanly skip them in the chart. Moreover, adding a simple gateway.enable flag doesn’t introduce much overhead:
WDYT |
|
maybe add comment like this: 🤔 gateway:
# Enables the built-in Envoy Gateway integration by default.
# Set to false if you want to use your own gateway (e.g., custom Envoy, APISIX, etc.).
# This option may evolve based on final architecture decisions (see #1733, #1735).
enable: true |
|
@googs1025 |
Signed-off-by: googs1025 <googs1025@gmail.com> Signed-off-by: CYJiang <googs1025@gmail.com>
e436eaa to
fb0c3bd
Compare
done |
|
I will merge this one. @Xunzhuo if you have further feedbacks, feel free to leave the comments and @googs1025 can address in follow up PR |
Pull Request Description
[Please provide a clear and concise description of your changes here]
Related Issues
Resolves: #1732
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]: Corrections to existing functionality[CI]: Changes to build process or CI pipeline[Docs]: Updates or additions to documentation[API]: Modifications to aibrix's API or interface[CLI]: Changes or additions to the Command Line Interface[Misc]: For changes not covered above (use sparingly)Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.