Skip to content

Commit 13b2080

Browse files
committed
feature: envoy ai gateway integration
Signed-off-by: googs1025 <googs1025@gmail.com>
1 parent 941e68f commit 13b2080

14 files changed

Lines changed: 734 additions & 1 deletion

File tree

dist/chart/templates/gateway-instance/gateway.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
{{- if .Values.gateway.enable }}
12
apiVersion: gateway.networking.k8s.io/v1
23
kind: GatewayClass
34
metadata:
@@ -156,3 +157,4 @@ spec:
156157
connect_timeout: {{ .Values.gateway.envoyPatchPolicy.route.connectTimeout }}
157158
lb_policy: CLUSTER_PROVIDED
158159
dns_lookup_family: V4_ONLY
160+
{{- end }}

dist/chart/templates/gateway-plugin/deployment.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
{{- if .Values.gateway.enable }}
12
apiVersion: apps/v1
23
kind: Deployment
34
metadata:
@@ -98,3 +99,4 @@ spec:
9899
readinessProbe:
99100
{{- toYaml .Values.gatewayPlugin.container.probes.readiness | nindent 12 }}
100101
serviceAccountName: aibrix-gateway-plugins
102+
{{- end }}

dist/chart/templates/gateway-plugin/envoy_extension_policy.yaml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# templates/envoy-extension-policy.yaml
2+
{{- if .Values.gateway.enable }}
23
apiVersion: gateway.envoyproxy.io/v1alpha1
34
kind: EnvoyExtensionPolicy
45
metadata:
@@ -35,4 +36,5 @@ spec:
3536
targetRef:
3637
group: gateway.networking.k8s.io
3738
kind: HTTPRoute
38-
name: aibrix-reserved-router-metadata-endpoint
39+
name: aibrix-reserved-router-metadata-endpoint
40+
{{- end }}

dist/chart/templates/gateway-plugin/httproute.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
{{- if .Values.gateway.enable }}
12
apiVersion: gateway.networking.k8s.io/v1
23
kind: HTTPRoute
34
metadata:
@@ -55,3 +56,4 @@ spec:
5556
backendRefs:
5657
- name: aibrix-metadata-service
5758
port: 8090
59+
{{- end }}

dist/chart/templates/gateway-plugin/rbac.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
{{- if .Values.gateway.enable }}
12
---
23
apiVersion: v1
34
kind: ServiceAccount
@@ -65,3 +66,4 @@ subjects:
6566
- kind: ServiceAccount
6667
name: aibrix-gateway-plugins
6768
namespace: {{ .Release.Namespace }}
69+
{{- end }}

dist/chart/templates/gateway-plugin/service.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
{{- if .Values.gateway.enable }}
12
apiVersion: v1
23
kind: Service
34
metadata:
@@ -24,3 +25,4 @@ spec:
2425
- name: metrics
2526
port: 8080
2627
targetPort: 8080
28+
{{- end }}

dist/chart/values.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,7 @@ gpuOptimizer:
102102
tolerations: []
103103

104104
gateway:
105+
enable: true
105106
envoyProxy:
106107
replicas: 1
107108
imagePullSecrets: []
Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
# Aibrix Integration with Envoy AI Gateway Deployment Guide
2+
3+
This guide walks you through deploying a multi-model AI inference gateway using **Envoy AI Gateway**, **Gateway API Inference Extension**, and custom Aibrix-branded routing rules.
4+
5+
### Project Structure
6+
7+
```bash
8+
samples/ai-gateway-integration
9+
├── gateway.yaml # GatewayClass + Gateway
10+
├── aigatewayroute.yaml # Multi-model routing rules (llama2-7b, mistral-7b)
11+
├── llama-7b-inferencepool.yaml # InferencePool + EPP for Llama2-7B
12+
├── mistral-7b-inferencepool.yaml # InferencePool + EPP for Mistral-7B
13+
└── llama-7b.yaml # Mock model llama-7b deployments
14+
└── mistral-7b.yaml # Mock model mistral-7b deployments
15+
```
16+
17+
### Prerequisites
18+
- Kubernetes cluster (v1.24+)
19+
- kubectl configured
20+
- helm v3.8+
21+
- Internet access to pull images from docker.io and GitHub
22+
23+
### Installation Steps
24+
25+
1. Install Aibrix Custom Application (Optional)
26+
If you have an internal Aibrix Helm chart:
27+
28+
If you have an internal Aibrix [Helm chart](../../dist/chart):
29+
```bash
30+
helm install aibrix dist/chart -n aibrix-system --create-namespace
31+
```
32+
33+
> **Note**: If you are using an internal Aibrix Helm chart, **you must set `gateway.enable: false`** in `values.yaml`.
34+
> This is critical because **Steps 2–5 below will install the AI Gateway controller and Envoy data plane independently**.
35+
> Enabling the built-in gateway here would cause resource conflicts or duplicate deployments.
36+
37+
```yaml
38+
...
39+
gateway:
40+
enable: false # ← Set this to false to skip internal gateway deployment
41+
...
42+
```
43+
44+
2. Install AI Gateway CRDs
45+
46+
```bash
47+
helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm \
48+
--version v0.0.0-latest \
49+
--namespace envoy-ai-gateway-system \
50+
--create-namespace
51+
```
52+
53+
> For more details, see the official [installation guide](https://aigateway.envoyproxy.io/docs/getting-started/installation#step-1-install-ai-gateway-crds) for AI Gateway CRDs.
54+
55+
56+
3. Install AI Gateway Controller
57+
58+
```bash
59+
helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm \
60+
--version v0.0.0-latest \
61+
--namespace envoy-ai-gateway-system \
62+
--create-namespace
63+
```
64+
65+
> For more details, see the official [installation guide](https://aigateway.envoyproxy.io/docs/getting-started/installation#step-2-install-ai-gateway-resources) for AI Gateway Resources.
66+
67+
Wait for the controller to be ready:
68+
```bash
69+
kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available
70+
```
71+
72+
4. Install Gateway API Inference Extension (EPP Framework)
73+
74+
```bash
75+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.1/manifests.yaml
76+
```
77+
78+
> For more details, see the official [installation guide](https://aigateway.envoyproxy.io/docs/capabilities/inference/httproute-inferencepool#step-1-install-gateway-api-inference-extension) for Gateway API Inference Extension.
79+
80+
81+
This deploys:
82+
CRDs (InferencePool, InferenceObjective)
83+
RBAC, webhooks, and core controllers
84+
85+
5. Install Envoy Gateway (Data Plane)
86+
87+
```bash
88+
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \
89+
--version v0.0.0-latest \
90+
--namespace envoy-gateway-system \
91+
--create-namespace \
92+
-f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-values.yaml \
93+
-f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/inference-pool/envoy-gateway-values-addon.yaml
94+
```
95+
96+
> For more details, see the official [installation guide](https://aigateway.envoyproxy.io/docs/getting-started/prerequisites#additional-features-rate-limiting-inferencepool-etc) for Envoy Gateway.
97+
98+
99+
6. Deploy Aibrix AI Gateway Resources
100+
101+
Apply your custom gateway and routing configuration:
102+
103+
```bash
104+
cd samples/ai-gateway-integration
105+
106+
# Deploy for each model
107+
kubectl apply -f llama-7b.yaml
108+
kubectl apply -f mistral-7b.yaml
109+
110+
# Deploy GatewayClass, Gateway, and AIGatewayRoute
111+
kubectl apply -f gateway.yaml
112+
kubectl apply -f aigatewayroute.yaml
113+
114+
# Deploy backend resources for each model
115+
kubectl apply -f llama-7b-inferencepool.yaml
116+
kubectl apply -f mistral-7b-inferencepool.yaml
117+
```
118+
119+
### Verify Deployment Status
120+
121+
After installation, you can verify that all components are running correctly. Below is an example of expected output from a successful deployment:
122+
123+
- Pods in `aibrix-system`
124+
```bash
125+
$ kubectl get pods -n aibrix-system
126+
NAME READY STATUS RESTARTS AGE
127+
aibrix-controller-manager-7dcf4b8d97-9mgw8 1/1 Running 0 3h35m
128+
aibrix-gpu-optimizer-556d946fbb-gzh85 1/1 Running 0 3h35m
129+
aibrix-metadata-service-bdfd4459d-678k5 1/1 Running 0 3h35m
130+
aibrix-redis-master-74945dc65d-sr2sq 1/1 Running 0 3h35m
131+
```
132+
133+
- Pods in `envoy-ai-gateway-system`
134+
```bash
135+
$ kubectl get pods -n envoy-ai-gateway-system
136+
NAME READY STATUS RESTARTS AGE
137+
ai-gateway-controller-5558c7cf7c-bzh65 1/1 Running 0 3h34m
138+
```
139+
140+
- Pods in `envoy-gateway-system
141+
```bash
142+
$ kubectl get pods -n envoy-gateway-system
143+
NAME READY STATUS RESTARTS AGE
144+
envoy-default-aibrix-ai-gateway-588291e8-54d5f9b6f-2psp6 3/3 Running 0 128m
145+
envoy-gateway-6dd8f9b8f-kjngn 1/1 Running 0 3h33m
146+
```
147+
148+
- AI Gateway CRDs
149+
```bash
150+
$ kubectl get InferencePool
151+
NAME AGE
152+
llama2-7b 121m
153+
mistral-7b 121m
154+
155+
$ kubectl get InferenceObjective
156+
NAME INFERENCE POOL PRIORITY AGE
157+
llama2-7b llama2-7b 10 121m
158+
mistral-7b mistral-7b 10 121m
159+
```
160+
161+
- Model and EPP Backend Pods (in default namespace)
162+
163+
```bash
164+
$ kubectl get pods
165+
NAME READY STATUS RESTARTS AGE
166+
llama2-7b-epp-6fb99fd7df-7xlxq 1/1 Running 0 121m
167+
mistral-7b-epp-7c7f7fcb66-bw87d 1/1 Running 0 121m
168+
mock-llama2-7b-6444f9b459-7gzmx 1/1 Running 0 131m
169+
mock-llama2-7b-6444f9b459-92bsl 1/1 Running 0 131m
170+
mock-llama2-7b-6444f9b459-krj8c 1/1 Running 0 131m
171+
mock-mistral-7b-5fddcff595-5268f 1/1 Running 0 131m
172+
mock-mistral-7b-5fddcff595-t65cp 1/1 Running 0 131m
173+
```
174+
175+
### Test the Setup
176+
177+
Once all pods are ready, test routing via curl:
178+
179+
- Llama2-7B
180+
181+
```bash
182+
curl -v http://<GATEWAY_IP>/v1/chat/completions \
183+
-H "Content-Type: application/json" \
184+
-H "x-ai-eg-model: llama2-7b" \
185+
-H "Authorization: Bearer test-key-1234567890" \
186+
-d '{
187+
"model": "llama2-7b",
188+
"messages": [{"role": "user", "content": "Say this is a test!"}],
189+
"temperature": 0.7
190+
}'
191+
```
192+
193+
- Mistral-7B
194+
195+
```bash
196+
curl -v http://<GATEWAY_IP>/v1/chat/completions \
197+
-H "Content-Type: application/json" \
198+
-H "x-ai-eg-model: mistral-7b" \
199+
-H "Authorization: Bearer test-key-0987654321" \
200+
-d '{
201+
"model": "mistral-7b",
202+
"messages": [{"role": "user", "content": "Say this is a test!"}],
203+
"temperature": 0.7
204+
}'
205+
```
206+
207+
Replace `<GATEWAY_IP>` with:
208+
- localhost:8080 if using
209+
210+
```bash
211+
kubectl port-forward -n envoy-gateway-system svc/eg-envoy 8080:80
212+
```
213+
214+
Or the external IP of the `eg-envoy` Service if exposed via LoadBalancer.
215+
216+
### References
217+
- [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway)
218+
- [Gateway API Inference Extension](https://github.com/kubernetes-sigs/gateway-api-inference-extension)
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
apiVersion: aigateway.envoyproxy.io/v1alpha1
2+
kind: AIGatewayRoute
3+
metadata:
4+
name: multi-model-route
5+
namespace: default
6+
spec:
7+
parentRefs:
8+
# References the Gateway defined
9+
- name: aibrix-ai-gateway
10+
kind: Gateway
11+
group: gateway.networking.k8s.io
12+
rules:
13+
- matches:
14+
- headers:
15+
- type: Exact
16+
name: x-ai-eg-model # Custom header used to specify the target model
17+
value: llama2-7b
18+
- type: Exact
19+
name: Authorization # Validates API key
20+
value: Bearer test-key-1234567890
21+
backendRefs:
22+
# Must match the InferencePool name
23+
- group: inference.networking.k8s.io
24+
kind: InferencePool
25+
name: llama2-7b
26+
- matches:
27+
- headers:
28+
- type: Exact
29+
name: x-ai-eg-model
30+
value: mistral-7b
31+
- type: Exact
32+
name: Authorization
33+
value: Bearer test-key-0987654321 # different key
34+
backendRefs:
35+
- group: inference.networking.k8s.io
36+
kind: InferencePool
37+
name: mistral-7b
38+
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: GatewayClass
3+
metadata:
4+
name: aibrix-ai-gateway-class # Unique name for this gateway class (branded with Aibrix)
5+
spec:
6+
controllerName: gateway.envoyproxy.io/gatewayclass-controller # Envoy Gateway controller identifier
7+
---
8+
apiVersion: gateway.networking.k8s.io/v1
9+
kind: Gateway
10+
metadata:
11+
name: aibrix-ai-gateway
12+
namespace: default
13+
spec:
14+
# Must match the GatewayClass name above
15+
gatewayClassName: aibrix-ai-gateway-class
16+
listeners:
17+
- name: http
18+
protocol: HTTP
19+
port: 80

0 commit comments

Comments
 (0)