Skip to content

feat:implement chained routing support for flexible algorithm composition #2098

@paranoidRick

Description

@paranoidRick

🚀 Feature Description and Motivation

AIBrix need the ability to combine multiple routing algorithms to balance different optimization goals (like load balancing, KV cache efficiency, and latency). Current AIBrix only supports a single routing algorithm, which limits our ability to fine-tune routing decisions for complex workloads.

Use Case

In large-scale clusters, we may need a multi-routing strategy, current problem is aibrix using only least-request routing ensures load balancing

Proposed Solution

The inspiration comes from multi-level sorting. I think we can implment the chained routing feature addresses this by allowing users to specify multiple comma-separated routing algorithms (e.g., least-request,least-kv-cache ). These algorithms are applied sequentially, with each algorithm narrowing down the candidate pod list until one pod remains or all algorithms are applied.

Maintaining full backward compatibility with existing routing configurations:
curl -H "routing-strategy: least-request,least-kv-cache" ...

Providing greater flexibility to tailor routing strategies to specific use cases

Metadata

Metadata

Assignees

Labels

area/gatewaypriority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions