Skip to content

Commit 83434a9

Browse files
vercel-ai-sdk[bot]shaper
andauthored
Backport: feat (provider/gateway): add provider routing sort options (#14508)
This is an automated backport of #14311 to the release-v6.0 branch. FYI @shaper --------- Co-authored-by: Walter Korman <shaper@vercel.com>
1 parent 3aa3a68 commit 83434a9

5 files changed

Lines changed: 74 additions & 0 deletions

File tree

.changeset/bright-crabs-float.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@ai-sdk/gateway": patch
3+
---
4+
5+
feat (provider/gateway): add sort options

content/providers/01-ai-sdk-providers/00-ai-gateway.mdx

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -776,6 +776,20 @@ The following gateway provider options are available:
776776

777777
Example: `only: ['anthropic', 'vertex']` will only allow routing to Anthropic or Vertex AI.
778778

779+
- **sort** _'cost' | 'ttft' | 'tps'_
780+
781+
Sorts available providers by a performance or cost metric before routing. The gateway will try the best-scoring provider first and fall back through the rest in sorted order. If unspecified, providers are ordered using the gateway's default system ranking.
782+
783+
- `'cost'` — lowest cost first
784+
- `'ttft'` — lowest time-to-first-token first
785+
- `'tps'` — highest tokens-per-second first
786+
787+
When combined with `order`, the user-specified providers are promoted to the front while remaining providers follow the sorted order.
788+
789+
Example: `sort: 'ttft'` will route to the provider with the fastest time-to-first-token.
790+
791+
When `sort` is active, the response's `providerMetadata.gateway.routing.sort` object contains the sort option used, the resulting execution order, per-provider metric values, and any providers that were deprioritized.
792+
779793
- **models** _string[]_
780794

781795
Specifies fallback models to use when the primary model fails or is unavailable. The gateway will try the primary model first (specified in the `model` parameter), then try each model in this array in order until one succeeds.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
import type { GatewayProviderOptions } from '@ai-sdk/gateway';
2+
import { generateText } from 'ai';
3+
import { run } from '../../lib/run';
4+
5+
run(async () => {
6+
const { providerMetadata, text, usage } = await generateText({
7+
model: 'openai/gpt-oss-120b',
8+
prompt: 'Invent a new holiday and describe its traditions.',
9+
providerOptions: {
10+
gateway: {
11+
sort: 'ttft',
12+
} satisfies GatewayProviderOptions,
13+
},
14+
});
15+
16+
console.log(text);
17+
console.log();
18+
console.log('Usage:', usage);
19+
console.log(JSON.stringify(providerMetadata, null, 2));
20+
});
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import type { GatewayProviderOptions } from '@ai-sdk/gateway';
2+
import { streamText } from 'ai';
3+
import { run } from '../../lib/run';
4+
5+
run(async () => {
6+
const result = streamText({
7+
model: 'openai/gpt-oss-120b',
8+
prompt: 'Invent a new holiday and describe its traditions.',
9+
providerOptions: {
10+
gateway: {
11+
sort: 'ttft',
12+
} satisfies GatewayProviderOptions,
13+
},
14+
});
15+
16+
for await (const textPart of result.textStream) {
17+
process.stdout.write(textPart);
18+
}
19+
20+
console.log();
21+
console.log('Token usage:', await result.usage);
22+
console.log('Finish reason:', await result.finishReason);
23+
console.log(
24+
'Provider metadata:',
25+
JSON.stringify(await result.providerMetadata, null, 2),
26+
);
27+
});

packages/gateway/src/gateway-provider-options.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,14 @@ const gatewayProviderOptions = lazySchema(() =>
1717
* Example: `['bedrock', 'anthropic']` will try Amazon Bedrock first, then Anthropic as fallback.
1818
*/
1919
order: z.array(z.string()).optional(),
20+
/**
21+
* Sort providers by a performance or cost metric before routing.
22+
*
23+
* - `'cost'`: lowest cost first
24+
* - `'ttft'`: lowest time-to-first-token first
25+
* - `'tps'`: highest tokens-per-second first
26+
*/
27+
sort: z.enum(['cost', 'ttft', 'tps']).optional(),
2028
/**
2129
* The unique identifier for the end user on behalf of whom the request was made.
2230
*

0 commit comments

Comments
 (0)