Skip to content

fix(embeddings): respect token limits in EmbeddingsBuilder batching#1221

Open
godnight10061 wants to merge 1 commit into0xPlaygrounds:mainfrom
godnight10061:fix-embeddingbuilder-token-batching
Open

fix(embeddings): respect token limits in EmbeddingsBuilder batching#1221
godnight10061 wants to merge 1 commit into0xPlaygrounds:mainfrom
godnight10061:fix-embeddingbuilder-token-batching

Conversation

@godnight10061
Copy link

@godnight10061 godnight10061 commented Jan 5, 2026

Fixes #462

Summary

EmbeddingsBuilder::build() previously batched only by M::MAX_DOCUMENTS. For providers like OpenAI embeddings, requests can also fail when the combined input exceeds the provider’s per-request token budget.

Changes

  • Add EmbeddingModel::max_tokens_per_request() (default None) so providers can expose a per-request token budget.
  • Batch by both M::MAX_DOCUMENTS and max_tokens_per_request() (uses text.len() as a conservative proxy for tokens to avoid adding a tokenizer dependency).
  • Set OpenAI embedding models to Some(300_000) (based on the provider error in the issue).
  • Add a regression test covering token-budget batching.

Test

  • cargo test -p rig-core --lib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: EmbeddingBuilder::build() Exceeds OpenAI Token Limit Due to Lack of Token-Based Chunking

1 participant