Background
Follow-up to RIG-1315, which surfaces finish_reason and model_version on the Gemini streaming response. That PR is contained to the Gemini provider and unblocks Ryzome's canvas/thread path (which sees the Gemini-typed StreamedAssistantContent::Final(_response) directly).
This ticket covers the cross-provider piece: surfacing tool_use_prompt_tokens on the normalized crate::completion::Usage struct so it's reachable from the generic/Cortex path that consumes complete.usage.into() without provider-typed access.
Motivation
Gemini's usageMetadata.toolUsePromptTokenCount reports tokens spent re-sending tool/function declarations on each turn of an agentic loop. In multi-step tool-use workloads (Cortex), this can be a non-trivial portion of true input cost. Today it's deserialized into gemini::PartialUsage but not forwarded into the normalized Usage, so any consumer working off Usage understates input cost.
Ryzome's two consumption paths:
Cortex is the agentic surface where tool-use prompt tokens matter most — fixing only the canvas path would instrument the workload where they matter least.
Changes
crates/rig-core/src/completion/request.rs:395 — Usage struct
- Add
pub tool_use_prompt_tokens: u64 with a doc comment noting that 0 means "not reported by this provider" (not "actually zero"). Anthropic, OpenAI, etc. will fall through to 0 because they bundle tool tokens into input_tokens.
- Update
Usage::new() (line 413), Add impl (line 431), AddAssign impl (line 447).
- Strongly consider adding
#[non_exhaustive] to Usage in the same PR. Adding a field to a struct with public fields is already a semver-minor breaking change (struct-literal construction and exhaustive destructuring break). Making it non-exhaustive now means future additions are non-breaking. Field access (usage.input_tokens) and complete.usage.into() keep working unchanged.
crates/rig-core/src/providers/gemini/streaming.rs:46 — GetTokenUsage for PartialUsage
- Populate
usage.tool_use_prompt_tokens = self.tool_use_prompt_token_count.unwrap_or_default() as u64;.
- Same change in
crates/rig-core/src/providers/gemini/interactions_api/streaming.rs if it has a parallel impl.
Other providers
- No changes required. They fall through to
0 via Usage::new(). Document the "0 = not reported" semantics on the field doc comment so consumers don't misinterpret.
Acceptance criteria
Usage has tool_use_prompt_tokens: u64, threaded through new(), Add, AddAssign.
- Gemini's
GetTokenUsage for PartialUsage populates it from tool_use_prompt_token_count.
#[non_exhaustive] decision made explicitly (recommended: yes).
- Doc comment on the new field clarifies
0-means-unreported semantics.
- Existing tests pass; add a Gemini test asserting the field is populated when
toolUsePromptTokenCount is present.
Risk / semver note
Adding a field to a public struct without #[non_exhaustive] is a breaking change for downstream code that constructs Usage { input_tokens, output_tokens, ... } exhaustively or destructures it. Call this out in the PR description and changelog. If #[non_exhaustive] is added in the same PR, the breaking change is bundled and future-proofed.
Background
Follow-up to RIG-1315, which surfaces
finish_reasonandmodel_versionon the Gemini streaming response. That PR is contained to the Gemini provider and unblocks Ryzome's canvas/thread path (which sees the Gemini-typedStreamedAssistantContent::Final(_response)directly).This ticket covers the cross-provider piece: surfacing
tool_use_prompt_tokenson the normalizedcrate::completion::Usagestruct so it's reachable from the generic/Cortex path that consumescomplete.usage.into()without provider-typed access.Motivation
Gemini's
usageMetadata.toolUsePromptTokenCountreports tokens spent re-sending tool/function declarations on each turn of an agentic loop. In multi-step tool-use workloads (Cortex), this can be a non-trivial portion of true input cost. Today it's deserialized intogemini::PartialUsagebut not forwarded into the normalizedUsage, so any consumer working offUsageunderstates input cost.Ryzome's two consumption paths:
StreamedAssistantContent::Final(_response). Can readusage_metadata.tool_use_prompt_token_countdirectly without this ticket (covered in feat(gemini): expose finish_reason and model_version on streaming StreamingCompletionResponse #1774 notes).Usageviacomplete.usage.into(). Needs this change.Cortex is the agentic surface where tool-use prompt tokens matter most — fixing only the canvas path would instrument the workload where they matter least.
Changes
crates/rig-core/src/completion/request.rs:395—Usagestructpub tool_use_prompt_tokens: u64with a doc comment noting that0means "not reported by this provider" (not "actually zero"). Anthropic, OpenAI, etc. will fall through to0because they bundle tool tokens intoinput_tokens.Usage::new()(line 413),Addimpl (line 431),AddAssignimpl (line 447).#[non_exhaustive]toUsagein the same PR. Adding a field to a struct with public fields is already a semver-minor breaking change (struct-literal construction and exhaustive destructuring break). Making it non-exhaustive now means future additions are non-breaking. Field access (usage.input_tokens) andcomplete.usage.into()keep working unchanged.crates/rig-core/src/providers/gemini/streaming.rs:46—GetTokenUsage for PartialUsageusage.tool_use_prompt_tokens = self.tool_use_prompt_token_count.unwrap_or_default() as u64;.crates/rig-core/src/providers/gemini/interactions_api/streaming.rsif it has a parallel impl.Other providers
0viaUsage::new(). Document the "0 = not reported" semantics on the field doc comment so consumers don't misinterpret.Acceptance criteria
Usagehastool_use_prompt_tokens: u64, threaded throughnew(),Add,AddAssign.GetTokenUsage for PartialUsagepopulates it fromtool_use_prompt_token_count.#[non_exhaustive]decision made explicitly (recommended: yes).0-means-unreported semantics.toolUsePromptTokenCountis present.Risk / semver note
Adding a field to a public struct without
#[non_exhaustive]is a breaking change for downstream code that constructsUsage { input_tokens, output_tokens, ... }exhaustively or destructures it. Call this out in the PR description and changelog. If#[non_exhaustive]is added in the same PR, the breaking change is bundled and future-proofed.