Skip to content

feat(inference): propagate STT extra to SpeechData.metadata#1389

Open
toubatbrian wants to merge 3 commits intomainfrom
claude/quirky-galileo-hlLhh
Open

feat(inference): propagate STT extra to SpeechData.metadata#1389
toubatbrian wants to merge 3 commits intomainfrom
claude/quirky-galileo-hlLhh

Conversation

@toubatbrian
Copy link
Copy Markdown
Contributor

@toubatbrian toubatbrian commented May 4, 2026

Summary

Port of livekit/agents#5639 ("feat(inference): propagate STT extra to SpeechData.metadata").

This plumbs the inference gateway's per-transcript extra field onto SpeechData.metadata, unblocking consumers of provider-specific STT signals — most immediately Inworld STT 1's voice profile (age, emotion, pitch, vocal style, accent, gender), which the gateway emits under extra["voice_profile"]. xAI STT, which already emits extra["speech_final"], will also start surfacing on SpeechData.metadata after this lands. The change is generic, not provider-specific.

Ported changes

  1. agents/src/stt/stt.ts — Add an optional metadata?: Record<string, unknown> field to the SpeechData interface, mirroring the new metadata: dict[str, Any] | None field on the Python SpeechData dataclass.

  2. agents/src/inference/stt.ts — In processTranscript, extract data.extra (already part of the sttInterimTranscriptEventSchema / sttFinalTranscriptEventSchema zod schemas as z.unknown().nullable().optional()) and surface it on SpeechData.metadata whenever it is a non-empty object.

Implementation nuances

The Python upstream is a 5-line change:

extra = data.get("extra")
metadata = extra if isinstance(extra, dict) and extra else None
return stt.SpeechData(..., metadata=metadata)

The JS port is a few lines longer because the schemas already typed extra as z.unknown().nullable().optional() (vs. Python's dict.get), so the JS guard has to widen the runtime check to TypeScript's structural type system:

const extra = data.extra;
const metadata =
  extra &&
  typeof extra === 'object' &&
  !Array.isArray(extra) &&
  Object.keys(extra as Record<string, unknown>).length > 0
    ? (extra as Record<string, unknown>)
    : undefined;

Behavior matches the Python semantics 1:1: null, non-objects, arrays, and empty objects all collapse to undefined (Python's None); only non-empty plain object payloads are forwarded.

The SpeechData interface field uses Record<string, unknown> rather than Record<string, any> to comply with the repo's strict ESLint rules (the repo bans any).

Test plan

  • pnpm build:agents passes
  • pnpm test -- --run agents/src/inference/stt passes (43/43)
  • pnpm lint produces no new warnings on agents/src/inference/stt.ts or agents/src/stt/stt.ts
  • pnpm format:write clean on the touched files
  • Patch changeset added under .changeset/inference-stt-metadata.md
  • Manual: end-to-end with Inworld STT through the inference gateway and confirm alt.metadata["voice_profile"] is populated on FINAL_TRANSCRIPT events

🤖 This PR was opened by an automated Claude Code routine that ports merged PRs from livekit/agents (Python) to livekit/agents-js. The routine is in experimentation — please flag anything that looks off.

cc @toubatbrian @livekit/agent-devs

🤖 Generated with Claude Code


Generated by Claude Code

Port of livekit/agents#5639. Adds an optional `metadata` field on
`SpeechData` and plumbs the inference gateway's per-transcript `extra`
field through to it, exposing provider-specific signals (e.g. Inworld
voice profile, xAI speech_final) to downstream consumers.
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 4, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ toubatbrian
❌ claude
You have signed the CLA already but the status is still pending? Let us recheck it.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 4, 2026

🦋 Changeset detected

Latest commit: 1c4ba58

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 31 packages
Name Type
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugins-test Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants