Skip to content

Stream Rime TTS response chunks#1405

Open
chasef07 wants to merge 1 commit intolivekit:mainfrom
chasef07:rime-tts-streaming-fix
Open

Stream Rime TTS response chunks#1405
chasef07 wants to merge 1 commit intolivekit:mainfrom
chasef07:rime-tts-streaming-fix

Conversation

@chasef07
Copy link
Copy Markdown
Contributor

@chasef07 chasef07 commented May 6, 2026

Summary

  • stream Rime PCM response chunks from response.body instead of waiting for the full response.arrayBuffer()
  • keep the existing one-frame deferral behavior so only the last emitted frame is marked final: true
  • flush a non-empty trailing PCM remainder after the HTTP stream ends
  • add a regression test that proves the Rime plugin emits audio before the mocked HTTP body closes

Context

The Rime JS TTS plugin was buffering the whole HTTP response before writing any audio into the LiveKit audio queue. Rime can flush PCM as synthesis progresses, but await response.arrayBuffer() means the agent sees no audio until the provider response has completely finished. For longer utterances, app-observed TTS TTFB therefore grows with total synthesis duration instead of reflecting the provider's first available audio chunk.

This brings the JS plugin behavior in line with the intended streaming path: consume the response body reader incrementally, pass each received PCM chunk through AudioByteStream, and enqueue complete 100ms AudioFrames as soon as they are available.

Implementation notes

  • response.body is checked before reading so a missing body fails explicitly.
  • ReadableStream.getReader() is released in finally, and the queue is closed in the same cleanup path.
  • AudioByteStream.write() already accepts ArrayBufferView and honors byteOffset / byteLength, so the streamed Uint8Array chunks can be passed directly without copying into a sliced ArrayBuffer.
  • AudioByteStream.flush() is still called after EOF to drain a valid partial-frame remainder; empty flush frames are skipped.

Validation

  • pnpm test -- plugins/rime/src/tts.test.ts
  • pnpm --filter @livekit/agents-plugin-rime lint
  • pnpm format:check
  • pnpm turbo run build --filter=@livekit/agents-plugin-rime
  • git diff --check

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 6, 2026

🦋 Changeset detected

Latest commit: 785a869

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 29 packages
Name Type
@livekit/agents-plugin-rime Patch
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch
@livekit/agents-plugins-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@adrian-cowham adrian-cowham requested a review from toubatbrian May 6, 2026 16:27
Copy link
Copy Markdown
Contributor

@adrian-cowham adrian-cowham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

Reviewed by: code-reviewer, silent-failure-hunter, pr-test-analyzer, code-simplifier

Critical Issues

None.

Important Issues

None.

Suggestions

[pr-test-analyzer] Cross-chunk frame accumulation not exercised (severity 7/10) — plugins/rime/src/tts.test.ts
The test sends two chunks of exactly bytesPerFrame (3200 bytes each), so each independently yields one frame. In production, HTTP chunk boundaries are arbitrary and will rarely align. A test with misaligned chunks (e.g., pcmChunk(1600), pcmChunk(1600), pcmChunk(3200)) would exercise byte accumulation across reads — the real behavioral change this PR introduces.

[pr-test-analyzer] audioByteStream.flush() trailing-frame path not exercised (severity 6/10) — plugins/rime/src/tts.test.ts
With aligned chunks the internal buffer is empty at end-of-stream, so flush() returns a zero-sample frame that gets skipped. Sending pcmChunk(3200 + 1600) followed by bodyController.close() would force the flush path to emit a trailing partial frame — a common scenario when Rime's server sends a final unaligned chunk.

[pr-test-analyzer] Reader release on mid-stream error not tested (severity 5/10) — plugins/rime/src/tts.test.ts
A test calling bodyController.error(new Error('network failure')) after enqueuing one chunk would verify that the finally cleanup (reader lock release + queue close) fires correctly on network errors mid-stream.

[code-simplifier] pcmChunk helper fill loop is unnecessary (minor) — plugins/rime/src/tts.test.ts
The even-byte fill produces non-silent PCM but no assertion inspects sample content. return new Uint8Array(byteLength) (silence) exercises the same code paths.

Strengths

  • Correct streaming conversion — the lastFrame deferred-emit pattern preserves final: false / final: true / done: true sequencing identically to the pre-PR behavior
  • Solid resource cleanupreader.releaseLock() + queue.close() in finally ensures cleanup on both success and error paths; the intentional lack of a catch block lets errors propagate to the base class retry/emit machinery
  • Proper null-body guard!response.body early-throw is consistent with baseten, inworld, and minimax plugins
  • Well-designed regression test — uses a real ReadableStream with manual controller (not over-mocked), validates audio arrives before body closes via withTimeout, and runs unconditionally in CI without external credentials
  • Consistent with codebase patterns — the streaming loop mirrors the existing baseten TTS plugin almost exactly
  • flush() zero-sample guard is load-bearingAudioByteStream.flush() does not short-circuit when the buffer is empty, so the samplesPerChannel === 0 filter is necessary

Looks good overall. The suggestions above are optional hardening for test coverage — none are blockers.

🤖 Reviewed with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants