fix(voice): prevent scheduling deadlock when SpeechHandle._markDone re-enters by toubatbrian · Pull Request #1422 · livekit/agents-js

toubatbrian · 2026-05-08T01:54:42Z

Summary

Automated port of livekit/agents#5678 (fix(voice): prevent scheduling deadlock when pipeline task crashes) into agents-js.

Note

This is an automated Claude Code Routine created by @toubatbrian. Right now it is in experimentation stage.

cc @toubatbrian @livekit/agent-devs for review.

What was broken

SpeechHandle._markDone() short-circuits when doneFut is already resolved, but the call to _markGenerationDone() was nested inside the same if (!doneFut.done) guard:

// Pre-fix (agents/src/voice/speech_handle.ts)
_markDone(): void {
  if (!this.doneFut.done) {
    this.doneFut.resolve();
    if (this.generations.length > 0) {
      this._markGenerationDone(); // preemptive generation could be cancelled before being scheduled
    }
  }
}

This meant that if _markDone() ran a second time after doneFut was already settled — e.g. the force-interrupt shutdown path in AgentActivity.interrupt() runs first, and the pipeline reply task later tries to clean up — and a generation had been authorized in the interim, that generation future was never resolved. mainTask then hangs forever on _waitForGeneration(), starving every subsequent speech handle (a real production failure mode that already had a workaround comment at agent_activity.ts:1709-1712: "the generation future created by _authorizeGeneration would never resolve since _markDone is a no-op once doneFut is already settled").

This is the JS twin of the Python bug: in Python the equivalent issue surfaced through contextlib.suppress(asyncio.InvalidStateError) swallowing _done_fut.set_result(None) and aborting the whole block; here it's the explicit if (!doneFut.done) guard. Same root cause, different idiom.

Fix

Move _markGenerationDone() out of the if (!doneFut.done) guard so it always runs when a generation is pending:

_markDone(): void {
  if (!this.doneFut.done) {
    this.doneFut.resolve();
  }

  // Ref: python livekit-agents/livekit/agents/voice/speech_handle.py - _mark_done
  // Must be outside the doneFut.done guard: if doneFut is already resolved
  // (e.g. interrupted before _markDone is called), the guarded block is
  // skipped and the generation future would be left unresolved, leaving
  // _waitForGeneration stuck and starving subsequent speech handles.
  if (this.generations.length > 0) {
    this._markGenerationDone();
  }
}

_markGenerationDone() is already idempotent (it only resolves the last generation if !lastGeneration.done), so calling it on a re-entry where the generation was already resolved is a no-op.

Implementation nuances vs. Python

The Python diff (6 / -3 in speech_handle.py) also re-orders the _interrupt_timeout_handle.cancel() call relative to the suppressed block. The JS SpeechHandle doesn't carry an _interruptTimeoutHandle field at all (no equivalent timeout-driven cancellation logic exists yet in agents-js), so that part of the Python change has no JS counterpart and is intentionally not ported. The behavioral fix — always run _markGenerationDone() when there's a pending generation — is identical in both languages.

The Python PR did not include a regression test. Because the JS bug pattern (re-entering _markDone after doneFut is settled, then authorizing/finishing a generation) is easy to reproduce deterministically with the existing public-internal API, this PR adds a single regression test in agents/src/voice/speech_handle.test.ts:

it('resolves a pending generation when called after doneFut is already done', async () => {
  const handle = SpeechHandle.create();
  handle._markDone();                // resolves doneFut (no generations yet)
  handle._authorizeGeneration();     // generation authorized after handle is "done"
  handle._markDone();                // pre-fix: short-circuit, generation never resolves

  const outcome = await raceTimeout(handle._waitForGeneration(), 500);
  expect(outcome).toBe('resolved');
});

Verified: this test times out on the pre-fix code (Expected: "resolved", Received: "timeout") and passes on the fix.

Files changed

agents/src/voice/speech_handle.ts — move _markGenerationDone() out of the if (!doneFut.done) guard.
agents/src/voice/speech_handle.test.ts — add SpeechHandle._markDone - generation resolution after early done regression suite.
.changeset/fix-speech-handle-mark-done-deadlock.md — patch for @livekit/agents.

Test plan

pnpm --filter @livekit/agents build — passes.
pnpm exec prettier --check on changed files — passes.
pnpm --filter @livekit/agents lint — no new errors / warnings on touched files.
pnpm exec vitest run agents/src/voice/speech_handle.test.ts — 10/10 pass (9 existing + 1 new).
pnpm exec vitest run agents/src/voice/ — 169/169 pass (full voice suite green).
Confirmed regression: reverting the speech_handle.ts change makes the new test fail with a 500 ms timeout, then re-applying restores it to green.
Manual smoke against a force-interrupt shutdown (left to reviewer with a real session).

Changeset

patch for @livekit/agents (per the routine's standing instructions).

Generated by Claude Code

Move the `_markGenerationDone()` call in `SpeechHandle._markDone()` outside the `if (!doneFut.done)` guard so a pending generation future is always resolved, even when `doneFut` was already settled. Previously, a second `_markDone()` would short-circuit and leave the generation unresolved, causing `mainTask` to hang on `_waitForGeneration()` and starve subsequent speech handles. Adds a regression test (`SpeechHandle._markDone - generation resolution after early done`) that fails on the pre-fix code (timeout) and passes on the fix. Ports livekit/agents#5678. https://claude.ai/code/session_01NQyfoU5f21EryyRbE98AE5

changeset-bot · 2026-05-08T01:54:46Z

🦋 Changeset detected

Latest commit: fe0d991

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 31 packages

Name	Type
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-assemblyai	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-cerebras	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-fishaudio	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-hume	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-lemonslice	Patch
@livekit/agents-plugin-liveavatar	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-minimax	Patch
@livekit/agents-plugin-mistral	Patch
@livekit/agents-plugin-mistralai	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-phonic	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-runway	Patch
@livekit/agents-plugin-sarvam	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugins-test	Patch
@livekit/agents-plugin-trugen	Patch
@livekit/agents-plugin-xai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

CLAassistant · 2026-05-08T01:54:49Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

devin-ai-integration Bot reviewed May 8, 2026

View reviewed changes

toubatbrian closed this May 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(voice): prevent scheduling deadlock when SpeechHandle._markDone re-enters#1422

fix(voice): prevent scheduling deadlock when SpeechHandle._markDone re-enters#1422
toubatbrian wants to merge 1 commit intomainfrom
claude/quirky-galileo-viXBX

toubatbrian commented May 8, 2026

Uh oh!

changeset-bot Bot commented May 8, 2026

Uh oh!

CLAassistant commented May 8, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

toubatbrian commented May 8, 2026

Summary

What was broken

Fix

Implementation nuances vs. Python

Files changed

Test plan

Changeset

Uh oh!

changeset-bot Bot commented May 8, 2026

🦋 Changeset detected

Uh oh!

CLAassistant commented May 8, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants