feat(amd): feature parity with Python AMD implementation by chenghao-mou · Pull Request #1394 · livekit/agents-js

chenghao-mou · 2026-05-05T13:43:36Z

added SIP code in the example;
added support for separate STT;
added support for participant wait;
added default models
pending: adding AMD remote session event: Version Packages protocol#1523 (review)

Tested with a SIP call.

Ports python livekit/agents#5584 (AMD improvement) into agents-js. - Expose `humanSpeechThresholdMs`, `humanSilenceThresholdMs`, `machineSilenceThresholdMs`, and `prompt` as `AMDOptions` fields. - Defer to the LLM (instead of forcing HUMAN) when a transcript is already available after a short greeting. - Add `postpone_termination` LLM tool (capped at 3 extensions × 10s) alongside `save_prediction`; fall back to JSON-content parsing when the LLM does not emit tool calls. - Add `participantIdentity` and `suppressCompatibilityWarning` options. - Warn once when the resolved LLM is not in `EVALUATED_LLM_MODELS`. Skipped (architectural divergence — see PR description): dedicated AMD STT pipeline, track-subscription wait, and the `start()` / `start_timers()` lifecycle split.

- Gate `save_prediction` and `postpone_termination` tool side effects on the current `detectGeneration`. Stale in-flight classifications now no-op instead of mutating timers, budget, or capturing a verdict that belongs to a superseded transcript window. - Normalize `save_prediction`'s `label` argument through `parseCategory` before storing, so an off-enum value from a misbehaving LLM (or our manual JSON path that bypasses Zod) is treated as UNCERTAIN rather than producing an `AMDResult` with an invalid category string. - Fix `warnIfNotEvaluated` substring check to also handle date-suffixed model names (e.g. `openai/gpt-4.1-mini-2025-04-14`).

Without this, a postpone_termination tool call resolved after aclose() would still see isStale() === false (settled was never flipped) and install a fresh silenceTimer that survives cleanup, eventually firing scheduleLLMClassification + tryEmitResult and potentially triggering session.interrupt on a closed AMD.

Without a lower bound and NaN guard, a misbehaving LLM passing a negative or non-numeric `seconds` argument would compute a clampedMs of NaN or a negative number, which setTimeout treats as 0 and fires immediately. The manual tool-execution path here bypasses the Zod schema, so this defense lives in execute().

Port of livekit/agents#5637. When a final STT transcript arrives inside the short-speech HUMAN_SILENCE_THRESHOLD window, cancel the pre-baked HUMAN/short_greeting silence timer and replace it with a long_speech timer anchored at speechEndedAt + MACHINE_SILENCE_THRESHOLD_MS so the LLM verdict gets the final word. https://claude.ai/code/session_017SqU9Zxmo439ZtcdwzKZp9

- added SIP code in the example; - added support for separate STT; - added support for participant wait; - added default models

changeset-bot · 2026-05-05T13:43:41Z

🦋 Changeset detected

Latest commit: 9a24e2c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 29 packages

Name	Type
@livekit/agents	Major
@livekit/agents-plugin-anam	Major
@livekit/agents-plugin-assemblyai	Major
@livekit/agents-plugin-baseten	Major
@livekit/agents-plugin-bey	Major
@livekit/agents-plugin-cartesia	Major
@livekit/agents-plugin-cerebras	Major
@livekit/agents-plugin-deepgram	Major
@livekit/agents-plugin-elevenlabs	Major
@livekit/agents-plugin-google	Major
@livekit/agents-plugin-hedra	Major
@livekit/agents-plugin-inworld	Major
@livekit/agents-plugin-lemonslice	Major
@livekit/agents-plugin-liveavatar	Major
@livekit/agents-plugin-livekit	Major
@livekit/agents-plugin-minimax	Major
@livekit/agents-plugin-mistral	Major
@livekit/agents-plugin-mistralai	Major
@livekit/agents-plugin-neuphonic	Major
@livekit/agents-plugin-openai	Major
@livekit/agents-plugin-phonic	Major
@livekit/agents-plugin-resemble	Major
@livekit/agents-plugin-rime	Major
@livekit/agents-plugin-runway	Major
@livekit/agents-plugin-sarvam	Major
@livekit/agents-plugin-silero	Major
@livekit/agents-plugins-test	Major
@livekit/agents-plugin-trugen	Major
@livekit/agents-plugin-xai	Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

devin-ai-integration

Devin Review found 1 new potential issue.

View 14 additional findings in Devin Review.

devin-ai-integration · 2026-05-06T08:59:29Z

+      // Start running AMD before creating the SIP participant to avoid losing
+      // any of the early audio. Same ordering as the python example.
+      if (phoneNumber && outboundTrunkId && participantIdentity) {
+        if (
+          !process.env.LIVEKIT_URL ||
+          !process.env.LIVEKIT_API_KEY ||
+          !process.env.LIVEKIT_API_SECRET
+        ) {
+          throw new Error('outbound dial requires LIVEKIT_URL/API_KEY/API_SECRET');
+        }
+        const roomName = ctx.room.name;
+        if (!roomName) {
+          throw new Error('ctx.room has no name; cannot place outbound call');
+        }

-    if (result.category === voice.AMDCategory.HUMAN) {
-      logger.info({ amd: result }, 'human answered the call, proceeding with normal conversation');
-      return;
-    }
+        const sip = new SipClient(
+          process.env.LIVEKIT_URL,
+          process.env.LIVEKIT_API_KEY,
+          process.env.LIVEKIT_API_SECRET,
+        );

-    if (result.category === voice.AMDCategory.MACHINE_IVR) {
-      logger.info({ amd: result }, 'ivr menu detected, starting navigation');
-      return;
-    }
+        logger.info({ participantIdentity }, 'creating SIP participant');
+        await sip.createSipParticipant(outboundTrunkId, phoneNumber, roomName, {
+          participantIdentity,
+          waitUntilAnswered: true,
+        });

-    if (result.category === voice.AMDCategory.MACHINE_VM) {
-      logger.info({ amd: result }, 'voicemail detected, leaving a message');
-      const speechHandle = session.generateReply({
-        instructions:
-          "You've reached voicemail. Leave a brief message asking the customer to call back.",
-      });
-      await speechHandle.waitForPlayout();
-      session.shutdown({ reason: 'amd:machine-vm' });
-      return;
-    }
+        const participant = await ctx.waitForParticipant(participantIdentity);
+        const subscribedAudioTrackSids: string[] = [];
+        for (const pub of participant.trackPublications.values()) {
+          if (pub.subscribed && pub.kind === TrackKind.KIND_AUDIO && pub.sid) {
+            subscribedAudioTrackSids.push(pub.sid);
+          }
+        }
+        logger.info(
+          {
+            actualIdentity: participant.identity,
+            expectedIdentity: participantIdentity,
+            kind: participant.kind,
+            audioTracksSubscribed: subscribedAudioTrackSids,
+          },
+          'participant joined',
+        );
+      }

-    if (result.category === voice.AMDCategory.MACHINE_UNAVAILABLE) {
-      logger.info({ amd: result }, 'mailbox unavailable, ending call');
-      session.shutdown({ reason: 'amd:machine-unavailable' });
-      return;
+      const result = await detector.execute();


🟡 Example starts AMD detection after SIP participant creation, contradicting the comment and missing early call audio

detector.execute() is called at line 138, AFTER the if block (lines 95-136) that creates the SIP participant and waits for it to join. However, the comment on lines 93-94 explicitly says "Start running AMD before creating the SIP participant to avoid losing any of the early audio. Same ordering as the python example." The AMD constructor on line 88 does not start detection — only execute() registers event handlers and starts the STT pump. For outbound calls, this means the initial call greeting (the exact audio AMD needs to classify) can already be spoken and processed by AudioRecognition before AMD's dedicated STT pump subscribes via subscribeAudioStream(). The correct pattern (matching the Python example) would be to call detector.execute() without await before createSipParticipant, then await the result after.

Prompt for agents

In examples/src/telephony_amd.ts, the AMD detection ordering is wrong for outbound calls. The comment on lines 93-94 says 'Start running AMD before creating the SIP participant to avoid losing any of the early audio' but detector.execute() is called AFTER the SIP participant creation block. The fix: start detector.execute() as a background promise before the SIP participant creation, then await the result afterwards. Something like: const amdPromise = detector.execute(); if (phoneNumber && outboundTrunkId && participantIdentity) { // ... create SIP participant, wait for participant ... } const result = await amdPromise; This matches the Python example pattern where the AMD coroutine is started as a task before the SIP call is placed, ensuring AMD's STT pump and event listeners are active before any call audio arrives.

Was this helpful? React with 👍 or 👎 to provide feedback.

…-and-stt-support

CLAassistant · 2026-05-06T15:27:16Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ toubatbrian
❌ claude
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

claude and others added 9 commits May 1, 2026 12:06

Merge branch 'main' into claude/quirky-galileo-51AGi

5bd733f

Update amd.ts

76de2fe

feat(amd): feature parity with Python AMD implementation

d49ffd0

- added SIP code in the example; - added support for separate STT; - added support for participant wait; - added default models

prepare for AMD remote session event

718750e

chenghao-mou requested a review from a team May 5, 2026 13:44

This comment was marked as resolved.

Sign in to view

address comments

1b96350

This comment was marked as resolved.

Sign in to view

add remote session and protocol bump

18eb974

This was referenced May 6, 2026

fix(amd): avoid negative zero delay #1402

Open

feat(amd): port AMDResult → AMDPredictionEvent + event emission from python #1393

Draft

address comments

f4deb46

devin-ai-integration Bot reviewed May 6, 2026

View reviewed changes

update branching example

8e1f6ce

chenghao-mou force-pushed the claude/quirky-galileo-B4wih branch from 15c346a to 4027e25 Compare May 6, 2026 15:02

Merge branch 'claude/quirky-galileo-B4wih' into chenghao/feat/amd-sip…

9a24e2c

…-and-stt-support

chenghao-mou merged commit a2c8caa into claude/quirky-galileo-B4wih May 6, 2026
1 check passed

chenghao-mou deleted the chenghao/feat/amd-sip-and-stt-support branch May 6, 2026 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(amd): feature parity with Python AMD implementation#1394

feat(amd): feature parity with Python AMD implementation#1394
chenghao-mou merged 14 commits intoclaude/quirky-galileo-B4wihfrom
chenghao/feat/amd-sip-and-stt-support

chenghao-mou commented May 5, 2026

Uh oh!

changeset-bot Bot commented May 5, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot May 6, 2026

Uh oh!

CLAassistant commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

chenghao-mou commented May 5, 2026

Uh oh!

changeset-bot Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

changeset-bot Bot commented May 5, 2026 •

edited

Loading