Skip to content

silero.VAD.load() causes libc++abi mutex abort on process exit (macOS arm64) #1375

@sgzrov

Description

@sgzrov

Describe the bug

On macOS arm64 (Apple Silicon), calling silero.VAD.load() and then exiting the Node process aborts with:

libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

Process exits with status 134 (SIGABRT). RUST_BACKTRACE=full prints nothing — the abort is a pure C++ uncaught exception, no Rust panic frame.

The same abort fires in three independent shutdown paths in a real @livekit/agents worker:

  1. Top-level script: silero.VAD.load() + process.exit(0) (no Room, no AudioStream, no rtc-node API used).
  2. Normal session disconnect: worker forks job process for a session, the room is deleted (or participant disconnects), the job process logs native resources disposed and Job process shutdown, then aborts.
  3. Ctrl+C of the worker while a session is active: same trace, abort fires after Job process shutdown.

Ctrl+C of an idle worker (no active session, no VAD.load()) does not abort.

Bisection isolates the trigger to silero.VAD.load(). With all other plugins removed, keeping just vad still crashes; removing the load entirely produces a clean exit; calling load() in prewarm without ever using the result still crashes. A top-level Room.connect()Room.disconnect()dispose()process.exit(0) using only @livekit/rtc-node does not crash on 0.13.27, so the basic rtc-node teardown is fine.

Surprisingly, inline replications that mimic the exact load flow — same imports (@livekit/agents, @livekit/rtc-node, onnxruntime-node), same options, same silero_vad.onnx path, same private fields, same Plugin.registerPlugin side effect — do not crash. The abort only fires when the silero plugin module itself drives the load.

This is the same C++ exception class as livekit/node-sdks#564 (consuming AudioStream without objectMode) and the JS-side dispose-ordering workaround in #1042 (already shipped, doesn't help here), but a third independent trigger.

Reviewed main via source: silero is at 1.3.1 (deps-only bump for @livekit/agents@1.3.1); vad.ts and onnx_model.ts are unchanged from the 1.3.0 dist tested. None of the 1.3.1 patches address shutdown/dispose/mutex. The bug is expected to reproduce on 1.3.1.

Relevant log output

Minimal repro (node --import tsx agent-repro.ts):

libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

@livekit/agents worker variant, normal-disconnect path (room deleted while a session is active), trimmed around the abort:

[14:25:14.926] INFO  AgentSession closed
    reason: "user_initiated"
    error: null
[14:25:15.294] INFO  Session report uploaded to LiveKit Cloud
[14:25:15.331] DEBUG disconnected from room
    jobID: "AJ_gmwcZZgSA5Lj"
[14:25:15.331] DEBUG native resources disposed
    jobID: "AJ_gmwcZZgSA5Lj"
[14:25:15.331] DEBUG Job process shutdown
    jobID: "AJ_gmwcZZgSA5Lj"
libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

Ctrl+C-with-active-session path produces an identical final stderr line.

Describe your environment

System:
OS: macOS Darwin Kernel 25.4.0 (arm64, Apple Silicon)
Node: v22.16.0

Packages:
@livekit/rtc-node: 0.13.27 (npm latest at time of test)
@livekit/agents: 1.3.0 (npm latest)
@livekit/agents-plugin-silero: 1.3.0 (npm latest)
onnxruntime-node: 1.21.0 (silero plugin's pinned dep)

Minimal reproducible example

Create an empty directory with these two files:

package.json:

{
  "type": "module",
  "dependencies": {
    "@livekit/agents": "1.3.0",
    "@livekit/agents-plugin-silero": "1.3.0",
    "@livekit/rtc-node": "0.13.27"
  },
  "devDependencies": { "tsx": "^4.19.0" }
}

agent-repro.ts:

import * as silero from "@livekit/agents-plugin-silero";

await silero.VAD.load();
process.exit(0);

Then:

npm install
RUST_BACKTRACE=full node --import tsx agent-repro.ts

Process exits 134 with the libc++abi line on stderr. No Room, no AudioStream, no LiveKit credentials needed. Verified by npm install from a clean empty directory — reproduces 100% of the time on the listed environment.

Additional information

Bisection ladder (each row run against a real LiveKit Cloud session; "disconnect" = lk dispatch create then lk room delete; "Ctrl+C" = SIGINT to the worker with an active session):

  • Production agent (full plugin stack) — crashes on disconnect, crashes on Ctrl+C
  • defineAgent + AgentSession with VAD + STT + LLM + TTS + EOU — crashes on disconnect, crashes on Ctrl+C
  • …minus turnDetector — crashes on disconnect
  • …minus TTS — crashes on disconnect
  • …minus STT — crashes on disconnect
  • …minus LLM (only vad: vad) — crashes on disconnect
  • AgentSession({}) with no plugins — clean exit
  • AgentSession({}) with no plugins + silero.VAD.load() runs in prewarmcrashes on disconnect
  • import * as silero but never call load() — clean exit
  • defineAgent + entry: ctx.connect() + silero.VAD.load() + disconnect — crashes on disconnect
  • defineAgent + entry: ctx.connect() + ctx.room.disconnect() — clean exit
  • Top-level Room.connect() + disconnect() + dispose() + exit(0) — clean exit

The trigger isolates to silero.VAD.load(): every row that calls it crashes; every row that doesn't is clean.

Confirmed not a duplicate of:

Source review of main:

  • silero is at 1.3.1 (deps-only bump). plugins/silero/src/vad.ts and plugins/silero/src/onnx_model.ts are unchanged from the 1.3.0 dist tested (only cosmetic TS-vs-compiled differences).
  • @livekit/agents@1.3.1 patch notes (OTEL trace fix, TTS aligned transcripts, agent-name env var, uuid security update, playbackLatency metric) do not address shutdown/dispose/mutex.
  • No open PR or issue in livekit/agents-js matches this trigger.
  • The AudioResampler.close() fix (fix: explicitly close AudioResampler instances too free up resources #1210) landed in silero 1.2.4, well before 1.3.0.

Bug expected to still reproduce on main once 1.3.1 is published. Happy to test patches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions