Skip to content

@livekit/agents-plugin-deepgram double-URL-encodes streaming params #1379

@ubinatus

Description

@ubinatus

Describe the bug

Summary

The Deepgram STT plugin calls encodeURIComponent(v) on each query param value before passing it to URLSearchParams.append(). URLSearchParams.append() already URL-encodes its inputs, so the value is encoded twice on the wire. Any param containing characters that encodeURIComponent rewrites (space, ', &, %, non-ASCII, etc.) is corrupted by the time it reaches Deepgram.

Object.entries(params).forEach(([k, v]) => {
  if (v !== void 0) {
    if (typeof v === "string" || typeof v === "number" || typeof v === "boolean") {
      streamURL.searchParams.append(k, encodeURIComponent(v));        // ← double-encodes
    } else {
      v.forEach((x) => streamURL.searchParams.append(k, encodeURIComponent(x))); // ← double-encodes
    }
  }
});

Where it bites in practice

The most common trigger is keyterm (and keywords), which accept user-supplied free-text vocabulary like "Joe's Plumbing", "AT&T", or "café". Example:

  • Input: keyterm: ["Joe's Plumbing"]
  • After encodeURIComponent: "Joe%27s%20Plumbing"
  • After URLSearchParams re-encodes the %: wire param keyterm=Joe%2527s%2520Plumbing
  • Deepgram decodes that to the literal string Joe%27s%20Plumbing — the keyterm is effectively broken and provides no recognition boost.

The same applies to any string param containing reserved chars. Booleans/numbers and ASCII-only strings ("nova-3", "en") happen to round-trip cleanly, which is why the bug went unnoticed for plain configs.

Expected behavior

Each param should be URL-encoded exactly once. URLSearchParams.append(k, v) already does the right thing — the wrapping encodeURIComponent should be removed.

Proposed fix

       Object.entries(params).forEach(([k, v]) => {
         if (v !== void 0) {
           if (typeof v === "string" || typeof v === "number" || typeof v === "boolean") {
-            streamURL.searchParams.append(k, encodeURIComponent(v));
+            streamURL.searchParams.append(k, String(v));
           } else {
-            v.forEach((x) => streamURL.searchParams.append(k, encodeURIComponent(x)));
+            v.forEach((x) => streamURL.searchParams.append(k, String(x)));
           }
         }
       });

(Booleans/numbers need String(...) because URLSearchParams.append types its second arg as string; current code coincidentally stringifies via encodeURIComponent.)

Relevant log output

No response

Describe your environment

System:
OS: macOS 26.3
CPU: (16) arm64 Apple M4 Max
Memory: 27.75 GB / 128.00 GB
Shell: 5.9 - /bin/zsh
Binaries:
Node: 24.14.1 - /opt/homebrew/opt/node@24/bin/node
npm: 11.11.0 - /opt/homebrew/opt/node@24/bin/npm
pnpm: 10.33.0 - /opt/homebrew/bin/pnpm
bun: 1.2.4 - /Users/juanandrescastro/.bun/bin/bun

Minimal reproducible example

Repro

import * as deepgram from "@livekit/agents-plugin-deepgram";

const stt = new deepgram.STT({
  model: "nova-3",
  language: "en",
  keyterm: ["Joe's Plumbing", "AT&T", "café"],
});
// Capture the WS URL the plugin opens — the keyterm values arrive as
// keyterm=Joe%2527s%2520Plumbing&keyterm=AT%2526T&keyterm=caf%25C3%25A9
// instead of the correctly-single-encoded form.

Additional information

Workarounds for downstreams (until fix lands)

None clean. Pre-transforming values from user code can't cancel both layers, sanitizing values to ASCII is lossy for the use case, and switching keytermkeywords hits the identical code path. Maintaining an npm/pnpm patch is currently the only viable mitigation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions