Skip to content

fix(security): harden gateway auth defaults and restrict auto-pair#1217

Merged
ericksoa merged 4 commits intoNVIDIA:mainfrom
jyaunches:fix/harden-gateway-auth-defaults
Apr 1, 2026
Merged

fix(security): harden gateway auth defaults and restrict auto-pair#1217
ericksoa merged 4 commits intoNVIDIA:mainfrom
jyaunches:fix/harden-gateway-auth-defaults

Conversation

@jyaunches
Copy link
Copy Markdown
Contributor

@jyaunches jyaunches commented Apr 1, 2026

Summary

Fixes #117 — Insecure authentication configuration in gateway setup.

Supersedes PR #123, which targeted the correct security issues but patched code paths that have since moved. Recommend closing #123 in favor of this PR.


Background

NemoClaw runs an AI assistant inside a locked-down sandbox. The gateway is the front door to that sandbox, and the Control UI is a web dashboard users open in their browser to interact with the assistant.

There were three security problems with that front door:

Problem 1: Device auth was permanently disabled 🔓

The config had dangerouslyDisableDeviceAuth: True hardcoded. This meant any device could connect to the gateway without proving its identity. It was added as a temporary workaround (commit 5fb629f) and never reverted. While the sandbox network isolation mitigates this today, it becomes dangerous if combined with LAN-bind changes or cloudflared tunnels in remote deployments — resulting in an unauthenticated, publicly reachable dashboard.

Fix: Default to false (device auth enabled). Developers who genuinely need it disabled for headless/dev environments can set NEMOCLAW_DISABLE_DEVICE_AUTH=1 as a build arg.

Problem 2: Insecure auth was always allowed, even over HTTPS 🌐

allowInsecureAuth: True was hardcoded, meaning insecure (non-HTTPS) authentication was permitted regardless of deployment context. For local development on http://127.0.0.1 this is fine, but for production/remote deployments over https:// it defeats the purpose of TLS.

Fix: Derive it from the CHAT_UI_URL scheme. If the URL is http:// → allow insecure auth (local dev). If https:// → block it (production).

Problem 3: The auto-pair bouncer let everyone in 🚪

The start_auto_pair watcher automatically approved every pending device pairing request with zero validation. A rogue or unexpected client type could pair with the gateway unchallenged.

Fix: Only approve devices with recognized clientId (openclaw-control-ui) or clientMode (webchat). Unknown clients are rejected and logged.


Why this PR exists (instead of merging #123)

PR #123 (opened Mar 17) correctly identified all three issues and proposed the right fixes. However, since then, commit 0f8eedd ("make openclaw.json immutable at runtime, move config to build time") moved the gateway controlUi configuration from scripts/nemoclaw-start.sh into the Dockerfile. The config is now baked at image build time and verified via SHA-256 hash at startup.

PR #123's changes to the start script's Python config block would patch dead code — that block no longer exists. This PR applies the same security logic where the code actually lives now:

Issue PR #123 patched This PR patches
dangerouslyDisableDeviceAuth scripts/nemoclaw-start.sh (removed) Dockerfile (build-time config)
allowInsecureAuth scripts/nemoclaw-start.sh (removed) Dockerfile (build-time config)
Auto-pair whitelisting scripts/nemoclaw-start.sh scripts/nemoclaw-start.sh

Changes

Dockerfile (+8/-2)

  • dangerouslyDisableDeviceAuth → derived from NEMOCLAW_DISABLE_DEVICE_AUTH env var (default 0 = secure)
  • allowInsecureAuth → derived from CHAT_UI_URL scheme (false when https://)
  • New ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0 with ENV promotion (follows existing C-2 safe pattern — no ARG interpolation into Python source)

scripts/nemoclaw-start.sh (+12/-4)

  • Auto-pair only approves ALLOWED_CLIENTS = {'openclaw-control-ui'} and ALLOWED_MODES = {'webchat'}
  • Validates isinstance(device, dict) before field access
  • Rejects unknown clients with [auto-pair] rejected unknown client=... mode=... log
  • Logs client= on approval for audit trail
  • Documents NEMOCLAW_DISABLE_DEVICE_AUTH in script header

test/security-c2-dockerfile-injection.test.js (+6 tests)

  • No hardcoded True for dangerouslyDisableDeviceAuth
  • No hardcoded True for allowInsecureAuth
  • Env var derivation for both settings
  • ARG defaults to 0
  • ENV promotion before RUN layer

test/nemoclaw-start.test.js (+7 tests)

  • Whitelist definitions (ALLOWED_CLIENTS, ALLOWED_MODES)
  • Rejection logic for unknown clients
  • Dict type validation
  • Client identity in approval logs
  • No unconditional approval pattern
  • NEMOCLAW_DISABLE_DEVICE_AUTH documented

Why no Brev E2E test

These changes are deterministic config values and simple conditional logic, not runtime behavior with environmental ambiguity:

  1. Dockerfile config — Two Python booleans evaluated at build time: os.environ.get(...) == '1' and parsed.scheme != 'https'. If the source is correct, the output is correct.
  2. Auto-pair whitelisting — Python set membership check. No shell interpretation, no SSH pipeline, no quoting edge cases.

This is fundamentally different from something like the telegram bridge injection fix, where you must send real payloads through a real shell on a real sandbox to prove metacharacters don't get interpreted. Our changes have no gap between "looks right in source" and "works right at runtime." The 13 source-pattern tests provide sufficient confidence.

If deeper verification is desired in the future, a build-time unit test that runs the Dockerfile's Python snippet and asserts the JSON output would be the appropriate next step — a regular CI test, not a Brev instance.


Test plan

  • All 752 tests pass (739 existing + 13 new)
  • shellcheck
  • hadolint
  • shfmt / prettier
  • gitleaks
  • Default onboarding: gateway starts with device auth enabled, Control UI pairing works via auto-pair
  • NEMOCLAW_DISABLE_DEVICE_AUTH=1 build: device auth is disabled (backward-compatible escape hatch)
  • CHAT_UI_URL=https://... build: allowInsecureAuth is false
  • CHAT_UI_URL=http://127.0.0.1:18789 (default): allowInsecureAuth is true
  • Unknown device type connecting to gateway: auto-pair rejects and logs
  • openclaw-control-ui / webchat devices: auto-pair approves as before

Summary by CodeRabbit

  • Security
    • Device authentication can be toggled at build time; UI insecure-auth is allowed only when the control UI URL uses http.
  • Bug Fixes
    • Auto-pair processing now validates inputs, rejects unknown client/mode, logs identifiers, and deduplicates requests to avoid repeated approvals/rejections.
  • Tests
    • Added tests for auto-pair validation, logging, de-duplication, and configurable auth behavior.
  • Chores
    • Sandbox/setup flow now enforces the build-time device-auth setting for images.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Build-time input NEMOCLAW_DISABLE_DEVICE_AUTH now controls gateway auth flags instead of hardcoded insecure defaults. The auto-pair watcher validates pending entries, enforces client/mode allowlists, deduplicates request handling, and logs request/client identifiers. Tests verify Dockerfile config generation and auto-pair defensive behavior.

Changes

Cohort / File(s) Summary
Build / Dockerfile
Dockerfile
Adds ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0 and ENV NEMOCLAW_DISABLE_DEVICE_AUTH=${NEMOCLAW_DISABLE_DEVICE_AUTH}; Python build-time generator uses os.environ.get(...) to derive dangerouslyDisableDeviceAuth from this env and computes allowInsecureAuth from CHAT_UI_URL scheme (True only for http).
Onboard / setup tooling
bin/lib/onboard.js, scripts/setup.sh
Patching logic now forces staged Dockerfile ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1 for sandbox/onboard builds and scripts/setup.sh replaces that ARG to 1 before image build.
Auto-pair watcher (embedded Python)
scripts/nemoclaw-start.sh
Auto-pair loop now skips non-dict pendings, requires requestId, extracts clientId/clientMode, enforces ALLOWED_CLIENTS/ALLOWED_MODES, deduplicates via HANDLED = set() and logs approvals/rejections including request/client identifiers; header docs updated to note NEMOCLAW_DISABLE_DEVICE_AUTH is build-time only.
Tests
test/nemoclaw-start.test.js, test/security-c2-dockerfile-injection.test.js
Adds Vitest assertions for allowlists (openclaw-control-ui, webchat), defensive dict access, rejection logging, approval logging containing both request_id and client_id, HANDLED de-duplication, promotion of NEMOCLAW_DISABLE_DEVICE_AUTH (ARG default 0 then ENV), and that allowInsecureAuth/dangerouslyDisableDeviceAuth are computed rather than hardcoded.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Device as Device / Client
    participant AutoPair as Auto-pair Watcher (Python)
    participant Gateway as Gateway API
    Device->>AutoPair: submit pending approval (device metadata)
    AutoPair->>AutoPair: validate type (dict), extract requestId/clientId/clientMode
    AutoPair->>AutoPair: check ALLOWED_CLIENTS / ALLOWED_MODES and HANDLED set
    alt allowed and not handled
        AutoPair->>Gateway: POST approve requestId
        AutoPair->>AutoPair: HANDLED.add(requestId)
        AutoPair-->>Device: log approval (requestId, clientId)
    else disallowed, invalid, or already handled
        AutoPair->>Gateway: POST reject requestId (if applicable)
        AutoPair-->>Device: log rejection (requestId, clientId/mode)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nudge a build arg, whisper "off" or "on",
I skip the pesky repeats until the night is gone.
I peek each pending, check client and mode,
I mark HANDLED, hop off down the road.
Tests applaud — the gateway hums along.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'fix(security): harden gateway auth defaults and restrict auto-pair' accurately summarizes the primary security-focused changes across multiple files (Dockerfile, scripts, tests).
Linked Issues check ✅ Passed The PR successfully addresses Issue #117 by removing hardcoded insecure auth defaults, deriving allowInsecureAuth from CHAT_UI_URL scheme, disabling dangerouslyDisableDeviceAuth by default via ARG, and restricting auto-pair to known clients/modes with validation.
Out of Scope Changes check ✅ Passed All changes are directly scoped to Issue #117 requirements: Dockerfile auth config hardening, auto-pair whitelist/validation, and corresponding test coverage; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/security-c2-dockerfile-injection.test.js (1)

340-366: Consider extracting ENV-block parsing into a shared helper.

The ENV block parsing logic (lines 344-358) is nearly identical to the pattern used for CHAT_UI_URL (lines 186-210) and NEMOCLAW_MODEL (lines 246-264). Extracting this into a helper function would reduce duplication and make future additions easier:

function isEnvPromotedBeforePythonRun(dockerfile, envVarName) {
  const lines = dockerfile.split("\n");
  let promoted = false;
  let inEnvBlock = false;
  for (const line of lines) {
    if (/^\s*FROM\b/.test(line)) { promoted = false; inEnvBlock = false; }
    if (/^\s*ENV\b/.test(line)) inEnvBlock = true;
    if (inEnvBlock && new RegExp(`${envVarName}[=\\s]`).test(line)) promoted = true;
    if (inEnvBlock && !/\\\s*$/.test(line)) inEnvBlock = false;
    if (/^\s*RUN\b.*python3\s+-c\b/.test(line)) return promoted;
  }
  return promoted;
}

This is optional — the current approach is clear and self-contained.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/security-c2-dockerfile-injection.test.js` around lines 340 - 366,
Extract the repeated ENV-block parsing into a shared helper (e.g.
isEnvPromotedBeforePythonRun) and replace the duplicated loops in the three
tests (the block using promoted/inEnvBlock and the RUN regex check) with calls
to that helper; the helper should accept the Dockerfile string and env var name,
scan lines resetting state on /^\\s*FROM\\b/, toggling inEnvBlock on
/^\\s*ENV\\b/, setting promoted when inEnvBlock and new
RegExp(`${envVarName}[=\\s]`) matches, closing the ENV block when a line does
not end with /\\\s*$/, and returning promoted when it encounters the RUN python3
-c pattern (/^\\s*RUN\\b.*python3\\s+-c\\b/) — then update the three tests (the
current test plus the CHAT_UI_URL and NEMOCLAW_MODEL checks) to call
isEnvPromotedBeforePythonRun(DOCKERFILE_CONTENT, "VAR_NAME") and assert the
returned boolean.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/security-c2-dockerfile-injection.test.js`:
- Around line 340-366: Extract the repeated ENV-block parsing into a shared
helper (e.g. isEnvPromotedBeforePythonRun) and replace the duplicated loops in
the three tests (the block using promoted/inEnvBlock and the RUN regex check)
with calls to that helper; the helper should accept the Dockerfile string and
env var name, scan lines resetting state on /^\\s*FROM\\b/, toggling inEnvBlock
on /^\\s*ENV\\b/, setting promoted when inEnvBlock and new
RegExp(`${envVarName}[=\\s]`) matches, closing the ENV block when a line does
not end with /\\\s*$/, and returning promoted when it encounters the RUN python3
-c pattern (/^\\s*RUN\\b.*python3\\s+-c\\b/) — then update the three tests (the
current test plus the CHAT_UI_URL and NEMOCLAW_MODEL checks) to call
isEnvPromotedBeforePythonRun(DOCKERFILE_CONTENT, "VAR_NAME") and assert the
returned boolean.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 85e21d8f-7e88-4559-9b2c-6bc34a42aecd

📥 Commits

Reviewing files that changed from the base of the PR and between c63d37b and 1f89564.

📒 Files selected for processing (4)
  • Dockerfile
  • scripts/nemoclaw-start.sh
  • test/nemoclaw-start.test.js
  • test/security-c2-dockerfile-injection.test.js

@jyaunches jyaunches requested a review from cv April 1, 2026 01:35
@jyaunches
Copy link
Copy Markdown
Contributor Author

Note: Overlap with #1149 and #690

Two open PRs touch the same code paths:

Neither is a blocker for this PR, but flagging for merge coordination.

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression Risk Analysis

The security fix is correct — flipping to secure-by-default is the right call. But the default flip is a breaking change that needs coordination before merge.

High Risk: Default onboarding flow breaks

dangerouslyDisableDeviceAuth goes from always True to False by default (NEMOCLAW_DISABLE_DEVICE_AUTH=0). Every image built without explicitly passing --build-arg NEMOCLAW_DISABLE_DEVICE_AUTH=1 will now require device pairing before the dashboard is accessible.

The current nemoclaw onboard flow assumes the dashboard is immediately accessible via SSH port-forward — that assumption breaks. Impact:

  • All fresh installs that don't set the build arg hit the pairing wall
  • Existing deployments that rebuild their image silently change behavior
  • Brev E2E and any CI/CD that builds images will need updating

Questions:

  • Does bin/lib/onboard.js pass NEMOCLAW_DISABLE_DEVICE_AUTH=1 when building images during onboard? If not, first-run breaks.
  • Does scripts/brev-setup.sh set this build arg? If not, E2E will fail.
  • Should this be announced as a breaking change in release notes?

Medium Risk: Auto-pair client allowlist is narrow

ALLOWED_CLIENTS = {'openclaw-control-ui'} and ALLOWED_MODES = {'webchat'} will reject any client that doesn't send these exact values. The PR correctly notes these are "client-supplied and spoofable" (defense-in-depth, not a trust boundary), but legitimate clients that don't set clientId/clientMode will fail silently. Future OpenClaw clients or custom dashboards would need to be added to the allowlist.

Low Risk: allowInsecureAuth scheme derivation

Default CHAT_UI_URL is http://127.0.0.1:18789 so allow_insecure stays True for default deployments. Anyone with a custom https:// CHAT_UI_URL gets allowInsecureAuth=False, which is correct behavior.

What's solid

Tests are comprehensive, HANDLED set prevents reprocessing, isinstance(device, dict) guard is strictly safer, build arg is documented in the script header.

@jyaunches
Copy link
Copy Markdown
Contributor Author

Findings on @ericksoa's Regression Questions

Investigated both questions — both concerns are confirmed:

1. bin/lib/onboard.js does NOT pass NEMOCLAW_DISABLE_DEVICE_AUTH=1

The patchStagedDockerfile() function (line ~988) patches several ARG values before building — NEMOCLAW_MODEL, NEMOCLAW_PROVIDER_KEY, CHAT_UI_URL, NEMOCLAW_INFERENCE_BASE_URL, etc. — but there is no replacement for ARG NEMOCLAW_DISABLE_DEVICE_AUTH. It stays at the default 0, so fresh onboard builds flip to secure mode and hit the pairing wall. First-run breaks.

2. scripts/brev-setup.sh does NOT set this build arg

Neither brev-setup.sh nor setup.sh reference NEMOCLAW_DISABLE_DEVICE_AUTH at all. brev-setup.sh installs Docker/NVIDIA toolkit and hands off to setup.sh — neither passes the build arg. E2E will fail.

Proposed Fix

Both paths need to explicitly opt into the old behavior during their automated flows:

  • bin/lib/onboard.js: Add a line in patchStagedDockerfile() to set NEMOCLAW_DISABLE_DEVICE_AUTH=1 (since onboard expects immediate dashboard access without pairing)
  • scripts/brev-setup.sh or scripts/setup.sh: Pass --build-arg NEMOCLAW_DISABLE_DEVICE_AUTH=1 for the Brev E2E path

@ericksoa flagging for your review — will push the fix shortly.

jyaunches added a commit that referenced this pull request Apr 1, 2026
…v flows

The secure-by-default flip (NEMOCLAW_DISABLE_DEVICE_AUTH=0) broke two
automated paths that expect immediate dashboard access without device
pairing:

- bin/lib/onboard.js: patchStagedDockerfile() now sets the ARG to 1
- scripts/setup.sh: sed-patches the staged Dockerfile before build

Addresses review feedback from @ericksoa on #1217.
@jyaunches jyaunches force-pushed the fix/harden-gateway-auth-defaults branch from 8409990 to fcb0fae Compare April 1, 2026 19:12
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/setup.sh`:
- Around line 236-238: The sed -i call that rewrites ARG
NEMOCLAW_DISABLE_DEVICE_AUTH in the Dockerfile is not portable to macOS; update
the script to use a portable inplace-edit approach (either choose perl -pi -e or
branch on uname to call sed -i '' on Darwin and sed -i on Linux). Specifically
replace the existing sed invocation that targets "$BUILD_CTX/Dockerfile" (the
line that changes ARG NEMOCLAW_DISABLE_DEVICE_AUTH) with a cross-platform block
like: if [[ "$(uname -s)" == "Darwin" ]]; then sed -i '' 's/^ARG
NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1/'
"$BUILD_CTX/Dockerfile"; else sed -i 's/^ARG NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG
NEMOCLAW_DISABLE_DEVICE_AUTH=1/' "$BUILD_CTX/Dockerfile"; fi so the change works
on both macOS and Linux.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4abfa8f6-b3e5-4c32-b1d6-8388b33cd405

📥 Commits

Reviewing files that changed from the base of the PR and between 1ff2d0d and fcb0fae.

📒 Files selected for processing (6)
  • Dockerfile
  • bin/lib/onboard.js
  • scripts/nemoclaw-start.sh
  • scripts/setup.sh
  • test/nemoclaw-start.test.js
  • test/security-c2-dockerfile-injection.test.js
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/security-c2-dockerfile-injection.test.js
  • test/nemoclaw-start.test.js

scripts/setup.sh Outdated
Comment on lines +236 to +238
# Brev/setup flow expects immediate dashboard access without device pairing,
# so disable device auth for images built via setup.sh (see #1217).
sed -i 's/^ARG NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1/' "$BUILD_CTX/Dockerfile"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for macOS-specific logic in setup.sh
rg -n "Darwin|macOS|brew|Colima" scripts/setup.sh

Repository: NVIDIA/NemoClaw

Length of output: 669


🏁 Script executed:

# Check for other sed -i usages in the file
rg -n "sed\s+-i" scripts/setup.sh

Repository: NVIDIA/NemoClaw

Length of output: 170


🏁 Script executed:

# Get the exact content around lines 236-238 and surrounding context
sed -n '230,245p' scripts/setup.sh

Repository: NVIDIA/NemoClaw

Length of output: 920


🏁 Script executed:

# Check if .shellcheckrc has any sed-related rules
cat .shellcheckrc 2>/dev/null || echo "No .shellcheckrc found"

Repository: NVIDIA/NemoClaw

Length of output: 478


Potential portability issue: sed -i without suffix may fail on macOS.

The sed -i command at line 238 behaves differently between GNU sed (Linux) and BSD sed (macOS):

  • GNU sed: sed -i 's/...' works
  • BSD sed: sed -i 's/...' fails; requires sed -i '' 's/...'

Since this script explicitly supports macOS (Darwin check at line 166-167, brew installation, and documented Colima/Docker Desktop support), consider using a portable pattern:

🛠️ Portable sed -i replacement
-sed -i 's/^ARG NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1/' "$BUILD_CTX/Dockerfile"
+sed 's/^ARG NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1/' "$BUILD_CTX/Dockerfile" > "$BUILD_CTX/Dockerfile.tmp" && mv "$BUILD_CTX/Dockerfile.tmp" "$BUILD_CTX/Dockerfile"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Brev/setup flow expects immediate dashboard access without device pairing,
# so disable device auth for images built via setup.sh (see #1217).
sed -i 's/^ARG NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1/' "$BUILD_CTX/Dockerfile"
# Brev/setup flow expects immediate dashboard access without device pairing,
# so disable device auth for images built via setup.sh (see `#1217`).
sed 's/^ARG NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1/' "$BUILD_CTX/Dockerfile" > "$BUILD_CTX/Dockerfile.tmp" && mv "$BUILD_CTX/Dockerfile.tmp" "$BUILD_CTX/Dockerfile"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/setup.sh` around lines 236 - 238, The sed -i call that rewrites ARG
NEMOCLAW_DISABLE_DEVICE_AUTH in the Dockerfile is not portable to macOS; update
the script to use a portable inplace-edit approach (either choose perl -pi -e or
branch on uname to call sed -i '' on Darwin and sed -i on Linux). Specifically
replace the existing sed invocation that targets "$BUILD_CTX/Dockerfile" (the
line that changes ARG NEMOCLAW_DISABLE_DEVICE_AUTH) with a cross-platform block
like: if [[ "$(uname -s)" == "Darwin" ]]; then sed -i '' 's/^ARG
NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG NEMOCLAW_DISABLE_DEVICE_AUTH=1/'
"$BUILD_CTX/Dockerfile"; else sed -i 's/^ARG NEMOCLAW_DISABLE_DEVICE_AUTH=.*/ARG
NEMOCLAW_DISABLE_DEVICE_AUTH=1/' "$BUILD_CTX/Dockerfile"; fi so the change works
on both macOS and Linux.

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good — secure-by-default with the onboard/setup escape hatch is the right approach. Approving.

…VIDIA#117)

Fixes NVIDIA#117 — supersedes PR NVIDIA#123 which targeted stale code paths.

Dockerfile: secure-by-default controlUi config
- dangerouslyDisableDeviceAuth now defaults to false (was hardcoded True)
- allowInsecureAuth derived from CHAT_UI_URL scheme (false for https)
- New build-arg NEMOCLAW_DISABLE_DEVICE_AUTH=0 as opt-in escape hatch
- Env var promoted safely via ENV (no ARG interpolation into Python)

nemoclaw-start.sh: auto-pair client whitelisting
- Only approve devices with known clientId (openclaw-control-ui) or
  clientMode (webchat); reject and log unknown clients
- Validate device is a dict before accessing fields
- Log client identity on approval for audit trail

Tests (13 new)
- Dockerfile: no hardcoded True for auth settings, env var derivation,
  ARG default, ENV promotion before RUN layer
- Start script: whitelist definitions, rejection logic, dict validation,
  client identity logging, no unconditional approval pattern
- allowInsecureAuth: use explicit http allowlist (parsed.scheme == 'http')
  instead of https denylist (parsed.scheme != 'https') so malformed or
  unknown schemes default to secure (CodeRabbit finding on NVIDIA#123)
- Add HANDLED set to track rejected/approved requestIds so rejected
  devices are not reprocessed every poll cycle (CodeRabbit finding on NVIDIA#123)
- Document that clientId/clientMode are client-supplied and spoofable;
  the allowlist is defense-in-depth, not a trust boundary (@cv review on NVIDIA#123)
- Reference PR NVIDIA#690 for the more comprehensive auto-pair fix (one-shot
  exit, timeout reduction, token cleanup)
- Move ALLOWED_CLIENTS/ALLOWED_MODES above the while loop — these are
  constants, not per-iteration state. Avoids reconstructing sets every
  poll cycle.
- Document NEMOCLAW_DISABLE_DEVICE_AUTH as build-time only in the script
  header. Setting it at runtime has no effect because openclaw.json is
  baked at image build and verified by hash at startup.
- Add tests: constants defined before loop, header mentions build-time.
…v flows

The secure-by-default flip (NEMOCLAW_DISABLE_DEVICE_AUTH=0) broke two
automated paths that expect immediate dashboard access without device
pairing:

- bin/lib/onboard.js: patchStagedDockerfile() now sets the ARG to 1
- scripts/setup.sh: sed-patches the staged Dockerfile before build

Addresses review feedback from @ericksoa on NVIDIA#1217.
@jyaunches jyaunches force-pushed the fix/harden-gateway-auth-defaults branch from 966847d to d25a13a Compare April 1, 2026 20:47
@ericksoa ericksoa merged commit 2804b74 into NVIDIA:main Apr 1, 2026
4 checks passed
@wscurran wscurran added bug Something isn't working security Something isn't secure priority: high Important issue that should be resolved in the next release NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). labels Apr 1, 2026
laitingsheng pushed a commit that referenced this pull request Apr 2, 2026
…1217)

## Summary

Fixes #117 — Insecure authentication configuration in gateway setup.

> **Supersedes PR #123**, which targeted the correct security issues but
patched code paths that have since moved. Recommend closing #123 in
favor of this PR.

---

## Background 

NemoClaw runs an AI assistant inside a locked-down sandbox. The
**gateway** is the front door to that sandbox, and the **Control UI** is
a web dashboard users open in their browser to interact with the
assistant.

There were three security problems with that front door:

### Problem 1: Device auth was permanently disabled 🔓

The config had `dangerouslyDisableDeviceAuth: True` hardcoded. This
meant any device could connect to the gateway without proving its
identity. It was added as a temporary workaround (commit `5fb629f`) and
never reverted. While the sandbox network isolation mitigates this
today, it becomes dangerous if combined with LAN-bind changes or
cloudflared tunnels in remote deployments — resulting in an
unauthenticated, publicly reachable dashboard.

**Fix:** Default to `false` (device auth enabled). Developers who
genuinely need it disabled for headless/dev environments can set
`NEMOCLAW_DISABLE_DEVICE_AUTH=1` as a build arg.

### Problem 2: Insecure auth was always allowed, even over HTTPS 🌐

`allowInsecureAuth: True` was hardcoded, meaning insecure (non-HTTPS)
authentication was permitted regardless of deployment context. For local
development on `http://127.0.0.1` this is fine, but for
production/remote deployments over `https://` it defeats the purpose of
TLS.

**Fix:** Derive it from the `CHAT_UI_URL` scheme. If the URL is
`http://` → allow insecure auth (local dev). If `https://` → block it
(production).

### Problem 3: The auto-pair bouncer let everyone in 🚪

The `start_auto_pair` watcher automatically approved *every* pending
device pairing request with zero validation. A rogue or unexpected
client type could pair with the gateway unchallenged.

**Fix:** Only approve devices with recognized `clientId`
(`openclaw-control-ui`) or `clientMode` (`webchat`). Unknown clients are
rejected and logged.

---

## Why this PR exists (instead of merging #123)

PR #123 (opened Mar 17) correctly identified all three issues and
proposed the right fixes. However, since then, commit `0f8eedd` ("make
openclaw.json immutable at runtime, move config to build time") moved
the gateway `controlUi` configuration from `scripts/nemoclaw-start.sh`
into the **Dockerfile**. The config is now baked at image build time and
verified via SHA-256 hash at startup.

PR #123's changes to the start script's Python config block would patch
dead code — that block no longer exists. This PR applies the same
security logic where the code actually lives now:

| Issue | PR #123 patched | This PR patches |
|-------|----------------|----------------|
| `dangerouslyDisableDeviceAuth` | `scripts/nemoclaw-start.sh` (removed)
| `Dockerfile` (build-time config) |
| `allowInsecureAuth` | `scripts/nemoclaw-start.sh` (removed) |
`Dockerfile` (build-time config) |
| Auto-pair whitelisting | `scripts/nemoclaw-start.sh` ✅ |
`scripts/nemoclaw-start.sh` ✅ |

---

## Changes

### `Dockerfile` (+8/-2)
- `dangerouslyDisableDeviceAuth` → derived from
`NEMOCLAW_DISABLE_DEVICE_AUTH` env var (default `0` = secure)
- `allowInsecureAuth` → derived from `CHAT_UI_URL` scheme (`false` when
`https://`)
- New `ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0` with `ENV` promotion (follows
existing C-2 safe pattern — no ARG interpolation into Python source)

### `scripts/nemoclaw-start.sh` (+12/-4)
- Auto-pair only approves `ALLOWED_CLIENTS = {'openclaw-control-ui'}`
and `ALLOWED_MODES = {'webchat'}`
- Validates `isinstance(device, dict)` before field access
- Rejects unknown clients with `[auto-pair] rejected unknown client=...
mode=...` log
- Logs `client=` on approval for audit trail
- Documents `NEMOCLAW_DISABLE_DEVICE_AUTH` in script header

### `test/security-c2-dockerfile-injection.test.js` (+6 tests)
- No hardcoded `True` for `dangerouslyDisableDeviceAuth`
- No hardcoded `True` for `allowInsecureAuth`
- Env var derivation for both settings
- `ARG` defaults to `0`
- `ENV` promotion before `RUN` layer

### `test/nemoclaw-start.test.js` (+7 tests)
- Whitelist definitions (`ALLOWED_CLIENTS`, `ALLOWED_MODES`)
- Rejection logic for unknown clients
- Dict type validation
- Client identity in approval logs
- No unconditional approval pattern
- `NEMOCLAW_DISABLE_DEVICE_AUTH` documented

---

## Why no Brev E2E test

These changes are **deterministic config values and simple conditional
logic**, not runtime behavior with environmental ambiguity:

1. **Dockerfile config** — Two Python booleans evaluated at build time:
`os.environ.get(...) == '1'` and `parsed.scheme != 'https'`. If the
source is correct, the output is correct.
2. **Auto-pair whitelisting** — Python set membership check. No shell
interpretation, no SSH pipeline, no quoting edge cases.

This is fundamentally different from something like the telegram bridge
injection fix, where you *must* send real payloads through a real shell
on a real sandbox to prove metacharacters don't get interpreted. Our
changes have no gap between "looks right in source" and "works right at
runtime." The 13 source-pattern tests provide sufficient confidence.

If deeper verification is desired in the future, a **build-time unit
test** that runs the Dockerfile's Python snippet and asserts the JSON
output would be the appropriate next step — a regular CI test, not a
Brev instance.

---

## Test plan

- [x] All 752 tests pass (739 existing + 13 new)
- [x] `shellcheck` ✅
- [x] `hadolint` ✅
- [x] `shfmt` / `prettier` ✅
- [x] `gitleaks` ✅
- [ ] Default onboarding: gateway starts with device auth enabled,
Control UI pairing works via auto-pair
- [ ] `NEMOCLAW_DISABLE_DEVICE_AUTH=1` build: device auth is disabled
(backward-compatible escape hatch)
- [ ] `CHAT_UI_URL=https://...` build: `allowInsecureAuth` is `false`
- [ ] `CHAT_UI_URL=http://127.0.0.1:18789` (default):
`allowInsecureAuth` is `true`
- [ ] Unknown device type connecting to gateway: auto-pair rejects and
logs
- [ ] `openclaw-control-ui` / `webchat` devices: auto-pair approves as
before

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Security**
* Device authentication can be toggled at build time; UI insecure-auth
is allowed only when the control UI URL uses http.
* **Bug Fixes**
* Auto-pair processing now validates inputs, rejects unknown
client/mode, logs identifiers, and deduplicates requests to avoid
repeated approvals/rejections.
* **Tests**
* Added tests for auto-pair validation, logging, de-duplication, and
configurable auth behavior.
* **Chores**
* Sandbox/setup flow now enforces the build-time device-auth setting for
images.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
jyaunches added a commit to jyaunches/NemoClaw that referenced this pull request Apr 2, 2026
Cherry-picked k8s hardening from @13ernkastel's PR NVIDIA#1149, which was
closed after its auth changes were superseded by NVIDIA#1217.

Changes to k8s/nemoclaw-k8s.yaml:
- automountServiceAccountToken: false
- enableServiceLinks: false
- workspace container: allowPrivilegeEscalation false, drop ALL caps,
  RuntimeDefault seccomp
- COMPATIBLE_API_KEY from optional Secret with dummy fallback
- NEMOCLAW_POLICY_MODE default changed from skip to suggested
- Replace curl|bash with download-then-execute pattern

Also adds k8s/README.md updates and regression test.

Co-authored-by: 13ernkastel <LennonCMJ@live.com>
quanticsoul4772 added a commit to quanticsoul4772/NemoClaw that referenced this pull request Apr 4, 2026
…VIDIA#1217)

Backport from upstream NVIDIA/NemoClaw:
- Dockerfile: dangerouslyDisableDeviceAuth defaults to false (secure);
  escape hatch via NEMOCLAW_DISABLE_DEVICE_AUTH=1 build arg
- Dockerfile: allowInsecureAuth derived from CHAT_UI_URL scheme
  (true for http://, false for https://)
- nemoclaw-start.sh: auto-pair only approves openclaw-control-ui/webchat;
  unknown clients rejected with HANDLED dedup set to avoid reprocessing
- onboard.js: patchStagedDockerfile sets NEMOCLAW_DISABLE_DEVICE_AUTH=1
  for onboard images (immediate dashboard access without device pairing)
- 13 new source-pattern tests covering all three security fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lakamsani pushed a commit to lakamsani/NemoClaw that referenced this pull request Apr 4, 2026
…VIDIA#1217)

## Summary

Fixes NVIDIA#117 — Insecure authentication configuration in gateway setup.

> **Supersedes PR NVIDIA#123**, which targeted the correct security issues but
patched code paths that have since moved. Recommend closing NVIDIA#123 in
favor of this PR.

---

## Background 

NemoClaw runs an AI assistant inside a locked-down sandbox. The
**gateway** is the front door to that sandbox, and the **Control UI** is
a web dashboard users open in their browser to interact with the
assistant.

There were three security problems with that front door:

### Problem 1: Device auth was permanently disabled 🔓

The config had `dangerouslyDisableDeviceAuth: True` hardcoded. This
meant any device could connect to the gateway without proving its
identity. It was added as a temporary workaround (commit `5fb629f`) and
never reverted. While the sandbox network isolation mitigates this
today, it becomes dangerous if combined with LAN-bind changes or
cloudflared tunnels in remote deployments — resulting in an
unauthenticated, publicly reachable dashboard.

**Fix:** Default to `false` (device auth enabled). Developers who
genuinely need it disabled for headless/dev environments can set
`NEMOCLAW_DISABLE_DEVICE_AUTH=1` as a build arg.

### Problem 2: Insecure auth was always allowed, even over HTTPS 🌐

`allowInsecureAuth: True` was hardcoded, meaning insecure (non-HTTPS)
authentication was permitted regardless of deployment context. For local
development on `http://127.0.0.1` this is fine, but for
production/remote deployments over `https://` it defeats the purpose of
TLS.

**Fix:** Derive it from the `CHAT_UI_URL` scheme. If the URL is
`http://` → allow insecure auth (local dev). If `https://` → block it
(production).

### Problem 3: The auto-pair bouncer let everyone in 🚪

The `start_auto_pair` watcher automatically approved *every* pending
device pairing request with zero validation. A rogue or unexpected
client type could pair with the gateway unchallenged.

**Fix:** Only approve devices with recognized `clientId`
(`openclaw-control-ui`) or `clientMode` (`webchat`). Unknown clients are
rejected and logged.

---

## Why this PR exists (instead of merging NVIDIA#123)

PR NVIDIA#123 (opened Mar 17) correctly identified all three issues and
proposed the right fixes. However, since then, commit `0f8eedd` ("make
openclaw.json immutable at runtime, move config to build time") moved
the gateway `controlUi` configuration from `scripts/nemoclaw-start.sh`
into the **Dockerfile**. The config is now baked at image build time and
verified via SHA-256 hash at startup.

PR NVIDIA#123's changes to the start script's Python config block would patch
dead code — that block no longer exists. This PR applies the same
security logic where the code actually lives now:

| Issue | PR NVIDIA#123 patched | This PR patches |
|-------|----------------|----------------|
| `dangerouslyDisableDeviceAuth` | `scripts/nemoclaw-start.sh` (removed)
| `Dockerfile` (build-time config) |
| `allowInsecureAuth` | `scripts/nemoclaw-start.sh` (removed) |
`Dockerfile` (build-time config) |
| Auto-pair whitelisting | `scripts/nemoclaw-start.sh` ✅ |
`scripts/nemoclaw-start.sh` ✅ |

---

## Changes

### `Dockerfile` (+8/-2)
- `dangerouslyDisableDeviceAuth` → derived from
`NEMOCLAW_DISABLE_DEVICE_AUTH` env var (default `0` = secure)
- `allowInsecureAuth` → derived from `CHAT_UI_URL` scheme (`false` when
`https://`)
- New `ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0` with `ENV` promotion (follows
existing C-2 safe pattern — no ARG interpolation into Python source)

### `scripts/nemoclaw-start.sh` (+12/-4)
- Auto-pair only approves `ALLOWED_CLIENTS = {'openclaw-control-ui'}`
and `ALLOWED_MODES = {'webchat'}`
- Validates `isinstance(device, dict)` before field access
- Rejects unknown clients with `[auto-pair] rejected unknown client=...
mode=...` log
- Logs `client=` on approval for audit trail
- Documents `NEMOCLAW_DISABLE_DEVICE_AUTH` in script header

### `test/security-c2-dockerfile-injection.test.js` (+6 tests)
- No hardcoded `True` for `dangerouslyDisableDeviceAuth`
- No hardcoded `True` for `allowInsecureAuth`
- Env var derivation for both settings
- `ARG` defaults to `0`
- `ENV` promotion before `RUN` layer

### `test/nemoclaw-start.test.js` (+7 tests)
- Whitelist definitions (`ALLOWED_CLIENTS`, `ALLOWED_MODES`)
- Rejection logic for unknown clients
- Dict type validation
- Client identity in approval logs
- No unconditional approval pattern
- `NEMOCLAW_DISABLE_DEVICE_AUTH` documented

---

## Why no Brev E2E test

These changes are **deterministic config values and simple conditional
logic**, not runtime behavior with environmental ambiguity:

1. **Dockerfile config** — Two Python booleans evaluated at build time:
`os.environ.get(...) == '1'` and `parsed.scheme != 'https'`. If the
source is correct, the output is correct.
2. **Auto-pair whitelisting** — Python set membership check. No shell
interpretation, no SSH pipeline, no quoting edge cases.

This is fundamentally different from something like the telegram bridge
injection fix, where you *must* send real payloads through a real shell
on a real sandbox to prove metacharacters don't get interpreted. Our
changes have no gap between "looks right in source" and "works right at
runtime." The 13 source-pattern tests provide sufficient confidence.

If deeper verification is desired in the future, a **build-time unit
test** that runs the Dockerfile's Python snippet and asserts the JSON
output would be the appropriate next step — a regular CI test, not a
Brev instance.

---

## Test plan

- [x] All 752 tests pass (739 existing + 13 new)
- [x] `shellcheck` ✅
- [x] `hadolint` ✅
- [x] `shfmt` / `prettier` ✅
- [x] `gitleaks` ✅
- [ ] Default onboarding: gateway starts with device auth enabled,
Control UI pairing works via auto-pair
- [ ] `NEMOCLAW_DISABLE_DEVICE_AUTH=1` build: device auth is disabled
(backward-compatible escape hatch)
- [ ] `CHAT_UI_URL=https://...` build: `allowInsecureAuth` is `false`
- [ ] `CHAT_UI_URL=http://127.0.0.1:18789` (default):
`allowInsecureAuth` is `true`
- [ ] Unknown device type connecting to gateway: auto-pair rejects and
logs
- [ ] `openclaw-control-ui` / `webchat` devices: auto-pair approves as
before

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Security**
* Device authentication can be toggled at build time; UI insecure-auth
is allowed only when the control UI URL uses http.
* **Bug Fixes**
* Auto-pair processing now validates inputs, rejects unknown
client/mode, logs identifiers, and deduplicates requests to avoid
repeated approvals/rejections.
* **Tests**
* Added tests for auto-pair validation, logging, de-duplication, and
configurable auth behavior.
* **Chores**
* Sandbox/setup flow now enforces the build-time device-auth setting for
images.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). priority: high Important issue that should be resolved in the next release security Something isn't secure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] Authentication Configuration

4 participants