Skip to content

fix(security): limit auto-pair to one-shot with 180s timeout#690

Closed
ericksoa wants to merge 6 commits intomainfrom
fix/auto-pair-one-shot
Closed

fix(security): limit auto-pair to one-shot with 180s timeout#690
ericksoa wants to merge 6 commits intomainfrom
fix/auto-pair-one-shot

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented Mar 23, 2026

Summary

Auto-pair in nemoclaw-start.sh unconditionally approved ALL pending device pairing requests for 10 minutes on every sandbox start. On headless/cloud installs where nobody is at the browser, this is an open window for unauthorized device pairing.

Addresses all three concerns from the PSIRT report:

  1. Open approval window — Reduce timeout from 600s to 180s; after first approval, exit within 5s (one-shot)
  2. No identity validation — Only approve devices with clientId == 'openclaw-control-ui' (the chat UI); reject all other pending requests immediately
  3. Token persistence — Clear all paired devices on sandbox start (openclaw devices clear --yes), revoking stale tokens from previous sessions. Clear remaining pending requests on exit.

Test plan

  • 219/219 tests pass
  • Manual: nemoclaw onboard — verify chat UI auto-pairs within 180s window
  • Manual: verify second device request after first approval is NOT auto-approved
  • Manual: verify non-UI clientId requests are rejected
  • Manual: verify paired devices from previous session are cleared on restart

Summary by CodeRabbit

  • Chores
    • Clears previously persisted device pairings at startup.
    • Polling now validates pending entries and only accepts approvals from the designated UI client; other requests are skipped and explicitly rejected.
    • Reduced approval timeout from 600s to 180s, with a brief extension after an approval is received.
    • On timeout, remaining pending requests are cleaned up and rejected.

@ericksoa ericksoa requested a review from jacobtomlinson March 23, 2026 03:01
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d591f8a7-5e62-4a19-a336-d27377d257d5

📥 Commits

Reviewing files that changed from the base of the PR and between ccdd7c3 and 37a135c.

📒 Files selected for processing (1)
  • scripts/nemoclaw-start.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/nemoclaw-start.sh

📝 Walkthrough

Walkthrough

The embedded Python watcher in scripts/nemoclaw-start.sh shortens the approval timeout (600s→180s), clears persisted device tokens on start, processes pending requests only for openclaw-control-ui, rejects non-UI or malformed entries, and clears remaining pending requests on timeout.

Changes

Cohort / File(s) Summary
Startup watcher script
scripts/nemoclaw-start.sh
Reduced approval polling deadline (600s → 180s); clear persisted device tokens at startup (openclaw devices clear --yes); only process pending entries while APPROVED == 0; ignore non-dict entries and use device.get('requestId'); skip/log and explicitly reject requests with clientIdopenclaw-control-ui via openclaw devices reject <id> --json; extend deadline by 5s after first approval; on timeout reject remaining pendings with openclaw devices clear --pending --yes.

Sequence Diagram(s)

sequenceDiagram
    participant Watcher as Watcher (start_auto_pair)
    participant CLI as openclaw CLI
    participant DeviceSvc as Devices Service

    Watcher->>CLI: clear persisted tokens (`devices clear --yes`)
    loop Poll until APPROVED >= 1 or DEADLINE
        Watcher->>CLI: list pending (`devices list --pending --json`)
        CLI->>DeviceSvc: request pending list
        DeviceSvc-->>CLI: pending list (array)
        CLI-->>Watcher: return pending list
        alt entry is dict and clientId == "openclaw-control-ui"
            Watcher->>CLI: accept (`devices accept <requestId> --json`)
            CLI->>DeviceSvc: accept request
            DeviceSvc-->>CLI: success
            CLI-->>Watcher: success (increment APPROVED)
            Watcher->>Watcher: extend deadline by 5s (once)
        else entry is dict and clientId != "openclaw-control-ui"
            Watcher->>CLI: reject (`devices reject <requestId> --json`)
            CLI->>DeviceSvc: reject request
            DeviceSvc-->>CLI: success
            CLI-->>Watcher: logged, APPROVED unchanged
        else non-dict entry
            Watcher-->>Watcher: ignore entry
        end
    end
    alt timeout reached and APPROVED == 0
        Watcher->>CLI: clear pending (`devices clear --pending --yes`)
        CLI->>DeviceSvc: clear pending requests
    end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

"I nibbled old tokens, cleared the floor,
Waited for the UI to knock my door.
Others got bounced with a polite reject,
A tiny deadline stretch, one approval checked.
Hooray — fresh hops and a cleaner core!" — 🐇

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main security-focused changes: reducing the auto-pair timeout to 180s and implementing one-shot behavior after first approval.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/auto-pair-one-shot

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 109-110: APPROVED being cumulative lets subsequent polls reset
DEADLINE; change the logic so DEADLINE is set only once when approval first
occurs — e.g., detect the transition (previous_APPROVED == 0 and APPROVED >= 1)
or check a sentinel like DEADLINE is None/0 before assigning DEADLINE =
time.time() + 5; update the branch around APPROVED and DEADLINE to use that
one-shot guard so later polls cannot re-extend the deadline.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 479fe10e-eabb-4ce8-a9f1-4d01477eed3b

📥 Commits

Reviewing files that changed from the base of the PR and between 055e334 and 4352f8d.

📒 Files selected for processing (1)
  • scripts/nemoclaw-start.sh

…ken cleanup

Auto-pair unconditionally approved ALL pending device pairing requests
for 10 minutes on every sandbox start. On headless/cloud installs this
is an open window for unauthorized device pairing.

Changes:
- Clear all paired devices on sandbox start (revokes stale tokens from
  previous sessions — addresses token persistence concern)
- Reduce initial timeout from 600s to 180s
- Only approve devices with clientId 'openclaw-control-ui' (the chat UI);
  reject all other pending requests immediately
- After first approval, reset deadline to now+5s (one-shot exit)
- Clear remaining pending requests on exit
- has_browser convergence logic unchanged
@ericksoa ericksoa force-pushed the fix/auto-pair-one-shot branch from 4352f8d to 1ef8f17 Compare March 23, 2026 03:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
scripts/nemoclaw-start.sh (1)

120-121: ⚠️ Potential issue | 🔴 Critical

One-shot is still bypassable due cumulative deadline reset.

Line 120 checks cumulative APPROVED, so any later pending poll re-extends Line 121 to now + 5. Also, the current for device in pending loop can approve multiple UI requests in a single poll before the deadline shrinks. This still violates one-shot behavior.

🔧 Proposed fix (set deadline once, approve only first request)
 APPROVED = 0
+ONE_SHOT_DONE = False

 while time.time() < DEADLINE:
@@
     if pending:
         QUIET_POLLS = 0
+        if ONE_SHOT_DONE:
+            time.sleep(1)
+            continue
         for device in pending:
             if not isinstance(device, dict):
                 continue
             request_id = device.get('requestId')
             if not request_id:
                 continue
             client_id = device.get('clientId', '')
             if client_id != 'openclaw-control-ui':
                 print(f'[auto-pair] skipping non-UI device request={request_id} clientId={client_id!r}')
                 run('openclaw', 'devices', 'reject', request_id, '--json')
                 continue
             arc, aout, aerr = run('openclaw', 'devices', 'approve', request_id, '--json')
             if arc == 0:
                 APPROVED += 1
+                ONE_SHOT_DONE = True
+                DEADLINE = time.time() + 5
                 print(f'[auto-pair] approved request={request_id} clientId={client_id}')
+                break
             elif aout or aerr:
                 print(f'[auto-pair] approve failed request={request_id}: {(aerr or aout)[:400]}')
-        if APPROVED >= 1:
-            DEADLINE = time.time() + 5
         time.sleep(1)
         continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` around lines 120 - 121, The current logic resets
DEADLINE on every poll when APPROVED >= 1 and may approve multiple devices in
the "for device in pending" loop, breaking one-shot semantics; change it so
DEADLINE is set only once (e.g., only assign DEADLINE = time.time() + 5 when
DEADLINE is unset/zero) and ensure only the first pending request is approved
per poll by approving the first device and breaking out of the "for device in
pending" loop (or otherwise preventing further approvals) so that APPROVED
cannot be incremented again during the same deadline window.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 120-121: The current logic resets DEADLINE on every poll when
APPROVED >= 1 and may approve multiple devices in the "for device in pending"
loop, breaking one-shot semantics; change it so DEADLINE is set only once (e.g.,
only assign DEADLINE = time.time() + 5 when DEADLINE is unset/zero) and ensure
only the first pending request is approved per poll by approving the first
device and breaking out of the "for device in pending" loop (or otherwise
preventing further approvals) so that APPROVED cannot be incremented again
during the same deadline window.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 38fc1241-af32-4fa4-8462-4eeab29f2dca

📥 Commits

Reviewing files that changed from the base of the PR and between 4352f8d and 1ef8f17.

📒 Files selected for processing (1)
  • scripts/nemoclaw-start.sh

@ericksoa ericksoa self-assigned this Mar 23, 2026
…ken cleanup

Auto-pair unconditionally approved ALL pending device pairing requests
for 10 minutes on every sandbox start. On headless/cloud installs this
is an open window for unauthorized device pairing.

Changes:
- Clear all paired devices on sandbox start (revokes stale tokens from
  previous sessions — addresses token persistence concern)
- Reduce initial timeout from 600s to 180s
- Only approve devices with clientId 'openclaw-control-ui' (the chat UI);
  reject all other pending requests immediately
- After first approval, reset deadline to now+5s (one-shot exit)
- Clear remaining pending requests on exit
- has_browser convergence logic unchanged
@cv cv force-pushed the fix/auto-pair-one-shot branch from ede8043 to ccdd7c3 Compare March 23, 2026 19:36
@wscurran wscurran added security Something isn't secure priority: high Important issue that should be resolved in the next release labels Mar 23, 2026
Copy link
Copy Markdown
Contributor

@cv cv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clientId is client-supplied and spoofable

The timeout reduction (600s→180s), one-shot exit, and token cleanup on start/exit shrink the attack window. However, the clientId validation — framed as one of the three PSIRT mitigations — does not hold up.

The gateway's device pairing flow:

  1. Client opens a WebSocket and sends a connect frame with client.id set to any value from the protocol's union type
  2. The gateway stores connectParams.client.id verbatim into the pending pairing request
  3. openclaw devices list --json returns that value as-is

There is no server-side validation that the connecting client is the control UI. Any WebSocket client that can reach the gateway can send "openclaw-control-ui" as its client.id and pass this check. The union type in frames.d.ts is a compile-time constraint — it doesn't enforce anything at the protocol level.

The check on line 111 filters out accidental pairing from other legitimate tools (CLI, probes, etc.) but provides no resistance against the intentional attacker scenario described in the PSIRT report.

Recommendations

  1. Merge the timeout, one-shot, and cleanup changes as-is
  2. Reframe the clientId check in the PR description and code comment as a convenience filter / defense-in-depth, not a security boundary. The PR description currently says "Only approve devices with clientId == 'openclaw-control-ui'; reject all other pending requests immediately" — this overstates what the check achieves
  3. If the PSIRT report requires actual identity validation, that needs to happen in the openclaw gateway itself (e.g., origin validation, signed client assertions, or server-side client registration) — it can't be solved in this script
  4. The for device in pending loop can approve multiple UI requests in a single poll before APPROVED is checked — add a break after the first successful approval to tighten one-shot semantics

Resolve conflicts in scripts/nemoclaw-start.sh: preserve the PR's
security fixes (180s timeout, one-shot auto-pair, client ID validation,
device clearing) while incorporating main's OPENCLAW resolved-path
variable for PATH hijacking prevention.
Resolve conflicts: keep OPENCLAW resolved-path variable from main
for PATH hijacking prevention in all openclaw CLI calls.
jyaunches added a commit to jyaunches/NemoClaw that referenced this pull request Apr 1, 2026
- allowInsecureAuth: use explicit http allowlist (parsed.scheme == 'http')
  instead of https denylist (parsed.scheme != 'https') so malformed or
  unknown schemes default to secure (CodeRabbit finding on NVIDIA#123)
- Add HANDLED set to track rejected/approved requestIds so rejected
  devices are not reprocessed every poll cycle (CodeRabbit finding on NVIDIA#123)
- Document that clientId/clientMode are client-supplied and spoofable;
  the allowlist is defense-in-depth, not a trust boundary (@cv review on NVIDIA#123)
- Reference PR NVIDIA#690 for the more comprehensive auto-pair fix (one-shot
  exit, timeout reduction, token cleanup)
jyaunches added a commit that referenced this pull request Apr 1, 2026
- allowInsecureAuth: use explicit http allowlist (parsed.scheme == 'http')
  instead of https denylist (parsed.scheme != 'https') so malformed or
  unknown schemes default to secure (CodeRabbit finding on #123)
- Add HANDLED set to track rejected/approved requestIds so rejected
  devices are not reprocessed every poll cycle (CodeRabbit finding on #123)
- Document that clientId/clientMode are client-supplied and spoofable;
  the allowlist is defense-in-depth, not a trust boundary (@cv review on #123)
- Reference PR #690 for the more comprehensive auto-pair fix (one-shot
  exit, timeout reduction, token cleanup)
jyaunches added a commit to jyaunches/NemoClaw that referenced this pull request Apr 1, 2026
- allowInsecureAuth: use explicit http allowlist (parsed.scheme == 'http')
  instead of https denylist (parsed.scheme != 'https') so malformed or
  unknown schemes default to secure (CodeRabbit finding on NVIDIA#123)
- Add HANDLED set to track rejected/approved requestIds so rejected
  devices are not reprocessed every poll cycle (CodeRabbit finding on NVIDIA#123)
- Document that clientId/clientMode are client-supplied and spoofable;
  the allowlist is defense-in-depth, not a trust boundary (@cv review on NVIDIA#123)
- Reference PR NVIDIA#690 for the more comprehensive auto-pair fix (one-shot
  exit, timeout reduction, token cleanup)
@jyaunches
Copy link
Copy Markdown
Contributor

Hey @ericksoa 👋

We just merged PR #1217 (harden gateway auth defaults and restrict auto-pair) and are doing post-merge cleanup on related PRs/issues. Your PR #690 came up as overlapping.

#1217 picked up one of your five changes — the client allowlisting (though implemented slightly differently with an ALLOWED_CLIENTS set + ALLOWED_MODES). But three of your other changes were not incorporated and are still potentially valuable:

  • Timeout reduction (600s → 180s) — the merged code still has a 10-minute polling window
  • One-shot exit — after first approval, exit within 5s instead of continuing to poll
  • Token lifecycle cleanup — clear persisted tokens on start + clear pending requests on timeout

These all seem like meaningful hardening on top of what landed. Do you still want to pursue some version of #690 as a follow-up, or are you comfortable closing it out? If there's interest, the simplest path would probably be a fresh PR against the current start_auto_pair() since the function has changed a fair bit since your branch.

@ahunnargikar-nvidia ahunnargikar-nvidia assigned cjagwani and unassigned cv and ericksoa Apr 7, 2026
@cjagwani
Copy link
Copy Markdown
Contributor

cjagwani commented Apr 7, 2026

Closing this in favor of a fresh PR against current main, the start_auto_pair() function has changed since this branch and cv's review feedback needs to be addressed against the new code.

The valuable ideas from this PR (timeout reduction, one-shot exit, token cleanup on start/timeout) will be carried forward. Thank you @ericksoa for the work here.

Superseded by a new PR addressing #1310.

@cjagwani cjagwani closed this Apr 7, 2026
cjagwani added a commit that referenced this pull request Apr 7, 2026
…and token cleanup

Reduce the auto-pair polling window from 600s to 180s, exit after the
first successful approval (one-shot with 5s grace period), clear stale
device tokens on startup, explicitly reject non-allowlisted clients,
and clear remaining pending requests on timeout.

Incorporates ideas from PR #690 with cv's review feedback addressed:
the clientId check comment now correctly frames it as defense-in-depth
rather than a security boundary, and the one-shot logic uses a
break-after-first-approval pattern that prevents deadline re-extension.

Supersedes #690
Closes #1310

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority: high Important issue that should be resolved in the next release security Something isn't secure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants