Skip to content

Commit 6fd3e2c

Browse files
committed
docs: update spec to address PR NVIDIA#1092 feedback - E2E tests must exercise real runAgentInSandbox()
1 parent 20a368c commit 6fd3e2c

2 files changed

Lines changed: 15 additions & 1 deletion

File tree

.specs/telegram-bridge-command-injection-fix/spec.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,9 @@ read -r NVIDIA_API_KEY && export NVIDIA_API_KEY && MSG=$(cat) && exec nemoclaw-s
155155

156156
### Phase 4: Test Coverage
157157

158-
**Goal:** Add unit and integration tests for the security fix.
158+
**Goal:** Add unit and integration tests for the security fix, and fix E2E test to exercise real code paths.
159+
160+
**Background:** PR #1092 review feedback from @cv identified that `test/e2e/test-telegram-injection.sh` uses ad-hoc SSH commands (`MSG=$(cat) && echo ...`) instead of exercising the actual `runAgentInSandbox()` function in `telegram-bridge.js`. This makes the test validate the concept but not the production code path.
159161

160162
**Changes:**
161163

@@ -166,17 +168,24 @@ read -r NVIDIA_API_KEY && export NVIDIA_API_KEY && MSG=$(cat) && exec nemoclaw-s
166168
2. Add integration test that verifies injection payloads are treated as literal text
167169
3. Add test that API key is not visible in process list
168170
4. Add test for temp file cleanup
171+
5. **Update `test/e2e/test-telegram-injection.sh`** to exercise real `runAgentInSandbox()`:
172+
- Create a test harness that imports/invokes the actual function from `telegram-bridge.js`
173+
- Or refactor `runAgentInSandbox()` to be exportable and testable
174+
- Verify the actual stdin-based message passing path, not ad-hoc SSH commands
169175

170176
**Files:**
171177

172178
- `test/telegram-bridge.test.js` (new file)
179+
- `test/e2e/test-telegram-injection.sh` (update to use real code paths)
180+
- `scripts/telegram-bridge.js` (may need minor refactor to export `runAgentInSandbox` for testing)
173181

174182
**Acceptance Criteria:**
175183

176184
- [ ] Unit tests pass for validation functions
177185
- [ ] Integration test confirms `$(...)` in message doesn't execute
178186
- [ ] Test confirms API key not in process arguments
179187
- [ ] Test confirms temp files are cleaned up
188+
- [ ] E2E test exercises actual `runAgentInSandbox()` function, not ad-hoc SSH
180189
- [ ] All existing tests still pass
181190

182191
## Security Considerations
@@ -194,6 +203,7 @@ read -r NVIDIA_API_KEY && export NVIDIA_API_KEY && MSG=$(cat) && exec nemoclaw-s
194203
- **PR #617** (upstream): Bridge framework refactor — if merged first, changes apply to `bridge-core.js` instead
195204
- **PR #699** (upstream): `ALLOWED_CHAT_IDS` warning/opt-in behavior — out of scope for this fix, separate concern
196205
- **PR #897** (upstream): Env var propagation fix in `bin/nemoclaw.js` — separate file, no conflict
206+
- **PR #1092** (upstream): Added E2E tests for telegram-injection; @cv's review noted tests don't exercise real `runAgentInSandbox()` — we address this in Phase 4
197207

198208
## Test Plan
199209

.specs/telegram-bridge-command-injection-fix/validation.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ Test Spec: `.specs/telegram-bridge-command-injection-fix/tests.md`
99

1010
**Primary Validation**: Run `test/e2e/test-telegram-injection.sh` via brev-e2e test suite
1111

12+
### PR #1092 Feedback Addressed
13+
14+
Per @cv's review on PR #1092, the original `test-telegram-injection.sh` used ad-hoc SSH commands (`MSG=$(cat) && echo ...`) instead of exercising the actual `runAgentInSandbox()` function. As part of Phase 4, we update the E2E test to invoke the real production code path.
15+
1216
## Validation Strategy
1317

1418
The existing E2E test `test/e2e/test-telegram-injection.sh` provides comprehensive validation of the security fix. This test:

0 commit comments

Comments
 (0)