Fix VM instance sharing across tasks by hjc-puro · Pull Request #6 · NousResearch/hermes-agent

hjc-puro · 2025-11-03T22:43:07Z

Isolates each VM to a task ID
Guarantees VMs will live for at most 20 minutes

hjc-puro · 2025-11-04T08:37:20Z

+
+        # Clean up VM for this task after conversation completes
+        try:
+            cleanup_vm(effective_task_id)


@teknium1 this part is a bit hacky - I can take it out if you're ok with instances running for ~5 mins after convo ends

Phase4.1 smart suggestions

#3 Cost Analytics Dashboard - New Analytics tab with summary cards (total tokens, cost, avg/mission, today, week) - CSS bar charts: cost by agent, cost by model, daily timeline (7d) - No external chart libraries — pure Tailwind #4 Export Mission as Markdown - Download .md file with full mission report (goal, team, transcript, artifacts) - Copy to clipboard button with visual feedback - Wired into Mission Detail Overlay #5 Word-by-word Streaming in Agent Chat - Replaced polling with SSE EventSource in AgentChatPanel - Real-time chunk streaming with fallback to polling on error - Streaming assistant message updates in-place #6 Remote Agents Panel - Fetches external sessions from gateway /api/sessions - Filters out local agent sessions — shows only remote/external - Auto-polls every 15s, card layout with status, model, tokens, cost - Open Chat links to ClawSuite chat tab #7 Real-time Collaboration (Presence) - BroadcastChannel-based cross-tab presence detection - Shows colored avatars of other users viewing Agent Hub - Heartbeat every 5s, stale cleanup at 30s - Shows which tab each peer is viewing

…ands

…lity + make configurable' (NousResearch#6) from fix/pids-limit-cgroup-probe into main

…t, stop/undo honesty, json_error crash, codex validation, deep-link race Bug #1: ChatPage loadSession reads res.items (not res.transcript) to match backend Bug NousResearch#2: Add GET /api/gui/session-search backed by SessionDB.search_messages (FTS5) Bug NousResearch#3: Stop button now checks res.supported before claiming run was stopped Bug NousResearch#4: Undo button now checks res.supported before removing messages locally Bug NousResearch#5: Fix _json_error positional calls in handle_chat_compress (was crashing 500) Bug NousResearch#6: Codex provider validation now also guards switching TO openai-codex Bug NousResearch#7: Deep-link hash check runs before health callback to prevent race condition

Aecroo · 2026-04-12T10:23:47Z

The fix has been merged into main and verified. host.get now correctly uses selectParentTemplates to retrieve templates for hosts.

- connection.py: cap header read at 8KB to prevent DoS from malicious handler - handler.py: use .find() instead of `in` + .index() to eliminate race in patch - handler.py: add truncated field to execute response when output exceeds 50KB - server.py: include error data field in formatted error messages - test: add timeout to test client recv, handle TimeoutExpired in close Fixes issues NousResearch#1, NousResearch#4, NousResearch#5, NousResearch#6, NousResearch#8, NousResearch#10 from Qwen 3.5 peer review on PR NousResearch#19. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

AC NousResearch#6 acceptance criteria: verify that a capped task preserves state and resumes/continues without Marco re-prompting. 20 new tests covering: - Artifact work handle extraction (BIF card + chat_id + user_id) - Verification state (unfinished, not done/success claim) - Last completed step and next action fields - Auto-continue decision logic (depth gate, completed/interrupted guards, platform scoping, emergency mode still triggers, non-standard exit reasons) - User-facing handoff text ("continuing automatically", no success claim, with/without auto-continue variants) - Synthetic continuation text (system instruction, task card ref, fresh budget, empty summary handling) - Gateway integration: artifact write + disk persistence + synthetic internal MessageEvent queuing - Restart survivability (reload artifact from persisted JSON) - Regression: turn_exit_reason alias miss, interrupted/failed caps

Match the other sandbox backends' per-init filesystem isolation. Docker stamps a fresh 'hermes-<uuid>' container name on every _init (docker.py:508), so a destroyed-then-recreated env always sees a brand-new filesystem. Gondolin's sandbox_dir is deterministic from task_id, and _setup_overlay_mounts keeps the scratch dir (overlays/<safe>/{upper,work,merged}) on disk across env lifecycles. The next env that mounts the same guest_path under the same sandbox_dir inherits the prior session's writes via the persisted upper layer — a real cross-session contamination bug, not just a disk leak. Fix: _teardown_overlay_mounts now rmtrees the per-mount scratch dir (merged.parent) after the lazy unmount returns. Lazy unmount + open-fd-keeps- inode-alive means this is safe even if the daemon hasn't fully released handles. Crash recovery still preserves upper/ because the import-time sweep only unmounts and never rmtrees. This also closes design-doc revisit item NousResearch#9 (failed-init cleanup). Test: tests/integration/test_gondolin_terminal.py::test_overlay_writes_do_not_leak_ between_env_lifecycles A KVM-gated integration test that asserts the behavioural invariant via the public GondolinEnvironment.execute() API: env1 writes a file into an overlay extra_mount, env2 (same sandbox_dir, same mount config) must not see it. Implementation-agnostic — no mention of upper/ or fuse-overlayfs — so a future migration to a custom upstream VFSProvider (the @earendil-works/ gondolin package ships vfs/provider) satisfies the same contract trivially and the test passes for free. Doc updates (DO NOT MERGE revisit list): - NousResearch#9 marked resolved (this fix) - NousResearch#6 narrowed: lists the one test we now have and what's still missing - NousResearch#10 added: task_id='default' is shared across all top-level agents at the hermes/gateway layer; concurrent-tenancy isolation needs a per-session task_id and is out of scope for this branch - NousResearch#11 added: overlay=true + missing readonly is a silent UX trap (host-side scratch is created, daemon makes guest mount EROFS) Regression: all 118 gondolin unit + integration tests pass. DO NOT MERGE — see docs/design/gondolin-terminal-backend.md.

…ache_stats AWS Bedrock Converse returns `usage.cacheReadInputTokens` / `cacheWriteInputTokens` (camelCase) when cachePoint markers fire on the request, but `normalize_converse_response` was dropping both fields on the floor — reading only `inputTokens` and `outputTokens`. This made prompt caching on non-Claude Bedrock models (Nova, Llama, DeepSeek) appear to give zero discount in Hermes telemetry, even when AWS was actually charging the cache-read rate. Fix across three layers: 1. `agent/bedrock_adapter.py` (normalize_converse_response): surface `cacheReadInputTokens` and `cacheWriteInputTokens` on the returned SimpleNamespace. Expose both camelCase (Bedrock-native) and snake_case (Anthropic-convention) aliases so downstream normalizers can use whichever they already read. 2. `agent/transports/types.py` (Usage dataclass): add `cache_creation_tokens` alongside the existing `cached_tokens` field. Updates the docstring to make it clear both are populated when the upstream provider surfaces them. 3. `agent/transports/bedrock.py` (BedrockTransport.normalize_response and new extract_cache_stats): populate the new Usage fields when normalizing and add an extract_cache_stats method that mirrors AnthropicTransport's so telemetry consumers can be transport-agnostic. Semantics match Bedrock docs: `inputTokens` represents NEW/uncached input tokens billed at full rate; cache-read/write tokens are reported separately and are NOT double-counted inside `inputTokens`. Pricing reconciliation consumers can sum all three for true prompt size. 26 new tests in tests/agent/transports/test_bedrock_cache_telemetry.py covering normalization, transport propagation, extract_cache_stats parity with the Anthropic transport, zero-value handling, and both SimpleNamespace and raw dict input shapes. Closes gap NousResearch#6 identified in the Phase 2 re-verification (PraxVault/Hermes/Reference/Decisions/bedrock-phase2-audit/04-current-architecture).

hjc-puro added 2 commits November 3, 2025 17:42

fix leakage

a4db3fd

prevent leakage of morph instances between tasks

fbd3a2f

hjc-puro commented Nov 4, 2025

View reviewed changes

hjc-puro requested a review from teknium1 November 4, 2025 08:37

teknium1 merged commit 9573b2a into main Nov 4, 2025

This was referenced Mar 5, 2026

Feature: Enhanced Extension System with Tool Interception & Lifecycle Events (inspired by Pi) #359

Closed

Feature: Multi-Agent Architecture — Orchestration, Cooperation, Specialized Roles & Resilient Workflows #344

Closed

SHL0MS mentioned this pull request Mar 30, 2026

[UX] Context-exceeded error lacks actionable guidance #4061

Closed

MorAlekss mentioned this pull request Apr 1, 2026

feat(skills): add verify-code-changes skill #4459

Closed

teknium1 mentioned this pull request Apr 2, 2026

feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration #4623

Merged

Copilot AI mentioned this pull request Apr 6, 2026

docs: feature Burgess Principle integration across README and repo docs ljbudgie/hermes-agent#2

Merged

ahmedaltewaj mentioned this pull request Apr 7, 2026

Gateway interrupt loop: get_pending_message never clears, causing infinite recursion #21

Closed

Helmi mentioned this pull request Apr 8, 2026

Inactivity timeout fires repeatedly with MiniMax 2.7 highspeed #6260

Closed

aaronlab mentioned this pull request Apr 9, 2026

fix: token estimation accuracy, context length logging, batch integrity #6629

Open

6 tasks

h4x3rotab referenced this pull request in Clawdi-AI/hermes-agent Apr 10, 2026

Merge pull request #6 from outsourc-e/phase4.1-smart-suggestions

30fea9f

Phase4.1 smart suggestions

Vex-Dravex added a commit to Vex-Dravex/hermes-agent that referenced this pull request Apr 10, 2026

ralph: story NousResearch#6 — Add /checkpoint and /restore slash comm…

acc94f0

…ands

Vex-Dravex added a commit to Vex-Dravex/hermes-agent that referenced this pull request Apr 11, 2026

ralph: story NousResearch#6 — Add /checkpoint and /restore slash comm…

a104206

…ands

malaiwah pushed a commit to malaiwah/hermes-agent that referenced this pull request Apr 11, 2026

Merge pull request 'fix(docker): gate --pids-limit on cgroup availabi…

66a5de5

…lity + make configurable' (NousResearch#6) from fix/pids-limit-cgroup-probe into main

kshitijk4poor mentioned this pull request Apr 15, 2026

feat: add hermes-blender skill for 3D modeling and rendering #10191

Closed

falses00 mentioned this pull request Apr 15, 2026

Code Review Refactoring Implementations (P0-P2) #10445

Open

kshitijk4poor mentioned this pull request Apr 16, 2026

feat: add TouchDesigner integration skill (twozero MCP) #10081

Closed

teknium1 mentioned this pull request Apr 16, 2026

fix(approval): heartbeat activity during gateway approval wait #11245

Merged

Julientalbot mentioned this pull request Apr 18, 2026

fix(cron): respect configured timezone for naive timestamps #12241

Closed

This was referenced Jun 1, 2026

Proactive model validation + per-use-case model guidance at gateway start #36278

Open

[Setup]: #22812

Closed

ricardocamiloconsir mentioned this pull request Jun 2, 2026

feat(gateway): session model pool — concurrency-aware auto-assignment with auxiliary slot tracking #37519

Open

liuhao1024 mentioned this pull request Jun 5, 2026

fix(gateway): strip _HERMES_GATEWAY from Windows detached restart helper env #40059

Closed

friendshipisover mentioned this pull request Jun 7, 2026

fix(state): anchor lineage title numbering to the resolved base #41223

Open

13 tasks

jarvis-stark-ops mentioned this pull request Jun 7, 2026

feat(gateway): dispatcher heartbeat — detect silent stalls from outside #41588

Closed

3 tasks

cristianmgm7 mentioned this pull request Jun 10, 2026

feat(platforms): add Carbon Voice as a native messaging platform #43226

Open

AutomalyRo mentioned this pull request Jun 10, 2026

[Bug]: OpenAI Codex usage completely Broken/Being Treated as Custom API #43461

Closed

1 task

ether-btc mentioned this pull request Jun 10, 2026

feat(skills): add model-task-router — automatic task-to-model routing backed by DeepSWE data #43534

Open

annguyenNous mentioned this pull request Jun 11, 2026

fix: add bounds checking in _parse_status git status parsing #44052

Open

3 tasks

f-trycua mentioned this pull request Jun 16, 2026

Track: decouple Hermes' computer_use wrapper from cua-driver internals #47072

Open

8 tasks

robbintops mentioned this pull request Jun 17, 2026

feat(background_review): structured improvement_record callback #47924

Open

3 tasks

This was referenced Jun 17, 2026

fix(egress): maxpetrusenko P1/P2 — fail-closed secrets, NODE_OPTIONS conflict, GPG verify, threat-model scope #48076

Closed

feat(egress): iron-proxy credential-injection firewall for sandboxes #30179

Open

lEWFkRAD mentioned this pull request Jun 25, 2026

fix(windows): comprehensive Windows compatibility overhaul #52685

Open

EiomSirius mentioned this pull request Jun 28, 2026

[Soul]: Give Hermes a Life Between Sessions — Curiosity Engine, Synthetic Dream Cycle, and the Watchman #53871

Open

hbd mentioned this pull request Jun 28, 2026

feat(slack): render markdown natively via Block Kit markdown block #53893

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix VM instance sharing across tasks#6

Fix VM instance sharing across tasks#6
teknium1 merged 2 commits into
mainfrom
fix-leakage

hjc-puro commented Nov 3, 2025 •

edited

Loading

Uh oh!

hjc-puro Nov 4, 2025

Uh oh!

Aecroo commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

hjc-puro commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hjc-puro Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Aecroo commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hjc-puro commented Nov 3, 2025 •

edited

Loading