Skip to content

CliAgentEnv: add TASK_FINISHED stop-signal hook#1243

Open
rasdani wants to merge 1 commit intomainfrom
feat/task-finished-stop-hook
Open

CliAgentEnv: add TASK_FINISHED stop-signal hook#1243
rasdani wants to merge 1 commit intomainfrom
feat/task-finished-stop-hook

Conversation

@rasdani
Copy link
Copy Markdown
Contributor

@rasdani rasdani commented Apr 24, 2026

Summary

Adds an opt-in task_finished_signal: str | None = None parameter to CliAgentEnv and a matching @vf.stop hook (task_finished_signal_emitted) that terminates the rollout when the configured signal (e.g. "TASK_FINISHED") appears in the trailing tool-role messages of the agent's latest intercepted request.

The default (None) is a no-op, so existing envs are unaffected.

Motivation

rlm-style agents already expose a passive-stop path when the model returns tool_calls=[], but the model rarely takes that exit on its own. Analysis of v4-p1 step-27 rollouts showed a 54.5% rate of "summary-while-still-tool-calling" — the agent narrates that it's done while continuing to issue tool calls — whereas mini_swe_agent_plus's analogous MINI_SWE_AGENT_FINAL_OUTPUT ritual is emitted in 97.4% of its rollouts.

Giving CliAgentEnv the same "I'm done" ritual + automatic stop hook gives the engine a reliable way to terminate when the model commits to a fix.

Companion PR

This hook is paired with PrimeIntellect-ai/rlm-harness#60, which instructs the rlm agent (when bash is an active tool) to emit exactly one bash tool call with the command echo "TASK_FINISHED" when confident the task is complete. Either PR alone is a no-op; both together close the loop.

Design notes

  • task_finished_signal defaults to None so the hook is opt-in; existing CliAgentEnv subclasses (e.g. rlm-swe harness configs) are not surprised.
  • The scan only walks the trailing run of tool messages in the latest trajectory step's prompt; it does not traverse full history.
  • Handles both Pydantic ToolMessage and dict-shaped intercepted messages.
  • Substring match on content (not exact equality) so echo wrappers or output framing don't defeat detection.

Note

Low Risk
Low risk: behavior is unchanged by default (task_finished_signal=None), and the new stop condition only triggers when explicitly configured based on recent tool-message content.

Overview
Adds an opt-in task_finished_signal parameter to CliAgentEnv and stores it on the instance.

Introduces a new @vf.stop check (task_finished_signal_emitted) that terminates the rollout when the configured signal is detected in the latest step’s trailing tool-role messages, allowing agents to end runs via a deliberate tool output marker.

Reviewed by Cursor Bugbot for commit ed421b3. Bugbot is set up for automated code reviews on this repo. Configure here.

Add a `task_finished_signal: str | None = None` arg and a matching
`@vf.stop` hook that terminates the rollout when the configured signal
(e.g. "TASK_FINISHED") appears in the trailing tool-role messages of
the agent's latest intercepted request.

When `task_finished_signal` is None (default), the hook is a no-op so
existing envs are unaffected. Opt-in by passing the signal, typically
paired with a system-prompt ritual instructing the model to emit it
via `echo "TASK_FINISHED"` — analogous to mini-swe-agent-plus's
MINI_SWE_AGENT_FINAL_OUTPUT mechanism.

The scan only walks the trailing run of tool messages in the latest
trajectory step's prompt, so it stays cheap regardless of trajectory
length.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit ed421b3. Configure here.

sandbox_creations_per_minute: float | None = 128,
timeouts: SandboxTimeouts = SandboxTimeouts(),
keep_sandbox_for_scoring: bool = False,
task_finished_signal: str | None = None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for new task_finished_signal parameter

Low Severity

The new task_finished_signal parameter and its corresponding @vf.stop hook task_finished_signal_emitted add user-facing functionality to CliAgentEnv, but docs/environments.md (line 888) — which explicitly enumerates CliAgentEnv constructor parameters — was not updated to mention the new parameter or describe the stop-signal behavior. This violates the documentation update rule for changes to core user-facing functionality described in docs/.

Fix in Cursor Fix in Web

Triggered by project rule: BugBot Instructions

Reviewed by Cursor Bugbot for commit ed421b3. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant