CliAgentEnv: add TASK_FINISHED stop-signal hook#1243
Conversation
Add a `task_finished_signal: str | None = None` arg and a matching `@vf.stop` hook that terminates the rollout when the configured signal (e.g. "TASK_FINISHED") appears in the trailing tool-role messages of the agent's latest intercepted request. When `task_finished_signal` is None (default), the hook is a no-op so existing envs are unaffected. Opt-in by passing the signal, typically paired with a system-prompt ritual instructing the model to emit it via `echo "TASK_FINISHED"` — analogous to mini-swe-agent-plus's MINI_SWE_AGENT_FINAL_OUTPUT mechanism. The scan only walks the trailing run of tool messages in the latest trajectory step's prompt, so it stays cheap regardless of trajectory length.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ed421b3. Configure here.
| sandbox_creations_per_minute: float | None = 128, | ||
| timeouts: SandboxTimeouts = SandboxTimeouts(), | ||
| keep_sandbox_for_scoring: bool = False, | ||
| task_finished_signal: str | None = None, |
There was a problem hiding this comment.
Missing documentation for new task_finished_signal parameter
Low Severity
The new task_finished_signal parameter and its corresponding @vf.stop hook task_finished_signal_emitted add user-facing functionality to CliAgentEnv, but docs/environments.md (line 888) — which explicitly enumerates CliAgentEnv constructor parameters — was not updated to mention the new parameter or describe the stop-signal behavior. This violates the documentation update rule for changes to core user-facing functionality described in docs/.
Triggered by project rule: BugBot Instructions
Reviewed by Cursor Bugbot for commit ed421b3. Configure here.


Summary
Adds an opt-in
task_finished_signal: str | None = Noneparameter toCliAgentEnvand a matching@vf.stophook (task_finished_signal_emitted) that terminates the rollout when the configured signal (e.g."TASK_FINISHED") appears in the trailing tool-role messages of the agent's latest intercepted request.The default (
None) is a no-op, so existing envs are unaffected.Motivation
rlm-style agents already expose a passive-stop path when the model returns
tool_calls=[], but the model rarely takes that exit on its own. Analysis of v4-p1 step-27 rollouts showed a 54.5% rate of "summary-while-still-tool-calling" — the agent narrates that it's done while continuing to issue tool calls — whereasmini_swe_agent_plus's analogousMINI_SWE_AGENT_FINAL_OUTPUTritual is emitted in 97.4% of its rollouts.Giving
CliAgentEnvthe same "I'm done" ritual + automatic stop hook gives the engine a reliable way to terminate when the model commits to a fix.Companion PR
This hook is paired with PrimeIntellect-ai/rlm-harness#60, which instructs the rlm agent (when bash is an active tool) to emit exactly one
bashtool call with the commandecho "TASK_FINISHED"when confident the task is complete. Either PR alone is a no-op; both together close the loop.Design notes
task_finished_signaldefaults toNoneso the hook is opt-in; existingCliAgentEnvsubclasses (e.g. rlm-swe harness configs) are not surprised.prompt; it does not traverse full history.ToolMessageand dict-shaped intercepted messages.echowrappers or output framing don't defeat detection.Note
Low Risk
Low risk: behavior is unchanged by default (
task_finished_signal=None), and the new stop condition only triggers when explicitly configured based on recent tool-message content.Overview
Adds an opt-in
task_finished_signalparameter toCliAgentEnvand stores it on the instance.Introduces a new
@vf.stopcheck (task_finished_signal_emitted) that terminates the rollout when the configured signal is detected in the latest step’s trailing tool-role messages, allowing agents to end runs via a deliberate tool output marker.Reviewed by Cursor Bugbot for commit ed421b3. Bugbot is set up for automated code reviews on this repo. Configure here.