Skip to content

Add TASK_FINISHED stop ritual to system prompt#60

Open
rasdani wants to merge 1 commit intomainfrom
feat/task-finished-stop-signal
Open

Add TASK_FINISHED stop ritual to system prompt#60
rasdani wants to merge 1 commit intomainfrom
feat/task-finished-stop-signal

Conversation

@rasdani
Copy link
Copy Markdown
Collaborator

@rasdani rasdani commented Apr 24, 2026

Summary

Adds an explicit "I'm done" ritual to the rlm system prompt: when the model is confident the task is complete, it must emit exactly one bash tool call with the command echo "TASK_FINISHED" and emit nothing further. This is analogous to mini-swe-agent-plus's MINI_SWE_AGENT_FINAL_OUTPUT submission signal.

The new section is only appended when bash is an active tool, so non-bash configurations are unaffected.

Motivation

The rlm engine already supports a passive-stop path when the model returns tool_calls=[], but the model rarely takes that exit on its own. Analysis of v4-p1 step-27 rollouts showed a 54.5% rate of "summary-while-still-tool-calling" behavior — the agent narrates that it's done while continuing to issue tool calls — whereas mini-swe-agent-plus's analogous ritual is emitted in 97.4% of its rollouts.

A ritual + matching verifiers stop hook gives the engine a reliable way to terminate when the model commits to a fix, preventing that pathology.

Companion PR

This prompt change is a no-op without the companion verifiers PR that watches for TASK_FINISHED in tool responses and concludes the rollout via @vf.stop. See PrimeIntellect-ai/verifiers (forthcoming companion PR) for the matching stop hook.

Follow-ups

  • Re-emission ablation once training catches up to this prompt change, to measure how often the model takes the new exit.

Note

Medium Risk
Changes system-prompt behavior to introduce a new termination signal, which can affect agent control flow and rollout stopping behavior when bash is enabled. Low code complexity but potentially impacts runtime behavior across tasks.

Overview
Adds a new explicit rollout termination ritual to the system prompt: when bash is an active tool, the agent is instructed to finish by making exactly one bash call with echo "TASK_FINISHED" and then emit nothing further.

The prompt construction now conditionally appends this instruction based on active_tools, leaving non-bash tool configurations unchanged.

Reviewed by Cursor Bugbot for commit 7543f14. Bugbot is set up for automated code reviews on this repo. Configure here.

Instruct the model to emit exactly one `bash` tool call with
`echo "TASK_FINISHED"` when it is confident the fix is complete. This
gives the rollout engine an explicit, easy-to-detect termination
signal, analogous to mini-swe-agent-plus's MINI_SWE_AGENT_FINAL_OUTPUT.

Only appended when `bash` is an active tool, so non-bash configs are
unaffected. This prompt change is a no-op without the companion
verifiers stop-hook PR that watches for the signal.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 7543f14. Configure here.

Comment thread src/rlm/prompt.py
"Emit no further tool calls or text after this. You cannot continue"
" working on this task after submitting.",
]
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contradictory "stop calling tools" vs TASK_FINISHED instructions

Medium Severity

The new TASK_FINISHED ritual (issue a bash tool call when done) directly contradicts the existing instruction on line 29: "When you are done, stop calling tools and state your final answer." When bash is active, the system prompt now tells the model both to stop calling tools and to call the bash tool when the task is complete. This conflicting guidance is likely to confuse the model and reduce the adoption rate of the new stop ritual that the PR is specifically trying to achieve.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 7543f14. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant