fix(compaction): forward tools= with tool_choice="none" to keep exten… by samsja · Pull Request #73 · PrimeIntellect-ai/rlm-harness

samsja · 2026-05-07T19:32:53Z

Summary

The compaction summary call drops tools= from the request entirely so the model can only respond with text. Side effect we missed: vLLM's
chat-completions layer injects a # Tools\n<tools>{...}</tools> block into the rendered system message only when tools= is in the request.
Dropping tools= therefore makes the compaction call's system prompt diverge from every regular turn.

Downstream, prime-rl's RL trajectory walker (interleave_rollout in src/prime_rl/orchestrator/trajectories.py) opens a new training sample
whenever a step's prompt_ids doesn't prefix-match any active sample. One compaction event creates two prefix-incompatible boundaries:

Compaction summary turn (no tools= in request → no # Tools block in rendered system) does not extend the prior pre-compaction sample.
Post-compaction continuation turn (tools back in request → # Tools block back) does not extend the summary turn.

So one compaction event splits the rollout into three samples where structurally two would be enough.

Symptoms in production

RLM-GLM5.1 run (TITO, use_token_client=true), all four envs:

env	samples/rollout	compactions/rollout	breaks (s−1)	breaks/compaction
rlm-swe	3.16	1.07	2.16	2.01
rlm-deepdive	1.99	0.49	0.99	2.02
rlm-swerebench-v2	2.62	0.81	1.62	2.00
general-agent-rlm	2.23	0.61	1.23	2.02

Ratio is 2.0 across all envs — too clean to be tokenization noise; structural.

Fix

Forward the active tool list to the compaction call as tools=active_tools with tool_choice="none". The system prompt now renders identically to
regular turns (extension property holds at the summary turn → 1 break per compaction instead of 2), while tool_choice="none" preserves the
original "text-only summary" behaviour by forbidding tool calls.

request_kwargs = {"model": self.model, "messages": messages}                                                                                       
if active_tools:                                                    
    request_kwargs["tools"] = active_tools
    request_kwargs["tool_choice"] = "none"
response = await call_with_retries(self.client.chat.completions.create, **request_kwargs)                                                          
                                            
Why tool_choice="none" over dropping tools=                                                                                                        
                                                                                                                                                   
tool_choice="none" is the OpenAI-spec way to say "you have tools, I just don't want you to call them this turn." It keeps the request              
schema-consistent with regular turns (same system-prompt rendering on the server), so prime-rl's interleave_rollout keeps merging steps into one   
sample across the compaction boundary instead of opening a third one.                                                                              
                                                                                                                                                   
When this regressed
                                                                                                                                                   
It hasn't — _compact_branch has called chat.completions.create without tools= since compaction was first introduced in #54 (2026-04-23, commit
f7cda58). The bug has been silent the whole time. It only surfaced now because we noticed samples_per_rollout / rlm_compactions_count == 2.0 and
traced the prefix break from the prime-rl orchestrator side.
                                                                                                                                                   
Scope
                                                                                                                                                   
- One file changed (src/rlm/engine.py), 23 insertions / 6 deletions.
- _compact_branch signature gains an active_tools argument; the only caller is updated.
- No test changes — tests/test_metrics.py exercises metrics shape, not the compaction request.

…sion property The compaction summary call dropped tools= entirely so the model could only respond with text. vLLM's chat-completions layer renders a "# Tools" block into the system message only when tools= is set in the request, so dropping it makes the summary call's system prompt diverge from every regular turn. Downstream, prime-rl's trajectory walker (interleave_rollout) checks whether each step's prompt_ids extend any active sample's prefix; the diverging system prompt fails that check at the compaction summary turn, and the post-compaction continuation then fails it a second time. One compaction event therefore opens *two* training-sample splits where structurally one is enough — observable as samples_per_rollout / rlm_compactions_count == 2.0 across all envs in RL runs. Fix: forward the active tool list with tool_choice="none". Tools are still advertised so the system prompt is identical to regular turns, and tool_choice="none" preserves the original "text-only summary" behaviour by forbidding tool calls on this turn. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

eligotts approved these changes May 7, 2026

View reviewed changes

samsja merged commit 509bffe into main May 7, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compaction): forward tools= with tool_choice="none" to keep exten…#73

fix(compaction): forward tools= with tool_choice="none" to keep exten…#73
samsja merged 1 commit intomainfrom
fix/compaction-tools-keep-extension-property

samsja commented May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

samsja commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Symptoms in production

Fix

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

samsja commented May 7, 2026 •

edited

Loading