Description
When using HandoffBuilder with OpenAIChatClient (Chat Completions API) and a tool that has approval_mode="always_require", the workflow crashes with a 400 error after the user approves a tool call. The OpenAI API rejects the request because the message array contains a duplicate assistant message with tool_calls that has no matching tool response message.
Root cause
HandoffAgentExecutor._run_agent_and_emit() replays _full_conversation as the message cache. The default InMemoryHistoryProvider also independently stores and loads messages via the agent session. When the workflow resumes after tool approval, both sources contribute messages, causing the assistant tool_calls message to appear twice — once from history and once from _full_conversation. The first copy has no matching tool response, so the OpenAI Chat Completions API rejects it:
An assistant message with 'tool_calls' must be followed by tool messages
responding to each 'tool_call_id'. The following tool_call_ids did not have
response messages: call_XXXXX
Debug output showing the duplicated messages sent to the API:
[0] role=system (instructions)
[1] role=user (initial message) ← from InMemoryHistoryProvider
[2] role=assistant, tool_call=call_XXX ← from InMemoryHistoryProvider (NO matching tool response)
[3] role=user (initial message) ← from _full_conversation
[4] role=assistant, tool_call=call_XXX ← from _full_conversation
[5] role=tool, tool_call_id=call_XXX ← approval result
Message [2] is the orphaned tool_calls that triggers the 400 error.
Relationship to #4376
Issue #4376 describes a related but distinct bug: HandoffBuilder forces store=False, which breaks AzureOpenAIResponsesClient because the FunctionInvocationLayer captures response.conversation_id and references a non-persisted response. That bug is specific to the Responses API path. This bug affects the Chat Completions API path (OpenAIChatClient) where store=False doesn't prevent InMemoryHistoryProvider from duplicating messages.
Steps to reproduce
- Create agents with
OpenAIChatClient (or any Chat Completions-based client)
- Add a tool with
@tool(approval_mode="always_require")
- Build a
HandoffBuilder workflow
- Run the workflow with
stream=True
- When the tool approval
request_info event fires, approve it
- Resume with
workflow.run(responses=responses, stream=True)
Code Sample
from agent_framework import Agent, Content, WorkflowEvent, tool
from agent_framework.openai import OpenAIChatClient
from agent_framework.orchestrations import HandoffAgentUserRequest, HandoffBuilder
client = OpenAIChatClient(
base_url="https://models.github.ai/inference",
api_key="...",
model_id="openai/gpt-5-mini",
)
@tool(approval_mode="always_require")
def submit_refund(order_id: str, amount: str) -> str:
"""Process a refund."""
return f"Refund of {amount} for order {order_id}"
triage = client.as_agent(name="triage", instructions="Route to refund agent.")
refund = client.as_agent(name="refund", instructions="Process refunds.", tools=[submit_refund])
workflow = (
HandoffBuilder(name="demo", participants=[triage, refund])
.with_start_agent(triage)
.build()
)
# Initial run triggers tool approval request
events = []
async for event in workflow.run("Refund order 123", stream=True):
if event.type == "request_info":
events.append(event)
# Approve and resume → crashes with 400
responses = {}
for e in events:
if isinstance(e.data, Content) and e.data.type == "function_approval_request":
responses[e.request_id] = e.data.to_function_approval_response(approved=True)
async for event in workflow.run(responses=responses, stream=True): # 💥 400 error
pass
I also verified this reproduces when adapting the official handoff_with_tool_approval_checkpoint_resume.py sample to use OpenAIChatClient instead of AzureOpenAIResponsesClient.
Error Messages / Stack Traces
openai.BadRequestError: Error code: 400 - {'error': {'message': "An assistant message
with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'.
The following tool_call_ids did not have response messages: call_XXXXX",
'type': 'invalid_request_error', 'param': 'messages.[3].role', 'code': None}}
Package Versions
agent-framework-core pinned at commit 11628c3166a1845683c5aef1e0d389eb862bcbaa (post-rc2)
Python Version
Python 3.12
Additional Context
The HandoffAgentExecutor already manages full conversation replay via _full_conversation and sets store=False. The InMemoryHistoryProvider (auto-added by Agent.__init__) should either be suppressed or have load_messages=False in handoff workflows to prevent this duplication. As a workaround, explicitly passing context_providers=[InMemoryHistoryProvider(load_messages=False)] to each agent prevents the error for the first tool approval round, but subsequent rounds still fail due to the same duplication in _full_conversation itself.
Description
When using
HandoffBuilderwithOpenAIChatClient(Chat Completions API) and a tool that hasapproval_mode="always_require", the workflow crashes with a 400 error after the user approves a tool call. The OpenAI API rejects the request because the message array contains a duplicate assistant message withtool_callsthat has no matchingtoolresponse message.Root cause
HandoffAgentExecutor._run_agent_and_emit()replays_full_conversationas the message cache. The defaultInMemoryHistoryProvideralso independently stores and loads messages via the agent session. When the workflow resumes after tool approval, both sources contribute messages, causing the assistanttool_callsmessage to appear twice — once from history and once from_full_conversation. The first copy has no matchingtoolresponse, so the OpenAI Chat Completions API rejects it:Debug output showing the duplicated messages sent to the API:
Message [2] is the orphaned
tool_callsthat triggers the 400 error.Relationship to #4376
Issue #4376 describes a related but distinct bug:
HandoffBuilderforcesstore=False, which breaksAzureOpenAIResponsesClientbecause theFunctionInvocationLayercapturesresponse.conversation_idand references a non-persisted response. That bug is specific to the Responses API path. This bug affects the Chat Completions API path (OpenAIChatClient) wherestore=Falsedoesn't preventInMemoryHistoryProviderfrom duplicating messages.Steps to reproduce
OpenAIChatClient(or any Chat Completions-based client)@tool(approval_mode="always_require")HandoffBuilderworkflowstream=Truerequest_infoevent fires, approve itworkflow.run(responses=responses, stream=True)Code Sample
I also verified this reproduces when adapting the official
handoff_with_tool_approval_checkpoint_resume.pysample to useOpenAIChatClientinstead ofAzureOpenAIResponsesClient.Error Messages / Stack Traces
Package Versions
agent-framework-core pinned at commit
11628c3166a1845683c5aef1e0d389eb862bcbaa(post-rc2)Python Version
Python 3.12
Additional Context
The
HandoffAgentExecutoralready manages full conversation replay via_full_conversationand setsstore=False. TheInMemoryHistoryProvider(auto-added byAgent.__init__) should either be suppressed or haveload_messages=Falsein handoff workflows to prevent this duplication. As a workaround, explicitly passingcontext_providers=[InMemoryHistoryProvider(load_messages=False)]to each agent prevents the error for the first tool approval round, but subsequent rounds still fail due to the same duplication in_full_conversationitself.