Python: fix(python/openai): use full-history mode by default to fix tool call streaming with OpenRouter#5111
Conversation
…aming with OpenRouter _get_conversation_id now returns None unless store=True is explicitly set, making server-managed conversation mode opt-in. This ensures compatibility with providers that don't support previous_response_id (e.g. OpenRouter), where sending only a tool result without full conversation context caused an HTTP 400 error. Fixes microsoft#5105
|
@microsoft-github-policy-service agree |
chetantoshniwal
left a comment
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 90%
✓ Correctness
This PR fixes a correctness issue where
_get_conversation_idwould return a conversation ID whenstore=None(the default), causing providers that don't supportprevious_response_id(e.g. OpenRouter) to fail. The fix changes the guard fromif store is Falsetoif store is not True, so only an explicitstore=Trueopt-in enables server-managed conversation mode. The logic change is correct and the tests are well-structured. One note: the same_get_conversation_idmethod in_responses_client.py(line 448) still uses the oldif store is Falseguard, but this may be intentional since the Responses API is OpenAI-native and always supportsprevious_response_id.
✓ Security Reliability
This PR fixes a reliability issue (GitHub #5105) where
_get_conversation_idwould return a conversation ID whenstore=None(the default), causing failures with OpenAI-compatible providers (e.g., OpenRouter) that don't supportprevious_response_id. The fix correctly changes the guard fromif store is Falsetoif store is not True, ensuring only explicit opt-in viastore=Trueenables server-managed conversation mode. The code change is minimal and well-tested. No security issues are introduced. Note that the sibling_responses_client.pystill uses the oldif store is Falsecheck, but this is likely intentional since the Responses API is OpenAI-native and always supports conversation IDs.
✓ Test Coverage
The test coverage for this behavioral change is solid. The new test
test_parse_response_with_store_none_returns_nonedirectly validates the regression fix (issue #5105) by asserting that bothstore=Noneandstore=FalsereturnNonefrom_get_conversation_id. The streaming tests (test_streaming_response_created_typeandtest_streaming_response_in_progress_type) were properly updated to verify bothstore=True(returns conversation_id) andstore=None(returnsNone) paths. One minor gap: the non-streaming_get_conversation_idpath lacks a directstore=Truewith-conversation positive test in this diff, though this is likely covered by a pre-existing test. Astore=Falsecase for streaming is untested but is functionally identical tostore=Noneunder the newis not Trueguard. Overall the tests are meaningful — they assert specific return values rather than just exercising code paths.
✓ Design Approach
The change correctly inverts the semantics of
_get_conversation_idfrom opt-out (return None only whenstore=False) to opt-in (return non-None only whenstore=True). Previously,store=None(the default when a caller omits the field) would fall through the guard and return aprevious_response_id-based conversation ID, which breaks providers like OpenRouter that don't support that parameter. The fix properly treatsstore=Nonethe same asstore=False, making server-managed conversation mode strictly opt-in. The change is minimal, addresses the root cause rather than masking it, and the updated tests adequately cover all threestorestates. No blocking design issues found.
Automated review by chetantoshniwal's agents
| assert client._get_conversation_id(mock_response, store=False) is None | ||
|
|
||
|
|
||
| def test_parse_response_uses_response_id_when_no_conversation() -> None: |
There was a problem hiding this comment.
This test covers the store=None and store=False negative cases well, but doesn't verify the positive case (store=True returning conv_456). Adding assert client._get_conversation_id(mock_response, store=True) == 'conv_456' would make this a complete tri-state test and guard against future regressions that might accidentally break the store=True path.
| def test_parse_response_uses_response_id_when_no_conversation() -> None: | |
| assert client._get_conversation_id(mock_response, store=None) is None | |
| assert client._get_conversation_id(mock_response, store=False) is None | |
| assert client._get_conversation_id(mock_response, store=True) == "conv_456" |
Summary
Fixes #5105 — streaming crashes after tool execution when using OpenRouter or any provider that doesn't support
previous_response_id.Root cause:
_get_conversation_idreturnedresponse.idby default (whenstorewas not explicitlyFalse). This caused the executor to enter server-managed continuation mode — clearing the full conversation history and sending only the tool result in the next request. Providers like OpenRouter ignoreprevious_response_idand received a barefunction_call_outputat position 0 → HTTP 400.Fix:
_get_conversation_idnow returnsNoneunlessstore=Trueis explicitly set. Server-managed conversation mode is opt-in; the default is full-history mode, which is universally compatible.Users who want the server-managed optimization (OpenAI native only) should set
store=Truein their chat options.Changes
python/packages/openai/agent_framework_openai/_chat_client.py: changedif store is False→if store is not Truein_get_conversation_idpython/packages/openai/tests/openai/test_openai_chat_client.py: updated streaming event tests to cover bothstore=Trueandstore=None; addedtest_parse_response_with_store_none_returns_noneas a regression testTest plan
uv run pytest packages/openai/tests/ -m "not integration"— 270 passed, 0 failuresstore=Truestill enables server-managed conversation mode for OpenAI native