Skip to content

fix(core): handle non-dict elements in merge_lists index lookup#35265

Open
Nilesh Hadalgi (nileshhadalgi016) wants to merge 2 commits intolangchain-ai:masterfrom
nileshhadalgi016:nileshhadalgi016/fix-merge-lists-type-error
Open

fix(core): handle non-dict elements in merge_lists index lookup#35265
Nilesh Hadalgi (nileshhadalgi016) wants to merge 2 commits intolangchain-ai:masterfrom
nileshhadalgi016:nileshhadalgi016/fix-merge-lists-type-error

Conversation

@nileshhadalgi016

Description

Fixes #35259

When streaming Mistral responses with inline citations, the content list can contain a mix of str and dict elements. For example:

[
    "start answer...",
    {"type": "reference", "index": 0, "reference_ids": ["iKcb2CAQ7"]},
    "other answer..."
]

The merge_lists function in _merge.py assumed all elements in the merged list would be dicts when searching for a matching index. When it encountered a str element, the expression "index" in e_left would check for substring containment (valid but wrong), and then e_left["index"] would fail with TypeError: string indices must be integers, not 'str'.

Fix

Added an isinstance(e_left, dict) guard in the list comprehension (line 114) that finds merge targets by index. This ensures only dict elements are checked for matching index keys, skipping strings and other non-dict types.

Tests

Added 2 new parametrized test cases to test_merge_lists:

  1. Mixed str and dict with matching index - verifies dicts with the same index are correctly merged while strings are preserved
  2. Mixed str and dict with non-matching index - verifies new dict elements are appended when no matching index exists

Issue

This affects any Mistral model returning inline citations (e.g., mistral-medium-2505) when used with streaming and tools like Tavily search.

Dependencies

None

Copilot AI review requested due to automatic review settings February 16, 2026 19:27
@github-actions github-actions bot added core `langchain-core` package issues & PRs fix For PRs that implement a fix external labels Feb 16, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a runtime TypeError in merge_lists when merging streaming content lists that contain mixed element types (e.g., str text segments alongside dict citation/reference objects).

Changes:

  • Guarded the index-based lookup in merge_lists to only inspect dict elements in the existing merged list.
  • Added parametrized unit tests covering mixed str/dict list merging with both matching and non-matching index values.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
libs/core/langchain_core/utils/_merge.py Prevents invalid "index" in e_left / e_left["index"] access when merged contains non-dict elements.
libs/core/tests/unit_tests/utils/test_utils.py Adds regression tests for mixed str + dict content lists (inline citations scenario).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

[{"no_index": "b"}],
[{"no_index": "a"}, {"no_index": "b"}],
),
# Mixed str and dict elements with index (Mistral inline citations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests pass on master.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Updated the test strings to contain "index" as a substring (e.g., "see index 1 for details", "reindex the data") so they properly trigger the TypeError on master. The bug requires "index" in some_string to return True (substring match), which then causes some_string["index"] to fail.

Fixes langchain-ai#35259

When streaming Mistral responses with inline citations, the content list
can contain a mix of str and dict elements. The merge_lists function
assumed all elements would be dicts when checking for 'index', causing a
TypeError when iterating over str elements ('string indices must be
integers, not str').

Add isinstance(e_left, dict) check before accessing dict keys in the
list comprehension that finds merge targets by index.
@nileshhadalgi016 Nilesh Hadalgi (nileshhadalgi016) force-pushed the nileshhadalgi016/fix-merge-lists-type-error branch from 2a6d5dc to 0a05585 Compare February 18, 2026 02:54
@nileshhadalgi016
Copy link
Author

Good catch! Updated the test strings to contain "index" as a substring (e.g., "see index 1 for details", "reindex the data") so they properly trigger the TypeError on master. The bug requires "index" in some_string to return True (substring match), which then causes some_string["index"] to fail.

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 18, 2026

Merging this PR will improve performance by 11.37%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 1 improved benchmark
✅ 12 untouched benchmarks
⏩ 22 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime test_async_callbacks_in_sync 23.5 ms 21.1 ms +11.37%

Comparing nileshhadalgi016:nileshhadalgi016/fix-merge-lists-type-error (7575656) with master (b004103)

Open in CodSpeed

Footnotes

  1. 22 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core `langchain-core` package issues & PRs external fix For PRs that implement a fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TypeError in merge_lists when streaming Mistral responses with inline citations

3 participants