Skip to content

DEPR: warn on silent dtype changes during setitem-with-expansion#64678

Draft
jbrockmendel wants to merge 4 commits into
pandas-dev:mainfrom
jbrockmendel:ref-indexing
Draft

DEPR: warn on silent dtype changes during setitem-with-expansion#64678
jbrockmendel wants to merge 4 commits into
pandas-dev:mainfrom
jbrockmendel:ref-indexing

Conversation

@jbrockmendel
Copy link
Copy Markdown
Member

@jbrockmendel jbrockmendel commented Mar 18, 2026

Needs disc, as in the issue there was not 100% consensus on whether to enforce PDEP6 behavior (which this PR does) or to allow concat behavior (the status quo in some but not all code paths).

Summary

  • Deprecate cases where ser.loc[new_key] = value or df.loc[new_key] = value silently changes the dtype of the Series/DataFrame
  • In a future version, the existing dtype will be retained instead of being silently changed
  • PDEP-6 exception preserved: int/uint → float is still allowed when the value introduces NaN
  • The warning is issued from infer_and_maybe_downcast, which is the function that decides whether values can be held in the original dtype
  • The check runs from _post_expansion_casting (the definitive answer after all intermediate dtype changes have been resolved), not from the forward-cast calls in _setitem_with_indexer_missing or _append_internal
  • Suppresses the warning for empty DataFrames (placeholder dtypes aren't meaningful to preserve) and for internal pivot_table margin calculations

Motivation

This is the first step toward simplifying the expansion codepath in core.indexing. Currently expansion uses concat_compat (Series) and _append_internal/concat (DataFrame) which do implicit dtype promotion. Once this deprecation is enforced, the expansion can be refactored to use reindex throughout, which preserves the existing dtype and is simpler.

Test plan

  • All existing indexing tests pass (8583 tests)
  • All extension tests pass (131 expansion-specific tests)
  • Pivot table tests pass with internal suppression
  • PDEP-6 int→float from NaN is correctly exempted
  • Warning message includes actionable guidance ("Cast the object to X before this operation")

🤖 Generated with Claude Code

@jbrockmendel jbrockmendel added Needs Discussion Requires discussion from core team before further action Deprecate Functionality to remove in pandas labels Mar 18, 2026
@jbrockmendel jbrockmendel force-pushed the ref-indexing branch 9 times, most recently from 18c931e to 9e4ed27 Compare March 22, 2026 14:51
@jbrockmendel
Copy link
Copy Markdown
Member Author

On the dev call @jorisvandenbossche had a preference for using the concat behavior rather than the reindex behavior, i.e. always-cast rather than PDEP6-only-casting.

Deprecate cases where `ser.loc[new_key] = value` or
`df.loc[new_key] = value` silently changes the dtype of the
Series/DataFrame. In a future version, the existing dtype will be
retained instead of being silently changed.

The PDEP-6 exception is preserved: int/uint -> float is still
allowed when the value introduces NaN.

The warning is issued from `infer_and_maybe_downcast`, which is
the function that decides whether values can be held in the
original dtype. The check runs from `_post_expansion_casting`
(the definitive answer after all intermediate dtype changes have
been resolved), not from the forward-cast calls in
`_setitem_with_indexer_missing` or `_append_internal`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jbrockmendel jbrockmendel added the Mothballed Temporarily-closed PR the author plans to return to label Apr 8, 2026
@jbrockmendel jbrockmendel reopened this Apr 30, 2026
@jbrockmendel jbrockmendel removed the Mothballed Temporarily-closed PR the author plans to return to label May 1, 2026
@jbrockmendel
Copy link
Copy Markdown
Member Author

Revisiting this I'm still not comfortable with the concat behavior. It makes it really easy to get object dtype. And code-wise reindex seems much simpler (e.g. we dont need the cast-back logic we currently have)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API: setitem-with-expansion casting

1 participant