PERF: preserve block memory layout in Block.copy (GH#60469)#65302
Merged
mroeschke merged 3 commits intopandas-dev:mainfrom Apr 21, 2026
Merged
PERF: preserve block memory layout in Block.copy (GH#60469)#65302mroeschke merged 3 commits intopandas-dev:mainfrom
mroeschke merged 3 commits intopandas-dev:mainfrom
Conversation
Pass the fortran-ordered transpose to DataFrame so per-column .values remain contiguous, matching the layout of the DataFrame that was written (GH#22073, GH#60469). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d222ebe to
f976aa2
Compare
mroeschke
approved these changes
Apr 21, 2026
Member
|
Thanks @jbrockmendel |
Sharl0tteIsTaken
added a commit
to Sharl0tteIsTaken/pandas
that referenced
this pull request
Apr 22, 2026
…h-origin * upstream/main: (31 commits) DOC:Missing r in your (pandas-dev#65323) DOC: fix grammar in the .dt accessor section (pandas-dev#65325) REGR: restore rank() for ExtensionArrays with custom values for sorting (pandas-dev#64976) BUG: MultiIndex.get_loc returns scalar for unique key in non-unique index (pandas-dev#65234) BUG/TST: add test for _cast_pointwise_result robustness + fix some cases (pandas-dev#65318) BUG: fix .loc with tuple key on MultiIndex with IntervalIndex level (pandas-dev#65239) BUG: permit building from source with mingw (pandas-dev#64849) BUG: DataFrame.loc setitem with list-like value on single-column EA DataFrame (pandas-dev#65241) PERF: preserve block memory layout in Block.copy (GH#60469) (pandas-dev#65302) PERF: short-circuit sort_index(level=...) on monotonic non-MultiIndex (pandas-dev#65279) BUG: fix FloatingArray.astype(str) crash with distinguish_nan_and_na=True (pandas-dev#65038) BUG: fix to_timedelta ignoring unit for mixed round/non-round floats (pandas-dev#65170) BUG: DataFrame.loc preserves original index name when key is an Index (pandas-dev#65229) REF: continue moving freq management off DatetimeArray/TimedeltaArray (GH#24566) (pandas-dev#65285) REF: remove redundant BaseMaskedArray.map override (pandas-dev#65297) Bump github/codeql-action from 4.35.1 to 4.35.2 (pandas-dev#65310) Bump actions/setup-node from 6.3.0 to 6.4.0 (pandas-dev#65309) BUG: Fix formatters applied to wrong columns in truncated DataFrame.to_string (GH#35410) (pandas-dev#65288) PERF: optimize block consolidation (pandas-dev#64574) CLN: Replace no_default signature with False for allow_duplicates in insert and reset_index (pandas-dev#65146) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Block.copycallsvalues.copy()without specifyingorder. numpy's default isorder="C", which flips the memory layout of non-C-contiguous blocks. Since BlockManager stores data transposed relative to the user-facing shape, a user-facing C-contiguous DataFrame has F-contiguous block storage — so after.copy()the user ends up with F-contiguous.values, and e.g.mean(axis=1)walks the slow stride. Passingorder="K"preserves the original layout.Repro from #60469 on this branch:
df_nan.mean(axis=1): 7.24 ms → 7.24 msdf_nan_copy.mean(axis=1): 18.73 ms → 7.43 msASV results
This PR was the subject of #44871, which was closed as stale in 2022 after an ASV run showed real wide-frame arithmetic regressions (
FrameWithFrameWide.time_op_different_blockswas 2.06× slower). A fullasv continuousrun on the current tree shows those regressions are no longer present — presumably resolved by internals refactoring over the last four years.Current run: ~120 benchmarks improved (0.45×–0.91×), 3 apparent regressions. Re-running the 3 regressions with more repeats showed all were noise from concurrent machine activity (
BENCHMARKS NOT SIGNIFICANTLY CHANGED).Notable improvements (sampling):
index_cached_properties.IndexCache.time_engine('TimedeltaIndex')strings.Methods.time_wrap('string[pyarrow]')indexing.Setitem.time_setitem_listarithmetic.OpWithFillValue.time_frame_op_with_fill_value_no_nassparse.Arithmetic.time_make_unionmultiindex_object.Unique.time_unique_dups(('Int64', <NA>))arithmetic.NumericInferOps.time_add(float64)series_methods.ToNumpy.time_to_numpy_copyCaveat: the ASV run had some concurrent machine activity, so per-benchmark ratios are directional, not quantitative. No 2×-style regression like the 2021 one appears; the targeted re-run of the three flagged regressions cleared them.