interp - Prefer broadcast over reindex when possible#10554
interp - Prefer broadcast over reindex when possible#10554Illviljan merged 17 commits intopydata:mainfrom
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
xarray/core/dataset.py
Outdated
| to_broadcast = (var.copy().squeeze(),) + tuple( | ||
| dest for index, dest in use_indexers.values() | ||
| ) | ||
| variables[name] = broadcast_variables(*to_broadcast)[0] |
There was a problem hiding this comment.
This changes the semantics from copies to views. We'll have to manually deepcopy these vars to avoid confusing downstream errors.
There was a problem hiding this comment.
broadcast_variables(*to_broadcast)[0].copy(deep=True) should do the trick I think.
There was a problem hiding this comment.
Yes that would work. This could be a good opportunity to look for optimizations in reindex if you have the bandwidth.
There was a problem hiding this comment.
Using copy(deep=True) now. I couldn't see a noticeable difference with the example above.
Last time I followed the reindex path Dask was the bottleneck. Though I'm not very familiar with those functions.
I recall reindex was about the same speed as interpolation a few years ago.
for more information, see https://pre-commit.ci
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
When a variable is a scalar it is faster to broadcast instead of using reindex. Use that when doing dataset interpolation.
whats-new.rstMain:
PR: