Skip to content

Fix: coerce unknown types to O dtype#10339

Open
brynpickering wants to merge 17 commits intopydata:mainfrom
brynpickering:fix-obj-dtype-infer
Open

Fix: coerce unknown types to O dtype#10339
brynpickering wants to merge 17 commits intopydata:mainfrom
brynpickering:fix-obj-dtype-infer

Conversation

@brynpickering
Copy link

@brynpickering brynpickering commented May 20, 2025

Added a check for Python objects that neither have numpy dtype dtype attributes nor are one of the builtin types so that type promotion succeeds and defaults to object dtype.

This deals with cases where user-defined instances are in the array which would otherwise trip up numpy.result_type.

I did think about placing this in xarray.core.dtypes.preprocess_types but chose to place it closest to where it causes issues (namely, on calling xp.result_type).

@max-sixty
Copy link
Collaborator

do you think we could add a quick comment in the code describing what it's doing, for the less informed of us?

(does anyone know this better?)

@brynpickering
Copy link
Author

Sure, done.

Another options for this would be to check all the args passed to the compat.array_api_compat.result_type method and if any of them are not a numpy-compatible object or a builtin type then just return np.object_ directly. Passing the args to np.result_type is unnecessary as any object dtype in the list will automatically lead to an object dtype being returned (as it's the lowest common denominator).

@dcherian dcherian requested a review from keewis May 27, 2025 17:51
@brynpickering
Copy link
Author

@keewis is there any timeline for this to be reviewed? We're currently stuck with xarray v2024.2.0 as a dependency until this is resolved which is starting to cause issues when used in some working environments.

Copy link
Collaborator

@keewis keewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, this had completely dropped off my radar.

If this is about scalars, note that you can already wrap your custom object in a 0D array of object dtype to work around this (so you don't have to wait on us)

@shoyer, do you have an opinion on this?

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in after applying @keewis's suggested clean-up?

Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
@brynpickering
Copy link
Author

@keewis @shoyer thanks both. I've committed the suggestion as-is.

@brynpickering
Copy link
Author

brynpickering commented Nov 12, 2025

Aha @keewis I see we're working on fixing this simultaneously!

With 788eedb the MWE in the originating issue now passes all checks

@keewis
Copy link
Collaborator

keewis commented Nov 12, 2025

yeah, I thought I'd help you fix the untested parts of what I suggested

@jsignell
Copy link
Member

@keewis is looks like this one is good to merge. Right? Just needs a note in "what's new"

@brynpickering
Copy link
Author

brynpickering commented Feb 11, 2026

@jsignell this may have been superseded by #10423? It certainly seems to be at odds with #10423 since a bunch of the extension array tests are now failing. Let me see if removing the implementation here introduced here but keeping the tests (and checking against the MWE I raised in the issue) works as expected.

EDIT: the answer to that is no, it doesn't work as expected. The implementation in #10423 means that it won't promote arbitrary python objects to object dtype because it wants to ensure that objects like <NA> can be passed through without incident. With this PR's implementation, the extension arrays and <NA>s are promoted to object dtype inside array_api_compat, which is exactly what we don't want to do.

The simplest option would be to use a try/except in array_api_compat.result_type which returns an object dtype on exception. I.e., if all else fails, return an object.

@keewis
Copy link
Collaborator

keewis commented Feb 11, 2026

the new tests pass with the code on main, so all we need to do here is to decide whether to move the logic from #10423 into result_type (as you proposed here) and what to do with the two new functions and builtin_types.

@brynpickering
Copy link
Author

@keewis I've removed the two methods previously added in this PR, in favour of the TypeError fallback. I've then removed try/except statements where they would no longer reach a TypeError since the method being called handles that itself. I don't think it makes sense to try merging the two object type promotion methods since one is there to override default numpy type promotion and the other is a fallback for types that numpy would otherwise raise an error for on attempting to promote.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot apply sum to custom object arrays with min_count=1 if n_dims <= sum_dims

7 participants