Skip to content

Enable multiplying NDCube and NDData#840

Merged
DanRyanIrish merged 29 commits intosunpy:mainfrom
PCJY:mul
May 29, 2025
Merged

Enable multiplying NDCube and NDData#840
DanRyanIrish merged 29 commits intosunpy:mainfrom
PCJY:mul

Conversation

@PCJY
Copy link
Contributor

@PCJY PCJY commented May 13, 2025

PR Description

Enable NDCube to be multiplied with wcs-less NDData.

To Do

Adapting NDCube.add to support addition and multiplication.

  • Rename NDCube.add to NDCube._operate_arithmetic and add an operation arg. operation should be a string of either "add" or "multiply".
  • Create a new NDCube.add method that simply returns self._operate_arithmetic with operation equal to 'add'.
  • Rerun tests to ensure everything still works as expected.
  • Include if statements in NDCube._operate_arithmetic to check whether operation == 'add'. (The 'multiply' case can be supported later.) There are three places where if statements are needed:
    • Managing the unit
    • Performing addition of the data
    • Calculating the uncertainties
    • Don't forget that these may need to be done in the NDData and Quantity sections of the code.
  • Rerun tests to make sure they all still pass.
  • Add an operation arg to NDCube._combine_uncertainty, i.e. def _combine_uncertainty(self, value, result_data, operation). Replace np.add with operation in the code of this method. Make sure you add np.add where you call self._combine_uncertainty in NDCube._operation_arithmetic.
  • Rerun tests to make sure everything is still working as expected.

Adding support for multiplication.

  • Add elif statements to NDCube._operation_arithmetic to support multiplication. Remember, there are three scenarios where multiplication differs from addition:
    • Handling the unit
    • Calculating the new data (+-> *)
    • Calculating the new uncertainties (np.add -> np.multiply)
    • Again, remember changes might be needed in the Quantity section of the method, as well as the NDData section.
  • Make the NDCube.multiply method return self._operate_arithmetic with operation set to 'multiply'.
  • Make NDCube.__mul__ call NDCube.multiply just like NDCube.__add__ calls NDCube.add.
  • Write a simple test for NDCube.__mul__.
  • Add a changelog file.

@DanRyanIrish DanRyanIrish added this to the 2.4.0 milestone May 13, 2025
ndcube/ndcube.py Outdated
# return the new NDCube instance
return self._new_instance(**kwargs)

def add(self, value, handle_mask = np.logical_and):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method needs a proper docstring. For an example, see the NDCubeBase.reproject_to method's docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the _operate_arithmetic and the multiply method also need docstring?

PCJY and others added 2 commits May 13, 2025 14:27
Co-authored-by: DanRyanIrish <ryand5@tcd.ie>
Co-authored-by: DanRyanIrish <ryand5@tcd.ie>
@DanRyanIrish
Copy link
Member

DanRyanIrish commented May 13, 2025

Hi @PCJY. On reflection, I think it would be simpler if the _operate_arithmetic only handled the NDData case. This is because there isn't a lot of overlap in the code with the Quantity and other cases. So this part of the code can remain the add() method like this:

def _arithmetic_operate_with_nddata(self, operation, value, handle_mask):
    # Do stuff

def add(self, value, handle_mask=np.logical_and):
    if isinstance(value, NDData):
        kwargs = self._arithmetic_operate_with_nddata("add", value, handle_mask)
    elif hasattr(value, 'unit'):
        # Put non-NDData code from self.add here...

    # return the new NDCube instance
    return self._new_instance(**kwargs)

What do you think?

@PCJY
Copy link
Contributor Author

PCJY commented May 13, 2025

Hi @PCJY. On reflection, I think it would be simpler if the _operate_arithmetic only handled the NDData case. This is because there isn't a lot of overlap in the code with the Quantity and other cases. So this part of the code can remain the add() method like this:

def _arithmetic_operate_with_nddata(self, operation, value, handle_mask):
    # Do stuff

def add(self, value, handle_mask=np.logical_and):
    if isinstance(value, NDData):
        kwargs = self._arithmetic_operate_with_nddata("add", value, handle_mask)
    elif hasattr(value, 'unit'):
        # Put non-NDData code from self.add here...

    # return the new NDCube instance
    return self._new_instance(**kwargs)

What do you think?

Do you mean, _operate_arithmetic() method handles the NDData case, and then, the add/mul method handles the Quantity case?

@DanRyanIrish
Copy link
Member

DanRyanIrish commented May 13, 2025

Another idea for making this simpler, is to create a default handle_mask function. See the code below.

def _arithmetic_handle_mask(self, self_mask, value_mask):
    if self_mask is None and value_mask is None:
        return None
    elif self_mask is None:
        return value_mask
    elif value_mask is None:
        return self_mask
    else:
        return np.logical_and(self_mask, value_mask)

Then in NDCube._arithmetic_operate_with_nddata/NDCube._arithmetic_operate (whatever we call it) we can have the default handle_mask=None and a line of code such as:

if handle_mask is None:
    handle_mask = self._arithmetic_handle_mask
kwargs["mask"] = handle_mask(self.mask, value.mask)

This would mean that if the user provides a handle_mask function, it is always used. Whereas at the moment, it is only sometimes used, which is confusing.

This does subtlely change the behaviour, so it might make the tests break. But that's OK if we agree that the new behaviour is better.

@DanRyanIrish
Copy link
Member

Do you mean, _operate_arithmetic() method handles the NDData case, and then, the add/mul method handles the Quantity case?

Yes. But add/mul calls also _operate_arithmetic if value is an NDData instance. So from the user-perspective it's the same as now. But how we organise the code is simpler and easier to maintain.

@DanRyanIrish
Copy link
Member

Apologies for the delay in getting these comments to you @PCJY. I wrote them but realised I didn't actually post them to the PR!

@DanRyanIrish
Copy link
Member

Also, we need a changelog file this PR too.

@PCJY
Copy link
Contributor Author

PCJY commented May 14, 2025

Hi @DanRyanIrish , here is my summary for all the methods we have now for scaling and addition for NDCube and wcs-less NDData. Please let me know if you have any feedback on this, thanks.

1, The ndcube.fill_masked() method
This method can be used by the users to specify the masked values, and to apply (or not) on the data and uncertainty, etc.

2, The default ndcube._arithmetic_handle_mask() method
This method is used when handle_mask is not provided by users, as the default behavior of handling masks.
The method essentially combines the mask using logical OR when both objects have masks.

3, The ndcube._combine_uncertainty() method

4, The ndcube._arithmetic_operate_with_nddata() method
Raises a TypeError if the wcs attribute of the 'value' object is not None.
Performs addition or multiplication when the 'value' object is an NDData instance, handles masks, units, and uncertainty (using the _combine_uncertainty() method).
Set to private and should only get called by the add method instead of by users (although they can but not encouraged to).

5, The ndcube.add() method
Performs addition for the scenarios:
when the 'value' object is an NDData instance (wcs-less, has mask, unit and uncertainty), (calls the ndcube._arithmetic_operate_with_nddata() method)
when the 'value' object is a Quantity object (has unit and data).

6, The ndcube.__add__() method, the + operator

Previously it was necessary because the handle_mask and operation_ignore_mask arguments must be set by users, in order to deal with masked elements before using the uncertainty propagation method.

Now that we have many methods to deal with masks, we assume when users come to arithmetic operations, they have already got the masked data. We would not specify the masks' impact on the data inside the arithmetic operation methods.

Therefore, this method just calls the ndcube.add() method.

7, The ndcube.multiply() method
Performs multiplication for the scenarios:
when the 'value' object is an NDData instance (wcs-less, has mask, unit and uncertainty), (calls the ndcube._arithmetic_operate_with_nddata() method)
when the 'value' object is a Quantity object (has unit and data).

8, The ndcube.__mul__() method, the * operator
This method calls the ndcube.multiply() method. (for the same reasons for the ndcube.__add__() method)

PCJY and others added 2 commits May 14, 2025 16:29
Co-authored-by: DanRyanIrish <ryand5@tcd.ie>
Co-authored-by: DanRyanIrish <ryand5@tcd.ie>
@DanRyanIrish
Copy link
Member

Hi @PCJY. Thanks for the summary of our progress. I concur that the next steps are to change the __add__ method so that it only calls the add method (as you've already done to __mul__ and multiply), and then write the tests for the multiply method.

Keep up the good work!

@DanRyanIrish
Copy link
Member

@PCJY: Here is a suggestion to discuss that I have reflected on as the project has evolved. The add and multiply methods do not affect how the data or uncertainties are calculated. Instead, their justification is now solely to handle the case where value is an NDData object with a mask and the user does not want the new mask to be calculated via np.logical_or. This is likely to be a small subset of the cases. Because the +/* operators will usually be used, users are just as likely to be far more familiar with them than the add/multiple methods, and may choose to manually change the mask themselves after the operation. This only actually requires one extra line of code:

new_cube = cube1 + nddata1
new_cube.mask = np.logical_and(cube1.mask, nddata1.mask)

Therefore, the maintenance, documentation, and publicisation of the add and multiply methods are a lot of work to save users one line of code in a small subset of cases, and many users may choose not to use them anyway.

Therefore, should we consider removing the add and multiply methods from the API? All the code you've written can instead live under the __add__ and __mul__ methods. This would reduce how much documentation you have to write, and also simplify the tutorials. Furthermore, in that tutorial, we can include an example like above, where the user alters the mask after the operation. What are your thoughts?

@PCJY
Copy link
Contributor Author

PCJY commented May 16, 2025

@PCJY: Here is a suggestion to discuss that I have reflected on as the project has evolved. The add and multiply methods do not affect how the data or uncertainties are calculated. Instead, their justification is now solely to handle the case where value is an NDData object with a mask and the user does not want the new mask to be calculated via np.logical_or. This is likely to be a small subset of the cases. Because the +/* operators will usually be used, users are just as likely to be far more familiar with them than the add/multiple methods, and may choose to manually change the mask themselves after the operation. This only actually requires one extra line of code:

new_cube = cube1 + nddata1
new_cube.mask = np.logical_and(cube1.mask, nddata1.mask)

Therefore, the maintenance, documentation, and publicisation of the add and multiply methods are a lot of work to save users one line of code in a small subset of cases, and many users may choose not to use them anyway.

Therefore, should we consider removing the add and multiply methods from the API? All the code you've written can instead live under the __add__ and __mul__ methods. This would reduce how much documentation you have to write, and also simplify the tutorials. Furthermore, in that tutorial, we can include an example like above, where the user alters the mask after the operation. What are your thoughts?

Hi @DanRyanIrish , thanks for sharing your new ideas. I am thinking, then what about the code components in the add and multiply methods that handle the uncertainty and units? Without the add and multiply methods, how can the users combine these two attributes when it comes to the arithmetic operations for NDCube and wcs-less NDData? Please let me know your thoughts on this and what I might be missing, thank you.

@PCJY
Copy link
Contributor Author

PCJY commented May 16, 2025

@PCJY: Here is a suggestion to discuss that I have reflected on as the project has evolved. The add and multiply methods do not affect how the data or uncertainties are calculated. Instead, their justification is now solely to handle the case where value is an NDData object with a mask and the user does not want the new mask to be calculated via np.logical_or. This is likely to be a small subset of the cases. Because the +/* operators will usually be used, users are just as likely to be far more familiar with them than the add/multiple methods, and may choose to manually change the mask themselves after the operation. This only actually requires one extra line of code:

new_cube = cube1 + nddata1
new_cube.mask = np.logical_and(cube1.mask, nddata1.mask)

Therefore, the maintenance, documentation, and publicisation of the add and multiply methods are a lot of work to save users one line of code in a small subset of cases, and many users may choose not to use them anyway.
Therefore, should we consider removing the add and multiply methods from the API? All the code you've written can instead live under the __add__ and __mul__ methods. This would reduce how much documentation you have to write, and also simplify the tutorials. Furthermore, in that tutorial, we can include an example like above, where the user alters the mask after the operation. What are your thoughts?

Hi @DanRyanIrish , thanks for sharing your new ideas. I am thinking, then what about the code components in the add and multiply methods that handle the uncertainty and units? Without the add and multiply methods, how can the users combine these two attributes when it comes to the arithmetic operations for NDCube and wcs-less NDData? Please let me know your thoughts on this and what I might be missing, thank you.

Or is it because, all the unit handling and stuffs are handled by the _arithmetic_operate_with_nddata() method, and by changing the add and multiply methods to __add__ and __mul__ methods, it is just removing redundant wrappers but not contents?


But when adding NDCube with wcs-less NDData, if the users use the + and * operator, they would automatically use the _arithmetic_operate_with_nddata() method, without needing to add the one extra line new_cube.mask = np.logical_and(cube1.mask, nddata1.mask) ?

@DanRyanIrish
Copy link
Member

Or is it because, all the unit handling and stuff is handled by the _arithmetic_operate_with_nddata() method, and by changing the add and multiply methods to __add__ and __mul__ methods, it is just removing redundant wrappers but not contents?

This is exactly right.

But when adding NDCube with wcs-less NDData, if the users use the + and * operator, they would automatically use the _arithmetic_operate_with_nddata() method, without needing to add the one extra line new_cube.mask = np.logical_and(cube1.mask, nddata1.mask) ?

Yes, that's right. Using + or * operators to combine an NDCube with a WCS-less NDData will handle the masks according to your _arithmetic_handle_mask method. The extra line explicitly setting the mask is only needed if the user wants to over-ride this.

@PCJY
Copy link
Contributor Author

PCJY commented May 21, 2025

Or is it because, all the unit handling and stuff is handled by the _arithmetic_operate_with_nddata() method, and by changing the add and multiply methods to __add__ and __mul__ methods, it is just removing redundant wrappers but not contents?

This is exactly right.

But when adding NDCube with wcs-less NDData, if the users use the + and * operator, they would automatically use the _arithmetic_operate_with_nddata() method, without needing to add the one extra line new_cube.mask = np.logical_and(cube1.mask, nddata1.mask) ?

Yes, that's right. Using + or * operators to combine an NDCube with a WCS-less NDData will handle the masks according to your _arithmetic_handle_mask method. The extra line explicitly setting the mask is only needed if the user wants to over-ride this.

Then what should we do about the handle_mask argument? The operator cannot have an argument, but the argument is present in the add method, which calls the _arithmetic_operate_with_nddata, which calls the _arithmetic_handle_mask method when handle_mask is None.

@DanRyanIrish
Copy link
Member

Then what should we do about the handle_mask argument? The operator cannot have an argument, but the argument is present in the add method, which calls the _arithmetic_operate_with_nddata, which calls the _arithmetic_handle_mask method when handle_mask is None.

We can simply delete the handle_mask kwarg. Then the mask will always be handled by _arithmetic_handle_mask. If the user wants something different, they can overwrite the mask after the operation as we discussed before, e.g.

new_cube.mask = np.logical_and(cube1.mask, nddata1.mask)

# Construct expected cube
expected_data = ndc.data + value.data
expected_uncertainty = ndc.uncertainty.propagate(
expected_uncertainty = ndc.uncertainty.propagate( # is this the correct way to test uncertainty? no need to test propagate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK in this case because, as you say, we aren't testing propagate, and if we are using multiple parameterizations, we might not be able to hard-code the expected result as it may be different each time. In this case, though, we only have one test case, so strictly speaking, it's better practice to explicitly set the expected value. But I don't think it's important to change here.

)
],
indirect=("ndc",))
def test_cube_arithmetic_multiply_ndcube_nddata(ndc, value):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be better to simplify this test by including the expected_cube in the parameterization. Then the test only has to calculate the output_cube and compare it to the expected_cube.

@DanRyanIrish DanRyanIrish mentioned this pull request May 27, 2025
2 tasks
@PCJY
Copy link
Contributor Author

PCJY commented May 28, 2025

Hi @DanRyanIrish , I think I have fixed the test for __multiply__. When you have a moment, could you please review it? I’d appreciate any feedback you might have, thank you.

Copy link
Member

@DanRyanIrish DanRyanIrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Just one suggestion regarding where the result cubes should be defined.

Comment on lines +735 to +757
@pytest.fixture
def ndcube_2d_ln_lt_res_ndc_mask_unc_unit(wcs_2d_lt_ln):
data_cube = np.array([[0, 1, 2], [3, 4, 5]])
return NDCube(data_cube, wcs=wcs_2d_lt_ln)


@pytest.fixture
def ndcube_2d_ln_lt_res_nddata_mask_unc_unit(wcs_2d_lt_ln):
unit = u.ct
data_cube = np.array([[0, 1, 2], [3, 4, 5]])
uncertainty = astropy.nddata.StdDevUncertainty(np.ones((2, 3))*0.1)
mask = np.ones((2, 3), dtype=bool)
return NDCube(data_cube, wcs=wcs_2d_lt_ln, uncertainty=uncertainty, mask=mask, unit=unit)


@pytest.fixture
def ndcube_2d_ln_lt_res_both_mask_unc_unit(wcs_2d_lt_ln):
unit = u.ct**2
data_cube = np.array([[0, 1, 2], [3, 4, 5]])
uncertainty = astropy.nddata.StdDevUncertainty(np.array([[0.1118034, 0.1118034, 0.1118034],
[0.1118034, 0.1118034, 0.1118034]]))
mask = np.ones((2, 3), dtype=bool)
return NDCube(data_cube, wcs=wcs_2d_lt_ln, uncertainty=uncertainty, mask=mask, unit=unit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixtures are typically defined in conftest so they can be used as input for multiple tests. As these are the expected results of specific tests, it would be better to move them to the parameterisation of the relevant tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These still need to be deleted.

@DanRyanIrish DanRyanIrish merged commit 84a66c9 into sunpy:main May 29, 2025
19 of 20 checks passed
@DanRyanIrish
Copy link
Member

Congratulations and thank you, @PCJY , for this PR 🥳

@PCJY
Copy link
Contributor Author

PCJY commented May 30, 2025

Congratulations and thank you, @PCJY , for this PR 🥳

Thank you for all the kind help and support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants