Add support for DataTree to xarray.merge()#10790
Conversation
mathause
left a comment
There was a problem hiding this comment.
Looking forward to this! Should the path of the node be added to the error message if the merge fails?
tree1 = xr.DataTree.from_dict({"/a/b": 1})
tree2 = xr.DataTree.from_dict({"/a/b": 2})
xr.merge([tree1, tree2])e.g. using
xarray/xarray/core/datatree_mapping.py
Line 141 in 2b947e9
Great suggestion, done! |
mathause
left a comment
There was a problem hiding this comment.
Nice! With this we can replace our custom datatree-merge function (which uses map_over_datasets) - I just checked and our tests still pass.
Thanks for checking! Let me know if you have any suggestions for further test coverage. |
xarray/structure/merge.py
Outdated
| def depth(kv): | ||
| return kv[0].count("/") |
There was a problem hiding this comment.
Maybe you could use
xarray/xarray/core/treenode.py
Line 473 in ad51404
There was a problem hiding this comment.
Tree.level turns out to be the right property. depth refers (somewhat confusing, IMO) to the number of levels below the given node.
| # Merge datasets, including inherited indexes to ensure alignment. | ||
| datasets = [node.dataset for node in nodes] | ||
| with add_path_context_to_errors(key): | ||
| merge_result = merge_core( | ||
| datasets, | ||
| compat=compat, | ||
| join=join, | ||
| combine_attrs=combine_attrs, | ||
| ) | ||
| # Remove inherited coordinates/indexes/dimensions. | ||
| for var_name in list(merge_result.coord_names): | ||
| if not any(var_name in node._coord_variables for node in nodes): | ||
| del merge_result.variables[var_name] | ||
| merge_result.coord_names.remove(var_name) | ||
| for index_name in list(merge_result.indexes): | ||
| if not any(index_name in node._node_indexes for node in nodes): | ||
| del merge_result.indexes[index_name] | ||
| for dim in list(merge_result.dims): | ||
| if not any(dim in node._node_dims for node in nodes): | ||
| del merge_result.dims[dim] |
There was a problem hiding this comment.
Can you explain / add comments explaining why this can't be done by just using node.to_dataset(inherit=False) then merging the resulting datasets?
There was a problem hiding this comment.
I have a comment about this above already: Merge datasets, including inherited indexes to ensure alignment
whats-new.rst