-
-
Notifications
You must be signed in to change notification settings - Fork 477
Split migration guide into Migration Guide + What's New in 1.0 #2528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Split migration guide into Migration Guide + What's New in 1.0 #2528
Conversation
…-devs#2527) - Add short migration_guide.md (need-to-know checklist) - Add whats_new_1_0.md (full narrative and examples) - Update user guide index toctree to include both docs Co-authored-by: Cursor <[email protected]>
|
@OriolAbril @aloctavodia Here's a first draft. It definitely needs polish, but I'd like to get high level comments at this stage, if possible.
I just want to get a sense if this is going in the right direction before revising. Thanks! |
OriolAbril
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think overall looks good, I will try to look at the longer page tomorrow but it looks basically as the current migration guide page.
Things to look out for. Even though we use markdown based text, sphinx requires headers to decrease 1 by 1, so #### level header can't come after ## level header. We also have some URLs around linking to the migration guide and some of its sections. Some we can update but for others we might want to think a bit about which of the two we want to point to before we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this shorter one I would order the sections by "expected impact". The first 3 should probably be renaming of plots (probably 1st), return type of plots+kwargs, and InferenceData->DataTree (not sure about the order between those two). Then either waic removal or hdi_prob->ci_prob. Then I/O which stays quite the same as before, the main impact on users will actually be having to install the dependencies.
dim/sample_dims is basically all new functionality. I think the only function that could be configured to reduce something other than chain+draw was hdi (and that was through a kwarg taken into account by xarray.apply_ufunc) and a couple plots using skip_dims. I am not sure it is worth adding to this shorter page focused on breaking changes and migration.
For several of the bullet points I would also merge them all into a general section about rcParams mentioning use of templates, rcParams class at the beginning of the code or using the argument at every function call. I would also add that as the last section (but can easily be convinced otherwise). I am not sure how much this has actually reached the user-base, but the defaults in rcParams are bound to change, and we don't consider changing them a breaking change. Anyone who feels very strongly about the defaults should use a user-level template.
Extra notes on this last point: updating the rcParam from hdi_prob to ci_prob was a breaking change but we did that a while ago in preparation. Existing arvizrc templates will already have ci_prob. We also changed the default method to compute weights quite a while ago to stacking, so this stays the same. 0.x defaults for reference: https://github.com/arviz-devs/arviz/blob/v0.x/arvizrc.template
| jupytext: | ||
| text_representation: | ||
| extension: .md | ||
| format_name: markdown |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use myst format (quite similar) or ipynb. These should be quick to run so myst format should be perfectly fine. We might need to double check the libraries we install to build the docs if we do a zarr showcase somewhere for example.
The myst format is very similar and is natively processed by myst-nb. This will mean that cell tags and more advanced features will work properly out of the box. Otherwise we'd have to configure myst-nb to use jupytext to parse these notebooks which I don't think is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the original doc was an ipynb, I was planning to convert md back to ipynb, but we could keep it in Myst/md if that's better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters too much for these pages. Do note the current format is not myst even though there is now format_name: myst. https://myst-nb.readthedocs.io/en/latest/authoring/text-notebooks.html has an overview, but in general you can just use jupytext to convert from one format to the other properly.
The main issue with the incorrect formatting is we were using a couple advanced rendering options from myst-nb which are no longer being taken into account. For example, the first cell in https://arviz--2528.org.readthedocs.build/en/2528/user_guide/whats_new_1_0.html where we configure xarray display options was removed from the rendered version but it is now visible
…URLs, examples - Reorder migration guide by impact (plots first, I/O last) - Shorten Dimensions section; consolidate rcParams into one section at end - Fix header hierarchy (## for sections); use MyST format for both docs - Update issue.md link to migration_guide.md; document URL decisions in plan - Add before/after examples (I/O, plot kwargs); explicit ic_compare_method in compare example Co-authored-by: Cursor <[email protected]>
|
@OriolAbril I made revisions based on your comments. |
Co-authored-by: Cursor <[email protected]>
| 2. **Compare default (`ic_compare_method`):** The method used to compute model weights when you call {func}`~arviz.compare` (and related behavior) is controlled by `ic_compare_method`. The default is now **stacking** (weights chosen to optimize predictive performance) instead of the previous default (e.g. pseudo-BMA). | ||
|
|
||
| **What's affected:** Code that uses {func}`~arviz.compare` (or depends on the default method for combining/ranking models). Reported weights and rankings can change even if you didn't call a method explicitly. | ||
|
|
||
| **What to do:** If you're fine with stacking, no change needed. To use a different weighting method, call `compare(..., ic_compare_method="pseudo-bma")` (or another supported value) or set the corresponding rcParam. See {ref}`whats_new_1_0` for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already mentioned this is not a 1.0 change before but I double checked and found it in the changelog: https://github.com/arviz-devs/arviz/blob/main/CHANGELOG.md#v0110-2021-dec-17. It is from version 0.11 in late 2021. Anything that has to do with ic_compare_method should go, it makes no sense to include it in 1.0 related things.
|
|
||
| ## Plot names and signatures | ||
|
|
||
| Some plot functions were renamed or split; others kept the same name but have updated arguments and defaults. Use the following as a quick reference; see {ref}`whats_new_1_0` for the full inventory and new/removed functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add targets at the headers of the whats_new_1_0 page so we can link to the relevant section directly from here when using {ref}`xyz`
|
|
||
| - **Constructing / loading:** Converters and I/O return a DataTree; use it like you used InferenceData. Variable-level access is unchanged: `dt["posterior"]["theta"]` is still a DataArray. | ||
| - **When you need a Dataset from a group:** Use `dt["posterior"].dataset` (view) or `dt["posterior"].to_dataset()` (mutable copy). If you only use `dt["group"]["var"]`, no change. | ||
| - **Method replacements:** Use the table below. For `.extend`, use {meth}`xarray.DataTree.update` (order is reversed for the default left-like merge). For `.map` over some groups, use `.match(...)` or `.filter(...)` with `.map_over_datasets(...)`, then `.update()` to merge back into the full tree. For a list of top-level group names, use `list(dt.children)`; `dt.groups` exists but returns path-style strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably add another bullet point about going over xarray docs and asking a question on arviz github discussions. I added before/after examples for what I believe are common usage patterns it doesn't aim to be exhaustive
| list(dt.children) | ||
|
|
||
| # Need a Dataset? Use .dataset or .to_dataset() | ||
| dt["posterior"].to_dataset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dt["posterior"].to_dataset() | |
| dt["posterior"].dataset |
a lot of people (and LLMs even more) copy things first then ask questions. I think showing .dataset as default is better. If they see it is not enough of need something else they'll see the .to_dataset() option.
| Three things changed for credible intervals and related defaults: | ||
|
|
||
| 1. **Interval type (`ci_kind`):** Default is now equal-tailed interval (ETI) instead of highest-density interval (HDI). Many functions accept `ci_kind` (e.g. {func}`~arviz.summary`, {func}`~arviz_plots.plot_forest`). | ||
| 2. **Parameter name (`hdi_prob` → `ci_prob`):** Renamed for consistency; the old name is removed in 1.0. Replace `hdi_prob` with `ci_prob` in code and in any rcParam keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the commend I added about the ci_prob rcParam ended up at the bottom of the doc. Similarly to the weight computation method, this is isn't as old but it is also over 1 year old and multiple minor versions old. Functions that used to have hdi_prob as an argument will now have ci_prob instead, but no change in rcParams
| 2. **Parameter name (`hdi_prob` → `ci_prob`):** Renamed for consistency; the old name is removed in 1.0. Replace `hdi_prob` with `ci_prob` in code and in any rcParam keys. | ||
| 3. **Default probability (0.94 → 0.89):** The default interval probability is now 0.89. | ||
|
|
||
| To override any of these (or other defaults), use a per-call argument, set `az.rcParams` at the start of your code, or use a user or project config file. See **rcParams and defaults** at the end of this page. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would also benefit from having the "rcParams and defaults" be a cross-reference to that section
| You can control ArviZ behavior in three ways: | ||
|
|
||
| - **User-level or project templates:** Put options in an `arvizrc` file (e.g. in your home directory or project root). See the template for available keys. | ||
| - **`rcParams` at the start of your code:** Set `az.rcParams["group.key"] = value` once after importing ArviZ; it applies for the rest of the session. | ||
| - **Per-call arguments:** Pass the option (e.g. `ci_kind="hdi"`, `ci_prob=0.94`) to each function that supports it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add a link to {ref}`arviz_base:arviz_configuration` which will point to https://python.arviz.org/projects/base/en/latest/api/index.html#configuration, probably in multiple places, as extra details about where the arvizrc file is searched for, to compare the old template with the new docs that include currently available keys and defaults.
|
|
||
| We do not consider changing an rcParam *default* to be a breaking change. Defaults may change in minor releases. If you care strongly about particular defaults, use a user-level template or set `rcParams` at the start of your script. | ||
|
|
||
| **Note:** The `hdi_prob` → `ci_prob` *rename* was a breaking change but was done in preparation; existing arvizrc templates already use `ci_prob`. The default method for model-comparison weights was changed to stacking a while ago, so that is unchanged in 1.0. For 0.x default values, see [arvizrc.template (v0.x)](https://github.com/arviz-devs/arviz/blob/v0.x/arvizrc.template). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this, at most leave the link to the arvizrc template along with the link I mention above to help people with complex rcparam templates port them over.
Closes #2527
migration_guide.md(need-to-know checklist)whats_new_1_0.md(full narrative and examples)Made with Cursor (and a whole lotta revision)