Skip to content

Commit d91b322

Browse files
committed
Add migration guides and tutorial
1 parent d88f7fe commit d91b322

File tree

3 files changed

+324
-0
lines changed

3 files changed

+324
-0
lines changed

spec/2025.12/index.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,13 @@ Contents
3030
verification_test_suite
3131
benchmark_suite
3232

33+
.. toctree::
34+
:caption: Guides and Tutorials
35+
:maxdepth: 1
36+
37+
migration_guide
38+
tutorial_basic
39+
3340
.. toctree::
3441
:caption: Other
3542
:maxdepth: 1

spec/2025.12/migration_guide.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
(migration-guide)=
2+
3+
# Migration Guide
4+
5+
This page is meant to help migrate your codebase to an Array API compliant
6+
implementation. The guide is divided into two parts and, depending on your
7+
exact use-case, you should look thoroughly into at least one of them.
8+
9+
The first part is dedicated for {ref}`array-producers`. If your library
10+
mimics e.g. NumPy's or Dask's functionality, then you can find there an
11+
additional instructions and guidance on how to ensure downstream users can
12+
easily pick your solution as an array provider for their system/algorithm.
13+
14+
The second part delves into details for Array API compatibility for
15+
{ref}`array-consumers`. This pertains to any software that performs
16+
multidimensional array manipulation in Python, such as: scikit-learn, SciPy,
17+
or statsmodels. If your software relies on a certain array producing library,
18+
such as NumPy or JAX, then here you can learn how to make it library agnostic
19+
and interchange them with way less friction.
20+
21+
## Ecosystem
22+
23+
Apart from the documented standard, the Array API ecosystem also provides
24+
a set of tools and packages to help you with the migration process:
25+
26+
27+
(array-api-compat)=
28+
29+
### Array API Compat
30+
31+
GitHub: [array-api-compat](https://github.com/data-apis/array-api-compat)
32+
33+
Although NumPy, Dask, CuPy, and PyTorch support the Array API Standard, there
34+
are still some corner cases where their behavior diverges from the standard.
35+
`array-api-compat` provides a compatibility layer to cover these cases as well.
36+
This is also accompanied by a few utility functions for easier introspection
37+
into array objects.
38+
39+
40+
(array-api-strict)=
41+
42+
### Array API Strict
43+
44+
GitHub: [array-api-strict](https://github.com/data-apis/array-api-strict)
45+
46+
`array-api-strict` is a library that provides a strict and minimal
47+
implementation of the Array API Standard. It is designed to be used as
48+
a reference implementation for testing and development purposes. By comparing
49+
your API calls with `array-api-strict` counterparts, you can ensure that your
50+
library is fully compliant with the standard and can serve as a reliable
51+
reference for other developers in the ecosystem.
52+
53+
54+
(array-api-tests)=
55+
56+
### Array API Test
57+
58+
GitHub: [array-api-tests](https://github.com/data-apis/array-api-tests)
59+
60+
`array-api-tests` is a collection of tests that can be used to verify the
61+
compliance of your library with the Array API Standard. It includes tests
62+
for array producers, covering a wide range of functionalities and use cases.
63+
By running these tests, you can ensure that your library adheres to the
64+
standard and can be used with compatible array consumers libraries.
65+
66+
67+
(array-api-extra)=
68+
69+
### Array API Extra
70+
71+
GitHub: [array-api-extra](https://github.com/data-apis/array-api-extra)
72+
73+
`array-api-extra` is a collection of additional utilities and tools that are
74+
missing from the Array API Standard but can be useful for compliant array
75+
consumers. It includes additional array manipulation and statistical functions.
76+
It is already used by SciPy and scikit-learn.
77+
78+
The sections below mention when and how to use them.
79+
80+
81+
(array-producers)=
82+
83+
## Array Producers
84+
85+
For array producers, the central task during the development/migration process
86+
is adhering user-facing API to the Array API Standard.
87+
88+
The complete API of the standard is documented on the
89+
[API specification](https://data-apis.org/array-api/latest/API_specification/index.html)
90+
page.
91+
92+
There, each function, constant, and object is described with details
93+
on parameters, return values, and special cases.
94+
95+
### Testing against Array API
96+
97+
There are two main ways to test your API for compliance: Either using
98+
`array-api-tests` suite or testing your API manually against `array-api-strict`
99+
reference implementation.
100+
101+
#### Array API Test suite (Recommended)
102+
103+
{ref}`array-api-tests` is a test suite which verifies that your API
104+
for adhering to the standard. For each function or method it confirms
105+
it's importable, verifies the signature, and generates multiple test
106+
cases with hypothesis package and runs asserts for the outputs.
107+
108+
The setup details are enclosed in the GitHub repository, so here we
109+
cover only the minimal workflow:
110+
111+
1. Install your package, for example in editable mode.
112+
2. Clone `array-api-tests`, and set `ARRAY_API_TESTS_MODULE` environment
113+
variable to your package import name.
114+
3. Inside the `array-api-tests` directory run `pytest` command. There are
115+
multiple useful options delivered by the test suite, a few worth mentioning:
116+
- `--max-examples=2` - maximal number of test cases to generate by the
117+
hypothesis. This allows you to balance between execution time of the test
118+
suite and thoroughness of the testing.
119+
- With `--xfails-file` option you can describe which tests are expected to
120+
fail - it's impossible to get the whole API perfectly implemented on a
121+
first try, so tracking what still fails gives you more control over the
122+
state of your API.
123+
- `-o xfail_strict=<bool>` is often used with the previous one. If a test
124+
expected to fail actually passes (`XPASS`) then you can decide whether
125+
to ignore that fact or raise it as an error.
126+
- `--skips-file` for skipping files. At times some failing tests might stall
127+
the execution time of the test suite - in that case the most convenient
128+
option is to skip these for the time being.
129+
130+
We strongly advise you to embed this setup in your CI as well. This will allow
131+
you to monitor the coverage live, and make sure new changes don't break existing
132+
API. For a reference here's a [NumPy Array API Tests CI setup](https://github.com/numpy/numpy/blob/581d10f43b539a189a2d37856e5130464de9e5f6/.github/workflows/linux.yml#L296).
133+
134+
135+
#### Array API Strict
136+
137+
A simpler, and more manual, way of testing the Array API coverage is to
138+
run your API calls along with {ref}`array-api-strict` Python implementation.
139+
140+
This way you can ensure the outputs coming from your API match the minimal
141+
reference implementation, but bare in mind you need to write the tests cases
142+
yourself, so you need to also take into account the edge cases as well.
143+
144+
145+
(array-consumers)=
146+
147+
## Array Consumers
148+
149+
For array consumers the main premise is keep in mind that your **array
150+
manipulation operations should not lock in for a particular array producing
151+
library**. For instance, if you use NumPy for arrays, then your code could
152+
contain:
153+
154+
```python
155+
import numpy as np
156+
157+
# ...
158+
b = np.full(shape, val, dtype=dtype) @ a
159+
c = np.mean(a, axis=0)
160+
return np.dot(c, b)
161+
```
162+
163+
The first step should be as simple as assigning `np` namespace to a dedicated
164+
namespace variable - the convention in the ecosystem is to name it `xp`. Then
165+
Making sure that each method and function call is something that Array API
166+
supports is vital (we will get to that soon):
167+
168+
```python
169+
import numpy as np
170+
171+
xp = np
172+
173+
# ...
174+
b = xp.full(shape, val, dtype=dtype) @ a
175+
c = xp.mean(a, axis=0)
176+
return xp.tensordot(c, b, axes=1)
177+
```
178+
179+
Then replacing one backend with another one should rely on providing a different
180+
namespace, such as: `xp = torch`, e.g. via environment variable. This can be useful
181+
if you're writing a script or in your custom software. The other alternatives are:
182+
183+
- If you are building a library where the backend is determined by input arrays
184+
passed by the end-user, then a recommended way is to ask your input arrays for a
185+
namespace to use: `xp = arr.__array_namespace__()`
186+
- Each function you implement can have a namespace `xp` as a parameter in the
187+
signature. Then enforcing inputs to be of type by the provided backend can be
188+
achieved with `arg1 = xp.asarray(arg1)` for each input array.
189+
190+
If you're relying on NumPy, CuPy, PyTorch, Dask, or JAX then
191+
{ref}`array-api-compat` can come in handy for the transition. The compat layer
192+
allows you to still rely on your selection of array producing library, while
193+
making sure you're already using standard compatible API. Additionally, it
194+
offers a set of useful utility functions, such as:
195+
196+
- [array_namespace()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace)
197+
for fetching the namespace based on input arrays.
198+
- [is_array_api_obj()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.is_array_api_obj)
199+
for the introspection whether a given object is Array API compatible.
200+
- [device()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.device)
201+
to get a device the array resides on.
202+
203+
For now the migration from a specific library (e.g. NumPy) to a standard compatible
204+
setup requires a manual intervention for each failing API call but in the future
205+
we plan to provide some automation tools for it.

spec/2025.12/tutorial_basic.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
(tutorial-basic)=
2+
3+
# Array API Tutorial
4+
5+
In this tutorial we're going to show the migration from the array consumer
6+
point of view for a simple graph algorithm.
7+
8+
The example presented here comes from [`graphblas-algorithms`](https://github.com/python-graphblas/graphblas-algorithms).
9+
library. There we can find [the HITS algorithm](https://github.com/python-graphblas/graphblas-algorithms/blob/35dbc90e808c6bf51b63d51d8a63f59238c02975/graphblas_algorithms/algorithms/link_analysis/hits_alg.py#L9),
10+
used for the link analysis for estimating prominence in sparse networks.
11+
12+
The inlined and slightly simplified (without "authority" feature)
13+
implementation looks like this:
14+
15+
```python
16+
def hits(G, max_iter=100, tol=1.0e-8, normalized=True):
17+
N = len(G)
18+
h = Vector(float, N, name="h")
19+
a = Vector(float, N, name="a")
20+
h << 1.0 / N
21+
# Power iteration: make up to max_iter iterations
22+
A = G._A
23+
hprev = Vector(float, N, name="h_prev")
24+
for _i in range(max_iter):
25+
hprev, h = h, hprev
26+
a << hprev @ A
27+
h << A @ a
28+
h *= 1.0 / h.reduce(monoid.max).get(0)
29+
if is_converged(hprev, h, tol):
30+
break
31+
else:
32+
raise ConvergenceFailure(max_iter)
33+
if normalized:
34+
h *= 1.0 / h.reduce().get(0)
35+
a *= 1.0 / a.reduce().get(0)
36+
return h, a
37+
38+
def is_converged(xprev, x, tol):
39+
xprev << binary.minus(xprev | x)
40+
xprev << unary.abs(xprev)
41+
return xprev.reduce().get(0) < xprev.size * tol
42+
```
43+
44+
We can see that the API is specific to the GraphBLAS array object.
45+
There is `Vector` constructor, overloaded `<<` for assigning new values,
46+
and `reduce`/`get` for reductions. We need to replace them, and, by convention,
47+
we will use `xp` namespace for calling respective functions.
48+
49+
First we want to make sure we construct arrays in an agnostic way:
50+
51+
```python
52+
h = xp.full(N, 1.0 / N)
53+
A = xp.asarray(G.A)
54+
```
55+
56+
Then, instead of `reduce` calls we use appropriate reducing
57+
functions from the Array API:
58+
59+
```python
60+
h = h / xp.max(h)
61+
# ...
62+
h = h / xp.sum(xp.abs(h))
63+
a = a / xp.sum(xp.abs(a))
64+
# ...
65+
err = xp.sum(xp.abs(...))
66+
```
67+
68+
We replace custom binary operation with the Array API counterpart:
69+
70+
```python
71+
...(x - xprev)
72+
```
73+
74+
And last but not least, let's ensure that the result of the convergence
75+
condition is a scalar coming from our API:
76+
77+
```python
78+
err < xp.asarray(N * tol)
79+
```
80+
81+
The rewrite is complete now, we can assemble all constituent parts into
82+
a full implementation:
83+
84+
```python
85+
def hits(G, max_iter=100, tol=1.0e-8, normalized=True):
86+
N = len(G)
87+
h = xp.full(N, 1.0 / N)
88+
A = xp.asarray(G.A)
89+
# Power iteration: make up to max_iter iterations
90+
for _i in range(max_iter):
91+
hprev = h
92+
a = hprev @ A
93+
h = A @ a
94+
h = h / xp.max(h)
95+
if is_converged(hprev, h, N, tol):
96+
break
97+
else:
98+
raise Exception("Didn't converge")
99+
if normalized:
100+
h = h / xp.sum(xp.abs(h))
101+
a = a / xp.sum(xp.abs(a))
102+
return h, a
103+
104+
def is_converged(xprev, x, N, tol):
105+
err = xp.sum(xp.abs(x - xprev))
106+
return err < xp.asarray(N * tol)
107+
```
108+
109+
At this point the actual execution depends only on `xp` namespace,
110+
and replacing that one variable allow us to switch from e.g. NumPy arrays
111+
to a JAX execution on a GPU. This allows us to be more flexible, and, for
112+
example use lazy evaluation and JIT compile a loop body with JAX's JIT compilation.

0 commit comments

Comments
 (0)