From 38d415bc04e827953f0cd91c07e49ca89c024f70 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mateusz=20Sok=C3=B3=C5=82?= Date: Mon, 23 Mar 2026 13:12:50 +0100 Subject: [PATCH 1/2] Migration guide post-merge review --- spec/draft/migration_guide.md | 118 +++++++++++++++++----------------- 1 file changed, 60 insertions(+), 58 deletions(-) diff --git a/spec/draft/migration_guide.md b/spec/draft/migration_guide.md index 9babf2f35..27779a039 100644 --- a/spec/draft/migration_guide.md +++ b/spec/draft/migration_guide.md @@ -2,74 +2,75 @@ # Migration Guide -This page is meant to help migrate your codebase to an Array API compliant -implementation. The guide is divided into two parts and, depending on your -exact use-case, you should look thoroughly into at least one of them. +This page is meant to help migrate your codebase to an array API standard +compliant implementation. The guide is divided into three parts. + +The first part gives an overview of the {ref}`ecosystem` libraries, that +are helpful in different contexts when working with the array API standard. The first part is dedicated for {ref}`array-producers`. If your library -mimics, for example, NumPy's or Dask's functionality, then you can find in +mimics, for example, NumPy's or PyTorch's functionality, then you can find in the first part additional instructions and guidance on how to ensure downstream users can easily pick your solution as an array provider for their system/algorithm. -The second part delves into details for Array API compatibility for +The second part delves into details for array API standard compatibility for {ref}`array-consumers`. This pertains to any software that performs multidimensional array manipulation in Python, such as may be found in scikit-learn, SciPy, or statsmodels. If your software relies on a certain array producing library, such as NumPy or JAX, then you can use the second -part to learn how to make it library agnostic and interchange array -namespaces with significantly less friction. +part to learn how to make it library agnostic and, as a result, interchange +array namespaces with significantly less friction. + + +(ecosystem)= ## Ecosystem -Apart from the documented standard, the Array API ecosystem also provides +Apart from the documented standard, the array API ecosystem also provides a set of tools and packages to help you with the migration process: (array-api-compat)= -### Array API Compat +### array-api-compat GitHub: [array-api-compat](https://github.com/data-apis/array-api-compat) User group: Array Consumers -Although NumPy, Dask, CuPy, and PyTorch support the Array API Standard, there +Although NumPy, Dask, CuPy, and PyTorch support the array API standard, there are still some corner cases where their behavior diverges from the standard. -`array-api-compat` provides a compatibility layer to cover these cases. -This is also accompanied by a few utility functions for easier introspection -into array objects. As an array consumer, you can still rely on the original -API while having access to the standard compatible one. +`array-api-compat` provides a compatibility layer to cover an additional subset +of these corner cases. This is also accompanied by a few utility functions fo +easier introspection into array objects. As an array consumer, you can still +rely on the original API while having access to the standard compatible one. (array-api-strict)= -### Array API Strict +### array-api-strict GitHub: [array-api-strict](https://github.com/data-apis/array-api-strict) -User group: Array Consumers, Array Producers (for testing) +User group: Array Consumers `array-api-strict` is a library that provides a strict and minimal -implementation of the Array API Standard. For array producers, it is designed -to be used as a reference implementation for testing and development purposes. -You can compare your API calls with `array-api-strict` counterparts and -ensure that your library is fully compliant with the standard and can -serve as a reliable reference for other developers in the ecosystem. -For consumers, you can use `array-api-strict` during the development as an -array provider to ensure your code uses APIs compliant with the standard. +implementation of the array API standard. As a consumer, you can use +`array-api-strict` for parametrising tests with it as an array namespace +to ensure your code uses APIs compliant with the standard. (array-api-tests)= -### Array API Test +### array-api-tests GitHub: [array-api-tests](https://github.com/data-apis/array-api-tests) User group: Array Producers `array-api-tests` is a collection of tests that can be used to verify the -compliance of your library with the Array API Standard. It includes tests +compliance of your library with the array API standard. It includes tests for array producers, covering a wide range of functionalities and use cases. By running these tests, you can ensure that your library adheres to the standard and can be used with compatible array consumer libraries. @@ -77,18 +78,17 @@ standard and can be used with compatible array consumer libraries. (array-api-extra)= -### Array API Extra +### array-api-extra GitHub: [array-api-extra](https://github.com/data-apis/array-api-extra) User group: Array Consumers `array-api-extra` is a collection of additional utilities and tools that are -missing from the Array API Standard but can be useful for compliant array -consumers. It includes additional array manipulation and statistical functions. -It is already used by SciPy and scikit-learn. - -The sections below mention when and how to use them. +not present in the array API standard but can be useful for compliant array +consumers. It includes additional array manipulation and statistical +functions, support for lazy backends, and useful testing utilities. It is +already used by SciPy and scikit-learn. (array-producers)= @@ -96,7 +96,7 @@ The sections below mention when and how to use them. ## Array Producers For array producers, the central task during the development/migration process -is ensuring that the user-facing API adheres to the Array API Standard. +is ensuring that the user-facing API adheres to the array API standard. The complete API of the standard is documented in the [API specification](https://data-apis.org/array-api/latest/API_specification/index.html). @@ -104,13 +104,13 @@ The complete API of the standard is documented in the There, each function, constant, and object is described with details on parameters, return values, and special cases. -### Testing against Array API +### Testing against array API There are two main ways to test your API for compliance: either using `array-api-tests` suite or testing your API manually against the `array-api-strict` reference implementation. -#### Array API Test suite (Recommended) +#### array-api-tests suite (Recommended) {ref}`array-api-tests` is a test suite which verifies that your API adheres to the standard. For each function or method, it confirms @@ -144,13 +144,15 @@ cover only the minimal workflow: option is to skip these for the time being. We strongly advise you to embed this setup in your CI as well. This will allow -you to continuously monitor Array API coverage, and make sure new changes don't break existing -APIs. As a reference, see [NumPy's Array API Tests CI setup](https://github.com/numpy/numpy/blob/581d10f43b539a189a2d37856e5130464de9e5f6/.github/workflows/linux.yml#L296). +you to continuously monitor array API standard coverage, and make sure new +changes don't break existing APIs. As a reference, see +[NumPy's array-api-tests CI setup](https://github.com/numpy/numpy/blob/581d10f43b539a189a2d37856e5130464de9e5f6/.github/workflows/linux.yml#L296) +and [a Pixi workspace setup](https://github.com/mdhaber/mparray/blob/0ef47e008fef92c605f73907436d4c6617419161/pixi.toml#L119-L179). -#### Array API Strict +#### array-api-strict -A simpler, and more manual, way of testing Array API coverage is to +A simpler, and more manual, way of testing array API standard coverage is to run your API calls along with the {ref}`array-api-strict` Python implementation. This way, you can ensure that the outputs coming from your API match the minimal @@ -163,10 +165,9 @@ cases. ## Array Consumers -For array consumers, the main premise is to keep in mind that your **array -manipulation operations should not lock in for a particular array producing -library**. For instance, if you use NumPy for arrays, then your code could -contain: +For array consumers, the main premise is that your **array manipulation operations +should not be specific to one particular array producing library**. For instance, +if your code is specific to NumPy, it might contain: ```python import numpy as np @@ -178,12 +179,12 @@ return np.dot(c, b) ``` The first step should be as simple as assigning the `np` namespace to a dedicated -namespace variable. The convention used in the ecosystem is to name it `xp`. Then, -it is vital to ensure that each method and function call is something that the Array API -supports. For example, `dot` is present in the NumPy's API, but the standard -doesn't support it. For the sake of simplicity, let's assume both `c` and `b` -are `ndim=2`; therefore, we select `tensordot` instead, as both NumPy and the -standard define it: +namespace variable. The convention used in the ecosystem is to name it `xp`. +Then, it is vital to ensure that each method and function call is something that +the array API standard supports. For example, `dot` is present in the NumPy's +API, but the standard doesn't support it. For the sake of simplicity, let's +assume both `c` and `b` are `ndim=2`; therefore, we select `tensordot` instead, +as both NumPy and the standard define it: ```python import numpy as np @@ -196,18 +197,19 @@ c = xp.mean(a, axis=0) return xp.tensordot(c, b, axes=1) ``` -At this point, replacing one backend with another one should only require providing a different -namespace, such as `xp = torch` (e.g., via an environment variable). This can be useful -if you're writing a script or in your custom software. The other alternatives are: +At this point, replacing one backend with another one should only require +providing a different namespace, such as `xp = torch` (e.g., via an environment +variable). This can be useful if you're writing a script or in your custom +software. The other alternatives are: -- If you are building a library where the backend is determined by input arrays, - and your function accepts array arguments, then a recommended way is to ask - your input arrays for a namespace to use: `xp = arr.__array_namespace__()`. - If the given library doesn't have it, then [`array_api_compat.array_namespace()`](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace) - should be used instead: +- If you are building a library where the backend is determined by input + arrays, and your function accepts array arguments, then a recommended way to + fetch the namespace is to use [`array_api_compat.array_namespace()`](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace). + In case you don't want to introduce a new package dependency, you can rely + on a plain `xp = arr.__array_namespace__()`: ```python def func(array1, scalar1, scalar2): - xp = array1.__array_namespace__() # or array_namespace(array1) + xp = array_namespace(array1) # or array1.__array_namespace__() return xp.arange(scalar1, scalar2) @ array1 ``` - For a function that accepts scalars and returns arrays, use namespace `xp` as @@ -227,7 +229,7 @@ offers a set of useful utility functions, such as: - [array_namespace()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace) for fetching the namespace based on input arrays. - [is_array_api_obj()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.is_array_api_obj) - for inspecting whether a given object is Array API compatible. + for inspecting whether a given object is array API compatible. - [device()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.device) for retrieving the device on which an array resides. From a3375e7df49e973d161877eb19485381f3fa21dc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mateusz=20Sok=C3=B3=C5=82?= Date: Mon, 23 Mar 2026 17:47:41 +0100 Subject: [PATCH 2/2] Review comments --- spec/draft/migration_guide.md | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/spec/draft/migration_guide.md b/spec/draft/migration_guide.md index 27779a039..52d418693 100644 --- a/spec/draft/migration_guide.md +++ b/spec/draft/migration_guide.md @@ -3,24 +3,24 @@ # Migration Guide This page is meant to help migrate your codebase to an array API standard -compliant implementation. The guide is divided into three parts. +compliant implementation or become interoperable with compliant +implementations. The guide is divided into three parts. The first part gives an overview of the {ref}`ecosystem` libraries, that are helpful in different contexts when working with the array API standard. -The first part is dedicated for {ref}`array-producers`. If your library +The second part is dedicated to {ref}`array-producers`. If your library mimics, for example, NumPy's or PyTorch's functionality, then you can find in -the first part additional instructions and guidance on how to ensure -downstream users can easily pick your solution as an array provider for -their system/algorithm. +here additional instructions and guidance on how to ensure downstream users +can easily pick your solution as an array provider for their system/algorithm. -The second part delves into details for array API standard compatibility for +The third part delves into details for array API standard compatibility for {ref}`array-consumers`. This pertains to any software that performs multidimensional array manipulation in Python, such as may be found in scikit-learn, SciPy, or statsmodels. If your software relies on a certain array producing library, such as NumPy or JAX, then you can use the second -part to learn how to make it library agnostic and, as a result, interchange -array namespaces with significantly less friction. +part to learn how to make it library agnostic and, as a result, use array +namespaces interchangeably with significantly less friction. (ecosystem)= @@ -39,12 +39,13 @@ GitHub: [array-api-compat](https://github.com/data-apis/array-api-compat) User group: Array Consumers -Although NumPy, Dask, CuPy, and PyTorch support the array API standard, there -are still some corner cases where their behavior diverges from the standard. +Although NumPy or CuPy support the array API standard, there are still some +corner cases where their behavior diverges from the standard. `array-api-compat` provides a compatibility layer to cover an additional subset -of these corner cases. This is also accompanied by a few utility functions fo -easier introspection into array objects. As an array consumer, you can still -rely on the original API while having access to the standard compatible one. +of such corner cases for supported libraries. This is also accompanied by a few +utility functions for easier introspection into array objects. As an array +consumer, you can consume standard-compliant namespaces as well as the wrapped +namespaces in `array-api-compat` at the same time. (array-api-strict)= @@ -57,8 +58,8 @@ User group: Array Consumers `array-api-strict` is a library that provides a strict and minimal implementation of the array API standard. As a consumer, you can use -`array-api-strict` for parametrising tests with it as an array namespace -to ensure your code uses APIs compliant with the standard. +`array-api-strict` in parametrising tests over the array namespace +to ensure your code uses only APIs compliant which are in the standard. (array-api-tests)= @@ -181,7 +182,7 @@ return np.dot(c, b) The first step should be as simple as assigning the `np` namespace to a dedicated namespace variable. The convention used in the ecosystem is to name it `xp`. Then, it is vital to ensure that each method and function call is something that -the array API standard supports. For example, `dot` is present in the NumPy's +the array API standard supports. For example, `dot` is present in the NumPy API, but the standard doesn't support it. For the sake of simplicity, let's assume both `c` and `b` are `ndim=2`; therefore, we select `tensordot` instead, as both NumPy and the standard define it: