feat: add `isin` to the specification by kgryte · Pull Request #959 · data-apis/array-api

kgryte · 2025-06-12T10:03:23Z

This PR:

resolves RFC: add isin for elementwise set inclusion test #854 by adding isin to the specification.
of the keyword arguments determined according to array comparison data, this PR chooses to support only the invert kwarg. The assume_unique kwarg was not included for the following reasons:
1. not all array libraries support this kwarg (e.g., ndonnx and CuPy). CuPy lists the kwarg in its documentation but states that this kwarg is ignored.
2. when doing a quick search through sklearn, I was only able to find one usage of assume_unique when using isin and that was when searching lists of already known unique values.
3. assume_unique is something of a performance optimization/implementation detail which we have generally attempted to avoid when standardizing APIs.
does not place restrictions on the shape of x2. While some libraries may choose to flatten a multi-dimensional x2, that is something of an implementation detail and not strictly necessary. For example, an implementation could defer to an "includes" kernel which performs nested loop iteration without needing to perform explicit reshapes/copies.
adds support for scalar arguments for either x1 or x2. This follows recent general practice in standardized APIs, with the restriction that at least one of x1 or x2 must be an array.
specifies that value equality should be used, but not must be used. This follows other set APIs (e.g., unique*). As a consequence of value equality, NaN values can never test as True and there is no distinction between signed zeros.
~~allows both x1 and x2 to be of any data type~~ limits portability to integer data types, as floating-point data types are not widely supported across all array libraries (e.g., PyTorch). However, if x1 and x2 have no promotable data type, behavior is left unspecified and thus implementation-defined.

Questions

Update: answers provided based on feedback below and discussions during workgroup meetings.

Would we be okay with requiring that value equality must be used? Is there a scenario where we want to allow libraries some wiggle room, such as with NaN and signed zero comparison?
- answer: use must, not should, due to predominant usage patterns.
Are we okay with leaving out assume_unique?
- answer: yes, this can be left out.
Are we okay with not mandating reshape behavior if x2 is multi-dimensional?
- answer: yes, no reshape behavior is required.

Closes: data-apis#854

rgommers

Thanks @kgryte. Looks pretty good to me. I agree with the design choices in the PR description.

Would we be okay with requiring that value equality must be used? Is there a scenario where we want to allow libraries some wiggle room, such as with NaN and signed zero comparison?

I am not sure wiggle room is needed here. This function has more to do with equal than with unique I think. I just checked NumPy, PyTorch, JAX and CuPy - all seem to be using value equality for nan.

Are we okay with leaving out assume_unique?

Yes.

Are we okay with not mandating reshape behavior if x2 is multi-dimensional?

I think that that part of the np.isin docstring is confusing. Reshaping is meaningless, the only point of that is trying to express that the comparisons are element-wise. It'd be better to have a simple double for-loop with pseudo-code. There is no broadcasting either, any shapes should work and the output has the same shape as x1.

src/array_api_stubs/_draft/set_functions.py

ev-br · 2026-01-11T14:41:08Z

Run a basic hypothesis test for isin at data-apis/array-api-tests#407

Immediate observations:

with CuPy, complex arguments fail to compile

E           /tmp/tmp247_208r/47750659c2a7d4ce5c4a4d61e6b4600a33e9f3e0.cubin.cu(119): error: no operator "==" matches these operands
E                       operand types are: const thrust::complex<double> == const S
E           
E           14 errors detected in the compilation of "/tmp/tmp247_208r/47750659c2a7d4ce5c4a4d61e6b4600a33e9f3e0.cubin.cu".
E           
E           
E           ========== FAILING CODE SNIPPET:
E           xp.isin(array(0.+0.j, dtype=complex64), array(0.+0.j), **kw) with kw = {}
E           ====================

with torch, bools and complex isin are not implemented:

In [3]: import torch

In [4]: torch.isin(torch.asarray(0.+0.j, dtype=torch.complex64), torch.asarray(0.+0.j))
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[4], line 1
----> 1 torch.isin(torch.asarray(0.+0.j, dtype=torch.complex64), torch.asarray(0.+0.j))

RuntimeError: Unsupported input type encountered for isin(): ComplexFloat

In [5]: torch.isin(torch.asarray(True), torch.asarray(True))
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[5], line 1
----> 1 torch.isin(torch.asarray(True), torch.asarray(True))

RuntimeError: Unsupported input type encountered for isin(): Bool

I rather strongly suspect that nobody really uses isin(complex, complex). So I'd suggest to follow the suggestion of
#959 (comment) and limit the specification to integer dtypes only. At least as a first step, to see if the assumption holds IRL. And if users actually need isin for floats/complex, we'll update the spec for the next revision.

cbourjau · 2026-01-12T11:25:14Z

src/array_api_stubs/_draft/set_functions.py

+    ----------
+    x1: Union[array, int, float, complex, bool]
+        first input array. **May** have any data type.
+    x2: Union[array, int, float, complex, bool]


Should there maybe be a constraint on the rank of x2?

We intentionally did not impose a constraint and it is not clear whether there is a conceptual reason to do so, as this API is a vectorized API for finding whether a needle (a value) is in a haystack (an array) regardless of the dimensionality of the haystack.

FWIW (at least some) other array libraries happily accept arbitrarily shaped x1 and x2:

In [1]: import numpy as np In [2]: a = np.arange(3*4*5).reshape(3, 4, 5) In [3]: b = np.arange(11) In [4]: np.isin(a, b).shape Out[4]: (3, 4, 5) In [5]: np.isin(b, a).shape Out[5]: (11,)

Here's a hypothesis test data-apis/array-api-tests#407 which does not restrict the shapes, and which seems to pass on numpy,cupy, jax and torch locally.

ev-br

LGTM modulo an optional nit.

A preliminary test does not surface any problems, data-apis/array-api-tests#407. The test does not perform value tests though, so it won't catch if a library does something other than equality testing. As long as only integer dtypes are allowed, there's not that many options though.

src/array_api_stubs/_draft/set_functions.py

lucascolley · 2026-02-05T15:58:48Z

just a note that we probably won't deprecate https://data-apis.org/array-api-extra/generated/array_api_extra.isin.html for now as it also exposes assume_unique and kind parameters.

lucascolley · 2026-02-05T16:00:42Z

just a note that we probably won't deprecate data-apis.org/array-api-extra/generated/array_api_extra.isin.html for now as it also exposes assume_unique and kind parameters.

it would be good however to open an issue to track updating the documentation of xpx.isin to point to xp.isin.

feat: add isin to the specification

7c09df3

Closes: data-apis#854

kgryte added this to the v2025 milestone Jun 12, 2025

kgryte added the API extension Adds new functions or objects to the API. label Jun 12, 2025

kgryte added 2 commits June 12, 2025 03:03

docs: fix typo

7259ac7

fix: import missing type

428be60

kgryte mentioned this pull request Jun 12, 2025

RFC: add isin for elementwise set inclusion test #854

Open

rgommers reviewed Jun 12, 2025

View reviewed changes

src/array_api_stubs/_draft/set_functions.py Outdated Show resolved Hide resolved

src/array_api_stubs/_draft/set_functions.py Outdated Show resolved Hide resolved

ev-br reviewed Jun 12, 2025

View reviewed changes

src/array_api_stubs/_draft/set_functions.py Outdated Show resolved Hide resolved

src/array_api_stubs/_draft/set_functions.py Show resolved Hide resolved

kgryte added 2 commits June 22, 2025 23:02

docs: update copy

11662a4

docs: s/should/must/

1c78599

cbourjau reviewed Oct 9, 2025

View reviewed changes

src/array_api_stubs/_draft/set_functions.py Outdated Show resolved Hide resolved

cbourjau reviewed Oct 9, 2025

View reviewed changes

src/array_api_stubs/_draft/set_functions.py Outdated Show resolved Hide resolved

tomwhite mentioned this pull request Nov 18, 2025

Add isin cubed-dev/cubed#832

Merged

cbourjau reviewed Jan 12, 2026

View reviewed changes

refactor: limit to integer data types

95c098b

kgryte requested a review from ev-br February 5, 2026 11:46

ev-br approved these changes Feb 5, 2026

View reviewed changes

src/array_api_stubs/_draft/set_functions.py Outdated Show resolved Hide resolved

fix: clarify output array shape

44c83c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add `isin` to the specification#959

feat: add `isin` to the specification#959
kgryte wants to merge 7 commits intodata-apis:mainfrom
kgryte:feat/isin

kgryte commented Jun 12, 2025 •

edited

Loading

Uh oh!

rgommers left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ev-br commented Jan 11, 2026 •

edited

Loading

Uh oh!

cbourjau Jan 12, 2026

Uh oh!

kgryte Feb 5, 2026

Uh oh!

ev-br Feb 5, 2026

Uh oh!

ev-br left a comment

Uh oh!

Uh oh!

lucascolley commented Feb 5, 2026

Uh oh!

lucascolley commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

kgryte commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Questions

Uh oh!

rgommers left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ev-br commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cbourjau Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

kgryte Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

ev-br Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

ev-br left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lucascolley commented Feb 5, 2026

Uh oh!

lucascolley commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kgryte commented Jun 12, 2025 •

edited

Loading

ev-br commented Jan 11, 2026 •

edited

Loading