Conversation
|
Okay, this is an obvious win. The largest benefactor is Dask: we had to limit the |
|
The last commit bumps the number of examples to 500 for NumPy and PyTorch (up from 200) and to 50 for Dask (up from 5). All runs finish in under 5 minutes, which is I think is a sweet spot for the balance between the convenience of iterating on a PR and a reasonable test coverage. |
|
Using 4 pytest workers, One potential point of concern can be how parallelization affects the failure mode. It's already not always easy to untangle how hypothesis reports errors and replicate a CI failure locally. I suppose we'll try and see if parallelization makes it any more difficult. |
|
Okay, I think this is it for low-hanging fruits. The slowest tests on torch and numpy spend their time inside hypothesis, so not much we can do about it. |
|
cc @crusaderky this seems to partially address some of your concerns about testing dask |
|
Now that data-apis/array-api-tests#373 has landed, array-api-tests are checked out from the master branch again, and the CI improvements are still there (#321 (comment)). Failure reports are still as readable as before, so this seems to be a win all around. Merging. |
The matching array-api-tests PR, data-apis/array-api-tests#373, allows to turn xfails into skips.
Apparently, this has a large effect on run time: in fact the vast majority of time on CI was spent in hypothesis working to distill a failure---for things which we know are failing and have explicitly marked as xfails.