Add trailing_nans parameter to control NaN placement in sort results#235
Add trailing_nans parameter to control NaN placement in sort results#235r-devulap wants to merge 11 commits into
Conversation
All sorting and selection routines (qsort, qselect, partial_qsort,
argsort, argselect) now accept an optional `bool trailing_nans`
parameter (default: true). When hasnan=true:
- trailing_nans=true (default): NaN values appear at the end of the
result, independent of sort direction.
- trailing_nans=false: NaN values appear at the beginning of the
result, independent of sort direction.
Previously, NaN placement was coupled to the sort direction: ascending
always put NaN at the end, descending always put NaN at the beginning.
This change decouples the two.
Implementation notes:
- qsort path: replace_inf_with_nan now takes both `descending` and
`trailing_nans`; when they differ, the real-value portion is rotated
before NaN slots are written back.
- qselect path: NaN is moved to the desired end before partitioning,
keeping the existing move_nans_to_{end,start}_of_array helpers.
- argsort path: after std_argsort_withnan + optional reverse, a
std::rotate moves NaN indices to the desired end if needed.
- argselect path: comparator lambda respects trailing_nans.
- keyvalue_qsort/select path: descending reverse now only reverses
the real portion [0, index_last_elem], leaving trailing NaN in place.
- Scalar fallback: four NaN-aware comparators (compare_nan_end/begin,
compare_arg_nan_end/begin) added to utils/custom-compare.h.
Tests updated to use compare_nan_end for descending reference sorts
and to skip value-comparison checks when arr[k] is NaN.
|
Nice, I have to take a slightly closer look. But should it be |
|
Thanks, this is nice! Going to take a deeper look soon - but, I was wondering if it's actually easier to just change |
It might easier to just reuse existing helper functions: |
- Add bool descending = false parameter to all argselect APIs and internal implementations (avx512, avx2, scalar, static, dispatch) - In xss_argselect: reverse the index array after partitioning when descending is true; pass descending to std_argselect_withnan for NaN-array fast path - std_argselect_withnan: honour descending flag in the comparator - scalar argselect: pick comparator once based on descending and trailing_nans, then call nth_element once - Tests: replace test_argselect with four typed tests covering ascending, descending, trailing NaNs and leading NaNs - IS_ARG_PARTITIONED: add descending parameter forwarded to IS_ARR_PARTITIONED
argselect_ always partitions in ascending order. For descending, we must select at position arrsize-1-k so the k-th largest lands at that mirror index; reversing then moves it to position k with the correct left/right partition invariant.
|
@seberg @MaanasArora This looks good to me - happy to merge once you sign off. |
NumPy, as of today, does not use a descending flag. I’m not sure which other libraries rely on this flag, and I’m not particularly concerned about that change. The newly introduced trailing_nans flag handles NaN placement independently of sort order—it simply lets users specify where NaNs should appear, regardless of whether the sort is ascending or descending. Personally, I don’t find assigning an order to NaNs meaningful. In that sense, this flag mainly provides users with a clear preference for where they want their NaNs. |
Summary of changes in
nanbranchtrailing_nansparameterAdded
bool trailing_nans = trueto all sort/select routines (qsort,qselect,partial_qsort,argsort,argselect). Whenhasnan=true, this controls where NaN values land in the result:trailing_nans=true(default): NaNs at the end, regardless of sort directiontrailing_nans=false: NaNs at the beginning, regardless of sort directionPreviously the library replaced NaNs with
±infas a workaround; now it partitions them directly to the correct end. Bit-exact NaN payloads are preserved.descendingparameter forargselectAdded
bool descending = falsetoargselect, completing its API to matchargsortandqselect. Whendescending=true:arrsize-1-kthen reverses; scalar path selects astd::greatercomparatorTests
qsort,argsort, andqselecttest_argselectwith four tests: ascending, descending, trailing NaNs, leading NaNsutils/custom-compare.hwith NaN-aware comparators shared across tests