Skip to content

Conversation

@tisonkun
Copy link
Contributor

@tisonkun tisonkun commented Nov 25, 2025

This refers to #149046.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 25, 2025
@rustbot
Copy link
Collaborator

rustbot commented Nov 25, 2025

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@tisonkun tisonkun force-pushed the slice_partial_sort_unstable branch from 115ac5c to 0e87d5d Compare November 25, 2025 16:42
@tisonkun
Copy link
Contributor Author

cc @orlp

Copy link
Contributor

@orlp orlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some remarks, some on style, but also some with substance.

Besides comments on the code that's written, I do note a lack of tests?

View changes since this review

@tisonkun
Copy link
Contributor Author

I do note a lack of tests?

Doc tests cover most branches. I don't find a dedicated file to cover its cousin sort_unstable. If you can point me to one, I'm glad to add cases there.

@orlp
Copy link
Contributor

orlp commented Nov 26, 2025

The examples can change at any time. And you didn't test, for example, the post-condition that all elements ..start are less than or equal to the elements start..end and that those are less than or equal to the elements end.., including for the zero-length case.

@tisonkun
Copy link
Contributor Author

The examples can change at any time. And you didn't test, for example, the post-condition that all elements ..start are less than or equal to the elements start..end and that those are less than or equal to the elements end.., including for the zero-length case.

Thanks and yes. Do you know where the unit tests of sort/sort_unstable locate?

@orlp
Copy link
Contributor

orlp commented Nov 26, 2025

I believe the bulk is found in https://github.com/rust-lang/rust/blob/main/library/alloctests/tests/sort/tests.rs.

@orlp
Copy link
Contributor

orlp commented Nov 26, 2025

What I suggested in the ACP was a sketch implementation, I did some more thinking and I think the following handles all corner cases nicely:

pub fn partial_sort<T, F, R>(mut v: &mut [T], range: R, is_less: &mut F)
where
    F: FnMut(&T, &T) -> bool,
    R: RangeBounds<usize>,
{
    let len = v.len();
    let Range { start, end } = slice::range(range, ..len);
    
    if end - start <= 1 {
        // Can be resolved in at most a single partition_at_index call, without
        // further sorting. Do nothing if it is an empty range at start or end.
        if start != len && end != 0 {
            sort::select::partition_at_index(v, start, is_less);
        }
        return;
    }
    
    // Don't bother reducing the slice to sort if it eliminates fewer than 8 elements.
    if end + 8 <= len {
        v = sort::select::partition_at_index(v, end - 1, is_less).0;
    }
    if start >= 8 {
        v = sort::select::partition_at_index(v, start, is_less).2;
    }
    sort::unstable::sort(v, is_less);
}

And to formalize the post-conditions, I think the following should hold after a call to v.partial_sort_unstable(b..e):

for i in 0..b {
    for j in b..n {
        assert!(v[i] <= v[j]);
    }
}
for i in 0..e {
    for j in e..n {
        assert!(v[i] <= v[j]);
    }
}
for i in b..e {
    for j in i..e {
        assert!(v[i] <= v[j]);
    }
}

@quaternic
Copy link
Contributor

quaternic commented Nov 28, 2025

And to formalize the post-conditions, I think the following should hold after a call to v.partial_sort_unstable(b..e):

A lot of those individual comparisons are implied by transitivity of the ordering, so it can be reduced to choosing the maximum of the prefix (if any), the minimum of the suffix (if any), and then asserting that the concatenation is sorted.

Informally, max(v[..b]) <= v[b] <= v[b + 1] <= ... <= v[e-1] <= min(v[e..]), or in code:

let max_before = v[..b].iter().max().into_iter();
let sorted_range = v[b..e].iter();
let min_after = v[e..].iter().min().into_iter();
let seq = max_before.chain(sorted_range).chain(min_after);
assert!(seq.is_sorted());

That's pretty much what you said in rust-lang/libs-team#685 (comment) , just using transitivity of the comparison. Without assuming that, the implementation couldn't guarantee the universally quantified property anyway.

@tisonkun tisonkun force-pushed the slice_partial_sort_unstable branch from f9a09e0 to 372589e Compare December 1, 2025 04:09
@rustbot
Copy link
Collaborator

rustbot commented Dec 1, 2025

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@tisonkun
Copy link
Contributor Author

tisonkun commented Dec 1, 2025

Pushed a new implementation.

I'm writing tests but perhaps we'd have a new mod under library/alloctests/tests/partial_sort rather than patch the existing library/alloctests/tests/sort/tests.rs. Since the sort tests are heavily depending on macros while partial sorts assertions are quite different.

@tisonkun
Copy link
Contributor Author

tisonkun commented Dec 1, 2025

Pushed a new implementation.

I'm writing tests but perhaps we'd have a new mod under library/alloctests/tests/partial_sort rather than patch the existing library/alloctests/tests/sort/tests.rs. Since the sort tests are heavily depending on macros while partial sorts assertions are quite different.

cc @Amanieu for early review for the direction and advice on where to organize the tests.

@rust-log-analyzer

This comment has been minimized.

@tisonkun tisonkun force-pushed the slice_partial_sort_unstable branch 2 times, most recently from 6ef6ab4 to 10d053f Compare December 1, 2025 10:57
@rust-log-analyzer

This comment has been minimized.

@tisonkun tisonkun force-pushed the slice_partial_sort_unstable branch from 10d053f to 43fc006 Compare December 1, 2025 11:13
Signed-off-by: tison <wander4096@gmail.com>
Co-Authored-By: Orson Peters <orsonpeters@gmail.com>
@tisonkun tisonkun force-pushed the slice_partial_sort_unstable branch from 43fc006 to bbca3c0 Compare December 1, 2025 11:43
@Amanieu
Copy link
Member

Amanieu commented Dec 2, 2025

Regarding the tests I'm happy with either a separate file or as part of the existing tests as you see fit.

Signed-off-by: tison <wander4096@gmail.com>
@tisonkun
Copy link
Contributor Author

tisonkun commented Dec 4, 2025

Pushed some basic test cases at 01bfcc0.

The existing sort tests have many assumptions, like checked_sort always sorts the whole range (for sure), and this helper is tightly coupled in cases. I can hardly insert the range concept in the current sort tests set up.

I added the basic test cases and will try to leverage the pattern functions in the sort module to generate more cases. But this patch itself should be mergeable as long as the implementation looks good, since now we have the test structure, which can be improved continuously.

@scottmcm
Copy link
Member

scottmcm commented Dec 6, 2025

r? libs

@rustbot rustbot assigned tgross35 and unassigned scottmcm Dec 6, 2025
Signed-off-by: tison <wander4096@gmail.com>
@tisonkun
Copy link
Contributor Author

Added some pattern tests - can be extended later.

@tisonkun
Copy link
Contributor Author

cc @tgross35 can you take a look here?

@tisonkun tisonkun requested review from Amanieu and orlp December 22, 2025 12:52
Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few requests, mostly stylistic. I think the tests could use some improvements but I'm not sure what would be reasonable - @orlp would you mind providing some suggestions here? (In general, I'm happy to defer this review to you)

View changes since this review

mod linked_list;
mod misc_tests;
mod num;
mod partial_sort;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional nit: since sort is a module with children, maybe organize this as sort::partial instead.

use crate::slice::sort::shared::find_existing_run;
#[cfg(not(any(feature = "optimize_for_size", target_pointer_width = "16")))]
use crate::slice::sort::shared::smallsort::insertion_sort_shift_left;
use crate::slice::{self};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, unusual import style

Suggested change
use crate::slice::{self};
use crate::slice;

Comment on lines +84 to +91
if start != len && end != 0 {
partition_at_index(v, start, &mut is_less);
} else {
// Do nothing if it is an empty range at start or end: all guarantees
// are already upheld.
}

return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the positive condition reads easier than negative, and this avoids the empty else just for a comment

if end == 0 || start == len {
    // Do nothing if it is an empty range at start or end: all guarantees
    // are already upheld.
    return;
}

partition_at_index(v, start, &mut is_less);
return;

Comment on lines +94 to +102
// Avoid partitioning the slice when it eliminates only a few elements to sort.
// The threshold of 8 elements was determined empirically.
let mut v = v;
if end + 8 <= len {
v = partition_at_index(v, end - 1, &mut is_less).0;
}
if start >= 8 {
v = partition_at_index(v, start, &mut is_less).2;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the "8" in a constant, similar to MAX_LEN_ALWAYS_INSERTION_SORT above. It would also be nice to include a link to further context on how the heuristic was determined (even if it's just the GH comment here) so the next person to update this doesn't need to dig too much.

Comment on lines +66 to +67
#[inline(always)]
pub fn partial_sort<T, F, R>(v: &mut [T], range: R, mut is_less: F)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[inline(always)] seems a bit strong here, is it actually needed? I think #[inline] would probably be fine so small code size heuristics can pick an outline point here if advantageous.

Copy link
Contributor Author

@tisonkun tisonkun Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows how sort_unstable was implemented now -

#[inline(always)]
pub fn sort<T, F: FnMut(&T, &T) -> bool>(v: &mut [T], is_less: &mut F) {

I tend to keep this flavor and review them together later, rather than make a difference here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a close look, the cyclomatic complexity of sort is about 4, while partial_sort is about 7, with the final sort to be inlined always, so perhaps 10.

This is typically small enough, but it's reasonable if you insist on making it #[inline] now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule of thumb nowadays is that using #[inline(always)] rather than #[inline] should be backed up by benchmarks and come with a comment explaining why, because it tends to hurt the size-optimized case. So yeah, I'd prefer #[inline] unless there is something to back up always making a difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that inlining is bottom-up, so inline(always) applies after other things might have been inlined into the body. Thus the cyclomatic complexity of this function is actually unknown.

In general, any argument of the form "it's typically small enough" is only enough to say #[inline], because if it really is small enough that's already sufficient to get it inlined -- and means that since "typically" isn't "always", leaving the flexibility to say "well actually no this is the atypical case" in the inliner's heuristics, which is also a good thing. (Making a case for always is easier in leaves, where for example you can argue that things like pointer::add shouldn't have function-call overhead even in opt-level=0, but something like sort is very much not that.)


It's possible that what you're looking for here would be inline(trampoline) if we got rust-lang/rfcs#3778 , but today you just want a normal #[inline].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually intended to ping you here to double check, but somehow you have a way of appearing whenever inlining is discussed anyway :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottmcm Thanks for your information!

One more question: as this partial_sort method is only used within the crate, even without #[inline], the compiler may still heuristically inline the function when desired?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So long as it has the information needed available (the MIR for the MIR inliner, the LLVM-IR for the LLVM inliner) then everything is an inlining candidate (though of course inline(never) is rarely inlined). In practice, that means that in -C codegen-units=1 with -C opt-level=3 there's not much of a difference between generic things (since those always need MIR available) and inline things.

And if you're using LTO, everything is an inlining candidate -- that's a big part of why LTO is useful (and a big part of why rustc doesn't have all that many inline annotations).

Comment on lines +3245 to +3257
/// Partially sorts the slice in ascending order **without** preserving the initial order of equal elements.
///
/// Upon completion, for the specified range `start..end`, it's guaranteed that:
///
/// 1. Every element in `self[..start]` is smaller than or equal to
/// 2. Every element in `self[start..end]`, which is sorted, and smaller than or equal to
/// 3. Every element in `self[end..]`.
///
/// This partial sort is unstable (i.e., may reorder equal elements), in-place (i.e., does not
/// allocate), and *O*(*n* + *k* \* log(*k*)) worst-case, where *n* is the length of the slice and
/// *k* is the length of the specified range.
///
/// See the documentation of [`sort_unstable`] for implementation notes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth a comment noting the (lack of) guarantees about ..start and end.., given this isn't exactly covered by stability, and that the current implementation may just sort the whole thing. IOW, users shouldn't be surprised if partial_sort_unstable([0, 1, 3, 2, 4, 5], 2..4) returns [1, 0, 2, 3, 5, 4].

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 28, 2025
@rustbot
Copy link
Collaborator

rustbot commented Dec 28, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

Copy link
Contributor

@orlp orlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the overall structure of tests look fine, just some stuff missing.

View changes since this review


check_is_partial_sorted::<T, _>(&mut v.to_vec(), ..);
check_is_partial_sorted::<T, _>(&mut v.to_vec(), 0..0);
check_is_partial_sorted::<T, _>(&mut v.to_vec(), len - 1..len - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test for len..len which is a valid empty range at the end.

check_is_partial_sorted::<T, _>(&mut v.to_vec(), mid..mid);
check_is_partial_sorted::<T, _>(&mut v.to_vec(), mid - 1..mid + 1);
check_is_partial_sorted::<T, _>(&mut v.to_vec(), mid - 1..mid);
check_is_partial_sorted::<T, _>(&mut v.to_vec(), mid..mid + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests for substantial slices somewhere in the middle.

@tisonkun
Copy link
Contributor Author

Thanks for your comments @orlp @tgross35! One comment inline #149318 (comment)

Rest SGTM. I'll integrate them in a few days and re-request a review :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants