Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust, python): add maintain_order argument to sort/top_k/bottom_k #9672

Merged
merged 20 commits into from
Jul 12, 2023

Conversation

CloseChoice
Copy link
Contributor

closes #9401

@stinodego stinodego changed the title Feature/sort maintain order fix(rust, python): add maintain_order argument to sort/top_k/bottom_k Jul 2, 2023
@stinodego stinodego changed the title fix(rust, python): add maintain_order argument to sort/top_k/bottom_k feat(rust, python): add maintain_order argument to sort/top_k/bottom_k Jul 2, 2023
@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars enhancement New feature or an improvement of an existing feature labels Jul 2, 2023
@CloseChoice CloseChoice marked this pull request as ready for review July 2, 2023 09:35
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for going through the code churn. This looks good. Can you add the mentioned comment in all docstrings?

You must also ensure we don't convert to streaming if this flag is set as the streaming API will not have a stable sort.

@@ -1236,6 +1249,8 @@ def bottom_k(
per column by passing a sequence of booleans.
nulls_last
Place null values last.
maintain_order
Whether the order should be maintained if elements are equal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a note that this is more expensive and blocks streaming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@CloseChoice
Copy link
Contributor Author

CloseChoice commented Jul 2, 2023

Thanks for going through the code churn. This looks good. Can you add the mentioned comment in all docstrings?

You must also ensure we don't convert to streaming if this flag is set as the streaming API will not have a stable sort.

Thanks for going through the code churn. This looks good. Can you add the mentioned comment in all docstrings?

You must also ensure we don't convert to streaming if this flag is set as the streaming API will not have a stable sort.

Done, not quite sure why the python test fails, is this known to be a fluky test?

Edit: Ah, I added streaming to the test_all features, that's probably the reason for the failed test.

@CloseChoice CloseChoice requested a review from ritchie46 July 3, 2023 10:59
) -> PolarsResult<&mut Self> {
println!("In sort in place");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A rogue print

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -3,6 +3,10 @@ use polars_plan::prelude::*;

pub(super) fn is_streamable_sort(args: &SortArguments) -> bool {
// check if slice is positive
if args.maintain_order {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't print here. And can we combine the check with the slice check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -97,7 +97,7 @@ pub fn hist(s: &Series, bins: Option<&Series>, bin_count: Option<usize>) -> Resu

cuts.left_join(&out, [category_str], [category_str])?
.fill_null(FillNullStrategy::Zero)?
.sort(["category"], false)
.sort(["category"], false, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think of this, I think we should make a struct called something like SortOptions that implements Default. That should make calling this from the rust side much more ergonomic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that this is something for this PR?

We already have SortOptions (s. here) and also the possibility to sort_with_options. The difference between sort and sort_with_options is that we cannot specify multiple descending columns in sort_with_options (e.g. this is not possible with sort_with_options). I guess a generalization of SortOptions might be needed for this, but I don't see this in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I propose this is because the number of arguments are becoming unwieldy and that can be resolved with an argument struct. Would you be willing to make a follow up PR on that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, already thought about that. I think I might need some guidance on how to best do this design-wise but I can do the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just created an issue for this: #9820

@CloseChoice CloseChoice requested a review from MarcoGorelli as a code owner July 9, 2023 19:44
@CloseChoice CloseChoice force-pushed the feature/sort-maintain-order branch from 7bd6fdb to 1b0a875 Compare July 9, 2023 19:51
@CloseChoice CloseChoice requested a review from ritchie46 July 11, 2023 21:50
@ritchie46
Copy link
Member

Alright, can you do a rebase @CloseChoice? Then we can do the argument struct in a later PR.

@CloseChoice CloseChoice force-pushed the feature/sort-maintain-order branch from a19277e to e1fdfb1 Compare July 12, 2023 06:36
@ritchie46 ritchie46 merged commit a1d5a22 into pola-rs:main Jul 12, 2023
@ritchie46
Copy link
Member

Thanks @CloseChoice

@CloseChoice CloseChoice deleted the feature/sort-maintain-order branch July 12, 2023 07:16
c-peters pushed a commit to c-peters/polars that referenced this pull request Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

top_k does not preserve original row order
2 participants