Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: Fix NaN ordering to make NaNs compare greater than any other float, and equal to themselves #12721

Merged
merged 35 commits into from
Nov 30, 2023

Conversation

orlp
Copy link
Collaborator

@orlp orlp commented Nov 27, 2023

This PR makes floats have a total ordering in their comparison operator, where NaNs compare equal to each other (regardless of sign or payload) and greater than any other floating point value. This matches the order used in e.g. duckdb and allows unambiguous sorting, hashing and equality. Ironically, sorting/hashing is not yet included in this PR, just the explicit comparison operators, one step at a time.

Also included in this PR is the new polars-compute crate that is intended to contain pure computational functions that do not require polars-core (but may use polars-arrow). We can start moving other things (such as the efficient summation routine) here after this PR.

Example

Before:

>>> s = pl.Series([1.0, float("nan"), float("inf")])
>>> s == s
shape: (3,)
Series: '' [bool]
[
        true
        false
        true
]

After:

>>> s == s
shape: (3,)
Series: '' [bool]
[
        true
        true
        true
]

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Nov 27, 2023
@orlp orlp marked this pull request as ready for review November 29, 2023 14:10
@orlp orlp changed the title fix: nans should compare greater than any other float, and equal to themselves fix!: nans should compare greater than any other float, and equal to themselves Nov 29, 2023
@github-actions github-actions bot added the breaking Change that breaks backwards compatibility label Nov 29, 2023
Copy link
Member

@stinodego stinodego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments on the Python side!

py-polars/polars/utils/various.py Outdated Show resolved Hide resolved
py-polars/tests/unit/operations/test_comparison.py Outdated Show resolved Hide resolved
py-polars/polars/testing/asserts/series.py Outdated Show resolved Hide resolved
py-polars/polars/dataframe/frame.py Outdated Show resolved Hide resolved
@ritchie46 ritchie46 merged commit 8c4b392 into pola-rs:main Nov 30, 2023
26 checks passed
@c-peters c-peters added the accepted Ready for implementation label Dec 1, 2023
@stinodego stinodego changed the title fix!: nans should compare greater than any other float, and equal to themselves fix!: Fix NaN ordering to make NaNs compare greater than any other float, and equal to themselves Dec 15, 2023
@sun-rs
Copy link

sun-rs commented Jan 16, 2024

can I treat NaN as inf when use polars? because when you do rolling_mean, there maybe a lot of NaN

@stinodego
Copy link
Member

can I treat NaN as inf when use polars? because when you do rolling_mean, there maybe a lot of NaN

If you want to treat NaN as inf you could use .fill_nan(float('inf')).

@99StewartL
Copy link

Does this not explicity break the IEEE 754 NaN comparison rules? What's the thought process behind doing this?

@stinodego
Copy link
Member

Does this not explicity break the IEEE 754 NaN comparison rules? What's the thought process behind doing this?

It does. See explanation:
#13413 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation breaking Change that breaks backwards compatibility fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants