Polars

Blazingly fast DataFrames in Rust

Polars is a blazingly fast DataFrames library implemented in Rust. Its memory model uses Apache Arrow as backend.

It currently consists of an eager API similar to pandas and a lazy API that is somewhat similar to spark. Amongst more, Polars has the following functionalities.

Functionality	Eager	Lazy
Filters	✔	✔
Shifts	✔	✔
Joins	✔	✔
GroupBys + aggregations	✔	✔
Comparisons	✔	✔
Arithmetic	✔	✔
Sorting	✔	✔
Reversing	✔	✔
Closure application (User Defined Functions)	✔	✔
SIMD	✔	✔
Pivots	✔	✗
Melts	✔	✗
Filling nulls + fill strategies	✔	✗
Aggregations	✔	✗
Find unique values	✔	✗
Rust iterators	✔	✗
IO (csv, json, parquet, Arrow IPC	✔	✗
Query optimization: (predicate pushdown)	✗	✔
Query optimization: (projection pushdown)	✗	✔
Query optimization: (type coercion)	✗	✔

Note that almost all eager operations supported by Eager on Series/ChunkedArrays can be use in Lazy via UDF's

Documentation

Want to know about all the features Polars supports? Check the current master docs.

Most features are described on the DataFrame, Series, and ChunkedArray structs in that order. For ChunkedArray a lot of functionality is also defined by Traits in the ops module. Other useful parts of the documentation are:

Performance

Polars is written to be performant. Below are some comparisons with the (also very fast) Pandas DataFrame library.

GroupBy

Joins

First run in Rust

Take a look at the 10 minutes to Polars notebook to get you started. Want to run the notebook yourself? Clone the repo and run $ cargo c && docker-compose up. This will spin up a jupyter notebook on http://localhost:8891. The notebooks are in the /examples directory.

Oh yeah.. and get a cup of coffee because compilation will take a while during the first run.

First run in Python

A subset of the Polars functionality is also exposed through Python bindings. You can install them with:

$ pip install py-polars

Next you can check the 10 minutes to py-polars notebook or take a look at the reference.

Features

Additional cargo features:

pretty (default)
- pretty printing of DataFrames
temporal (default)
- Conversions between Chrono and Polars for temporal data
simd (default)
- SIMD operations
parquet
- Read Apache Parquet format
random
- Generate array's with randomly sampled values
ndarray
- Convert from DataFrame to ndarray
lazy
- Lazy api

Contribution

Want to contribute? Read our contribution guideline.

Name		Name	Last commit message	Last commit date
Latest commit History 528 Commits
.github/workflows		.github/workflows
bench		bench
examples		examples
pandas_cmp		pandas_cmp
polars		polars
py-polars		py-polars
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Polars

Blazingly fast DataFrames in Rust

Documentation

Performance

GroupBy

Joins

First run in Rust

First run in Python

Features

Contribution

About

Releases

Packages

Languages

License

justanotherdot/polars

Folders and files

Latest commit

History

Repository files navigation

Polars

Blazingly fast DataFrames in Rust

Documentation

Performance

GroupBy

Joins

First run in Rust

First run in Python

Features

Contribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages