-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Insights: pola-rs/polars
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
py-1.17.1 Python Polars 1.17.1
published
Dec 9, 2024
31 Pull requests merged by 11 people
-
refactor: Don't deconstruct
CsvParseOptions
#20302 merged
Dec 15, 2024 -
feat: Add 'skip_lines' for CSV
#20301 merged
Dec 15, 2024 -
feat: Allow substraction of time dtype columns
#20300 merged
Dec 15, 2024 -
feat: Add
bin.reinterpret
#20263 merged
Dec 15, 2024 -
feat: Allow decoding of non-Polars arrow dictionaries in Arrow and Parquet
#20248 merged
Dec 15, 2024 -
chore(python): Prepare test suite for Python 3.13 support
#20297 merged
Dec 15, 2024 -
fix: Ensure pl.datetime returns empty column when input columns are empty
#20278 merged
Dec 15, 2024 -
refactor: Add
FunctionCastOptions
and conservative IR-level cast type-checking#20286 merged
Dec 14, 2024 -
refactor: Add more descriptive error message for failure of vstack/extend
#20299 merged
Dec 14, 2024 -
docs(python): Include parquet options in BigQuery I/O write sample
#20292 merged
Dec 14, 2024 -
chore(python): Clean up some remnants of Python 3.8 support
#20293 merged
Dec 14, 2024 -
perf(python): Lower overhead for
BytecodeParser
on introspection of incompatible UDFs#20280 merged
Dec 13, 2024 -
build(python): Update
pyo3
andnumpy
crates to version0.23
#20111 merged
Dec 13, 2024 -
feat(python): Streamline creation of empty frame from
Schema
#20267 merged
Dec 13, 2024 -
feat: Add new
Int128Type
#20232 merged
Dec 12, 2024 -
build(python): Build wheels for ARM Windows in Python release workflow
#20247 merged
Dec 12, 2024 -
fix: Ensure output height does not change on lazy projection pushdown with aggregations
#20223 merged
Dec 11, 2024 -
docs: Fix typo in
fork
warning#20258 merged
Dec 11, 2024 -
fix: Fix error writing on Windows to locations outside of C drive
#20245 merged
Dec 10, 2024 -
feat(rust): IR formatting QoL improvements
#20246 merged
Dec 10, 2024 -
fix: Incorrect comparison in some cases with filtered list/array columns
#20243 merged
Dec 10, 2024 -
fix: Ensure height is maintained in SQL
SELECT 1 FROM
#20241 merged
Dec 10, 2024 -
feat: Add
cat.len_chars
andcat.len_bytes
#20211 merged
Dec 10, 2024 -
test(python): Add test for BytesIO overwritten after scan
#20240 merged
Dec 10, 2024 -
fix: Properly account for updated Categorical in .unique() kernel
#20235 merged
Dec 9, 2024 -
feat: Expose AexprArena
#20230 merged
Dec 9, 2024 -
python Polars 1.17.1
#20227 merged
Dec 9, 2024 -
Rust Polars 0.45.1
#20226 merged
Dec 9, 2024 -
fix: Fix incorrect lazy
select(len())
with some select orderings#20222 merged
Dec 9, 2024 -
refactor: Remove debug asserts on scratch space
#20224 merged
Dec 9, 2024 -
fix: Fix assertion panic on LazyFrame
scratch.is_empty()
#20219 merged
Dec 9, 2024
4 Pull requests opened by 3 people
-
fix: Update the jsonpath_lib dependency
#20225 opened
Dec 9, 2024 -
feat: Add `cat.starts_with`/`cat.ends_with`
#20257 opened
Dec 11, 2024 -
docs(python): Update example for writing to cloud storage
#20265 opened
Dec 12, 2024 -
feat: Serialize DataFrame/Series using IPC in serde
#20266 opened
Dec 12, 2024
39 Issues closed by 16 people
-
`map_elements` has high overhead even with empty dataframe
#20272 closed
Dec 15, 2024 -
`pl.read_csv` with `skip_rows` cannot deal with unbalanced double quotation in header
#20147 closed
Dec 15, 2024 -
Can´t subtract two time columns
#19720 closed
Dec 15, 2024 -
Categorical lexical ordering is not kept when roundtripped through arrow or parquet
#20288 closed
Dec 15, 2024 -
`from_arrow` does not respect global StringCache
#20271 closed
Dec 15, 2024 -
Panic When Aggregating Parquet LazyFrame: validity mask length must match the number of values
#20242 closed
Dec 15, 2024 -
Polars cannot read native pyarrow parquet tables that use dictionary encodings (e.g., categorical types)
#17945 closed
Dec 15, 2024 -
Filter and datetime do not work together
#19696 closed
Dec 15, 2024 -
Row encoding currently regards lexical categorical as non-fixed encoding when it is
#20229 closed
Dec 14, 2024 -
Discussion about casting of Arrow dictionary arrays
#20250 closed
Dec 14, 2024 -
Unpickling Error disappeared when use POLARS_VERBOSE=1
#20283 closed
Dec 13, 2024 -
parse typing.Literal as Enum
#20124 closed
Dec 13, 2024 -
Cannot concatenate datetime to null
#20285 closed
Dec 13, 2024 -
Fork issue
#20284 closed
Dec 13, 2024 -
Problem with `concat` and hive partitioned data in combination with in-memory data frames
#16285 closed
Dec 13, 2024 -
Add empty_data_frame function to pl.Schema
#20262 closed
Dec 13, 2024 -
Add arg to `pl.col(...).is_in(...)` - `respect_none_values: bool = False`
#18730 closed
Dec 13, 2024 -
`pl.col('a').is_in(['val1', None])` does not return true for null cells in col
#18728 closed
Dec 13, 2024 -
v1 Upgrade Guide: seems as though `pl.NUMERIC_DTYPES` is no longer actively supported?
#18062 closed
Dec 13, 2024 -
v1 Upgrade: did `df.drop(['column_which_is_not_in_df'])` switch from doing nothing to error-ing?
#18045 closed
Dec 13, 2024 -
TypeError: argument 'cloud_options': 'AioSession' object cannot be converted to 'PyString'
#20275 closed
Dec 12, 2024 -
Support python builtin `round`
#19942 closed
Dec 11, 2024 -
`melt` panic with categories
#10775 closed
Dec 11, 2024 -
Projection pushdown on scalars changes output height
#20221 closed
Dec 11, 2024 -
polars 1.17 can not save file to RamDisk
#20231 closed
Dec 10, 2024 -
Series length 1 doesn't match DataFrame height 3 in `select()`
#18896 closed
Dec 10, 2024 -
Comparison for Array and List with masked out values is wrong when broadcasting
#20165 closed
Dec 10, 2024 -
[SQL] SELECT literal from dataframe return only 1 row instead of the dataframe height
#20058 closed
Dec 10, 2024 -
SELECT 0 from table_name return 1 line instead of expected number_of_lines(table_name)
#18404 closed
Dec 10, 2024 -
Different streaming mode behavior between Python/Rust
#20156 closed
Dec 10, 2024 -
DataFrame.filter freezes when used in torch.utils.data.DataLoader with non-zero number of workers
#17526 closed
Dec 10, 2024 -
Saving parquet to AWS S3 with df.write_parquet() fails with FileNotFound
#19930 closed
Dec 10, 2024 -
Suggestion to update the UDF documentation
#12912 closed
Dec 9, 2024 -
Incorrect unique() results after multiple CSV reads with categorical columns in Polars 1.17.1
#20236 closed
Dec 9, 2024 -
Panic when using StringCache and requesting unique items
#20233 closed
Dec 9, 2024 -
Support for passing additional arguments to map_batches directly
#20113 closed
Dec 9, 2024 -
Incorrect `select(len())` with some select orderings
#20220 closed
Dec 9, 2024 -
Panic crash with assertion failed in Polars 1.17.0 `scratch.is_empty()`
#20216 closed
Dec 9, 2024
36 Issues opened by 33 people
-
Possibly inconsistent behavior of `pl.mean`/`pl.Expr.mean`/`pl.Series.mean` not raising on str
#20304 opened
Dec 15, 2024 -
`Series.__array_ufunc__` does not properly support `datatypes.Array`
#20303 opened
Dec 15, 2024 -
Parquet reading regression from 1.17.0 on
#20298 opened
Dec 14, 2024 -
Running database tests under Python 3.13 leads to ResourceWarning (unclosed database)
#20296 opened
Dec 14, 2024 -
Fill null values uses `.struct.with_fields(pl.field(column).forward_fill())`.
#20295 opened
Dec 14, 2024 -
CSV column names with double-quotes in them are parsed incorrectly
#20294 opened
Dec 13, 2024 -
docs: mention that NaN equals itself in "not-a-number-or-nan-values" page
#20291 opened
Dec 13, 2024 -
qcut panics when applied to a categorical column
#20290 opened
Dec 13, 2024 -
df.sample() should support pl.len()
#20289 opened
Dec 13, 2024 -
LazyFrame Schema not updated
#20287 opened
Dec 13, 2024 -
Unable to write directly to Azure Datalake Gen2 Storage in polars 1.17.1
#20282 opened
Dec 13, 2024 -
Hive partitioning creates incorrect row_index when skipping files
#20281 opened
Dec 13, 2024 -
bug: pl.cum_sum_horizontal(pl.all()).struct.unnest() raises
#20277 opened
Dec 12, 2024 -
Add example of adding null values in `Expr.add`
#20276 opened
Dec 12, 2024 -
"numpy.float64 object is not callable" Panic
#20274 opened
Dec 12, 2024 -
Panic in CSV serializer
#20273 opened
Dec 12, 2024 -
Create Golang binding
#20269 opened
Dec 12, 2024 -
`repeat_by` panics when `n=lit(None)`; should propagate `null` instead
#20268 opened
Dec 12, 2024 -
`pl.from_records` drops timezone information
#20264 opened
Dec 12, 2024 -
Documentation of `DataFrame.__setitem__` is missing
#20261 opened
Dec 11, 2024 -
Unnest causes multiple invocations of map_batches function
#20260 opened
Dec 11, 2024 -
Support sink_parquet/sink_ndjson/sink_csv with GPU engine
#20259 opened
Dec 11, 2024 -
Polars gets progressively slower when doing operations in repeated cycles
#20256 opened
Dec 10, 2024 -
Getting warning message "Using fork() can cause Polars to deadlock in the child process"
#20255 opened
Dec 10, 2024 -
Enhancement Suggestion: Improve Documentation for Cloud Integration in Polars
#20254 opened
Dec 10, 2024 -
Read_delta does not return latest version of delta table when called in streamlit app
#20253 opened
Dec 10, 2024 -
AWS Lambda does not support multiprocessing
#20252 opened
Dec 10, 2024 -
`pl.read_csv()` does not enforce `pl.Enum()` columns in schema argument
#20251 opened
Dec 10, 2024 -
`unique(maintain_order=True)` fails on null series
#20249 opened
Dec 10, 2024 -
`sliced()` function on `polars-arrow` Array is unsound, allows out-of-bounds access in release mode
#20239 opened
Dec 10, 2024 -
scan_delta on partitioned and compacted delta table fails
#20238 opened
Dec 9, 2024 -
Error casting list to array in agg
#20237 opened
Dec 9, 2024 -
Unable to create ScanSources
#20234 opened
Dec 9, 2024 -
Polars dies on working code when POLARS_PANIC_ON_ERR=1 is set
#20228 opened
Dec 9, 2024 -
Excessive Memory Usage after 1.15.0
#20218 opened
Dec 9, 2024
29 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
feat: Add `dt.replace`
#19708 commented on
Dec 15, 2024 • 7 new comments -
refactor: Move hive partitioning outside of readers
#20203 commented on
Dec 14, 2024 • 0 new comments -
feat: Series.index_of()
#19894 commented on
Dec 13, 2024 • 0 new comments -
feat(python): Add show methods to DataFrame and LazyFrame
#19634 commented on
Dec 12, 2024 • 0 new comments -
Sliding window for pl.Expr.over function
#8976 commented on
Dec 15, 2024 • 0 new comments -
Unfinished quotes (also in skipped lines) result in an empty CSV with read_csv()
#12440 commented on
Dec 15, 2024 • 0 new comments -
Errors in scan_delta and write_delta with nested struct schema evolution (aka adding new field)
#19915 commented on
Dec 15, 2024 • 0 new comments -
document rank function's behavior with NaN
#10513 commented on
Dec 15, 2024 • 0 new comments -
Unexpected behaviour when applying `struct.with_fields` to multiple columns
#18461 commented on
Dec 14, 2024 • 0 new comments -
Allow join on different types if upcast is safe
#15338 commented on
Dec 14, 2024 • 0 new comments -
Support sink_parquet for anonymous scan
#8719 commented on
Dec 13, 2024 • 0 new comments -
add `is_in_missing` or missing parameter to `is_in`
#12591 commented on
Dec 13, 2024 • 0 new comments -
`scan_csv()` fails if there is not enough disk space to download the whole file (e.g. AWS lambda, or container)
#17946 commented on
Dec 12, 2024 • 0 new comments -
Panic reading JSON/parquet Files From S3
#20096 commented on
Dec 12, 2024 • 0 new comments -
Serializing float columns with `format="json"` turns inf/nan values into null
#17211 commented on
Dec 12, 2024 • 0 new comments -
Decimal division fails
#19756 commented on
Dec 12, 2024 • 0 new comments -
Incorrect assumptions about Binary being a String representation when parsing to UInt32 (or any other numerical) type
#18991 commented on
Dec 12, 2024 • 0 new comments -
Globbing support for multiple JSON (not ndjson) files?
#12910 commented on
Dec 12, 2024 • 0 new comments -
When working with s3fs (For AWS S3), it still raises "Polars found a filename" warning
#18040 commented on
Dec 11, 2024 • 0 new comments -
Imperfect behavior with `scan_csv()` on `zstd` compressed file with `new_columns` param
#19916 commented on
Dec 11, 2024 • 0 new comments -
`scan_delta` from S3 returns 400 Bad Request when using Polars 1.14.0, works fine in 1.13.0.
#19969 commented on
Dec 10, 2024 • 0 new comments -
Finish switching to PyO3 0.21's native datetime support, now available with abi3, once Python 3.9 is the minimum version
#16199 commented on
Dec 10, 2024 • 0 new comments -
`LazyFrame::scan_parquet` should return `PolarsError::NoData` on empty folder scans
#20012 commented on
Dec 10, 2024 • 0 new comments -
Columns names with a dot (`.`) do not plot correctly with the new Altair plotting backend
#19735 commented on
Dec 9, 2024 • 0 new comments -
ChunkedArray different chunk sizes performance
#20190 commented on
Dec 9, 2024 • 0 new comments -
Add padding method to List datatype
#10283 commented on
Dec 9, 2024 • 0 new comments -
Future direction for Decimal, embracing fixed point?
#19784 commented on
Dec 9, 2024 • 0 new comments -
jsonpath_lib_polars_vendor should not enable the serde_json's preserve_order feature by default
#20208 commented on
Dec 9, 2024 • 0 new comments -
Rolling window on another col quantile returns different values from quantile for Nearest method
#20214 commented on
Dec 9, 2024 • 0 new comments