-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transpose method to dataframe #1176
Comments
df = pl.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
print(df)
print(pl.DataFrame(df.rows()))
|
Thanks! Would it make sense to create an alias for that function called |
Yes, will add the alias. 👍 |
@ritchie46 I'm wondering whether it would make sense to introduce a lazy transpose - skipping the extra allocation (turning it into an iterator) if there is a subsequent operation. Edit: this might fit ndarray more, eg it could optimize DF * DF.transpose() |
Now that I think of it. I will support this natively, as going to python rows is super expensive.
@alippai Currently these operation sadly cannot be done in lazy. I need to know the schema of every node in the query plan. An operation like |
Makes sense, I really appreciate the implementation detail! |
fwiw, spark supports it (lazily), but it is a shotgun shot to the foot, as it performs two queries, one of them to compute the distincts during planning. |
Just to get the complexity of the task: an n x m sized 2D single type (int/float) specialization of the DF type would be needed for this, right? |
Ouch.. that's definitely a shotgun. In such I cases, I'd rather have a user doing something like this, and document why'd they want this: temp = long_query().transpose().collect()
(temp.lazy()
.select(..) # continue from here.
) |
For max performance I guess something like that. I am planning to use the |
Added in 2fa53db |
Similar to pandas.
The text was updated successfully, but these errors were encountered: