Newest 'pandas-apply' Questions

0 votes

0 answers

19 views

DataFrame.apply converts integer data to float even in absence of mixed numerical types

I am using DataFrame.apply() to calculate a new "Metric" column by taking an existing integer categorical column and looking up the integer in a list, i.e., indexing into the list. It works ...

user2153235

1,115

asked Jul 25 at 20:34

2 votes

1 answer

62 views

advanced logic with groupby, apply and transform - compare row value with previous value and create new column

I have the following pandas dataframe: d= {'Time': [0,1,2,0,1,2,2,3,4], 'Price': ['Auction', 'Auction','800','900','By Negotiation','700','250','250','Make Offer'],'Item': ['Picasso', 'Picasso', '...

doctorblaze

45

asked Jul 3 at 8:41

1 vote

2 answers

35 views

DataFrame Creation from DataFrame.apply

I have a function that returns a pd.DataFrame given a row of another dataframe: def func(row): if row['col1']== 1: return pd.DataFrame({'a':[1], 'b':[11]}) else: return pd....

Ottpocket

155

asked Apr 30 at 18:40

0 votes

1 answer

40 views

Why is .apply / .map only running against the first column?

All! I have two [2] pd.DataFrames: df_colors [20,3] and df_master [200,13]. I am attempting to update the values for each item in each row. df_colors.head(5) COLOR CODE WHITE 0 BLACK 1 GREEN 0 ...

OctoCatKnows

1

asked Jan 30 at 15:36

0 votes

0 answers

90 views

create multiple columns with df.apply that uses multi return function

I am creating a new column with df.apply and a function and would like to use the same function to create another column in the dataframe. The function returns several values from which I'm selecting ...

peetman

707

asked Nov 29, 2023 at 23:07

1 vote

3 answers

96 views

How do I sum a column based on separate category types, and preserve the zeros?

Here is my dataframe: Id Category Hours 1 A 1 1 A 3 1 B 4 2 A 2 2 B 6 3 A 3 And here is the output I want: Id Total Hours A_Hours B_Hours 1 5 4 4 2 8 2 6 3 3 3 0 How do I achieve this? I ...

hailthedawn

35

asked Oct 31, 2023 at 13:58

0 votes

0 answers

25 views

Apply logical comparison but truth value ambiguous

I am trying to do a mapping of current resources to available resources in gcp. I want to check the current RAM and vCPUs(calculated elsewhere) then create a new column for each machine series that ...

Alex Cahill

13

asked Apr 27, 2023 at 14:50

1 vote

0 answers

50 views

Apply a function with multiple arguments to a large Pandas dataframe efficiently

My dataframe (1,957,046 x 4) is of baby names by year, count and gender, as follows: Year Name Gender Count 1880 A F 1 1880 B M 5 1880 C F 2 ... ... ... ... 2018 X M 7 2018 Y F 4 2018 Z M 2 I ...

PeteyPablo

41

asked Mar 27, 2023 at 17:53

1 vote

1 answer

79 views

Vectorized dataframe filtering with complex logic

I have a very big dataframe with five columns, ID and four numerical. Let's say, integers between 0 and 50. My goal is to calculate cosine similarity matrix for every ID. However, I want to force some ...

Emil Mirzayev

252

asked Jan 26, 2023 at 17:14

2 votes

1 answer

2k views

Get correlation per groupby/apply in Python Polars

I have a pandas DataFrame df: d = {'era': ["a", "a", "b","b","c", "c"], 'feature1': [3, 4, 5, 6, 7, 8], 'feature2': [7, 8, 9, 10, 11, 12], '...

jbssm

7,151

asked Nov 27, 2022 at 20:53

0 votes

1 answer

45 views

Assignment a values to columns inside df.apply()

I need to assign multiple values to multiple columns inside a pandas.DataFrame. What I want to do looks like that: df.apply( lambda x: x['card_{}'.format(card)] = score for card, score in zip(...

Любовь Пономарева

365

asked Nov 8, 2022 at 15:27

0 votes

1 answer

145 views

Speed up groupby rolling apply utilising multiple columns

I'm trying to create a Brier Score for a grouped rolling window. As the function that calculates the Brier Score utilises multiple columns in the grouped rolling window I've had to use the answer here ...

Jossy

959

asked Oct 15, 2022 at 21:38

1 vote

1 answer

71 views

index compatibility of dataframe with multiindex result from apply on group

We have to apply an algorithm to columns in a dataframe, the data has to be grouped by a key and the result shall form a new column in the dataframe. Since it is a common use-case we wonder if we have ...

barium

53

asked Oct 14, 2022 at 9:28

1 vote

1 answer

30 views

How do I keep values based on dataframe values?

I have the following dataframe. ID path1 path2 path3 1 12 NaN NaN 1 1 5 NaN 1 2 NaN '' 1 2 4 111 2 123 NaN NaN 3 11 ...

daily update

37

asked Sep 26, 2022 at 6:47

0 votes

1 answer

185 views

Add columns to Dataframe when apply custom function that returns dictionary

def tFunc(row): if (random.random()>0.5): info={'A': 'a', 'B': 'b', 'C': 'c', 'D': 'd'} else: info={'A': 'a', 'B': 'b', 'C': 'c'} return info # Workaround # for ...

Dr.PB

1,067

asked Aug 9, 2022 at 4:25

0 votes

1 answer

41 views

How to create a column in a dataframe based on another value in the row (Python)

I have the following data: country code continent plants invertebrates vertebrates total Afghanistan AFG Asia 5 2 33 40 Albania ALB Europe 5 71 61 137 Algeria DZA Africa 24 40 81 145 I want to ...

Jas

25

asked Jul 28, 2022 at 22:23

2 votes

3 answers

1k views

Pandas : Concat rows of a dataframe with same index to form custom string in pairs

Say I have a dataframe df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'], 'colB' : ['DEF', 'MNO', 'VWX', '456'], 'colC' : ['GHI', 'PQR', 'YZ', '789'],}, ...

Himanshu Poddar

7,751

asked Jul 28, 2022 at 17:30

1 vote

1 answer

462 views

Pandas : Prevent groupby-apply to sort the results according to index

Say I have a dataframe, dict_ = { 'Query' : ['apple', 'banana', 'mango', 'bat', 'cat', 'rat', 'lion', 'potato', 'london', 'new jersey'], 'Category': ['fruits', 'fruits', 'fruits', 'animal', '...

Himanshu Poddar

7,751

asked Jul 27, 2022 at 9:59

1 vote

1 answer

289 views

Apply T-Test test per group

I have dataframe like this: features_df = pd.DataFrame({ 'group': np.array([0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1]), 'variable': ['var1'] * 8 + ['var2'] * 8, 'value': np.array([5.582443, 7....

Arseny Sokolov

65

asked Jul 19, 2022 at 10:45

0 votes

1 answer

378 views

How to improve performance of pandas.apply() to perform text cleaning operations on large sized pandas column?

I'm working on a tweet dataset where one column is the text of the tweet. Following function performs the cleaning of tweet which involves removal of punctuations, stopwords, lower case conversion, ...

Ravindra S

6,392

asked Jul 8, 2022 at 6:53

0 votes

3 answers

51 views

Check within a column if a certain value is contained, if yes set a value

I have a problem. I want to run a loop through the whole series and check if it contains a certain value. If this row contains a certain value, it should be set to true. I get the following error: ...

Test

549

asked Jul 5, 2022 at 14:16

1 vote

0 answers

261 views

Numba - how to return multiple columns ( arrays) - after group by apply

I would like to run groupby and then apply Numba function on top of a pandas. This is the example : @nb.jit(nopython=True) def my_Numba_function(arr1,arr2): arr1[:] =11 arr2[:] =22 ...

Boris

2,075

asked Jun 16, 2022 at 21:47

0 votes

1 answer

58 views

pandas df.apply Including condition with NaN values

I'm trying to replace outliers and NaN values in my pandas.DataFrame with the mode of the series, using the apply method and a lambda function and filtering by a property. I've tried in three ...

Miguel Gómez

1

asked Jun 13, 2022 at 18:10

0 votes

1 answer

889 views

Python Dataframe add two columns containing lists

I have a dataframe of two columns, each containing list as elements. I want to perform rowwise element additon of two lists. My below solution is inspired from this answer Element-wise addition of 2 ...

Mainland

4,534

asked Apr 30, 2022 at 6:52

0 votes

2 answers

996 views

DataFrame apply/append a function that returns a dict to each row

I'm looking to apply get_sentiment to each row in a dataframe and have the returned dict append to that row. Is there a good way of doing this? def get_sentiment(txt: str) -> dict: response = ...

Connor

423

asked Apr 12, 2022 at 20:22

0 votes

1 answer

301 views

How to apply a function that takes multiple arguments to a pandas DataFrame

I want to create two functions, apply those functions on the DataFrame, and return the result to column interval_ratio import seaborn as sns import pandas as pd import numpy as np max_testing_data = ...

titlefight23

11

asked Mar 27, 2022 at 21:06

0 votes

1 answer

55 views

Pandas locate and apply changes to column

This is something I always struggle with and is very beginner. Essentially, I want to locate and apply changes to a column based on a filter from another column. Example input. import pandas as pd ...

Dodd-learning

118

asked Mar 25, 2022 at 17:37

0 votes

1 answer

318 views

How to return a DataFrame when using pandas.apply()

I'm trying to get a concated DataFrame using pandas.apply(), there is a demo below: Just like the code shown, the apply() returns a Series instead of a concated DataFrame that I expected, how can I ...

ZJ Z

1

asked Mar 16, 2022 at 14:55

0 votes

1 answer

552 views

Pandas: error when creating a new column using a function that takes one argument from another column

I have the following data frame df: df = pd.DataFrame({'result' : ['s17h10e7', 's5e3h2S105h90e15', 's17H10e7S5e3H2s105h90e15'], 'status' : [102, 117, ...

equanimity

2,523

asked Mar 9, 2022 at 22:40

0 votes

1 answer

521 views

Python: How to apply TTEST_IND to multiple columns and multiple variants in a Dataframe?

Need a quick way to apply a t-test to multiple groups and multiple variables. Let's assume I have a table like this: df = pd.DataFrame({'group': 'a a b b'.split(), 'B': [1,2,3,4], 'C': [4,6, 5,10]}) ...

Alx

1

asked Feb 19, 2022 at 3:47

1 vote

1 answer

562 views

Pandas simple groupby and apply complains "Columns must be same length as key"

Essentially I have a table of timestamps and some data and want to group by the same timestamps and change the timestamps on a grouping basis. I got something working with Interpolate seconds to ...

Pithikos

20.2k

asked Feb 14, 2022 at 10:43

2 votes

3 answers

237 views

Pandas Groupby and Apply

I am performing a grouby and apply over a dataframe that is returning some strange results, I am using pandas 1.3.1 Here is the code: ddf = pd.DataFrame({ "id": [1,1,1,1,2] }) def ...

Ben Muller

321

asked Jan 25, 2022 at 5:29

5 votes

1 answer

147 views

Why does pandas.GroupBy.apply() ignore the sort flag in some situations?

When and why is the sort flag of a DataFrame grouping ignored in pd.GroupBy.apply()? The problem is best understood with an example. In the following 4 equivalent solutions to a dummy problem, ...

normanius

9,712

asked Jan 25, 2022 at 1:56

2 votes

1 answer

4k views

faster alternatives to .apply() in pandas

I am trying to speed up the process of applying a custom function to columns in a data frame. I have found that this: b = b.apply(lambda x: 'not_ticker' if x is None else x) b = b.apply(lambda x: x if ...

rvorse

21

asked Dec 25, 2021 at 10:32

0 votes

2 answers

2k views

How to Compare Multiple Columns, and Produce Values in single New Column , Using Apply Function in Pandas

Using the Apply Function in Pandas, I want to compare Multiple Columns in a Datafarme , to see if there values are Higher or Lower than a Numerical Value. Than Based on the Result of the Condition If ...

Calculate

343

asked Dec 22, 2021 at 21:29

1 vote

0 answers

599 views

Python Type Error: Series.name must be a hashable

I have a grouped dataset (by groupby) called level_temp_grouped and I want to apply a function to two boolean columns namely, up and fill_cand, for each group. level_temp['neighbor'] = ...

Ozzie

11

asked Dec 14, 2021 at 22:39

0 votes

1 answer

605 views

Count values using groupby function and using apply function at the same time

I'm trying to count the occurance of grouped values and write values in a column using apply and grouby function on a dataframe. I have the following data frame: df = pd.DataFrame({'colA': ['name1', '...

plategt

35

asked Nov 20, 2021 at 23:39

1 vote

2 answers

98 views

How can I apply multiple functions involving multiple columns of a pandas dataframe with grouby?

Considering the following datafrme: id cat date max score 1 s1 A 12/06 9 5.4 2 s1 B 12/06 10 5.4 3 s2 C 11/04 13 4.2 4 s2 D 11/04 28 10 5 s3 E 08/02 16 5.4 5 s3 F 08/02 6 5.4 I want to group ...

Josu16

103

asked Nov 18, 2021 at 19:30

0 votes

1 answer

396 views

Using Panda's Apply Function to loop through List of Coordinates (csv) in Google Places API query with Python

For a project, I have a csv file with 60 coordinates and the radius (for each centroid of a city district) I want to get my Google Maps results of. Aim is to loop through the coordinates and the ...

marinade

5

asked Nov 14, 2021 at 20:09

0 votes

2 answers

875 views

Pandas df.apply function returns None [closed]

What I'm trying to do: Pass a column through a regex search in order to return that will be added to another column How: By writing a function with simple if-else clauses: def category(series): ...

yd132

13

asked Sep 3, 2021 at 9:40

0 votes

2 answers

197 views

Keep NaN groups when using GroupBy apply

I'm looking to keep the structure when using apply on a GroupBy object where some groups are NaN. Using dropna=False does not appear to help, NaN groups are still lost with apply. mux = pd.MultiIndex....

misantroop

2,545

asked Aug 26, 2021 at 13:48

1 vote

1 answer

881 views

How to iterate over an array using a lambda function with pandas apply

I have the following dataset: 0 1 2 0 2.0 2.0 4 0 1.0 1.0 2 0 1.0 1.0 3 3 1.0 1.0 5 4 1.0 1.0 2 5 1.0 NaN 1 6 NaN 1.0 1 ...

user16470918

67

asked Aug 21, 2021 at 5:38

0 votes

1 answer

121 views

Issues Converting Python Code Block to Function

There is a block of code I use regularly in my analysis to standardize the description of the types of devices used by customers to access an internet provider's services. The block of code is as ...

Gideone

5

asked Jul 26, 2021 at 21:18

1 vote

0 answers

67 views

how does DataFrameGroupBy.apply handle large dataframes with duplicate index in pandas?

Suppose that we have a large dataframe with duplicate index, # IPython In [1]: import pandas as pd In [2]: from numpy.random import randint In [3]: df = pd.DataFrame({'a': randint(1, 10, 10000)}, ...

Vvvvvv

164

asked Jun 27, 2021 at 8:09

0 votes

1 answer

392 views

How to make decimal part rounding filter, using python pandas Dataframe apply method

I want to make decimal filter with pandas Dataframe. Filter will ceiling and flooring their decimal part. Like this threshold is 0.3 and 0.7 0.75 -> 1 1.99 -> 2 9.13 -> 9 326.2 -> 326 ...

Luckydipper

17

asked Jun 13, 2021 at 14:36

0 votes

1 answer

81 views

Enhancing performance of pandas groupby&apply

These days I've been stucked in problem of speeding up groupby&apply,Here is code: dat = dat.groupby(['glass_id','label','step'])['equip'].apply(lambda x:'_'.join(sorted(list(x)))).reset_index() ...

Beytab

11

asked May 24, 2021 at 8:56

0 votes

1 answer

114 views

Problems with Dataframe.apply() in combine phase

Issue I am trying to use DataFrame.apply() to add new columns to a dataframe. The number of columns being added is dependent on each row of the original dataframe. There is overlap between the columns ...

Peter

83

asked May 18, 2021 at 20:23

1 vote

2 answers

246 views

Python pandas dataframe apply result of function to multiple columns where NaN

I have a dataframe with three columns and a function that calculates the values of column y and z given the value of column x. I need to only calculate the values if they are missing NaN. def ...

Inthu

1,029

asked May 2, 2021 at 15:00

0 votes

1 answer

33 views

Make apply return two series

Say I have the following dataframe id | dict_col ---+--------- 1 {"age":[1,2],"name":["john","doe"]} 2 {"age":[3,4],"name":["foo&...

CutePoison

5,305

asked Apr 21, 2021 at 7:25

1 vote

1 answer

230 views

Pandas, apply simple function to NaN returns value instead of NaN?

import pandas as pd import numpy as np pd.DataFrame( {'a':[0,1,2,3], 'b':[np.nan, np.nan, np.nan,3]} ).apply(lambda x: x> 1) returns me False for the column b, whereas I would like to get ...

kakk11

918

asked Mar 26, 2021 at 11:08

Collectives™ on Stack Overflow

Related Tags