Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers

DataFrame.apply converts integer data to float even in absence of mixed numerical types

I am using DataFrame.apply() to calculate a new "Metric" column by taking an existing integer categorical column and looking up the integer in a list, i.e., indexing into the list. It works ...
user2153235's user avatar
  • 1,115
2 votes
1 answer

advanced logic with groupby, apply and transform - compare row value with previous value and create new column

I have the following pandas dataframe: d= {'Time': [0,1,2,0,1,2,2,3,4], 'Price': ['Auction', 'Auction','800','900','By Negotiation','700','250','250','Make Offer'],'Item': ['Picasso', 'Picasso', '...
doctorblaze's user avatar
1 vote
2 answers

DataFrame Creation from DataFrame.apply

I have a function that returns a pd.DataFrame given a row of another dataframe: def func(row): if row['col1']== 1: return pd.DataFrame({'a':[1], 'b':[11]}) else: return pd....
Ottpocket's user avatar
  • 155
0 votes
1 answer

Why is .apply / .map only running against the first column?

All! I have two [2] pd.DataFrames: df_colors [20,3] and df_master [200,13]. I am attempting to update the values for each item in each row. df_colors.head(5) COLOR CODE WHITE 0 BLACK 1 GREEN 0 ...
OctoCatKnows's user avatar
0 votes
0 answers

create multiple columns with df.apply that uses multi return function

I am creating a new column with df.apply and a function and would like to use the same function to create another column in the dataframe. The function returns several values from which I'm selecting ...
peetman's user avatar
  • 707
1 vote
3 answers

How do I sum a column based on separate category types, and preserve the zeros?

Here is my dataframe: Id Category Hours 1 A 1 1 A 3 1 B 4 2 A 2 2 B 6 3 A 3 And here is the output I want: Id Total Hours A_Hours B_Hours 1 5 4 4 2 8 2 6 3 3 3 0 How do I achieve this? I ...
hailthedawn's user avatar
0 votes
0 answers

Apply logical comparison but truth value ambiguous

I am trying to do a mapping of current resources to available resources in gcp. I want to check the current RAM and vCPUs(calculated elsewhere) then create a new column for each machine series that ...
Alex Cahill's user avatar
1 vote
0 answers

Apply a function with multiple arguments to a large Pandas dataframe efficiently

My dataframe (1,957,046 x 4) is of baby names by year, count and gender, as follows: Year Name Gender Count 1880 A F 1 1880 B M 5 1880 C F 2 ... ... ... ... 2018 X M 7 2018 Y F 4 2018 Z M 2 I ...
PeteyPablo's user avatar
1 vote
1 answer

Vectorized dataframe filtering with complex logic

I have a very big dataframe with five columns, ID and four numerical. Let's say, integers between 0 and 50. My goal is to calculate cosine similarity matrix for every ID. However, I want to force some ...
Emil Mirzayev's user avatar
2 votes
1 answer

Get correlation per groupby/apply in Python Polars

I have a pandas DataFrame df: d = {'era': ["a", "a", "b","b","c", "c"], 'feature1': [3, 4, 5, 6, 7, 8], 'feature2': [7, 8, 9, 10, 11, 12], '...
jbssm's user avatar
  • 7,151
0 votes
1 answer

Assignment a values to columns inside df.apply()

I need to assign multiple values to multiple columns inside a pandas.DataFrame. What I want to do looks like that: df.apply( lambda x: x['card_{}'.format(card)] = score for card, score in zip(...
Любовь Пономарева's user avatar
0 votes
1 answer

Speed up groupby rolling apply utilising multiple columns

I'm trying to create a Brier Score for a grouped rolling window. As the function that calculates the Brier Score utilises multiple columns in the grouped rolling window I've had to use the answer here ...
Jossy's user avatar
  • 959
1 vote
1 answer

index compatibility of dataframe with multiindex result from apply on group

We have to apply an algorithm to columns in a dataframe, the data has to be grouped by a key and the result shall form a new column in the dataframe. Since it is a common use-case we wonder if we have ...
barium's user avatar
  • 53
1 vote
1 answer

How do I keep values based on dataframe values?

I have the following dataframe. ID path1 path2 path3 1 12 NaN NaN 1 1 5 NaN 1 2 NaN '' 1 2 4 111 2 123 NaN NaN 3 11 ...
daily update's user avatar
0 votes
1 answer

Add columns to Dataframe when apply custom function that returns dictionary

def tFunc(row): if (random.random()>0.5): info={'A': 'a', 'B': 'b', 'C': 'c', 'D': 'd'} else: info={'A': 'a', 'B': 'b', 'C': 'c'} return info # Workaround # for ...
Dr.PB's user avatar
  • 1,067
0 votes
1 answer

How to create a column in a dataframe based on another value in the row (Python)

I have the following data: country code continent plants invertebrates vertebrates total Afghanistan AFG Asia 5 2 33 40 Albania ALB Europe 5 71 61 137 Algeria DZA Africa 24 40 81 145 I want to ...
Jas's user avatar
  • 25
2 votes
3 answers

Pandas : Concat rows of a dataframe with same index to form custom string in pairs

Say I have a dataframe df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'], 'colB' : ['DEF', 'MNO', 'VWX', '456'], 'colC' : ['GHI', 'PQR', 'YZ', '789'],}, ...
Himanshu Poddar's user avatar
1 vote
1 answer

Pandas : Prevent groupby-apply to sort the results according to index

Say I have a dataframe, dict_ = { 'Query' : ['apple', 'banana', 'mango', 'bat', 'cat', 'rat', 'lion', 'potato', 'london', 'new jersey'], 'Category': ['fruits', 'fruits', 'fruits', 'animal', '...
Himanshu Poddar's user avatar
1 vote
1 answer

Apply T-Test test per group

I have dataframe like this: features_df = pd.DataFrame({ 'group': np.array([0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1]), 'variable': ['var1'] * 8 + ['var2'] * 8, 'value': np.array([5.582443, 7....
Arseny Sokolov's user avatar
0 votes
1 answer

How to improve performance of pandas.apply() to perform text cleaning operations on large sized pandas column?

I'm working on a tweet dataset where one column is the text of the tweet. Following function performs the cleaning of tweet which involves removal of punctuations, stopwords, lower case conversion, ...
Ravindra S's user avatar
  • 6,392
0 votes
3 answers

Check within a column if a certain value is contained, if yes set a value

I have a problem. I want to run a loop through the whole series and check if it contains a certain value. If this row contains a certain value, it should be set to true. I get the following error: ...
Test's user avatar
  • 549
1 vote
0 answers

Numba - how to return multiple columns ( arrays) - after group by apply

I would like to run groupby and then apply Numba function on top of a pandas. This is the example : @nb.jit(nopython=True) def my_Numba_function(arr1,arr2): arr1[:] =11 arr2[:] =22 ...
Boris's user avatar
  • 2,075
0 votes
1 answer

pandas df.apply Including condition with NaN values

I'm trying to replace outliers and NaN values in my pandas.DataFrame with the mode of the series, using the apply method and a lambda function and filtering by a property. I've tried in three ...
Miguel Gómez's user avatar
0 votes
1 answer

Python Dataframe add two columns containing lists

I have a dataframe of two columns, each containing list as elements. I want to perform rowwise element additon of two lists. My below solution is inspired from this answer Element-wise addition of 2 ...
Mainland's user avatar
  • 4,534
0 votes
2 answers

DataFrame apply/append a function that returns a dict to each row

I'm looking to apply get_sentiment to each row in a dataframe and have the returned dict append to that row. Is there a good way of doing this? def get_sentiment(txt: str) -> dict: response = ...
Connor's user avatar
  • 423
0 votes
1 answer

How to apply a function that takes multiple arguments to a pandas DataFrame

I want to create two functions, apply those functions on the DataFrame, and return the result to column interval_ratio import seaborn as sns import pandas as pd import numpy as np max_testing_data = ...
titlefight23's user avatar
0 votes
1 answer

Pandas locate and apply changes to column

This is something I always struggle with and is very beginner. Essentially, I want to locate and apply changes to a column based on a filter from another column. Example input. import pandas as pd ...
Dodd-learning's user avatar
0 votes
1 answer

How to return a DataFrame when using pandas.apply()

I'm trying to get a concated DataFrame using pandas.apply(), there is a demo below: Just like the code shown, the apply() returns a Series instead of a concated DataFrame that I expected, how can I ...
ZJ Z's user avatar
  • 1
0 votes
1 answer

Pandas: error when creating a new column using a function that takes one argument from another column

I have the following data frame df: df = pd.DataFrame({'result' : ['s17h10e7', 's5e3h2S105h90e15', 's17H10e7S5e3H2s105h90e15'], 'status' : [102, 117, ...
equanimity's user avatar
  • 2,523
0 votes
1 answer

Python: How to apply TTEST_IND to multiple columns and multiple variants in a Dataframe?

Need a quick way to apply a t-test to multiple groups and multiple variables. Let's assume I have a table like this: df = pd.DataFrame({'group': 'a a b b'.split(), 'B': [1,2,3,4], 'C': [4,6, 5,10]}) ...
Alx's user avatar
  • 1
1 vote
1 answer

Pandas simple groupby and apply complains "Columns must be same length as key"

Essentially I have a table of timestamps and some data and want to group by the same timestamps and change the timestamps on a grouping basis. I got something working with Interpolate seconds to ...
Pithikos's user avatar
  • 20.2k
2 votes
3 answers

Pandas Groupby and Apply

I am performing a grouby and apply over a dataframe that is returning some strange results, I am using pandas 1.3.1 Here is the code: ddf = pd.DataFrame({ "id": [1,1,1,1,2] }) def ...
Ben Muller's user avatar
5 votes
1 answer

Why does pandas.GroupBy.apply() ignore the sort flag in some situations?

When and why is the sort flag of a DataFrame grouping ignored in pd.GroupBy.apply()? The problem is best understood with an example. In the following 4 equivalent solutions to a dummy problem, ...
normanius's user avatar
  • 9,712
2 votes
1 answer

faster alternatives to .apply() in pandas

I am trying to speed up the process of applying a custom function to columns in a data frame. I have found that this: b = b.apply(lambda x: 'not_ticker' if x is None else x) b = b.apply(lambda x: x if ...
rvorse's user avatar
  • 21
0 votes
2 answers

How to Compare Multiple Columns, and Produce Values in single New Column , Using Apply Function in Pandas

Using the Apply Function in Pandas, I want to compare Multiple Columns in a Datafarme , to see if there values are Higher or Lower than a Numerical Value. Than Based on the Result of the Condition If ...
Calculate's user avatar
  • 343
1 vote
0 answers

Python Type Error: must be a hashable

I have a grouped dataset (by groupby) called level_temp_grouped and I want to apply a function to two boolean columns namely, up and fill_cand, for each group. level_temp['neighbor'] = ...
Ozzie's user avatar
  • 11
0 votes
1 answer

Count values using groupby function and using apply function at the same time

I'm trying to count the occurance of grouped values and write values in a column using apply and grouby function on a dataframe. I have the following data frame: df = pd.DataFrame({'colA': ['name1', '...
plategt's user avatar
  • 35
1 vote
2 answers

How can I apply multiple functions involving multiple columns of a pandas dataframe with grouby?

Considering the following datafrme: id cat date max score 1 s1 A 12/06 9 5.4 2 s1 B 12/06 10 5.4 3 s2 C 11/04 13 4.2 4 s2 D 11/04 28 10 5 s3 E 08/02 16 5.4 5 s3 F 08/02 6 5.4 I want to group ...
Josu16's user avatar
  • 103
0 votes
1 answer

Using Panda's Apply Function to loop through List of Coordinates (csv) in Google Places API query with Python

For a project, I have a csv file with 60 coordinates and the radius (for each centroid of a city district) I want to get my Google Maps results of. Aim is to loop through the coordinates and the ...
marinade's user avatar
0 votes
2 answers

Pandas df.apply function returns None [closed]

What I'm trying to do: Pass a column through a regex search in order to return that will be added to another column How: By writing a function with simple if-else clauses: def category(series): ...
yd132's user avatar
  • 13
0 votes
2 answers

Keep NaN groups when using GroupBy apply

I'm looking to keep the structure when using apply on a GroupBy object where some groups are NaN. Using dropna=False does not appear to help, NaN groups are still lost with apply. mux = pd.MultiIndex....
misantroop's user avatar
  • 2,545
1 vote
1 answer

How to iterate over an array using a lambda function with pandas apply

I have the following dataset: 0 1 2 0 2.0 2.0 4 0 1.0 1.0 2 0 1.0 1.0 3 3 1.0 1.0 5 4 1.0 1.0 2 5 1.0 NaN 1 6 NaN 1.0 1 ...
user16470918's user avatar
0 votes
1 answer

Issues Converting Python Code Block to Function

There is a block of code I use regularly in my analysis to standardize the description of the types of devices used by customers to access an internet provider's services. The block of code is as ...
Gideone's user avatar
1 vote
0 answers

how does DataFrameGroupBy.apply handle large dataframes with duplicate index in pandas?

Suppose that we have a large dataframe with duplicate index, # IPython In [1]: import pandas as pd In [2]: from numpy.random import randint In [3]: df = pd.DataFrame({'a': randint(1, 10, 10000)}, ...
Vvvvvv's user avatar
  • 164
0 votes
1 answer

How to make decimal part rounding filter, using python pandas Dataframe apply method

I want to make decimal filter with pandas Dataframe. Filter will ceiling and flooring their decimal part. Like this threshold is 0.3 and 0.7 0.75 -> 1 1.99 -> 2 9.13 -> 9 326.2 -> 326 ...
Luckydipper's user avatar
0 votes
1 answer

Enhancing performance of pandas groupby&apply

These days I've been stucked in problem of speeding up groupby&apply,Here is code: dat = dat.groupby(['glass_id','label','step'])['equip'].apply(lambda x:'_'.join(sorted(list(x)))).reset_index() ...
Beytab's user avatar
  • 11
0 votes
1 answer

Problems with Dataframe.apply() in combine phase

Issue I am trying to use DataFrame.apply() to add new columns to a dataframe. The number of columns being added is dependent on each row of the original dataframe. There is overlap between the columns ...
Peter's user avatar
  • 83
1 vote
2 answers

Python pandas dataframe apply result of function to multiple columns where NaN

I have a dataframe with three columns and a function that calculates the values of column y and z given the value of column x. I need to only calculate the values if they are missing NaN. def ...
Inthu's user avatar
  • 1,029
0 votes
1 answer

Make apply return two series

Say I have the following dataframe id | dict_col ---+--------- 1 {"age":[1,2],"name":["john","doe"]} 2 {"age":[3,4],"name":["foo&...
CutePoison's user avatar
  • 5,305
1 vote
1 answer

Pandas, apply simple function to NaN returns value instead of NaN?

import pandas as pd import numpy as np pd.DataFrame( {'a':[0,1,2,3], 'b':[np.nan, np.nan, np.nan,3]} ).apply(lambda x: x> 1) returns me False for the column b, whereas I would like to get ...
kakk11's user avatar
  • 918