Newest 'python+dataframe' Questions

2 votes

0 answers

23 views

streamlit update df interactively

I want to update a df interactively. The user is selecting a row, and then I want the displayed df to "disappear"and only the leftover options should stay visible. In the original problem, ...

Claudia Behnke

21

asked yesterday

0 votes

2 answers

34 views

Cannot compare tz-naive and tz-aware timestamps

I'm finding the error below: Cannot compare tz-naive and tz-aware timestamps How can I convert dates to fix the issue? The error appear in the end of the syntax below. from datetime import datetime, ...

lpca

5

asked yesterday

0 votes

0 answers

95 views

How to efficiently make a large matrix of 1s and 0s

I have two numpy arrays x and y of same length, and I am trying to make a square matrix A such that the (i,j) entry of the matrix will contain a 1 if a certain relationship holds between x[i], x[j], y[...

JLB

109

asked yesterday

1 vote

0 answers

37 views

Snowflake - Error while creating Temp View from snowpark dataframe

Hope you are all doing well. I am facing a weird issue in Snowpark (Python) while creating a temp view from Dataframe. I have searched online and while I have had hits, there is no proper solution. ...

rainingdistros

627

asked yesterday

0 votes

0 answers

31 views

problems when using Dask in a Dataframe in Python

A newbie here using parallel computing in Phyton I have ~80 huge CSV files (32 GB each) that I need to process in Python to retrieve some rows from them. The file structure is 'Barra', 'D1', 'D2','D3'...

Eduardo

1

asked 2 days ago

0 votes

0 answers

19 views

plot pre-computed mean and confidence intervals for two types of firms (python)

I want to plot two time-series lines with mean and CI each for low and high diversity firms (variable div_ind) throughout 8 calendar years (variable calyr). May I know how to do it? Online sources ...

Jaevapple

9

asked Dec 14 at 16:51

-2 votes

0 answers

40 views

How do I Get rid of '0' in index column?

I try to rename the Index column to 'idx' and get rid of 0 using this code: df1.index.rename(name='idx', inplace=True) However, I end up with the second dataframe as below. It results in messing up ...

Hưng Trần

7

asked Dec 14 at 9:12

1 vote

1 answer

48 views

merge several columns of the same data into one

I've created a dataframe that adds data from several sources. Here is an example subset: index CompanyName Source1site Source2site Source3site City 1 Comp1 web1.com Nan ...

Anas

43

asked Dec 13 at 21:38

2 votes

2 answers

73 views

How to use numpy.where in a pipe function for pandas dataframe groupby?

Here is a script to simulate the issue I am facing: import pandas as pd import numpy as np data = { 'a':[1,2,1,1,2,1,1], 'b':[10,40,20,10,40,10,20], 'c':[0.3, 0.2, 0.6, 0.4, 0....

learner

655

asked Dec 13 at 16:56

1 vote

2 answers

63 views

Update Pandas DataFrame slice row-wise using dictionary

Question I am updating the values in a slice of a pandas.DataFrame by row such that each row of the slice has unique value. I am using pandas version 2.2.3. I have found an approach that seems to work ...

Ouroboroski

171

asked Dec 13 at 15:23

0 votes

0 answers

24 views

python df astype doesn't work on object from csv_read

I have a function that reads a csv into a df. the csv is quite large, but it's a combination of strings (categories) and numbers in different columns: import pandas as pd df_temp=pd.read_csv('somefile....

Ohad

69

asked Dec 13 at 10:57

0 votes

0 answers

31 views

AttributeError: 'numpy.ndarray' object has no attribute 'categories'

Modin DataFrame Merge Issue After dropna on Categorical Column: I'm encountering an issue when using Modin to merge DataFrames that contain categorical columns. The issue arose after I performed a ...

Sumukha G C

13

asked Dec 13 at 10:37

0 votes

1 answer

42 views

Indices mismatch during merge in pandas

I am trying to merge two dataframes in Python, pandas, df1 and df2. I am trying to merge them on Column1, and then assign value of Column2 from df2 to df1. This is my code: df1 = df1.reset_index() ...

melca

13

asked Dec 13 at 10:18

0 votes

3 answers

54 views

How to separate multiple tickers into individual dataframes with yfinance downloaded data

I'm trying to download stock data information using yfinance. Currently, I can successfully download a single ticker using yf.download which returns a dataframe with information I can use. This API ...

Ryanc88

182

asked Dec 13 at 0:21

0 votes

1 answer

13 views

QTableView's ComboBox Delegate model is not synchronizing with the PandasModel

I've got a QTableView that uses a combobox delegate in 2 of the table's columns. The selected item from the combobox displays in the TableView correctly if no columns are sorted. When a column is ...

Tadpole

15

asked Dec 12 at 22:42

-2 votes

0 answers

14 views

How to impute OPEN_CLS_STS based on values in DT_CLS in Python [duplicate]

I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS. IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'. I tried ...

Derekhe1988

1

asked Dec 12 at 17:24

-1 votes

1 answer

48 views

How to impute OPEN_CLS_STS based on values in DT_CLS [duplicate]

I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS. IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'. I tried ...

Derekhe1988

1

asked Dec 12 at 16:44

0 votes

0 answers

25 views

Creating separate groups in a dataframe when column values repeat [duplicate]

I have a dataframe with numbers formatted as follows: df = pd.DataFrame({"ColumnA": [1,2,3,4,5,6,7,8,9,10], "ColumnB": [1,3,5,6,4,7,5,4,1,2], "ColumnC": [0,1,1,2,0,2,1,1,...

zard

1

asked Dec 12 at 15:04

0 votes

1 answer

72 views

Manipulation of a Pandas dataframe most time- and memory-efficiently

Please imagine I have a dataframe like this: df = pd.DataFrame(index=pd.Index(['1', '1', '2', '2'], name='from'), columns=['to'], data= ['2', '2', '4', '5']) df: Now, I would like to calculate a ...

Saeed

2,078

asked Dec 12 at 14:36

1 vote

2 answers

40 views

Pandas Dataframe Multiindex - Calculate Mean and add additional column to each level of the index

Given the following dataframe: Year 2024 2023 2022 Header N Result SD N Result SD N Result SD Vendor A 5 20 3 5 22 4 1 21 3 B 4 25 2 ...

Tumas04

63

asked Dec 12 at 12:05

-1 votes

0 answers

50 views

My Exponential Moving Average calculations are still somehow wrong?

Where am I going wrong... Here is my Python code which interacts with the MetaTrader5 API. import numpy as np import MetaTrader5 as mt5 import pandas as pd from sklearn.preprocessing import ...

RHO

49

asked Dec 11 at 21:23

2 votes

1 answer

27 views

python dataframe slicing by row number

all Python experts, I'm a Python newbie, stuck with a problem which may look very simple to you. Say I have a data frame of 100 rows, how can I split it into 5 sub-frames, each of which contains the ...

Jasper

85

asked Dec 11 at 20:18

0 votes

0 answers

46 views

I cannot get all data to export to CSV

# Collect batting stats for the 2022, 2023, and 2024 seasons try: print("Collecting batting stats from 2022 to 2024...") batting_data = batting_stats(2021, 2024, league="all&...

Chad Broussard

1

asked Dec 11 at 19:16

1 vote

1 answer

40 views

Comparing empty dataframes

I have a function, extract_redundant_values, to extract redundant rows from a pandas dataframe. I am testing it by running on in_df to generate out_df. I am then comparing this against my expected ...

Tim Kirkwood

706

asked Dec 11 at 18:50

0 votes

1 answer

51 views

loop over date range and appending new values to a new data frame

I wish to loop each row of the date frame below over each date of date rage below, check the following condition and return the current date of date range in a new data frame with all columns we have ...

lpca

5

asked Dec 11 at 17:30

1 vote

1 answer

60 views

Fill in rows to dataframe based on another dataframe

I have 2 dataframes that look like this: import pandas as pd data = {'QuarterYear': ["Q3 2023", "Q4 2023", "Q1 2024", 'Q2 2024', "Q3 2024", "Q4 2024"]...

TIC-FLY

173

asked Dec 11 at 10:07

1 vote

1 answer

49 views

Alternate background colors in styled pandas df that also apply to MultiIndex in python pandas

SETUP I have the following df: import pandas as pd import numpy as np arrays = [ np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...

bismo

1,439

asked Dec 10 at 20:19

1 vote

1 answer

35 views

How to style all cells in a row of a specific MultiIndex value in pandas

SETUP I have the following df: import pandas as pd import numpy as np arrays = [ np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...

bismo

1,439

asked Dec 10 at 16:38

1 vote

1 answer

39 views

Pyspark computation time increases with less data

I'm posed with a problem where i have to iterate the same computations on each row of data until they converge. My train of thought was to remove the converged rows after each iteration so the ...

Søren Jensen

163

asked Dec 10 at 11:48

-1 votes

0 answers

40 views

XML to Pandas dataFrame [closed]

0 A 51 non-null object 1 B 51 non-null object 2 C 51 non-null object 3 D 45 non-null object This is the info of the dataframe. It is fine when I just return it ...

Debasish Kundu

1

asked Dec 9 at 18:33

0 votes

0 answers

30 views

How to transform nested data to be used with an tabular learning network

I have an dataframe containing measurements and error with there corresponding status and counter. And i want to use it to feed it into e.g. TabNet Raw dataframe protocolid time measurement_1 ...

Max_Och

1

asked Dec 9 at 17:21

0 votes

0 answers

33 views

Data retrieving and SQL database update

I'm trying to retrieve some data from an API and save them to a local database I created. All data come from Google Ads campaigns, and I need to make two separate calls because of their docs, but that'...

Davide

454

asked Dec 9 at 9:51

0 votes

0 answers

31 views

why am I am getting an the: sns.lineplot(x=anomaly_df['Date'], y=scaler.inverse_transform(anomaly_df['Close/Last']))

import numpy as np from keras.models import Sequential from keras.layers import LSTM, Input, Dropout from keras.layers import Dense from keras.layers import RepeatVector from keras.layers import ...

Asiedu

1

asked Dec 9 at 9:32

0 votes

0 answers

17 views

Trying to iterate over different teams in mean data from a larger dataset [duplicate]

Basically after the mean data for the teams home and away games is taken, I want to plot multiple graphs for each team in one loop, essentially, in the code below, where Arsenal is in quotes in the ...

Daragh

1

asked Dec 9 at 9:31

1 vote

2 answers

78 views

How can I clean a year column with messy values?

I have a project I'm working on for a data analysis course, where we pick a data set and go through the steps of cleaning and exploring the data with a question to answer in mind. I want to be able to ...

Jubilbee Draws

13

asked Dec 7 at 21:07

0 votes

3 answers

81 views

Pandas dataframe reshape with columns name [closed]

I have a dataframe like this: >>> df TYPE A B C D 0 IN 550 350 600 360 1 OUT 340 270 420 190 I want reshape it to this shape: AIN AOUT BIN BOUT CIN COUT ...

Sun Jar

163

asked Dec 7 at 14:36

2 votes

1 answer

57 views

Dropping duplicates by column in PySpark

I have a PySpark dataframe like this but with a lot more data: user_id event_date 123 '2024-01-01 14:45:12.00' 123 '2024-01-02 14:45:12.00' 456 '2024-01-01 14:45:12.00' 456 '2024-03-01 14:45:12....

Myakotka247

23

asked Dec 6 at 10:38

0 votes

0 answers

34 views

Create a new line for comma separated values in pandas column - I dont want to add new rows, I want to have same rows in output [duplicate]

I have a dataframe like this, df col1 col2 1 'abc,pqr' 2 'ghv' 3 'mrr, jig' Now I want to create a new line for each comma separated values in col2, so the output would look ...

Kallol

2,189

asked Dec 6 at 9:31

0 votes

1 answer

49 views

How can i change a column data type in pandas without creating null values in the whole column in my dataframe

I have been getting null values when trying to convert a column with the non-numeric type values to a column with numeric type values I have been using the below code line to change my column data ...

Samuel Sepeku

1

asked Dec 6 at 9:08

0 votes

0 answers

49 views

How to Increase Precision of Decimal Points in Python DataFrames? [closed]

I am developing a system in Python that replicates another written in LabWindows. A part of the design involves calculating the Periodogram, which returns a decimal array. I then add this array to a ...

S N B

61

asked Dec 6 at 5:05

0 votes

0 answers

28 views

Pandas DataFrame uses more memory than it claimed

My program is very simple. I run it in Jupyter Notebook. It loads data from MongoDB. I tried to store the data as pandas.DataFrame at first. import pandas as pd import pymongo mongo = pymongo....

SerSmile

11

asked Dec 6 at 3:53

0 votes

3 answers

60 views

Add columns to dataframe from a dictionary

There are many answers out there to this question, but I couldn't find one that applies to my case. I have a dataframe that contains ID's: df = pd.DataFrame({"id": [0, 1, 2, 3, 4]}) Now, I ...

mrgou

2,418

asked Dec 5 at 16:53

0 votes

1 answer

45 views

How to check pyspark dataframe column for incorrect value type using pytest? [closed]

I am trying to write a test to see if the spark dataframe has records with incorrect value type, but I'm stuck. There is the dataframe: schema1 = StructType( [ StructField("id_key&...

user28640934

9

asked Dec 5 at 12:06

0 votes

0 answers

23 views

Reassigning pandas columns in chained .assign() gives incorrect values [duplicate]

I often follow the convention (for better or worse) of loading data and preprocessing manipulations in a single line of chained pandas commands. In one such manipulation, I need to multiply a set of ...

Patrick

1

asked Dec 4 at 18:13

0 votes

2 answers

67 views

How to convert string scientific notation to float within a txt file

I have code in a .txt file that has scientific notation values stored as strings and I am trying to convert them to floats that way I can perform calculations on them. However, when I try to attempt ...

n00bcoder_24

13

asked Dec 4 at 16:36

-1 votes

0 answers

51 views

Pandas read_excel is throwing an issue related to datetime conversion while reading an .xlsx or .xls file, but file doesn’t have any datetime columns

By using below code facing issue: I am trying to read .xslx as well .xls file. df = pd.read_excel(filepath,sheet_name = "Package ID Informatio", header=hd, dtype=str) Code is running well ...

Tejaswini Jadhav

11

asked Dec 4 at 15:58

1 vote

2 answers

78 views

Pandas dataframe - combine cell values as strings [duplicate]

I have a dataframe: Email | Col1 | Col2 | Col3 | Name -------------------------------------------------------------------- [email protected] | CellStr11 | 1.4 | CellStr13 |...

badbadllama

67

asked Dec 3 at 15:51

1 vote

2 answers

42 views

Pandas dataframe - finding row comparing two cell values

I have a dataframe: Email | ... | Name -------------------------------------- [email protected] | ... | John Cena [email protected] | ... | John Cena I need to find a row, that ...

badbadllama

67

asked Dec 3 at 11:56

1 vote

1 answer

37 views

Filter Pandas DataFrame when all IDs are blank [duplicate]

This is how I am populating my DataFrame: import pandas as pd data = {'ID1': ['BBG01Q69DW37', 'BBG01Q69DW37','BBG01Q69TEST','BBG01Q69TES1'], 'ID2': ['YU3384903', 'YU3384903','','YU338TES1'], ...

Sachin

63

asked Dec 2 at 22:54

-2 votes

0 answers

26 views

Errors reading csv file from different URLs [duplicate]

I cannot figure out why the same approach in pandas cannot be used to read the CSV file of the two following URLs. import pandas as pd url1 = "https://data.ontario.ca/dataset/a2dfa674-a173-45b3-...

Trung Nguyen

103

asked Dec 2 at 21:17

Collectives™ on Stack Overflow

All Questions

Related Tags