Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
2 votes
0 answers
23 views

streamlit update df interactively

I want to update a df interactively. The user is selecting a row, and then I want the displayed df to "disappear"and only the leftover options should stay visible. In the original problem, ...
Claudia Behnke's user avatar
0 votes
2 answers
34 views

Cannot compare tz-naive and tz-aware timestamps

I'm finding the error below: Cannot compare tz-naive and tz-aware timestamps How can I convert dates to fix the issue? The error appear in the end of the syntax below. from datetime import datetime, ...
lpca's user avatar
  • 5
0 votes
0 answers
95 views

How to efficiently make a large matrix of 1s and 0s

I have two numpy arrays x and y of same length, and I am trying to make a square matrix A such that the (i,j) entry of the matrix will contain a 1 if a certain relationship holds between x[i], x[j], y[...
JLB's user avatar
  • 109
1 vote
0 answers
37 views

Snowflake - Error while creating Temp View from snowpark dataframe

Hope you are all doing well. I am facing a weird issue in Snowpark (Python) while creating a temp view from Dataframe. I have searched online and while I have had hits, there is no proper solution. ...
rainingdistros's user avatar
0 votes
0 answers
31 views

problems when using Dask in a Dataframe in Python

A newbie here using parallel computing in Phyton I have ~80 huge CSV files (32 GB each) that I need to process in Python to retrieve some rows from them. The file structure is 'Barra', 'D1', 'D2','D3'...
Eduardo's user avatar
0 votes
0 answers
19 views

plot pre-computed mean and confidence intervals for two types of firms (python)

I want to plot two time-series lines with mean and CI each for low and high diversity firms (variable div_ind) throughout 8 calendar years (variable calyr). May I know how to do it? Online sources ...
Jaevapple's user avatar
-2 votes
0 answers
40 views

How do I Get rid of '0' in index column?

I try to rename the Index column to 'idx' and get rid of 0 using this code: df1.index.rename(name='idx', inplace=True) However, I end up with the second dataframe as below. It results in messing up ...
Hưng Trần's user avatar
1 vote
1 answer
48 views

merge several columns of the same data into one

I've created a dataframe that adds data from several sources. Here is an example subset: index CompanyName Source1site Source2site Source3site City 1 Comp1 web1.com Nan ...
Anas's user avatar
  • 43
2 votes
2 answers
73 views

How to use numpy.where in a pipe function for pandas dataframe groupby?

Here is a script to simulate the issue I am facing: import pandas as pd import numpy as np data = { 'a':[1,2,1,1,2,1,1], 'b':[10,40,20,10,40,10,20], 'c':[0.3, 0.2, 0.6, 0.4, 0....
learner's user avatar
  • 655
1 vote
2 answers
63 views

Update Pandas DataFrame slice row-wise using dictionary

Question I am updating the values in a slice of a pandas.DataFrame by row such that each row of the slice has unique value. I am using pandas version 2.2.3. I have found an approach that seems to work ...
Ouroboroski's user avatar
0 votes
0 answers
24 views

python df astype doesn't work on object from csv_read

I have a function that reads a csv into a df. the csv is quite large, but it's a combination of strings (categories) and numbers in different columns: import pandas as pd df_temp=pd.read_csv('somefile....
Ohad's user avatar
  • 69
0 votes
0 answers
31 views

AttributeError: 'numpy.ndarray' object has no attribute 'categories'

Modin DataFrame Merge Issue After dropna on Categorical Column: I'm encountering an issue when using Modin to merge DataFrames that contain categorical columns. The issue arose after I performed a ...
Sumukha G C's user avatar
0 votes
1 answer
42 views

Indices mismatch during merge in pandas

I am trying to merge two dataframes in Python, pandas, df1 and df2. I am trying to merge them on Column1, and then assign value of Column2 from df2 to df1. This is my code: df1 = df1.reset_index() ...
melca's user avatar
  • 13
0 votes
3 answers
54 views

How to separate multiple tickers into individual dataframes with yfinance downloaded data

I'm trying to download stock data information using yfinance. Currently, I can successfully download a single ticker using yf.download which returns a dataframe with information I can use. This API ...
Ryanc88's user avatar
  • 182
0 votes
1 answer
13 views

QTableView's ComboBox Delegate model is not synchronizing with the PandasModel

I've got a QTableView that uses a combobox delegate in 2 of the table's columns. The selected item from the combobox displays in the TableView correctly if no columns are sorted. When a column is ...
Tadpole's user avatar
  • 15
-2 votes
0 answers
14 views

How to impute OPEN_CLS_STS based on values in DT_CLS in Python [duplicate]

I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS. IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'. I tried ...
Derekhe1988's user avatar
-1 votes
1 answer
48 views

How to impute OPEN_CLS_STS based on values in DT_CLS [duplicate]

I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS. IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'. I tried ...
Derekhe1988's user avatar
0 votes
0 answers
25 views

Creating separate groups in a dataframe when column values repeat [duplicate]

I have a dataframe with numbers formatted as follows: df = pd.DataFrame({"ColumnA": [1,2,3,4,5,6,7,8,9,10], "ColumnB": [1,3,5,6,4,7,5,4,1,2], "ColumnC": [0,1,1,2,0,2,1,1,...
zard's user avatar
  • 1
0 votes
1 answer
72 views

Manipulation of a Pandas dataframe most time- and memory-efficiently

Please imagine I have a dataframe like this: df = pd.DataFrame(index=pd.Index(['1', '1', '2', '2'], name='from'), columns=['to'], data= ['2', '2', '4', '5']) df: Now, I would like to calculate a ...
Saeed's user avatar
  • 2,078
1 vote
2 answers
40 views

Pandas Dataframe Multiindex - Calculate Mean and add additional column to each level of the index

Given the following dataframe: Year 2024 2023 2022 Header N Result SD N Result SD N Result SD Vendor A 5 20 3 5 22 4 1 21 3 B 4 25 2 ...
Tumas04's user avatar
  • 63
-1 votes
0 answers
50 views

My Exponential Moving Average calculations are still somehow wrong?

Where am I going wrong... Here is my Python code which interacts with the MetaTrader5 API. import numpy as np import MetaTrader5 as mt5 import pandas as pd from sklearn.preprocessing import ...
RHO's user avatar
  • 49
2 votes
1 answer
27 views

python dataframe slicing by row number

all Python experts, I'm a Python newbie, stuck with a problem which may look very simple to you. Say I have a data frame of 100 rows, how can I split it into 5 sub-frames, each of which contains the ...
Jasper's user avatar
  • 85
0 votes
0 answers
46 views

I cannot get all data to export to CSV

# Collect batting stats for the 2022, 2023, and 2024 seasons try: print("Collecting batting stats from 2022 to 2024...") batting_data = batting_stats(2021, 2024, league="all&...
Chad Broussard's user avatar
1 vote
1 answer
40 views

Comparing empty dataframes

I have a function, extract_redundant_values, to extract redundant rows from a pandas dataframe. I am testing it by running on in_df to generate out_df. I am then comparing this against my expected ...
Tim Kirkwood's user avatar
0 votes
1 answer
51 views

loop over date range and appending new values to a new data frame

I wish to loop each row of the date frame below over each date of date rage below, check the following condition and return the current date of date range in a new data frame with all columns we have ...
lpca's user avatar
  • 5
1 vote
1 answer
60 views

Fill in rows to dataframe based on another dataframe

I have 2 dataframes that look like this: import pandas as pd data = {'QuarterYear': ["Q3 2023", "Q4 2023", "Q1 2024", 'Q2 2024', "Q3 2024", "Q4 2024"]...
TIC-FLY's user avatar
  • 173
1 vote
1 answer
49 views

Alternate background colors in styled pandas df that also apply to MultiIndex in python pandas

SETUP I have the following df: import pandas as pd import numpy as np arrays = [ np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...
bismo's user avatar
  • 1,439
1 vote
1 answer
35 views

How to style all cells in a row of a specific MultiIndex value in pandas

SETUP I have the following df: import pandas as pd import numpy as np arrays = [ np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...
bismo's user avatar
  • 1,439
1 vote
1 answer
39 views

Pyspark computation time increases with less data

I'm posed with a problem where i have to iterate the same computations on each row of data until they converge. My train of thought was to remove the converged rows after each iteration so the ...
Søren Jensen's user avatar
-1 votes
0 answers
40 views

XML to Pandas dataFrame [closed]

0 A 51 non-null object 1 B 51 non-null object 2 C 51 non-null object 3 D 45 non-null object This is the info of the dataframe. It is fine when I just return it ...
Debasish Kundu's user avatar
0 votes
0 answers
30 views

How to transform nested data to be used with an tabular learning network

I have an dataframe containing measurements and error with there corresponding status and counter. And i want to use it to feed it into e.g. TabNet Raw dataframe protocolid time measurement_1 ...
Max_Och's user avatar
0 votes
0 answers
33 views

Data retrieving and SQL database update

I'm trying to retrieve some data from an API and save them to a local database I created. All data come from Google Ads campaigns, and I need to make two separate calls because of their docs, but that'...
Davide's user avatar
  • 454
0 votes
0 answers
31 views

why am I am getting an the: sns.lineplot(x=anomaly_df['Date'], y=scaler.inverse_transform(anomaly_df['Close/Last']))

import numpy as np from keras.models import Sequential from keras.layers import LSTM, Input, Dropout from keras.layers import Dense from keras.layers import RepeatVector from keras.layers import ...
Asiedu's user avatar
  • 1
0 votes
0 answers
17 views

Trying to iterate over different teams in mean data from a larger dataset [duplicate]

Basically after the mean data for the teams home and away games is taken, I want to plot multiple graphs for each team in one loop, essentially, in the code below, where Arsenal is in quotes in the ...
Daragh's user avatar
  • 1
1 vote
2 answers
78 views

How can I clean a year column with messy values?

I have a project I'm working on for a data analysis course, where we pick a data set and go through the steps of cleaning and exploring the data with a question to answer in mind. I want to be able to ...
Jubilbee Draws's user avatar
0 votes
3 answers
81 views

Pandas dataframe reshape with columns name [closed]

I have a dataframe like this: >>> df TYPE A B C D 0 IN 550 350 600 360 1 OUT 340 270 420 190 I want reshape it to this shape: AIN AOUT BIN BOUT CIN COUT ...
Sun Jar's user avatar
  • 163
2 votes
1 answer
57 views

Dropping duplicates by column in PySpark

I have a PySpark dataframe like this but with a lot more data: user_id event_date 123 '2024-01-01 14:45:12.00' 123 '2024-01-02 14:45:12.00' 456 '2024-01-01 14:45:12.00' 456 '2024-03-01 14:45:12....
Myakotka247's user avatar
0 votes
0 answers
34 views

Create a new line for comma separated values in pandas column - I dont want to add new rows, I want to have same rows in output [duplicate]

I have a dataframe like this, df col1 col2 1 'abc,pqr' 2 'ghv' 3 'mrr, jig' Now I want to create a new line for each comma separated values in col2, so the output would look ...
Kallol's user avatar
  • 2,189
0 votes
1 answer
49 views

How can i change a column data type in pandas without creating null values in the whole column in my dataframe

I have been getting null values when trying to convert a column with the non-numeric type values to a column with numeric type values I have been using the below code line to change my column data ...
Samuel Sepeku's user avatar
0 votes
0 answers
49 views

How to Increase Precision of Decimal Points in Python DataFrames? [closed]

I am developing a system in Python that replicates another written in LabWindows. A part of the design involves calculating the Periodogram, which returns a decimal array. I then add this array to a ...
S N B's user avatar
  • 61
0 votes
0 answers
28 views

Pandas DataFrame uses more memory than it claimed

My program is very simple. I run it in Jupyter Notebook. It loads data from MongoDB. I tried to store the data as pandas.DataFrame at first. import pandas as pd import pymongo mongo = pymongo....
SerSmile's user avatar
0 votes
3 answers
60 views

Add columns to dataframe from a dictionary

There are many answers out there to this question, but I couldn't find one that applies to my case. I have a dataframe that contains ID's: df = pd.DataFrame({"id": [0, 1, 2, 3, 4]}) Now, I ...
mrgou's user avatar
  • 2,418
0 votes
1 answer
45 views

How to check pyspark dataframe column for incorrect value type using pytest? [closed]

I am trying to write a test to see if the spark dataframe has records with incorrect value type, but I'm stuck. There is the dataframe: schema1 = StructType( [ StructField("id_key&...
user28640934's user avatar
0 votes
0 answers
23 views

Reassigning pandas columns in chained .assign() gives incorrect values [duplicate]

I often follow the convention (for better or worse) of loading data and preprocessing manipulations in a single line of chained pandas commands. In one such manipulation, I need to multiply a set of ...
Patrick's user avatar
0 votes
2 answers
67 views

How to convert string scientific notation to float within a txt file

I have code in a .txt file that has scientific notation values stored as strings and I am trying to convert them to floats that way I can perform calculations on them. However, when I try to attempt ...
n00bcoder_24's user avatar
-1 votes
0 answers
51 views

Pandas read_excel is throwing an issue related to datetime conversion while reading an .xlsx or .xls file, but file doesn’t have any datetime columns

By using below code facing issue: I am trying to read .xslx as well .xls file. df = pd.read_excel(filepath,sheet_name = "Package ID Informatio", header=hd, dtype=str) Code is running well ...
Tejaswini Jadhav's user avatar
1 vote
2 answers
78 views

Pandas dataframe - combine cell values as strings [duplicate]

I have a dataframe: Email | Col1 | Col2 | Col3 | Name -------------------------------------------------------------------- [email protected] | CellStr11 | 1.4 | CellStr13 |...
badbadllama's user avatar
1 vote
2 answers
42 views

Pandas dataframe - finding row comparing two cell values

I have a dataframe: Email | ... | Name -------------------------------------- [email protected] | ... | John Cena [email protected] | ... | John Cena I need to find a row, that ...
badbadllama's user avatar
1 vote
1 answer
37 views

Filter Pandas DataFrame when all IDs are blank [duplicate]

This is how I am populating my DataFrame: import pandas as pd data = {'ID1': ['BBG01Q69DW37', 'BBG01Q69DW37','BBG01Q69TEST','BBG01Q69TES1'], 'ID2': ['YU3384903', 'YU3384903','','YU338TES1'], ...
Sachin's user avatar
  • 63
-2 votes
0 answers
26 views

Errors reading csv file from different URLs [duplicate]

I cannot figure out why the same approach in pandas cannot be used to read the CSV file of the two following URLs. import pandas as pd url1 = "https://data.ontario.ca/dataset/a2dfa674-a173-45b3-...
Trung Nguyen's user avatar

1
2 3 4 5
1829