All Questions
78,644 questions
0
votes
2
answers
34
views
Cannot compare tz-naive and tz-aware timestamps
I'm finding the error below:
Cannot compare tz-naive and tz-aware timestamps
How can I convert dates to fix the issue? The error appear in the end of the syntax below.
from datetime import datetime, ...
0
votes
0
answers
95
views
How to efficiently make a large matrix of 1s and 0s
I have two numpy arrays x and y of same length, and I am trying to make a square matrix A such that the (i,j) entry of the matrix will contain a 1 if a certain relationship holds between x[i], x[j], y[...
-2
votes
0
answers
40
views
How do I Get rid of '0' in index column?
I try to rename the Index column to 'idx' and get rid of 0 using this code:
df1.index.rename(name='idx', inplace=True)
However, I end up with the second dataframe as below. It results in messing up ...
1
vote
1
answer
48
views
merge several columns of the same data into one
I've created a dataframe that adds data from several sources. Here is an example subset:
index CompanyName Source1site Source2site Source3site City
1 Comp1 web1.com Nan ...
2
votes
2
answers
73
views
How to use numpy.where in a pipe function for pandas dataframe groupby?
Here is a script to simulate the issue I am facing:
import pandas as pd
import numpy as np
data = {
'a':[1,2,1,1,2,1,1],
'b':[10,40,20,10,40,10,20],
'c':[0.3, 0.2, 0.6, 0.4, 0....
1
vote
2
answers
63
views
Update Pandas DataFrame slice row-wise using dictionary
Question
I am updating the values in a slice of a pandas.DataFrame by row such that each row of the slice has unique value. I am using pandas version 2.2.3.
I have found an approach that seems to work ...
0
votes
1
answer
42
views
Indices mismatch during merge in pandas
I am trying to merge two dataframes in Python, pandas, df1 and df2.
I am trying to merge them on Column1, and then assign value of Column2 from df2 to df1.
This is my code:
df1 = df1.reset_index()
...
0
votes
3
answers
54
views
How to separate multiple tickers into individual dataframes with yfinance downloaded data
I'm trying to download stock data information using yfinance. Currently, I can successfully download a single ticker using yf.download which returns a dataframe with information I can use. This API ...
-2
votes
0
answers
14
views
How to impute OPEN_CLS_STS based on values in DT_CLS in Python [duplicate]
I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS.
IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'.
I tried ...
-1
votes
1
answer
48
views
How to impute OPEN_CLS_STS based on values in DT_CLS [duplicate]
I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS.
IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'.
I tried ...
0
votes
0
answers
25
views
Creating separate groups in a dataframe when column values repeat [duplicate]
I have a dataframe with numbers formatted as follows:
df = pd.DataFrame({"ColumnA": [1,2,3,4,5,6,7,8,9,10], "ColumnB": [1,3,5,6,4,7,5,4,1,2], "ColumnC": [0,1,1,2,0,2,1,1,...
0
votes
1
answer
72
views
Manipulation of a Pandas dataframe most time- and memory-efficiently
Please imagine I have a dataframe like this:
df = pd.DataFrame(index=pd.Index(['1', '1', '2', '2'], name='from'), columns=['to'], data= ['2', '2', '4', '5'])
df:
Now, I would like to calculate a ...
1
vote
2
answers
40
views
Pandas Dataframe Multiindex - Calculate Mean and add additional column to each level of the index
Given the following dataframe:
Year 2024 2023 2022
Header N Result SD N Result SD N Result SD
Vendor
A 5 20 3 5 22 4 1 21 3
B 4 25 2 ...
-1
votes
0
answers
50
views
My Exponential Moving Average calculations are still somehow wrong?
Where am I going wrong...
Here is my Python code which interacts with the MetaTrader5 API.
import numpy as np
import MetaTrader5 as mt5
import pandas as pd
from sklearn.preprocessing import ...
1
vote
1
answer
40
views
Comparing empty dataframes
I have a function, extract_redundant_values, to extract redundant rows from a pandas dataframe. I am testing it by running on in_df to generate out_df. I am then comparing this against my expected ...
0
votes
1
answer
51
views
loop over date range and appending new values to a new data frame
I wish to loop each row of the date frame below over each date of date rage below, check the following condition and return the current date of date range in a new data frame with all columns we have ...
1
vote
1
answer
60
views
Fill in rows to dataframe based on another dataframe
I have 2 dataframes that look like this:
import pandas as pd
data = {'QuarterYear': ["Q3 2023", "Q4 2023", "Q1 2024", 'Q2 2024', "Q3 2024", "Q4 2024"]...
1
vote
1
answer
49
views
Alternate background colors in styled pandas df that also apply to MultiIndex in python pandas
SETUP
I have the following df:
import pandas as pd
import numpy as np
arrays = [
np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...
1
vote
1
answer
35
views
How to style all cells in a row of a specific MultiIndex value in pandas
SETUP
I have the following df:
import pandas as pd
import numpy as np
arrays = [
np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...
-1
votes
0
answers
40
views
XML to Pandas dataFrame [closed]
0 A 51 non-null object
1 B 51 non-null object
2 C 51 non-null object
3 D 45 non-null object
This is the info of the dataframe.
It is fine when I just return it ...
0
votes
0
answers
33
views
Data retrieving and SQL database update
I'm trying to retrieve some data from an API and save them to a local database I created. All data come from Google Ads campaigns, and I need to make two separate calls because of their docs, but that'...
0
votes
0
answers
17
views
Trying to iterate over different teams in mean data from a larger dataset [duplicate]
Basically after the mean data for the teams home and away games is taken, I want to plot multiple graphs for each team in one loop, essentially, in the code below, where Arsenal is in quotes in the ...
1
vote
2
answers
78
views
How can I clean a year column with messy values?
I have a project I'm working on for a data analysis course, where we pick a data set and go through the steps of cleaning and exploring the data with a question to answer in mind.
I want to be able to ...
0
votes
3
answers
81
views
Pandas dataframe reshape with columns name [closed]
I have a dataframe like this:
>>> df
TYPE A B C D
0 IN 550 350 600 360
1 OUT 340 270 420 190
I want reshape it to this shape:
AIN AOUT BIN BOUT CIN COUT ...
0
votes
0
answers
34
views
Create a new line for comma separated values in pandas column - I dont want to add new rows, I want to have same rows in output [duplicate]
I have a dataframe like this,
df
col1 col2
1 'abc,pqr'
2 'ghv'
3 'mrr, jig'
Now I want to create a new line for each comma separated values in col2, so the output would look ...
0
votes
1
answer
49
views
How can i change a column data type in pandas without creating null values in the whole column in my dataframe
I have been getting null values when trying to convert a column with the non-numeric type values to a column with numeric type values
I have been using the below code line to change my column data ...
0
votes
0
answers
28
views
Pandas DataFrame uses more memory than it claimed
My program is very simple. I run it in Jupyter Notebook. It loads data from MongoDB. I tried to store the data as pandas.DataFrame at first.
import pandas as pd
import pymongo
mongo = pymongo....
0
votes
3
answers
60
views
Add columns to dataframe from a dictionary
There are many answers out there to this question, but I couldn't find one that applies to my case.
I have a dataframe that contains ID's:
df = pd.DataFrame({"id": [0, 1, 2, 3, 4]})
Now, I ...
0
votes
0
answers
23
views
Reassigning pandas columns in chained .assign() gives incorrect values [duplicate]
I often follow the convention (for better or worse) of loading data and preprocessing manipulations in a single line of chained pandas commands. In one such manipulation, I need to multiply a set of ...
-1
votes
0
answers
51
views
Pandas read_excel is throwing an issue related to datetime conversion while reading an .xlsx or .xls file, but file doesn’t have any datetime columns
By using below code facing issue:
I am trying to read .xslx as well .xls file.
df = pd.read_excel(filepath,sheet_name = "Package ID Informatio", header=hd, dtype=str)
Code is running well ...
1
vote
2
answers
78
views
Pandas dataframe - combine cell values as strings [duplicate]
I have a dataframe:
Email | Col1 | Col2 | Col3 | Name
--------------------------------------------------------------------
[email protected] | CellStr11 | 1.4 | CellStr13 |...
1
vote
2
answers
42
views
Pandas dataframe - finding row comparing two cell values
I have a dataframe:
Email | ... | Name
--------------------------------------
[email protected] | ... | John Cena
[email protected] | ... | John Cena
I need to find a row, that ...
1
vote
1
answer
37
views
Filter Pandas DataFrame when all IDs are blank [duplicate]
This is how I am populating my DataFrame:
import pandas as pd
data = {'ID1': ['BBG01Q69DW37', 'BBG01Q69DW37','BBG01Q69TEST','BBG01Q69TES1'],
'ID2': ['YU3384903', 'YU3384903','','YU338TES1'],
...
2
votes
2
answers
74
views
How to Write a Pandas DataFrame to CSV With Strings Quoted and Integers/Empty Cells Unaltered Without Adding Escape Characters for Commas?
I am working on a Python script to write a DataFrame to a CSV file. My goal is to:
Enclose all string values in double quotes (").
Keep numeric values unchanged (no quotes).
Leave empty cells as ...
-1
votes
2
answers
68
views
Conditionally color dataframe cells in python terminal
I have a dataframe.
import pandas as pd
data = {
"Name": ['Alice','Bob', 'Sue','Joe','rose','cindy'],
"Age": [11, 6, 3, 16, 21, 8],
"Num": [2, 17, 12, 7, 22, ...
1
vote
3
answers
78
views
How do I transfer data from pandas Dataframe to a Dataframe variable of a different type
So I have a Dataframe of car data which has a price column.
So I want to transfer the data from the price column a variable which is a Dataframe which has the same data as the original column, but I ...
1
vote
3
answers
86
views
How to access the last 10 rows and first two columns of a DataFrame
I was given a file to practice pandas on, and was asked this question:
Q: Access the last 10 rows and the first two columns of the index dataframe.
So, I tried this code:
df = index[(index.tail(10)) &...
1
vote
2
answers
74
views
Converting a nested json three levels deep to dataframe
I have a json that is three levels deep.
I want to flatten it into a dataframe that has five columns.
id
name
code
level
parent_id
So:
The part I struggle with is that I can extract each nested item,...
1
vote
2
answers
52
views
flattening pandas columns in a non-trivial way
I have a pandas dataframe which looks like the following:
site pay delta over under
phase a a b
ID
D01 London 12.3 10.3 -2.0 0.0 -2.0
D02 ...
-1
votes
0
answers
65
views
Pandas can't convert to JSON
df = pd.DataFrame(list(zip(job_title, company, location, pay)),
columns=['Job Title', 'Company',
'Location', 'Payment'])
df.to_json('Job_Thai.json', ...
1
vote
2
answers
52
views
Perform a binary op on values in a pandas dataframe column by a value in that same column chosen based on a value in another column
Sorry for the mouthful title. I think this is best illustrated by an example. Let's say we have an item that has different rarity levels, all of which have dfferent prices in different shops. I want ...
0
votes
2
answers
56
views
Remove rows of one pandas df using another df, but enable "wildcard" behavior when second df is missing values
I have a df of hundreds of thousands of rows, which can contain errors. A team is manually reviewing/identifying, and I'm trying to enable flexible removal of combinations to purge.
There are three ...
0
votes
0
answers
16
views
Use pandas to turn cell value to column header and show count of that value as new value [duplicate]
I have a dataframe df that returns this table
City
State
Indicent
New York
New York
foo
New York
New York
bar
Los Angeles
California
foo
Los Angeles
California
bar
Miami
Florida
foo
Miami
...
0
votes
2
answers
45
views
Calculate difference between rows in Pandas dataframe using conditional logic [duplicate]
I am trying to use the pandas.DataFrame.diff function to calculate the difference between rows in a dataframe. The catch is I only want to calculate the difference for certain values using some simple ...
1
vote
1
answer
123
views
Python using oracledb to connect to Oracle database with Pandas DataFrame Error: "pandas only supports SQLAlchemy connectable (engine/connection)"
I am pretty new to Python and even newer to Pandas and am hoping for some guidance
My company has an on-prem DEV Oracle database that I am trying to connect to using Python & Pandas. After some ...
3
votes
2
answers
52
views
Shift column in dataframe without deleting one
Here is my dataframe:
A
B
C
First
row to delete
row to shift
Second
row to delete
row to shift
And I want this output :
A
B
C
First
row to shift
Second
row to shift
I tried this code :
df....
1
vote
1
answer
51
views
How to prevent Pandas to_csv double quoting empty fields in output csv
I currently have a sample python script that reads a csv with double quotes as a text qualifier and removes ascii characters and line feeds in the fields via a dataframe. It then outputs the dataframe ...
0
votes
3
answers
69
views
For loop in dataframe column
I want to iterate through dataframe column, ['Status'], and based on value calculate days since date in column ['Date'] and write to a third column,['Days'].
import pandas as pd
from datetime import ...
0
votes
2
answers
62
views
Why this Pandas DataFrame column operation fails?
This script works fine with Python 3.11 and Pandas 2.2:
import pandas as pd
df = pd.read_csv(f'test.csv', comment='#')
df['x1'] = df['x1']*8
# df['y1'] = df['y1']*8
print(df)
and prints:
x1 y1
...
0
votes
0
answers
28
views
Combine two dataframes with partially overlapping column values [duplicate]
Hi :) I want to combine two dataframes, I have been looking on this site in the hopes of getting an answer but I still could not figure it out.
My first dataframes looks like this (I put it as table ...