All Questions
249,122 questions
0
votes
0
answers
10
views
Export Python charts excel with grid layout
I'm trying to write a Python script that generates a series of charts from Excel. The Excel file has the price of different cars on each day.
Column 1 is the Date; Column 2 is price for Green cars; ...
0
votes
0
answers
24
views
Creating groups from dataset based on time intervals [duplicate]
I am working on expanding a script I have made to do some reporting for one of our projects. I need to determine length of video, location and a few other attributes. This part works well when there ...
-1
votes
0
answers
41
views
How do I replace text in a word document table using information from an excel file?
I have a template where I am trying to replace certain text with information from excel. The problem is that once I have replaced and saved the file, the conditions I have set no longer apply and all ...
0
votes
0
answers
28
views
Return something from onclick?
I have a function that creates a chart with pandas & matplotlib. In the function there is an onclick handler that creates a table and then shows it with plt.show() upon click.
When the entire ...
0
votes
0
answers
63
views
How to calculate daily weights which certain conditions in python
I have the following pandas dataframe which represents the consumption of 7 days (day_0 is today, day_-1 is yesterday etc) of 10 people (ids):
import pandas as pd
import numpy as np
df = pd.DataFrame(...
1
vote
0
answers
25
views
Use attribute name as column name for sqlalchemy in pandas read_sql()
I am using pd.read_sql() together with SQL alchemy. However, it is not using the attribute name of the mapping for the columns in the pandas dataframe, but the original SQL column name:
class DBtable(...
2
votes
2
answers
72
views
Find average rate per group in specific years using groupby transform
I'm trying to find a better/faster way to do this. I have a rather large dataset (~200M rows) with individual dates per row. I want to find the average yearly rate per group from 2018 to 2019. I know ...
0
votes
2
answers
33
views
Cannot compare tz-naive and tz-aware timestamps
I'm finding the error below:
Cannot compare tz-naive and tz-aware timestamps
How can I convert dates to fix the issue? The error appear in the end of the syntax below.
from datetime import datetime, ...
0
votes
4
answers
81
views
pd.read_csv() not working with parse_dates
I'm using the Netflix Movies and TV Shows dataset to better understand pandas.
The column date_added is in the format: "September 21, 2024" which, as I understand, would be parsed as "%...
0
votes
0
answers
95
views
How to efficiently make a large matrix of 1s and 0s
I have two numpy arrays x and y of same length, and I am trying to make a square matrix A such that the (i,j) entry of the matrix will contain a 1 if a certain relationship holds between x[i], x[j], y[...
0
votes
0
answers
47
views
Reindexing only valid with uniquely valued index objects
There are a couple of articles about this issue already, but none of them solve my issue.
I have two sets of Python dataframes (df_A1, df_A2 and df_B1, df_B2) and I want to combine the A's together ...
0
votes
2
answers
47
views
Pandas conditional filtering on categorical values
I have the following table in pandas:
Index
Ingredient
Dish
1
Potato
Pie
2
Potato
Juice
3
Potato
Fries
4
Potato
Pure
5
Apple
Pie
6
Apple
Juice
7
Apple
Fries
8
Apple
Pure
and want to apply ...
-3
votes
0
answers
57
views
pandas complex merge
How can I merge these two dataframes?
df1
enter image description here
df2
Rule:
Merge based on callsign and re (if not null). If de and d_runway do not match, add a label indicating a mismatch.
...
1
vote
0
answers
52
views
Pandas - Change specific input based on row and column name
Lets say I have the following dataframe
data = [['Tom', 180], ['Adam', 174], ['Bob', 182]]
df = pd.DataFrame(data, columns=['Name', 'Height'])
If I wish to change the value of Bob's height, I could ...
1
vote
2
answers
64
views
Split a Pandas column of lists with different lengths into multiple columns [duplicate]
I have a Pandas DataFrame that looks like:
ID result
1 [.1,.5]
2 [.4,-.2,-.3,.1,0]
3 [0,.1,.6]
How can split this column of lists into two columns?
Desired result:
ID result_1 result_2 ...
0
votes
5
answers
80
views
How to transpose and modify a pandas dataframe based on column values
I have a set of data like below:
name A B C
foo 1 0 0
bar 0 1 0
coo 0 0 1
That I am trying to alter to look like the table below:
name
A foo
B bar
C coo
I've done research but have gotten ...
0
votes
2
answers
64
views
Calculate all missing values for specific data using pivot tables in pandas
I am working on this dataset called titanic.csv Let's simplify the problem and include some data here:
I need to calculate all missing values for child, as you see it is a value under who column. ...
0
votes
0
answers
31
views
ImportError from Pandas while running XGBoost model on python
I am trying to run a basic XGBoost model on python (v 3.8.5), however getting an error that I can not resolve. Appreciate your help, thanks!
My code is as below:
import seaborn as sns
import pandas ...
0
votes
1
answer
34
views
Getting empty dataframes after webscraping from Wikipedia
I am trying to extract data from a wikipedia page and load it into a dataframe. After webscraping and running the data frame, python is returning an empty data frame which shouldn't be the case. Here'...
0
votes
0
answers
29
views
IndexError: index out of range in self when training Dynamic Word Embedding Model (DWB)
I am a beginner in text mining analysis and am currently learning Dynamic Word Embedding (DWB) techniques. While running the replication codes from this Kaggle notebook, I encountered the following ...
-2
votes
0
answers
40
views
How do I Get rid of '0' in index column?
I try to rename the Index column to 'idx' and get rid of 0 using this code:
df1.index.rename(name='idx', inplace=True)
However, I end up with the second dataframe as below. It results in messing up ...
2
votes
2
answers
48
views
Skip rows with pandas.read_csv(..., comment="#") but allow hash in the data?
Is there any way in pandas to ignore #-commented lines in their entirety, but leave the # symbol alone in the CSV body?
import pandas as pd
from io import StringIO
csv_content = """\
# ...
1
vote
1
answer
39
views
Behavior of df.map() inside another df.apply()
I find this code very interesting. I modified the code a little to improve the question. Essentially, the code uses a DataFrame to format the style of another DataFrame using pd.style.
t1 = pd....
1
vote
1
answer
48
views
merge several columns of the same data into one
I've created a dataframe that adds data from several sources. Here is an example subset:
index CompanyName Source1site Source2site Source3site City
1 Comp1 web1.com Nan ...
2
votes
2
answers
50
views
Interpolating time series data for step values
I have time series data that looks like this (mm/dd hh:mm):
3.100 12/14 05:42
3.250 12/14 05:24
3.300 12/14 05:23
3.600 12/14 02:45
3.700 12/13 10:54
3.600 12/12 13:19
3.900 12/12 10:43
...
2
votes
2
answers
73
views
How to use numpy.where in a pipe function for pandas dataframe groupby?
Here is a script to simulate the issue I am facing:
import pandas as pd
import numpy as np
data = {
'a':[1,2,1,1,2,1,1],
'b':[10,40,20,10,40,10,20],
'c':[0.3, 0.2, 0.6, 0.4, 0....
1
vote
2
answers
63
views
Update Pandas DataFrame slice row-wise using dictionary
Question
I am updating the values in a slice of a pandas.DataFrame by row such that each row of the slice has unique value. I am using pandas version 2.2.3.
I have found an approach that seems to work ...
0
votes
0
answers
36
views
Monkeypatching pandas series to_csv with pytest
I am testing a function, write_query(), in module.py. My test is in test_module.py, which is carried out using pytest. I am using the pytest monkeypatch fixture to monkeypatch Series.to_csv(), in ...
0
votes
1
answer
42
views
Indices mismatch during merge in pandas
I am trying to merge two dataframes in Python, pandas, df1 and df2.
I am trying to merge them on Column1, and then assign value of Column2 from df2 to df1.
This is my code:
df1 = df1.reset_index()
...
0
votes
3
answers
54
views
How to separate multiple tickers into individual dataframes with yfinance downloaded data
I'm trying to download stock data information using yfinance. Currently, I can successfully download a single ticker using yf.download which returns a dataframe with information I can use. This API ...
-1
votes
1
answer
35
views
Exception has occurred: DatatypeMismatch column "occurence_timestamp" is of type timestamp without time zone but expression is of type bigint
Here are the core steps and logics of my script below:
Create and instantiate a PostgreSQLDB class object that does database operation
Use view vw_valid_case_from_db1 to get a list of case_id which ...
2
votes
1
answer
58
views
How to convert the column with lists into one hot encoded columns? [duplicate]
Assume, there is one DataFrame such as following
import pandas as pd
import numpy as np
df = pd.DataFrame({'id':range(1,4),
'items':[['A', 'B'], ['A', 'B', 'C'], ['A', 'C']]})
...
1
vote
0
answers
45
views
How do I pass a row to a function using df.apply in Pandas [duplicate]
I have a fairly complicated function that I need to run on each row of my dataframe - lambda functions won't work here.
I want to pass multiple columns from the dataframe to the function.
In my ...
1
vote
2
answers
32
views
Delete indexed observation based on other column entry
I have an example imported excel spreadsheet that appears like this with many more entries and i read in with pd.read_excel. I would like to delete all of ID 3's entries since the status is deleted ...
0
votes
1
answer
42
views
Sparse matrix in pandas/scipy with row and column indices
I have a dataframe in pandas that looks like this:
>>> df[['BranchNumber', 'ModelArticleNumber', 'ActualSellingPrice']].info()
<class 'pandas.core.frame.DataFrame'>
Index: 447970 ...
-2
votes
0
answers
14
views
How to impute OPEN_CLS_STS based on values in DT_CLS in Python [duplicate]
I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS.
IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'.
I tried ...
-1
votes
1
answer
48
views
How to impute OPEN_CLS_STS based on values in DT_CLS [duplicate]
I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS.
IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'.
I tried ...
1
vote
2
answers
60
views
How to store the result of a datafram grouping seperatly using Pandas Python
I'm facing a challenge when I'm trying to group a dataframe, here's a fake datafram which similir to the real one :
**univertcity country sector name firstname code**
Evergreen College ...
0
votes
0
answers
25
views
Creating separate groups in a dataframe when column values repeat [duplicate]
I have a dataframe with numbers formatted as follows:
df = pd.DataFrame({"ColumnA": [1,2,3,4,5,6,7,8,9,10], "ColumnB": [1,3,5,6,4,7,5,4,1,2], "ColumnC": [0,1,1,2,0,2,1,1,...
0
votes
1
answer
72
views
Manipulation of a Pandas dataframe most time- and memory-efficiently
Please imagine I have a dataframe like this:
df = pd.DataFrame(index=pd.Index(['1', '1', '2', '2'], name='from'), columns=['to'], data= ['2', '2', '4', '5'])
df:
Now, I would like to calculate a ...
-5
votes
0
answers
54
views
Clearing outliers in Python using pandas
I am doing a small project (while trying to educate myself in OCR and data analysis), and I am facing a problem I cannot solve.
I am plotting this graph of speed relative to time, but due to the ...
1
vote
2
answers
40
views
Pandas Dataframe Multiindex - Calculate Mean and add additional column to each level of the index
Given the following dataframe:
Year 2024 2023 2022
Header N Result SD N Result SD N Result SD
Vendor
A 5 20 3 5 22 4 1 21 3
B 4 25 2 ...
0
votes
0
answers
39
views
Python - Pandas matching two json API dumps as dataframes
I'm a network admin working on a small project at work using Cisco's Meraki API wrapper for Python
The end goal is to have a dashboard that displays the amount of authenticated users on a single ...
2
votes
4
answers
69
views
pandas multi index subset selection
import pandas as pd
import numpy as np
# Sample data
index = pd.MultiIndex.from_tuples([
('A', 'a1', 'x'),
('A', 'a1', 'y'),
('A', 'a2', 'x'),
('A', 'a2', 'y'),
('B', 'b1', 'x'),
...
1
vote
1
answer
39
views
What role does min value and max value play in reducing memory usage?
I am learning the code from this github Intrusion Detection (CIC-IDS2017)
Here is the code and the result that the authors use to reduce the memory, but I don't know why the author made adjustments ...
0
votes
1
answer
37
views
Download and read an Excel file into a pandas DataFrame without saving the Excel file
I'm making a request to download an Excel file. The response is in byte form. What I usually do is save this response body as an Excel file using:
with open('filename.xlsx', 'wb') as f:
f.write(...
-1
votes
0
answers
50
views
My Exponential Moving Average calculations are still somehow wrong?
Where am I going wrong...
Here is my Python code which interacts with the MetaTrader5 API.
import numpy as np
import MetaTrader5 as mt5
import pandas as pd
from sklearn.preprocessing import ...
0
votes
1
answer
61
views
Not getting decimals when extracting values [duplicate]
So I am practicing data wrangling and I have encountered an issue.
food['GPA'].unique()
And the output is
array(['2.4', '3.654', '3.3', '3.2', '3.5', '2.25', '3.8', '3.904', '3.4',
'3.6', '3.1'...
1
vote
1
answer
40
views
Comparing empty dataframes
I have a function, extract_redundant_values, to extract redundant rows from a pandas dataframe. I am testing it by running on in_df to generate out_df. I am then comparing this against my expected ...
1
vote
2
answers
76
views
Rearrange and encode columns in pandas
i have data structured like this (working with pandas):
ID|comp_1_name|comp_1_percentage|comp_2_name|comp_2_percentage|
1| name_1 | 13 | name_2 | 33 |
2| name_3 | ...