All Questions
735 questions
2
votes
1
answer
50
views
Comparing Multiple Values Across Columns of Pandas Dataframe Based on Column Names
I have a pandas dataframe with a number of thresholds and values associated with epochs. I want to compare the all of the thresholds with their associated values simultaneously to remove rows as ...
2
votes
2
answers
97
views
How to set the start date of the workweek for a month using Python?
I have a database with a date column in the following format:
However, I need to work with this database so that it can be recognized as a time series. In this case, the names "week 1", &...
2
votes
1
answer
51
views
Python : Merge two dataframes based on created time
I have two dfs, two df's has be to be merged by class and joining dates. Please check the below df's
df1
class teacher age instructor_joining_date
A mark 50 2024-01-20 07:18:29....
0
votes
2
answers
62
views
How to handle duplicates using asfreq() function. Is there any other way to do this?
I have some hourly data on electricity generation from various sources in various countries. I download data from the ENTSO-E Transparency Platform website and found a problem with data inconsistency. ...
1
vote
1
answer
64
views
Calculate mean values of the past x years for every month
I have historical data from 2012 to 2023. I am trying to calculate the average for every month during these years to build a 'reference year' or 'baseline'. My DataFrame (final_df) looks like this:
...
0
votes
0
answers
34
views
How to treat/fill missing data in ohlcv ccxt
I'm trying to find out a solution to treat / fill missing ohlcv data coming from cryptocurrency exchange API.
I'm running this code:
bars = exchange.fetch_ohlcv(symbol, interval, limit = limitklines)
...
0
votes
1
answer
91
views
Segmenting Time series with python
I have some task, where I just don't know how to solve it. I have a Time Series for productionsteps. Now I want to segment every productionstep and give it an ID. My Timeseries looks like this:
The ...
0
votes
1
answer
71
views
How to label the date based on seasons in time-series pandas dataframe?
I have a pandas dataframe with datetime and values column as below and the third column I would like to label as indicted in the df below:
| datetime | |values| | Label |
| ----------------- ...
0
votes
1
answer
53
views
Counting and comparing values across dataframes to create new dataframe
Let's say I have 50 dataframes with an index of dates and a column labeled yes/no.
I would like to count the number of yes's/no's for a specific date across the dataframes to create a new dataframe ...
1
vote
3
answers
54
views
Convert energy consumption sessions to hourly energy consumption timeseries in python
I have a dataframe with start and end timestamps, duration in HH:MM:SS and energy consumed. A sample is shown below:
start end time energy
0 2024-03-...
0
votes
1
answer
31
views
Combining two dataframes with different column name in time-series
I have two data frames, one named sensor and one named train.
The sensor data frame contains the data of a time series, whose index is the ts_sensor column. In the sensor data frame the names of the ...
1
vote
3
answers
65
views
How do I combine the values of a Pandas column from multiple dataframes into one column in one dataframe?
The dataframes can be created using the following code:
import pandas as pd
s1 = pd.DataFrame({'item':['apple','apple','apple']},
index=['1/2/2024','1/5/2024','1/6/2024'])
s2 = pd....
1
vote
2
answers
183
views
Parsing timestamp column with variable timestamp format in a column in pandas
I want to parse the time from a column of a sub second precision timeseries .csv file but it returns NaT for some timestamps.
A quirk of the dataset is, that every non-full second will be represented ...
0
votes
1
answer
21
views
Set x-axis values for dataframe plotting in Python when data is time series
I have drawn my graph in Python using this code:
print(data_filtered['ranking_datetime'])
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y/%m/%d'))
plt.gca().xaxis.set_major_locator(mdates....
0
votes
0
answers
39
views
Managing pd.Series and pd.DatetimeIndex in a function simultaneously
I have a function meant to examine time-series data, usually obtained from a pandas dataframe. However, there are usually two ways to store this kind of data: the time-series can be stored in a column ...
0
votes
1
answer
67
views
Substracting time values from csv file by python
From a csv file I am reading a time series in this format 23:03:00 using pandas. The python code reads it as a string & can't convert it to integer or float to subtract it from another time step.
...
0
votes
0
answers
52
views
Extract instances from perminute timeseries data where the POWER value is 0 continuously for greater than 0 minutes
I have huge time series dataset containing TIME, POWER as columns, I'm taking this dataset in window of 30 minutes and inside each window if POWER value is 0 for greater than 10 minutes continuously, ...
-4
votes
1
answer
31
views
Python: group dataframe time series by nth weekdays
I need to run statistics in python on a dataframe over several years.
I managed to perform calculations grouping data by:
months, days:
sum = data.groupby([lambda x: x.month, lambda x: x.day]).sum()
...
0
votes
1
answer
79
views
Get the peak value per cycle
I would like to get date time include ns in bellow strings.
0 2023-11-02 12:00:00.039\t-201
1 2023-11-02 12:00:00.996\t-201
2 2023-11-02 12:00:01.396\t-198
3 2023-11-02 12:00:01....
0
votes
1
answer
260
views
Unknown string format error when I try to clean the dataframe I'm working in
I am trying to setup the given data so I can analyze the timeseries data. I am new to coding, especially to Python.
With the code included, I keep getting error:
ParserError: Unknown string format: ...
1
vote
1
answer
85
views
During aggregation count the longest date streak - using pyspark
imagine a table:
PersonID
Date
HasDoneWorkout
A
31-01-2001
1
A
01-02-2001
1
A
02-02-2001
1
A
03-02-2001
0
A
04-02-2001
1
B
02-02-2001
1
I would like to create a pyspark aggregation function ...
1
vote
1
answer
286
views
Visualizaion of count records of categorical variables including their missing values (`None` or `NaN`)
Let's say I have the following time series data dataframe:
import numpy as np
import pandas as pd
import random
np.random.seed(2019)
# Generate TS
#rng = pd.date_range('2019-01-01', freq='MS', ...
1
vote
1
answer
100
views
Keep track of data revisions efficiently
I get a data feed like
datetime,value.
The API gives me the entire history, about 10,000 observations, and it adds new observations about every hour.
I need to add new data to a dataframe and run some ...
0
votes
1
answer
74
views
Any ways to accelerate the for loop in Python?
Actually, I'm dealing with time series data one batch at a time in a for loop. Within the batch, there are 14,400 rows, I need to check if there is any vacation of the time series and make it up if ...
0
votes
0
answers
23
views
I have a function that is returning a new homogenous dataframe everytime when running the code, now I want to store this dataframe
I have a function
def Vega_dict():
Sum_CE_Vega = Final_df.loc[Final_df['instrumentType'] == 'CE', 'vega'].sum()
Sum_PE_Vega = Final_df.loc[Final_df['instrumentType'] == 'PE', 'vega'].sum()
...
0
votes
0
answers
48
views
How to deal with missing values in time series data wiser?
Here is the case:
import numpy as np
import pandas as pd
from itertools import repeat
df = pd.DataFrame({'date': pd.date_range(start='2013-01-01', periods=10, freq='H'), 'value': np.random....
0
votes
1
answer
33
views
Select rows from dataframe by hour
I have a dataset with financial assets. Asset_id means one asset at a given timestamp. My dataframe has a time step per second, but I need to select each Asset_id for each hour in 'Date'. I tried ways ...
0
votes
0
answers
162
views
Why is my Pandas dataframe converting values to NaT?
I have a dataframe with a 'datetime' column of type datetime64[ns], and I'm trying to add a column with the difference between the current time and then next time in a numpy array of datetimes.
...
0
votes
1
answer
51
views
Pandas: Insert a zero in a DataFrame cell
I have a Dataframe with time series, whose values are presented below:
01/05/2023 25.1 25.9 25.1
01/05/2023 1 25.1 25.2 25.0
01/05/2023 2 24.7 25.1 24.7
01/05/2023 3 24.7 24....
0
votes
2
answers
50
views
Pandas: Issue at dropping undesirable zeros
I am trying to drop two last zeros at the hours column
01/05/2023;0000;
01/05/2023;0100;
01/05/2023;0200;
01/05/2023;0300;
01/05/2023;0400;
01/05/2023;0500;
01/05/2023;0600;
01/05/2023;0700;
and so ...
0
votes
1
answer
35
views
How to shift elements from a column to another based on time
I have a df
Date_time
Sam_house
Jack_House
Stella_House
01.01.2023 1:00
Sam
Jack
Stella
Considering
(i) Sam moves from one house to another every 10 minutes
(ii) Jack moves from one house to ...
-1
votes
1
answer
52
views
How to move column values from one column to another based on the data_time value on time-series
I have a df
Date_time
Universe_Location_1
Universe_Location_2
Universe_Location_3
Universe_Location_4
20.06.2023 11:00
Saturn
mars
earth
Considering
(i) Saturn moves from one location to another ...
1
vote
1
answer
84
views
Checking chunks of names in column of pandas dataframe for completeness
i have a pandas df that includes columns of sensor measurements were each row contains the sensor measurements of one unique sensor node. The order of these rows from the sensor nodes looks like this:
...
0
votes
1
answer
325
views
How to get all sequinces with tf.keras.utils.timeseries_dataset_from_array and sampling_rate > 1? There are lost last values
I have pd DataFrame with data columns and category column. I want to create Dataset to use in NN.
I'm using code from tensorflow tutorials https://www.tensorflow.org/tutorials/structured_data/...
1
vote
1
answer
139
views
How can I improve my Python code for classifying intermittent signals in a timeseries?
Classifying intermittent signals in the timeseries - is there a better way to write this in Python?
Problem: a sensor produces a signal which can be intermittent, say one period it is 0.01, next ...
0
votes
2
answers
84
views
manipulating timeseries data with frequency counts
I have a dataframe -
starttime Count
0 2013-10-01 00:00:00 274
1 2013-10-01 01:00:00 140
2 2013-10-01 02:00:00 67
3 2013-10-01 03:00:00 37
4 2013-10-01 04:00:00 57
... ... ...
...
0
votes
2
answers
2k
views
Sktime TimeSeriesSplit with pandas dataframe
I try to use cross-validationwith a timeseries for a pandas dataframe with the sktime TimeSeriesSplit. The dataframe df has a daily format:
timepoint balance
0 2017-03-01 1.0
1 2017-04-01 ...
0
votes
0
answers
1k
views
Correctly formatting a datetime index when reading in a csv file Pandas
I have an hourly time series data set for a full year. The date column in the original csv file is formatted as 'day.month.year' as the snippet below shows.
0 01.01.2018 00:00
1 01.01.2018 ...
0
votes
1
answer
52
views
How to recommend recurring action
I have a dataset that contains the following columns: id_customer (customer identifier), id_receiver (identifier of the person who receives the money), money (money sent), and the date the money was ...
1
vote
1
answer
55
views
Create the evolution of exposure from inception and maturity of a list of loans in Python/Pandas
I have a DataFrame with a list of loans, with inception date, maturity, amount.
I would like to create a DataFrame with the evolution of the exposure as follows:
from:
contract
amount
inception
...
0
votes
0
answers
37
views
Applying a conditional to a DateTime
I have sensor readings that I need to check for malfunctions. Any reading below 550 should be flagged up, the datetime records each 8 sensor every second.
If at any given point it goes below 550 for ...
1
vote
2
answers
105
views
Replace NaN values of DataFrame with values from list
I have a DataFrame with 14 variables which has several NaNs. I would like to fill these NaNs with a specific value existing in a list.
This is the df:
Date CalamarQ InkorQ ... SHelena2P ...
0
votes
1
answer
141
views
How to create a plot using instead of 8760 hours 365 days and 52 weeks in Python
I am new to Python and am having trouble with a small task. I have a time series with 8760 hours and certain values. I have already split the course of the values for one year. But now I want to plot ...
1
vote
2
answers
359
views
How to merge data in DataFrame in overlapping time periods in pandas?
I have a pandas DataFrame like the following:
start_time status duration
0 2023-03-16 01:30:00 OK 0 days 00:02:00
1 2023-03-16 01:31:00 WARN 0 ...
0
votes
1
answer
303
views
Select the same columns from multiple DataFrames in a Dictionary of DataFrames
I need to extract the same column from multiple DataFrames in a Dictionary and save them as a DataFrame so that each column name is labelled as the key in the Dictionary.
By extracting the 'close' ...
1
vote
1
answer
190
views
Calibrating one timeseries with another
I have two timeseries in two different dataframes. What is the best way to find the calibration coefficient between them?
I was thinking to substrack one dataframe from another and divide with ...
0
votes
1
answer
65
views
Removing falsely filled rows in time series data in pandas
I have 1-min time series data with event and duration columns. Sometimes events don't happen for a while but the last event is forward filled until the next event occurs. We know how many minutes each ...
0
votes
1
answer
53
views
Is there a way to reshape a Pandas Series into bins based on time intervals and select one of them?
So I have a timeseries stored on a Pandas Series:
data = pd.Series(data=[0,1,3,5], index=pd.to_timedelta([0, 15, 30, 45], unit='min'))
I wanted to group those data into 30 minute intervals, and then ...
1
vote
1
answer
49
views
Merging of multiples time series
Today I tried to merge multiples time series, corresponding to clinical recording (Heart rate, Arterial Pressure...), to make TSfresh analysis. Some of this have the same time step, and other have ...
0
votes
0
answers
124
views
How to resample a time series in DataFrame so that each time series have the same number of rows?
I have a time series in a DataFrame. The time series capture trajectories of the same path traversed, i.e. acceleration and rotation in x, y and z direction and a label (str). The timestamp is used to ...