Skip to main content
Filter by
Sorted by
Tagged with
-1 votes
0 answers
18 views

Get rid of '0' in index column

I try to rename the Index column to 'idx' and get rid of 0 using this code: df1.index.rename(name='idx', inplace=True) However, I end up with the second dataframe as below. It results in messing up ...
Hưng Trần's user avatar
0 votes
1 answer
34 views

merge several columns of the same data into one

I've created a dataframe that adds data from several sources. Here is an example subset: index CompanyName Source1site Source2site Source3site City 1 Comp1 web1.com Nan ...
Anas's user avatar
  • 33
0 votes
1 answer
16 views

Debugging Error: Non-Numeric Argument in R Function for Calculating Animal Movement

I have an animal movement dataset ("data") that looks like this: Data ID Time x y u v A 2008-02-01 12:00:00 9155834.12606686 -1085858.899 ...
Cam's user avatar
  • 451
2 votes
2 answers
64 views

How to use numpy.where in a pipe function for pandas dataframe groupby?

Here is a script to simulate the issue I am facing: import pandas as pd import numpy as np data = { 'a':[1,2,1,1,2,1,1], 'b':[10,40,20,10,40,10,20], 'c':[0.3, 0.2, 0.6, 0.4, 0....
learner's user avatar
  • 655
1 vote
2 answers
45 views

Update Pandas DataFrame slice row-wise using dictionary

Question I am trying to update the values in a pandas (version 2.2.3) dataframe by row (such that each row has the same values) using a dictionary with row indices as keys and row values as values. In ...
Ouroboroski's user avatar
0 votes
0 answers
19 views

python df astype doesn't work on object from csv_read

I have a function that reads a csv into a df. the csv is quite large, but it's a combination of strings (categories) and numbers in different columns: import pandas as pd df_temp=pd.read_csv('somefile....
Ohad's user avatar
  • 69
0 votes
0 answers
24 views

AttributeError: 'numpy.ndarray' object has no attribute 'categories'

Modin DataFrame Merge Issue After dropna on Categorical Column: I'm encountering an issue when using Modin to merge DataFrames that contain categorical columns. The issue arose after I performed a ...
Sumukha G C's user avatar
0 votes
1 answer
34 views

Indices mismatch during merge in pandas

I am trying to merge two dataframes in Python, pandas, df1 and df2. I am trying to merge them on Column1, and then assign value of Column2 from df2 to df1. This is my code: df1 = df1.reset_index() ...
melca's user avatar
  • 13
0 votes
1 answer
29 views

Pandas read_excel with nrows and skiprows and lazy-loading?

I am searching for ways to read an .xlsx file as chunks of dataframes, instead of loading the whole thing into memory. What exactly happens when I pd.read_excel(nrows, skiprows, usecols) ? Is the ...
zensei's user avatar
  • 1
0 votes
1 answer
28 views

Comparing every latitude and longitude in a dataframe

Probably in over my head here as I'm still learning R... I have a dataframe containing a column with longitude values and another column with corresponding latitude values. I want to find long/lat ...
TwentyandChange's user avatar
0 votes
3 answers
49 views

How to separate multiple tickers into individual dataframes with yfinance downloaded data

I'm trying to download stock data information using yfinance. Currently, I can successfully download a single ticker using yf.download which returns a dataframe with information I can use. This API ...
Ryanc88's user avatar
  • 182
0 votes
0 answers
9 views

QTableView's ComboBox Delegate model is not synchronizing with the PandasModel

I've got a QTableView that uses a combobox delegate in 2 of the table's columns. The selected item from the combobox displays in the TableView correctly if no columns are sorted. When a column is ...
Tadpole's user avatar
  • 15
3 votes
2 answers
82 views

How to extract strings after specific symbols in one column and separate to multiple rows?

I have data that contains the nearest gene sets, including their genomic region and strand, in one column. I want to make a new column for the single gene extracted from that column and separate them ...
AKO's user avatar
  • 33
-2 votes
0 answers
13 views

How to impute OPEN_CLS_STS based on values in DT_CLS in Python [duplicate]

I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS. IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'. I tried ...
Derekhe1988's user avatar
-1 votes
1 answer
45 views

How to impute OPEN_CLS_STS based on values in DT_CLS [duplicate]

I'm trying to impute the OPEN_CLS_STS based on the values in DT_CLS. IF DT_CLS has a date populated then OPEN_CLS_STS should have a value 'C'. Otherwise OPEN_CLS_STS should have a value 'O'. I tried ...
Derekhe1988's user avatar
0 votes
0 answers
24 views

Creating separate groups in a dataframe when column values repeat [duplicate]

I have a dataframe with numbers formatted as follows: df = pd.DataFrame({"ColumnA": [1,2,3,4,5,6,7,8,9,10], "ColumnB": [1,3,5,6,4,7,5,4,1,2], "ColumnC": [0,1,1,2,0,2,1,1,...
zard's user avatar
  • 1
-1 votes
0 answers
43 views

How convert complex JSON into parquet file

I need convert the next json into a parquet file. Convert this kind of json using pyspark it's really easy, but the complex here is that i have a sub childs and have to do more that one explode, and ...
Julio's user avatar
  • 555
0 votes
1 answer
71 views

Manipulation of a Pandas dataframe most time- and memory-efficiently

Please imagine I have a dataframe like this: df = pd.DataFrame(index=pd.Index(['1', '1', '2', '2'], name='from'), columns=['to'], data= ['2', '2', '4', '5']) df: Now, I would like to calculate a ...
Saeed's user avatar
  • 2,068
1 vote
2 answers
39 views

Pandas Dataframe Multiindex - Calculate Mean and add additional column to each level of the index

Given the following dataframe: Year 2024 2023 2022 Header N Result SD N Result SD N Result SD Vendor A 5 20 3 5 22 4 1 21 3 B 4 25 2 ...
Tumas04's user avatar
  • 63
0 votes
1 answer
27 views

Order dataframe within pivot_wider function?

I have a dataframe in a longlist format with duplicate IDs. Each ID has a so-called donornr and timepoint (Tijdspunt). One ID (Deelnemernr.) can have duplicate timepoints like so: Deelnemernr. ...
Debbie Oomen's user avatar
-1 votes
0 answers
49 views

My Exponential Moving Average calculations are still somehow wrong?

Where am I going wrong... Here is my Python code which interacts with the MetaTrader5 API. import numpy as np import MetaTrader5 as mt5 import pandas as pd from sklearn.preprocessing import ...
RHO's user avatar
  • 49
2 votes
1 answer
27 views

python dataframe slicing by row number

all Python experts, I'm a Python newbie, stuck with a problem which may look very simple to you. Say I have a data frame of 100 rows, how can I split it into 5 sub-frames, each of which contains the ...
Jasper's user avatar
  • 85
0 votes
0 answers
45 views

I cannot get all data to export to CSV

# Collect batting stats for the 2022, 2023, and 2024 seasons try: print("Collecting batting stats from 2022 to 2024...") batting_data = batting_stats(2021, 2024, league="all&...
Chad Broussard's user avatar
1 vote
1 answer
38 views

Comparing empty dataframes

I have a function, extract_redundant_values, to extract redundant rows from a pandas dataframe. I am testing it by running on in_df to generate out_df. I am then comparing this against my expected ...
Tim Kirkwood's user avatar
0 votes
1 answer
46 views

loop over date range and appending new values to a new data frame

I wish to loop each row of the date frame below over each date of date rage below, check the following condition and return the current date of date range in a new data frame with all columns we have ...
lpca's user avatar
  • 3
1 vote
1 answer
58 views

Fill in rows to dataframe based on another dataframe

I have 2 dataframes that look like this: import pandas as pd data = {'QuarterYear': ["Q3 2023", "Q4 2023", "Q1 2024", 'Q2 2024', "Q3 2024", "Q4 2024"]...
TIC-FLY's user avatar
  • 173
-1 votes
0 answers
43 views

Convert null values from json file into a empty string

i need a help from all of you. I have to copy a json file from an S3 bucket to another S3 bucket, but this new json file must contain all the fields that have "null" value as an "" ...
Julio's user avatar
  • 555
1 vote
1 answer
47 views

Alternate background colors in styled pandas df that also apply to MultiIndex in python pandas

SETUP I have the following df: import pandas as pd import numpy as np arrays = [ np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...
bismo's user avatar
  • 1,429
0 votes
2 answers
77 views

Combine likert plot from appended data frame and bar plot from pure data frame in R using ggplot2

I have a data frame in R called df : library(tibble) library(tidyverse) library(ggplot2) library(ggstats) var_levels <- c(LETTERS[1:20]) n = 500 likert_levels = c( "Very \n Dissatisfied&...
Homer Jay Simpson's user avatar
0 votes
0 answers
45 views

force a column of all NaNs to be seen as a string

I have two dataframes that I need to merge with an automated process with the corresponding details: They are read as CSV and there is a corresponding type inference of the dtypes. Most of the time, ...
Stephen's user avatar
  • 8,700
1 vote
1 answer
35 views

How to style all cells in a row of a specific MultiIndex value in pandas

SETUP I have the following df: import pandas as pd import numpy as np arrays = [ np.array(["fruit", "fruit", "fruit","vegetable", "vegetable", &...
bismo's user avatar
  • 1,429
1 vote
1 answer
37 views

Pyspark computation time increases with less data

I'm posed with a problem where i have to iterate the same computations on each row of data until they converge. My train of thought was to remove the converged rows after each iteration so the ...
Søren Jensen's user avatar
0 votes
0 answers
18 views

bad : in bean i tried to create muiltiple data base to h2 and mysql [closed]

Error creating bean with name 'mysqlEntityManagerFactory' defined in class path resource [com/example/bank/Configure/MysqlConfig.class]: No PersistenceProvider specified in EntityManagerFactory ...
rakesh sharma's user avatar
0 votes
1 answer
31 views

Create an empty schema with struct inside

Hello guys i have a small question today, something that i want to set when i create an empty dataframe i want to set an empty schema if the json that i receive is the field "data" empty i ...
Julio's user avatar
  • 555
-1 votes
0 answers
40 views

XML to Pandas dataFrame [closed]

0 A 51 non-null object 1 B 51 non-null object 2 C 51 non-null object 3 D 45 non-null object This is the info of the dataframe. It is fine when I just return it ...
Debasish Kundu's user avatar
0 votes
0 answers
28 views

How to transform nested data to be used with an tabular learning network

I have an dataframe containing measurements and error with there corresponding status and counter. And i want to use it to feed it into e.g. TabNet Raw dataframe protocolid time measurement_1 ...
Max_Och's user avatar
0 votes
1 answer
37 views

How can I extract specific values from a .csv-File and add them into a specific cell in a pre-exisiting dataframe/tibble in R automatically?

I want to automatically extract specific values from a .csv-File, which is generated by our measuring device, into a a dataframe/tibble in R which has a pre-defined layout. The name of the measured ...
Piratepenguin's user avatar
0 votes
0 answers
33 views

Data retrieving and SQL database update

I'm trying to retrieve some data from an API and save them to a local database I created. All data come from Google Ads campaigns, and I need to make two separate calls because of their docs, but that'...
Davide's user avatar
  • 454
0 votes
0 answers
31 views

why am I am getting an the: sns.lineplot(x=anomaly_df['Date'], y=scaler.inverse_transform(anomaly_df['Close/Last']))

import numpy as np from keras.models import Sequential from keras.layers import LSTM, Input, Dropout from keras.layers import Dense from keras.layers import RepeatVector from keras.layers import ...
Asiedu's user avatar
  • 1
0 votes
0 answers
17 views

Trying to iterate over different teams in mean data from a larger dataset [duplicate]

Basically after the mean data for the teams home and away games is taken, I want to plot multiple graphs for each team in one loop, essentially, in the code below, where Arsenal is in quotes in the ...
Daragh's user avatar
  • 1
0 votes
0 answers
21 views

I reduced a dataframe of times series, and now get "Error in replCmat4" and "Incompatible methods" error messages

I want to estimate a panel VAR model on a large set of data (130 companies, 6 variables, 2 identifiers over 10 years), so large I had to cut my sample by half to have enough memory to run the function....
Guillaume Brasseur's user avatar
1 vote
2 answers
77 views

How can I clean a year column with messy values?

I have a project I'm working on for a data analysis course, where we pick a data set and go through the steps of cleaning and exploring the data with a question to answer in mind. I want to be able to ...
Jubilbee Draws's user avatar
0 votes
3 answers
77 views

Pandas dataframe reshape with columns name [closed]

I have a dataframe like this: >>> df TYPE A B C D 0 IN 550 350 600 360 1 OUT 340 270 420 190 I want reshape it to this shape: AIN AOUT BIN BOUT CIN COUT ...
Sun Jar's user avatar
  • 163
0 votes
0 answers
25 views

Pandas Dataframe rolling mean of last 50 daily values differs from rolling("50D").mean() [duplicate]

I'm trying to find how the "50D" rolling mean is being calculated in the following example because really I cannot find the way. import pandas as pd values = [np.nan, -0.00076194, -0....
Armando Contestabile's user avatar
2 votes
1 answer
55 views

Dropping duplicates by column in PySpark

I have a PySpark dataframe like this but with a lot more data: user_id event_date 123 '2024-01-01 14:45:12.00' 123 '2024-01-02 14:45:12.00' 456 '2024-01-01 14:45:12.00' 456 '2024-03-01 14:45:12....
Myakotka247's user avatar
0 votes
0 answers
34 views

Create a new line for comma separated values in pandas column - I dont want to add new rows, I want to have same rows in output [duplicate]

I have a dataframe like this, df col1 col2 1 'abc,pqr' 2 'ghv' 3 'mrr, jig' Now I want to create a new line for each comma separated values in col2, so the output would look ...
Kallol's user avatar
  • 2,189
1 vote
3 answers
53 views

How can I count cominations of variables in R?

I'm trying to count the number of occurrences of combinations across two variables in a data frame in R. If I have the following dataframe: df <- data.frame(v1 = c("A", "A", &...
decamaramp's user avatar
0 votes
1 answer
49 views

How can i change a column data type in pandas without creating null values in the whole column in my dataframe

I have been getting null values when trying to convert a column with the non-numeric type values to a column with numeric type values I have been using the below code line to change my column data ...
Samuel Sepeku's user avatar
0 votes
0 answers
47 views

How to Increase Precision of Decimal Points in Python DataFrames? [closed]

I am developing a system in Python that replicates another written in LabWindows. A part of the design involves calculating the Periodogram, which returns a decimal array. I then add this array to a ...
S N B's user avatar
  • 61
0 votes
0 answers
34 views

When I change the status and save the spreadsheet, the status I changed is not modified [closed]

I am comparing two excel spreadsheets, I select the first one that has all the data that should be in the system and the second spreadsheet has the data that was included in the system, after ...
Thoru Tuhi's user avatar

1
2 3 4 5
2952