Newest 'python+scikit-learn+pipeline' Questions

0 votes

2 answers

35 views

Feature Importance with ColumnTransform and OneHotEncoder in RandomForestClassifier

Apologies for bothering you, but I haven't been able to find a definitive answer after searching the site. I'm building a RandomForestClassifier on some clinical data where the target variable (...

Aezhel

11

asked Dec 2 at 13:09

0 votes

0 answers

29 views

How to use Keras with Optuna tuning and Sklearn Pipeline

I am developing a model using Keras and use Optuna for the hyperameter tuning. I need to use K-fold method for the development. However, I cannot successfully run it. Please help. Here is the code: ...

HappyFish

1

asked Nov 26 at 7:32

0 votes

0 answers

23 views

Encountered NaN value in between pipeline steps, sklearn's custom estimators and imblearn's custom sampler

I was trying custom estimator and custom sampler.MyFeatureConcator and MyFeatureResampler are the custom estimators that I would like to use as steps in my pipeline. The error encountered is as: ...

Sid

1

asked Nov 6 at 12:59

1 vote

1 answer

44 views

Data Shape Issues in SKL Pipeline using TFIDF

I am stumped on an issue with Python/Sci-Kit Learn/Pipelines. I am receiving an error that the shape of the data as it passes through the pipeline is not what is expected. Specific error: blocks[0,:] ...

Josh Willis

95

asked Aug 20 at 15:52

1 vote

0 answers

30 views

The features selected by SelectKBest do not match those transformed by ColumnTransformer

I am in the process of deploying a machine learning model for study purposes and I have some questions about it: My POST method will send to the API my original features (without transformations ...

leandro.starke

11

asked Jun 22 at 9:47

0 votes

0 answers

29 views

How to implement pipeline into machine learning model

I would like to implement Onehot encoding and label encoding to my dataset using Pipeline into my random forest model. I have created a function that utilize pipeline from scikit learn together with ...

Stackie

3

asked Jun 21 at 8:29

1 vote

0 answers

65 views

Passing Sample Weights to Sklearn Pipeline object with XGBoost

There are some good questions on this topic, however, I haven't found any solution to this error involving using XGBoost models with sample_weight in sklearn's Pipeline framework. Here is my example ...

a.powell

1,722

asked May 17 at 13:45

1 vote

1 answer

32 views

Label encoder the target in Pipeline

I want to create a pipeline to do preprocessing in both training features and target, then train the model. Dataset would be something like: v1 v2 target 0 1 a yes 1 5 c no 2 3 f ...

Fernando Quintino

193

asked May 16 at 12:59

0 votes

0 answers

51 views

Sklearn preprocessors work sequentially but produce NAs when used in Pipeline

Here's the context: I'm working with a dataset containing various feature types (numerical, categorical). My task is the binary prediction of startup success dependent on a target variable defined ...

Elias Hofmann

1

asked May 13 at 18:07

1 vote

0 answers

62 views

Pipeline for ML model using LabelEncoding in a Transformer [duplicate]

I'm attempting to incorporate various transformations into a scikit-learn pipeline along with a LightGBM model. This model aims to predict the prices of second-hand vehicles. Once trained, I plan to ...

alexquilis1

25

asked May 4 at 15:12

0 votes

1 answer

83 views

Error using a custom transformer in an SKLearn pipeline, but not as a standalone transformer

As an exercise I'm trying to create a custom transformer that takes a dataset and labels and returns the transformed dataset keeping only those columns with a correlation with the labels above a ...

Dargscisyhp

145

asked Apr 21 at 17:52

0 votes

0 answers

45 views

Dynamically set K value of SelectKBest

I am using SelectKBest in my pipeline and I want to be able to configure the number of features I want to select using a config.ini file. So essentially in the .ini file I have this : # ...

Hahanaki

1

asked Mar 23 at 12:08

0 votes

0 answers

29 views

Sklearn : ValueError feature shape during training is different than feature shape during validation

I'm trying to use sklearn to build a custom Pipeline for a school project that uses ML to analyze text. I have established some logging into my custom Transformers and am encountering an issue that ...

Hahanaki

1

asked Mar 22 at 13:33

1 vote

2 answers

884 views

Python raises an AttributeError when methods on the sklearn Pipeline object are called

Problem I am calling the fit_transform() and transform() methods on a Pipeline object, but Python is raising an AttributeError whenever I try to do so. Here is what I'm trying to run, with imports. (...

Martin

25

asked Mar 15 at 23:15

0 votes

0 answers

60 views

Sklearn: Extract feature names after model fitting with polynomialFeature, onehot encoding and OrdinalEncoder

As suggested in many other posts e.g., there are ways of extracting relevant feature names. However, How do I make sure that feature names align/are in the same order as the model.coef_? The structure ...

abalone

1

asked Mar 9 at 19:10

-2 votes

1 answer

39 views

Problems creating a transformer for a pipeline

Right now I'm trying to create a pipeline that initially use Random Oversampling, and the second step I want to use is a custom outlier remover, but I'm having problems executing that pipeline. That ...

Roterun

7

asked Mar 1 at 10:51

0 votes

0 answers

22 views

ColumnTransformer and Pipelines: how to properly use it

I am trying to build a pipeline but everytime I get rid off some issue, I end with a new one. ColumnTransformer is really playing with me. I want to make some transformations in some columns of a ...

Dimitri

119

asked Feb 23 at 19:05

0 votes

2 answers

221 views

Not sure on how to use the make_pipeline of sklearn correctly

I am playing around with the titanic ddataset and trying to make a correct usage of the sklearn make_pipeline, but I'm becoming a little confused on how tu correctly build the pipelines. Here's the ...

Dimitri

119

asked Feb 23 at 15:47

2 votes

3 answers

590 views

how to properly incorporate early stopping validation in sklearn Pipeline with ColumnTransformer

I want to setup a lightGBM model with early stop validation. I also want to follow the best practice of using Pipeline to combine preprocessing and model fitting and prediction. Code below: ...

PingPong

965

asked Jan 29 at 7:50

0 votes

1 answer

48 views

ColumnTransformer with non-trivially intersecting column domains

I'm working with a housing dataset containing both numerical and categorical data. The only missing values in my data occur in two of the numerical features. As an example, consider X and y given by ...

vonbecker

93

asked Jan 22 at 19:42

0 votes

0 answers

65 views

Including multiple dataset transformers in custom transformer

Here is my custom transformer, meant to transform the subject dataframe of encoding and scaling: class DfGrooming(BaseEstimator, TransformerMixin): def __init__(self): self....

Aditya Shandilya

1

asked Nov 29, 2023 at 11:56

3 votes

1 answer

200 views

Pass parameters across sklearn pipelines

I am writing a custom sklearn pipeline as follows: Step 1: class Step1(BaseEstimator, TransformerMixin): def __init__(self, input1: str = "Input1") -> None: self.input1 = ...

Ach Raf

31

asked Nov 5, 2023 at 18:11

0 votes

0 answers

90 views

Permutation feature importance on features transformed within a pipeline (sklearn)

A similar issue has been raised earlier. I need to compute feature importance of preprocessed features via sklearn.inspection.permutation_importance. The preprocessing is implemented within a pipeline....

victoris_93

1

asked Oct 26, 2023 at 14:21

0 votes

1 answer

38 views

Error with encoding categorical data in order

Data source text I am trying to encode the categorical data columns sex with Ohe and Blood Pressure and Diet with Oe, and then scale the data before passing it through a classifier in a pipeline. ...

111

17

asked Oct 22, 2023 at 6:09

0 votes

1 answer

175 views

Pipeline for Machine Learning Model has 'Feature shape mismatch' when trying to predict the target for a single observation

Here is the outline of my Machine Learning / Python project: Build a ColumnTransformer called preprocessor containing multiple transformers (e.g. One Hot Encoding, Ordinal Encoding etc) Build a ...

Zoe

35

asked Sep 26, 2023 at 9:06

2 votes

1 answer

460 views

How to manually select features for Scikit-Learn model regression?

There are various methods for doing automated feature selection in Scikit-learn. E.g. my_feature_selector = SelectKBest(score_func=f_regression, k=3) my_feature_selector.fit_transform(X, y) The ...

Bill

11.6k

asked Sep 22, 2023 at 19:20

-2 votes

1 answer

319 views

how to use SHAP library for text classification?

i have text data and pip line model . i want to using shap library to Visualize the impact on all the output classes i got this error : TypeError: The passed model is not callable and cannot be ...

MelinA

1

asked Aug 8, 2023 at 11:04

0 votes

1 answer

46 views

How can I force a GridSearchCV model (or a pipeline model) to use a given hyparameter value?

I have used GridSearchCV to find the best hyperparameters of a regularized logistic model. It also includes a pipeline to impute and standardize the covariates. numeric_cols = X_train.select_dtypes(...

skan

7,710

asked Aug 2, 2023 at 0:40

0 votes

0 answers

101 views

The features are not getting considered despite of adding feature selection in sklearn Pipeline

The pipeline has FeatureSelection to it, but it is not taking the updated Feature Values. This is how my pipeline looks like: # Define pipeline pipeline = ImbPipeline(steps=[ ('preprocessor', ...

Kaiwalya Patil

62

asked Jul 31, 2023 at 20:37

1 vote

1 answer

173 views

Invalid parameter 'logisticregression' for estimator Pipeline. GridSearchCV and ColumnTransformer

I'm trying to perform a GridSearchCV including a pipeline. I want to impute and standardize the numerical variables. And just impute the categorical ones. I've tried to do it like this: numeric_cols = ...

skan

7,710

asked Jul 31, 2023 at 20:23

-2 votes

1 answer

128 views

DATA INGESTION -TypeError: cannot unpack non-iterable NoneType object

I am getting this error in data ingestion part (training pipeline). I am trying to run trainining_pipeline.py and this error shows up. Full traceback: Traceback (most recent call last): File "...

bhavay bukkal

1

asked Jun 11, 2023 at 15:22

4 votes

2 answers

147 views

sklearn transformer for outlier removal - returning xy?

I am trying to remove rows that are labeled outliers. I have this partially working, but not in the context of a pipeline and I am not sure why. from sklearn.datasets import make_classification X1, ...

mmann1123

5,275

asked Jun 7, 2023 at 18:49

1 vote

1 answer

181 views

Sklearn pipeline with LDA and KNN

I try to use LinearDiscriminantAnalysis (LDA) class from sklearn as preprocessing part of my modeling to reduce the dimensionality of my data, and after applied a KNN classifier. I know that a good ...

Adrien Riaux

523

asked May 5, 2023 at 14:47

1 vote

1 answer

294 views

Drop a step from a sklearn pipeline using the step name

How to remove a step from a sklearn pipeline using the step name? By position I know that it can be done: pipeline.steps.pop(n) But with a very large pipeline, it can be difficult to find the ...

Slevin_42

87

asked Apr 27, 2023 at 9:25

0 votes

1 answer

184 views

What is the best practice to chain DL model into sklearn Pipeline() stages and still access hyperparameters e.g, batch_size \ epochs in pipeline?

I want to experiment DL regression model over time-series data by implementing the model using sklearn pipeline() properly. I formed the following DL model in the form of the class WaveNet and would ...

Mario

1,960

asked Apr 23, 2023 at 20:14

1 vote

1 answer

36 views

'Vect' not defined sklearn logistic regression error message

So I have this pipeline i used for a text classifier that works fine. from sklearn.feature_extraction.text import TfidfTransformer from sklearn.feature_extraction.text import CountVectorizer from ...

Barri

44

asked Apr 17, 2023 at 21:07

0 votes

1 answer

807 views

Sklearn. Pipeline. Several transformers. get_feature_names_out

I'he realised custom transformer of sklearn, where I porcess a column of text data. I create a pipeline, where I combine two transofrmers - NameTransformer, OneHotEncoder. But I have got an error. ...

Anton Troitsky

77

asked Mar 30, 2023 at 17:28

1 vote

2 answers

934 views

How run sklearn.preprocessing.OrdinalEncoder on several columns?

this code raise error: import pandas as pd from sklearn.compose import ColumnTransformer from sklearn.pipeline import Pipeline from sklearn.preprocessing import OrdinalEncoder # Define categorical ...

parvij

1,390

asked Mar 23, 2023 at 21:24

0 votes

2 answers

664 views

Problem With Scikit Learn One Hot and Ordinal Encoders

I'm having a problem with Scikit Learn's one-hot and ordinal encoders that I hope someone can explain to me. I'm following along with a Towards Data Science article that uses a Kaggle data set to ...

duffymo

308k

asked Mar 18, 2023 at 1:56

1 vote

1 answer

38 views

What object is a sklearn.pipeline.Pipeline that applies a ColumnTransformer actually fitting on when fit(X, Y) is called on it

I am trying to get an idea of the inner workings of a scikit learn Pipeline. Consider the below data set and pipeline construction. data = pd.DataFrame({ 'Name': ['Alice', 'Bob', 'Charlie'], '...

gebruiker

117

asked Mar 13, 2023 at 8:54

1 vote

2 answers

2k views

What is the correct order in data preprocessing stage for Machine Learning?

I am trying to create some sort of step-by-step guide/cheat sheet for myself on how to correctly go over the data preprocessing stage for Machine Learning. Let's imagine we have a binary ...

Yara1994

391

asked Feb 24, 2023 at 2:41

1 vote

1 answer

206 views

Staged_predict from a Pipeline object

I am having the same issue which was outlined years ago here: https://github.com/scikit-learn/scikit-learn/issues/10197 It seems to not have been resolved so I am looking for a work around. The ...

Keith

4,894

asked Feb 16, 2023 at 20:05

2 votes

1 answer

964 views

Using sample_weight param with XGBoost through a pipeline

I want to use the sample_weight parameter with XGBClassifier from the xgboost package. The problem happen when I want to use it inside a pipeline from sklearn.pipeline. from sklearn.preprocessing ...

Will

1,835

asked Feb 8, 2023 at 15:36

2 votes

1 answer

194 views

Return pipeline score as one of multiple evaluation metrics

I am using a pipeline in a hyperparameter gridsearch in sklearn. I would like the search to return multiple evaluation scores - one a custom scoring function that I wrote, and the other the default ...

David Pellow

65

asked Feb 5, 2023 at 8:15

0 votes

1 answer

40 views

Extracting feature importances along with column names from sklearn pipeline

I have a sklearn pipeline with two steps (a columntransformer preprocessor with a One hot encoder and a randomforestregressor estimator). I would like to get the feature names of the encoded columns ...

Sherwin R

99

asked Jan 26, 2023 at 12:21

0 votes

1 answer

285 views

Error using categorical data in Pipeline with OneHotEncoder

I would like to build a pipeline to predict 'Survival' from the three features 'SibSp_category', 'Parch_category', 'Embarked'. In the preprocessing step, I use (1) OrdinalEncoder to convert the ...

ja_doe

3

asked Jan 26, 2023 at 8:16

3 votes

1 answer

147 views

How to specify the parameter for FeatureUnion to let it pass to underlying transformer

In my code, I am trying to access the sample_weight of the StandardScaler. However, this StandardScaler is within a Pipeline which again is within a FeatureUnion. I can't seem to get this parameter ...

Olivier_s_j

5,152

asked Jan 4, 2023 at 15:18

0 votes

1 answer

685 views

Key Error when passing list of input in .predict() using Pipeline

From what I found out when trying myself and reading here on stackoverflow, When I pass a pandas dataframe to .predict(), it successfully gives me a prediction value. Like below: pipe = Pipeline([('...

Tejas Padhye

3

asked Dec 23, 2022 at 17:33

1 vote

2 answers

631 views

sklearn to pmml pipeline how to apply postprocessing linear trasnformation

I'm having a tough time trying to apply a postprocessing step with the sklearn2pmml packages. What I'm trying to do is to apply a linear transformation after applying the predict_proba method within ...

Tom

606

asked Dec 5, 2022 at 14:11

0 votes

1 answer

377 views

sklearn to pmml, cant create pipeline for preprocessing step of categorical columns

I'm having a tough time trying to create a PMML pipeline in the library sklearn2pmml (python). I want to convert categorical variables to numerical ones by reasigning them but don't have any clue, I ...

Tom

606

asked Dec 5, 2022 at 13:52

Collectives™ on Stack Overflow

All Questions

Related Tags