Unit 5 Notes New
Unit 5 Notes New
Unit 5 Notes New
Data preprocessing is one of the many crucial steps of any data science project. As we
know, our real-life data is often very unorganized and messy and without data
preprocessing. First, we have to preprocess our data and then feed that processed data
to our data science models for good performance. One part of preprocessing is Feature
Transformation which we will discuss in this article.
It refers to the algorithm family that creates new features using the existing features.
These new features may not have the same interpretation as the original features, but
they may have more explanatory power in a different space rather than in the original
space. This can also be used for Feature Reduction. It can be done in many ways, by
linear combinations of original features or using non-linear functions. It helps machine
learning algorithms to converge faster.
Box-cox requires the input data to be strictly positive (not even zero is acceptable), while
Yeo-Johnson supports both positive and negative data.