Logistic Regression and Feature Extraction based

Fault Diagnosis of main bearing of Wind Turbines

Muhammad Kamran Bodla, Sarmad Majeed Malik, Muhammad Numan, Muhammad Zeeshan Ali,Jimmy
Muhammad Tahir Rasheed Baimba Brima
Electrical Department Electric Power Systems and Automation
National University of Science and Technology (NUST), North China Electric Power University, Beijing, China

Abstract—The total installed capacity of wind turbines is systems are becoming more famous as they provide the most
continuously increasing with focus on Renewable Energy Sources useful information about the status of components [4].
(RES). The reliability and efficiency of wind turbines has become
a major issue. The occurrence of fault in various components of Vibration analysis of bearing data is most challenging task
wind turbine needs to be addressed for improved performance. because vibration data is very friendly to noise and other
Among these issues, wind turbine bearing fault is the most unnecessary signals (including cross frequency and very low
significant. Different techniques such as online monitoring using frequency signals). A small change in air speed, direction,
Artificial Intelligence (AI), have been proposed and still research occurrence of any small disturbance in weather condition or
is being carried out in this domain. This paper presents a fault load can totally change the pattern of vibration data. As a
diagnosis analysis of main shaft bearings of wind turbines. The result, it is hard to differentiate normal and abnormal behavior
goal is to monitor the condition of wind turbines for early fault of bearing vibration signal which implies that monitoring of
prediction so that the turbine can immediately be adjusted for main bearing in WT is extremely important.
improved performance and extended service life. Different
techniques such as Fast Fourier transform (FFT), Hilbert Huang Literature review shows that many techniques have been
Transformation (HHT), Feature extraction, Logistic Regression developed to monitor the condition of bearings using vibration
(LR) are applied on a real data set of 18 wind turbines to signal analysis. For example, FFT was one of the pioneer
accurately evaluate the health of the turbine. The results approaches to achieve that goal [11]. Later, Time encoded
highlight the superior and reliable performance of these signal processing and recognition (TESPAR) and envelop
techniques for bearing fault detection for cost effective operation analysis were also used together to save bearings from total
and maintenance (O&M). failure [12]. Another approach is discussed in [13] which
compares the results of Artificial Neural Network (ANN) and
Keywords—Wind Turbine (WT);early condition monitoring; Genetic Algorithm (GA) to check which method is more
main bearing; fault diagnosis; Hilbert Huang accurate in detecting the abnormal behavior of gearbox signals.
Transformation(HHT); feature extraction; Logistic Regression Moreover, the use of FFT with Discreet Wavelet Transform is
revised in many different ways which provide more obvious
I. INTRODUCTION results. Discreet Wavelet Transform is also used with Envelop
Analysis to extract the characteristic spectrum of rolling
Wind Energy is growing with an increase of 24% every bearing vibration data which helps to diagnose the fault in
year which is faster than any other renewable energy source bearing [14]-[16].
[1]. Wind turbines (WTs) are unmanned, complex remote
power plant which bear intense weather conditions as they use Active and passive fault tolerance control methods along
air as stochastic input, due to change in weather conditions. with fault diagnosis and detection are discussed to find and
WTs are subject to continuously changing loads [2]. Despite diagnose the problem [3]. A list of different condition
the efforts made in this field, wind turbines are expensive to monitoring techniques for each component of wind turbine is
install and operate. The size and power capacity of WTs is provided in [10] but much effort is still required to develop
increasing day by day which inflicts new challenges in more precise sensors and data analysis algorithms. [7], [8]
reliability and accessibility. Like any electrical system, WTs explain how to detect fault in blades using novel pattern
also face premature component failure which results in high classification and Pattern recognition approach respectively.
energy loss and increase in operational and maintenance cost Fault detection in bearings using stress wave analysis at lower
[3]. To overcome all these challenges condition monitoring and speed is given in [9]. All the above techniques are being used
early fault diagnosis of the WT components is very necessary. since last two decades but none of them has shown the best
required results. Some techniques are time consuming while
Due to the high maintenance cost and long downtime of some lack reliability. Generic techniques are not available
WT, research is being carried out in early detection of fault and which can be used for the entire system.
condition monitoring. Many techniques have been developed
like Fault-Tolerant control and ANN. Moreover, SCADA

This paper presents a detailed and generic approach to After removal of higher frequencies and noise by
monitor the condition of main shaft bearing in WT with Butterworth band pass filter, the results are shown in Fig. 2.
diagnosis. It determines the current health of main bearing of Although output signal has regular fluctuations but still signal
any WT. The approach can be used with any type and any failure frequency is not visible. Fig. 3 shows the Fourier
layout of WTs. The only information required is vibration data frequency response of the filtered vibration data. It is obvious
of main bearing with electrical and mechanical parameters of that in abnormal data, the high frequency components of the
individual WTs. The approach is demonstrated on 18 WTs vibration signal are prominent and have a great influence on
vibration data and the results are presented. side band components. In normal data the vibration signal is
The paper is organized as follows: Section II presents relatively stable with negligible high frequency shock.
different signal processing methods. A model is developed
with the available data in Section III. Section IV discusses the
results of the applied techniques and conclusion and future
work is presented in Section V.
Condition based monitoring (CBM) is defined as a
predictive method of maintenance by continuously observing
the desired equipment. It can be done both as automatically and
manually. In our case, an automatic method is used to observe
the bearing fault by using vibration method. Vibration data
from different WTs working in different kind of weather
conditions is collected. Each turbine is 1.5MW unit. Main
bearing consists of inner ring, outer ring, rolling body and
support frame. Fig. 1. Vibration Data obtained from WT bearing
The data of this study is extracted from the bearing radial
vibration monitoring of the CMS system with vibration rate of
6.4s. Each part of signal is 6.4s and contains 16,384 points with
the frequency of 2560 Hz. Each point in data table shows the
amplitude of vibration.
A. Downtime Data
The system is continuously receiving vibration data even if
the turbine is not working (due to any reason such as weather
conditions), which implies wrong information will be obtained
and the data can lower the efficiency of our prediction method.
First, this data needs to be removed to ensure that we are
putting efforts on accurate condition of bearing. In this paper
down data is removed by comparing the time nodes of
vibration data with SCADA data. A data with rotational speed Fig. 2. Signals comparison after removing the downtime data and
of generator n > 8.8 r/min is chosen as this speed ensures the applying butterworth band pass filter
working of WT.
B. Fast Fourier Transformation (FFT)
FFT is a very fast frequency based tool which deals with
the internal oscillations of signal and converts it into frequency
domain through Discreet Fourier Transform (DFT). FFT is
expressed in (1):
୬ൗ ሻ
 ୩ ൌ ෍ š୬ ‡ିଶ஠୏୐ሺ ୒ Ͳ ൏ ݇ ൏ ݊ሺͳሻ
FFT response to vibration signal is not very clear due to
unwanted mixed noise. Butter worth filter is applied on raw
data to make the frequency response of FFT as flat as possible
to comprehend the results more obviously. Fig. 1 show the Fig. 3. FFT response of the filtered data
original data (top: abnormal, Bottom: normal). It is obvious
from this graph that we cannot describe the condition of Compared with the normal signal, the fault signal contains
bearing because of random fluctuation response and mixture high frequency impulses and side frequency component so the
high frequency signal can be used as an important parameter of
of noise.
fault feature extraction. Hilbert Huang Transformation (HHT)

is the best tool to extract these featured frequency impulses. abnormal data is visible after envelop analysis of IMFs.
HHT analyzes the non-stationary and nonlinear data much Envelope spectrum of abnormal unit in Fig. 5 contains
more effectively than FFT and Wavelet analysis. HHT is frequency peaks which clears the presence of severe faults. On
amalgam of two components: an algorithm which decomposes the other hand, the signal in normal unit contains more
the signal in many small components called empirical mode uniform fluctuation frequency. It is ensured that IMF
decomposition and Hilbert spectral analysis tool. It is explained frequency amplitudes can be used as an important feature to
in more details in next section. monitor and diagnose an early fault in bearings. Experiments
C. Emperical Mode Decomposition(EMD) show that frequency amplitude of only first five IMFs have
EMD is an algorithm in HHT which can break any complex most impact in detecting the failure frequency.
signal into its small finite set of functions, making it very
simple to analyze the signal. These sets are called Intrinsic
Mode Function (IMF). EMD uses an adoptive and iterative
shifting procedure to extract IMFs. This decomposing process
is as follows:
• Determine all the extrema (maxima and minima) of the
signal X (t).
• Connect all the local maxima with line. It will make
upper envelop of the signal e.g. ୳ሺ୲ሻ .
• Connect all the local minima with line. It will make
lower envelop of the signal e.g. ୪ሺ୲ሻ .
• Find the local mean from upper and lower envelopes.
ሺ–ሻ ൌ ͲǤͷ ൈ ሺ୳ሺ୲ሻ +୪ሺ୲ሻ ) (2)
• Subtract the local mean from the signal.
 ሺ–ሻ ൌ ሺ–ሻ െ ሺ–ሻሺ͵ሻ
• Replace X (t) by I (t) and repeat the above steps until
I(t) reaches the criteria of shifting process and meets the
IMF criteria.
Mathematically EMD is defined as:

ሺ–ሻ ൌ  ୬ ൅ ෍  ୧ ሺ–ሻሺͶሻ

where  ୬ is final residue from which no more IMF can be

extracted. Every IMF should fulfill following two conditions:
(1) mean value m (t) should be zero at every point and (2) the
difference of number of extrema and zero crossing should be
Fig. 4 EMD Analysis of Normal and Faulted Bearing
zero or at most 1. EMD decomposes the original signal in
many IMFs in such a way that the last IMF shows the
Method of Feature extraction from EMD is composed of
minimum frequency component of the signal often known as
following steps:
• Decompose the signal into IMFs.
Fig. 4 shows the application of EMD on vibration data and
it decomposes the signal into 14 IMFs with last one known as • Choose the most influencing IMFs
trend. It is visible that the trend of normal and abnormal data is • Calculate the total energy of selected IMFs
totally out of phase by 180 degree. It can be seen that the IMF
graphs do not directly perceive the failure frequency reaction.
To confirm the usefulness of EMD, an envelope spectrum of ୧୫୤ ൌ ෍  ୧ ሺ–ሻሺͷሻ
first three IMFs of normal and abnormal data are shown in Fig. ୬ୀଵ
5 and Fig. 6. • Calculate the overall IMFs energy of all data points
In envelope signal processing, the signal is first sent to a ୬
band pass filter and the filtered response is sent to an  ൌ ෍ ୧ ሺ͸ሻ
enveloper (rectifier) to extract the repetition rate of the sharp
surge of energy. A clear difference between normal and
• Create the feature character.

By following the above steps, mean amplitude of specific ሺšሻ
frequency band is extracted as feature character from first five Ž‘‰ ൌ Ʌ଴ ൅ Ʌšሺ͹ሻ
ͳ െ ሺšሻ
IMFs of one WT bearing data (normal and abnormal data) and
is used to train the model and test the competence of the This implies
model. ͳ
 ሺšሻ ൌ ሺͺሻ
ͳ െ ‡஘బା஘୶
From Fig. 4, 5 & 6, it is found that we must take Y=1 when
F(x) •0.5 and Y=0 when F(x) ”0.5 to minimize the inaccurate
diagnosis and maximize the likelihood function. In case study,
80% of the data is used for training and 20% data is used for
testing the model and results of all data classifications are very
good with an accuracy of 82% as shown in Table 1.


Testing Output of the model

Model Value Count Percent
Fig. 5 Envelop Spectrum of Faulted Bearing IMFs 0 636 32.32%
Iput data
1 1332 67.68%
Output 0 617 31.35%
data 1 1351 68.65%
Accuracy 82%
The accuracy is calculated as given in (9).

……—”ƒ…› ൌ ൈ ͳͲͲΨሺͻሻ
where x is number of occurrences of correctly predicted fault, y
is number of occurrences correctly predicted, h is number if
occurrences of fault and k is number of normal instances. This
model is best for a single WT but it does not have the ability to
classify the new data from other WT because only feature
Fig. 6 Envelop Spectrum of Normal Bearing IMFs extracted is the mean value of the amplitude of each
component which is not enough as different WT are working in
III. TRAINING AND TESTING THE MODEL different weather and working conditions. So we need to
extract more characteristic feature to generalize the model.
Model is trained with the data set of a single WT. Training
data set consists of mean value of IMF frequency amplitude B. Feature Extraction
(both in normal and abnormal data of the bearing). Model is Time domain feature extraction is a very basic and critical
developed with Logistic Regression (LR) technique to learn the approach to observe the early degradation of bearings in fault
behavior of system from feature characteristics and calculate diagnosis techniques. Many features can be extracted from
the current health status of the bearing. vibration data as it contains series of numerical values
A. Logistic regression (LR) representing acceleration, velocity and root mean square
(RMS) values. In this paper 13 features (dimensional and non-
LR is a statistical technique which estimates the dimensional) are extracted for our model.
relationship between two variables in which one is dependent
variable (DV) with categorical value while other is independent The blow graph shown in Fig. 7 describes the 8
variable. In this case, the study DV is status of bearing which is dimensional parameters (the maximum value, the minimum
either 0 or 1 with 0 meaning abnormal status and 1 implies value, peak value, mean value, root mean square value, RMSE,
normal status. standard deviation, variance) for 8 WT units. 4 WTs are
working in normal conditions (let’s name them as 1, 2, 5 and 7)
It can be easily understood that the value of Y is 1 when while other 4 WTs are already in abnormal condition (let’s
Ʌ଴ ൅ Ʌଵ š+İ is greater than threshold of probability and zero name them as 6, 10, 25 and 26). It is found that dimensional
otherwise. The threshold of probability is determined by cross parameters are not stable and cannot be used as feature
validation. characters to the model. So the behavior of some non-
LR maps the input features to the subsequent probability dimensional parameters (Kurtosis, margin factor, impulse
for a DV which is a sigmoid function of training feature vector, factor, Crest Factor, Waveform index) is observed in Fig. 7, for
label (x, y) and parameter ș. Parameter ș is obtained by same WT units.
maximizing the log likelihood function. Stochastic gradient
decent method is applied to find the optimal parameter.
Sigmoid function is calculated as:

A comprehensive comparison of 8 dimensional features
extracted from first five IMF is done and it is witnessed that 5
characters including mean, root mean square value, RMSE,
standard deviation and variance shows a perfect discrimination
in normal and faulty bearing data. Fig. 9 shows the
comparison of 8 dimensional characters for just first IMF. On
the other hand, it can be perceived from Fig. 10 that
dimensional characters are rickety and there is no visible
discrimination as in Fig. 9.

Fig. 7 Comparison of dimensional characters of 8 different WTs

Fig. 10 Comparison of non dimensional characters of IMF of 8 different


It implies that mean, root mean square value, RMSE,

standard deviation and variance can be used as feature
characters to train the model. The next step is to use these
characters to extract the feature vector as training set for the
model and to continue test to get more precise results.

Fig. 8 Comparison of non dimensional characters of 8 different WTs IV. RESULTS AND DISSCUSIONS
Most influential parameters are used to extract main
Fig. 8 shows that the non-dimensional parameters are information from raw data and then the information is used to
worse than dimensional parameters and the direct extraction of train the model. Data from seven different WTs of a wind farm
features from original data does not lead to the desire results. is used to test the final model. Five random data sets from
Therefore, there is a need to apply some frequency domain random WTs are taken as training set for model practice and
analysis tools to make these time domain characters useful for two data sets are used to test the model. An example is shown
the model. In section II, the extraction of IMF from EMD is in Fig. 11.
explained. IMF is very useful to extract the feature characters.
In the next step, calculation of the above mentioned 13 feature 1
characters from IMF values of the vibration data is carried out
instead of the original data to check if IMF values can be used
for more feature extraction. Same 4 steps are revised as 0.8
mentioned in section II to extract the features and results are 0.7
shown as shown in Fig. 9 and Fig. 10: 0.998

715 720 725


Actual result
Predicted results



0 200 400 600 800 1000 1200 1400 1600 1800 2000
Actual Results VS Predicted Results

Fig. 11 Comparison of actual and predicted events of the Model

Fig. 9 Comparison of dimensional characters of IMF of 8 different WTs

confirm the diversity of the model. More feature characters are
required to discriminate the normal and abnormal behavior
Fig. 11 shows the result of testing the model for WT4&7 more precisely. Availability of high quality data which is clean
which was trained by WT 2, 3, 1, 6 and 5 data. The above from any mixture of low frequency data can increase the
graph shows that the actual data and predicted data are almost diversity of this model.
similar for most of the values with an accuracy of 87.9%. After
that, model was tested for other combinations of WTs as given REFERENCES
in Table 2. [1] WWEA, "World Wind Energy Report" 2011.
[2] Tchakoua, P.; Wamkeue, R.; Ouhrouche, M.; Slaoui-Hasnaoui, F.;
Tameghe, T.A.; Ekemb, G. Wind Turbine Condition Monitoring: State
Model Results of the Art Review, New Trends, and Future Challenges.
Energies 2014, 7, 2595-2630.
Training set Testing set [3] Badihi, Hamed, Youmin Zhang, and Henry Hong. "A review on
Accuracy (%)
2/3/6/1/5 4/7 87㸬91 application of monitoring, diagnosis, and fault-tolerant control to wind
4/3/7/4/6 1/2 85.5 turbines." Control and Fault-Tolerant Systems (SysTol), 2013
4/2/6/1/5 3/7 84.41 Conference on. IEEE, 2013.
1/7/5/4/6 2/3 81 [4] Bangalore, Pramod, and Lina Bertling Tjernberg. "Self evolving neural
1/5/7/2/6 3/4 94.93 network based algorithm for fault prognosis in wind turbines: A case
6/7/2/4/3 1/5 91.27 study."Probabilistic Methods Applied to Power Systems (PMAPS), 2014
5/6/3/4/2 1/7 81.54 International Conference on. IEEE, 2014.
Average Accuacy : 87% [5] Besnard, François, and Lina Bertling. "An approach for condition-based
maintenance optimization applied to wind turbine blades." Sustainable
Energy, IEEE Transactions on 1.2 (2010): 77-83.
Table 2 confirms that the accuracy of the model is very [6] Tsung-Yi Lin, Chen-Yu Lee. ‘‘Logistic Regression’’ (2013)
high for any combination of data. Although these WTs face [7] Toubakh, Houari, Moamar Sayed-Mouchaweh, and Eric Duviella.
different wind speeds and wake shocks from each other but still "Advanced pattern recognition approach for fault diagnosis of wind
experimental results show that this model is very reliable and turbines." Machine Learning and Applications (ICMLA), 2013 12th
International Conference on. Vol. 2. IEEE, 2013.
efficient to use for condition monitoring of WT bearings. Fig.
12 shows the steps required to develop this model accurately. [8] Wisznia, Roman. "Condition Monitoring of Offshore Wind Turbines."
[9] N. Jamludin et al. (2001) Condition monitoring of slow speed rolling
Testing the elements bearings using stress waves. In: Proceedings of the institution
Model of the Mechanical Engineering. Vol 215, No 4, 200, pp 245-271
Training the
Model [10] T.W. Verbruggen (2003) Wind turbine operation and maintenance based
extraction on condition monitoring. ECN-C-03-047. Pages 1-33
Analysis [11] Jayaswal P,Agrawal B.New Trends In Wind Turbine Condition
Remove Monitoring System. Int JEmerg Trends Eng Dev 2011;3(1):133---48.
[12] Abdussiam S, Ahmed M, Raharjo P, Gu F, Ball AD. Time Encoded
Signal Processing and Recognition of incipient bearing faults
conference. In: Pro- ceedings of the 17th International Conference on
Fig. 12 Process of model development Automation and Comput- ing, ICAC 2011. Hudders field, United
Kingdom. September10;2011.Source: DBLP.
V. CONCLUSION [13] Ziani R,Zagadi R,Felkaoui A,Djouada M. Bearing Fault Diagnosis
Using Neural Network and Genetic Algorithms with the Trace Criterion.
Vibration Analysis is a key part of condition monitoring of Condition Monit Mach Non-Station Operat 2012:89---96.
wind turbine components. In this paper an experiment is
[14] M. Sarvajith B. Shah S. Kulkarni S. Jana. Condition Monitoring of
conducted by using the vibration data of normal and faulty WT Rolling Element Bearing Using Wavelet Transform and Support Vector
to monitor the condition of the bearing. Different time domain Machine Conference: NCCM 2013.
and frequency domain methods are used to extract the best [15] C.K.E. Nizwan S.A. Ong M.F.M. Yusof M.Z. Baharom. A wavelet
features from the data. A statistical regression technique is used decomposition analysis of vibration signal for bearing fault detection
on the extracted features and it is found with high accuracy of IOP Conference Series Materials Science and Engineering 50 1. 2013
87%, that the presented model is a best choice for the industry doi: 10.1088/1757-899X/50/1/ 012026.11.
to monitor and diagnose the bearings of WT and save a high [16] Sun W, Yang GA, Chen Q, Palazoglu A, Feng K. Fault diagnosis of
rolling bearing based on wavelet transform and envelope spectrum
cost of replacement of the equipment. The modeling approach correlation. J Vib Control 2013;19(6):924---41.
used in this paper is easy, time conserving and stable as it can
be used for any number of WTs in a wind farm without making
any change in the model. This paper presents a superior and
consistent performance of the model for bearing fault detection
as verified in section IV.
Although the presented technique has high accuracy, there
are some limitations since the size of fault data was limited.
Many faults did not occur due to low frequency data. These
faults are very difficult to sense by any modeling methodology.
More data from various WTs manufacturers is required to

