Real-Time Motor Fault Detection by 1D Convolutional Neural Networks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 1

Real-Time Motor Fault Detection by 1D


Convolutional Neural Networks
Turker Ince, Member, IEEE, Serkan Kiranyaz, Senior Member, IEEE, Levent Eren, Member,
IEEE, Murat Askar and Moncef Gabbouj, Fellow Member, IEEE

 signal processing techniques [4]: time-domain analysis [5],[7],


Abstract—Early detection of the motor faults is frequency-domain analysis [8],[9], enhanced frequency
essential and Artificial Neural Networks (ANNs) are widely analysis [10],[11], and time–frequency analysis techniques
used for this purpose. The typical systems usually [12],[14]. The signal-based systems do not require an explicit
encapsulate two distinct blocks: feature extraction and
classification. Such fixed and hand-crafted features may
or complete model of the system but their performance may
be a sub-optimal choice and require a significant degrade when working in an unknown or unbalanced
computational cost that will prevent their usage for real- condition. It is a well-known fact that as the complexity of
time applications. In this paper, we propose a fast and advanced signal processing tools used increases, fault
accurate motor condition monitoring and early fault detection capability is increased together with the
detection system using 1D Convolutional Neural computational cost [6]. The knowledge-based systems may be
Networks (CNNs) that has an inherent adaptive design to
fuse the feature extraction and classification phases of the
divided into two groups: qualitative methods on the basis of
motor fault detection into a single learning body. The symbolic intelligence and quantitative methods on the basis of
proposed approach is directly applicable to the raw data machine learning intelligence [3]. The qualitative methods
(signal) and thus eliminates the need for a separate include fault trees, diagraphs, and expert systems whereas
feature extraction algorithm resulting in more efficient quantitative methods have both unsupervised learning systems
systems in terms of both speed and hardware. such as K-means, C-means, nearest neighbor, principal
Experimental results obtained using real motor data
demonstrate the effectiveness of the proposed method for
component analysis (PCA), and self- organizing maps (SOM),
real-time motor condition monitoring. and supervised learning systems such as artificial neural
networks (ANN), fuzzy logic (FL), support vector machines
Index Terms—Convolutional Neural Networks; Motor (SVM), partial least squares (PLS), and hybrid systems. The
Current Signature Analysis hybrid systems may be more suitable for complex fault
detection problems where the features are extracted from
statistical projection methods such as PCA and PLS, or signal
I. INTRODUCTION processing methods such as fast Fourier transform (FFT) and

M OTOR fault detection and diagnosis methods can be


divided into three major categories: model-based,
signal-based, and knowledge-based. Model-based methods use
wavelet transform (WT). The performance of knowledge-
based methods relies on training data and quality of selected
features heavily.
mathematical models describing the normal operating In several studies [15]-[27] different features are proposed.
conditions of the induction motors [1]. In model-based The selected features are presented to classifiers as inputs.
methods, fault diagnosis algorithms are developed to monitor Diagnosis of electric stator faults in induction machines using
the consistency between the measured outputs of the practical an ANN based approach is proposed in [17]. Machine fault
systems and the model-predicted outputs [2]. The main conditions were predicted with less than 2.4% error using only
advantage of a model-based method is that the fault diagnosis 13 training data patterns and 9 validation data patterns. In
is very straightforward if the model parameter has a one-to- [18], Li et al presented a neural-network based motor bearing
one mapping with the physical coefficients [3]. The signal- fault diagnosis system using time and frequency based
based methods usually employ one of four main classes of features, which achieved average detection rates between
88.75% and 96.25% for different number of hidden neurons.
Manuscript received December 10, 2015; revised April 21, 2016; In [21], a neural-network-based fault prediction scheme
accepted May 8, 2016. without using any machine parameter or speed information is
T. Ince, L. Eren and M. Askar are with the Electrical & Electronics
Engineering Department, Izmir University of Economics, Izmir, Turkey
presented. Speed is estimated from measured terminal voltage
(e-mail: [email protected], [email protected], and current. With minimal tuning of the neural network,
[email protected]). induction machines of different power ratings can be
S. Kiranyaz is with the Electrical Engineering Department, Qatar
University, Doha, Qatar (e-mail: [email protected]).
accommodated, and 93% or more detection performance is
M. Gabbouj is with the Department of Signal Processing, Tampere achieved. In [22], two types of neural detectors, feedforward
University of Technology, Tampere, Finland (e-mail: multi-layer perceptron (MLP) and self-organized Kohonen’s
[email protected]).

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 2

network, were employed to classify healthy or damaged In this paper, we propose a fast, generic and highly accurate
bearings with 85% accuracy. Tung et al [23] proposed a motor anomaly detection and condition monitoring system
CART-ANFIS based classifier to perform fault diagnosis of using an adaptive 1D Convolutional Neural Network. With a
induction motors and for six different fault classes with 180 proper adaptation over the traditional CNNs, the proposed
training and 90 test samples, obtained total classification approach can directly classify input signal samples acquired
accuracy of 91.11% and 76.67% for vibration and current from the motor current, therefore, resulting in an efficient
signals, respectively. In another work [24], by using a self- system in terms of speed that allows a real-time application.
organizing map, cluster information from frequency-domain As mentioned earlier, due to the CNNs’ ability to learn to
features is extracted, and fault mode prediction with an error extract the optimal features, with the proper training, the
rate of 1.48% is achieved using a 2-dimensional multi-class proposed system can achieve an elegant classification and
SVM. fault detection accuracy. The overview of the proposed system
Although mostly satisfactory levels of anomaly detection is illustrated in Fig. 1.
accuracies were reported, most of these prior studies had to In this study, we further aim to demonstrate that simple
utilize different features and/or classifiers for various types of CNN configurations can easily achieve an elegant detection
motor data. This basically shows how crucial the choice of the performance rather than the complex ones commonly used for
right features to characterize the specific signals used. deep learning. In this way using compact 1D CNNs one can
Therefore, it is obvious that such features that are either easily perform few hundreds of back-propagation (BP)
manually selected or hand-crafted may not optimally iterations for efficient training after which a real-time
characterize any motor current signal and thus cannot monitoring and continuous anomaly detection can be
accomplish a generic solution that can be used for any motor accomplished since a compact CNN only performs few
data. In other words which feature extraction is the optimal hundreds of 1D convolutions to generate the output decision
vector. This makes them an ideal tool to be used in an
choice for a particular signal (motor current data) still remains
accurate, real-time, and cost-effective motor fault detection
unanswered up to date. Furthermore, feature extraction usually
and early fault alert system. In summary, the contributions of
turns out to be a computationally costly operation which
the paper are the following:
eventually may hinder the usage of such methods in real-time
monitoring applications. In this study, we aim to address these  We propose a novel approach for motor fault detection
drawbacks and limitations using Convolutional Neural using 1D CNNs that can merge feature extraction and
Networks (CNNs). classification tasks into a single machine learner. To
CNNs are feed-forward and constrained 2D neural networks our knowledge, this is the pioneer work applied for this
that has both alternating convolutional and subsampling purpose.
layers. Convolutional layers basically model the cells in the  By directly learning the best possible features from
human visual cortex [28]. The final layers after the motor’s training data, the proposed generic classifier
convolutional layers are fully connected and thus resemble can adapt to possible variability of motor current
MLPs. CNNs aim to mimic the mammalian visual system signatures and it is applicable to different types of
which can accurately recognize certain patterns and structures electrical machine failures.
such as objects in a visual scenery. CNNs have recently  The proposed method does not require any form of
become the de-facto standard for “deep learning” tasks such as transformation, feature extraction, and post-processing.
object recognition in large image achieves and achieved the It can directly work over the raw data, i.e., the motor
state-of-the-art performances [29]-[31] with a significant current signal, to detect the anomalies.
performance gap. In our earlier work, the adaptive CNNs have  As a result, while achieving an elegant classification
successfully been used over 1D electrocardiogram (ECG) performance, the computational complexity of the
signals, in particular for the purpose of ECG classification and proposed method is significantly lower than any prior
anomaly detection [32] and exhibit a superior performance in
work and thus enables the real-time detection
terms of both accuracy and speed. The main reason behind this
capability.
is that during the training phase the convolution layers of the
CNNs basically are optimized to extract highly discriminative The rest of the paper is organized as follows: A brief
features using a large set of 1D filter kernels. The latter layers introduction to motor faults is provided in Section II. Section
basically mimic a MLP which performs the classification III outlines the motor fault diagnosis dataset and the down-
(learning) task. As a result, when trained properly for a sampling process performed over the raw data. The proposed
particular signal collection (dataset), they can optimize both 1D CNNs along with the formulations of the back-propagation
feature extraction and classification tasks according to the training are presented in Section IV. In Section V, the
problem at hand. Usually the optimization technique is a experimental results obtained using the real motor data are
gradient-descent method with random initialization, the so- presented and performance of the proposed approach is
called back-propagation (BP) method that iteratively searches evaluated using the standard performance metrics. Finally,
for the optimal set of network parameters (filter coefficients, Section VI concludes the paper and suggests topics for the
MLP weights and biases). future research.

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 3

II. MOTOR FAULTS The equations used for calculating both characteristic
The main sources of failure for induction machines include vibration frequencies and current frequencies are given as
both mechanical types caused by bearings faults and electrical follows [34]:
types caused by insulation or winding faults. Bearing faults Outer race defect frequency, fOD, the ball passing frequency
are by far the highest single cause of all motor failures. They on the outer race, is given by
are the most difficult to detect but the least expensive to fix
n BD
when detected early enough and replaced [12]. Consequently, f OD  f (1  cos  )
this study focuses on the detection of bearing faults in the 2 rm PD (1)
earliest possible way. where frm is the rotor speed in revolutions per second, n is the
Bearing faults are mechanical defects and they cause number of balls, and the angle φ is the contact angle which is
vibration at fault related frequencies. The fault related zero for ball bearings.
frequencies can be determined if both bearing geometry and
shaft speed are available. Typical ball bearing geometry is
depicted in Figure 2.

Fig. 1. Overview of the proposed approach with training (offline) and real-time monitoring and fault detection phases.

Ball defective frequency fBD, the ball spin frequency, is given


by
Pitch Diameter (PD)

PD BD 2
f BD  f (1  ( ) cos 2  )
2 BD rm PD (4)

The bearing dimension data (n, PD, BD) can be easily


obtained from the manufacturer in most cases. The mechanical
vibration due to the bearing defect results in air gap
Ball Diameter (BD)
eccentricity. Oscillations in air gap width in turn cause
variations in flux density. The variations in flux density affect
Fig. 2. Ball bearing geometry the machine inductances producing stator current vibration
harmonics [7]. The characteristic current frequencies, fCF, due
Inner race defect frequency fID, the ball passing frequency on to bearing characteristic vibration frequencies can be
the inner race, is expressed as expressed as,
n BD
f ID  f rm (1  cos  )
2 PD (2) f CF  f e  mf v
(5)
where fe is the line frequency, m is an integer and fv is
Cage defect frequency fCD, caused by irregularity in the rolling
characteristic vibration frequency obtained from Eqs. 1-4.
element train, is given by
1 BD
f CD  f (1  cos  )
2 rm PD (3)

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 4

III. MOTOR FAULT DATA PREPARATION power system fundamental component in the current
Motor fault related frequency components usually show up in frequency spectrum complicate the motor fault detection
close neighborhood of fundamental frequency in motor current process. Usually notch filters are used for pre-processing of
spectrum. Their magnitudes are very small compared to the motor current data to suppress power system fundamental
magnitude of power system fundamental frequency. frequency in the current spectrum.
Therefore, the presence of electrical noise and dominant

Original motor current signal Original signal spectrum


1

0.5 -5
10
0

-0.5
-10
10
-1
0 500 1000 1500 2000 200 400 600 800
Preprocessed motor current signal Preprocessed signal spectrum

0.5 -5
10
0

-0.5

50 100 150 200 20 40 60 80 100 120

Fig. 3. Sample healthy motor current signal and its amplitude spectrum before and after preprocessing
Original motor current signal Original signal spectrum
1

0.5 -5
10
0

-0.5 -10
10
-1
0 500 1000 1500 2000 200 400 600 800
Preprocessed motor current signal Preprocessed signal spectrum

0.5 -5
10
0

-0.5

50 100 150 200 20 40 60 80 100 120

Fig. 4. Sample faulty motor current signal and its amplitude spectrum before and after preprocessing

The test system consists of a three-phase, one hp, 200 V, captured at 128 point per cycle for a minute in each trial. The
four-pole, 1750 r/min induction motor (US Motors Frame current data is then filtered by a second order notch filter to
143T) and a SquareD CM4000 industrial circuit monitor to suppress the fundamental frequency for preprocessing.
capture current data. The shaft end ball bearing is a 6205-2Z- The raw input current signal is down-sampled by a factor of
J/C3 (9 balls) and the opposite end ball bearing is a 6203-2Z- 8 by performing a decimation preceded by an anti-aliasing
J/C3 (8 balls). filtering. The decimated signal is then normalized properly to
In data collection, baseline data is taken for the motor under be the input of the 1D CNN classifier. The decimation allows
monitoring using a healthy set of bearings. Then, the cage of the usage of a simpler CNN configuration, which in turn
shaft end bearing is dented to simulate a cage defect, and line improves both training and detection speeds. Finally, the
current is sampled under same loading condition to collect training and test sets are normalized to have zero mean and
data from a motor with a faulty bearing. Motor current is unity standard deviation to remove the effect of dc offset and

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 5

amplitude biases, and then linearly scaled into [-1, 1] interval two major parameters of the CNN. The input layer is only a
before being presented to the CNN classifier. Sample healthy passive layer which accepts an input image and assigns its
and faulty motor current signals and their amplitude spectrum (R,G,B) color channels as the feature maps of its three
before and after preprocessing are shown in Figures 3 and 4. neurons. With forward propagation over sufficient number of
sub-sampling layers, they are decimated to a scalar (1-D) at
IV. THE PROPOSED SYSTEM WITH 1D CNNS the output of the last sub-sampling layer. The following layers
are identical to the layers of a MLP, fully-connected and feed-
A. Overview of CNNs forward networks that has the output layer estimating the
CNNs are biologically inspired feed-forward ANNs that decision (classification) vector.
present a simple model for the mammalian visual cortex. They In order to accomplish decimation until a scalar is achieved
are now widely used and become the de-facto standard in at the output CNN layer, the entire CNN configuration
many image and video recognition systems. Fig. 5 illustrates a (number of convolutional, sub-sampling and MLP layers) has
2D CNN model with an input layer accepting 28x28 pixel to be arranged according to the input image dimensions.
images. Each convolution layer after the input layer alternates Usually it is the other way around, i.e., the input image
with the sub-sampling layers which decimate propagated 2D dimension is adapted according to the CNN configuration. To
maps from the neurons of previous layer. Unlike hand-crafted address this drawback we performed certain modifications on
and fixed parameters of the 2D filter kernels, in CNNs they the CNN topology and further formulated the back-
are trained (optimized) by the back-propagation (BP) propagation training of a 1D CNN that works over 1D (time)
algorithm. However, the kernel size and the sub-sampling signals.
factor that are set to 5 and 2 for illustration in Fig. 5, are the

Fig. 5. Overview of a sample conventional CNN

Layer (l‐1) Layer l Layer (l+1)

s1l 1 f ' ( xkl ) kth neuron b1l 1


w1l k1
x1l 1
bkl w kl 1 +
l f’ 1x8
x k

wikl 1
+
l 1 b lj1
s i y l f s l l x lj1
k k w k j
SS(2)
+ 1x8
1x20 1x10
l 1
w N l 1k l 1
l 1
lk lsk l
b N l 1
s N l 1 US(2) wkN l 1 x Nl l11
1x20 1x10 +
1x22 1x8

Fig. 6. The convolution layers of the proposed adaptive 1D CNN configuration

(fault detection) phases of the raw motor current signals. The


B. Adaptive 1D CNNs and Back-Propagation
adaptive CNN topology will allow us to work with any input
As mentioned earlier we used an adaptive 1D CNN layer dimension. Furthermore, the proposed compact CNN
configuration in order to fuse feature extraction and learning

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 6

have now the hidden neurons of the convolution layers that


   
NL

E p  MSE (tip , y1L ,...., y NL L )   yiL  t ip


2
can perform both convolution and sub-sampling operations as (8)
i 1
shown in Fig. 6. This is why we call the fusion of a
convolution and a sub-sampling layer as the “CNN layer” to
make the distinction but still call the remaining layers as the The objective of the BP is to minimize the contributions of
MLP layers. So, the 1D CNNs are composed of an input layer, network parameters to this error. Therefore, we aim to
compute the derivative of the MSE with respect to an
hidden CNN and MLP layers and an output layer. l 1
Further structural differences are visible between the individual weight (connected to that neuron, k), wik , and bias
traditional 2D and the proposed 1D CNNs. The main l
of the neuron k, b k , so that we can perform gradient descent
difference is the usage of 1D arrays instead of 2D matrices for
both kernels and feature maps. Accordingly, the 2D matrix method to minimize their contributions and hence the overall
error in an iterative manner. Specifically, the delta of the kth
manipulations such as 2D convolution (conv2D) and lateral
rotation (rot180) have now been replaced by their 1D neuron at layer l, lk will be used to update the bias of that
counterparts, conv1D and reverse. Moreover, the parameters neuron and all weights of the neurons in the previous layer
for kernel size and sub-sampling are now scalars, K and ss for connected to that neuron, as,
1D CNNs, respectively. However, the MLP layers are E E
identical to 2D counterpart and therefore, has the same, l 1
 lk y il 1 and  lk (9)
wik bkl
traditional BP formulation.
In 1D CNNs, the 1D forward propagation (FP) from a
previous convolution layer, l-1, to the input of a neuron in the So from the first MLP layer to the last CNN layer, the regular
current layer, l, can be expressed as, (scalar) BP is simply performed as,
N l 1
xkl  bkl   conv1D ( wikl 1 , sil 1 ) (6) E N l 1
 E  x il 1 N l 1
i 1
 s kl
  s kl   xi 1
l 1
 s kl
 
i 1
l 1
i w kil (10)
i
l l
where xk is the input, b k is a scalar bias of the kth neuron at
l 1
Once the first BP is performed from the next layer, l+1, to the
layer l, and s i is the output of the ith neuron at layer l-1. current layer, l, then we can further back-propagate it to the
wikl 1 is the kernel from the the ith neuron at layer l-1 to the input delta, lk . Let zero order up-sampled map be:
kth neuron at layer l. The intermediate output of the neuron, us kl  up ( s kl ) , then one can write:
y kl , can then be expressed from the input, x kl , as follows: E y kl E us kl ' l
lk   f ( xk )  up ( skl )  f ' ( xkl ) (11)
y kl  f ( xkl ) and skl  y kl  ss (7)
y kl xkl us kl ykl

where   ss 1 since each element of s k was obtained by


l
l
where s k is the output of the neuron and  ss represents the
averaging ss number of elements of the intermediate output,
down-sampling operation with the factor, ss.
The adaptive CNN configuration requires the automatic

y l . The inter BP of the delta error (  s l 
  l 1 ) can
k k ı
assignment of the sub-sampling factor of the output CNN be expressed as,
layer (the last CNN layer). It is set to the size of its input
 conv 1Dz  
N l 1
l 1
array. For instance, in Figure 6 assume that the layer l+1 is the  s kl  ı , rev ( w kil ) (12)
last CNN layer, then ss = 8 automatically since the input array i 1

size is 8. Such a design allows the usage of any number of


CNN layers. This adaptation capability is possible in this CNN where rev(.) reverses the array and conv1Dz(.,.) performs full
configuration because the output dimension of the last CNN convolution in 1D with K-1 zero padding. Finally, the weight
layer can be automatically downsized to 1 (scalar) regardless and bias sensitivities can be expressed as,
from the native sub-sampling factor parameter that was set in E E
advance for the CNN.  conv1D ( skl , li1 )   lk ( n ) (13)
wkil bkl n
We shall now briefly formulate the back-propagation (BP)
steps. The BP of the error starts from the output MLP layer.
Let l=1 and l=L be the input and output layers, respectively. As a result, the iterative flow of the BP algorithm can be
Also let NL be the number of classes in the database. For an stated as follows:
input vector p, and its corresponding target and output vectors, 1) Initialize all weights (usually randomly, U(-a, a))
t ip and [ y1L ,...., y NL ] , respectively , the mean-squared error 2) For each BP iteration DO:
L

(MSE) in the output layer for the input p, Ep, can be expressed a. For each item (or a group of items or all items) in the
as follows: dataset, DO:

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 7

i. FP: Forward propagate from the input layer to the performance. The 1D CNN configuration used in all
output layer to find outputs of each neuron at each experiments has [60 40 40] neurons on the 3 hidden
layer, yi , i  1, N l and l  1, L  .
l convolution layers and 20 neurons on the hidden MLP layer.
The output (MLP) layer size is 2 which is the number of
ii. BP: Compute delta error at the output layer and classes and there is a single input neuron which takes the input
back-propagate it to first hidden layer to compute signal as the 240 (time-domain) samples of the decimated
the delta errors,  k , k  1, N l and l  2, L  1
l
motor current data. The two parameters of the1D CNN, the
kernel size, K, and the sub-sampling factor, ss, are set to 9 and
iii. PP: Post-process to compute the weight and bias 4, respectively. In this case, the sub-sampling factor for the
sensitivities. last CNN layer is set to 4, which is automatically determined
iv. Update: Update the weights and biases with the in the proposed adaptive CNN implementation.
(accumulation of) sensitivities found in (c) scaled For all experiments we assigned a two-fold stopping criteria
with the learning factor, ε: for BP training: the minimum train classification error (CE) is
E 0.5% or the maximum number of BP iterations is 100.
wikl 1 (t  1)  wikl 1 (t )   Whenever either criterion is met the BP training stops. The
wikl 1
(14) learning factor, ε, is initially set as 0.001 and the global
E adaptation is performed during each BP iteration: for the next
b (t  1)  b (t )   l
l l
k k
bk iteration if the train MSE decreases in the current iteration ε is
increased by 5%; otherwise, reduced by 30%. We repeated 10
individual BP runs for each data partition and we reported the
average anomaly detection performances.
V. EXPERIMENTAL RESULTS
In this section, the experimental setup for the test and B. Detection Performance Evaluation
evaluation of the proposed motor condition monitoring An extensive set of experiments are performed using real
approach is first presented. Then, the overall results obtained motor current data samples for a total of 260 healthy (H) and
from the experiments using real motor data are presented in 260 faulty (F) cases. The dataset is obtained from a three-
terms of the most common metrics found in the literature: phase squirrel cage induction motor using an industrial circuit
classification accuracy (Acc), sensitivity (Sen), specificity monitor for capturing motor current data. The proposed
(Spe), and positive predictivity (Ppr). While accuracy adaptive 1D CNN classifier is implemented by C++ using MS
measures the overall system performance over the two classes Visual Studio 2013 in 64bit. For training the 1D CNNs, 10-
of motor data, Healthy (H) and Faulty (F), the other metrics fold cross-validation technique is applied to improve
are specific to each class and they measure the recall rate of generalization and avoid the over-fitting problem. Table I
the classification algorithm to each class. The expressions of presents the confusion matrix of motor fault detection problem
these standard performance metrics using the hit/miss for all (10) test runs.
counters, e.g., true positive (TP), true negative (TN), false For comparison with major competing signal processing
positive (FP), and false negative (FN), are as follows: techniques for current-based bearing fault detection we
Accuracy is the ratio of the number of correctly classified implemented wavelet packet decomposition [12], [13] and
patterns to the total number of patterns classified, Acc = Fast Fourier Transform (FFT) [8], [18] based feature
(TP+TN)/(TP+TN+FP+FN); Sensitivity (Recall) is the rate of extraction techniques with three commonly used classifiers
correctly classified fault events among all data, Sen = from the literature: Multi-Layer Perceptron (MLP) [18],
TP/(TP+FN); Specificity is the rate of correctly classified Radial Basis Function Networks (RBFN) [13], and Support
normal (H) events among all H events, Spe = TN/(TN+FP); Vector Machines (SVM) [24]. We explored various
and Positive Predictivity (Precision) is the rate of correctly configurations for these classifiers and empirically selected the
classified F events in all detected F events, Ppr = configurations that achieved the best performances, ([32 64 32
TP/(TP+FP). Finally, the computational complexity of the 2] for MLP, [32 32 2] for RBFN, and SVM with the linear
proposed method for both training (offline) and classification kernel). Classification results using the aforementioned
(online) will be discussed. common metrics are summarized in Table II. While accuracy
measures the overall system performance over all classes, the
A. Experimental Setup other metrics are specific to each class and they measure the
As described in Section III, motor current signals are ability of the classification algorithm to distinguish certain
represented as 240 time-domain samples after pre-processing events (i.e., faulty motor) from nonevents (i.e., healthy motor).
at input of the proposed classifier. The 1D CNN-based motor In addition, the region of convergence (ROC) plots are
fault detection system used in all experiments has a compact presented in Figure 7 for better visualization of the
configuration with only three hidden convolution layers and 2 performance of the proposed method.
MLP layers. In this way we aim to accomplish an elegant
computational efficiency required for training and particularly
for real-time anomaly detection. Besides that, this will also
demonstrate that deep and complex CNN configurations are
not really needed to achieve the desired detection

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 8

TABLE I
CONFUSION MATRIX OF THE MOTOR FAULT DETECTION PROBLEM FOR ALL
TEST RUNS

Classification Result
H F
Ground H 2522 78
Truth F 58 2542

TABLE II
MOTOR FAULT DETECTION PERFORMANCES OF THE PROPOSED METHOD
WITH SIX MAJOR ALGORITHMS

Method Fault detection


Acc Sen Spe Ppr
1 Proposed 1D CNN 97.4 97.8 97.0 97.0
2 WP – MLP 97.9 97.0 98.8 98.9
3 WP - RBFN 99.8 100 99.7 99.7
4 WP - SVM 99.2 100 98.4 98.3 Fig. 8. The average execution times (msec) of the proposed algorithm
5 FFT – MLP 92.7 90.8 94.9 95.1 (1) and six major algorithms (2-7, in the same order as in Table II)
6 FFT - RBFN 92.5 90.8 94.4 94.6
7 FFT - SVM 84.2 85.0 83.3 82.9
VI. CONCLUSIONS
In this work, we proposed a novel motor condition monitoring
system with an adaptive implementation of 1D Convolutional
Neural Networks (CNNs) that are able to fuse the two major
blocks of a traditional fault detection approach into a single
learning body: feature extraction and classification. The
proposed system has the ability (to learn) to extract the
optimal features with the proper training and thus it can be
applied to any motor data. This not only achieves a high level
of generalization but also voids the need for manual parameter
tuning or hand-crafted feature extraction and furthermore,
promises an optimized solution for the problem at hand.
The proposed system is tested with real motor current data
and the experimental results demonstrate its potential and
effectiveness as a real-time motor condition monitoring
system. It can be easily modified to include the detection and
classification of both mechanical and electrical faults with
signatures on mechanical or electrical quantities (i.e. current).
With the BP training the convolutional layers of proposed 1D
Fig. 7. ROC plots of classifiers for comparison. The x- and y-axis CNN can learn to extract optimized features while the MLP
represent the false positive rate and true positive rate, respectively layers perform the classification task. Experimental results
demonstrated that an elegant fault detection accuracy (> 97%)
From the results in Table I and Table II, it is fairly evident can thus be achieved. Due to the simple structure of the 1D
that the proposed method based on 1D CNN classifier can be CNNs that requires only 1D convolutions (scalar
effectively used for motor bearing fault diagnosis. In our multiplications and additions) any hardware implementation
implementation with Intel ® OpenMP API the training time of of the proposed system will be quite feasible and cheaper. It is
the proposed system was around 4.8 minutes. Note that the therefore suitable for FPGA or ASIC implementations [33].
training will be performed only once per motor. Specifically, Such a hardware implementation and classification of more
for the single-CPU implementation, the total time for a fault types for real-time monitoring will be the topic of our
forward propagation of a single input current data to obtain the future work.
class vector is less than 1 msec. The average execution time of
the proposed algorithm and that of six major algorithms are
compared in Figure 8.
REFERENCES
[1] A. Giantomassi, “Electric motor fault detection and diagnosis by kernel
density estimation and Kullback–Leibler divergence based on stator
current measurements,” IEEE Trans. Ind. Electron., vol. 62, no. 3, pp.
1770-1780, Mar. 2015.

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 9

[2] Z. Gao, C. Cecati and S. X. Ding, “A survey of fault diagnosis and fault- [24] W.-Y. Chen, J.-X. Xu, S.K. Panda, "Application of artificial intelligence
tolerant techniques—part I: fault diagnosis with model-based and signal- techniques to the study of machine signatures," in Proceedings of the XX
based approaches,” IEEE Trans. Ind. Electron., vol. 62, no. 6, pp. 3757– IEEE International Conference on Electrical Machines (ICEM' 2012),
3767, Jun. 2015. Marseille (France), Sep. 2012, pp. 2390-2396.
[3] X. Dai and Z. Gao, “From model, signal to knowledge: a data-driven [25] M.S. Ballal, Z. J. Khan, H. M. Suryawanshi, R. L. Sonolikar, “Adaptive
perspective of fault detection and diagnosis,” IEEE Trans. Ind. neural fuzzy inference system for the detection of inter-turn insulation
Informat., vol. 9, no. 4, pp. 2226–2238, Apr. 2013. and bearing wear faults in induction motor”, IEEE Trans. Ind. Electron.,
[4] F. Filippetti, A. Bellini, and G. A. Capolino, “Condition monitoring and vol. 54, no. 1, pp. 250–258, Jan. 2007.
diagnosis of rotor faults in induction machines: state of art and future [26] F. Zidani, D. Diallo, M. E. H. Benbouzid, R. Nait-Said, “A fuzzy based
perspectives,” in Proc. IEEE WEMDCD, Paris, Mar. 2013, pp. 196–209. approach for the diagnosis of fault modes in a voltage-fed PWM inverter
[5] W. Zhou, T. Habetler, and R. Harley, “Bearing fault detection via stator induction motor drive”, IEEE Trans. Ind. Electron., vol. 55, no. 2, pp.
current noise cancellation and statistical control,” IEEE Trans. Ind. 586–593, Feb. 2008.
Electron., vol. 55, no.12, pp.4260-4269, Dec. 2008. [27] K. Kim and A. G. Parlos, “Induction motor fault diagnosis based on
[6] A. Bellini, F. Filippetti, C. Tassoni and G. A. Capolino, “Advances in neuropredictors and wavelet signal processing,” IEEE/ASME
diagnostic techniques for induction machines,” IEEE Trans. Ind. Transactions on Mechatronics, vol. 7, no. 2, pp. 201–219, Jun. 2002.
Electron., vol. 55, no. 12, pp. 4109–4125, Dec. 2008. [28] D. H. Wiesel and T. N. Hubel, “Receptive fields of single neurones in
[7] C. Kral, T. G. Habetler, R.G. Harley, “Detection of mechanical the cat’s striate cortex,” Journal of Physiology, vol. 148, pp. 574–591,
imbalances of induction machines without spectral analysis of time Oct. 1959.
domain signals” IEEE Trans. Ind. Appl., vol. 40, no. 4, pp. 1101–1106, [29] D. C. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber, “Deep
Apr. 2004. big simple neural nets for handwritten digit recognition,” Neural
[8] R.R. Schoen, T.G. Habetler, F. Kamran, and R. G. Bartheld, “Motor Comput., vol. 22, no. 12, pp. 3207–3220, Dec. 2010.
bearing damage detection using stator current monitoring,” IEEE Trans. [30] D. Scherer, A. Muller, and S. Behnke, “Evaluation of pooling operations
Ind. Appl., vol. 31, pp. 1274-1279, Dec. 1995. in convolutional architectures for object recognition,” in Proc. Int. Conf.
[9] G.B. Kliman, W.J. Premerlani, B. Yazici, R.A. Koegl, and J. on Artificial Neural Networks (ICANN), Thessaloniki, Greece, Sep.
Mazereeuw, “Sensorless online motor diagnostics,” IEEE Comput. Appl. 2010, pp. 92-101.
Power, vol. 10, no.2, pp. 39–43, Feb. 1997. [31] A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification
[10] J. Pons-Llinares, J.A. Antonino-Daviu, M. Riera-Guasp, S. B. Lee, and with deep convolutional neural networks", In Proc. Advances in Neural
T.J. Kang and C. Yang "Advanced induction motor rotor fault diagnosis Information Processing Systems (NIPS), Lake Tahoe, Nevada, Dec.
via continuous and discrete time–frequency tools," IEEE Trans. Ind. 2012, pp. 1097-1105.
Electron., vol. 62, no 3, pp. 1791 – 1802, Mar. 2015. [32] S. Kiranyaz, T. Ince and M. Gabbouj, “Real-time patient-specific ECG
[11] D. Z. Li, W. Wang, and F. Ismail, “An enhanced bispectrum technique classification by 1D convolutional neural networks”, IEEE Trans.
with auxiliary frequency injection for induction motor health condition Biomed. Eng., vol. 63, DOI: 10.1109/TBME.2015.2468589, no. 3, pp.
monitoring,” IEEE Trans. Instrum. Meas., vol. 67, no 10, pp. 2279-87, 664-675, Aug. 2015.
Oct. 2015. [33] C. Farabet, C. Poulet, J. Han, and Y. LeCun, “CNP: An FPGA-based
[12] L. Eren, M. J. Devaney, “Bearing damage detection via wavelet packet processor for convolutional networks,” in Proc. IEEE International
decomposition of the stator current”, IEEE Trans. Instrum. Meas., vol. Conference on Field Programmable Logic and Applications, Prague,
53, no. 2, pp. 431–436, Feb. 2004. Sep. 2009, pp. 32-37.
[13] L. Eren, A. Karahoca, and M. J. Devaney, "Neural network based motor [34] V. Wowk, Machinery Vibration, Measurement and Analysis, McGraw-
bearing fault detection," in Proc. IMTC-Instrumentation and Hill Education – Europe, United States, Jul. 1991.
Measurement Technology Conference, May 2004, pp. 1657-1660.
[14] Z. Ye Z, B. Wu B, A. R. Sadeghian, “Current signature analysis of
induction motor mechanical faults by wavelet packet decomposition,”
IEEE Trans. Ind. Electron., vol. 50, no. 6, pp. 1217-28, Jun. 2003.
[15] J. Liu, W. Wang, F. Golnaraghi, and K. Liu, “Wavelet spectrum analysis Turker Ince (M’98) received the B.S. degree
for bearing fault diagnostics,” Meas. Sci. Technol., vol. 19, no. 1, pp. 1- from the Bilkent University, Ankara, Turkey, in
10, Jan. 2008. 1994, the M.S. degree from the Middle East
[16] R. Yan, R. X. Gao, and X. Chen,”Wavelets for fault diagnosis of rotary Technical University, Ankara, Turkey, in 1996,
machines: A review with applications,” Signal Processing, vol. 96 Part and the Ph.D. degree from the University of
A, pp 1–15, Mar. 2014. Massachusetts, Amherst (UMass-Amherst), in
[17] R. Di Stefano, S. Meo, M. Scarano, "Induction motor fault diagnostic 2001 all in electrical engineering.
via artificial neural network," in Proceedings of the IEEE International From 1996 to 2001, he was a Research
Symposium on Industrial Electronics (ISIE' 94), Santiago, May 1994, Assistant at the Microwave Remote Sensing
pp. 220-225. Laboratory, UMass-Amherst. He worked as a
[18] B. Li, M-Y, Chow, Y. Tipsuwan, J.C. Hung, “Neural network based Design Engineer at Aware, Inc., Boston, from 2001 to 2004, and at
motor rolling bearing fault diagnosis,” IEEE Trans. Ind. Electron., vol. Texas Instruments, Inc., Dallas, from 2004 to 2006. In 2006, Dr. Ince
47, no. 5, pp. 1060-1069, Oct. 2000. joined the faculty of Engineering and Computer Science at Izmir
[19] G.F. Bin, J.J. Gao, X.J. Li, and B.S. Dhillon, “Early fault diagnosis of University of Economics, Turkey, where he is currently an Associate
rotating machinery based on wavelet packets – empirical mode Professor. His research interests include radar remote sensing and
decomposition feature extraction and neural network,” Mechanical target recognition, signal processing, evolutionary optimization and
Systems and Signal Processing, vol 27, pp 696–711, Feb. 2012. machine learning.
[20] X. Li, A. Zhang, X. Zhang, C. Li, and L. Zhang, “Rolling element
bearing fault detection using support vector machine with improved ant
colony optimization,” Measurement, vol. 46, no.8, pp 2726-2734, Aug. Serkan Kiranyaz (SM’15) was born in Turkey,
2013. 1972. He received his BS degree in Electrical
[21] K. Kim, A. G. Parlos, and R. M. Bharadwaj, “Sensorless fault diagnosis and Electronics Department at Bilkent
of induction motors,” IEEE Trans. Ind. Electron., vol. 50, no. 5, pp. University, Ankara, Turkey, in 1994 and MS
1038–1051, Oct. 2003. degree in Signal and Video Processing from the
[22] C.T. Kowalski and T.O-Kowalska, “Neural network application for same University, in 1996.
induction motor faults diagnosis,” Mathematics and Computers in He received his PhD degree in 2005 and his
Simulation, vol. 63, no. 3-5, pp.435-448, Nov. 2003. Docency at 2007 from Tampere University of
[23] Tung, V. T., Yang, B.-S., Oh, M.-S., & Tan, A. C. C., “Fault diagnosis Technology, Institute of Signal Processing
of induction motor based on decision trees and adaptive neurofuzzy respectively. He was working as a Professor in
inference,” Expert Syst. Appl., vol. 36, no. 2, pp. 1840–1849, Feb. 2009. Signal Processing Department in the same
university during 2009 to 2015 and he held the Research Director

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2016.2582729, IEEE
Transactions on Industrial Electronics
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 10

position for the department and also for the Center for Visual Decision
Informatics (CVDI) in Finland.
Prof. Kiranyaz published 2 books, more than 35 journal papers in 10
different IEEE Transactions and other high impact journals, and around
80 papers in international conferences. He made significant
contributions on bio-signal analysis, classification and segmentation,
computer vision with applications to recognition, classification,
multimedia retrieval, evolving systems and evolutionary machine
learning, swarm intelligence and stochastic optimization.

Levent Eren (M’98) Levent Eren received


B.S., M.S. and Ph.D. degrees in electrical
engineering from the University of Missouri,
Columbia, in 1995, 1998, and 2002
respectively. He was a member of Electrical
and Electronics Engineering Department at
Bahcesehir University, Turkey, between 2003
and 2012.
He is currently working as an associate
professor Electrical and Electronics
Engineering at Izmir University of Economics, Turkey. His research
interests include motor fault diagnostics, power quality instrumentation,
renewable energy, and signal processing.

Murat Askar received B.S. and M.S. degrees


from Department of Electrical Engineering at
Middle East Technical University in 1974 and
1976 respectively, and the Ph.D. degree also in
the same university in 1981.
He worked as graduate assistant from 1974
to 1978 in the Department of Electrical and
Electronics Engineering of Middle East
Technical University. In the same department,
he was promoted as instructor in 1978,
assistant professor in 1981, and associate professor in 1984. He
became professor in 1991 in the same department. Since 2010, Dr.
Askar has been working as a professor in Izmir University of
Economics. His research interests include VLSI design and
communications.

Moncef Gabbouj (F’11) received his BS


degree in electrical engineering in 1985 from
Oklahoma State University, and his MS and
PhD degrees in electrical engineering from
Purdue University, in 1986 and 1989,
respectively.
He is a Professor of Signal Processing at the
Department of Signal Processing, Tampere
University of Technology, Tampere, Finland.
He was Academy of Finland Professor during
2011-2015. His research interests include
multimedia content-based analysis, indexing and retrieval, machine
learning, nonlinear signal and image processing and analysis, voice
conversion, and video processing and coding.
Dr. Gabbouj is a Fellow of the IEEE and member of the Academia
Europaea and the Finnish Academy of Science and Letters. He is the
past Chairman of the IEEE CAS TC on DSP and committee member of
the IEEE Fourier Award for Signal Processing. He served as associate
editor and guest editor of many IEEE, and international journals and
Distinguished Lecturer for the IEEE CASS. He organized several
tutorials and special sessions for major IEEE conferences and
EUSIPCO. Dr. Gabbouj guided 40 PhD students and published 650
papers.

0278-0046 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like