Comparison of Neural Network Architectures For Machinery Fault Diagnosis
Comparison of Neural Network Architectures For Machinery Fault Diagnosis
Comparison of Neural Network Architectures For Machinery Fault Diagnosis
GT2003-38450
N. Rieger
STI Technologies,Inc
Rochester, New York 14623
∑ ∑ ( t( k ) − a( k ))
1 1 makes them suitable for back-propagation applications.
mse = e( k ) 2 = 2
(6)
Q Q Once the network weights and biases have been
k =1 k=1
initialized, the network is ready for training. The network can
be trained for function approximation (nonlinear regression),
The LMS algorithm adjusts the weights and biases of the pattern association, or pattern classification. The training
linear network, to minimize this mean square error. The LMS process requires a set of examples of proper network behavior
rule is applied through the Widrow-Hoff Algorithm, as follows - network inputs p and target outputs t. During training the
weights and biases of the network are iteratively adjusted to
W(k +1) = W(k) + 2 áe(k)pT (k) (7) minimize the network performance function. The most known
b(k +1) = b(k) + 2 áe(k) (8) performance function for Feed-Forward networks is the mean
square error MSE - the average squared error between the
network outputs a and the target outputs t.
Here the error e and the bias b are vectors and α is a The modified Levenberg-Marquardt algorithm [10] which
learning rate. If α is large, learning occurs quickly, but if it is is used here, was designed to approach second-order training
too large it may lead to instability and errors may even speed without having to compute the Hessian matrix due to its
increase. To ensure stable learning, the learning rate must be large memory requirements; it uses the Jacobian matrix to
less than the reciprocal of the largest eigen value of the approximate the Hessian matrix, while the standard
correlation matrix pT p of the input vectors. Levenberg-Marquart uses the Hessian matrix to accelerate
Self-Organizing Networks
Self-organizing networks (SON) learn to classify input
vectors according to how they are grouped in the input space.
They differ from competitive layers in that neighboring
neurons in the self-organizing network learn to recognize
neighboring sections of the input space. Thus, self-organizing
networks learn both the distribution (as do competitive layers)
and topology of the input vectors they are trained on.
A self-organizing network learns to categorize input Figure 4 SON Architecture
vectors. It also learns the distribution of input vectors. SON
allocates more neurons to recognize parts of the input space Learning Vector Quantization Networks
where many input vectors occur and allocate fewer neurons to LVQ networks classify input vectors into target classes by
parts of the input space where few input vectors occur. Self- using a competitive layer to find subclasses of input vectors,
organizing networks also learn the topology of their input and then combining them into the target classes. Unlike
vectors. Neurons next to each other in the network learn to Perceptrons, LVQ networks can classify any set of input
respond to similar vectors. The layer of neurons can be vectors, not just linearly separable sets of input vectors. The
imagined to be a rubber net that is stretched over the regions in only requirement is that the competitive layer must have
the input space where input vectors occur. Self-organizing enough neurons, and each class must be assigned enough
networks allow neurons that are neighbors to the winning competitive neurons.
neuron to output values. Thus, the transition of output vectors To ensure that each class is assigned an appropriate
is much smoother than that obtained with competitive layers, amount of competitive neurons, it is important that the target
where only one neuron has an output at a time. Here a self- vectors used to initialize the LVQ network, have the same
organizing network identifies a winning neuron i* using the distributions of targets as the training data the network is
same procedure as employed by a competitive layer [10]. trained on. If this is done, target classes with more vectors
However, instead of updating only the winning neuron, all will be the union of more subclasses.
neurons within a certain neighborhood Ni* (d) of the winning Learning vector quantization (LVQ) is a method for
neuron are updated using the Kohonen rule. Specifically, we training competitive layers in a supervised manner. A
adjust all such neurons i ª Ni* d as follows: competitive layer automatically learns to classify input
vectors. However, the classes that the competitive layer finds
i w (q) = i w (q –1) + á (p(q) - i w (q –1)) (12) are dependent only on the distance between input vectors. If
or i w (q) = (1– á) i w (q –1) + áp (q) (13) two input vectors are very similar, the competitive layer
probably will put them in the same class. There is no
mechanism in a strictly competitive layer design to say
whether or not any two input vectors are in the same class or
Conditioning amplifier
Waveform acquisition
(sampling)
Amplitude spectrum
( FFT, unwrap phase )
Phase spectrum Figure 9 the test rig showing planted faults locations
Averaging display
No Fault: After setting up the test rig and the acquisition
system, nine data files are collected where no faults been
Scaling
planted. Those files are regarded as the No-Fault (NF) class.
Each measurement is performed while the rotor is running at
1800 RPM. Small 1X amplitude and smaller amplitudes for all
Integration other components characterize it, as sample shown in Figure
10.
Mechanical Unbalance: A mechanical unbalance is
Saving to text introduced by tightening a 10 gm bolt to one of the discs close
file
to the middle of the rotor. Also, nine readings are obtained.
The recorded files are labeled as (UN) class. Large 1X with
Amplitude spectrum smaller harmonics characterize it, as sample shown in figure
display 11.
Structural Looseness: In order to make a suitable
Figure 8 Block Diagram of the required Signal looseness in a safe way, two bolts are relaxed. These are the
Processing bolts of the test rig frame. The accelerometer was held radial
in the same side of the relaxed frame bolts, as shown in Figure
9. Nine more data files are recorded here and has been labeled
Basically, to obtain a spectrum, the steps shown in the as (SL) class. Large 1X with maybe larger then decreasing
block diagram in Figure 8, are to be implemented in the signal harmonics characterize it, as sample shown in Figure 12.
When the order of data files has changed, the perceptron The ability of different network architectures to recognize
network fails to accomplish the same performance. This is particular vibration signature and correlate them to different
referred to the data files being in this arrangement resulting in a test rig conditions was tested.
trained network with a decision boundary that causes miss The five diagnostic network architectures succeeded with
recognition due to linear inseparability of data vectors. different degrees varying between 75% and 100% in the
detection of different test rig conditions in both training and
validation phases. It was found that most of the networks
CONCLUSION succeeded 100% in the detection of the training data set, except
A neural-based comparison was conducted between five the FeedForward network, which reached 87% in the training
different architectures concerning their capabilities to perform data set and 75% in the validation set. It could reach better
machinery fault diagnosis. performance by adding more hidden neurons or more hidden
This comparison was demonstrated with the use of a layers.
desktop test rig, which was subjected to two different While for validation data set, two networks reached 100%
mechanical faults: mass unbalances, and structure looseness, correct fault diagnosis, which were the Perceptrons and the
seeking to examine the capabilities of different networks for LVQ. While the Self-Organizing network was very interesting
diagnosing rotating machinery faults, based on the because it needed only one data file for each test rig condition
characteristic vibration signatures and amplitude spectrums of for the training phase, and with this small amount of training
each fault, measured on the desktop test rig. data, it succeeded in the diagnosis of 84% of the validation
data, which demonstrate its high capability in this issue.