Estimation of Difficult-to-Measure Process Variables Using Neural Networks
Estimation of Difficult-to-Measure Process Variables Using Neural Networks
Estimation of Difficult-to-Measure Process Variables Using Neural Networks
Abstract - In this paper, two different artificial neural by measuring process variables from the plant often do not give
networks are tested and compared with regard to their enough information, especially if the value of the output
application in the estimation of difficult-to-measure process quantity is determined using laboratory analysis. However, if
variables. Two of the most commonly used neural the data is collected over a relatively long time period, it could
networks, the MLP (multi-layer perceptron) and RBF contain enough information upon which the relationship
(radial basis function) neural networks, with simple between certain process variables may be determined.
structure and standard training methods are chosen as Continuous progress in process instrumentation and
examples. Neural network training is based on available computerization in process industry has resulted in large
data from a database of process variables measured over a databases, containing values of process variables recorded over
long time period. The database in this paper is obtained a long period of time. The important fact is that this bulk of
using a simulation model of a real process. Without going data contains significant knowledge of the process itself.
deeper into theoretical background, relative properties of Therefore, it is necessary to find a way of using the data for
these neural networks are given through the results obtaining the required mathematical model, i.e. for designing a
obtained by testing the trained networks and analysis suitable estimator of difficult-to-measure process variables
performed on these results. from this large dataset [2] [l]. Difficulties in designing an
Keywords: difficult-to-measure process variable, estimation, estimator mostly arise due to the features of the dataset [3].
soft-sensor, artificial neural networks. Basically, there are two approaches to obtain the
required process model based on such data [2] [3]:
- Application of multivariate statistics,
I. INTRODUCTION - Application of artificial neural networks (ANN).
These approaches of process modelling are getting more and
A basic requirement for introducing automatic control is more popular because they do not require a deeper
the existence of continuous and on-line measurement of understanding of the process. A good overview of the most
relevant process variables, especially of output variables that important methods in both fields is given in [4]. In order to
are directly related to the quality of the final product. obtain better results (models), these methods are often
Unfortunately, in most industrial processes sensors cannot combined in practice.
directly measure important output variables. The required Statistical methods of system modelling are based on PCA
information with regard to the value of these (difficult-to- (Principal Component Analysis) and PLS (Partial Least
measure) process variables is then obtained through laboratory Squares) methods [2] [5]. The benefits of statistical methods
analysis. This method of 'measuring' is generally expensive, with respect to methods based on neural networks are:
time consuming, it is subject to mistakes due to human factor - Computationally less demanding,
and has a very low sample frequency. Therefore, it does not - Results do not depend on initial conditions,
provide an adequate monitoring of the process or a basis for - Simpler model development.
introducing automatic control. However, traditional statistical methods give poor results when
A solution that would eliminate, or at least palliate, these modelling non-linear processes, processes with time-varying
problems is possible with the use of an estimator that would parameters and especially dynamics of the process. Neural
substitute the missing measurement of the required process networks are being used to overcome these disadvantages. Due
variables using information based on other (easy-to-measure) to the fact that the computational power of today's computers
process variables [I]. Easy-to-measure process variables are (and those used in the industry) is always on the rise, the
measured by sensors and are available on-line. Such an computationally demanding ANN is less and less of a problem
estimator then represents a virtual sensor, often referred to as a for their practical application in process monitoring and control
soft-sensor, which can indirectly measure the required difficult- systems.
to-measure variable. Thus, difficult-to-measure process This paper is organised as follows. Some basic properties
variables become available on-line with the same sample of neural networks used in this paper are given in section 2. A
frequency as the easy-to-measure process ones. description of the simulation used in generating the process
In order to be able to perform estimation, a suitable database is given in section 3. Training and testing of the chosen
mathematical model of the process must be available. Using neural networks, and the results obtained are given in section 4
theoretical aspects (building a physical-chemical model) to while section 5 gives an analysis of the results obtained and
build such models is in most cases not possible or it is not respcctive conclusions.
considered as acceptable. Thus, in order to obtain a model,
experimental methods shou'd be applied. Current data obtained
? ll ...............
11 The change of liquid level in the reservoirs with respect to the
simulated input control signals for the non-linear model is
shown in figure 3.
113
l l
.......................
U==? 1
388
reservoir (h2) is chosen as the output (difficult-to-measure) of difficult-to-measure variables using neural networks. The
variable. The sample time for all the simulations is 6s. The time time constants of the (sub)processes of liquid accumulation in
period for simulation is up to 2 hours. the individual reservoirs are the most dominant and have values
A linear model of the same process is also created in order bctween IOOS and 300s, while the chosen time constants of the
to test the influence of non-linearity on the quality of estimation valves are about ten times smaller.
TABLE I.
ESTIMATION GIVEN
RESULTS AS MSE
Apart from the described simulations being performed on the the 2-nearest-neighbour method (with a certain modification
non-linear and linear models of the process, simulations are also [12]) is used to determine the width of the radial basis function.
performed on the same models with noise added (also BLWN The number of centres or the number of neurons of the hidden
signals) to particular positions representing process disturbances layer is chosen randomly (as with the MLP network) and is the
and measurement noise. subject of optimisation of neural networks for this type of
Therefore four models of the same process are used: non- applications.
linear and linear models witWwithout noise. Based on these four For supervised learning, only the sampled data is used
models, four corresponding databases are generated each with when the value of the output (difficult-to-measure) variable is
their own features. known, i.e., when the set of input-output data is complete. In
this paper, 3 situations are analysed: when the value of the
IV. PROCESS ESTIMATION
VARIABLE output variable is known every 60s, 300s and 600s. With
respect to the dynamics of the observed process and very low
In order to investigate the properties of ANN as estimators sampling frequency, in the last two situations, the functional
of difficult-to-measure variables it is logical to start with two of relationship between the series of discrete values of the output
the most commonly used neural networks (MLP and RBF) and variable is significantly reduced, as is the relationship between
determine which is more promising. The properties of the the output variable and the available input variables. The data
implemented neural networks are investigated through their obtained this way, especially with the added noise, is often
training and test on the data from all four databases. referred to as case-data [13]. All available data of the easy-to-
The basic structure of the MLP neural network is chosen: 2 measure variables (available every 6s) are used during the
layers, 5 to 15 neurons in the hidden layer, activation functions: unsupervised learning of the RBF neural network.
‘tansig’ in the hidden layer and ‘purelin’ in the output layer. The The testing of these neural network based estimators is
Levenberg-Marquadt (LM) algorithm [IO] is used for training. performed on all available data in the corresponding databases.
Since the MLP neural network interpolates the training set data, The Mean Squared Error (MSE) between the ‘actual’ output
this network can easily start learning the noise included in the variable and the estimated output variable is used as a measure
data and therefore lose its generalization properties. In order to of estimation performance. The results are given in Table 1. The
avoid this, a validation set is also included and implicit index ‘i’ indicates the model of the process without noise
regularization is performed after each training step in order to whereas the index ‘r’ indicates the model of the process with
check its generalization properties and to stop learning when process disturbances and measurement noise.
necessary [9][10]. All the results are obtained after repeatedly training the
The basic (commonly used) structure of the RBF neural neural networks. This is because the initial conditions of the
network is also used: 2 layers, 5 to 15 neurons in the hidden neural networks before training, which affect the final results,
layer, activation functions: the ‘Gaussian function’ in the hidden are generated randomly. Each result is obtained after 10 to 20
layer and ‘purelin’ in the output layer. A combined method training epochs. The mean value of the three best results is
[9][1 I ] is used for learning. In the combined method, supervised taken as the final result (results shown in Table 1).
learning is used for the parameters of the output layer (least The change of the output variable and the change in
square method), while unsupervised learning is used for the percentage error over time were also recorded, but these results
hidden layer. For the unsupervised learning, the K-means are not given in this paper due to the lack of space. For
method is used to determine the positions of the centres while example, figure 4 shows a typical output of an estimator based
3 89
on the described MLP network with 10 hidden neurons, while results shown in Table 1 . are in bold font in instances where the
figure 5 shows a typical output of an estimator based on the RBF network gives similar or better results than the MLP
described RBF network with an equal number of hidden network
neurons. The networks are trained and tested on the data Without going deep into theoretical background of ANN,
obtained from the non-linear model with noise added (model the RBF network has an important advantage in this application
N L-r) . - it can learn on all data available, i.e., and on sampled data
obtained when no knowledge of the output variable exists. This
is very important because this is a common situation in practice
-there is a lot of data when the easy-to-measure variables are in
question (they have a high sampling frequency), whereas the
number of difficult-to-measure variables is sparse (due to the
low sampling frequency).
The MLP network might have given better results had a
better validation set been chosen and a bigger learning set used.
On the other hand, all the possibilities of the RBF network
have not been explored - a greater number of neurons would
give accurate results, and what is more important, the part
involving unsupervised learning can be considerably improved.
-1 51 This is mostly related to automatically determining, depending
0 200 400 600 800 1000 1200
Sample No on data structure, the optimal number of centres (i.e. hidden
Figure 4. Output of estimator based on the MLP network neurons) and more accurate centre coordinates and widths of the
radial basis functions.
REFERENCES
Gonzalez, G.D., Soft Sensors for Processing Plants, Proc. of the
2”” International Conference on Intelligent Processing and
Manufacturing of Materials, IPMM’99, IEEE, Part vol.1, pp. 59-
69, 1999.
McAvoy, T.J., Intelligent “Control” applications in the Process
Industries, Annual Reviews in Control, 26, pp. 7546,2002.
Qin, S.J., T.J. McAvoy, A Data-Based Process Modeling
Approach and Its Applications, I n reprints of the 3’” IFAC Dycord
Symposium, pp. 321-326, 1992.
-1 5
Bakshi, B.R., U. Utojo, A common framework for the unification
0 200 400 600
Sampls No
800 I000 1200 of neural, chemometric and statistical modeling methods,
Analytica Chimica Acta, 384, pp. 227-247, 1999.
Figure 5. Output of estimator based on the RBF network. Geladi P., B.R. Kowalski, Partial least-squares regression: A
tutorial, Analytica Chimica Acta, 185, pp. 1-17, 1986.
Yang, X., J. Zhang, A.J. Morris, An Artificial Neural Network
v. DISCUSSION
AND CONCLUSION Approach for Inferential Measurement, Proc. of International
Conference on Neural Information Processing (ICONIP ‘95).
Based on the results obtained, it can be noticed that in all Vol.l., pp. 485-488, 1995.
cases, the respective neural network output follows the trend of Girosi, F., T. Poggio, Networks and the best approximation
the ‘actual’ output variable, with the RBF neural network property, Biological Cybernetics, vol. 63, pp. 169-176, 1990.
generally giving a dampened (smoother) response. This can be Wray, J., G.R. Green, Neural networks, approximation theory and
seen from the results shown in figures 4 and 5 . finite precision computation, Neural Networks, vol. 8, no. 1, pp.
31-37, 1995.
The MLP neural network is significantly sensitive to noise
Haykin, S., Neural Networks - A Comprehensive Foundation, 2””
in the measured data, with the estimation quality worsening as edition, Prentice Hall, 1999.
the noise is increased. A smaller sample frequency has a similar Demuth, H., M. Beale, Neural Network Toolbox User’s Guide -
effect. The results also show that increasing the number of For Use with MATLAB, Ver. 4, The Mathworks, Inc, 2001.
hidden neurons (greater than 10) in the MLP network gives Moody, J., C. Darken, Fast Learning in Networks of Locally-
poorer estimation results. This indicates a worsening of the Tuned Processing Units, Neural Computation, vol. I , pp. 281-294,
generalization properties of the network, which is probably 1989.
related to the choice of the validation set data - learning is Chen, C.L., W.C. Chen, F.Y. Chang, Hybrid learning algorithm
for Gaussian potential function networks, IEE Proc. - D, vol. 140,
stopped too late and the network learns part of the noise
no. 6, pp. 442-448, 1993.
included in the learning set data. Wang, X., R. Luo, H. Shao, Designing a soft sensor for a
The RBF network gives better estimation results with an distillation column with the fuzzy distributed radial basis function
increase in the number of hidden neurons. When a small neural network, Proc. of the 35Ih Conference on Decision and
number of neurons are chosen, the K-means algorithm Control, pp. 1714-1719, 1996.
(especially with data that includes noise) has difficulty in
determining the coordinates of the centres - different initial
conditions give quite different results, thus significantly
influencing the estimation accuracy.
The most important conclusion arrived at after training the
ANN is that the RBF network estimates the case data better than
the MLP network. The accuracy of the RBF network is less
affected by noise, and more importantly for this application, less
sensitive to smaller sample frequencies of the variable for which
the estimator is designed. In order to point this fact out, the
390