Minimum Error Thresholding
Minimum Error Thresholding
Minimum Error Thresholding
00
Printed in Great Britain. Pergamon Press Ltd
t 1986 Pattern Recognition SocieD
M I N I M U M ERROR THRESHOLDING
J. KITTLER and J. ILLINGWORTH
SERC Rutherford Appleton Laboratory, Chilton, Didcot, Oxon OX11 0QX, U.K.
(Received 15 August 1984; in final form 13 dune 1985; receivedfor publication 8 duly 1985)
Abstraet--A computationaily efficient solution to the problem of minimum error thresholding is derived
under the assumption of object and pixel grey level values being normally distributed. The method is
applicable in multithreshold selection.
41
42 J. KITTLERand J. ILLINGWORTH
The problem of minimum error threshold selection is mance figure for the whole image can then be charac-
to determine the threshold level z. terised by the criterion function
The minimum error threshold can be obtained by
solving the quadratic equation defined by equating the
J(T) = ~ h(g). ~(g, T). (12)
¢
left and right hand sides of (4). However, the para-
meters #i, ai and P~ of the mixture density p(g) The role of this criterion function in finding the
associated with an image to be thresholded will not (approximate) minimum error threshold is rather
usually be known. Nevertheless, these parameters can subtle. For a given threshold T the criterion reflects
be estimated from the grey level histogram h(g) using indirectly the amount of overlap between the Gaussian
fitting techniques. This approach has been advocated models of the object and background populations.
by Nagawa and Rosenfeld. ts) Computationally their' Note that the density functions defining the models
method is very involved, as it requires optimisation of will overlap, even though, as a result of the histogram
a goodness-of-fit criterion function by a hill climbing partitioning at threshold T, the tails of the distri-
procedure. butions are ignored in their estimates.
In this paper we derive a much simpler technique for As threshold T is varied the models of the popul-
finding the optimum threshold x. Suppose that we ation distributions change. The better the tit between
threshold the grey level data at some arbitrary level T the data and the models, the smaller the overlap
and model each of the two resulting pixel populations between the density functions and therefore the smal-
by a normal density h(gli, T) with parameters/z~(T), ler the classification error. This behaviour is illustrated
a,~T) and a priori probability P~(T) given, respectively, in Fig. 1 for two different values of threshold T. The
as
corresponding overlaps of the density functions are
indicated by hatching. The value of threshold T
b
PiT)= ~ h(g). (5) yielding the lowest value of criterion J(~, T) will give
g=a the best fit model and therefore the minimum error
threshold.
#,T,=[..~ h(g)g]/P,.(T) (6) In summary, instead of determining ¢ indirectly as in
Nagawa and Rosenfeld, ~s) the problem of minimum
and error threshold selection can be formulated as one of
minimising criterion J(T), i.e.
a~(T)= [.=~{g- u,~T)}2h(o)]/P,<T), (7) J(z) = min J(T). (13)
r
where It should be noted that even at the optimum value T,
0 i=1 the models h(gli, T), i = 1, 2 will represent biased
estimates of the true components of the mixture of
a= T+ 1 i=2 (8)
normal probability distributions, due to the truncation
and of the tails of these distributions by histogram par-
titioning. However, we shall assume that the effect of
T i= 1 this bias is small and can be ignored.
b= . (9)
n i=2 Let us express criterion J(T) as
Now using the models h(gli, T), i = 1, 2, the T
conditional probability e(g, T) of grey level g being J(T) = ~ h(g)
g=O
replaced in the image by a correct binary value is given
by
~,~-~ -_] + 2 log ~,(T) - 2
e(g, T) = h(gli, T)" P,~T)/h(o) i= ~ l 0 <- T. (10)
(2 g>T
+ ~ h(o)
g=T+l
500
200
IO0
500
T:50
5°°i
~ (c)
F ~ / T:70
(b) Pj=0.25 /.L:=38 or,:9 I ~' I I PI : 0.45 /~, :47 ~ l =1~
400 Pz=0.75 /J.2:121 0"z:44
500
A 3 0 0 ~
200 200 --
i
ICO ~00
Fig. 1. (a) Grey level histogram defined by two equipopulous Gaussian distributions. Optimal threshold is at
T = 100. (b, c) Population models for trial thresholds T = 50 and T = 70. Binarisation error is represented
by the hatched overlapping areas of the model population density functions.
II
3. PROPERTIES OF THE CRITERION FUNCTION Criterion (15) is also more appropriate for thresh-
olding images with highly unequal population sizes
The criterion function may have local minima at the exceeding the ratio of 1:10 z or more, where the
boundaries of the interval of grey levels. However, methods of Otsu ~7~ and Ridler~8' 9~ tend to split the
these minima are not of interest. A unique internal larger mode into two halves."°~ Figure 4 is such an
minimum implies histogram bimodality and it corre- image and Fig. 5 shows the corresponding histogram
sponds to the optimum (minimum error) threshold. of grey level values. The minimum of the criterion
This has been verified experimentally on the histogram function, shown in Fig. 6, gives a good threshold value
shown in Fig. 2, containing object and background for segmenting the square from the background. Note
grey level modes of unequal variances. Figure 3 gives that at the peak of the larger mode the criterion
the corresponding criterion function. Note that the function attains a maximum.
popular threshold selection methods of Otsu fT} and The absence of an internal minimum is indicative of
Ridter,{s' ~} which use a less sophisticated model for the a unimodal histogram, which would correspond to a
two modes, give a biased threshold indicated by T Oin homogeneous image. Binarisation of such an image
Fig. 2. The reasons why the bias occurs are discussed in would be inappropriate and this case is illustrated in
Kittler and IllingworthJ 1°} Figs 7 and 8, which show the histogram and the
Fig. 6. Criterion function J{T) for histogram of Fig. 5. Fig. 7. Unimodal grey level histogram.
Minimum error thresholding 45
Fig. 8. J(T) for Fig. 7. No internal minima exist. Fig. 9. Poorly illuminated image of an industrial part.
Fig. 10. Binary image derived using minimum error Fig. l l. Trimodalhistogram. Pt = P2 = P3,~ = 50,#2 =
thresholding. 100, #3 = 150; at = a2 = 0-3 |0.
=
46 J. KITTLERand J. ILLINGWORTH
S. I T E R A T I V E I M P L E M E N T A T I O N
if [ g - PI(T)] 2
at-~-~ -j + 2 [ l o g a t ( T ) - logPl(T)]
Another point to note is that at some values of T the 2. J. Kittler, J. Illingworth, J. F6glein and K. Paler, An
product of the a priori probability and the conditional automatic thresholding method for waveform segmen-
tation, Proc. Int. Conf. on Digital Signal Processing.
density of one class exceeds that of the other. Hence the Florence, pp. 727-732 (1984).
quadratic equation in (22) will have imaginary roots. If 3. J. Kittler, J. Illingworth and J. F6glein, Threshold
such a situation occurs the algorithm must be init- selection based on a simple image statistic, Comput.
ialised at a new starting point. Vision Graphics Image Process. 30, 125-147 (1985).
4. A. Rosenfeid and A. C. Kak, Digital Picture Processing.
6. CONCLUSIONS Academic Press, New York (1976).
5. Y. Nagawa and A. Rosenfeld, Some experiments on
variable thresholding, Pattern Recognition 11, 191-204
A computationally efficient solution to the problem
(1979).
of minimum error thresholding has been derived under 6. P. A. Devijver and J. Kittler, Pattern Recognition: A
the assumption of object and pixel gray level values Statistical Approach. Prentice/Hall, Englewood Cliffs, NJ
being normally distributed. The principal idea behind (1982).
the method is to optimise the average pixel classifi- 7. N, Otsu, A threshold selection method from grey level
histograms, IEEE Trans. Syst. Man Cybernet. SMC-9,
cation error rate directly, using either an exhaustive 62-66 (1979).
search or an iterative algorithm. The method is 8. T. Ridler and S. Calvard, Picture thresholding using an
applicable in multithreshold selection. iterative selection method, IEEE Trans. Syst. Man
Cybernet. SMC-8, 630-632 (1978).
9. H. J. Trussel, Comments on picture thresholding using
REFERENCES an iterative selection method, IEEE Trans. Syst. Man
1. J. Kittler, J. Illingworth, J. F6glein and K. Paler, An Cybernet. SMC-9, 311 (1979).
automatic thresholding algorithm and its performance, 10. J. Kittler and J. Illingworth, On threshold selection using
Proc. 7th Int. Conf. on Pattern Recognition, Montreal, pp. clustering criteria, IEEE Trans. Syst. Man Cybernet.
287-289 (1984). SMC-15 (1985).
About the Autbor--Josar KtTrLERwas awarded a Ph.D. degree in Pattern Recognition in 1974 and since then
has published a number of papers and a book (Pattern Recognition: A Statistical Approach, Prentice Hall,
1982) on topics in pattern recognition and image processing. Since 1980 he has been with the Rutherford
Appleton Laboratory, where he is in charge of a research team working on projects in computer vision and
remote sensing. He is the SERC Coordinator for Pattern Analysis.
Dr Kittler is an Associate Editor of l EEE Transactions on Pattern Analysis and Machine Intelligence, and
is on the editorial board of Pattern Recognition, Pattern Recognition Letters, and Image and Vision
Computing. He has been serving as a member of the BPRA Committee for several years.
About the Author--JoHN ILLINGWORTHreceived B.Sc. and D.Phil. degrees in Physics from the Universities of
Birmingham and Oxford in 1978 and 1983, respectively. He is now employed as a Senior Research Associate
at the Rutherford Appleton Laboratory, undertaking research in computer vision algorithms.