CONTOUR DETECTION BY MULTIRESOLUTION SURROUND INHIBITION
Giuseppe Papari*, Patrizio Campisi**, Nicolai Petkov*, Alessandro Neri**
*Institute of Mathematics and Computing Science
University of Groningen
P. O. Box 800, 9700 AV Groningen, The Netherlands
[email protected],
[email protected]
**Dipartimento di Elettronica Applicata
Università degli Studi di Roma “Roma Tre”,
Via della Vasca Navale 84, 00146 Roma, Italy
[email protected],
[email protected]
ABSTRACT
In natural images, luminance changes occur both on object contours and on textures. Often, the latter are stronger than the former,
thus standard edge detectors fail in isolating object contours from
texture. To overcome this problem, we propose a multiresolution
contour detector motivated by biological principles. At each scale,
texture is suppressed by using a bipolar surround inhibition process. The binary contour map is obtained by a contour selection criterion that is more effective than the classical hysteresis thresholding. Robustness to noise is achieved by Bayesian gradient estimation.
Keywords: edge, context, contour, surround suppression, texture
1. INTRODUCTION
Edge and contour detection, an important task in computer vision,
is a fertile field of ongoing research (see [1] for a survey). Standard
edge detectors react to all non-negligible luminance changes in an
image, irrespective whether they are originated by object contours
or by texture (grass, foliage, waves, etc.). Moreover, luminance
changes due to texture are often stronger than ones due to contours.
Our goal is to isolate objects in a scene; therefore, some further
process is needed with respect to general purpose edge detectors.
Specifically, we use some principles deployed in the Human Visual
System (HVS). Psychophysical studies show that the perception of
an oriented stimulus can be influenced by other similar stimuli in
the surroundings [2]. Neurophysiological researches show that surround modulation is due to a specific neural mechanism. In [3] it
has been suggested that surround suppression effectively enhances
contours in natural images rich in textures. Other psychological
experiments show that the retinal image is decomposed through
band-pass filters. Low-pass filters are responsible for the so called
pre-attentive stage of vision, corresponding to the first 0.1 y 0.3 s
of the image persistency on the retina, where only the general morphology is perceived [4]. High-pass filters deliver information for
the subsequent attentive stage where details are recognized. A multiresolution approach to contour detection has been proposed in [5].
In the current work we extend our previous studies [6] in combining a multiresolution approach and surround inhibition. We propose a method that detects contours at different resolutions and
combines them by a contour-oriented selection algorithm. At each
scale, noise is reduced by optimal Bayesian Minimum Mean
Square Estimation (MMSE) of the gradient in additive noise and
texture is suppressed by a biologically motivated surround inhibition process.
2. SCALE DEPENDANT CONTOUR DETECTOR
The proposed single scale contour detector is depicted in Fig. 1,
where Iw(x,y) is a noisy version of a given image I(x,y) corrupted
by additive independent noise w(x,y). First, the gradient of the in-
1424404819/06/$20.00 ©2006 IEEE
749
put image Iw is evaluated by convolution with the gradient of a
Gaussian kernel gV(x,y) [8]. The gradient estimation depends on the
parameter V, which we will call scale, or resolution.
V I w
Iw
^
I w gV
` ^I
w
VIw
Gradient
computation
`
g V ,
1
e
2SV2
g V x, y
Bayesian
denoising
V I
x2 y 2
2 V2
(1)
bV
Surround
inhibition
Fig. 1 Scale dependant contour detector.
Then, Bayesian denoising, described in Section 2.1, is applied on
the noisy image gradient. Surround inhibition is performed as detailed in Section 2.2.
2.1 Bayesian denoising
Our goal is to find the optimal estimator V I
aˆ z
of the un-
known vector a = VI, when a noisy version z = VI = VI + Vw
is observed. As well known from the Bayesian estimation theory,
the optimal MMSE estimator is given by:
w
aˆ z
pz a z a pa a
³ pz a
(2)
z a pa a da
According to recent statistical studies on natural images [7], both
pa a and pz a z a are assumed Gaussian Scale Mixture (GSM),
with covariance matrices Ai and Nk respectively:
° pa a
°
®
°
° pz a z a
¯
where:
N 2 ȟ , µ, R
K1
¦ Oi N 2
a, 0, A i
i 1
,
K2
¦ Ek N 2
z , a, N k
K1
K2
i 1
k 1
¦ Oi ¦ E k
1
(3)
k 1
1
ª 1
exp « ȟ - µ
2S det R
¬ 2
T
º
R 1 ȟ - µ »
¼
(4)
By substituting eq. (3) in eq. (2), we can find the following closed
expression for the optimal MMSE estimator:
a z
¦ Oi Ek N 2
i,k
z ,0, A i N k A -1
i Ai N k
¦N 2
z ,0, A i N k
1
z
.
(5)
i,k
The nonlinearity defined by eq. (5), applied to each pixel of the
gradient V I w , gives the best estimation of V I .
2.2 Surround inhibition
Next, a surround inhibition operator taking into account the context
ICIP 2006
influence of the surrounding of each point is deployed. In [3] an
inhibition term TV ( x, y ) is introduced as the local average of the
The strongly inhibited gradient cV ,D1 ( x, y ) contains only little texture, but some weak contours are broken (Fig. 5a). On the contrary,
the weakly inhibited gradient cV ,D 2 ( x, y ) contains all contours, but
V I ( x, y ) on a ring r around each
gradient magnitude M V ( x, y )
pixel. TV ( x, y ) , large in textured areas and small for isolated
texture is still present (Fig. 5b). In order to get the advantages of
both inhibition levels, we combine cV ,D1 and cV ,D 2 as follows:
First, we apply non-maxima suppression and consider the sets S1
and S2 of the non-zero pixels of cV ,D1 and cV ,D 2 .
edges, is then subtracted from MV(x,y), thus reducing the response
to texture. With this type of inhibition, there is a certain autoinhibition of isolated edges. Moreover, edges at texture borders are
considerably inhibited as well, which is not desirable with respect
to the detection of region boundaries. In order to overcome these
problems, we propose a bipolar inhibition term and a two-level inhibition process.
The ring r is split along the edge direction -V(x,y) in two halves r
0,
wu
w 2cV ,D i
wu
2
½°
0 ¾ , i 1, 2
°¿
since D1 > D2. Then, the output binary map bV is defined as the set
of pixels of S2 which are connected to at least one point of S1. In
this way, the broken contours of S1 are restored and most of the
texture present in S2 is removed (Fig. 5c).
The sets S1 and S2 have been obtained without thresholding the
+
and r, on which two local averages TVr ( x, y ) are evaluated (see
Fig. 2) and the smaller value is taken as an inhibition term. Specifically, let us consider the following pairs of orientation dependent
filters wVr ,I x, y , which define two half-rings oriented along an
angle I [0, S):
a
r+
gV x, y g 4V x, y
x, y
wcV ,Di
where u is the direction of the gradient V I . Note that S1 S2,
A. Bipolar inhibition term
WVr,I
°
® x, y cV ,Di x, y z 0,
°¯
Si
r
wV ,I x, y
³³
U ªr
¬ x cos I y sin I a º¼
WVr,I x, y
WVr,I x, y dxdy
r
(a)
(6)
(c)
(b)
Fig. 2 Half-rings on which MV(x,y) is averaged, for: isolated edges
(a), textured areas (b), borders of textured areas (c).
R2
where
[ t0
[ t0
[ ,
1,
,
U [ ®
(7)
®
[ 0
[ 0
¯0,
¯0,
The weighted local averages are defined by the following convolutions and, for each pixel, the minimum response is taken:
[
T r x, y
°° V
®
°T x, y
°̄ V
^MV wV I ` x, y I min ^TV x, y , TV x, y `
r
,
V
min
,
x, y
N
min
-V
(8)
min
min
Orientation selector
TV
Fig. 3 Computation scheme of the inhibition term TV ( x, y ) .
The convolutions are computed for a discrete set of orientations ^Ii `i I1 , Ii
Inhibition Term Computation
MV
S i 1 NI and, for each pixel, the result ob-
D1
tained for the angle Ii that is closest to the gradient orientation
MV
-V(x,y), is used (see Fig. 3).
On isolated edges (Fig. 2a), the local averages on both sides are
very low, ideally zero. Consequently, TV(x,y) is low and contours
are not inhibited. In textured areas (Fig. 2b), the local averages on
both sides are high and so will be the inhibition term. Borders of
textured areas are not inhibited, since TV(x,y) is low on such points
(Fig. 2c). These modifications lead to a considerable improvement
of the surround inhibition effect obtained in [3].
-V
Inhibition
term
computation
TV
D2
Surround inhibition
cV ,D1
cV ,D 2
Strong
inhibition
Merge
bV
Weak inhibition
Fig. 4 Surround inhibition block (see Fig. 1).
B. Two-level inhibition
The proposed inhibition scheme is shown in Fig. 4, where the inhibition term TV ( x, y ) is computed as specified in Section 2.2.A.
Two inhibited gradient fields cV ,D1 ( x, y ) and cV ,D 2 ( x, y ) are
evaluated, corresponding to strong and weak inhibition, respectively,:
cV ,D x, y
M V x, y D TV x, y
, D
D1,D 2 , D1 ! D 2
(a)
(b)
(c)
Fig. 5 Sets S1 (a) and S2 (b) of the nonzero pixels of respectively
cV ,D1 and cV ,D 2 , with strong and weak inhibition. (c) Output map
(9)
bV in which contours are restored and texture is suppressed.
750
gradient magnitude in order to preserve the weak contours, thus
some undesired edges are still present in Fig. 5c. In the common
situation where some object contours are weaker than texture, the
standard hysteresis thresholding techniques will fail. In the next
section, the information at more resolutions are combined to select
the desired contours and suppress even more texture.
ject Oi( n ) (Fig. 6b). For each object Oi( n ) , we define its weight
Ri( n 1) as the sum of the values of M V ( x, y ) over Oi( n 1) :
Oi
*
n 1
1
the logic AND. Conversely, the accidental superposition of some
residual texture in C2( n 1) and C 2( n ) is maintained.
To overcome these problems, we propose the multiscale contour
detector shown in Fig. 7. First, the binary contour maps b1, …, bN
at the scales V1 < V2 < … < VN are computed using the contour detector introduced in Section 2. Given two binary contour maps bn1
and bn, detected at two adjacent scales, a new map
is obtained as detailed in Fig. 8, by means of
bnout
1 CS bn 1, bn
the mathematical operator CS (Contour Selector). It selects from
bn1 (finer scale), the pieces of contours having a good overlap with
the contours bn (coarser scale) and forming long chains of non-zero
pixels. The operator CS is applied iteratively from the coarsest
scale up to the finest one.
In more detail, the block CS operates as follows: first, the morphological dilation bn bn D2 is computed, with a disk D2 of radius
2 pixel as a structuring element. The connected components of bn1
n
and b are denoted by C ( n 1) and C , respectively:
k
bn 1
* Ck
i
n 1
, bn
k
* Ci
n
n 1
^
k
^
card Ck
n 1
`
n
`
n 1
x , y Oi
n 1
n 1
(12)
M V x, y
Ci z
n
Oi( n 1) whose weight is above a given threshold TR:
bnout
1
*
CS bn 1, bn
Ri
n1
Oi
n 1
(13)
!TR
In this way, we exploit the fact that object contours form long
chains of non-zero pixels (high weight), whereas textures only
forms short rods (low weight). Such approach is more effective
than thresholding the local gradient magnitude, because often luminance changes due to texture are stronger than the ones due to
object contours.
4. EXPERIMENTAL RESULTS AND COMPARISON
The performance of the proposed contour detector has been compared with four other existing algorithms. The results are presented
in Figs. 9 and 10 for a test image, without and with additive noise
(SNR = 13 dB). Other results are available on the webpage
www.cs.rug.nl/~papari/resultsICIP06.htm. As it can be
seen, our approach gives the best results in terms of texture suppressed, cleanness of the detected contour, and robustness to noise.
Fig 6. (a) A pixel based superposition would destroy part of the
object contour C1( n 1) , and would keep part of the spurious texture
C2( n 1) . (b) The object Oi( n 1) , given by the union of the four con-
nected components C1( n 1) - C4( n 1) of b1, belongs to the connected
n
component C i of bn .
Iw
i
We measure the degree of overlapping between Ck
the following quantity:
n 1
card C
b
Fk
(10)
.
¦
Ri
;
The final contour map bnout
1 is given by the union of all the objects
As well known from multiresolution analysis, coarse scales contain
the general morphology and almost no texture. On the other hand,
contours detected at coarse scales are smoothed, shifted [9], and
the non-maxima suppression destroys the junctions [10]. Thus, information at more scales can be combined in order to obtain contours being as detailed as at the fine scales, but without the texture
that does not appear at coarse scales [5, 6, 11].
In [6], this has been achieved through the pixel by pixel logic AND
combination of binary contour maps obtained at different scales. A
morphological dilation at coarse scales is applied, in order to compensate the shifting and restore the junctions (see [6] for details).
However, this approach has the drawbacks shown in Fig. 6a: part
of the object contour C1( n 1) , detected at the scale n1, falls outside
its counterpart C ( n ) at coarser scale n, therefore it is destroyed by
n
n 1
F k !TF
Ck
3. MULTISCALE CONTOUR COMBINATION
Ck
n
and B2 by
Scale dependant
contour detector
b1
Scale dependant
contour detector
b2
Scale dependant
contour detector
b3
b1out
b2out
CS
CS
(11)
Fig. 7 Proposed multiscale contour detector.
where card{X} indicates the cardinality of the set X. All the components Ck( n 1) such that F k( n 1) is below a threshold TF are re-
Contours Selector (CS)
bn1
moved. Thus the component C1( n 1) in Fig. 6a will not be broken
Overlapping
threshold
Fine scale
and the undesired component C2( n 1) will be completely removed.
The second step consists in labeling the components Ck( n 1) , which
are included in the same component C i( n ) , as part of the same ob-
751
bn
Morph.
dilation
Relevance
threshold
bn
Coarse scale
Fig. 8. Block «CS » of the scheme in Fig. 7.
bnout
1
For each studied algorithm, the values of U have been computed on
a set of 24 images. The average value U and the standard deviation VU are shown in Fig. 11, both for noiseless and noisy images.
With respect to the Canny edge detector (Figs. 9c, 10c), the benefits of the multiscale analysis [5], without surround inhibition, are
shown in Figs. 9d, 10d: some texture is removed and noise is reduced. Comparable texture suppression is achieved with the single
scale surround inhibition algorithm proposed in [3], (see Figs. 9e,
10e). The combination of multiscale analysis and surround inhibition [6] gives the much better results shown in Figs. 9f, 10f. The
improvement proposed here leads to the even better result shown in
Figs. 9b, 10b.
A ground-truth based performance evaluation has also been performed. The similarity U between each Algorithmic Result (AR)
and the Ground Truth (GT) has been computed as follows:
U
EC
EC MC FP
5. SUMMARY AND CONCLUSION
The proposed multiscale contour detector exploits important aspects of the HVS, in order to isolate object contours from texture.
Edges surrounded by other edges are inhibited, since the HVS perceives them as texture rather than object contours. The bipolar
mechanism introduced here avoids the auto-inhibition of the weak
contours. The two-level inhibition process operates a strong inhibition on textured areas and a weak one on object contours.
Similarly to the HVS, contours are detected at more resolutions.
All the contours’ parts having a low overlapping degree Fk with
respect to the adjacent coarser scale, and a low weight Rk are removed. Thresholding global quantities as Fk and Rk is more effective than thresholding the local gradient magnitude MV(x,y).
Robustness to noise, for the general non-Gaussian case, is achieved
by using a Bayesian estimator. GSM models are employed for both
the image and the noise and a closed form of the estimator has been
provided. As shown by experimental results and performance
evaluations, our algorithm outperforms both standard and more
sophisticated approaches, based on single and multiscale surround
inhibition.
(14)
where EC (Exact Contours) indicates the number pixels present
both in the AR and in the GT. MC (Missing Contours) indicate the
number of pixel present in the GT but not in the AR. FP (false
Positive) indicates the number of pixel present in the AR but not in
the GT, and measure the amount of not suppressed texture. U is
always between 0 and 1, with U = 1 iff AR = GT.
( a)
( b)
(c)
Fig. 11. Quantitative performance comparison.
6. REFERENCES
[1] M. Basu, “Gaussian-based edge-detection methods: A Survey”, IEEE SMC-C 3 (32) (2002) 252-260.
[2] D.J. Field, A. Hayes, R.F. Hess, “Contour integration by the
human visual system: Evidence for a local association field”,
Vision Research 33 (2) (1993) 173–193.
[3] C. Grigorescu, N. Petkov and M.A. Westenberg: “Contour
detection based on non-classical receptive field inhibition”,
IEEE Trans. on Image Processing, 12 (7) (2003) 729-739.
[4] B. Julesz, “Visual Pattern Discrimination”, IRE Transactions
on Information Theory, 8 (1962) 84-92.
[5] W. Richards, H.K. Nishihara, B. Dawson, “CARTOON: A
biologically motivated edge detection algorithm”, MIT A.I.
Memo No. 668 (1982).
[6] G. Papari, P. Campisi, N. Petkov, A. Neri, “A multiscale approach to contour detection by texture suppression”, SPIE Image Proc.: Alg. and Syst. (2006), Vol. 6064A, San Jose, CA.
[7] J. Portilla and E.P. Simoncelli, “A parametric texture model
based on joint statistics of complex wavelet coefficients” Int.
J. Comput. Vis., 40 (1) (2000) 49–71.
[8] J.F. Canny, “A computational approach to edge detection”,
IEEE PAMI 8 (6) (1986) 679–698.
[9] K.H. Liang, T. Tjahjadi and Y.H. Yang. “Bounded diffusion
for multiscale edge detection using regularized cubic B-spline
fitting” IEEE SMC-B 29 (2) (1999) 291-297.
[10] A. Ding, A. Goshtasby, “On the Canny edge detector”, Pattern Recognition, 34 (3) (2001) 721-725.
[11] F. Bergholm, “Edge focusing”, IEEE PAMI 9 (6) (1987) 726741
( d)
(e)
(f)
Fig. 9 Input image (a) and contours detected with: (b) the proposed
approach, (c) the Canny edge detector, (d) the multiscale edge detector CARTOON without surround inhibition [5], (e) single [3]
and (f) multi scale surround inhibition [6].
( a)
( b)
(c)
( d)
(e)
(f)
Fig. 10. Contours detected on a noisy image test (SNR = 13dB).
752