Multi-resolution Stereo Matching Using Genetic Algorithm
Minglun Gong
Yee-Hong Yang
Department of Computer Science
University of Alberta
Abstract
In this paper, a new genetic-based stereo matching
algorithm is presented. Our motivation is to improve
the accuracy of the disparity map generated by
removing the mismatches caused by both occlusions
and false targets. In our approach, the stereo matching
problem is considered as an optimization problem. The
algorithm first takes advantage of multi-view stereo
images to detect occlusions, therefore, removes
mismatches caused by visibility problems. A genetic
algorithm is then used to optimize both the
compatibility between corresponding points and the
continuity of the disparity map, which removes
mismatches caused false targets. In addition, the
quadtree structure is used to implement a multiresolution framework. Since nodes at different level of
the quadtree cover different number of pixels, selecting
nodes at different levels gives similar effect as adjusting
the window size at different locations of the image. The
experimental results show that our approach can
generate more accurate disparity maps than two
existing approaches.
Keywords: Multi-resolution image, stereo vision,
genetic algorithm, Markov random fields.
1 Introduction
A stereo vision system normally takes two or more
images taken by parallel cameras as inputs and
produces a disparity map. A desired disparity map
should be smooth and should contain sufficient detail
for subsequent processing. Previous work in stereo
vision can be coarsely classified into feature-based
methods [4; 5; 7] and intensity-based methods [1; 9; 10;
16].
Feature-based methods first detect features, e.g.
edges and corners, in the source images. The matching
process is then conducted on these features. Generally,
feature-based approaches can provide robust, but
sparse, depth information. A complex interpolation
process is needed to obtain a complete disparity map.
Intensity-based methods select a window centered at
the each pixel. Pixels within the window are used to
compute correlation or sum of squares differences
between input images. The disparity that produces the
best match is set as the disparity of the pixel. The
selection of window size is a critical problem for
intensity-based methods. The window size should be
large enough to include enough intensity variation for
reliable matching, but small enough to avoid disparity
variation inside the window. Levine, et al. [10] change
the window size locally according to the intensity
pattern. Kanade and Okutomi [9] propose an iterative
algorithm to select optimal window sizes for different
parts of the image. The advantage of intensity-based
methods is that dense disparity maps can be estimated.
Most of the intensity-based methods compute
disparities based on local information only, and
therefore, are sensitive to noise. To address this
problem, cooperative algorithms apply global
constraints in the matching process. The disparity of a
pixel is influenced by the disparities of its neighbors.
Cooperative algorithms can be formulated as the
process of extracting a surface from a threedimensional u-v-d volume [1], i.e., the so-called
disparity space. The value of any voxel in the disparity
space indicates the probability of that the disparity of
pixel (u,v) is d. Consequently, the desired disparity
map should be a surface that satisfies two constraints
(1) the surface is smooth, (2) it which passes through
voxels of high probabilities. In previous approaches [1;
16], three-dimensional functions are used to update the
disparity space iteratively until the process converges.
Recovering disparity information from two views
has its inherent limitations. First of all, the length of
the baseline has to be chosen carefully. Using a short
baseline cannot estimate distance accurately due to the
triangulation procedure. On the other hand, to increase
the baseline means that a larger disparity range must be
searched to find the match, and therefore, has a higher
probability of finding a false match. Hence, a tradeoff
has to be made between the precision in estimating
distance and the correctness in finding a match. The
multiple-baseline approach [11] is an attempt to address
this tradeoff problem. In this approach, multiple
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
cameras are placed along a line with their optical axes
perpendicular to the line. Matches between different
stereo pairs are considered together to improve the
accuracy of the disparity map.
Another commonly known issue of analyzing
binocular disparity is that some part of the scene may
be visible in only one of the images. As a result, the
disparity information is not recoverable since it is not
possible to find a correct match. The Stereo by Eye
Array (SEA) approach [13] is proposed to solve this
problem. In this approach, nine cameras are placed in a
3×3 matrix on a plane with the same baseline length
between neighboring cameras along both the vertical
and horizontal directions. The matching algorithm uses
the geometric relation among captured images to detect
occlusions.
2 Genetic-based Stereo Matching
The genetic algorithm (GA) proposed by John
Holland [8] is a general-purpose global optimization
technique based on randomized search and incorporates
some aspects of iterative algorithms. GA is often
regarded as an alternative method for solving complex
optimization problems, especially for combinatorial
optimization problems or problems whose derivatives
cannot be computed numerically. The research on GA
and its applications are a topic of active research [15].
Some researchers have applied GA to stereo vision
[6; 12]. In the approach proposed by Saito and Mori
[12], different window sizes are used to calculate
several candidate disparity values at each pixel location.
The GA is then applied to select the disparity of each
pixel from these candidates.
To simplify the
calculation, the disparity map is partitioned into 8×8
blocks and the GA is applied to analyze each block.
This approach is limited because it assumes that the
correct disparity value is one of the candidates
calculated with different window sizes.
This
assumption may not hold when the images are noisy or
when part of the scene has a plain color and lacks in
features for matching.
Han et al. [6] divide the reference image into
regions using the modified nonlinear Laplace (MNL)
filter. Chromosomes are then determined adaptively
based on the extracted regions. Eventually, disparities
are calculated by selecting a fitness value among
several candidate genes using a genetic algorithm. The
quality of the disparity map generated by this approach
highly depends on the region extraction result of the
MNL filter.
The approach presented in this paper is an intensitybased approach. Our motivation is to improve the
accuracy of the disparity map generated by removing
the mismatches caused by both occlusions and false
targets. Therefore, multiple-view stereo images are
used to detect occlusions, and global constraints are
applied in the matching process to eliminate false
targets.
A problem associated with using global constraints
is how to process around the boundaries. We notice
this problem when we experimented with the iterative
algorithm implemented by Zitnick and Kanade, which
involves summation of a three-dimensional local
support area [16]. The boundary problem not only
manifests itself at the boundary of the disparity maps,
which shows up as a black border, it also introduces
errors when the true disparity is close to the given
minimum or maximum disparity because the
summation is also conducted in the disparity dimension.
A possible solution to the second problem is to enlarge
the disparity search range. However, it will increase
the memory and the time needed for calculation and
also raise the possibility of false matches.
In this paper, to avoid the above -mentioned
boundary problem, a novel approach is used to apply
the global constraints in the matching process. Here,
we borrow the idea from neighborhood-based
segmentation and use the Markov Random Fields
(MRFs) to model the interactions between neighboring
pixels. Within the MRFs framework, the stereo
matching process is equivalent to finding the optimum
state of the MRF. Because of the Gibbs’ equivalence,
the probability that the MRF is in a particular state can
be calculated using local energies over the entire image.
Consequently, a given stereo matching result can be
evaluated by modeling local interactions, and the
problem of finding the best stereo matching can be
viewed as finding the solution to a combinatorial
optimization problem.
The outline of the approach is as follows: First the
three-dimensional disparity space is populated with
dissimilarity values based on the source images. In the
case of multi-view stereo images, the space will be
filled with the output of an occlusion detection function
proposed in SEA. A fitness function is defined based
on the MRF to evaluate a given disparity map. A
genetic algorithm is used to extract the best surface
from the disparity space with respect to the fitness
measure. In addition, the quadtree structure is used to
represent possible disparity maps. Since nodes at
different level of the quadtree contain different number
of pixels, selecting nodes at different resolution levels
gives similar effect to that of adjusting the window size
at different locations of the image.
In the following subsections, detailed issues are
discussed first, which include initialization of the
disparity space, the encoding mechanism for all
possible disparity maps, the formulation of the fitness
function, and the appropriate crossover and mutation
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
operators. An outline of the optimization process is
given latter.
2.1 Initialization of the Disparity Space
The disparity space is initialized based on the input
stereo images. If only two stereo images, I and Iref, are
available, this space will be filled directly by the
dissimilarity measure between corresponding pixels of I
and Iref, which is calculated by:
S (r , c , d ) = e ref (r , c , d ) =
∑ ρ (I (r + i, c + j), I (r + i, c + j + d ))
ref
(1)
−w≤i , j ≤ w
where w is the window radius, I(x,y) the color of pixel
(x,y), and ρ the color dissimilarity function.
The color dissimilarity function could be any
function that produces low values for correct matches,
such as the squared difference or the absolute
difference. In this paper, the absolute difference
function is used and the window radius is set to 1.
If multi-view stereo images captured by a camera
matrix are available, occlusion detection functions
proposed in SEA can be applied. Basically, eight stereo
pairs are formed using the central camera, I(0,0), and
each of the eight peripheral cameras, I(m,n). For each
pixel in the center image, the dissimilarity values of all
eight stereo pairs are computed. The disparity space is
then filled using the following function:
S (r , c , d ) =
e (−1, −1 ) (r , c, d ), e (−1, 0 ) (r , c , d ), e (−1 ,1) (r , c , d ),
(2)
σ e (0 , −1 ) (r, c, d ), e (0 ,1) (r , c, d ),
(1 ,−1 )
(1, 0 )
(1 ,1)
e (r , c , d ), e (r , c, d ), e (r , c , d )
where e(m,n)(r,c,d) is the dissimilarity between
corresponding pixels of I(0,0) and I(m,n), and σ the
occlusion detection function.
Different occlusion detection functions can be
applied. Satoh and Ohta give a systematic comparison
of different functions [14]. The “sorting summation”
function is used in this paper since it can detect most of
mismatches caused by occlusions. This algorithm
normally does not perform as good as some other
functions in removing mismatches caused by false
targets. However, our algorithm will rely on the
genetic-based optimization process to get rid of these
mismatches.
2.2 Encoding Scheme for Disparity Maps
Fundamental to all GA’s is the encoding scheme for
representing the solutions of the corresponding
optimization problems. Normally, the method to
encode the solutions depends on the applications to
which the GA is applied.
In our approach, the quadtree structure is used to
provide the multi-resolution scheme. Therefore, the
possible solutions, i.e. the disparity maps, are encoded
by encoding the corresponding quadtrees. The same
encoding scheme has been recently applied to color
image segmentation [3]. The quadtrees used to
represent the disparity maps must satisfy the following
two constraints:
• Every leaf (i.e. a node with no child) k of the quadtree
has an associated disparity value x, which implies that
the disparity values of all the pixels covered by k are
x. A leaf k is said to cover pixel p if k contains p.
• Any interior node in the quadtree cannot have all its
descendents assigned to the same disparity value;
otherwise, all the descendents of the node should be
removed and the node itself should be selected as a
leaf.
Now we need to find a way to encode all the
possible quadtrees.
In this paper, an array
representation of a complete quadtree is used so that we
can encode different quadtrees using a one-dimensional
integer string. In such a representation, the index of
node k in the string can be computed by:
h −1
O (k ) = 4 i + y × 2 h + x
i =0
∑
(3)
where h, x, and y are the height and the x-, ycoordinates of node k, respectively. The content at
location O(k) in the integer string is determined by:
x if node k is a leaf and its disparity is x ;
D[O(k )] =
otherwise
− 1
2.3 Fitness Evaluation
The fitness of a given chromosome, which is
represented as a string, controls the evolution process.
The fitter the chromosome, the greater is its probability
to survive from one generation to the next. In the
context of stereo matching, we need a way to evaluate
different disparity maps. Here we use we use an energy
function, which is based on the MRF, as the fitness
evaluation tool. The smaller the value of the energy
function, the better the disparity map. The format of
the function is:
f (D ) =
∑ ( ∑) S (r, c, D[k ]) + λT
k
k ∈P
(4)
r ,c ∈k
where D denotes the disparity map, P the set that
contains all the leaves in the quadtree, D[k] the
disparity value of leaf k. λ is the weight of the penalty
term with a small value of λ favors a detailed disparity
map and a large value encourages a coarser result. Tk is
the length of the boundary of leaf k, which is defined
as:
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
Tk =
∑ (1 − δ (D[k ] == D[g]))× t
k ,g
(5)
g∈P ∩Qk
v
where δ(true)=1, δ(false)=0, Qk the set that contains
all the nodes that share boundary with node k, tk,g the
length of the shared boundary between leaf k and leaf g.
u
(a)
2.4 Initial Population Generation
The GA needs a number of initial disparity maps as
the initial population to start with. The choice of the
population size is very important. If the selected
population size is too small, the algorithm may result in
premature convergence without finding an appropriate
solution. On the other hand, a large population size will
lead to long computation time. Our experiments show
that premature convergence is likely to occur when the
population size is smaller than 30. In this paper, the
population size is set to 40, which appears to be
sufficient to avoid this problem.
The initial disparity maps are generated purely
randomly. First, a recursive procedure is invoked using
the root of the quadtree as the input parameter. Inside
the procedure, whether or not the given parameter, the
node k, is selected as a leaf depends on a random
number. If the decision is to select then the best
disparity, which can minimize the sum of the
dissimilarities between all the pixels in the node and
their corresponding pixels, is assigned to node k.
Otherwise, the procedure invokes itself recursively
using the four child nodes of k until the bottom level is
reached.
2.5 Crossover Operator
In our scenario, it will be inefficient to apply the
commonly used crossover operators, such as the two point crossover, because they do not guarantee that the
crossover results of the two quadtrees will still be legal
quadtrees. To address this problem, a recently
proposed crossover operation for color image
segmentation, called graft crossover, is applied [3]. For
completeness, details of this operator are discussed in
the following:
Given two strings, which represent two quadtrees,
we want to compare them and find out all the leaves
that appear in only one of the two quadtrees. The
crossover process will be terminated if no such leaves
are found. Otherwise, we randomly select one of these
leaves as a seed node. For example, after comparing
the two quadtrees shown in Figure 1(a) and (b), we can
pick leaf u since it appears only in the left quadtree.
Then the cover node that is the predecessor of the seed
node and appears in both quadtrees is determined (it is
node v in our example). Finally, we swap all the nodes
that are the descendents of the cover node. Figure 1(c)
and (d) show the results after we do the graft crossover
at cover node v.
v
(b)
v
v
u
(c)
(d)
Figure 1: Graft crossover for quadtrees, (a) and (b)
the parents; (c) and (d) the offspring.
By construction, this algorithm guarantees that the
results of crossover will still be legal quadtrees. After
crossover, the energies of the two quadtrees may
change and one of the offspring’s strings may have a
lower energy than either of its parents.
2.6 Mutation Operator
The mutation operation is important for the GA
since the crossover operator cannot generate offspring
that have genes that do not appear in the initial
population. In our approach, three mutation operations,
which are the splitting, the merging, and the alteration
are adopted in this paper [3].
The mutation operator randomly selects a number of
pixels from the original image. In our experiments, the
number of pixels selected is equal to half of the
perimeter of the image. For each pixel, we search for
leaf k in the quadtree that covers this pixel. One of the
splitting, merging, and altering operations will be
applied with equal chance.
When trying to merge leaf k with its siblings, we
need to find out all of its siblings first. The merging
operation will be inhibited if any of its siblings has
children. Otherwise, the local energy is computed, and
the merging operation will happen only if the energy is
lower after merging than that before merging.
When trying to split leaf k, the process is similar.
The local energies are computed for disparity maps
before and after splitting. The splitting operation will
occur only when the energy decreases after splitting.
When trying to alter leaf k, all the neighbors of k are
found first. The alteration operation will change the
label of leaf k to the label of one of the labels of its
neighbors, which can reduce the energy.
2.7 Minimization Process
After the above issues are addressed, the GA can be
implemented as an iterative procedure. When the
population evolves from one generation to the next, two
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
stereo, which contains two grayscale images and is
provided by R. Szeliski of Microsoft Research. The
last dataset, “random dot stereogram”, is a synthetic
stereo image pair with 50 percent density, which was
used in [16].
The SEA approach [13] and the cooperative
algorithm [16] are used for comparison. All the
cooperative algorithm results shown below are
generated using the “ZK Stereo” program written by
Zitnick [17]. The default values are used for most of
the parameters. In particular, the window radius for
initial match value is set to 1, the local support radius
for row and column dimensions is set to 2, and the local
support radius for disparity dimension is set to 1. The
values for the minimum disparity and maximum
disparity depend on the particular stereo dataset. In
order to make sure that the algorithm converges, the
number of iterations is changed to 15, which is the
upper bound according to the document of the program
[17]. The program provides two different ways to
compute the initial match values. The “sums of
absolute differences” is used because it gives better
results and also because we use the same way to fill the
disparity space when only two stereo images are
available.
strings are picked randomly each time to do the
crossover and mutation until all the strings in the
population are processed.
The elitist strategy [2] is applied when selecting
strings for the next generation. The energy value f(Dim )
of the best string Dim of generation m is compared with
the energy value f(Djm+1 ) of the worst string Djm+1 of
generation m+1. If f(Dim ) < f(Djm+1 ), then string Dim is
substituted for Djm+1 . By means of this strategy, the
minimum energy value of the population will not
increase during the process of evolution.
The above process is repeated until the termination
condition is satisfied. In our approach, the evaluation
process will be terminated if the energy difference
between the best string and the worst string in the
population is smaller than 0.01 percent of the average
energy value.
3 Experimental Results
Four datasets are used to test the presented
algorithm. Each of the first two datasets, “head and
lamp” and “plant”, contains nine color images, which
are sampled using a camera matrix by University of
Tsukuba. The third one, “slanted plane”, is a binocular
(a)
(b)
(c)
(d)
(e)
(f)
Figure 2: “head and lamp” multi-view stereo (a) source image (b) ground truth (c) our approach (d) SEA
(3×3 window) (e) SEA (5×5 window) (f) cooperative algorithm.
Figure 2 shows the experimental result for the “head
and lamp” dataset. The true disparity range for this
stereo dataset is [5, 14]. For comparison, the results of
using SEA with different window sizes are shown as
Figure 2 (d) and (e). Figure 2 (f) is generated by the
cooperative algorithm, which is quite good considering
it uses two original images only. However, in order to
generate this result, we have to enlarge the disparity
range to [2, 17]. Poor results will be produced if a
tighter bound is used. Since the disparity range is
expanded by 60 percents, it is reasonable to believe that
the computational time and the space required are also
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
increased by 60 percents.
(a)
(b)
(c)
(d)
Figure 3: “plant” multi-view stereo (a) source image (b) our approach (c) SEA (3×3 window) (d) cooperative
algorithm (disparity range [12, 51])
Figure 3 shows the experimental results for the
“plant” dataset. The true disparity range for this stereo
dataset is [15, 48]. Many occlusions exist in this stereo
dataset since the scene is very complex and the length
of the baseline is quite large. Figure 3(c) shows that the
SEA approach can detect most of the occlusions.
However, many mismatches caused by f alse targets still
exist. For instance, incorrect depth variations show up
in the blue rectangle region. As shown in Figure 3(b),
our approach successfully removed these mismatches.
The result generated by the cooperative algorithm
shown in Figure 3(d) looks smooth. However, upon
closer inspection of the picture shows that many details
are lost. For example, the leaves are broken, “tears”
shows up in the red rectangle area, and part of the
background is covered in the green oral area. These
problems might be caused by occlusions, which is not
recoverable using only two images.
Figure 4 shows the experimental results for the
“slanted plane” dataset. The true disparity range for
this stereo dataset is [4, 28]. Pixels of red color in
Figure 4(b) are occluded in the other source image.
Although mismatches caused by occlusions exist in the
result of our approach, comparing with the result using
simple window-based matching (Figure 4(d)), it shows
that most of the mismatches caused by false targets
have been eliminated. Figure 4(e) shows the boundary
problem of the cooperative algorithm when a tight
disparity range is given. Figure 4(f) shows the result of
the cooperative algorithm using an enlarged disparity
range. Since interpolation is applied, the result looks
even smoother than the ground truth.
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
(a)
(b)
(c)
(d)
(e)
(f)
Figure 4: “slanted plane” binocular stereo (a) source image (b) ground truth (c) our approach (d) 3×3
window match (e) cooperative algorithm (disparity [4, 28]) (f) cooperative algorithm (disparity [1, 31]).
(a)
(b)
(c)
(d)
(e)
(f)
Figure 5: “random dot stereogram” binocular stereo (a) reference (left) image (b) right image (c) ground
truth (d) cooperative algorithm (disparity [0, 20]) (f) cooperative algorithm (disparity range [-3, 23])
Figure 5 shows the experiment results for the
synthetic scene. The true disparity range for this stereo
dataset is [0, 20]. Again, red color is used in Figure
5(c) to mark areas that are not visible in the reference
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
image. The result of our approach shows that, in area
without occlusion, most of the disparities given are
accurate. Figure 5(e) shows that for the cooperative
algorithm, the boundary problem still exists. Figure
5(f) shows that even when the disparity range is
increased to [-3, 23], the brightest vertical bar still does
not show up. This might be caused by the boundary
problem since this area has the highest disparity.
Table 1: Comparison of accuracy for “head and
lamp” dataset.
% of correct
disparities
% of the
disparities
within truth
value ±1
SEA (3×3
window)
78.048
SEA (5×5
window)
82.314
Our
Approach
87.855
92.544
94.679
97.452
Table 2: Comparison for accuracy for “slanted
plant” dataset (in area without occlusion only)
% of correct
disparities
% of the disparities
within truth value ±1
Cooperative
Algorithm
69.132
Our
Approach
82.774
97.235
98.547
Table 1 gives the percentage of accurate disparities
found by the SEA algorithm and our approach for the
“head and lamp” dataset. Table 2 gives the percentage
of accurate disparities found by the cooperative
algorithm and our approach for the “slanted plane”
dataset. The result shows that our approach gives more
accurate results for both stereo datasets.
The computational costs of genetic-based
algorithms are relatively high. Our experiments show
that the new algorithm takes about 500-700 generations
to converge. However, since the dissimilarity values
are pre-calculated and stored in the disparity space, the
computation time is reduced. On our 1.5GHz Pentium
4 computer with 256MB RAM running Windows 2000,
about 4 minutes are needed to calculate the disparity
map for the “head and lamp” example, which is a
384×288 multi-view stereo.
4 Conclusions
In this paper, we proposed a novel genetic-based
stereo matching algorithm. The algorithm can be
applied to both binocular stereo data and multi-view
stereo data that are captured by a camera matrix. When
multi-view stereo images are available, they can be
fully utilized to detect occlusions and to improve
accuracy. A fitness function, which considers both
intensity similarity and disparity smoothness, is
introduced to evaluate a given disparity map. The
genetic algorithm is used to find the best disparity map
with respect to the fitness function.
The quadtree structure is used to implement the
multi-resolution framework, which gives similar effect
as adjusting the window size at different locations of
the image. To apply the genetic algorithm under the
quadtree structure, an encoding mechanism for the
quadtree structure is used. Graft crossover, splitting
mutation, merging mutation, and alteration mutation,
which are suitable for the quadtree structure, are
applied.
Comparing with the SEA algorithm, our approach is
more flexible since it can also work with binocular
stereo datasets and produces acceptable results. Our
approach adopts the idea of detecting occlusions using
the geometric relation among captured images.
However, different from the SEA algorithm, rather than
using the output of the occlusion detection function to
determine disparities directly, our approach uses the
output to fill the disparity space, which makes it
possible to further remove mismatches.
Comparing with the iterative -based cooperative
algorithm, our approach uses the MRF framework to
apply the global constraints in the matching process.
The boundary problem is solved since no summation is
needed in the process. In addition, our approach can
fully utilize multi-view stereo images to remove
mismatches caused by collusions.
In summary, the proposed algorithm naturally
integrates together the idea of occlusion detection using
a camera matrix, matching with adaptive window size,
and incorporating support from neighboring pixels.
The experimental results show that the algorithm can
generate better disparity maps than two existing
approaches. Future research in this direction is
warranted.
Acknowledgments
The authors would like to acknowledge financial
support from NSERC and the University of Alberta.
The authors would like to thank Dr. Y. Ohta of the
University of Tsukuba for supplying the “head and
lamp” and “plant” stereo images, Dr. R. Szeliski of
Microsoft Research for the “slanted plane” stereo
images, and Mr. L. Zitnick of Carnegie Mellon
University for the “random dot stereogram” stereo
images. We also thank Dr. L. Zitnick for sharing his
software on the web.
References
[1] Q. Chen & G. Medioni, "Volumetric Stereo Matching
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
Method: Application to Image-Based Modeling," In:
Proceedings of the Conference on Computer Vision and
Pattern Recognition Vol. 1, IEEE Computer Society, pp.
29-34, Fort Collins, CO, USA, June 23-25, 1999.
[2] D. E. Goldberg, Genetic Algorithms in Search,
Optimization, and Machine Learning, Addison-Wesley,
Reading, MA, USA, 1989.
[3] M. Gong & Y.-H. Yang, "Genetic -Based Multiresolution
Color Image Segmentation," In: Vision Interface
Proceedings, Canadian Image Processing and Pattern
Recognition Society, pp. 71-80, Ottawa, Ontario,
Canada, June 7-9, 2001.
[4] W. E. L. Grimson, "A Computer Implementation of a
Theory of Human Stereo Vision," Philosophical
Transactions of the Royal Society of London, Vol. B 292,
pp. 217-253, May, 1981.
[5] W. E. L. Grimson, "Computational Experiments with a
Feature Based Stereo Algorithm," IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 7, No.
1, pp. 17-34, January, 1985.
[6] K.-P. Han, K.-W. Song, E.-Y. Chung, S.-J. Cho, & Y.-H.
Ha, "Stereo Matching Using Genetic Algorithm with
Adaptive Chromosomes," Pattern Recognition, Vol. 34,
No. 9, pp. 1729-1740, September, 2001.
[7] W. Hoff & N. Ahuja, "Surfaces from Stereo: Integrating
Feature Matching, Disparity Estimation, and Contour
Detection," IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 11, No. 2, pp. 121-136,
February, 1989.
[8] J. H. Holland, Adaptation in Natural and Artificial
Systems: An Introductory Analysis with Applications to
Biology, Control and Artificial Intelligence, University
of Michigan Press, Ann Arbor, MI, USA, 1975.
[9] T. Kanade & M. Okutomi, "Stereo Matching Algorithm
with an Adaptive Window: Theory and Experiment,"
IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 16, No. 9, pp. 920-932, September,
1994.
[10] M. D. Levine, D. A. O'Handley, & G. M. Yagi,
"Computer Determination of Depth Maps," Computer
Graphics and Image Processing, Vol. 2, No. 4, pp. 131150, October, 1973.
[11] M. Okutomi & T. Kanade, "A Multiple -Baseline Stereo,"
IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 15, No. 4, pp. 353-363, April, 1993.
[12] H. Saito & M. Mori, "Application of Genetic Algorithms
to Stereo Matching of Images," Pattern Recognition
Letters, Vol. 16, No. 8, pp. 815-821, August, 1995.
[13] K. Satoh & Y. Ohta, "Occlusion Detectable Stereo Using
a Camera Matrix," In: Proceedings of the 2nd Asian
Conference on Computer Vision, International
Association for Pattern Recognition, pp. 331-335,
Singapore, December 5-8, 1995.
[14] K. Satoh & Y. Ohta, "Occlusion Detectable Stereo Systematic Comparison of Detection Algorithms," In:
Proceedings of the 13th International Conference on
Pattern Recognition Vol. 1, International Association for
Pattern Recognition, pp. 280-286, Los Alamitos, CA,
USA, August 25-29, 1996.
[15] M. Srinivas & L. M. Patnaik, "Genetic Algorithms: A
Survey," Computer, Vol. 27, No. 6, pp. 17-26, June,
1994.
[16] L. C. Zitnick & T. Kanade, "A Cooperative Algorithm
for Stereo Matching and Occlusion Detection," IEEE
Transactions on Pattern Analysis and Machine
Intelligence, Vol. 22, No. 7, pp. 675-684, July, 2000.
[17] L. C. Zitnick, Software “ZK Stereo,” available at
http://www.cs.cmu.edu/~clz/stereo.html.
Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV’01)
0-7695-1327-1/01 $17.00 © 2001 IEEE
View publication stats