Abstract We present a new approach for large-scale multi-view stereo matching, which is designed ... more Abstract We present a new approach for large-scale multi-view stereo matching, which is designed to operate on ultra high-resolution image sets and efficiently compute dense 3D point clouds. We show that, using a robust descriptor for matching purposes and high-resolution images, we can skip the computationally expensive steps that other algorithms require. As a result, our method has low memory requirements and low computational complexity while producing 3D point clouds containing virtually no outliers.
Abstract Given two camera calibrations, this report presents a closed form algorithm that compute... more Abstract Given two camera calibrations, this report presents a closed form algorithm that computes a sequence of 3D points such that they all project to a single location on one camera and that their projection forms a uniformly sampled line on the other camera.
Dense disparity maps can be computed from wide-baseline stereo pairs but will inevitably contain ... more Dense disparity maps can be computed from wide-baseline stereo pairs but will inevitably contain large areas where the depth cannot be estimated accurately because the pixels are seen in one view only. A traditional approach to this problem is to introduce a global optimization scheme to fill-in the missing information by enforcing spatialconsistency, which usually means introducing a geometric regularization term that promotes smoothness. The world, however, is not necessarily smooth and we argue that a better approach is to monocularly estimate the surface normals and to use them to supply the required constraints. We will show that, even though the estimates are very rough, we nevertheless obtain more accurate depth-maps than by enforcing smoothness. Furthermore, this can be done effectively by solving large but sparse linear systems.
3D scene reconstruction from uncalibrated image sequences is a challenging problem. One of its cr... more 3D scene reconstruction from uncalibrated image sequences is a challenging problem. One of its critical subproblems is to solve for fundamental matrix in which the algebraic relations between consecutive images are stored. 8-point, normalized 8point, algebraic minimization and geometric distance minimization methods are tested for their performances against noise by synthetic and real image simulations. The performances of these methods are also tested for determining camera intrinsic parameters by solving Kruppa equations. Considering their computational complexities and noise robustness, the normalized 8-point algorithm gives a comparable performance against more complex algorithms in terms of errors, especially with high corresponding points.
Multimedia Content Representation, Classification and …, Jan 1, 2006
A novel approach is presented in order to reject correspondence outliers between frames using the... more A novel approach is presented in order to reject correspondence outliers between frames using the parallax-based rigidity constraint for epipolar geometry estimation. In this approach, the invariance of 3-D relative projective structure of a stationary scene over different views is exploited to eliminate outliers, mostly due to independently moving objects of a typical scene. The proposed approach is compared against a well-known RANSAC-based algorithm by the help of a test-bed. The results showed that the speed-up, gained by utilization of the proposed technique as a preprocessing step before RANSAC-based approach, decreases the execution time of the overall outlier rejection, significantly.
Abstract Virtual view synthesis from an array of cameras has been an essential element of three-d... more Abstract Virtual view synthesis from an array of cameras has been an essential element of three-dimensional video broadcasting/conferencing. In this paper, we propose a scheme based on a hybrid camera array consisting of four regular video cameras and one time-of-flight depth camera. During rendering, we use the depth image from the depth camera as initialization, and compute a view-dependent scene geometry using constrained plane sweeping from the regular cameras. View-dependent texture mapping is then deployed to ...
In this study, an algorithm is proposed to solve the multibody structure from motion (SfM) proble... more In this study, an algorithm is proposed to solve the multibody structure from motion (SfM) problem for the single camera case. The algorithm uses the epipolar criterion to segment the features belonging to independently moving objects. Once the features are segmented, corresponding objects are reconstructed individually by applying a sequential algorithm, which uses the previous structure to estimate the pose of the current frame. A tracker is utilized to increase the baseline and improve the F-matrix estimation, which is beneficial for both segmentation and 3D structure estimation. The experimental results on synthetic and real data demonstrate that our approach efficiently deals with the multi-body SfM problem.
iii PLAGIARISM I hereby declare that all information in this document has been obtained and prese... more iii PLAGIARISM I hereby declare that all information in this document has been obtained and presented in accordance with academic rules and ethical conduct. I also declare that, as required by these rules and conduct, I have fully cited and referenced all material and results that are not original to this work.
It has recently been shown that deformable 3D surfaces could be recovered from single video strea... more It has recently been shown that deformable 3D surfaces could be recovered from single video streams. However, existing techniques either require a reference view in which the shape of the surface is known a priori, which often may not be available, or require tracking points over long sequences, which is hard to do.
In this paper, we introduce a local image descriptor that is inspired by earlier detectors such a... more In this paper, we introduce a local image descriptor that is inspired by earlier detectors such as SIFT and GLOH but can be computed much more efficiently for dense wide-baseline matching purposes. We will show that it retains their robustness to perspective distortion and light changes, can be made to handle occlusions correctly, and runs fast on large images.
Abstract We present a new approach for large-scale multi-view stereo matching, which is designed ... more Abstract We present a new approach for large-scale multi-view stereo matching, which is designed to operate on ultra high-resolution image sets and efficiently compute dense 3D point clouds. We show that, using a robust descriptor for matching purposes and high-resolution images, we can skip the computationally expensive steps that other algorithms require. As a result, our method has low memory requirements and low computational complexity while producing 3D point clouds containing virtually no outliers.
Abstract Given two camera calibrations, this report presents a closed form algorithm that compute... more Abstract Given two camera calibrations, this report presents a closed form algorithm that computes a sequence of 3D points such that they all project to a single location on one camera and that their projection forms a uniformly sampled line on the other camera.
Dense disparity maps can be computed from wide-baseline stereo pairs but will inevitably contain ... more Dense disparity maps can be computed from wide-baseline stereo pairs but will inevitably contain large areas where the depth cannot be estimated accurately because the pixels are seen in one view only. A traditional approach to this problem is to introduce a global optimization scheme to fill-in the missing information by enforcing spatialconsistency, which usually means introducing a geometric regularization term that promotes smoothness. The world, however, is not necessarily smooth and we argue that a better approach is to monocularly estimate the surface normals and to use them to supply the required constraints. We will show that, even though the estimates are very rough, we nevertheless obtain more accurate depth-maps than by enforcing smoothness. Furthermore, this can be done effectively by solving large but sparse linear systems.
3D scene reconstruction from uncalibrated image sequences is a challenging problem. One of its cr... more 3D scene reconstruction from uncalibrated image sequences is a challenging problem. One of its critical subproblems is to solve for fundamental matrix in which the algebraic relations between consecutive images are stored. 8-point, normalized 8point, algebraic minimization and geometric distance minimization methods are tested for their performances against noise by synthetic and real image simulations. The performances of these methods are also tested for determining camera intrinsic parameters by solving Kruppa equations. Considering their computational complexities and noise robustness, the normalized 8-point algorithm gives a comparable performance against more complex algorithms in terms of errors, especially with high corresponding points.
Multimedia Content Representation, Classification and …, Jan 1, 2006
A novel approach is presented in order to reject correspondence outliers between frames using the... more A novel approach is presented in order to reject correspondence outliers between frames using the parallax-based rigidity constraint for epipolar geometry estimation. In this approach, the invariance of 3-D relative projective structure of a stationary scene over different views is exploited to eliminate outliers, mostly due to independently moving objects of a typical scene. The proposed approach is compared against a well-known RANSAC-based algorithm by the help of a test-bed. The results showed that the speed-up, gained by utilization of the proposed technique as a preprocessing step before RANSAC-based approach, decreases the execution time of the overall outlier rejection, significantly.
Abstract Virtual view synthesis from an array of cameras has been an essential element of three-d... more Abstract Virtual view synthesis from an array of cameras has been an essential element of three-dimensional video broadcasting/conferencing. In this paper, we propose a scheme based on a hybrid camera array consisting of four regular video cameras and one time-of-flight depth camera. During rendering, we use the depth image from the depth camera as initialization, and compute a view-dependent scene geometry using constrained plane sweeping from the regular cameras. View-dependent texture mapping is then deployed to ...
In this study, an algorithm is proposed to solve the multibody structure from motion (SfM) proble... more In this study, an algorithm is proposed to solve the multibody structure from motion (SfM) problem for the single camera case. The algorithm uses the epipolar criterion to segment the features belonging to independently moving objects. Once the features are segmented, corresponding objects are reconstructed individually by applying a sequential algorithm, which uses the previous structure to estimate the pose of the current frame. A tracker is utilized to increase the baseline and improve the F-matrix estimation, which is beneficial for both segmentation and 3D structure estimation. The experimental results on synthetic and real data demonstrate that our approach efficiently deals with the multi-body SfM problem.
iii PLAGIARISM I hereby declare that all information in this document has been obtained and prese... more iii PLAGIARISM I hereby declare that all information in this document has been obtained and presented in accordance with academic rules and ethical conduct. I also declare that, as required by these rules and conduct, I have fully cited and referenced all material and results that are not original to this work.
It has recently been shown that deformable 3D surfaces could be recovered from single video strea... more It has recently been shown that deformable 3D surfaces could be recovered from single video streams. However, existing techniques either require a reference view in which the shape of the surface is known a priori, which often may not be available, or require tracking points over long sequences, which is hard to do.
In this paper, we introduce a local image descriptor that is inspired by earlier detectors such a... more In this paper, we introduce a local image descriptor that is inspired by earlier detectors such as SIFT and GLOH but can be computed much more efficiently for dense wide-baseline matching purposes. We will show that it retains their robustness to perspective distortion and light changes, can be made to handle occlusions correctly, and runs fast on large images.
Uploads
Papers by Engin Tola