Session 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Introduction

Michael Bleyer
LVA Stereo Vision
VU Stereo Vision (3.0 ECTS/2.0 WS)
ƒ Anrechenbarkeit:
• Wahlfach im Masterstudium “Computergraphik & Digitale
Bildverarbeitung”
• Wahlfach im Masterstudium “Medieninformatik”
ƒ Webseite der LVA:
• http://www.ims.tuwien.ac.at/teaching_detail.php?ims_id=188.HQK
VU Stereo Vision (3.0 ECTS/2.0 WS)
ƒ Vorlesungstermine (9 Einheiten):
• Fr 5. März (10.00-11.30)
• Fr 12. März (10.00-11.30)
• Fr 19. März (10.00-11.30)
• Fr 26. März (10.00-11.30)
• Fr 16. April (10.00-11.30)
• Fr 23. April (10.00-11.30)
• Mi 28. April (10.00-11.30)
• Fr 7. Mai (10.00-11.30)
• Mi 12. Mai (10.00-11.30)
• Mündliche Prüfung nach Vereinbarung
ƒ Ort:
• Seminarraum 188/2
Topics Covered in the Lecture (Preliminary List)
ƒ Basics:
• 3D Perception, Stereo Matching Problem, Applications
• Stereo Pipeline, Challenges in Stereo Matching, Middlebury benchmark
ƒ Local Stereo Methods:
• Adaptive Window Methods
ƒ Global Stereo Matching:
• Global optimization:
− Dynamic Programming, Belief Propagation, Move-Making Algorithms
• Stereo Models:
− Smoothness Priors, Occlusion Handling
ƒ The Data Term
• Sampling insensitive measures, Role of color, Illumination-invariant measures.
ƒ Segmentation-Based Stereo
ƒ Recent work at IMS
Homework
ƒ 1. Step:
• You will implement block matching for stereo.
ƒ 2. Step:
• You will make the algorithm of step 1 computationally fast (sliding
window technique)
• Reflects the first branch of stereo research: real-time matching
ƒ 3. Step:
• You will improve the algorithm of step 1 to deliver high-quality
results.
• Competition:
− What is your ranking in the Middlebury benchmark?
• It is up to you what tricks of this lecture you are going to
use.
• Reflects the second branch of stereo research: high-quality
(but slow) matching
3D Perception

Michael Bleyer
LVA Stereo Vision
3D Perception
Human-Eye Brain
Separation(~6.5cm)

Left 2D Image Right 2D Image 3D View


3D Perception
Human-Eye Brain
If we ensure that the left eye sees a 
Separation(~6.5cm)

2D image and the right eye sees 
another one, our brain will try to 
overlay the images to generate a 3D 
impression.

How can we use this for watching 
3D movies? 
Left 2D Image Right 2D Image 3D View
Anaglyphs
ƒ Two images of complementary
color are overlaid to generate
one image.
ƒ Glasses required (e.g.
red/green)
ƒ Red filter cancels out red
image component, green filter
cancels out green component (Anaglyph Image)
ƒ Each eye gets one image =>
3D impression
ƒ Current 3D cinemas use this
principle. However,
polarization filters are used (Red/Green Glasses)
instead of color filters.
Shutter Glasses
ƒ Display flickers between left
and right image (i.e. each
even frame shows left image,
each odd frame shows right
image)
(Shutter Glasses and 120 Hz Display)
ƒ When left frame is shown,
shutter glasses close right
eye and vice versa.
ƒ Requires new displays of
high frame rate (120Hz).
ƒ Currently pushed by Nvidea
to address gaming market.

(Nvidea Artwork)
Autostereoscopic Displays
ƒ No glasses required!
ƒ Matrix of many transparent lenses
put on the display.
ƒ Lenses distort pixels so that left eye
gets a left image and right eye gets
a right image (if you are standing in
a sweet spot) => 3D impression
ƒ Novel viewpoint capability:
• You can walk in front of the display and
get a perceptively correct depth
impression depending on your current
viewpoint. (Philips Wowvx Display)
ƒ You will get a demo soon
Free Viewing (No glasses required, but some practice)
ƒ The way how you usually look at the display (no 3D):
Free Viewing (No glasses required, but some practice)
ƒ Parallel Viewing:

Left Image Right Image
Free Viewing (No glasses required, but some practice)
ƒ Cross Eye Viewing:

Right Image Left Image

ƒ Most likely the simpler method.


Learning Cross Eye Viewing

Right 2D Image Left 2D Image


ƒ Take a pencil and hold it in the middle of your eyes.
ƒ Look at the pencil and slowly change its distance to your eyes
ƒ If you found the right distance, you see a third image inbetween
left and right images.
ƒ This third image is in 3D
ƒ Practise, it is worth the effort.
3D on YouTube
Computational
Stereo
Michael Bleyer
LVA Stereo Vision
Computational Stereo
Brain

Replace human eyes with a pair of slightly 
displaced cameras. 

Left 2D Image Right 2D Image 3D View


Computational Stereo
Displacement Brain
(Stereo Baseline)

Replace human eyes with a pair of slightly 
displaced cameras. 

Left 2D Image Right 2D Image 3D View


Computational Stereo
Displacement Brain
(Stereo Baseline)

Left 2D Image Right 2D Image 3D View


Computational Stereo
Displacement Computer
(Stereo Baseline)

Left 2D Image Right 2D Image 3D View


Computational Stereo
Displacement Computer
(Stereo Baseline)

Left 2D Image Right 2D Image 3D View


Computational Stereo
Displacement Computer
(Stereo Baseline)

How can we accomplish a fully 
automatic 2D to 3D conversion?

Left 2D Image Right 2D Image 3D View


What is Disparity?

ƒ The amount to which a single pixel is displaced in the


two images is called disparity.
ƒ A pixel’s disparity is inversely proportional to its depth in
the scene.
What is Disparity?

Background 
Disparity 
(Small)

ƒ The amount to which a single pixel is displaced in the


two images is called disparity.
ƒ A pixel’s disparity is inversely proportional to its depth in
the scene.
What is Disparity?

Foreground 
Disparity (Large)

ƒ The amount to which a single pixel is displaced in the


two images is called disparity.
ƒ A pixel’s disparity is inversely proportional to its depth in
the scene.
Disparity Encoding

ƒ The disparity of each pixel is encoded by a grey value.


ƒ High grey values represent high disparities (and low
gray values small disparities).
ƒ The resulting image is called disparity map.
Disparity and Depth

ƒ The disparity map contains sufficient information for


generating a 3D model.
Disparity and Depth

The challenging part is to compute 
the disparity map.
This task is known as the stereo 
matching problem.

Stereo matching will be the topic of 
ƒ this lecture!!!
The disparity map contains sufficient information for
generating a 3D model.
Applications
(just a few examples)
3D Reconstruction from aerial images

ƒ Stereo cameras are mounted on an airplane to obtain a


terrain map.
ƒ Images taken from http://www.robotic.de/Heiko.Hirschmueller/
3D Reconstruction of Cities

ƒ City of Dubrovnik reconstructed from images taken from


Flickr in a fully automatic way.
ƒ [S. Agarwal, N. Snavely, I. Simon, S. Seitz and R. Szeliski “Building Rome in a Day”,
ICCV, 2009]
Driver Assistance / Autonomous driving cars

ƒ For example, use stereo to measure distance to other


cars.
ƒ DARPA Grand Challenge
ƒ Image taken from http://www.cs.auckland.ac.nz/~rklette/talks/08_AI.pdf
The Mars Rover

ƒ Reconstruct the surface of Mars using stereo vision


Human Motion Capture

ƒ Fit a 3D model of the human body to the computed


point cloud.
ƒ [R. Plänkers and P. Fua, “Articulated Soft Objects for Multiview Shape and Motion
Capture”, PAMI, 2003]
Bilayer Segmentation – Z-Keying

ƒ Goal: Divide image into a foreground and a background region.


ƒ Simple background subtraction will fail if there is motion in the background.
ƒ Solution:
• Compute depth map
• If the depth of a pixel is larger than a predefined threshold, pixels belongs to the foreground
ƒ [A. Criminisi, G. Cross, A. Blake and V. Kolmogorov, “Bilayer Segmentation
of Live Video”, CVPR, 2006]
Novel View Generation

Left View Virtual Interpolated Right View


(recorded) View (not recorded) (recorded)

ƒ Given a 3D model of the scene, one can use a virtual camera to


record new views from arbitrary viewpoints.
ƒ For example: Freeze frame effect from the movie Matrix.
ƒ [L. Zitnick, S. Kang, M. Uyttendaele, S. Winder, and R. Szeliski, "High-quality video
view interpolation using a layered representation", SIGGRAPH, 2004]
Understanding Human Vision
ƒ If we can teach the computer to see in 3D, we can also
learn more about the way how the human perceives
depth.
Summary
ƒ 3D Perception
• Principle of human 3D vision
• Ways for watching movies in 3D
ƒ Computational Stereo
• Stereo Matching Problem
ƒ Applications

You might also like