The Three R's of Computer Vision:: Jitendra Malik UC Berkeley
The Three R's of Computer Vision:: Jitendra Malik UC Berkeley
The Three R's of Computer Vision:: Jitendra Malik UC Berkeley
Jitendra Malik
UC Berkeley
Recogni6on,
Reconstruc6on
&
Reorganiza6on
Recogni6on
Reconstruc6on
Reorganiza6on
The
Three
R’s
of
Vision
Recognition
Reconstruction Reorganization
Will person B put some money into Person C’s tip bag?
Different
aspects
of
vision
• Percep6on:
study
the
“laws
of
seeing”
-‐predict
what
a
human
would
perceive
in
an
image.
• Neuroscience:
understand
the
mechanisms
in
the
re6na
and
the
brain
• Func6on:
how
laws
of
op6cs,
and
the
sta6s6cs
of
the
world
we
live
in,
make
certain
interpreta6ons
of
an
image
more
likely
to
be
valid
Kinect (PrimeSense)
Velodyne Lidar
Agarwal et al (2010)
Frahm et al, (2010)
Semantic Segmentation is needed to make this more
useful…
Some
Pictorial
Cues
Shading
Cast
Shadows
The
Visual
Pathway
Hubel
and
Wiesel
(1962)
discovered
orienta6on
sensi6ve
neurons
in
V1
Block
Diagram
of
the
Primate
Visual
System
Reconstruction Reorganization
Superpixel
assemblies as
candidates
Reconstruction Reorganization
Bobom-‐up
grouping
as
input
to
recogni6on
Input
Extract region
Compute CNN
Classify regions
image
proposals (~2k / image)
features
(linear SVM)
Reconstruction Reorganization
Recogni6on
Helps
Reorganiza6on
Results
of
Simultaneous
Detec6on
and
Segmenta6on
Hariharan,
Arbelaez,
Girshick
&
Malik
(2014)
Score
Original detec6on
Regress
boxes
Score
Ac6ons
and
Abributes
from
Wholes
and
Parts
G.
Gkioxari,
R.
Girshick
&
J.
Malik
Finding
Human
Body
Joints
Viewpoint
Predic6on
for
Objects
Tulsiani
&
Malik
(2014)
The columns show 15th, 30th, 45th, 60th, 75th and 90th percentile instances respectively
in terms of the error.
Keypoint
Predic6on
for
Objects
Reconstruction Reorganization
• Idea -‐ Deform a mesh to sa6sfy silhoue,es from different viewpoints
Reconstruction Reorganization
“Is there a dog in the “Is there a dog and “Is there a person “Is there a person
image?” where is it in the diving in the video?” diving and where is
image?” it in the video?”
Results
on
UCF
Sports
(Gkioxari
&
Malik,
2014)
Tracking error
Reconstruction Reorganization
Scene
Understanding
using
RGB-‐D
data
Gupta,
Girshick,
Arbelaez,
Malik
(ECCV
2014)
Pose
Estimation 51
Instance Segmentation
I hope you enjoy the course!