Gist: A Mobile Robotics Application of Context-Based Vision in Outdoor Environment

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 29

Gist: A Mobile Robotics

Application of Context-Based
Vision in Outdoor Environment

Christian Siagian
Laurent Itti
Univ. Southern California, CA, USA
 Mobile robot localization
 Biological approach to vision
 Gist model
 Testing and results
 Discussion and conclusion
Mobile Robot Localization
 Where are we?

 Localization
Mobile Robot Localization
 Indoors: strong assumptions of flat walls, narrow hallways, and solid angles
• Ranging sensors (laser and sonar) for mapping
 Outdoors: less conforming set of surfaces
• Ranging sensors are less effective, vision is better
Robot Vision Localization
 Object-based Vision Localization
 Objects as landmarks
Robot Vision Localization
 Region-based Vision Localization
 regions as landmarks
Robot Vision Localization
 Scene-based Vision Localization
 Scenes as a whole as Landmarks
 Color histograms [Ulrich
and Nourbakhsh 2000]
 Fourier Transform
[Oliva & Torralba 2001]
 Wavelet pyramids
[Torralba 2003]
 Histogram of Dominant
features [Renniger &
Malik 2004]
 Definition and background
 Essence, holistic characteristics of an image
 Context information obtained within a eye saccade
(app. 150 ms.)
 Evidence of place recognizing cells at
Parahippocampal Place Area (PPA)
 Biologically plausible models of Gist are yet to be
 Nature of tasks done with gist
 Scene categorization/context recognition
 Region priming/layout recognition
 Resolution/scale selection
Human Vision
 Visual Cortex:
 Low level filters,
center-surround, and
 Saliency Model:
 Attend to pertinent
 Gist Model:
 Compute image
general characteristics
 High Level Vision:
 Object recognition
 Layout recognition
 Scene understanding
Gist Model
 Utilize the same Visual Cortex raw features in
the saliency model [Itti 2001]
 Gist is theoretically non-redundant with Saliency
 Gist vs. Saliency
 Instead of looking at most conspicuous locations
in image, looks at scene as a whole
 Detection of regularities, not irregularities
 Cooperation (Accumulation) vs. competition
(WTA) among locations
 More spatial emphasis in saliency
 Local vs. global/regional interaction
Gist Model
 V1 Raw image feature-Maps
 Orientation Channel
• Gabor filters at 4 angles
(0,45,90,135) on 4 scales
= 16 sub-channels
 Color:
• red-green and blue-yellow
center surround each with
6 scale combinations
= 12 sub-channels
 Intensity
• dark-bright center-surround
with 6 scale combinations
= 6 sub-channels
= Total of 34 sub-channels
Gist Model Implementation
 Gist Feature Extraction
 Average values of predetermined grid
Gist Model
 Dimension Reduction
 Original:
34 sub-channels x
16 features
= 544 features
 PCA/ICA reduction:
80 features
• Kept >95% of variance
Gist Model
 Dimension Reduction
 Original:
34 sub-channels x
16 features
= 544 features
 PCA/ICA reduction:
80 features
• Kept >95% of variance
 Place Classification
 Three-layer neural
Testing & Results
 Site selection:
 Different challenges appearance-wise
 Variability in area covered/ path
 Various lighting conditions
 Single-view filming
 Clean break between segments
 Scalability: combine all sites
Map of Experiment Sites
Site 1: Building Complex
Site 1 Experiment
Input Image Gist Feature-vectors

System Output PCA/ICA reduced features

Site 1 Results

Output Label

Site 2:Vegetation-filled Park
Site 2 Result

Output Label

Site 2 Experiment
Input Image Gist Feature-vectors

System Output PCA/ICA reduced features

Site 3: Open Field Park
Site 3 Experiment
Input Image Gist Feature-vectors

System Output PCA/ICA reduced features

Site 3 Result

Output Label

Combined Sites Result
Discussion & Conclusion
 Result of current model:
 Success rate between 82.48% and 87.93%
 Combined rate of 85.96%
 4.73% error in inter-site classification
 Integrating saliency for robot navigation
 Localization within segment
• Identifying discriminating cues in the environment
• Issues in object-based systems still applies
 Bad view detection
• Foreground objects sometimes occlude whole view
 Obstacle avoidance, exploration, etc.
 Integration of gist and saliency in general
 Single representation of both models
 Influence of saliency to gist and vice versa
• Involvement of saliency in improving gist estimation
• Gist helpful in identifying/filtering salient location
 Testing the limits of Gist: psychophysics
• Change blindness test for large scale layout changes
• Varying exposure time
• Isolation of bottom up - top down influences

You might also like