Computer Vision Nanodegree Syllabus: Before You Start

Computer Vision Nanodegree

Syllabus
Become a Computer Vision Expert

Welcome to the Computer Vision Nanodegree program!
Before You Start

Educational Objectives: In this program, you’ll learn the underlying math and programming concepts that
drive pattern recognition, object and image classification tasks, and object tracking systems. This course will
cover the latest in deep learning architectures used in industry, and you’ll combine current computer vision
and deep learning techniques to power a variety of applications. With the practical skills you gain in this
course, you’ll be able to program your own applications, extract information from any kind of image and
spatial data, and solve real-world challenges.

Prerequisite Knowledge: In order to succeed in this program, we recommend having significant experience
with Python, and entry-level experience with probability and statistics, and deep learning architectures.
Specifically, we expect you to be able to write a class in Python and to add comments to your code for
others to read. Also, you should be familiar with the term “neural networks” and understand the differential
math that drives backpropagation. If you feel you need to add to your Python and statistics skills, we suggest
our M
achine Learning program. If you’d like to learn more about neural networks and backpropagation,
consider our D eep Learning program.

Length of Program: The program is comprised of 1 term, lasting 3 months. We expect students to work 10
hours/week on average. Make sure to set aside adequate time on your calendar for focused work.

Instructional Tools Available: Video lectures, Jupyter notebooks, personalized project reviews.
Contact Info
While going through the program, if you have questions about anything, you can reach us at
[email protected].

version 1.0

Nanodegree Program Info

This program is designed to enhance your existing machine learning and deep learning skills with the
addition of computer vision theory and programming techniques. These computer vision skills can be
applied to various applications such as image and video processing, autonomous vehicle navigation, medical
diagnostics, smartphone apps, and much more. This program will not prepare you for a specific career or
role, rather, it will grow your deep learning and computer vision expertise, and give you the skills you need
to start applying computer vision techniques to real-world challenges and applications.

The term is comprised of 3 courses and 3 projects, which are described in detail below. Building a project is
one of the best ways to demonstrate the skills you've learned and each project will contribute to an
impressive professional portfolio that shows potential employers your mastery of computer vision and deep
learning techniques.

Length of Program: 120 Hours*
Number of Reviewed Projects: 3

* The length of this program is an estimation of total hours the average student may take to complete all
required coursework, including lecture and project time. Actual hours may vary.

version 1.0

Projects
Throughout this Nanodegree program, you'll master valuable skills by building the following projects:

● Facial Keypoint Detection
● Automatic Image Captioning
● Landmark Detection and Tracking

In the sections below, you'll find a detailed description of each project along with the course material that
presents the skills required to complete the project.
Project: Facial Keypoint Detection

Use image processing techniques and deep learning techniques to detect faces in an image and find facial
keypoints, such as the position of the eyes, nose, and mouth on a face.

This project tests your knowledge of image processing and feature extraction techniques that allow you to
programmatically represent different facial features. You’ll also use your knowledge of deep learning
techniques to program a convolutional neural network to recognize facial keypoints. Facial keypoints include
points around the eyes, nose, and mouth on any face and are used in many applications, from facial
tracking to emotion recognition.
Introduction to Computer Vision
Lesson Title Learning Outcomes
INTRODUCTION TO ● Learn where computer vision techniques are used in industry.
COMPUTER VISION ● Prepare for the course ahead with a detailed topic overview.
● Start programming your own applications!
IMAGE REPRESENTATION ● See how images are represented numerically.

AND ANALYSIS ● Implement image processing techniques like color and
geometric transforms.
● Program your own convolutional kernel for object
edge-detection.
CONVOLUTIONAL NN ● Learn about the layers of a deep convolutional neural network:
LAYERS convolutional, maxpooling, and fully-connected layers.
● Build an CNN-based image classifier in PyTorch.
● Learn about layer activation and feature visualization
techniques.
FEATURES AND OBJECT ● Learn why distinguishing features are important in pattern and
RECOGNITION object recognition tasks.
● Write code to extract information about an object’s color and

version 1.0

shape.
● Use features to identify areas on a face and to recognize the
shape of a car or pedestrian on a road.
IMAGE SEGMENTATION ● Implement k-means clustering to break an image up into parts.

● Find the contours and edges of multiple objects in an image.
● Learn about background subtraction for video.

Project: Automatic Image Captioning

Combine CNN and RNN knowledge to build a deep learning model that produces captions given an input
image.

Image captioning requires that you create a complex deep learning model with two components: a CNN that
transforms an input image into a set of features, and an RNN that turns those features into rich, descriptive
language. In this project, you will implement these cutting-edge deep learning architectures.
Advanced Computer Vision and Deep Learning
ADVANCED CNN ● Learn about advances in CNN architectures.

ARCHITECTURES ● See how region-based CNN’s, like Faster R-CNN, have allowed for
fast, localized object recognition in images.
● Work with a YOLO/single shot object detection system.
RECURRENT NEURAL ● Learn how recurrent neural networks learn from ordered
NETWORKS sequences of data.
● Implement an RNN for sequential text generation.
● Explore how memory can be incorporated into a deep learning
model.
● Understand where RNN’s are used in deep learning applications.
ATTENTION ● Learn how attention allows models to focus on a specific piece

MECHANISMS of input data.
● Understand where attention is useful in natural language and
computer vision applications.
IMAGE CAPTIONING ● Learn how to combine CNNs and RNNs to build a complex
captioning model.
● Implement an LSTM for caption generation.
● Train a model to predict captions and understand a visual scene.

version 1.0

Project: Landmark Detection and Tracking

Use feature detection and keypoint descriptors to build a map of the environment with SLAM (simultaneous
localization and mapping).

Implement a robust method for tracking an object over time, using elements of probability, motion models,
and linear algebra. This project tests your knowledge of localization techniques that are widely used in
autonomous vehicle navigation.
Object Tracking and Localization
OBJECT MOTION AND ● Learn how to programmatically track a single point over time.
TRACKING ● Understand motion models that define object movement over
time.
● Learn how to analyze videos as sequences of individual image
frames.
OPTICAL FLOW AND ● Implement a method for tracking a set of unique features over
FEATURE MATCHING time.
● Learn how to match features from one image frame to another.
● Track a moving car using optical flow.
ROBOT LOCALIZATION ● Use Bayesian statistics to locate a robot in space.

● Learn how sensor measurements can be used to safely navigate
an environment.
● Understand Gaussian uncertainty.
● Implement a histogram filter for robot localization in Python.
GRAPH SLAM ● Identify landmarks and build up a map of an environment.

● Learn how to simultaneously localize an autonomous vehicle
and create a map of landmarks.
● Implement move and sense functions for a robotic vehicle.

version 1.0

Computer Vision Nanodegree Syllabus: Before You Start

Uploaded by

Copyright:

Available Formats

Computer Vision Nanodegree Syllabus: Before You Start

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Vision Nanodegree Syllabus: Before You Start

Uploaded by

Copyright:

Available Formats

Computer Vision Nanodegree

Before You Start

Nanodegree Program Info

Project: Facial Keypoint Detection

Introduction to Computer Vision

Lesson Title Learning Outcomes

IMAGE REPRESENTATION ● See how images are represented numerically.

IMAGE SEGMENTATION ● Implement k-means clustering to break an image up into parts.

Project: Automatic Image Captioning

Advanced Computer Vision and Deep Learning

Lesson Title Learning Outcomes

ADVANCED CNN ● Learn about advances in CNN architectures.

ATTENTION ● Learn how attention allows models to focus on a specific piece

Project: Landmark Detection and Tracking

Object Tracking and Localization

Lesson Title Learning Outcomes

ROBOT LOCALIZATION ● Use Bayesian statistics to locate a robot in space.

GRAPH SLAM ● Identify landmarks and build up a map of an environment.

You might also like