Deeplearning - Ai Deeplearning - Ai

Copyright Notice
These slides are distributed under the Creative Commons License.
DeepLearning.AI makes these slides available for educational purposes. You may not use or distribute
these slides for commercial purposes. You may make copies of these slides and use or distribute them for
educational purposes as long as you cite DeepLearning.AI as the source of the slides.
For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode

Introduction to
ML strategy
Why ML
deeplearning.ai
Strategy?
Motivating example
Ideas:
• Collect more data • Try dropout
• Collect more diverse training set • Add !" regularization
• Train algorithm longer with gradient descent • Network architecture
• Try Adam instead of gradient descent • Activation functions
• Try bigger network • # hidden units
• Try smaller network • … Andrew Ng
Introduction to
ML strategy
Orthogonalization
deeplearning.ai
TV tuning example
Car
Andrew Ng
Chain of assumptions in ML
Fit training set well on cost function
Fit dev set well on cost function
Fit test set well on cost function
Performs well in real world
Andrew Ng
Setting up
your goal
Single number
deeplearning.ai
evaluation metric
Using a single number evaluation metric
Idea
Classifier Precision Recall F1 Score

A 95% 90% 92.4%
B 98% 85% 91.0%
Experiment Code
Andrew Ng
Another example
Algorithm US China India Other Average

A 3% 7% 5% 9% 6%
B 5% 6% 5% 10% 6.5%
C 2% 3% 4% 5% 3.5%
D 5% 8% 7% 2% 5.25%
E 4% 5% 2% 4% 3.75%
F 7% 11% 8% 12% 9.5%
Andrew Ng
Setting up
your goal
Satisficing and
deeplearning.ai
optimizing metrics
Another cat classification example
Classifier Accuracy Running time
A 90% 80ms
B 92% 95ms
C 95% 1,500ms
Andrew Ng
Setting up
your goal
Train/dev/test
deeplearning.ai
distributions
Cat classification dev/test sets
Regions:
• US
• UK
• Other Europe
• South America
• India
Idea
• China
• Other Asia
• Australia
Experiment Code
Andrew Ng
True story (details changed)
Optimizing on dev set on loan approvals for

medium income zip codes
Tested on low income zip codes
Andrew Ng
Guideline
Choose a dev set and test set to reflect data you

expect to get in the future and consider important
to do well on.
Andrew Ng
Setting up
your goal
Size of dev
deeplearning.ai
and test sets
Old way of splitting data
Andrew Ng
Size of dev set
Set your dev set to be big enough to detect differences in
algorithm/models you’re trying out.
Andrew Ng
Size of test set
Set your test set to be big enough to give high confidence
in the overall performance of your system.
Andrew Ng
Setting up
your goal
When to change
deeplearning.ai dev/test sets and
metrics
Cat dataset examples
Metric: classification error

Algorithm A: 3% error
Algorithm B: 5% error
Andrew Ng
Orthogonalization for cat pictures: anti-porn
1. So far we’ve only discussed how to define a metric to

evaluate classifiers.
2. Worry separately about how to do well on this metric.
Andrew Ng
Another example
Algorithm A: 3% error
Algorithm B: 5% error
Dev/test User images
If doing well on your metric + dev/test set does not

correspond to doing well on your application, change your
metric and/or dev/test set.
Andrew Ng
Comparing to human-
level performance
Why human-level
deeplearning.ai
performance?
Comparing to human-level performance
accuracy
time
Andrew Ng
Why compare to human-level performance
Humans are quite good at a lot of tasks. So long as
ML is worse than humans, you can:
- Get labeled data from humans.
- Gain insight from manual error analysis:

Why did a person get this right?
- Better analysis of bias/variance.
Andrew Ng
Comparing to human-
level performance
Avoidable bias
deeplearning.ai
Bias and Variance
high bias “just right” high variance
Andrew Ng
Bias and Variance
Cat classification
Training set error: 1% 15% 15% 0.5%

Dev set error: 11% 16% 30% 1%
Andrew Ng
Cat classification example
Training error 8% 8%
Dev error 10% 10 %
Andrew Ng
Comparing to human-
level performance
Understanding
deeplearning.ai human-level
performance
Human-level error as a proxy for Bayes error
Medical image classification example:
Suppose:
(a) Typical human ………………. 3 % error
(b) Typical doctor ………………... 1 % error
(c) Experienced doctor …………... 0.7 % error
(d) Team of experienced doctors .. 0.5 % error
What is “human-level” error?

Andrew Ng
Error analysis example
Training error
Dev error
Andrew Ng
Summary of bias/variance with human-level
performance
Human-level error
Training error
Dev error
Andrew Ng
Comparing to human-
level performance
Surpassing human-
deeplearning.ai
level performance
Surpassing human-level performance
Team of humans
One human
Training error
Dev error
Andrew Ng
Problems where ML significantly surpasses
human-level performance
- Online advertising
- Product recommendations
- Logistics (predicting transit time)
- Loan approvals
Andrew Ng
Comparing to human-
level performance
Improving your model

deeplearning.ai
performance
The two fundamental assumptions of
supervised learning
1. You can fit the training set pretty well.
2. The training set performance generalizes pretty

well to the dev/test set.
Andrew Ng
Reducing (avoidable) bias and variance
Human-level Train bigger model

Train longer/better optimization algorithms
Training error NN architecture/hyperparameters search
More data
Dev error Regularization
NN architecture/hyperparameters search
Andrew Ng

Deeplearning - Ai Deeplearning - Ai

Uploaded by

Copyright:

Available Formats

Deeplearning - Ai Deeplearning - Ai

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Deeplearning - Ai Deeplearning - Ai

Uploaded by

Copyright:

Available Formats

Copyright Notice

These slides are distributed under the Creative Commons License.

For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode

Fit training set well on cost function

Fit dev set well on cost function

Fit test set well on cost function

Performs well in real world

Classifier Precision Recall F1 Score

Algorithm US China India Other Average

Optimizing on dev set on loan approvals for

Tested on low income zip codes

Choose a dev set and test set to reflect data you

Metric: classification error

1. So far we’ve only discussed how to define a metric to

2. Worry separately about how to do well on this metric.

If doing well on your metric + dev/test set does not

- Gain insight from manual error analysis:

- Better analysis of bias/variance.

high bias “just right” high variance

Training set error: 1% 15% 15% 0.5%

(b) Typical doctor ………………... 1 % error

(c) Experienced doctor …………... 0.7 % error

(d) Team of experienced doctors .. 0.5 % error

What is “human-level” error?

- Logistics (predicting transit time)

Improving your model

1. You can fit the training set pretty well.

2. The training set performance generalizes pretty

Human-level Train bigger model

Training error NN architecture/hyperparameters search

You might also like