Back Propogation PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

SETTING THE LEARNING RATE

ADAPTIVE LEARNING RATES


ADAPTIVE LEARNING RATE ALGORITHMS

 Adagrad
 SGD
 Adadelta
 Adam
 RMS Prop
GRADIENT DESCENT
STOCHASTIC GRADIENT DESCENT

The word ‘stochastic‘ means a system or a process


that is linked with a random probability.
Hence, in Stochastic Gradient Descent, a few
samples are selected randomly instead of the whole
data set for each iteration.

In Gradient Descent, there is a term called “batch”


which denotes the total number of samples from a
dataset that is used for calculating the gradient for
each iteration.
TERMINOLOGY
THREE MODES OF GRADIENT DESCENT
SETTING HYPERPARAMETERS
SETTING HYPERPARAMETERS
THE PROBLEM OF OVERFITTING

You might also like