Machine Learning Notes
Machine Learning Notes
Machine Learning Notes
Optimization Theory WHY: How do you train the weights of your model so that the
training error is minimized ? Answer: optimization.
You may need to know how to take derivatives of loss function with respect to some
parameter so that you can carry out gradient descent optimization. You may need to
know what gradients mean. What are hessians if you are doing second order
optimization like LBFGS. You may need to learn what Newton steps are, maybe to
solve line searches. You will need to understand functional derivatives to better
understand Gradient Boosted Decision Trees. You will need to
understand convergence properties of various optimization methods to get an idea of
how fast or slow your algorithm will run.
Probability and Statistics WHY: When you are doing machine learning, you are
primarily after some kind of distribution. What is the probability of an output given
my input ? Why do I need this ? When your machine learning model predicts (assigns
probabilities) high enough to known observation, you know you have a good model at
hand. Its a goodness criteria. Statistics help you to count well, normalize well, obtain
distributions, find out the mean of your input feature, its standard deviation. Why
do you need these things ? You need means and variances to better normalize your input
data before you feed it into you machine learning system. This helps in faster
convergence (optimization theory concept).
Signal Processing WHY: You usually do not feed raw input to your machine learning
systems. You do some kind of pre processing. For instance you would like to extract
some features from the input speech signal, or an image. Now, extracting these features
needs you to know properties of these underlying signals. Digital signal processing or
Image processing will help you gain expertise. You would be in a better situation to
know what feature extraction works and what does not. You would want to learn what is
a Fourier transform because maybe you would like to apply that to speech signal or
maybe apply discrete cosine transform to images before using them as features to
your machine learning system.
Pre-requisites for Machine Learning:
1. Vectors Basics
Click on the link below :
vectors basics.pdf