ML Video

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Machine Learning

Machine Learning Algorithms:


● Supervised Learning.
● Unsupervised Learning.
● Reinforcement learning.

.Supervised Learning: Make predictions from our given data.

Unsupervised Learning: Learn patterns from data.

Linear Regression:

● Hypothesis Function:
● Objective: Fit the data by minimizing the error between predictions and actual values.

Cost Function:
● Measures the squared error between predicted and actual values.

Gradient Descent: (Concept is not clear to me)


● Algorithm to minimize the cost function.
Gradient Descent for multiple variables:

Computing partial derivatives of the cost function with respect to each variable to minimize the
error.

Feature Scaling:
● Standardizes input features for faster convergence during optimization.
.
● Method: Mean Normalization:

Learning Rate: α

● Determine size of step to minimize cost function.


● For sufficiently small α, cost function decrease in every iteration.
● Large α may not converge.
Logistic Regression for Classification

● Predicts probabilities for binary outcomes using the sigmoid function.


● Assigns a label based on a threshold (e.g., 0.5).

Normal equation:

Hypothesis :

● The hypothesis hθ(x) models the relationship between inputs and output .

● hθ(x)= 1/(1+e^-(θt(x)

Decision Boundary:
● Separates data classes based on the hypothesis.
● If hθ(x) >=0.5; then output y=1.
● If hθ(x) <0.5; then y=0.

Advanced Optimization for Logistic Regression:


Advanced optimization methods aim to minimize the cost function J(θ) more efficiently than
gradient descent.

One-vs-All Classification:
One-vs-All trains a separate binary classifier for each class.

Octave:
Language that help plotting data, performing matrix operations for machine learning algorithms.

Neural Networks:

● Neural networks are models inspired by the human brain, designed to process inputs,
learn patterns, and make predictions.
● It is capable of learning both linear and non-linear relationships in data

Input layers: Takes in the data( image/number)


Hidden layers: Process the data using weights and biases. (shape/color)
Output layers: Give final prediction.

Non-Linear Hypothesis
Neural networks create non-linear hypotheses, enabling them to handle complex problems like
classifying patterns.

Neurons and Brain:

Algorithms that try to mimic the brain.

Model Representation:
● Receives raw features.
● Perform transformations using weights, biases, and activation functions.
● Produces prediction
Examples and Intuitions:
Binary Classification: Spam vs. Not Spam.
Non-Linear Problems: Solving tasks like the OR, XOR problem, where linear models fail.
Multi-Class Classification: Recognizing handwritten digits

Multi-Class Classification:

Multi-class classification is the task of predicting one label from three or more possible classes.

Examples:Handwritten digit recognition (classes: 0–9)

Cost Function
The cost function measures how well the neural network's predictions match the actual labels.
● Binary Classification: Measures the error for outputs in [0,1].
● Multi-Class Classification: K classes.

Backpropagation Algorithm:
Backpropagation calculates gradients of the cost function
● Perform forward propagation to compute predictions.
● Calculate errors at the output layer.
● Propagate the error backward through the layers, adjusting weights and biases.

Implementation Note - Unrolling Parameters:


● Have initial parameters :Ø1,Ø2,Ø3.
● Require parameters in vector form.
● Unrolling parameters involves converting weight matrices and bias vectors into a single
column vector.

Gradient checking
Gradient checking ensures the correctness of backpropagation by comparing analytically
computed gradients to numerical approximations.

𝑑/𝑑0(𝐽(Ø) = 𝐽(Ø + ε) − 𝐽(Ø − ε)/2ε


Random Initialization:
● Weights are initialized to small random values to avoid symmetry problems, where all
neurons in a layer learn the same features.
● Random initialization ensures diverse learning paths for different neurons.

Summary of Neural Networks:


● Initialize weights and biases.
● Perform forward propagation to calculate predictions.
● Compute cost/loss.
● Use backpropagation to calculate gradients.
● Update weights using gradient descent

Example:
Neural network used for lane detection, object recognition, and path planning in autonomous
vehicles.

Advice for Applying Machine Learning:


Evaluating a Hypothesis

● Focuses on understanding the performance of a model using training, validation, and


test datasets.
● Overfitting occurs when the model performs well on training data but poorly on unseen
data.

Model Selection:
● Training Set:60%
● Cross validation set:20%
● Test Set: 20%

Diagnosing Bias vs Variance:

Bias(Underfit):

● Train set error high.


● Cross validation high.

Variance:(Underfit):

● Train set low.


● Cost function is much greater than Train set error.

Regularization :

Helps reduce variance(overfitting) and improves generalization ability of model without


significantly incresing bias(underfitting).

Learning Curves:

If a learning algorithm is suffering from high bias getting more data will not help much.

Error Metrics:

Precision= True positives / Predicted Positive

Recall= True positive / Number of actual positive.

Trading Off Precision and Recall:

● Lower threshold: increase recall, decreasing precision.


● Higher threshold: increase precision, decreasing recall.
F1 score balanced precision & recall

F1= 2* (P * R)/(P+R).

Support Vector Machines (SVMs):


SVM finds best boundary to separate data into classes.

● SVM tries to find a hyperplane that maximizes the margin between classes.
● The points closest to the hyperlane are called support vector.

Large Margin Intuition:

● Maximize margin to improve model performance.


● Reduce risk of overfitting.

Clustering:

K-Means Algorithm: Alternates between assigning points to the nearest centroid and
updating centroids.

Choosing Number of Clusters:

Look for the point where adding more clusters doesn't significantly reduce the within-cluster
variance.

You might also like