ANN Unit IV Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Artificial Neural Networks: Unit IV

1) List of Activation Functions and Detailed Explanation:

Activation functions are used in neural networks to introduce non-linearity into the model, helping

the network learn complex patterns.

a) Identity Function

- Formula: f(x) = x

- Explanation: This is a linear function where the output is the same as the input. It is mainly used in

the input layer as it does not introduce any non-linearity.

- Use Case: Not often used in hidden layers as it cannot learn complex patterns, but it can be used

in regression problems.

b) Step (Threshold) Function

- Formula:

f(x) = 1 if x >= 0, else f(x) = 0

- Explanation: It outputs a 1 if the input exceeds a certain threshold (usually 0) and outputs a 0

otherwise.

- Use Case: Useful for binary classification problems but not differentiable, limiting its use in complex

networks.

c) ReLU (Rectified Linear Unit)

- Formula: f(x) = max(0, x)

- Explanation: ReLU is the most commonly used activation function in deep learning, solving the

vanishing gradient problem.

- Use Case: Used in convolutional and deep networks for tasks like image processing, NLP, etc.
d) Sigmoid Function

- Formula: f(x) = 1 / (1 + e^(-x))

- Explanation: Outputs values between 0 and 1, useful for probabilistic outputs but suffers from

vanishing gradients.

- Use Case: Binary classification tasks, often used in the output layer.

e) Hyperbolic Tangent (Tanh) Function

- Formula: f(x) = (e^x - e^-x) / (e^x + e^-x)

- Explanation: Similar to sigmoid but outputs values between -1 and 1, making it useful for

backpropagation.

- Use Case: Frequently used in deep learning models.

f) Leaky ReLU

- Formula: f(x) = x if x > 0, else f(x) = 0.01x

- Explanation: Allows small negative values to prevent the dying ReLU problem.

- Use Case: Used in deep learning networks for image processing tasks.

2) Backpropagation Algorithm (with Example)

The Backpropagation Algorithm trains neural networks by minimizing the error using gradient

descent.

Steps:

1. Initialization: Randomly initialize the weights and biases.

2. Forward Pass: Inputs pass through the network to produce an output.

3. Compute Error: Calculate the difference between predicted and actual output.
4. Backward Pass: Propagate errors back through the network to adjust the weights.

5. Weight Update: Use gradient descent to update the weights.

Example:

If the actual output is 0.8 and the network predicts 0.5, the error will be propagated back to adjust

the weights, reducing the difference between prediction and actual output.

3) Differentiating Single Layer, Multilayer Feedforward, and Recurrent Networks

a) Single Layer Feedforward Network:

- Contains one layer of neurons from input to output.

- No hidden layers.

- Use Case: Suitable for simple linear problems like basic classification.

b) Multilayer Feedforward Network:

- Contains one or more hidden layers.

- Can learn non-linear relationships.

- Use Case: Suitable for complex tasks like image recognition.

c) Recurrent Neural Network (RNN):

- Contains loops allowing information to persist.

- Suitable for time-dependent data.

- Use Case: Used in tasks like speech recognition and time series forecasting.

4) What is Multi-layer Feedforward Network?

A Multi-layer Feedforward Network consists of input, hidden, and output layers.


Importance of Layers:

- Hidden Layers: Enable the network to learn complex patterns.

- Output Layer: Produces the final output based on learned patterns.

The more hidden layers, the more abstract and complex relationships the network can learn.

5) Derivation of Backpropagation Training Algorithm:

Backpropagation uses gradient descent to minimize error by adjusting weights.

Cost Function:

E = (1/2) * sum(t_k - y_k)^2

Gradient of the Error:

dE/dw = delta_k * a_j

Weight Update Rule:

w_new = w_old - alpha * dE/dw

Role of Learning Rate (alpha):

- Controls the step size during weight updates.

- Too high: May overshoot the optimal solution.

- Too low: Slow convergence.

A well-adjusted learning rate ensures faster convergence without overshooting or getting stuck in

local minima.

You might also like