ANN Unit IV Notes

Artificial Neural Networks: Unit IV
1) List of Activation Functions and Detailed Explanation:
Activation functions are used in neural networks to introduce non-linearity into the model, helping
the network learn complex patterns.
a) Identity Function
- Formula: f(x) = x
- Explanation: This is a linear function where the output is the same as the input. It is mainly used in
the input layer as it does not introduce any non-linearity.
- Use Case: Not often used in hidden layers as it cannot learn complex patterns, but it can be used
in regression problems.
b) Step (Threshold) Function
- Formula:
f(x) = 1 if x >= 0, else f(x) = 0
- Explanation: It outputs a 1 if the input exceeds a certain threshold (usually 0) and outputs a 0
otherwise.
- Use Case: Useful for binary classification problems but not differentiable, limiting its use in complex
networks.
c) ReLU (Rectified Linear Unit)
- Formula: f(x) = max(0, x)
- Explanation: ReLU is the most commonly used activation function in deep learning, solving the
vanishing gradient problem.
- Use Case: Used in convolutional and deep networks for tasks like image processing, NLP, etc.
d) Sigmoid Function
- Formula: f(x) = 1 / (1 + e^(-x))
- Explanation: Outputs values between 0 and 1, useful for probabilistic outputs but suffers from
vanishing gradients.
- Use Case: Binary classification tasks, often used in the output layer.
e) Hyperbolic Tangent (Tanh) Function
- Formula: f(x) = (e^x - e^-x) / (e^x + e^-x)
- Explanation: Similar to sigmoid but outputs values between -1 and 1, making it useful for
backpropagation.
- Use Case: Frequently used in deep learning models.
f) Leaky ReLU
- Formula: f(x) = x if x > 0, else f(x) = 0.01x
- Explanation: Allows small negative values to prevent the dying ReLU problem.
- Use Case: Used in deep learning networks for image processing tasks.
2) Backpropagation Algorithm (with Example)
The Backpropagation Algorithm trains neural networks by minimizing the error using gradient
descent.
Steps:
1. Initialization: Randomly initialize the weights and biases.
2. Forward Pass: Inputs pass through the network to produce an output.
3. Compute Error: Calculate the difference between predicted and actual output.
4. Backward Pass: Propagate errors back through the network to adjust the weights.
5. Weight Update: Use gradient descent to update the weights.
Example:
If the actual output is 0.8 and the network predicts 0.5, the error will be propagated back to adjust
the weights, reducing the difference between prediction and actual output.
3) Differentiating Single Layer, Multilayer Feedforward, and Recurrent Networks
a) Single Layer Feedforward Network:
- Contains one layer of neurons from input to output.
- No hidden layers.
- Use Case: Suitable for simple linear problems like basic classification.
b) Multilayer Feedforward Network:
- Contains one or more hidden layers.
- Can learn non-linear relationships.
- Use Case: Suitable for complex tasks like image recognition.
c) Recurrent Neural Network (RNN):
- Contains loops allowing information to persist.
- Suitable for time-dependent data.
- Use Case: Used in tasks like speech recognition and time series forecasting.
4) What is Multi-layer Feedforward Network?
A Multi-layer Feedforward Network consists of input, hidden, and output layers.

Importance of Layers:
- Hidden Layers: Enable the network to learn complex patterns.
- Output Layer: Produces the final output based on learned patterns.
The more hidden layers, the more abstract and complex relationships the network can learn.
5) Derivation of Backpropagation Training Algorithm:
Backpropagation uses gradient descent to minimize error by adjusting weights.
Cost Function:
E = (1/2) * sum(t_k - y_k)^2
Gradient of the Error:
dE/dw = delta_k * a_j
Weight Update Rule:
w_new = w_old - alpha * dE/dw
Role of Learning Rate (alpha):
- Controls the step size during weight updates.
- Too high: May overshoot the optimal solution.
- Too low: Slow convergence.
A well-adjusted learning rate ensures faster convergence without overshooting or getting stuck in
local minima.

ANN Unit IV Notes

Uploaded by

Copyright:

Available Formats

ANN Unit IV Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ANN Unit IV Notes

Uploaded by

Copyright:

Available Formats

Artificial Neural Networks: Unit IV

1) List of Activation Functions and Detailed Explanation:

the network learn complex patterns.

the input layer as it does not introduce any non-linearity.

b) Step (Threshold) Function

f(x) = 1 if x >= 0, else f(x) = 0

c) ReLU (Rectified Linear Unit)

- Formula: f(x) = max(0, x)

vanishing gradient problem.

- Formula: f(x) = 1 / (1 + e^(-x))

e) Hyperbolic Tangent (Tanh) Function

- Formula: f(x) = (e^x - e^-x) / (e^x + e^-x)

- Use Case: Frequently used in deep learning models.

- Formula: f(x) = x if x > 0, else f(x) = 0.01x

2) Backpropagation Algorithm (with Example)

1. Initialization: Randomly initialize the weights and biases.

2. Forward Pass: Inputs pass through the network to produce an output.

5. Weight Update: Use gradient descent to update the weights.

3) Differentiating Single Layer, Multilayer Feedforward, and Recurrent Networks

a) Single Layer Feedforward Network:

- Contains one layer of neurons from input to output.

b) Multilayer Feedforward Network:

- Contains one or more hidden layers.

- Can learn non-linear relationships.

- Use Case: Suitable for complex tasks like image recognition.

c) Recurrent Neural Network (RNN):

- Contains loops allowing information to persist.

- Suitable for time-dependent data.

4) What is Multi-layer Feedforward Network?

A Multi-layer Feedforward Network consists of input, hidden, and output layers.

- Hidden Layers: Enable the network to learn complex patterns.

- Output Layer: Produces the final output based on learned patterns.

5) Derivation of Backpropagation Training Algorithm:

Backpropagation uses gradient descent to minimize error by adjusting weights.

E = (1/2) * sum(t_k - y_k)^2

Gradient of the Error:

dE/dw = delta_k * a_j

Weight Update Rule:

w_new = w_old - alpha * dE/dw

Role of Learning Rate (alpha):

- Controls the step size during weight updates.

- Too high: May overshoot the optimal solution.

- Too low: Slow convergence.

You might also like