ML & DL

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

1. Explain about Types of Machine Learning Systems.?

Ans:
Machine Learning:
“Machine Learning is defined as the study of computer programs
that leverage algorithms and statistical models to learn through
inference and patterns without being explicitly programed. Machine
Learning field has undergone significant developments in the last
decade.”
With machine learning algorithms, AI was able to develop beyond just
performing the tasks it was programmed to do. Before ML entered the
mainstream, AI programs were only used to automate low-level tasks in
business and enterprise settings.This included tasks like intelligent
automation or simple rule-based classification. This meant that AI
algorithms were restricted to only the domain of what they were
processed for. However, with machine learning, computers were able to
move past doing what they were programmed and began evolving with
each iteration.Machine learning is fundamentally set apart from artificial
intelligence, as it has the capability to evolve. Using various
programming techniques, machine learning algorithms are able to
process large amounts of data and extract useful information. In this
way, they can improve upon their previous iterations by learning from
the data they are provided.We cannot talk about machine learning
without speaking about big data, one of the most important aspects
of machine learning algorithms. Any type of AI is usually dependent on
the quality of its dataset for good results, as the field makes use of
statistical methods heavily.

Types of Machine Learning


As with any method, there are different ways to train machine learning
algorithms, each with their own advantages and disadvantages. To
understand the pros and cons of each type of machine learning, we must
first look at what kind of data they ingest. In ML, there are two kinds of
data — labeled data and unlabeled data.

Supervised Learning:
Supervised learning is one of the most basic types of machine learning.
In this type, the machine learning algorithm is trained on labeled data.
Even though the data needs to be labeled accurately for this method to
work, supervised learning is extremely powerful when used in the right
circumstances.
In supervised learning, the ML algorithm is given a small training
dataset to work with. This training dataset is a smaller part of the bigger
dataset and serves to give the algorithm a basic idea of the problem,
solution, and data points to be dealt with. The training dataset is also
very similar to the final dataset in its characteristics and provides the
algorithm with the labeled parameters required for the problem.

Unsupervised Learning:
Unsupervised machine learning holds the advantage of being able to
work with unlabeled data. This means that human labor is not required
to make the dataset machine-readable, allowing much larger datasets to
be worked on by the program.
In supervised learning, the labels allow the algorithm to find the exact
nature of the relationship between any two data points. However,
unsupervised learning does not have labels to work off of, resulting in
the creation of hidden structures. Relationships between data points are
perceived by the algorithm in an abstract manner, with no input required
from human beings.
Reinforcement learning:
directly takes inspiration from how human beings learn from data in
their lives. It features an algorithm that improves upon itself and learns
from new situations using a trial-and-error method. Favorable outputs
are encouraged or ‘reinforced’, and non-favorable outputs are
discouraged or ‘punished’.
Based on the psychological concept of conditioning, reinforcement
learning works by putting the algorithm in a work environment with an
interpreter and a reward system. In every iteration of the algorithm, the
output result is given to the interpreter, which decides whether the
outcome is favorable or not.
Applications of Machine Learning
→Machine learning algorithms are used in circumstances where the
solution is required to continue improving post-deployment.
→The dynamic nature of adaptable machine learning solutions is one of
the main selling points for its adoption by companies and organizations
across verticals.
➔Machine learning algorithms and solutions are versatile and can be
used as a substitute for medium-skilled human labor given the right
circumstances.
→For example, customer service executives in large B2C companies
have now been replaced by natural language processing machine
learning algorithms known as chatbots
→These chatbots can analyze customer queries and provide support for
human customer support executives or deal with the customers directly.
→Machine learning algorithms also help to improve user experience and
customization for online platforms
→Facebook, Netflix, Google, and Amazon all use recommendation
systems to prevent content glut and provide unique content to individual
users based on their likes and dislikes.

2. Explain about Over fitting the Training Data, and Under fitting
the Training Data?
Ans:
Overfitting and Underfitting in Machine Learning:
Overfitting and Underfitting are the two main problems that occur in
machine learning and degrade the performance of the machine learning
models.
The main goal of each machine learning model is to generalize well.
Here generalization defines the ability of an ML model to provide a
suitable output by adapting the given set of unknown input. It means
after providing training on the dataset, it can produce reliable and
accurate output. Hence, the underfitting and overfitting are the two terms
that need to be checked for the performance of the model and whether
the model is generalizing well or not.
Before understanding the overfitting and underfitting, let's understand
some basic term that will help to understand this topic well:
• Signal: It refers to the true underlying pattern of the data that helps
the machine learning model to learn from the data.
• Noise: Noise is unnecessary and irrelevant data that reduces the
performance of the model.
• Bias: Bias is a prediction error that is introduced in the model due
to oversimplifying the machine learning algorithms. Or it is the
difference between the predicted values and the actual values.
• Variance: If the machine learning model performs well with the
training dataset, but does not perform well with the test dataset,
then variance occurs.
Over fitting:
Overfitting occurs when our machine learning model tries to cover all
the data points or more than the required data points present in the given
dataset. Because of this, the model starts caching noise and inaccurate
values present in the dataset, and all these factors reduce the efficiency
and accuracy of the model. The overfitted model has low bias and high
variance.
The chances of occurrence of overfitting increase as much we provide
training to our model. It means the more we train our model, the more
chances of occurring the overfitted model.
Overfitting is the main problem that occurs in supervised learning.
Example: The concept of the overfitting can be understood by the below
graph of the linear regression output:

As we can see from the above graph, the model tries to cover all the data
points present in the scatter plot. It may look efficient, but in reality, it is
not so. Because the goal of the regression model to find the best fit line,
but here we have not got any best fit, so, it will generate the prediction
errors.
How to avoid the Overfitting in Model
Both overfitting and underfitting cause the degraded performance of the
machine learning model. But the main cause is overfitting, so there are
some ways by which we can reduce the occurrence of overfitting in our
model.
• Cross-Validation
• Training with more data
• Removing features
• Early stopping the training
• Regularization
• Ensembling
Underfitting:
Underfitting occurs when our machine learning model is not able to
capture the underlying trend of the data. To avoid the overfitting in the
model, the fed of training data can be stopped at an early stage, due to
which the model may not learn enough from the training data. As a
result, it may fail to find the best fit of the dominant trend in the data.
In the case of underfitting, the model is not able to learn enough from
the training data, and hence it reduces the accuracy and produces
unreliable predictions.
An underfitted model has high bias and low variance.
Example: We can understand the underfitting using below output of the
linear regression model:

As we can see from the above diagram, the model is unable to capture
the data points present in the plot.
How to avoid underfitting:
• By increasing the training time of the model.
• By increasing the number of features.

3. Explain about Measuring Accuracy Using Cross-Validation?


Ans:
Random subspaces:
The random subspace method is a technique used in order to introduce
variation among the predictors in an ensemble model. This is done as
decreasing the correlation between the predictors increases the
performance of the ensemble model. The random subspace method is
also known as feature or attribute bagging. What it does is it creates
subsets of the training set that only contain certain features. The chosen
number of features are randomly sampled from the training set with
replacement. However, most implementations allow the user to specify
whether or not they would like features to be sampled with or without
replacement.These subsets are then used in order to train the predictors
of an ensemble.
Random patches method:
When the random subspace method is used along with bagging or
pasting it is known as the random patches method.The random
subspaces/patches methods and their purpose is closely related to that of
bagging and pasting. If you are unfamiliar with why these techniques are
used, I have written a full article about the topic which is available here.

4. Explain about Linear Neurons and Their Limitations.


Ans:
Linear Neurons and Their Limitations:
Most neuron types are defined by the function f they apply to their logit
z . Let’s first consider layers of neurons that use a linear function in the
form of f ( z ) = a z + b . For example, a neuron that attempts to estimate
a cost of a meal in a fast-food restaurant would use a linear neuron
where a = 1 and b = 0 . In other words, using f ( z ) = z and weights
equal to the price of each item, the linear neuron in Figure 1-10 would
take in some ordered triple of servings of burgers, fries, and sodas and
output the price of the combination.

Figure 1-10. An example of a linear neuron


Linear neurons are easy to compute with, but they run into serious
limitations. In fact, it can be shown that any feed-forward neural
network consisting of only linear neurons can be expressed as a network
with no hidden layers. This is problematic because, as we discussed
before, hidden layers are what enable us to learn important features from
the input data. In other words, in order to learn complex relationships,
we need to use neurons that employ some sort of nonlinearity.
Expressing Linear Perceptrons as Neurons:
In “The Mechanics of Machine Learning”, we talked about using
machine learning models to capture the relationship between success on
exams and time spent studying and sleeping. To tackle this problem, we
constructed a linear perceptron classifier that divided the Cartesian
coordinate plane into two halves:
h ( 𝐱 , θ ) = - 1 if 3 x 1 + 4 x 2 - 24 < 0 1 if 3 x 1 + 4 x 2 - 24 ≥ 0
As shown in Figure 1-4, this is an optimal choice for θ because it
correctly classifies every sample in our dataset. Here, we show that our
model h is easily using a neuron. Consider the neuron depicted
in Figure 1-8. The neuron has two inputs, a bias, and uses the function:
f ( z ) = - 1 if z < 0 1 if z ≥ 0
It’s very easy to show that our linear perceptron and the neuronal model
are perfectly equivalent. And in general, it’s quite simple to show that
singular neurons are strictly more expressive than linear perceptrons. In
other words, every linear perceptron can be expressed as a single neuron,
but single neurons can also express models that cannot be expressed by
any linear perceptron.

5. Explain about Backpropagation Algorithm?


Ans:
We can define the backpropagation algorithm as an algorithm that trains
some given feed-forward Neural Network for a given input pattern
where the classifications are known to us. At the point when every
passage of the example set is exhibited to the network, the network looks
at its yield reaction to the example input pattern. After that, the
comparison done between output response and expected output with the
error value is measured. Later, we adjust the connection weight based
upon the error value measured.
Before we deep dive into backpropagation, we should be aware about
who introduced this concept and when. It was first introduced in the
1960s and 30 years later it was popularized by David Rumelhart,
Geoffrey Hinton, and Ronald Williams in the famous 1986 paper. In this
paper, they spoke about the various neural networks. Today,
backpropagation is doing good. Neural network training happens
through backpropagation. By this approach, we fine-tune the weights of
a neural net based on the error rate obtained in the previous run. The
right manner of applying this technique reduces error rates and makes
the model more reliable. Backpropagation is used to train the neural
network of the chain rule method. In simple terms, after each feed-
forward passes through a network, this algorithm does the backward
pass to adjust the model’s parameters based on weights and biases. A
typical supervised learning algorithm attempts to find a function that
maps input data to the right output. Backpropagation works with a
multi-layered neural network and learns internal representations of input
to output mapping.
How does backpropagation work?
Let us take a look at how backpropagation works. It has four layers:
input layer, hidden layer, hidden layer II and final output layer.
So, the main three layers are:
1. Input layer
2. Hidden layer
3. Output layer
Each layer has its own way of working and its own way to take action
such that we are able to get the desired results and correlate these
scenarios to our conditions. Let us discuss other details needed to help
summarizing this algorithm.

This image summarizes the functioning of the backpropagation


approach.
1. Input layer receives x
2. Input is modeled using weights w
3. Each hidden layer calculates the output and data is ready at the
output layer
4. Difference between actual output and desired output is known as
the error
5. Go back to the hidden layers and adjust the weights so that this
error is reduced in future runs
This process is repeated till we get the desired output. The training phase
is done with supervision. Once the model is stable, it is used in
production.

6. Explain the TensorFlow Operations.


Ans:
TensorFlow:
TensorFlow is an open-source library developed by Google primarily for
deep learning applications. It also supports traditional machine learning.
TensorFlow was originally developed for large numerical computations
without keeping deep learning in mind. However, it proved to be very
useful for deep learning development as well, and therefore Google
open-sourced it.
TensorFlow accepts data in the form of multi-dimensional arrays of
higher dimensions called tensors. Multi-dimensional arrays are very
handy in handling large amounts of data.
TensorFlow works on the basis of data flow graphs that have nodes and
edges. As the execution mechanism is in the form of graphs, it is much
easier to execute TensorFlow code in a distributed manner across a
cluster of computers while using GPUs.
Operations in TensorFlow:
• Add
• Subtract
• Multiply
• Divide
• Square
• Reshape
Tensor Addition:
You can add two tensors using tensorA.add(tensorB):
Example:
const tensorA = tf.tensor([[1, 2], [3, 4], [5, 6]]);
const tensorB = tf.tensor([[1,-1], [2,-2], [3,-3]]);

// Tensor Addition
const tensorNew = tensorA.add(tensorB);

// Result: [ [2, 1], [5, 2], [8, 3] ]


Tensor Subtraction:
You can subtract two tensors using tensorA.sub(tensorB):
Example:
const tensorA = tf.tensor([[1, 2], [3, 4], [5, 6]]);
const tensorB = tf.tensor([[1,-1], [2,-2], [3,-3]]);

// Tensor Subtraction
const tensorNew = tensorA.sub(tensorB);

// Result: [ [0, 3], [1, 6], [2, 9] ]


Tensor Multiplication:
You can multiply two tensors using tensorA.mul(tensorB):
Example:
const tensorA = tf.tensor([1, 2, 3, 4]);
const tensorB = tf.tensor([4, 4, 2, 2]);

// Tensor Multiplication
const tensorNew = tensorA.mul(tensorB);

// Result: [ 4, 8, 6, 8 ]
Tensor Division:
You can divide two tensors using tensorA.div(tensorB):
Example:
const tensorA = tf.tensor([[1, 2], [3, 4], [5, 6]]);
const tensorB = tf.tensor([[1,-1], [2,-2], [3,-3]]);

// Tensor Division
const tensorNew = tensorA.div(tensorB);

// Result: [ 2, 2, 3, 4 ]
Tensor Square:
You can square a tensor using tensor.square():
Example:
const tensorA = tf.tensor([1, 2, 3, 4]);

// Tensor Square
const tensorNew = tensorA.square();

// Result [ 1, 4, 9, 16 ]
Tensor Reshape:
The number of elements in a tensor is the product of the sizes in the
shape.
Since there can be different shapes with the same size, it is often useful
to reshape a tensor to other shapes with the same size.
You can reshape a tensor using tensor.reshape():
Example:
const tensorA = tf.tensor([[1, 2], [3, 4]]);
const tensorB = tensorA.reshape([4, 1]);

// Result: [ [1], [2], [3], [4] ]

7. Testing and validating?


Ans:
Testing Set:
This dataset is independent of the training set but has a somewhat
similar type of probability distribution of classes and is used as a
benchmark to evaluate the model, used only after the training of the
model is complete. Testing set is usually a properly organized dataset
having all kinds of data for scenarios that the model would probably be
facing when used in the real world. Often the validation and testing set
combined is used as a testing set which is not considered a good
practice. If the accuracy of the model on training data is greater than that
on testing data then the model is said to have overfitting. This data is
approximately 20-25% of the total data available for the project.
Validation Set:
The validation set is used to fine-tune the hyperparameters of the model
and is considered a part of the training of the model. The model only
sees this data for evaluation but does not learn from this data, providing
an objective unbiased evaluation of the model. Validation dataset can be
utilized for regression as well by interrupting training of model when
loss of validation dataset becomes greater than loss of training dataset
.i.e. reducing bias and variance. This data is approximately 10-15% of
the total data available for the project but this can change depending
upon the number of hyperparameters .i.e. if model has quite many
hyperparameters then using large validation set will give better results.
Now, whenever the accuracy of model on validation data is greater than
that on training data then the model is said to have generalized well.

8. Error Analysis?
Ans:
Error analysis is the process to isolate, observe and diagnose erroneous
ML predictions thereby helping understand pockets of high and low
performance of the model. When it is said that “the model accuracy is
90%” it might not be uniform across subgroups of data and there might
be some input conditions which the model fails more. So, it is the next
step from aggregate metrics to a more in-depth review of model errors
for improvement.
An example might be that a dog detection image recognition model
might be doing better for dogs in an outdoor setting but not so good in
low-lit indoor settings. Of course, this might be due to skewed datasets
and error analysis helps identify if such cases impact model
performance. The below illustration provides a view of how moving
from aggregate to group-wise errors provides a better picture of model
performance.
9. Limits of Traditional Computer Programs?
Ans:
Traditional Programming:
Traditional programming is a manual process—meaning a person
(programmer) creates the program. But without anyone programming
the logic, one has to manually formulate or code rules.

In machine learning, on the other hand, the algorithm automatically


formulates the rules from the data.

10. Challenges of Machine Learning?


Ans:
7 Major Challenges Faced By Machine Learning Professionals
1. Poor Quality of Data:
Data plays a significant role in the machine learning process. One of the
significant issues that machine learning professionals face is the absence
of good quality data. Unclean and noisy data can make the whole
process extremely exhausting.
2. Underfitting of Training Data:
This process occurs when data is unable to establish an accurate
relationship between input and output variables. It simply means trying
to fit in undersized jeans. It signifies the data is too simple to establish a
precise relationship. To overcome this issue:
• Maximize the training time
• Enhance the complexity of the model
• Add more features to the data
• Reduce regular parameters
• Increasing the training time of model
3. Overfitting of Training Data:
Overfitting refers to a machine learning model trained with a massive
amount of data that negatively affect its performance. It is like trying to
fit in Oversized jeans. Unfortunately, this is one of the significant issues
faced by machine learning professionals.
We can tackle this issue by:
• Analyzing the data with the utmost level of perfection
• Use data augmentation technique
• Remove outliers in the training set
• Select a model with lesser features
4. Machine Learning is a Complex Process:
The machine learning industry is young and is continuously changing.
Rapid hit and trial experiments are being carried on. The process is
transforming, and hence there are high chances of error which makes the
learning complex.
5. Lack of Training Data:
The most important task you need to do in the machine learning process
is to train the data to achieve an accurate output. Less amount training
data will produce inaccurate or too biased predictions. Let us understand
this with the help of an example. Consider a machine learning algorithm
similar to training a child.
6. Slow Implementation:
This is one of the common issues faced by machine learning
professionals. The machine learning models are highly efficient in
providing accurate results, but it takes a tremendous amount of time.
Slow programs, data overload, and excessive requirements usually take a
lot of time to provide accurate results.
7. Imperfections in the Algorithm When Data Grows:
So you have found quality data, trained it amazingly, and the predictions
are really concise and accurate. Yay, you have learned how to create a
machine learning algorithm!! But wait, there is a twist; the model may
become useless in the future as data grows.
11. Gradient Boosting?
Ans:
Gradient Boosting is a popular boosting algorithm. In gradient
boosting, each predictor corrects its predecessor’s error. In contrast to
Adaboost, the weights of the training instances are not tweaked, instead,
each predictor is trained using the residual errors of predecessor as
labels.
There is a technique called the Gradient Boosted Trees whose base
learner is CART (Classification and Regression Trees).
The below diagram explains how gradient boosted trees are trained for
regression problems.
Gradient Boosted Trees for Regression
The ensemble consists of N trees. Tree1 is trained using the feature
matrix X and the labels y. The predictions labelled y1(hat) are used to
determine the training set residual errors r1. Tree2 is then trained using
the feature matrix X and the residual errors r1 of Tree1 as labels. The
predicted results r1(hat) are then used to determine the residual r2. The
process is repeated until all the N trees forming the ensemble are trained.
12. Delta Rule?
Ans:
While the delta rule is similar to the perceptron’s update rule, the
derivation is different. The perceptron uses the Heaviside step function
as the activation function g(h) and that means that g’(h) does not exist at
zero, and is equal to zero elsewhere, which makes the direct application
of the delta rule impossible.

You might also like