CS3491 Artificial Intelilgence and Machine Learning
CS3491 Artificial Intelilgence and Machine Learning
CS3491 Artificial Intelilgence and Machine Learning
com
Question Bank
Vision of Institution
To build Jeppiaar Engineering College as an Institution of Academic Excellence in Technical
education and Management education and to become a World Class University.
Mission of Institution
To equip students with values, ethics and life skills needed to enrich their lives and
M3
enable them to meaningfully contribute to the progress of society
M4 To prepare students for higher studies and lifelong learning, enrich them with the
practical and entrepreneurial skills necessary to excel as future professionals and
contribute to Nation’s economy
Life-long learning: Recognize the need for, and have the preparation and ability to
PO12 engage in independent and life-long learning in the broadest context of technological
change.
Vision of Department
To emerge as a globally prominent department, developing ethical computer professionals,
innovators and entrepreneurs with academic excellence through quality education and research .
Mission of Department
To create computer professionals with an ability to identify and formulate the
M1
engineering problems and also to provide innovative solutions through effective
teaching learning process.
M2 To strengthen the core-competence in computer science and engineering and to create
an ability to interact effectively with industries.
M3 To produce engineers with good professional skills, ethical values and life skills for the
betterment of the society.
To interpret real-time problems with analytical skills and to arrive at cost effective and
PSO2 optimal solution using advanced tools and techniques.
SYLLABUS
PRACTICAL EXERCISES:
1. Implementation of Uninformed search algorithms (BFS, DFS)
2. Implementation of Informed search algorithms (A*, memory-bounded A*)
COURSE OUTCOMES:
At the end of this course, the students will be able to:
CO1: Use appropriate search algorithms for problem solving
CO2: Apply reasoning under uncertainty
CO3: Build supervised learning models
CO4: Build ensembling and unsupervised models
CO5: Build deep learning neural network models
TEXT BOOKS:
1. Stuart Russell and Peter Norvig, “Artificial Intelligence – A Modern Approach”, Fourth
Edition, Pearson Education, 2021.
2. Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Fourth Edition, 2020.
REFERENCES:
1. Dan W. Patterson, “Introduction to Artificial Intelligence and Expert Systems”, Pearson
Education,2007
2. Kevin Night, Elaine Rich, and Nair B., “Artificial Intelligence”, McGraw Hill, 2008
3. Patrick H. Winston, "Artificial Intelligence", Third Edition, Pearson Education, 2006
4. Deepak Khemani, “Artificial Intelligence”, Tata McGraw Hill Education, 2013
(http://nptel.ac.in/)
5. Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer, 2006.
6. Tom Mitchell, “Machine Learning”, McGraw Hill, 3rd Edition,1997.
7. Charu C. Aggarwal, “Data Classification Algorithms and Applications”, CRC Press, 2014
8. Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar, “Foundations of Machine
Learning”, MIT Press, 2012.
9. Ian Goodfellow, Yoshua Bengio, Aaron Courville, “Deep Learning”, MIT Press, 2016
Part A
2. What is an agent?
An agent is anything that can be viewed as perceiving its environment through
sensors and acting upon that environment through actuators.
7. Are reflex actions (such as flinching from a hot stove) rational? Are they
intelligent?
Reflex actions can be considered rational. If the body is performing the action,
then it can be argued that reflex actions are rational because of evolutionary
adaptation. Flinching from a hot stove is a normal reaction, because the body wants to
keep itself out of danger and getting away from something hot is a way to do that.
Reflex actions are also intelligent. Intelligence suggests that there is reasoning
and logic involved in the action itself.
Automatic Assembly
Internet searching
Toy problem Examples:
8 – Queen problem
8 – Puzzle problem
Vacuum world problem
Depth-limited search
Iterative deepening depth-first search
Bidirectional search
21. What is the power of heuristic search? (or) Why does one go for
heuristics search?
Heuristic search uses problem specific knowledge while searching in state
space. This helps to improve average search performance. They use evaluation
functions which denote relative desirability (goodness) of a expanding node set. This
makes the search more efficient and faster. One should go for heuristic search because
it has power to solve large, hard problems in affordable times.
23. State the reason when hill climbing often gets stuck?
Local maxima are the state where hill climbing algorithm is sure to get struck.
Local maxima are the peak that is higher than each of its neighbour states, but lower
than the global maximum. So we have missed the better state here. All the search
procedure turns out to be wasted here. It is like a dead end.
25. What do you mean by local maxima with respect to search technique?
Local maximum is the peak that is higher than each of its neighbour states, but
lowers than the global maximum i.e. a local maximum is a tiny hill on the surface
whose peak is not as high as the main peak (which is a optimal solution). Hill
climbing fails to find optimum solution when it encounters local maxima. Any small
move, from here also makes things worse (temporarily). At local maxima all the
search procedure turns out to be wasted here. It is like a dead end.
Ridge and plateau in hill climbing can be avoided using methods like
backtracking, making big jumps. Backtracking and making big jumps help to avoid
plateau, whereas, application of multiple rules helps to avoid the problem of ridges.
In a game of chance, we can add extra level of chance nodes in game search tree.
These nodes have successors which are the outcomes of random element. The
minimax algorithm uses probability P attached with chance node di based on this
value. Successor function S(N,di) give moves from position N for outcome di
Part B
1. Enumerate Classical “Water jug Problem”. Describe the state space for
this problem and also give the solution.
2. How to define a problem as state space search? Discuss it with the help of
an example
3. Solve the given problem. Describe the operators involved in it.
Consider a Water Jug Problem : You are given two jugs, a 4-gallon one and
a 3-gallon one. Neither has any measuring markers on it. There is a pump
that can be used to fill the jugs with water. How can you get exactly 2 gallons
of water into the 4-gallon jug ? Explicit Assumptions: A jug can be filled
from the pump, water can be poured out of a jug onto the ground, water can
be poured from one jug to another and that there are no other measuring
devices available.
4. Define the following problems. What types of control strategy is used in
the following problem.
i. The Tower of Hanoi
ii.Crypto-arithmetic
iii.The Missionaries and cannibals problems
iv.8-puzzle problem
5. Discuss uninformed search methods with examples.
6. Give an example of a problem for which breadth first search would
work better than depth first search.
7. Explain the algorithm for steepest hill climbing
8. Explain the A* search and give the proof of optimality of A*
9. Explain AO* algorithm with a suitable example. State the limitations in the
algorithm?
10. Explain the nature of heuristics with example. What is the effect of
heuristics accuracy?
11. Explain the various types of hill climbing search techniques.
12. Discuss about constraint satisfaction problem with a algorithm for solving
a crypt arithmetic Problem.
13. Solve the following Crypt arithmetic problem using constraints
satisfaction search procedure.
CROSS
+ROADS
------------
DANGER
----------------
Part A
1. Why does uncertainty arise?
Agents almost never have access to the whole truth about their environment.
Uncertainty arises because of both laziness and ignorance. It is inescapable in
complex, nondeterministic, or partially observable environments
Agents cannot find a categorical answer.
Uncertainty can also arise because of incompleteness, incorrectness in
agents understanding of properties of environment.
2. Differentiate uncertainty with ignorance.
A key condition that differentiates ignorance from uncertainty is the absence
of knowledge about the factors that influence the issues
P(A/B)=[P(A)*P(B/A)]/P(B)
0.4 = (0.3*P(B/A))/0.5
P(B/A) = 0.66
Part B
PART - A
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning
Transduction
Learning to Learn
8. What are the three stages to build the hypotheses or model in machine
learning?
Model building
Model testing
Applying the model
11. What is the difference between artificial learning and machine learning?
Designing and developing algorithms according to the behaviours based on
empirical data are known as Machine Learning. While artificial intelligence in
addition to machine learning, it also covers other aspects like knowledge
representation, natural language processing, planning, robotics etc.
13. What is the main key difference between supervised and unsupervised
machine learning?
supervised learning Unsupervised learning
The supervised learning technique needs Unsupervised learning does not
labelled data to train the model. For need any labelled dataset. This is
example, to solve a classification problem the main key difference between
(a supervised learning task), you need to supervised learning and
have label data to train the model and to unsupervised learning.
classify the data into your labelled groups.
17. What is the difference between stochastic gradient descent (SGD) and
gradient descent (GD)?
Both algorithms are methods for finding a set of parameters that minimize a
loss function by evaluating parameters against data and then making adjustments.
In standard gradient descent, you'll evaluate all training samples for each set of
parameters. This is akin to taking big, slow steps toward the solution. In stochastic
gradient descent, you'll evaluate only 1 training sample for the set of parameters
before updating them. This is akin to taking small, quick steps toward the solution.
19. What is the difference between least squares regression and multiple
regression?
The goal of multiple linear regression is to model the linear relationship
between the explanatory (independent) variables and response (dependent)
variables. In essence, multiple regression is the extension of ordinary least-squares
(OLS) regression because it involves more than one explanatory variable.
36. Do you think 50 small decision trees are better than a large one? Why?
Yes. Because a random forest is an ensemble method that takes many weak
decision trees to make a strong learner. Random forests are more accurate, more
robust, and less prone to overfitting.
37. You’ve built a random forest model with 10000 trees. You got delighted after
getting training error as 0.00. But, the validation error is 34.23. What is going
on? Haven’t you trained your model perfectly?
The model has overfitted. Training error 0.00 means the classifier has
mimicked the training data patterns to an extent, that they are not available in the
unseen data. Hence, when this classifier was run on an unseen sample, it couldn’t
find those patterns and returned predictions with higher error. In a random forest,
it happens when we use a larger number of trees than necessary. Hence, to avoid
this situation, we should tune the number of trees using cross-validation.
38. When would you use random forests vs SVM and why?
There are a couple of reasons why a random forest is a better choice of the
model than asupport vector machine:
● Random forests allow you to determine the feature importance. SVM’s
can’t do this.
● Random forests are much quicker and simpler to build than an SVM.
● For multi-class classification problems, SVMs require a one-vs-rest method,
which is less scalable and more memory intensive.
Part – B
1. Assume a disease so rare that it is seen in only one person out of every
million. Assume also that we have a test that is effective in that if a person has
the disease, there is a 99 percent chance that the test result will be positive;
however, the test is not perfect, and there is a one in a thousand chance that
the test result will be positive on a healthy person. Assume that a new patient
arrives and the test result is positive. What is the probability that the patient
has the disease?
2. Explain Naïve Bayes Classifier with an Example.
3. Explain SVM Algorithm in Detail.
4. Explain Decision Tree Classification.
5. Explain the principle of the gradient descent algorithm. Accompany your
explanation with a diagram. Explain the use of all the terms and constants
that you introduce and comment on the range of values that they can take.
6. Explain the following
a) Linear regression
b) Logistic Regression
PART - A
1. What is bagging and boosting in ensemble learning?
Bagging is a way to decrease the variance in the prediction by generating additional
data for training from dataset using combinations with repetitions to produce multi-sets of the
original data. Boosting is an iterative technique which adjusts the weight of an observation
based on the last classification.
8. What are Gaussian mixture models How is expectation maximization used in it?
Expectation maximization provides an iterative solution to maximum
likelihood estimation with latent variables. Gaussian mixture models are an approach
to density estimation where the parameters of the distributions are fit using the
expectation-maximization algorithm.
9. What is k-means unsupervised learning?
K-Means clustering is an unsupervised learning algorithm. There is no labeled
data for this clustering, unlike in supervised learning. K-Means performs the division
of objects into clusters that share similarities and are dissimilar to the objects
belonging to another cluster. The term 'K' is a number.
Part – B
1. Explain briefly about unsupervised learning structure?
2. Explain various learning techniques involved in unsupervised learning?
3. What is Gaussian process? And explain in detail of Gaussian parameter
estimates with suitable examples.
4. Explain the concepts of clustering approaches. How it differ from classification.
5. List the applications of clustering and identify advantages and disadvantages of
clustering algorithm.
6. Explain about EM algorithm.
16. What is stochastic gradient descent and why is it used in the training of neural
networks?
Stochastic Gradient Descent is an optimization algorithm that can be used to train neural
network models. The Stochastic Gradient Descent algorithm requires gradients to be
calculated for each variable in the model so that new values for the variables can be
calculated.
17. What are the three main types gradient descent algorithm?
There are three types of gradient descent learning algorithms: batch gradient descent,
stochastic gradient descent and mini-batch gradient descent.
19. How do you solve the vanishing gradient problem within a deep neural network?
The vanishing gradient problem is caused by the derivative of the activation function used
to create the neural network. The simplest solution to the problem is to replace the activation
function of the network. Instead of sigmoid, use an activation function such as ReLU
Part – B
1. Draw the architecture of a single layer perceptron (SLP) and explain its
operation. Mention its advantages and disadvantages.
2. Draw the architecture of a Multilayer perceptron (MLP) and explain its
operation. Mention its advantages and disadvantages.
3. Explain the stochastic optimization methods for weight determination.
4. Describe back propagation and features of back propagation.
5. Write the flowchart of error back-propagation training algorithm.
6. Develop a Back propagation algorithm for Multilayer Feed forward neural
network consisting of one input layer, one hidden layer and output layer from
first principles.
7. List the factors that affect the performance of multilayer feed-forward neural
network.
8. Difference between a Shallow Net & Deep Learning Net.
9. How do you tune hyperparameters for better neural network performance?
Explain in detail.