IIS Lecture 3
IIS Lecture 3
IIS Lecture 3
Chapter 5
Machine-Learning Techniques for
Predictive Analytics
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Neural Network Concepts
• Neural networks (NN): a human brain metaphor for
information processing
• Neural computing
• Artificial neural network (ANN)
• Many uses for ANN for
– pattern recognition, forecasting, prediction, and
classification
• Many application areas
– finance, marketing, manufacturing, operations,
information systems, and so on
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Biological Neural Networks
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Processing Information in ANN
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Elements of ANN
• Processing element (PE)
• Network architecture
– Hidden layers
– Parallel processing
• Network information processing
– Inputs
– Outputs
– Connection weights
– Summation function
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Neural Network Architectures
• Architecture of a neural network is driven by the task it is
intended to address
– Classification, regression, clustering, general
optimization, association
• Feedforward, multi-layered perceptron with
backpropagation learning algorithm
– Most popular architecture:
– This ANN architecture will be covered in Chapter 6
• Other ANN Architectures – Recurrent, self-organizing
feature maps, hopfield networks, …
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Neural Network Architectures
Recurrent Neural Networks
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Support Vector Machines (S VM)
(2 of 4)
• Goal of S VM: to generate mathematical functions that
map input variables to desired outputs for classification or
regression type prediction problems.
– First, SV M uses nonlinear kernel functions to
transform non-linear relationships among the variables
into linearly separable feature spaces.
– Then, the maximum-margin hyperplanes are
constructed to optimally separate different classes
from each other based on the training dataset.
• SVM has solid mathematical foundation!
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Support Vector Machines (S VM)
(3 of 4)
• A hyperplane is a geometric concept used to describe the
separation surface between different classes of things.
– In SVM, two parallel hyperplanes are constructed on
each side of the separation space with the aim of
maximizing the distance between them.
• A kernel function in SVM uses the kernel trick (a method
for using a linear classifier algorithm to solve a nonlinear
problem)
– The most commonly used kernel function is the radial
basis function (R BF).
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Support Vector Machines (S VM)
(4 of 4)
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
SV M Applications
• SVM are the most widely used kernel-learning algorithms
for wide range of classification and regression problems
• SVM represent the state-of-the-art by virtue of their
excellent generalization performance, superior prediction
power, ease of use, and rigorous theoretical foundation
• Most comparative studies show its superiority in both
regression and classification type prediction problems.
• SVM versus ANN?
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
k-Nearest Neighbor Method (k-N N)
(1 of 2)
• ANNs and SVM s → time-demanding, computationally
intensive iterative derivations
• k-NN a simplistic and logical prediction method, that
produces very competitive results
• k-NN is a prediction method for classification as well as
regression types (similar to ANN & SVM)
• k : the number of neighbors used in the model
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
k-Nearest Neighbor Method (k-N N)
(2 of 2)
• The answer to
“which class a
data point
belongs to?”
depends on the
value of k
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
The Process of k-NN Method
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
k-NN Model Parameter (1 of 2)
1. Similarity Measure: The Distance Metric
Minkowski distance
q q q
d (i, j ) = ( xi1 − x j1 + xi 2 − x j 2 + ... + xip − x jp )
q
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
k-NN Model Parameter (2 of 2)
2. Number of Neighbors (the value of k)
– The best value depends on the data
– Larger values reduces the effect of noise but also
make boundaries between classes less distinct
– An “optimal” value can be found heuristically
• Cross Validation is often used to determine the best value
for k and the distance measure
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Naïve Bayes Method for
Classification (1 of 2)
• Naïve Bayes is a simple probability-based classification
method
– Naïve - assumption of independence among the input
variables
• Can use both numeric and nominal input variables
– Numeric variables need to be discretized
• Can be used for both regression and classification
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved