IIS Lecture 3

University of Sadat City
Faculty of Computers and Artificial Intelligence (FCAI)
Intelligent Information Systems

(IS409)
Lecture 3
Prepared By:
Dr. Heba Askr
IS Department
2nd Semester-2023-2024
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
1
Analytics, Data Science and AI:
Systems for Decision Support
Eleventh Edition
Chapter 5
Machine-Learning Techniques for
Predictive Analytics
Slide in this Presentation Contain Hyperlinks.

JAWS users should be able to get a list of links
by using INSERT+F77
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Neural Network Concepts
• Neural networks (NN): a human brain metaphor for
information processing
• Neural computing
• Artificial neural network (ANN)
• Many uses for ANN for
– pattern recognition, forecasting, prediction, and
classification
• Many application areas
– finance, marketing, manufacturing, operations,
information systems, and so on
Biological Neural Networks
• Two interconnected brain cells (neurons)
Processing Information in ANN
• A single neuron (processing element – PE) with inputs

and outputs
Biology Analogy
Biological Artificial
Soma Node
Dendrites Input
Axon Output
Synapse Weight
Slow Fast
Many neurons (109) Few neurons (a dozen to hundreds of thousands)
Elements of ANN
• Processing element (PE)
• Network architecture
– Hidden layers
– Parallel processing
• Network information processing
– Inputs
– Outputs
– Connection weights
– Summation function
Neural Network Architectures
• Architecture of a neural network is driven by the task it is
intended to address
– Classification, regression, clustering, general
optimization, association
• Feedforward, multi-layered perceptron with
backpropagation learning algorithm
– Most popular architecture:
– This ANN architecture will be covered in Chapter 6
• Other ANN Architectures – Recurrent, self-organizing
feature maps, hopfield networks, …
Neural Network Architectures
Recurrent Neural Networks
Support Vector Machines (S VM)
(2 of 4)
• Goal of S VM: to generate mathematical functions that
map input variables to desired outputs for classification or
regression type prediction problems.
– First, SV M uses nonlinear kernel functions to
transform non-linear relationships among the variables
into linearly separable feature spaces.
– Then, the maximum-margin hyperplanes are
constructed to optimally separate different classes
from each other based on the training dataset.
• SVM has solid mathematical foundation!
(3 of 4)
• A hyperplane is a geometric concept used to describe the
separation surface between different classes of things.
– In SVM, two parallel hyperplanes are constructed on
each side of the separation space with the aim of
maximizing the distance between them.
• A kernel function in SVM uses the kernel trick (a method
for using a linear classifier algorithm to solve a nonlinear
problem)
– The most commonly used kernel function is the radial
basis function (R BF).
(4 of 4)
• Many linear classifiers (hyperplanes) may separate the

data
How Does a S VM Works?
• Following a machine-learning process, a SVM learns from the
historic cases.
• The Process of Building SVM
1. Preprocess the data
▪ Scrub and transform the data.
2. Develop the model.
▪ Select the kernel type (RBF is often a natural choice).
▪ Determine the kernel parameters for the selected kernel
type.
▪ If the results are satisfactory, finalize the model,
otherwise change the kernel type and/or kernel
parameters to achieve the desired accuracy level.
3. Extract and deploy the model.
The Process of Building a SVM
SV M Applications
• SVM are the most widely used kernel-learning algorithms
for wide range of classification and regression problems
• SVM represent the state-of-the-art by virtue of their
excellent generalization performance, superior prediction
power, ease of use, and rigorous theoretical foundation
• Most comparative studies show its superiority in both
regression and classification type prediction problems.
• SVM versus ANN?
k-Nearest Neighbor Method (k-N N)
(1 of 2)
• ANNs and SVM s → time-demanding, computationally
intensive iterative derivations
• k-NN a simplistic and logical prediction method, that
produces very competitive results
• k-NN is a prediction method for classification as well as
regression types (similar to ANN & SVM)
• k : the number of neighbors used in the model
k-Nearest Neighbor Method (k-N N)
(2 of 2)
• The answer to
“which class a
data point
belongs to?”
depends on the
value of k
The Process of k-NN Method
k-NN Model Parameter (1 of 2)
1. Similarity Measure: The Distance Metric
Minkowski distance
q q q
d (i, j ) = ( xi1 − x j1 + xi 2 − x j 2 + ... + xip − x jp )
q
If q = 1, then d is called Manhatten distance
d (i, j ) = xi1 − x j1 + xi 2 − x j 2 + ... + xip − x jp

If q = 2, then d is called Euclidean distance
2 2 2
d (i, j ) = ( xi1 − x j1 + xi 2 − x j 2 + + xip − x jp )
– Numeric versus nominal values?
k-NN Model Parameter (2 of 2)
2. Number of Neighbors (the value of k)
– The best value depends on the data
– Larger values reduces the effect of noise but also
make boundaries between classes less distinct
– An “optimal” value can be found heuristically
• Cross Validation is often used to determine the best value
for k and the distance measure
Naïve Bayes Method for
Classification (1 of 2)
• Naïve Bayes is a simple probability-based classification
method
– Naïve - assumption of independence among the input
variables
• Can use both numeric and nominal input variables
– Numeric variables need to be discretized
• Can be used for both regression and classification

IIS Lecture 3

Uploaded by

Copyright:

Available Formats

IIS Lecture 3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IIS Lecture 3

Uploaded by

Copyright:

Available Formats

University of Sadat City

Faculty of Computers and Artificial Intelligence (FCAI)

Intelligent Information Systems

Slide in this Presentation Contain Hyperlinks.

• Two interconnected brain cells (neurons)

• A single neuron (processing element – PE) with inputs

• Many linear classifiers (hyperplanes) may separate the

If q = 1, then d is called Manhatten distance

d (i, j ) = xi1 − x j1 + xi 2 − x j 2 + ... + xip − x jp

– Numeric versus nominal values?

You might also like