Indian Institute of Science: Roblem
Indian Institute of Science: Roblem
Indian Institute of Science: Roblem
P ROBLEM 1: Write down the equation for the output y of the network as shown in Figure 1. (5 pts.)
z −1
00
w11
w11 y1
x1
wO1
w21 0
w21 y
0
w12
w12 wO2
x2 w22 y2
00
w22
−1
z
as follows:
XNx
x= αn xn (1)
n=1
where αn ≥ 0 and N Nz d
P x
n=1 αn = 1. Consider a second set of data points {z m }m=1 ∈ R and define the
corresponding convex hull. The two datasets are linearly separable if there exists a vector w and a scalar b
such that wT xn + b > 0 for all xn and wT z m + b < 0 for all z m . Show that
(a) If two convex hulls intersect, the two datasets cannot be linearly separable.
(b) If the two datasets are linearly separable, their convex hulls do not intersect.
(10 pts.)
P ROBLEM 3: Consider a N -class classification problem of the input data x ∈ Rd . Let {Lkj }N k,j=1 be the
cost associated with assigning class label k to an input which belongs to class j, P (Ck | x) be the condition
probability that the input x belongs to class Ck and, Rk be the set of all x for which the class label is Ck .
(a) Compute the total expected loss.
(b) Derive the optimal rule for assigning class labels.
(10 pts.)
P ROBLEM 4:
(a) The perceptron may be used to perform numerous logic functions. Demonstrate the implementation
of the binary logic functions AND, OR, and COMPLEMENT. Provide scatter plots with decision
boundaries. Attach your codes in an Appendix.
(b) A basic limitation of the perceptron is that it cannot implement the EXCLUSIVE OR function.
Explain the reason for this limitation.
(10 pts.)
1
2
P ROBLEM 5:
y
C1 Rm
x
D
Rin
Rout C2
F IGURE 2. Set of data points uniformly distributed within a circle of radius Rm and cres-
cent with inner and outer radius Rin and Rout .
(a) Generate a set of N = 2000 data points (with 1000 data points in each class) as shown in Figure 2
with Rm : Rin : Rout = 1 : 2 : 3, and D = 0. Provide scatter plots and attach your codes in an
Appendix.
(b) By changing the value of D, classify the set of data points into classes C1 and C2 using the perceptron
algorithm configured in online and batch modes. Provide scatter plots with decision boundaries.
Attach your codes in an Appendix.
(c) Fix the D value for linear separability. Start with different initial conditions for the weight vector
and the bias. Check whether you get the same decision boundary and comment upon this.
(d) With the initial weight vector fixed in online mode, randomize the sequence of inputs to the percep-
tron. Check whether you get the same decision boundary and comment upon this.
(e) Add Gaussian noise with 0 mean and variance ranging from 1 to Rin (in steps of 2) to the set of data
points shown in Figure 2. What is your stopping criterion for learning? What can you comment
upon the classification accuracy experimentally?
(f) Repeat the experiment 5(b) (fix the D value for linear separability) by varying the learning rate η
from 0.1 to 1 in steps of 0.2. Report the number of steps nmax required for the convergence of
perceptron algorithm for each value of η and fix the initial weight vectors in all your experiments.
(g) Justify your observations made in 5(f) theoretically.
(35 pts.)