Lecture 2
Lecture 2
Lecture 2
University of Queensland
Outline
I Linear regression.
I Accuracy of linear regression.
I Problems with the linear regression model.
I Nearest-neighbor regression.
I First comparison of parametric (linear regression) vs.
nonparametric (nearest neighbor) learning.
Supervised learning setup
I Given a random variable X , predict another variable Y .
I Example:
I Y = Sales.
I X = Advertising.
I Solution:
I Learn a function fˆ from the data.
I Given input X , output
Y = fˆ(X )
f (X ) = β0 + β1 X
Y ≈ f (X ) = β0 + β1 X
Y = f (X ) + ε
= β0 + β1 X + ε
How to learn the linear regression model?
ei = yi − βˆ0 − βˆ1 xi
I Objective function
n
X n
X
Q(β0 , β1 ) = (yi − β0 − β1 xi )2 = ε2i
i=1 i=1
y1 1 x1 ε1
.. .. .. ..
. . . .
y X
yi ,
= 1 xi , ε =
= εi
.. .. .. ..
. . . .
yn 1 xn εn
I Matrix of coefficients.
β0
β=
β1
Linear regression model in matrix form 2
y1 = β0 + β1 x1 + ε1
..
.
yi = β0 + β1 xi + εi
..
.
yn = β0 + β1 xn + εn
y = Xβ + ε
Least Squares Minimization 2
I Y = β0 + β1 X1 + β2 X2 + ε
Multiple Linear regression
y = Xβ + e
β̂ = (X T X )−1 X T y
Var (β̂) = σ 2 (X T X )−1
How good is the regression model?
fˆ(x 0 ) = x 0 β̂
= β̂0 + β̂1 x 0,1 + · · · + β̂p x 0,p
I We can now compute
I Mean squared error. (It is linked to R 2 .)
I Test MSE. (Based on x0 in the test sample)
Special input data
I Example: a binary input:
1 if i is a student
studenti =
0 otherwise
I regression function.
balancei = β0 + β1 incomei + β2 studenti + εi
(β0 + β2 ) + β1 incomei + εi if i is a student
=
β0 + β1 incomei + εi otherwise
Discrete variable coding
I Example: a discrete input in the regression of Y on X
red
X = blue
green
I Simpson’s Paradox
1 X
fˆ(x0 ) = yi
K
xi ∈N0
Comparison of parametric and nonparametric regression
I Curse of dimensionality