7 Nonlinear
7 Nonlinear
7 Nonlinear
1 / 23
1 / 23
1 / 23
step functions,
splines,
1 / 23
Polynomial Regression
yi = 0 + 1 xi + 2 x2i + 3 x3i + . . . + d xdi + i
| || | || | | || | || || | || | || || | || || | || | || | | | | | || | || |
0.10
0.05
Pr(Wage>250 | Age)
200
150
100
0.00
50
Wage
0.15
250
300
0.20
Degree4 Polynomial
20
30
40
50
Age
60
70
80
||| || || ||| ||| ||| ||| ||| ||| ||||| ||||| ||| ||| ||||| ||||| ||||| ||||| ||| ||||| ||| ||| ||| ||| ||||| ||| ||| ||||| ||| ||| ||| ||| ||| ||| ||| ||| ||||| ||| || ||| ||| ||| ||| ||| ||||| ||| ||| || || || || ||| || || || || | | || || |
20
30
40
50
60
70
||
80
Age
2 / 23
Details
Create new variables X1 = X, X2 = X 2 , etc and then treat
3 / 23
Details
Create new variables X1 = X, X2 = X 2 , etc and then treat
3 / 23
Details
Create new variables X1 = X, X2 = X 2 , etc and then treat
3 / 23
Details
Create new variables X1 = X, X2 = X 2 , etc and then treat
3 / 23
Details continued
Logistic regression follows naturally. For example, in figure
we model
4 / 23
Details continued
Logistic regression follows naturally. For example, in figure
we model
4 / 23
Details continued
Logistic regression follows naturally. For example, in figure
we model
Step Functions
Another way of creating transformations of a variable cut
the variable into distinct regions.
C1 (X) = I(X < 35),
| || | | | | || || || | | || || | || | | || | | | | | | | | || | | | |
0.10
Pr(Wage>250 | Age)
0.05
200
150
100
0.00
50
Wage
0.15
250
300
0.20
Piecewise Constant
20
30
40
50
Age
60
70
80
|| ||| || || ||| || ||| ||| ||| |||| ||||| || ||| ||| ||| ||| ||||| ||| ||| |||||||||| |||||||||| ||| ||| ||||| ||||| ||| ||| ||||| ||||| ||| ||||| ||| ||| ||| ||| ||| ||| ||| ||| ||| ||| ||| || || || || || || || || | || | || || | || |
20
30
40
50
Age
60
70
||
80
5 / 23
6 / 23
6 / 23
6 / 23
6 / 23
Piecewise Polynomials
continuity.
7 / 23
200
40
50
60
70
20
30
40
50
Age
Age
Cubic Spline
Linear Spline
60
70
60
70
200
150
100
50
50
100
150
Wage
200
250
30
250
20
Wage
150
Wage
50
100
150
50
100
Wage
200
250
250
Piecewise Cubic
20
30
40
50
Age
60
70
20
30
40
50
Age
8 / 23
Linear Splines
A linear spline with knots at k , k = 1, . . . , K is a piecewise
linear polynomial continuous at each knot.
We can represent this model as
yi = 0 + 1 b1 (xi ) + 2 b2 (xi ) + + K+3 bK+3 (xi ) + i ,
where the bk are basis functions.
9 / 23
Linear Splines
A linear spline with knots at k , k = 1, . . . , K is a piecewise
linear polynomial continuous at each knot.
We can represent this model as
yi = 0 + 1 b1 (xi ) + 2 b2 (xi ) + + K+3 bK+3 (xi ) + i ,
where the bk are basis functions.
b1 (xi ) = xi
bk+1 (xi ) = (xi k )+ ,
k = 1, . . . , K
0.9
0.7
0.3
0.5
f(x)
0.0
0.2
0.4
0.6
0.8
1.0
0.6
0.8
1.0
0.2
0.0
b(x)
0.4
0.0
0.2
0.4
x
10 / 23
Cubic Splines
A cubic spline with knots at k , k = 1, . . . , K is a piecewise
cubic polynomial with continuous derivatives up to order 2 at
each knot.
Again we can represent this model with truncated power basis
functions
yi = 0 + 1 b1 (xi ) + 2 b2 (xi ) + + K+3 bK+3 (xi ) + i ,
b1 (xi ) = xi
b2 (xi ) = x2i
b3 (xi ) = x3i
bk+3 (xi ) = (xi k )3+ ,
where
(xi
k )3+
=
k = 1, . . . , K
(xi k )3 if xi > k
0 otherwise
11 / 23
1.6
1.4
1.0
1.2
f(x)
0.0
0.2
0.4
0.6
0.8
1.0
0.6
0.8
1.0
0.2
0.0
b(x)
0.4
0.0
0.2
0.4
x
12 / 23
150
100
50
Wage
200
250
20
30
40
50
60
70
13 / 23
| || | | || | || || || || | || | | || | | || | || || | || | | || || || | | ||
0.10
Pr(Wage>250 | Age)
0.05
200
150
100
0.00
50
Wage
0.15
250
300
0.20
20
30
40
50
Age
60
70
80
|| || || || || ||| || ||| ||| ||||| ||||| ||| ||| ||| ||||| ||||| ||| ||||| ||| ||| ||| ||| ||| ||||| ||| ||| ||||| ||| ||||| ||||| ||| ||| ||||| ||| ||| ||| ||| ||| ||| ||| ||| ||| ||||| || ||| || || || || || || || || || | || | | | |
20
30
40
50
60
70
||
80
Age
14 / 23
Knot placement
One strategy is to decide K, the number of knots, and then
200
150
100
Comparison of a
degree-14 polynomial and a natural
cubic spline, each
with 15df.
50
Wage
250
300
20
30
40
50
Age
60
70
80
15 / 23
Knot placement
One strategy is to decide K, the number of knots, and then
200
150
100
Comparison of a
degree-14 polynomial and a natural
cubic spline, each
with 15df.
ns(age, df=14)
poly(age, deg=14)
50
Wage
250
300
20
30
40
50
Age
60
70
80
15 / 23
Smoothing Splines
This section is a little bit mathematical
Consider this criterion for fitting a smooth function g(x) to
some data:
Z
n
X
(yi g(xi ))2 + g 00 (t)2 dt
minimize
gS
i=1
16 / 23
Smoothing Splines
This section is a little bit mathematical
Consider this criterion for fitting a smooth function g(x) to
some data:
Z
n
X
(yi g(xi ))2 + g 00 (t)2 dt
minimize
gS
i=1
The first term is RSS, and tries to make g(x) match the
data at each xi .
16 / 23
Smoothing Splines
This section is a little bit mathematical
Consider this criterion for fitting a smooth function g(x) to
some data:
Z
n
X
(yi g(xi ))2 + g 00 (t)2 dt
minimize
gS
i=1
The first term is RSS, and tries to make g(x) match the
data at each xi .
The second term is a roughness penalty and controls how
wiggly g(x) is. It is modulated by the tuning parameter
0.
16 / 23
Smoothing Splines
This section is a little bit mathematical
Consider this criterion for fitting a smooth function g(x) to
some data:
Z
n
X
(yi g(xi ))2 + g 00 (t)2 dt
minimize
gS
i=1
The first term is RSS, and tries to make g(x) match the
data at each xi .
The second term is a roughness penalty and controls how
wiggly g(x) is. It is modulated by the tuning parameter
0.
The smaller , the more wiggly the function, eventually
interpolating yi when = 0.
16 / 23
Smoothing Splines
This section is a little bit mathematical
Consider this criterion for fitting a smooth function g(x) to
some data:
Z
n
X
(yi g(xi ))2 + g 00 (t)2 dt
minimize
gS
i=1
The first term is RSS, and tries to make g(x) match the
data at each xi .
The second term is a roughness penalty and controls how
wiggly g(x) is. It is modulated by the tuning parameter
0.
The smaller , the more wiggly the function, eventually
interpolating yi when = 0.
As , the function g(x) becomes linear.
16 / 23
17 / 23
17 / 23
17 / 23
n
X
i=1
{S }ii .
17 / 23
18 / 23
RSScv () =
n
X
i=1
(yi
(i)
g (xi ))2
n
X
yi g (xi ) 2
i=1
1 {S }ii
In R: smooth.spline(age, wage)
18 / 23
Smoothing Spline
200
50 100
0
Wage
300
16 Degrees of Freedom
6.8 Degrees of Freedom (LOOCV)
20
30
40
50
60
70
80
Age
19 / 23
Local Regression
O
0.0
0.2
0.4
0.6
0.8
1.5
1.0
0.5
0.0
0.5
OO
O
O OO
O
O
O
O
O
O
O
OOO
O
OO O
O
O
O O
O
O
O O OO O
O
O O
OOO
O OO
O
OO O
O OO OOO O
O
O
O
OO
O OO O
OO OO
O O
O
O
O
O
O OO
O
O O
O
O O
O
OO
O
OO O O
O
OO
OO O O
O O
1.0
OO
O
O OO
O
O
O
O
O
O
O
OOO
O
OO O
O
O
O O
O
O
O O OO O
O
O O
OOO
O OO
O
OO O
O OO OOO O
O
O
O
OO
O OO O
OO OO
O O
O
O
O
O
O OO
O
O O
O
O O
O
OO
O
OO O O
O
OO
OO O O
O O
1.0
0.5
0.0
0.5
1.0
1.5
Local Regression
1.0
O
0.0
0.2
0.4
0.6
0.8
1.0
<Coll
Coll
>Coll
2005
2007
year
2009
20
10
10
20
40
30
50
2003
10
f2 (age)
30
20
10
0
10
20
30
f1 (year)
f3 (education)
20
30
10
30
40
20
<HS
20
30
40
50
age
60
70
80
education
21 / 23
GAM details
Can fit a GAM simply using, e.g. natural splines:
22 / 23
GAM details
Can fit a GAM simply using, e.g. natural splines:
22 / 23
GAM details
Can fit a GAM simply using, e.g. natural splines:
22 / 23
GAM details
Can fit a GAM simply using, e.g. natural splines:
22 / 23
GAM details
Can fit a GAM simply using, e.g. natural splines:
p(X)
1 p(X)
= 0 + f1 (X1 ) + f2 (X2 ) + + fp (Xp ).
<Coll
Coll
>Coll
f2 (age)
0
2
f1 (year)
f3 (education)
HS
2003
2005
2007
year
2009
20
30
40
50
age
60
70
80
education
23 / 23