Chapter 5 - Excursion B-Splines - Commented2
Chapter 5 - Excursion B-Splines - Commented2
Chapter 5 - Excursion B-Splines - Commented2
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 1 / 46
SPLINES
In this chapter we will regard a specific class of basis functions, the
so-called B-splines, in detail.
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 2 / 46
Basics of Bases
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 3 / 46
BASIC LINEAR REGRESSION
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 4 / 46
BASIC LINEAR REGRESSION
Least squares: minimize
n
X n
X
2
S= (yi − α0 − α1 xi ) = (yi − µi )2
i =1 i =1
Matrix notation:
1 x1
1 x2
α
X = . . α= 0 µ = Xα
.. .. α1
1 xn
Minimize:
ky − Xα k2 =⇒ XT Xα̂
α = XT y =⇒ α = (XT X)−1 XT y
α̂
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 5 / 46
CURVED RELATIONSHIPS
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 6 / 46
CURVED RELATIONSHIPS
Linear fit too simple? Add higher powers of x:
µi = α0 + α1 xi + α2 xi2 + α3 xi3 + . . .
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 7 / 46
BASIS FUNCTIONS
Regression model µ = Xα
Columns of X: basis functions. Polynomial basis
With sorted x for nice visual representation
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 9 / 46
EXAMPLE: MOTORCYCLE DATA
High degree needed for decent curve fit
Bad numerical condition (use orthogonal polynomials)
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 10 / 46
EXAMPLE: MOTORCYCLE DATA
Slight data changes can change the fit by a huge amount
Longer left part (near zero)
Notice wiggles
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 11 / 46
TROUBLE WITH POLYNOMIALS
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 12 / 46
B-Splines
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 13 / 46
B-SPLINE TYPES
Definition of B-Splines: Basis functions are only non-zero over the
intervals between d + 2 adjacent knots, where d is the degree of the
basis (for example d = 3 for a cubic spline). To define an M parameter
B-spline basis, we need to define M + d + 1 knots,
t1 < t2 < . . . < tM +d +1 , where the interval over which the spline is to
be evaluated lies within [td +1 , tM +1 ] (so that the first and last d knot
locations are essentially arbitrary). A d-th degree spline can then be
represented as
M
X
f (x ) = αj Bj (x ; d ) ,
j =1
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 14 / 46
B-SPLINE TYPES
The i-th B-spline basis of degree d is defined recursively as:
x − tj tj +d +1 − x
Bj (x ; d ) = Bj (x ; d −1)+ Bj −1 (x ; d −1), j = 1, . . . , M
tj +d − tj tj +d +1 − tj +1
where (
1, if x ∈ [tj , tj +1 ),
Bj (x ; 0) =
0, otherwise.
More detailed information about the B-spline basis can be found for
example in Eilers and Marx (1996).
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 15 / 46
B-SPLINE TYPES
Linear B-Spline:
Two pieces, each a straight line, rest zero
Nicely connected at knots (t1 to t3 ): same values
Slope jumps at knots
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 16 / 46
B-SPLINE TYPES
Quadratic B-Spline:
Three pieces, each a quadratic segment, rest zero
Nicely connected at knots (t1 to t4 ): same values and slopes
Shape similar to Gaussian
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 17 / 46
B-SPLINE TYPES
Cubic B-Spline:
Four pieces, each a cubic segment, rest zero
At knots (t1 to t5 ): same values, first & second derivatives
Shape more similar to Gaussian
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 18 / 46
B-SPLINE TYPES
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 19 / 46
B-SPLINE TYPES
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 20 / 46
B-SPLINES IN ACTION
Basis matrix B
Columns are B-splines
B1 (x1 ) B2 (x1 ) B3 (x1 ) . . . BM (x1 )
B1 (x2 ) B2 (x2 ) B3 (x2 ) . . . BM (x2 )
.. .. .. .. ..
. . . . .
B1 (xn ) B2 (xn ) B3 (xn ) . . . BM (xn )
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 21 / 46
B-SPLINES IN ACTION
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 22 / 46
B-SPLINES IN ACTION
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 23 / 46
B-SPLINES IN ACTION
B-splines are local functions, look like Gaussian
B-splines are columns of basis matrix B
Scaling and summing gives fitted values: µ = Bα
The knots determine the B-spline basis
Polynomial pieces make up B-splines, join at knots
General patterns of knots are possible
We consider only equal-spacing
Number of knots determines width and number of B-splines
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 24 / 46
B-SPLINES IN ACTION
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 25 / 46
P-Splines
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 26 / 46
SMOOTHNESS AND ROUGHNESS
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 27 / 46
SMOOTHNESS AND ROUGHNESS
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 28 / 46
SMOOTHNESS AND ROUGHNESS
The coefficients determine roughness
High roughness: α erratic
Little roughness: smoothly varying α
Simple numerical measure:
M
X
R= (αj − αj −1 )2
j =2
p
Or RMS “change to neighbor”: r = R /(M − 1)
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 29 / 46
SMOOTHNESS AND ROUGHNESS
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 30 / 46
DIFFERENCES AND MATRICES
α = (α2 − α1 , . . . , αM − αM −1 )
We are interested in ∆α
Special matrix makes life easy:
−1 1 0 0
α = Dα ;
∆α D = 0 −1 1 0
0 0 −1 1
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 31 / 46
ROUGHNESS WITH A MATRIX
Roughness measure R:
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 32 / 46
PENALIZING LEAST SQUARES
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 33 / 46
PENALIZING LEAST SQUARES
Minimize
Small modification of BT Bα = BT y
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 34 / 46
PENALIZING LEAST SQUARES
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 35 / 46
HIGHER ORDER DIFFERENCES
α = (α2 − α1 , . . . , αM − αM −1 )
First order: ∆α
α) = ∆2α
Second order: ∆(∆α
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 36 / 46
HIGHER ORDER DIFFERENCES
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 37 / 46
HIGHER ORDER DIFFERENCES
Third order
Matrix
−1 3 −3 1 0 0
D3 = 0 −1 3 −3 1 0
0 0 −1 3 −3 1
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 38 / 46
INTERPOLATION WITHOUT A PENALTY
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 39 / 46
INTERPOLATION WITH A PENALTY
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 40 / 46
EXTRAPOLATION
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 41 / 46
INTER- AND EXTRAPOLATION
with d = 1
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 42 / 46
INTER- AND EXTRAPOLATION
with d = 2
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 43 / 46
INTER- AND EXTRAPOLATION
with d = 3
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 44 / 46
STRONG SMOOTHING
Example (λ = 106 ):
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 45 / 46
WRAP-UP
Bernd Bischl, Julia Moosbauer, Andreas Groll c Winter term 2020/21 Advanced Statistical Learning – 5 – 46 / 46