STA 405: Linear Modelling 2: Dr. Idah
STA 405: Linear Modelling 2: Dr. Idah
STA 405: Linear Modelling 2: Dr. Idah
Dr. Idah
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 1/1
Model Building
ŷ = β0 + β1 x1 + · · · βk xk
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 2/1
Model Selection and Validation
1)Hypothesis Testing
In many regression situations, individual coefficients are of importance
to the experimenter.
Using regression analysis one is also interested in deletion of variables
when the situation dictates that, in addition to arriving at a workable
prediction equation
The ”best regression” involving only variables that are useful
predictors should be obtained
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 3/1
One criterion that is commonly used to illustrate the adequacy of a
fitted regression model is the coefficient of multiple determination
Pn
2 SSR (yˆi − ȳ )2
R = = Pi=1 n 2
SST i=1 (yi − ȳ )
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 4/1
The square root of R 2 is called the multiple correlation coefficient
between Y and the set x1 , x2 , · · · , xk .
The regression sum of squares can be used to give some indication
concerning whether or not the model is an adequate explanation of
the true situation.
One can test the hypothesis H0 that the regression is not significant
by merely forming the ratio
SSR/k SSR/k
f = =
SSE /(n − k − 1) s2
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 5/1
Example
Consider the below data
y x1 x2 x3
25.5 1.74 5.30 10.80
31.2 6.32 5.42 9.40
25.9 6.22 8.41 7.20
38.4 10.52 4.63 8.50
18.4 1.19 11.60 9.40
26.7 1.22 5.85 9.90
26.4 4.10 6.62 8.00
25.9 6.32 8.72 9.10
32.0 4.08 4.42 8.70
25.2 4.15 7.60 9.20
39.7 10.15 4.83 9.40
35.7 1.72 3.12 7.60
26.5 1.70 5.30 8.20
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 6/1
From the experiment data we list the following sums of squares and
products
13
X 13
X 13
X
yi = 377.5 yi2 = 11, 400.15 xi = 59.43
i=1 i=1 i=1
13
X 13
X 13
X
x2i = 81.82 x3i = 115.40 x1i2 = 394.7255
i=1 i=1 i=1
X13 13
X 13
X
x2i2 = 576.7264 x3i3 = 1035.96 xi yi = 1877.567
i=1 i=1 i=1
13
X X13 13
X
x2i yi = 2246.661 x3i yi = 3337.78 x1i x2i = 360.6621
i=1 i=1 i=1
13
X 13
X
x1i x3i = 522.078 x2i x3i = 728.31 n = 59.43
i=1 i=1
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 7/1
13 59.43 81.82 115.4 β0 377.75
59.43 394.7255 360.6621 522.078 β1 1877.5670
81.82 360.6621 576.7264 728.31 , β2 2246.6610
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 8/1
Using the relation
β̂ = (X ′ X )−1 X ′ y
Dr. Idah STA 405: Linear Modelling 2 March 29, 2023 9/1
where
Pn Pn Pn
···
n i=1 x1i i=1 x2i i=1 xki
Pn Pn Pn
x1i2
P
X ′ X = i=1 x1i
i=1 x1i x2i · · · i=1 x1i xki
..
.
Pk Pn Pn Pn 2
i=1 xki i=1 xki x1i i=1 xki x2i · · · i=1 xki
3337.780
R(β2 |β1 )
f =
s2
can be determined to test the appropriateness of x2 in the model.
R(β3 |β1 , β2 )
f =
s2
tests the appropriateness of x3 in the model.
R(β5 |β1 , β3 , β4 )
f =
s2
is insignificant, the variable x5 is removed from the model.
At each step the s 2 used in the F − text is the error mean square for
the regression model at this stage.
Given SST = 8.7837; using forward selection, select the best fitting model.