Fu Ch11 Linear Regression
Fu Ch11 Linear Regression
Fu Ch11 Linear Regression
1. Representation of
Some Phenomenon
Non-Math/Stats Model
2. Types
- Deterministic Models (no randomness)
Probabilistic
Probabilistic
Models
Models
Regression
Regression Correlation
Correlation Other
Other
Models
Models Models
Models Models
Models
Probabilistic
Probabilistic
Models
Models
Regression
Regression Correlation
Correlation Other
Other
Models
Models Models
Models Models
Models
Simple
Simple Multiple
Simple Multiple
Linear
Simple Multiple
Non-
Linear
Linear
Simple Multiple
Non-
Linear Linear
Linear
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Yi 0 1X i i
Dependent Independent
(Response) (Explanatory) Variable
Variable (e.g., Years s. serocon.)
(e.g., CD+ c.)
Population & Sample
Regression Models
EPI 809/Spring 2008 32
Population & Sample
Regression Models
Population
Unknown
Relationship
Yi 0 1X i i
EPI 809/Spring 2008 33
Population & Sample
Regression Models
Population Random Sample
Unknown
Relationship
Yi 0 1X i i
EPI 809/Spring 2008 34
Population & Sample
Regression Models
Population Random Sample
Unknown
Yi 0 1X i i
Relationship
Yi 0 1X i i
EPI 809/Spring 2008 35
Population Linear Regression
Model
Y Yi 0 1X i i Observed
value
i = Random error
E Y 0 1 X i
X
Observed value
EPI 809/Spring 2008 36
Sample Linear Regression
Model
Y Yi 0 1X i i
^i = Random
error
Unsampled
observation
Yi 0 1X i
X
Observed value
EPI 809/Spring 2008 37
Estimating Parameters:
Least Squares Method
Y
60
40
20
0 X
0 20 40 60
Y
60
40
20
0 X
0 20 40 60
Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept unchanged
EPI 809/Spring 2008 41
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
‘fits best’?
Slope unchanged
Y
60
40
20
0 X
0 20 40 60
Intercept changed
EPI 809/Spring 2008 42
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
‘fits best’?
Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept changed
EPI 809/Spring 2008 43
Least Squares
1.‘Best Fit’ Means Difference Between
Actual Y Values & Predicted Y Values Are
a Minimum. But Positive Differences Off-
Set Negative ones
ˆ
n n
2
Y Yˆ 2
i
i i
i 1 i 1
ˆ
n n
2
Y Yˆ 2
i
i i
i 1 i 1
i 1
Y Y2 0 1X 2 2
^ 44
^ 22
^ 11 ^ 33
Yi 0 1X i
X
EPI 809/Spring 2008 47
Coefficient Equations
Prediction equation
yˆi ˆ0 ˆ1 xi
Sample slope
SS xy xi x yi y
ˆ1
2
SS xx i x x
Sample Y - intercept
ˆ0 y ˆ1x
EPI 809/Spring 2008 48
Derivation of Parameters (1)
Least Squares (L-S):
Minimize squared error
n n
i yi 0 1 xi
2
2
i 1 i 1
yi 0 1 xi
2 2
i
0
0 0
2 ny n 0 n1 x
ˆ0 y ˆ1x
EPI 809/Spring 2008 49
Derivation of Parameters (1)
Least Squares (L-S):
Minimize squared error
i2 yi 0 1 xi
2
0
1 1
2 xi yi 0 1 xi
2 xi yi y 1 x 1 xi
1 xi xi x xi yi y
1 xi x xi x xi x yi y
ˆ SS xy
1
SS xx
Birthweight
4
3
2
1
0
0 1 2 3 4 5 6
Estriol level
nn
X ii 55
5
ii11
X 22
ii
ii11 n
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
^0 ^
1
M. Yield (lb.)
10
8
6
4
2
0
0 5 10 15
Food intake (lb.)
nn
X ii 296
4
ii11
X 22
ii
ii11 n
^
2. Y-Intercept (0)
Average Milk yield (Y) Is Expected to Be 0.8
lb. When Food intake (X) Is 0
EPI 809/Spring 2008 70