Introduction To Regression and Analysis of Variance PDF

Statistics 203: Introduction to Regression
and Analysis of Variance

Introduction + Simple Linear Regression
Jonathan Taylor
- p. 1/15
Course outline
Course outline
What is a regression model?
This course is not an exhaustive survey of regression
Simple linear regression
model
methodology.
Parsing the name
Least Squares: Computation
We will focus on regression models: a large class of
Solving the normal equations
Geometry of least squares
statistical models used in applied practice.
Residuals
Estimating 2
In our survey, we will emphasize common themes among
2
Estimating
b e
these models.
Distribution of ,
b:
Inference for t-statistics First half of course bears some similarity to STATS 191
Statistics software
General themes in regression Introduction to Applied Statistics but we will focus a little
models
more on the theoretical aspects of the models than in STATS
191.
Prerequisites: STATS 200 + familiarity with matrix algebra.
Evaluation: 4 assignments (60%), 1 take home final exam
(40 %).
- p. 2/15
Course outline
A regression model is essentially a model of the relationships
model
between some covariates (predictors) and an outcome.
Parsing the name
Often used in an exploratory setting: can sometimes be used
for confirmatory studies but generally not for establishing
Residuals
Estimating 2
causal relationships.
2
Estimating
b e
Example: to predict height of the wife in a couple, based on
Distribution of ,
b:
Inference for t-statistics the husbands height.
Statistics software
Wife is the outcome;
General themes in regression
models covariate(s) is Husband.
Regression model is a model of the average outcome given

the covariates.
i.e. a regression model is a model of the conditional
expectation

E Wife Husband
which is a function of Husband.
- p. 3/15
Simple linear regression model
Course outline
Assume that we only have information on Husband and we
model
observe n pairs (Yi , Xi ).
Parsing the name Specifying the model: given (X1 , . . . , Xn ) we assume that
Solving the normal equations Y i = 0 + 1 Xi + i
Residuals N (0, 2 Inn )
Estimating 2
Estimating
2 Fitting the model: how do we estimate (0 , 1 )?
b e
Distribution of , Least squares
b:
Inference for t-statistics
Statistics software
n
X
(b0 , b1 ) = argmin
2
models
(Yi 0 1 Xi )
(0 ,1 ) i=1
Computation: how do we find (b0 , b1 )?

Inference: what can we say about 1 based on the n
observations?
- p. 4/15
Parsing the name
Course outline
Why is it called a simple linear regression model?
Simple linear regression Because we were modelling the height of Wife (Y
model
Parsing the name
dependent variable) on Husband (X independent variable)
alone we only had one covariate: hence it is a simple
Residuals model.
Estimating 2
Estimating
2 In the model
b e
Distribution of
b:
Inference for
,
t-statistics
E(Y |X) = 0 + 1 X,
Statistics software
General themes in regression i.e. the conditional expectation of Y given X is linear in X.
models
Hence it is a linear regression model.
In general, a linear regression model for an outcome Y and
covariates X1 , . . . , Xp states that
p
X

E Y X1 , . . . , X p = 0 + j X j
j=1
Could also be a linear combination of known functions of Xj

maybe polynomials, etc.
- p. 5/15
Course outline
In Wifes heigh model, least squares regression chooses
model
the line that minimizes
Parsing the name
n
X
SSE(0 , 1 ) = (Yi 0 1 Xi )2 .
Residuals
i=1
Estimating 2
Estimating
2
b e
Distribution of ,
Normal equations:
b:
Xn
Statistics software
SSE
= 2 (Yi 0 1 Xi )
models
0 i=1
Xn
SSE
= 2 (Yi 0 1 Xi ) Xi
1 i=1
- p. 6/15
Solution: at a critical value (b0 , b1 )

Course outline

model
Parsing the name
b0 = Y b1 X
b Syx
Residuals 1 =
Estimating 2 Sxx
2
Estimating
Xn
b e
Distribution of ,
b:
Inference for t-statistics Syx = (Yi Y )(Xi X)
Statistics software
i=1
models
Xn
Sxx = (Xi X)2
i=1
The vector of fitted values is

Yb = b0 + b1 X.
- p. 7/15
Course outline
For each pair (0 , 1 ) the vector P(0 ,1 ) with components
model
Parsing the name Pi,(0 ,1 ) = 0 + 1 Xi
Geometry of least squares is a linear combination of the vectors X and
Residuals
Estimating 2
Estimating
2 1 = (1, . . . , 1).
b e
Distribution of ,
b:
Inference for t-statistics The SSE can be expressed as
Statistics software
n
X
models
(Yi Pi,(0 ,1 ) )2 = kY P(0 ,1 ) k2 .

i=1
Minimizing this length over (0 , 1 ) finds the vector P(b0 ,b1 )

closest to the plane L spanned by 1 , X.
Or, least squares projects the vector Y onto the plane L.
- p. 8/15
Residuals
Course outline
The residuals are defined as
model
Parsing the name ei = Yi Ybi
Equivalent to
e = Y Yb
Residuals
Estimating 2
Estimating
2
b e
Distribution of , or e is the projection of Y onto the orthogonal complement
b:
Inference for
Statistics software
t-statistics
L of the plane L spanned by 1 , X.
models
This implies
X n
ei = e 1 = 0
i=1
n
X
ei Xi = e X =0
i=1
Xn
ei Ybi = e Yb =0
i=1
The vector of residuals e is independent of Yb . - p. 9/15

Estimating 2
Course outline
If we knew (0 , 1 ), then
model
Parsing the name i = Y i 0 1 Xi
Geometry of least squares and
Residuals n
X
Estimating 2
Estimating
2 kk2 = 2i = SSE(0 , 1 ) 2 2n
b e
Distribution of , i=1
b:
Inference for
Statistics software
t-statistics
so !
n
X
1
models
E 2i = 2
n i=1
so kk2 would be an unbiased estimate of 2 .
- p. 10/15
Estimating 2
Course outline
As (0 , 1 ) is unknown we might think of using estimates of
model
i instead:
Parsing the name
Solving the normal equations kek2 = SSE(b0 , b1 ) 2 2n2
Residuals
Estimating 2 and
Estimating
2
2 b b SSE(b0 , b1 )
b e
Distribution of , b = M SE(0 , 1 ) =

b:
Inference for t-statistics n2
Statistics software
models
is an unbiased estimate of 2 .
Why n 2? Because e is the projection of onto an n 2
dimensional subspace hence we can write its norm as the
sum of the squares n 2 independent standard normal
random variables.
- p. 11/15
Distribution of b, e
The vector b = (b0 , b1 ) is a function of Yb so is independent

Course outline

model of e.
Parsing the name
Least Squares: Computation Both b and Yb are linear transformations of Y so they are
Geometry of least squares normally distributed.
Residuals
Estimating 2 It can be shown that (we will prove this more generally later)
Estimating
2
b e
E((b0 , b1 )) = (0 , 1 )
Distribution of ,
b:
Statistics software
General themes in regression 2

Var(b1 ) =
models
Sxx
2
!
1 X
Var(b0 ) = 2 + .
n Sxx
Natural estimates of variance
2
d b1 ) = b

Var(
Sxx
2
!
d b0 ) = 1 X
b2
- p. 12/15
Var( + .
b t-statistics
Inference for :
Course outline
What is a regression model? d b1 ) and
Because e is independent of b it follows that Var(
model d b0 ) are independent of .
Var( b
Parsing the name
Under the hypothesis H0 : 1 = 10
b1 10
Residuals
Estimating 2
Estimating
2 T =q tn2 .
b e
Distribution of , d b1 )
Var(
b:
Statistics software
models
(Why?)
To test this hypothesis, compare |T | to tn2,1/2 the 1 /2
quantile of the t distribution with n 2 degrees of freedom.
Reject H0 if |T | > tn2,1/2 .
More on inference in next class.
- p. 13/15
Statistics software
Course outline
We will use R in this class.
model
R is an open source, multi-platform statistics programming
Parsing the name
Least Squares: Computation environment.
Geometry of least squares Here is the code & output to fit this dummy model:
Residuals
Estimating 2
Estimating
2
b e
Distribution of ,
b:
Statistics software
models
- p. 14/15
General themes in regression models
Course outline
Specifying regression models.
Simple linear regression What is the joint (conditional) distribution of all outcomes
model
Parsing the name
given all covariates?
Solving the normal equations Are outcomes independent (conditional on covariates)? If
Residuals not, what is an appropriate model?
Estimating 2
Estimating
2 Fitting the models.
b e
Distribution of , Once a model is specified how are we going to estimate
b:
Statistics software the parameters?
models Is there an algorithm or some existing software to fit the
model?
Comparing regression models.
Inference for coefficients in the model: are some zero (i.e.
is a smaller model better?)
What if there are two competing models for the data? Why
would one be preferable to the other?
What if there are many models for the data? How do we
compare models for the data?
- p. 15/15

Introduction To Regression and Analysis of Variance PDF

Uploaded by

Copyright:

Available Formats

Introduction To Regression and Analysis of Variance PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Regression and Analysis of Variance PDF

Uploaded by

Copyright:

Available Formats

Statistics 203: Introduction to Regression

and Analysis of Variance

Regression model is a model of the average outcome given

Computation: how do we find (b0 , b1 )?

Could also be a linear combination of known functions of Xj

Solution: at a critical value (b0 , b1 )

The vector of fitted values is

(Yi Pi,(0 ,1 ) )2 = kY P(0 ,1 ) k2 .

Minimizing this length over (0 , 1 ) finds the vector P(b0 ,b1 )

The vector of residuals e is independent of Yb . - p. 9/15

so kk2 would be an unbiased estimate of 2 .

The vector b = (b0 , b1 ) is a function of Yb so is independent

You might also like