SP Anova Lect 3

Analysis of Variance and Design of
Experiments
General Linear Hypothesis and Analysis of Variance
:::
Lecture 3
Regression and Analysis of Variance Models
Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
Slides can be downloaded from http://home.iitk.ac.in/~shalab/sp1

Regression model for the general linear hypothesis:
Let Y1 , Y2 ,..., Yn be a sequence of n independent random variables
associated with responses. Then we can write it as
p
E (Yi )    j xij , i  1, 2,..., n, j  1, 2,..., p
j 1
Var (Yi )   2 .
This is the linear model in the expectation form where 1 ,  2 ,...,  p

are the unknown parameters and xij’s are the known values of
independent covariates X 1 , X 2 ,..., X p .
2
Alternatively, the linear model can be expressed as

p
Yi    j xij ,   i , i  1, 2,..., n; j  1, 2,..., p
j 1
where i ’s are identically and independently distributed random
error component with mean 0 and variance  2 , i.e.,
E ( i )  0 V ar (  i )   2
and Cov ( i ,  j )  0(i  j ).
3
In matrix notations, the linear model can be expressed as

Y  X 
where
• Y  (Y1 , Y2 ,..., Yn ) ' is a n x 1 vector of observations on the
response variable,
 X 11 X 12 ... X 1 p 
 
• the matrix  X 21 X 22 ... X 2 p 
X  is a n x p matrix of n observations
   
 
 X X ... X 
 n1 n 2 np 
on p independent covariates X 1 , X 2 ,..., X p ,
4
•   ( 1 ,  2 ,...,  p ) is a p x 1 vector of unknown regression

parameters (or regression coefficients) 1 ,  2 ,...,  p associated
with X 1 , X 2 ,..., X p , respectively and
•   (1 ,  2 ,...,  n ) is a n x 1 vector of random errors or
disturbances.
• We assume that E ( )  0, the covariance matrix
V ( )  E ( ')   2 I p , rank ( X )  p
5
In the context of analysis of variance and design of experiments,
 the matrix X is termed as the design matrix;
 unknown  1 ,  2 ,...,  p are termed as effects,
 the covariates X 1 , X 2 ,..., X p , are counter variables or indicator
variables where xij counts the number of times the effect j
occurs in the ith observation xi .
 xij mostly takes the values 1 or 0 but not always.
 The value xij = 1 indicates the presence of effect j in xi and xij
= 0 indicates the absence of effect j in xi.
6
Note that in the linear regression model, the covariates are usually
continuous variables.
When some of the covariates are counter variables, and rest are
continuous variables, then the model is called a mixed model and
is used in the analysis of covariance.
7
Relationship between the regression and ANOVA models:
The same linear model is used in the linear regression analysis as
well as in the analysis of variance.
So it is important to understand the role of a linear model in the

context of linear regression analysis and analysis of variance.
Consider the multiple linear model
Y   0  X 1  1  X 2  2  ...  X p  p  
8
In the case of analysis of variance model,
• the one‐way classification considers only one covariate,
• two‐way classification model considers two covariates,
• three‐way classification model considers three covariates and
so on.
9
If    and  denote the effects associated with the covariates
X, Z and W which are the counter variables, then in
One‐way model: Y    X   
Two‐way model: Y    X   Z   
Three‐way model: Y    X   Z   W    and so on.
10
Consider an example of agricultural yield. The study variable
denotes the yield which depends on various covariates X1,X2,…,Xp .
In the case of regression analysis, the covariates X1,X2,…,Xp are the

different variables like temperature, the quantity of fertilizer,
amount of irrigation etc.
11
Now consider the case of one‐way model and try to understand
its interpretation in terms of the multiple regression model.
The covariate X is now measured at different levels, e.g., if X is

the quantity of fertilizer then suppose there are p possible values,
say 1 Kg., 2 Kg.,..., p Kg. then X 1 , X 2 ,..., X p denotes these p values
in the following way.
12
The linear model now can be expressed as
Y   0   1 X 1   2 X 2  ...   p X p  
by defining
1 if effect of 1 Kg. fertilizer is present

X1  
0 if effect of 1 Kg. fertilizer is absent
1 if effect of 2 Kg. fertilizer is present

X2  
0 if effect of 2 Kg. fertilizer is absent

1 if effect of p Kg. fertilizer is present
Xp  
0 if effect of p Kg. fertilizer is absent.
13
If the effect of 1 Kg. of fertilizer is present, then other effects will
obviously be absent and the linear model is expressible as
Y   0   1 ( X 1  1)   2 ( X 2  0)  ...   p ( X p  0)  
  0  1  
If the effect of 2 Kg. of fertilizer is present then

Y   0  1 ( X 1  0)   2 ( X 2  1)  ...   p ( X p  0)  
 0  2  
If the effect of p Kg. of fertilizer is present then

Y   0  1 ( X 1  0)   2 ( X 2  0)  ...   p ( X p  1)  
 0   p  
and so on.
14
If the effect of p Kg. of fertilizer is present then

Y   0  1 ( X 1  0)   2 ( X 2  0)  ...   p ( X p  1)  
 0   p  
and so on.
15

SP Anova Lect 3

Uploaded by

Copyright:

Available Formats

SP Anova Lect 3

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SP Anova Lect 3

Uploaded by

Copyright:

Available Formats

Analysis of Variance and Design of

Slides can be downloaded from http://home.iitk.ac.in/~shalab/sp1

This is the linear model in the expectation form where 1 ,  2 ,...,  p

Alternatively, the linear model can be expressed as

where i ’s are identically and independently distributed random

error component with mean 0 and variance  2 , i.e.,

In matrix notations, the linear model can be expressed as

on p independent covariates X 1 , X 2 ,..., X p ,

•   ( 1 ,  2 ,...,  p ) is a p x 1 vector of unknown regression

V ( )  E ( ')   2 I p , rank ( X )  p

So it is important to understand the role of a linear model in the

Consider the multiple linear model

Three‐way model: Y    X   Z   W    and so on.

In the case of regression analysis, the covariates X1,X2,…,Xp are the

The covariate X is now measured at different levels, e.g., if X is

1 if effect of 1 Kg. fertilizer is present

1 if effect of 2 Kg. fertilizer is present

If the effect of 2 Kg. of fertilizer is present then

If the effect of p Kg. of fertilizer is present then

If the effect of p Kg. of fertilizer is present then

You might also like