Regression Analysis
Regression Analysis
Regression Analysis
1 Overview
yi , i = 1, 2, .., n
p Explanatory(independent) variables:
General Linear Model For each i, the conditional distribution [yi |xi ] is
given by:
yi = ŷi + ϵ
where
∑
ŷi = pj=1 β1 xi,j
β = (β1 , β2 , ..., βp )T are p regression parameters
ϵ varies from different model
1. Polynomial Model
xi,j is replaced by (xi )j
2
2. Fourier Model
xi,j is replaced by sin(jxi ), cos(jxi )
3. Time series regressions
time indexed by i, and explanatory variables include lagged response variable.
(1)Propose a model
1.specify the scale of response variable Y
2.select the appropriate form of independent variable X
3.Assuming the distribution of ϵ
(2) Specify a criterion for judging the parameter
(3) Applying the parameter to the given data
(4) Check the assumptions in (1)
In step(1) we have different form of Residual Distribution
Gauss-Markov: zero mean, constant variance, uncorrelated
Normal-linear models: ϵi are i.i.d N (0, σ 2 )
Generalized Gauss-Markov:zero mean, and general covariance matrix, which
means the convariance matrix do not have zero element.
3
yˆ1
yˆ2
ŷ =
... = (βx) , Q(β) = (y − xβ) (y − xβ)
T T
yˆp
∂Q(β)
β̂ solves OLS when ∂ βˆj
= 0, i = {1, 2, ..., p}
∂ ∑
p
∂Q(β)
= [y − (β1 xj,1 + β2 xj,2 + ... + βp xj,p )]2
∂pi ∂pi j=1
∑
p
= −2 xi,j (y − (β1 xj,1 + β2 xj,2 + ... + βp xj,p ))
j=1
= −2xT[j] (y − xβ)
−2xT[1] (y − xβ)
∂Q(β) −2x T
(y − xβ)
=
[2]
.
= −2xT (y − xβ)
β ..
−2xT[p] (y − xβ)