Oheads Chapter8 PDF

Chapter 8: Regression with Lagged
Explanatory Variables
• Time series data: Yt for t=1,..,T
• End goal: Regression model relating a

dependent variable to explanatory variables.
With time series new issues arise:
1. One variable can influence another with a

time lag.
2. If the data are nonstationary, a problem

known as spurious regression may arise.
• You will not understand 2. at this stage.
• In this chapter, we focus on 1.
• Assume data are stationary (explain later

what this means).
1
The Regression Model with Lagged
Explanatory Variables
Yt = α + β0Xt + β1Xt-1 + ... + βqXt-q + et
• Multiple regression model with current and

past values (lags) of X used as explanatory
variables.
• q = lag length = lag order
• OLS estimation can be carried out as in

Chapters 4-6.
• Statistical methods same as in Chapters 4-6.
• Verbal interpretation same as in Chapter 6.
Ex. “β2 measures the effect of the explanatory

variable 2 periods ago on the dependent
variable, ceteris paribus”.
2
Aside on Lagged Variables
• Xt is the value of the variable in period t.
• Xt-1 is the value of the variable in period t-1 or

“lagged one period” or “lagged X”.
Defining X and lagged X in a spreadsheet
“X” “lagged X”
X2 X1
X3 X2
X4 X3
. .
. .
. .
. .
. .
. .
XT XT-1
• Each column will have T-1 observations.
• In general, when creating “X lagged q

periods” you will have T-q observations.
3
Example: Lagged Variables
T = 10
Y =α + β X + β X + β X
t 1 t 2 t −1 3 t −2
+β X
4 t −3
+e .
t
Col. A Col. B Col. C Col. D Col. E

Y X X lagged X lagged X lagged
1 period 2 periods 3 periods
Row 1 Y4 X4 X3 X2 X1
Row 7 Y10 X10 X9 X8 X7
4
Example: Long Run Prediction of a
Stock Market Price Index
The issue of whether stock market returns are
predictable is a very important (but difficult) one in
finance.
This is not a book on financial theory and, hence,

we will not describe the theoretical model which
motivates this example.
Variables: stock prices, dividends and returns.
The basic equation relating these three concepts is:
( Pt − Pt −1 + Dt )
Return = Rt = Pt −1
× 100 ,
where Rt is the return on holding a share from period

t-1 through t,
Pt is the price of the stock at the end of period t
Dt is the dividend earned between period t-1 and t.
This relationship, along with assumptions about how

these variables evolve in the future, can be used to
develop various theoretical financial models.
One example: the ratio of dividends to stock price

should have some predictive power for future returns,
particularly at long horizons.
5
How does such a theory relate to our regression
model with lagged explanatory variables?
Dependent variable (Y) is the total return on the

stock market index over a future period but the
explanatory variable (X) is the current dividend-price
ratio.
Yt+h = α + βXt + et+h ,

h is forecast horizon
Yt+h is calculated using the returns Rt+1, Rt+2,.., Rt+h.
Equivalently:
Yt = α + βX t −h + et .
This is a specialized version of the regression model
with lagged explanatory variables.
6
Financial theory suggests that the explanatory power
for this regression should be poor at short horizons
(e.g. h=1 or 2) but improve at longer horizons.
Our data (monthly)

Y = twelve month returns (i.e. h=12) from a stock
market
X = dividend-price ratio (twelve months ago).
Coeff t Stat P-value Lower Upper

95% 95%
Inter. -0.003 -0.662 0.508 -0.013 0.006
Xt-12 0.022 4.833 1.5E-6 0.013 0.032
Dividend-price ratio does have significant

explanatory power for twelve month returns (since P-
value less than .05).
Theory that dividend-price ratio has some predictive

power for long run returns is supported.
However, R2=0.019 indicating that this predictive

power is weak.
Only 1.9% of the variation in twelve month returns

can be explained by the dividend-price ratio.
7
Example: The Effect of Bad News on
Market Capitalization
Motivation: Share price of a company can be

sensitive to bad news.
E.g. Company B is in an industry which is

particularly sensitive to the price of oil.
If the price of oil goes up, then the profits of

Company B will tend to go down and some
investors, anticipating this, will sell their shares
in Company B driving its price (and market
capitalization) down.
However, this effect might not happen

immediately so lagged explanatory variables
might be appropriate.
8
Monthly data for 5 years (i.e. 60 months) on the
following variables:
• Y = market capitalization of company ($000)
• X = price of oil (dollars per barrel)
Y =α + β X + β X + β X + β X + β X +e .
t 0 t 1 t −1 2 t −2 3 t −3 4 t −4 t
9
Market Capitalization (cont.)
Results:
Coeff. St. Err. t Stat P-val Lower Upper

95% 95%
Inter. 92001.5 2001.7 45.96 6.E-42 87979 96024.
Xt -145.0 47.6 -3.04 .0037 -240.7 -49.3
Xt-1 -462.1 47.7 -9.70 6E-13 -557.9 -366.4
Xt-2 -424.5 46.2 -9.19 3.E-12 -517.3 -331.6
Xt-3 -199.6 47.8 -4.18 .0001 -295.5 -103.6
Xt-4 -36.9 47.5 -.78 .44 -132.3 58.5
10
What can the company conclude about the effect

of the price of oil on its market capitalization?
Increasing the price of oil by $1 per barrel in a

given month is associated with:
• An immediate reduction in market

capitalization of $145,000, ceteris paribus.
• A reduction in market capitalization of

$462,140 one month later, ceteris paribus.

$424,470 two months later, ceteris paribus.

$199,550 three months later, ceteris paribus.

$36,900 four months later, ceteris paribus.
11
Intuition about what the ceteris paribus

condition implies:
“Increasing the oil price by one dollar in a

given month will tend to reduce market
capitalization in the following month by
$462,120, assuming that no other change in the
oil price occur.”
Total effect = $145,000 + $462,140 + $424,470

+ $199,550 + $36,900 = $1,268,060
“After four months, the effect of adding one

dollar to the price of oil is to decrease market
capitalization by $1,268,060”.
12
Selection of Lag Order
How to choose q (lag length)?
One way: Use t-tests discussed in Chapter 5

sequentially (another way is to use information
criteria which we will discuss later).
Step 1
Choose the maximum possible lag length, qmax,

that seems reasonable to you.
Step 2
Estimate the model:
Y = α + β X + β X + ... + β
t 0 t 1 t −1 q max
X t − q max
+e .
t
If the P-value for testing βqmax=0 is less than the

significance level you choose (e.g. .05) then go no
further. Use qmax as lag length. Otherwise go
on to the next step.
13
Selection of Lag Order (cont.)
Step 3
Estimate the model:
Y = α + β X + β X + ... + β
t 0 t 1 t −1 q max − 1
X t − q max + 1
+e .
t
If the P-value for testing βqmax-1=0 is less than

the significance level you choose (e.g. .05) then
go no further. Use qmax-1 as lag length.
Otherwise go on to the next step.
Step 4
Estimate the model:
Y = α + β X + β X + ... + β
t 0 t 1 t −1 q max − 2
X t − q max + 2
+e .
t
If the P-value for testing βqmax-2=0 is less than

the significance level you choose (e.g. .05) then
go no further. Use qmax-2 as lag length.
Otherwise go on to the next step, etc.
14
Aside: Lag Length
• The number of observations used in a model

with lagged explanatory variables is equal to
the original number of observations, T, minus
the maximum lag length.
• In Step 2 you have T-qmax observations
• In Step 3, T-qmax+1 observations, etc.
15
• Suppose qmax=4
• P-value for Xt-4 = .44>.05 (see previous table)
• Drop Xt-4 and re-estimate with q = 3.
Coeff. St. Err. t Stat P-value Lower Upper

95% 95%
Inter. 90402.2 1643.18 55.02 9.E-48 87104.9 93699.5
Xt -125.90 46.24 -2.72 .0088 -218.69 -33.11
Xt-1 -443.49 45.88 -9.67 3.E-13 -535.56 -351.42
Xt-2 -417.61 45.73 -9.13 2.E-12 -509.38 -325.84
Xt-3 -179.90 46.25 -3.89 .0003 -272.72 -87.09
• P-value for Xt-3 is .0003 < .05.
• Select q=3 and present this table in a report.
16
Chapter Summary
1. Regressions with time series variables involve

two issues we have not dealt with in the past.
First, one variable can influence another with
a time lag. Second, if the variables are non-
stationary, the spurious regressions problem
can result. The latter issue will be dealt with
later on.
2. Distributed lag models have the dependent

variable depending on an explanatory
variable and lags of the explanatory variable.
3. If the variables in the distributed lag model

are stationary, then OLS estimates are
reliable and the statistical techniques of
multiple regression (e.g. looking at P-values or
confidence intervals) can be used in a
straightforward manner.
4. The lag length in a distributed lag model can

be selected by sequentially using t-tests
beginning with a reasonably large lag length.
17

Oheads Chapter8 PDF

Uploaded by

Copyright:

Available Formats

Oheads Chapter8 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Oheads Chapter8 PDF

Uploaded by

Copyright:

Available Formats

Chapter 8: Regression with Lagged

• Time series data: Yt for t=1,..,T

• End goal: Regression model relating a

With time series new issues arise:

1. One variable can influence another with a

2. If the data are nonstationary, a problem

• You will not understand 2. at this stage.

• In this chapter, we focus on 1.

• Assume data are stationary (explain later

Yt = α + β0Xt + β1Xt-1 + ... + βqXt-q + et

• Multiple regression model with current and

• q = lag length = lag order

• OLS estimation can be carried out as in

• Statistical methods same as in Chapters 4-6.

• Verbal interpretation same as in Chapter 6.

Ex. “β2 measures the effect of the explanatory

• Xt is the value of the variable in period t.

• Xt-1 is the value of the variable in period t-1 or

Defining X and lagged X in a spreadsheet

• Each column will have T-1 observations.

• In general, when creating “X lagged q

Col. A Col. B Col. C Col. D Col. E

This is not a book on financial theory and, hence,

Variables: stock prices, dividends and returns.

The basic equation relating these three concepts is:

where Rt is the return on holding a share from period

Pt is the price of the stock at the end of period t

Dt is the dividend earned between period t-1 and t.

This relationship, along with assumptions about how

One example: the ratio of dividends to stock price

Dependent variable (Y) is the total return on the

Yt+h = α + βXt + et+h ,

Yt+h is calculated using the returns Rt+1, Rt+2,.., Rt+h.

Our data (monthly)

X = dividend-price ratio (twelve months ago).

Coeff t Stat P-value Lower Upper

Dividend-price ratio does have significant

Theory that dividend-price ratio has some predictive

However, R2=0.019 indicating that this predictive

Only 1.9% of the variation in twelve month returns

Motivation: Share price of a company can be

E.g. Company B is in an industry which is

If the price of oil goes up, then the profits of

However, this effect might not happen

• Y = market capitalization of company ($000)

• X = price of oil (dollars per barrel)

Coeff. St. Err. t Stat P-val Lower Upper

What can the company conclude about the effect

Increasing the price of oil by $1 per barrel in a

• An immediate reduction in market

• A reduction in market capitalization of

• A reduction in market capitalization of

• A reduction in market capitalization of

• A reduction in market capitalization of

Intuition about what the ceteris paribus

“Increasing the oil price by one dollar in a

Total effect = $145,000 + $462,140 + $424,470

“After four months, the effect of adding one

How to choose q (lag length)?