Tute6Answers ECON339

Tutorial 6 (Week 7)
Assumptions and Diagnostics
Tutorial assignment:
What might Ramsey’s RESET test be used for?
Ans: The Ramsey’s RESET test is a test to determine the correct functional form.
Ramsey’s RESET test is a test of whether the functional form of the regression is
appropriate. In other words, we test whether the relationship between the dependent
variable and the independent variables really should be linear or whether a non-linear
form would be more appropriate. The test works by adding powers of the fitted values
from the regression into a second regression. If the appropriate model was a linear
one, then the powers of the fitted values would not be significant in this second
regression.
What could be done if it were found that the RESET test failed?
Ans: The test is performed under the null hypothesis of a linear model. The rejection
of the null implies that a nonlinear model is supported by the data. However, the test
does not provide the functional form of the nonlinear model.
If we fail Ramsey’s RESET test, then the easiest “solution” is probably to transform
all of the variables into logarithms. This has the effect of turning a multiplicative
model into an additive one.
If this still fails, then we really have to admit that the relationship between the
dependent variable and the independent variables was probably not linear after all so
that we have to either estimate a non-linear model for the data (which is beyond the
scope of this course) or we have to go back to the drawing board and run a different
regression containing different variables.
Objective: 1. Identifying multicollinearity and possible solutions to the problem

2. Perform Chow Test to determine whether parameters are the same for different
groups
Question 1
The data is available in the file “hedonic1.xls” Consider the following multiple
regression model for new houses only (age=0).
Proc>Set sample> if age=0
a) Estimate the econometric model

ln(SP) t   1   2 SFLAt   3 BEDS   4 BATHS t   5 STORIES t   6VACANTt  u t
Quick>Estimate equation>
Log(selling_price) c sfla beds baths stories vacant
1
Dependent Variable: LOG(SELLING_PRICE)
Method: Least Squares
Date: 07/09/05 Time: 10:42
Sample: 1 6660 IF AGE=0
Included observations: 151
Variable Coefficient Std. Error t-Statistic Prob.
C 11.24201 0.124654 90.18541 0.0000

SFLA 0.000707 5.48E-05 12.90426 0.0000
BEDS -0.084611 0.033665 -2.513295 0.0131
BATHS 0.034427 0.084073 0.409495 0.6828
STORIES -0.141884 0.060394 -2.349299 0.0202
VACANT 0.068117 0.035346 1.927130 0.0559
R-squared 0.784029 Mean dependent var 12.02137

Adjusted R-squared 0.776582 S.D. dependent var 0.431160
S.E. of regression 0.203797 Akaike info criterion -0.304458
Sum squared resid 6.022332 Schwarz criterion -0.184566
Log likelihood 28.98658 F-statistic 105.2772
Durbin-Watson stat 0.213214 Prob(F-statistic) 0.000000
b) Do the coefficients take the expected signs? Check for any evidence of
multicollinearity.
The coefficients for beds and stories do not take the expected signs we expect the
more bedrooms and more stories the greater the selling price however this is not
demonstrated by the data for new houses. Also vacant houses on average are more
expensive than non-vacant houses but this is not significant at the 5% level.
One way to check for multicollinearity is to check the correlation coefficients of all
the explanatory variables.
In the workfile window> hold control and click all the explanatory variables> right
click and select to open them as a group.
View> covariance analysis>tick the option correlation>OK
BATHS BEDS SFLA STORIES VACANT

BATHS 1.000000 0.676223 0.862870 0.673716 -0.178327
BEDS 0.676223 1.000000 0.657986 0.515717 -0.074098
SFLA 0.862870 0.657986 1.000000 0.658058 -0.131856
STORIES 0.673716 0.515717 0.658058 1.000000 -0.029968
VACANT -0.178327 -0.074098 -0.131856 -0.029968 1.000000
2
Any correlations over 0.8 are considered to be evidence of multicollinearity.
However with multicollinearity there is no cut off there is just more of a
possible effect the higher the correlations between the variables.
Another way in which to detect multicollinearity is to run auxiliary

regressions where we run one variable against the rest of the explanatory
variables.
For example:
Quick>Estimate Equation>
Sfla c beds baths stories vacant
Dependent Variable: SFLA

Date: 07/09/05 Time: 12:00
Sample: 1 6660 IF AGE=0
C -1312.035 153.8351 -8.528843 0.0000

BEDS 111.6102 50.00764 2.231862 0.0271
BATHS 1015.968 95.17536 10.67469 0.0000
STORIES 204.8847 89.63886 2.285668 0.0237
VACANT 6.589078 53.38976 0.123415 0.9019

S.E. of regression 307.8469 Akaike info criterion 14.32963
Sum squared resid 13836377 Schwarz criterion 14.42954
Log likelihood -1076.887 F-statistic 117.8251
A high r-squared indicates that there is evidence of multicollinearity.
c) What are the possible effects of multicollinearity? What are the possible solutions
to the problem?
Possible consequences include – incorrect signs and sizes of the coefficients and
possible large std errors, so that the variables appear individually not significant
when in fact they are and together they may be significant when doing an F-test.
Possible solutions include – dropping one of the variables in question. Creating a

ratio of the variables. Gathering more data to estimate your model.
3
d) Create a dummy variable for the entire dataset which has a value of 1 for a new
house and 0 for any other house.
Proc> set sample> clear the if statement

Genr>new=0
Genr>new=1 , and type in the sample window next to @all if age=0
Alternatively, in the blank window above the workfile type in dum1= age=0.
You will be able to find a new variable dum1 created in the workfile.
To check whether the dummy is properly created, you can graph the dum1 variable and find
the many spikes that have the value 1 at age=0.
e) Do a Chow test for the complete dataset to see if the equation changes depending
on whether the house is a new house or not.
Quick>Estimate equation>
Log(selling_price) c sfla beds baths stories vacant new new*sfla new*beds
new*baths new*stories new*vacant
Dependent Variable: LOG(SELLING_PRICE)

Date: 07/09/05 Time: 12:24
Sample: 1 6660
C 11.06159 0.016676 663.3423 0.0000

SFLA 0.000465 1.12E-05 41.67081 0.0000
BEDS -0.025994 0.007025 -3.700036 0.0002
BATHS 0.168266 0.009302 18.08933 0.0000
STORIES -0.055975 0.010797 -5.184200 0.0000
VACANT -0.111657 0.007842 -14.23829 0.0000
NEW 0.180419 0.174618 1.033220 0.3015
NEW*SFLA 0.000242 7.72E-05 3.136150 0.0017
NEW*BEDS -0.058617 0.047466 -1.234926 0.2169
NEW*BATHS -0.133838 0.117601 -1.138073 0.2551
NEW*STORIES -0.085909 0.084904 -1.011839 0.3117
NEW*VACANT 0.179774 0.049907 3.602154 0.0003

S.E. of regression 0.284178 Akaike info criterion 0.323366
Sum squared resid 536.8724 Schwarz criterion 0.335626
Log likelihood -1064.810 F-statistic 705.3853
4
View>Coefficient tests>Wald – Coefficient restrictions>

C(7)=0,C(8)=0,C(9)=0,C(10)=0,C(11)=0,C(12)=0
Wald Test:
Equation: EQ01
Test Statistic Value df Probability
F-statistic 4.102574 (6, 6648) 0.0004

Chi-square 24.61545 6 0.0004
Ho: The coefficients are the same regardless of whether it is a new house or not
H1: The coefficients change depending on whether it is a new house or not
Assume a 5% level of significance
p-value=.0004<0.05 Reject the null
At the 5% level we can conclude that there are different effects for new and old
houses

Tute6Answers ECON339

Uploaded by

Copyright:

Available Formats

Tute6Answers ECON339

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tute6Answers ECON339

Uploaded by

Copyright:

Available Formats

Tutorial 6 (Week 7)

Assumptions and Diagnostics

What might Ramsey’s RESET test be used for?

Objective: 1. Identifying multicollinearity and possible solutions to the problem

Proc>Set sample> if age=0

a) Estimate the econometric model

Variable Coefficient Std. Error t-Statistic Prob.

C 11.24201 0.124654 90.18541 0.0000

R-squared 0.784029 Mean dependent var 12.02137

View> covariance analysis>tick the option correlation>OK

BATHS BEDS SFLA STORIES VACANT

Another way in which to detect multicollinearity is to run auxiliary

Dependent Variable: SFLA

Variable Coefficient Std. Error t-Statistic Prob.

C -1312.035 153.8351 -8.528843 0.0000

R-squared 0.763486 Mean dependent var 1621.993

A high r-squared indicates that there is evidence of multicollinearity.

Possible solutions include – dropping one of the variables in question. Creating a

Proc> set sample> clear the if statement

Dependent Variable: LOG(SELLING_PRICE)

Variable Coefficient Std. Error t-Statistic Prob.

C 11.06159 0.016676 663.3423 0.0000

R-squared 0.538565 Mean dependent var 11.91930

View>Coefficient tests>Wald – Coefficient restrictions>

Test Statistic Value df Probability

F-statistic 4.102574 (6, 6648) 0.0004

You might also like