Solution Manual For Microeconometrics
Solution Manual For Microeconometrics
Solution Manual For Microeconometrics
BOOK PREFACE
This book provides a detailed treatment of microeconometric analysis, the analysis of individuallevel data on the economic behavior of individuals or firms. This usually entails regression methods
applied to cross-section and panel data.
The book aims to provide the practitioner with a comprehensive coverage of statistical methods
and their application in modern applied microeconometrics research. These methods include
nonlinear modelling, inference under minimal distributional assumptions, identifying and
measuring causation rather than mere association, and correcting from departures from simple
random sampling. Many of these features are of relevance to individual-level data analysis
throughout the social sciences.
The ambitious agenda has determined the characteristics of this book. First, although oriented to
the practitioner the book is relatively advanced in places. A cookbook approach is inadequate as
when two or more complications occur simultaneously, a common situation, the practitioner must
know enough to be able to adapt available methods. Second, the book provides considerable
coverage of practical data problems, see especially the last three chapters. Third, the book includes
substantial empirical examples in many chapters, to illustrate some of the methods covered. Finally,
the book is unusually long. Despite this length we have been space-constrained. We had intended to
include even more empirical examples. And abbreviated presentations will at times fail to recognize
the accomplishments of researchers who have made substantive contributions.
The book assumes a basic understanding of the linear regression model with matrix algebra. It is
written at the mathematical level of the first-year economics Ph.D. sequence, comparable to Greene
(2003). We have two types of readers in mind. First, the book can be used as a course text for a
microeconometrics course, typically taught in the second-year of the Ph.D., or for data-oriented
microeconomics field courses such as labor economics, public economics and industrial
organization. Second, the book can be used as a reference work for graduate students and applied
researchers who despite training in microeconometrics will inevitably have gaps that they wish to
fill.
For instructors using this book as an econometrics course text it is best to introduce the basic
nonlinear cross-section and linear panel data models as early as possible, initially skipping many of
the methods chapters. The key methods chapter (chapter 5) covers maximum likelihood and
nonlinear least squares estimation. ML and NLS provide adequate background for the most
commonly-used nonlinear cross-section models (chapters 14-17, 20), basic linear panel data models
(chapter 21) and treatment evaluation methods (chapter 25). Generalized method of moments
estimation (chapter 6) is needed especially for advanced linear panel data methods (chapter 22).
For readers using this book as a reference work, the chapters have been written to be as selfcontained as possible. The notable exception is that some command of general estimation results in
chapter 5, and occasionally chapter 6, will be necessary. Most models chapters are structured to
begin with a discussion and example that is accessible to a wide audience.
The web-site www.econ.ucdavis.edu/faculty/cameron/mmabook provides all the data and
computer programs used in this book, and related materials useful for instructional purposes.
This project has been long and arduous, and at times seemingly without an end. Its completion
has been greatly aided by our colleagues, friends, and graduate students. We would like to thank
especially the following for reading and commenting on specific chapters: Bijan Borah, Kurt
Brnns, Pian Chen, Tim Cogley, Parthe Deb, David Drukker, Massimiliano De Santis, Jeff Gill,
10
Tue Gorgens, Shiferaw Gurmu, Lu Ji, Oscar Jorda, Roger Koenker, Chenghui Li, Tong Li, Doug
Miller, Murat Munkin, Jim Prieger, Ahmed Rahmen, Sunil Sapra, Haruki Seitani, Yacheng Sun,
Xiaoyong Zheng, and David Zimmer. We thank Rajeev Dehejia, Bronwyn Hall, Cathy Kling,
Jeffrey Kling, Will Manning, Brian McCall and Jim Ziliak for making their data available for
empirical illustrations. We thank our respective departments for facilitating our collaboration, and
for the production and distribution of the draft manuscript at various stages. We benefitted from the
comments of two anonymous reviewers. Guidance, advice and encouragement from our CUP
editor, Scott Pariss, has been invaluable.
Our interest in econometrics owes much to the training and environments we encountered as
students and in the initial stages of our academic careers. The first author thanks The Australian
National University, Stanford University, especially Takeshi Amemiya and Tom MaCurdy, and The
Ohio State University. The second author thanks the London School of Economics and The
Australian National University.
Our interest in writing a book oriented to the practitioner owes much to our exposure to the
research of graduate students and colleagues at our respective institutions, UC-Davis and IUBloomington.
Finally, we would like to thank our families for their patience and understanding without which
completion of this project would not have been possible.
A. Colin Cameron
Davis, California
Pravin K. Trivedi
Bloomington, Indiana
11
TABLE OF CONTENTS
I: PRELIMINARIES
1.
Overview
2. Causal and Noncausal Models
3. Microeconomic Data Structures
4.
Linear
models
5.
ML
and
NLS
estimation
6. GMM and Systems Estimation
7.
Hypothesis
Tests
8. Specification Tests and Model
Selection
9.
Semiparametric
Methods
10. Numerical Optimization
III:
SIMULATION- 11.
Bootstrap
BASED
12.
Simulation-based
METHODS
13. Bayesian Methods
Methods
Methods
IV:
CROSS-SECTION 14.
Binary
Outcome
Models
DATA MODELS
15.
Multinomial
Models
16. Tobit and Selection Models
17. Transition Data: Survival Analysis
18. Mixture Models and Unobserved
Heterogeneity
19. Models of Multiple Hazards
20. Count Data Models
V:
PANEL
MODELS
A.
Asymptotic
Theory
B. Making Pseudo-Random Draws
12
Part 2 presents the core methods least squares, method of moments, and maximum likelihood -of estimation and inference in nonlinear regression models that are central in microeconometrics.
Both the traditional topics as well as more modern topics like quantile regression, sequential
estimation, empirical likelihood, bootstrap, and semi- and nonparametric regression are covered. In
general the discussion is at a level intended to provide enough background and detail to enable the
practitioner to read and comprehend articles in the leading econometrics journals. We presume
prior
familiarity
with
linear
regression
analysis.
Chapter 4 begins with the linear regression model. It then covers at an introductory level quantile
regression, which models distributional features other than the conditional mean. It provides a
lengthy expository treatment of instrumental variables estimation, a major semiparametric method
13
of causal inference. Chapter 5 presents the most commonly-used estimation methods for nonlinear
models, beginning with the quite general topic of m-estimation, before specialization to maximum
likelihood and nonlinear least squares regression. Chapter 6 provides a comprehensive treatment of
generalized method of moments, which is a quite general estimation framework, applicable both in
linear and nonlinear, and single- and multi-equation settings. The chapter emphasizes the special
case
of
instrumental
variables
estimation.
Chapter 7 covers both the classical and bootstrap approaches to hypothesis testing, while Chapter 8
presents relatively more modern methods of model selection and specification analysis. .Because of
their importance the bootstrap methods also get a more detailed stand-alone treatment in Chapter
11. As much as possible testing methods are presented in a unified manner in these chapters, but
specific
applications
occur
throughout
the
book
Chapter 9 is a stand-alone chapter that presents nonparametric and semiparametric estimation
methods that place a flexible structure on the econometric model. Chapter 10 presents the
computational methods used to compute the nonlinear estimators presented in chapters 5 and 6.
This material becomes especially relevant to the practitioner if an estimator is not automatically
computed by an econometrics package.
Part 1 emphasized that: (1) Microeconometric models are often nonlinear; (2) they are frequently
estimated using large and heterogeneous data sets; and (3) the data often come from surveys that
are complex and subject to a variety of sampling biases. A realistic depiction of the economic
phenomena in such settings often requires the use of models that are difficult to estimate and
analyze. Advances in computing hardware and software now make it feasible to tackle such tasks.
Part 3 presents modern, computer-intensive, simulation-based methods of inference that mitigate
some of these difficulties. The background required to cover this material varies somewhat with the
chapter but the essential base is least squares and maximum likelihood estimation.
Chapter 11 presents bootstrap methods for statistical inference. These methods have the attraction
of providing a simple way to obtain standard errors when the formulae from asymptotic theory are
complex, as is the case for some two-step estimators. Furthermore, if implemented appropriately, a
bootstrap can lead to a more refined asymptotic theory that may then lead to better statistical
inference
in
small
samples.
Chapter 12 presents simulation-based estimation methods. These methods permit estimation in
situations where standard computational methods may not permit calculation of an estimator,
because of the presence of an integral over a probability distribution for which there is no closedform
solution.
Chapter 13 surveys Bayesian methods that provide an approach to estimation and inference that is
quite different from the classical approach used in other chapters of this book. Despite this different
approach, the Bayesian toolkit can also be adopted to permit classical estimation and inference for
problems that are otherwise intractable
14
Part 4, consisting of chapters 14 to 20, covers the core nonlinear limited dependent variable models
for cross-section data, defined by the range of values taken by the dependent variable. Topics
covered include models for binary and multinomial data, duration data and count data. The
complications of censoring, truncation and sample selection are also studied.
Chapters 14-15 cover models for binary and multinomial data that are standard in the analysis of
discrete choice and outcomes. Maximum likelihood methods are dominant. Different
parameterizations for the conditional probabilities in these models lead to different models, notably
logit and probit models, which are well-established Recent literature has focused on less restrictive
modeling with more flexible functional forms for conditional probabilities and on accommodating
individual unobserved heterogeneity. These objectives motivate the use of semiparametric methods
and
simulation-based
estimation
methods.
Censoring, truncation or sample selection generate empirically several important classes of models
that are analyzed in Chapter 16. The long-established Tobit model is central to this literature, but its
estimation and inference rely on strong distributional assumptions to permit consistent estimation.
We also examine the newer semiparametric methods require weaker assumptions.
Chapters 17-19 consider duration models in which the focus is on either the determinants of spell
lengths, such as length of an unemployment spell, or on modeling the hazard rate of transitions from
one initial state to another. The relative importance of state dependence and unobserved
heterogeneity as determinants of the average length of spell is a central issue, whose resolution
raises fundamental questions about alternative modeling approaches. The analysis covers both
discrete and continuous time models, and both parametric and semiparametric formulations,
including the standard models like the exponential, the Weibull, and the proportional hazards
model. Chapter 18 covers formulation and interpretation of richer models that incorporate
unobserved heterogeneity. Chapter 19 deals with models with several types of events using the
competing
risks
formulation
and
models
of
multiple
spells.
Chapter 20 covers the analysis of event count of the kind very common in health economics. There
are many strong connections and parallels between count data models and duration models because
of their common foundation in stochastic processes. We analyze the widely-used Poisson and
negative binomial regression models, together with important variants such as the two-part or
hurdle model, zero-inflated models, latent class models, and endogenous regressor models, all of
which accommodate different facets of the event processes.
Cross section models have certain inherent limitations. They are predominantly equilibrium models
that generally do not shed light on intertemporal dependence of events. They also cannot
satisfactorily resolve fundamental issues about the sources of persistence in behavior. Such
persistence may be behavioral, i.e. arising from true state dependence, or it may be spurious, being
an artifact of the inability to control for heterogeneous behavior in the population. Because panel
data, also called longitudinal data, contain periodically repeated observations of the same subjects,
they have a large potential for resolving issues that cross section models cannot satisfactorily
handle. Chapters 21 through 23 present methods for panel data. We progress systematically from
15
linear models for continuous data in Chapter 21 to nonlinear panel data models for limited
dependent variables in Chapter 23. Both fixed effects and random effects models are considered. A
persistent theme through these three chapters is the importance of using robust methods of
inference.
Chapter 21, which reviews the key general results for linear panel data regression models, can be
read easily by those with a good grasp of linear regression; it does not require the material covered
in Parts 2 to 4. We recommend that even those who are interested in more advanced material should
quickly peruse through the contents of this chapter first to gain familiarity with key concepts and
definitions.
Chapter 22 covers important extensions of Chapter 21, especially to dynamic panels which allow
for Markovian dependence structure of current variables. The analysis is in the GMM framework
that is currently favored by many practitioners in this area. The analysis here is at times intricate,
involving many issues of detail. A strong grasp of GMM will be helpful in absorbing the main
results
of
this
chapter.
The results of Chapters 21 and 22 do not extend to nonlinear panel models of Chapter 23 in a
general and unified fashion. There are relatively fewer general results for limited dependent variable
panel models. Despite this, in Chapter 23 we begin by presenting an analysis of some general issues
and approaches. Later sections can be treated as panel data extensions of the counterpart cross
section models in Part 4. these analyze four categories of models for binary, count , censored, and
duration data, respectively. These should be accessible to a suitably prepared reader familiar with
the parallel cross section models.
Frequently in empirical work data present not one but multiple complications that the analysis must
simultaneously deal with. Examples of such complications include departures from simple random
sampling, clustering of observations, measurement errors, and missing data. When they occur,
individually or jointly, and in the context of any of the models developed in Parts 4 and 5,
identification of parameters of interest will be compromised. Three chapters in Part 6 Chapters
24, 26, and 27 analyze the consequences of such complications and then present methods that
attempt to overcome the consequences. The methods are illustrated using examples taken from the
earlier parts of the book. This features gives points of connection between Part 6 and the rest of the
book.
Chapter 24, which deals with features of data from complex surveys, complements various topics
covered Chapters 3, 5, and 16. Chapter 26 which deals with measurement errors complements
topics in Chapter 4, 14, and 20. Chapter 27 is a stand-alone chapter on missing data and multiple
imputation, but its use of the EM algorithm and Gibbs sampler also gives it points of contact with
Chapters
10
and
13,
respectively.
Chapter 25 deals with the important topic of treatment evaluation. Treatment is a broad term that
refers to the impact of one variable, e.g. schooling, on some outcome variable, e.g. income.
Treatment variables may be exogenously assigned, or may be endogenously chosen. The topic of
treatment evaluation concerns the identifiability of the impact of treatment on outcome, as measured
by either the marginal effects or certain functions of marginal effect. A variety of methods are used
including instrumental variables regression and propensity score matching. The problem of
treatment evaluation can arise in the context of any model considered in parts 4 and 5. This chapter
16
may also be read on its own, but it does presume familiarity with many other topics covered in the
book, including instrumental variables and selection models, which is why it is placed in the last
part.
17
The book assumes a basic understanding of the linear regression model with matrix algebra. It is
written at the mathematical level of the first-year economics Ph.D. sequence, comparable to Greene
(2000).
While some of the material in this book is covered in a first-year sequence, most of the material in
this book appears in second year econometrics Ph.D. courses or in data-oriented microeconomics
field courses such as labor economics, public economics or industrial organization. This book is
intended to be used as both an econometrics text and as an adjunct for such field courses. More
generally, the book is intended to be useful as a reference work for applied researchers in
economics, in related social sciences such as sociology and political science, and in epidemiology.
The models chapters have been written to be as self-contained as possible, to minimize the amount
of background material in the methods chapters that needs to be read. For the specific models
presented in parts four and five (chapters 14-23) it will generally be sufficient to read the relevant
chapter in isolation, except that some command of the general estimation results in chapter 5 and in
some cases chapter 6 will be necessary. Most chapters are structured to begin with a discussion and
example that is accessible to a wide audience.
For instructors using this book as a course text it is best to introduce the basic nonlinear crosssection and linear panel data models as early as possible, skipping many of the methods chapters.
The most commonly-used nonlinear cross-section models are presented in chapters 14-16, and
require knowledge of maximum likelihood and least squares estimation, presented in chapter five.
Chapter twenty-one on linear panel data models requires even less preparation, essentially just
chapter four.
Table 1.2 provides an outline for a one-quarter second-year graduate course taught at the University
of California - Davis, immediately following the required first-year statistics and econometrics
sequence. A quarter provides sufficient time to cover the basic results given in the first half of the
chapters in this outline. With additional time one can go into further detail or cover a subset of
chapters eleven to thirteen on computationally-intensive estimation methods (simulation-based
estimation, the bootstrap which is also briefly presented in chapter seven and Bayesian methods);
additional cross-section models (durations and counts) presented in chapters seventeen to twenty;
and additional panel data models (linear model extensions and nonlinear models) given in chapters
twenty-two and twenty-three.
Outline of a twenty-lecture ten-week course:
Lectures
Chapter
Topic
1-3
4
Review of linear models and asymptotic theory
4-7
10
9-11
14,15
12-14
16
15
Estimation: GMM
16
17-19
21
20
9
Estimation: Semiparametric
At Indiana University - Bloomington, a fifteen-week semester long field course in
microeconometrics is based on material in most of Parts 4 and 5 (chapters 14-23). The prerequisite
courses for this course cover material similar to the material in Part 2 (chapters 4-10).
18
Some exercises are provided at the end of each chapter after the first three introductory chapters.
These exercises are usually learning-by-doing exercises, some are purely methodological while
others entail analysis of generated or actual data. The level of difficulty of the questions is mostly
related to the level of difficulty of the topic.
Detailed programs and data for all the data applications (using either actual data or generated data)
will be made available at the book website.
19
ADVANCE REVIEWS
"This book presents an elegant and accessible treatment of the broad range of rapidly expanding
topics currently being studied by microeconometricians. Thoughtful, intuitive, and careful in laying
out central concepts of sophisticated econometric methodologies, it is not only an excellent
textbook for students, but also an invaluable reference text for practitioners and researchers."
- Cheng Hsiao, University of Southern California
"I wish "Microeconometrics" was available when I was a student! Here, in one place -- and in clear
and readable prose -- you can find all of the tools that are necessary to do cutting-edge applied
economic
analysis,
and
with
many
helpful
examples."
- Alan Krueger, Princeton University
"Cameron and Trivedi have written a remarkably thorough and up-to-date treatment of
microeconometric methods. This is not a superficial cookbook; the early chapters carefully lay the
theoretical foundations on which the authors build their discussion of methods for discrete and
limited dependent variables and for analysis of longitudinal data. A distinctive feature of the book
is its attention to cutting-edge topics like semiparametric regression, bootstrap methods, simulationbased estimation, and empirical likelihood estimation. A highly valuable book."
- Gary Solon, University of Michigan
"The empirical analysis of micro data is more widespread than ever before. The book by Cameron
and Trivedi contains a superb treatment of all the methods that economists like to apply to such
data. What is more, it fully integrates a number of exciting new methods that have become
applicable due to recent advances in computer technology. The text is in perfect balance between
econometric theory and empirical intuition, and it contains many insightful examples."
-
20
Example
4.5.3
84-5
* mma04p1wls.asc
4.6.4
88-90
Quantile
and
Regression
qreg0902.dta
qreg0902.asc
4.8.8
102-3
Instrumental
Regression
4.9.6
110-2
Median mma04p2qreg.do
mma04p2qreg.txt
Variables mma04p3iv.do
mma04p3iv.txt
Data
[* means generated]
or
* mma04p3iv.asc
DATA66.dat
DATA66.dct
and
* mma05data.asc
* mma05data.asc
* mma05data.asc
5.9.4
* mma05data.asc
6.5.4
198-9
Nonlinear
Limdep
* mma06p1nl2sls.asc
6.5.4
198-9
* mma06p1nl2sls.asc
7.4
241-3
Likelihood-based
Hypothesis Testts
* mma07p1mltests.asc
7.6.3
248-9
No data
7.7.1-5 250-4
Data
for
many
simulations not saved
7.8
254-6
Bootstrap example
* mma07p4boot.asc
8.2.9
* mma08p1cmtests.asc
8.5.5
283-4
Nonnested
2SLS:
models
Using mma06p1nl2sls.lim
mma06p1nl2sls.out
mma07p1mltests.do
mma07p1mltests.txt
mma07p4boot.do
mma07p4boot.txt
test mma08p2nonnested.do
21
example
mma08p2nonnested.txt
8.7.3
290-1
Model
example
diagnostics mma08p3diagnostics.do
mma08p3diagnostics.txt
9.2
295-7
Nonparametric
density mma09p1np.do
estimation and regression: mma09p1np.txt
appplication
mma08p2nonnested.asc
*
mma08p3diagnostics.asc
* mma09p2npmore.asc
9.3.3
* mma09p3kernels.asc
299300
10.2.5 338-9
mma09p3kernels.do
mma09p3kernels.txt
PROGRAMS:
No data
(chapters 11-13)
Section
Pages
Example
Data
11.3
366-8
Bootstrap example
mma11p1boot.do
mma11p1boot.txt
* mma11p1boot.asc
12.3.3
391-2
Integral
Example
12.4.5,
12.5.6
397-7,
403-4
Maximum
Simulated mma12p2mslmsm.do
Likelihood and Maximum mma12p2mslmsm.txt
Simulated Score Example
*
mma12p2mslmsm.asc
12.8.2
412-3
No data
13.2.2
424
No data
13.6
452-4
PROGRAMS:
IV.
Models
for
Cross-Section
Data
Section Pages
Example
14.2
Logit
and
Probit mma14p1binary.do
Application (fishing mode) mma14p1binary.txt
464-5
(chapters
14-20)
Data
Nldata.asc
22
14.7.5
486
mma14p1binary.asc
15.2.1- 491-5
3
Nldata.asc
15.6.3
511
Nldata.asc
15.2.2
493-4
Nldata.asc
mma15p3mnl.lim
mma15p3mnl.out
15.2.1- 491-5
3
mma15p4gev.asc
16.2.1
530-1,
565
mma16p1tobit.asc
16.3.4
540
No data
16.6
553-5
Selection
Application
expenditures)
17.2
17.5.1
574-5
581-3
strkdur.dta
strkdur.asc
17.5.1
581-2
Data in program
17.6.1
584-6
Weibull
distribution mma17p3weib.do
functions plotted
mma17p3weib.txt
No data
17.11
603-8
ema1996.dta
or ema1996.asc
18.8
632-6
19.5
658-3
ema1996.dta
or ema1996.asc
20.2
20.7
671-4
690
randdata.dta
mma20p1count.asc
mma16p2mills.do
mma16p2mills.txt
Model mma16p3selection.do
(medical mma16p3selection.txt
randdata.dta
or
mma16p3selection.asc
or
23
PROGRAMS:
V.
Models
for
Data
(chapters
Pages
21.3.1-3
MOM.dat
21.3.2
21.3.4
710
719
MOM.dat
21.3.4
713-5
Linear
Panel
Residual mma21p3panresiduals.do
Analysis (hours and wages) mma21p3panresiduals.txt
MOM.dat
21.5.5
725
MOM.dat
22.3
754-6
Linear
Panel
GMM mma22p1gmmpanel.do
Application (hours and mma22p1gmmpanel.txt
wages)
MOMprecise.dat
23.3
792-5
patr7079.asc
VI.
Example
21-23)
Section
PROGRAMS:
Example
Panel
Further
Methods
Section
Pages
24.7
Data
(chapters
Clustered
Poisson mma24p2poiscluster.do
Regression
(individual mma24p2poiscluster.txt
pharmacy visits clustered on
commune)
24-27)
Data
vietnam_ex1.dta
or vietnam_ex1.asc
vietnam_ex2.dta
or vietnam_ex2.asc
25.8.1-4
889-93 Treatment
Evaluation: mma25p1treatment.do
Simple
calculations mma25p1treatment.txt
(training on earnings)
nswpsid.da1
or nswpsid.dta
25.8.5
893-6
nswpsid.da1
or nswpsid.dta
25.8
889-96 Treatment
Treatment
Evaluation: mma25p2matching.do
Propensity score matching mma25p2matching.txt
(training on earnings):
Evaluation: mma25p3extra.do
nswre74_treated.dta
24
26.5
919-20 Measurement
Example
27.8
935-9
Error
Bias To
Missing
Data
MCMC To come
Imputation Example
and
nswre74_control.dta
or nswre74_all.asc
propensity_cps.dta
or
propensity_cps.asc
come Generated data
Generated data
25
DATA
SETS
Data in fixed format text file have extension .asc or .dat [and if Stata dictionary used extension is
.dct]
Stata
data
files
have
extension
.dta
We thank Rajeev Dehejia, Bronwyn Hall, Cathy Kling, Jeffrey Kling, Will Manning, Brian McCall
and Jim Ziliak for making their data available for empirical illustrations. The relevant citations are
given below. For "Authors' extract" the citation is A. C. Cameron and P. K. Trivedi (2005),
"Microeconometrics: Methods and Applications," Cambridge University Press, New York.
Many more examples use generated data - see programs.
Pages
Topic
Data Source
Data
88-90
or
110-2
Instrumental
National
Longitudinal
Survey DATA66.dat
variables with weak J. R. Kling (2001) "Interpreting DATA66.dct
instruments
Instrumental Variables Estimates of the
Return to Schooling," Journal of Business
and Economic Statistics, 19, 358-364.
and
295-7
300
Panel Survey of
Nonparametric
density estimation Authors' extract
and regression
463-6
486
491-5
Binary
multinomial
outcomes
553-6
565
Selection models
574-5
582
Duration models
Strike
duration
data strkdur.asc
J. Kennan (1985), "The Duration of strkdur.asc
Contract strikes in U.S. Manufacturing,"
Journal of Econometrics, 28, 5-28.
or
603-8
632-6
658-62
Duration models
or
671-4
692
708-15
Linear
panel Panel Survey of Income Dynamics MOM.dat
models: basics
J. Ziliak (1997), "Efficient Estimation
Income
Dynamics psidf3050.dat
choice
data Nldata.asc
and Fishing-mode
J. A. Herriges and C. L. Kling (1999), mma15p4gev.asc
"Nonlinear Income Effects in Random
Utility Models," Review of Economics
and Statistics, 81, 62-72.
or
Experiment randdata.dta
or
mma16p3selection.asc
26
Linear
panel Panel Survey of Income Dynamics MOMprecise.dat
models: GMM
J. Ziliak (1997) - see previous cite.
792-5
Nonlinear
models
848-53
Clustered data
889-95
panel Patents-R&D
data patr7079.asc
B. H. Hall, Z. Griliches and J. A.
Hausman (1986), "Patents and R&D: Is
There a Lag?", International Economic
Review, 27, 265-283.
Treatment
evaluation
[nswpsid:
NSW
treated vs PSID
control used in text.
The other data sets
not used in text but
used
in
mmap3extra.do]
vietnam_ex1.dta
vietnam_ex1.asc
vietnam_ex2.dta
vietnam_ex2.asc
nswpsid.da1
or
nswpsid.dta
nswre74_treated.dta
and
nswre74_control.dta
or
nswre74_all.asc
propensity_cps.dta
or propensity_cps.asc
or
or
27
Most programs are in Stata version 8.0, executed on a MSWindows PC with Stata 8.2.
Stata 7 will usually be okay. Exceptions where Stata 8 is needed include:
(1) Estimates command (for tabulating regression results) is not available in version 7.
Comment out occurrences of "estimates store ..."
and "estimates table ...."
(2) Graphics commands (used to obtain the figures in the book) changed substantially from 7 to 8.
This only effects generating figures. If graphs are important, it is best to upgrade to Stata 8 as so
much
better.
(3) In some places free Stata add-ons have been included. These are noted in programs.
To download these programs e.g. knnreg in Stata give command "search knnreg" and follow
directions.
The Stata programs vary from very problem-specific code to code that potentially can be adapted to
one's own needs.
Some programs use Limdep version 7.0 and Nlogit 2.0, executed on an MSWindows PC.
Some programs use SAS / IML. SAS version 8.0 used on a Unix machine.
FILE NAMING CONVENTIONS:
For
Stata:
as
an
example
for
chapter
4.5.3
we
provide:
mma04p1wls.do
Stata
program
mma04p1wls.txt
Output
from
this
program
- mma04p1wls.asc
The generated data as fixed width ascii data set
[permits analysis with programs other than Stata]
For
Limdep:
as
an
example
for
chapter
14.5.3
we
provide:
mma15p3mnl.lim
Limdep
program
- mma15p3mnl.out
Output from this program
For
SAS:
as
an
example
for
chapter
13.6
we
provide:
mma15p2bayesgibbs.sas
SAS
program
mma13p2bayesgibbs.lst
SAS
output
- mma13p2bayesgibbs.log SAS logfile
For
data
sets
the
extensions
are:
.dta
for
Stata
data
set
- .asc for ascii (text) data set that is usually both space delimited and fixed width
For descriptions of the data sets see the relevant program that uses the data set, and the associated
output.
PROGRAM CPU TIME
Programs generally take little time to run.
Exception is programs that entail simulation, including bootstrapping.
Programs can be speeded up by reducing the number of simulations / replications, though final
analysis should use many simulations / replications.
28
29
30
31
32
33
34
35
36
37
38
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma04p1wls.txt
log type: text
opened on: 17 May 2005, 13:41:48
.
. ********** OVERVIEW OF MMA04P1WLS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 4.5.3 pages 84-5
. * Robust Standard Errors for OLS, WLS and GLS
. * (1) Robust and nonrobust standard errors for OLS, WLS and GLS.
. * (2) Table 4.3
. * using generated data (see below)
.
. ********** SETUP **********
.
. set more off
. version 8
. set scheme s1mono /* Used for graphs */
.
. ********** GENERATE DATA and SUMMARIZE **********
.
. * Model is y = 1 + 1*x + u
. * where u = abs(x)*e
.*
x ~ N(0, 5^2)
.*
e ~ N(0, 2^2)
.
. * Errors are conditionally heteroskedastic with V[u|x]=4*x^2
. * OLS, WLS and GLS are consistent
. * but need to use robust standard errors for OLS and WLS.
.
. set seed 10105
. set obs 100
obs was 0, now 100
. gen x = 5*invnorm(uniform())
39
. gen e = 2*invnorm(uniform())
. gen u = abs(x)*e
. gen y = 1 + 1*x + u
.
. * Descriptive Statistics
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x|
100 -.1322828 4.64293 -11.05289 10.63336
e|
100 .350339 2.033639 -3.776468 5.150759
u|
100 1.215709 8.187081 -19.58098 32.6086
y|
100 2.083426 9.364465 -27.63657 39.93944
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x e u using mma04p1wls.asc, replace
.
. ********** ESTIMATE THE MODELS **********
.
. ** (1) OLS - first column of Table 4.3
.
. * (1A) OLS with wrong standard errors
. regress y x
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 1, 98) = 30.23
Model | 2046.73901 1 2046.73901
Prob > F
= 0.0000
Residual | 6634.88855 98 67.7029444
R-squared = 0.2358
-------------+-----------------------------Adj R-squared = 0.2280
Total | 8681.62755 99 87.6932076
Root MSE
= 8.2282
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .979313 .1781124 5.50 0.000 .6258548 1.332771
_cons | 2.212973 .8231553 2.69 0.008 .5794478 3.846497
-----------------------------------------------------------------------------. estimates store olsusual
.
. * (1B) OLS with correct standard errors (robust sandwich)
. regress y x, robust
40
100
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .979313 .2750617 3.56 0.001 .4334621 1.525164
_cons | 2.212973 .8198253 2.70 0.008
.586056 3.839889
-----------------------------------------------------------------------------. estimates store olsrobust
.
. ** (2) WLS - second column of Table 4.3
.
. * (2A) WLS with wrong standard errors
. * Use the aweight option (not clearly explained in Stata manual).
. * The aweight option MULTIPLIES y and x by sqrt(aweight).
. * Here we suppose V[u]=constant*|x|
. * So want to divide by sqrt(|x|), so let aweight=1/|x|
. gen absx = abs(x)
. regress y x [aweight=1/absx]
(sum of wgt is 5.7885e+02)
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 1, 98) = 25.29
Model | 56.759883 1 56.759883
Prob > F
= 0.0000
Residual | 219.985987 98 2.24475497
R-squared = 0.2051
-------------+-----------------------------Adj R-squared = 0.1970
Total | 276.74587 99 2.79541283
Root MSE
= 1.4983
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .9569768 .1903115 5.03 0.000 .5793097 1.334644
_cons | 1.060374 .1498265 7.08 0.000 .7630484
1.3577
-----------------------------------------------------------------------------. estimates store wlsusual
.
. * (2B) WLS with correct standard errors (robust sandwich)
. regress y x [aweight=1/absx], robust
(sum of wgt is 5.7885e+02)
Regression with robust standard errors
Number of obs =
100
41
F( 1, 98) = 17.07
Prob > F
= 0.0001
R-squared = 0.2051
Root MSE = 1.4983
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .9569768 .231612 4.13 0.000 .4973503 1.416603
_cons | 1.060374 .050533 20.98 0.000 .9600931 1.160655
-----------------------------------------------------------------------------. estimates store wlsrobust
.
. ** (3) GLS - last column of Table 4.3
.
. * (3A) GLS with usual standard errors (correct)
. * Here we know V[u]=constant*x^2
. * So want to divide by x, so let aweight=1/(x^2)
. gen xsq = x*x
. regress y x [aweight=1/xsq]
(sum of wgt is 1.0314e+05)
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 1, 98) = 20.70
Model | .086075004 1 .086075004
Prob > F
= 0.0000
Residual | .407542418 98 .004158596
R-squared = 0.1744
-------------+-----------------------------Adj R-squared = 0.1660
Total | .493617422 99 .004986035
Root MSE
= .06449
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .9516457 .2091752 4.55 0.000 .5365444 1.366747
_cons | .9964956 .0065131 153.00 0.000 .9835706 1.009421
-----------------------------------------------------------------------------. estimates store glsusual
.
. * (3B) GLS with standard errors (robust sandwich - unnecessary here)
. regress y x [aweight=1/xsq], robust
(sum of wgt is 1.0314e+05)
Regression with robust standard errors
Number of obs =
F( 1, 98) = 20.89
Prob > F
= 0.0000
R-squared = 0.1744
100
42
Root MSE
= .06449
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .9516457 .2082145 4.57 0.000 .5384508 1.364841
_cons | .9964956 .0078922 126.26 0.000 .9808337 1.012157
-----------------------------------------------------------------------------. estimates store glsrobust
.
. * (3C) Check that aweight works as expected.
. * Do GLS by OLS on daya transformed by dividing by x.
. gen try = y/x
. gen trint = 1/x
. gen trx = x/x
. regress try trx trint, noconstant
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 2, 98) =11850.15
Model | 101659.545 2 50829.7726
Prob > F
= 0.0000
Residual | 420.359033 98 4.28937789
R-squared = 0.9959
-------------+-----------------------------Adj R-squared = 0.9958
Total | 102079.904 100 1020.79904
Root MSE
= 2.0711
-----------------------------------------------------------------------------try |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------trx | .9516457 .2091752 4.55 0.000 .5365444 1.366747
trint | .9964956 .0065131 153.00 0.000 .9835706 1.009421
-----------------------------------------------------------------------------.
. ********** DISPLAY KEY RESULTS **********
.
. * Table 4.3
. estimates table olsusual olsrobust wlsusual wlsrobust glsusual glsrobust, /*
>
*/ se stats(N r2) b(%7.3f) keep(_cons x)
-------------------------------------------------------------------------Variable | olsus~l olsro~t wlsus~l wlsro~t glsus~l glsro~t
-------------+-----------------------------------------------------------_cons | 2.213 2.213 1.060 1.060 0.996 0.996
| 0.823 0.820 0.150 0.051 0.007 0.008
x | 0.979 0.979 0.957 0.957 0.952 0.952
| 0.178 0.275 0.190 0.232 0.209 0.208
43
44
.
. ********** DATA DESCRIPTION **********
.
. * The data from World Bank 1997 Vietnam Living Standards Survey
. * are described in chapter 4.6.4.
. * A larger sample from this survey is studied in Chapter 24.7
.
. ********** READ DATA, TRANSFORM and SAMPLE SELECTION **********
.
. use qreg0902
. describe
Contains data from qreg0902.dta
obs:
5,999
vars:
9
19 Sep 2002 21:45
size:
191,968 (98.1% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------sex
byte %8.0g
Gender of HH.head (1:M;2:F)
age
int %8.0g
Age of household head
educyr98
float %9.0g
schooling year of HH.head
farm
float %9.0g
loaiho Type of HH (1:farm; 0:nonfarm)
urban98
byte %8.0g
urban
1:urban 98; 0:rural 98
hhsize
long %12.0g
Household size
lhhexp1
float %9.0g
lhhex12m
float %9.0g
lnrlfood
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------sex |
5999 1.270712 .4443645
1
2
age |
5999 48.01284 13.7702
16
95
educyr98 |
5999 7.094419 4.416092
0
22
farm |
5999 .5730955 .4946694
0
1
urban98 |
5999 .2883814 .4530472
0
1
-------------+-------------------------------------------------------hhsize |
5999 4.752292 1.954292
1
19
lhhexp1 |
5999 9.341561 .6877458 6.543108 12.20242
lhhex12m |
5006 6.310585 1.593083
0 12.36325
lnrlfood |
5999 8.679536 .5368118 6.356364 11.38385
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile sex age educyr98 farm urban98 hhsize lhhexp1 lhhex12m lnrlfood /*
45
>
.
. * drop zero observations for medical expenditures
. drop if lhhex12m == .
(993 observations deleted)
.
. * lhhexp1 is natural logarithm of household total expenditure
. * lhhex12m is natural logarithm of household medical expenditure
. gen lntotal = lhhexp1
. gen lnmed = lhhex12m
. label variable lntotal "Log household total expenditure"
. label variable lnmed "Log household medical expenditure"
. describe
Contains data from qreg0902.dta
obs:
5,006
vars:
11
19 Sep 2002 21:45
size:
200,240 (98.0% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------sex
byte %8.0g
Gender of HH.head (1:M;2:F)
age
int %8.0g
Age of household head
educyr98
float %9.0g
schooling year of HH.head
farm
float %9.0g
loaiho Type of HH (1:farm; 0:nonfarm)
urban98
byte %8.0g
urban
1:urban 98; 0:rural 98
hhsize
long %12.0g
Household size
lhhexp1
float %9.0g
lhhex12m
float %9.0g
lnrlfood
float %9.0g
lntotal
float %9.0g
Log household total expenditure
lnmed
float %9.0g
Log household medical
expenditure
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------sex |
5006 1.269676 .443836
1
2
age |
5006 48.06133 13.79974
18
95
educyr98 |
5006 7.147956 4.333304
0
21
46
farm |
5006 .5679185 .4954151
0
1
urban98 |
5006 .2920495 .4547504
0
1
-------------+-------------------------------------------------------hhsize |
5006 4.832601 1.95257
1
19
lhhexp1 |
5006 9.370402 .6726841 6.543108 12.20242
lhhex12m |
5006 6.310585 1.593083
0 12.36325
lnrlfood |
5006 8.697963 .5309517 6.356364 11.38385
lntotal |
5006 9.370402 .6726841 6.543108 12.20242
-------------+-------------------------------------------------------lnmed |
5006 6.310585 1.593083
0 12.36325
.
. ********* ANALYSIS: QUANTILE REGRESSION **********
.
. * (0) OLS
. reg lnmed lntotal
Source |
SS
df
MS
Number of obs = 5006
-------------+-----------------------------F( 1, 5004) = 311.91
Model | 745.293239 1 745.293239
Prob > F
= 0.0000
Residual | 11956.9671 5004 2.38948183
R-squared = 0.0587
-------------+-----------------------------Adj R-squared = 0.0585
Total | 12702.2603 5005 2.53791415
Root MSE
= 1.5458
-----------------------------------------------------------------------------lnmed |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lntotal | .5736545 .0324817 17.66 0.000 .5099761 .6373328
_cons | .9352117 .3051496 3.06 0.002 .3369847 1.533439
-----------------------------------------------------------------------------. predict pols
(option xb assumed; fitted values)
. reg lnmed lntotal, robust
Regression with robust standard errors
Number of obs =
F( 1, 5004) = 318.05
Prob > F
= 0.0000
R-squared = 0.0587
Root MSE = 1.5458
5006
-----------------------------------------------------------------------------|
Robust
lnmed |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lntotal | .5736545 .0321665 17.83 0.000
.510594 .636715
_cons | .9352117 .298119 3.14 0.002 .3507677 1.519656
-----------------------------------------------------------------------------. * Bootstrap standard errors for OLS
47
6112.4546
6098.5295
6097.2178
6097.1564
Median regression
Number of obs =
Raw sum of deviations 6324.265 (about 6.3716121)
Min sum of deviations 6097.156
Pseudo R2
5006
=
0.0359
-----------------------------------------------------------------------------lnmed |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lntotal | .6210917 .0388194 16.00 0.000 .5449886 .6971948
_cons | .5921626 .3646869 1.62 0.104 -.1227836 1.307109
48
3279.5575
2691.3839
2521.5214
2506.303
2505.1952
2505.1334
2505.1314
2505.1313
.9 Quantile regression
Number of obs =
5006
Raw sum of deviations 2687.692 (about 8.2789364)
Min sum of deviations 2505.131
Pseudo R2 = 0.0679
-----------------------------------------------------------------------------lnmed |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lntotal | .8003569 .0517225 15.47 0.000 .6989581 .9017558
_cons | .6750967 .4857563 1.39 0.165 -.2771985 1.627392
-----------------------------------------------------------------------------. predict pqreg90
(option xb assumed; fitted values)
.
. * (2) Create Figure 4.2 on page 90 first as this is easy
. graph twoway (scatter lnmed lntotal, msize(vsmall)) (lfit pqreg90 lntotal, clstyle(p2)) /*
> */ (lfit pqreg50 lntotal, clstyle(p1)) (lfit pqreg10 lntotal, clstyle(p3)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Regression Lines as Quantile Varies") /*
> */ xtitle("Log Household Medical Expenditure", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log Household Total Expenditure", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(11) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Actual Data") label(2 "90th percentile") /*
> */
label(3 "Median") label(4 "10th percentile"))
. graph export ch4fig2QR.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch4fig2QR.wmf written in Windows Metafile format)
.
. * (3) Create Figure 4.1 second as this is more difficult
. * Simultaneous quantile regression for quantiles 0.05, 0.10, ..., 0.90, 0.95
. * with standard errors by bootstrap - here 200 replications
. set seed 10101
49
Number of obs =
5006
.05 Pseudo R2 = 0.0015
.10 Pseudo R2 = 0.0012
.15 Pseudo R2 = 0.0058
.20 Pseudo R2 = 0.0106
.25 Pseudo R2 = 0.0149
.30 Pseudo R2 = 0.0183
.35 Pseudo R2 = 0.0242
.40 Pseudo R2 = 0.0274
.45 Pseudo R2 = 0.0326
.50 Pseudo R2 = 0.0359
.55 Pseudo R2 = 0.0408
.60 Pseudo R2 = 0.0464
.65 Pseudo R2 = 0.0500
.70 Pseudo R2 = 0.0520
.75 Pseudo R2 = 0.0563
.80 Pseudo R2 = 0.0603
.85 Pseudo R2 = 0.0630
.90 Pseudo R2 = 0.0679
.95 Pseudo R2 = 0.0795
-----------------------------------------------------------------------------|
Bootstrap
lnmed |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------q5
|
lntotal | .1536332 .0791236 1.94 0.052 -.0014838 .3087501
_cons | 2.095395 .7559016 2.77 0.006 .6134964 3.577293
-------------+---------------------------------------------------------------q10
|
lntotal | .1512009 .085018 1.78 0.075 -.0154716 .3178734
_cons | 2.825072 .7697613 3.67 0.000 1.316002 4.334141
-------------+---------------------------------------------------------------q15
|
lntotal | .2695707 .0580757 4.64 0.000 .1557168 .3834245
_cons | 2.231293 .5429047 4.11 0.000 1.166962 3.295624
-------------+---------------------------------------------------------------q20
|
lntotal | .3552251 .0504688 7.04 0.000 .2562841 .4541662
_cons | 1.740233 .4649551 3.74 0.000 .8287172 2.651749
-------------+---------------------------------------------------------------q25
|
lntotal | .4034632 .0421514 9.57 0.000 .3208279 .4860984
50
. * (1) IV Regression (with robust s.e.'s though not needed here for iid error).
. * (2) Table 4.4
. * using generated data (see below)
.
. ********** SETUP **********
.
. set more off
. version 8
.
. ********** GENERATE DATA and SUMMARIZE **********
.
. * Model is
. * y = b1 + b2*x + u
. * x = c1 + c2*z + v
. * z ~ N[2,1]
. * where b1=0, b2=0.5, c1=0 and c2=1
. * and u and v are joint normal (0,0,1,1,0.8)
.
. * OLS of y on z is inconsistent as z is correlated with u
. * Instead need to do IV with instrument x for z
. * Also try using
.
. set seed 10001
. set obs 10000
obs was 0, now 10000
. scalar b1 = 0
. scalar b2 = 0.5
. scalar c1 = 0
. scalar c2 = 1
.
. * Generate errors u and v
. * Use fact that u is N(0,1)
. * and v | u is N(0 + (.8/1)(u - 0), 1 - .8x.8/1 = 0.36)
. gen u = 1*invnorm(uniform())
. gen muvgivnu = 0.8*u
. gen v = 1*(muvgivnu+sqrt(0.36)*invnorm(uniform()))
.
. * Generate instrument z (which is purely random)
. gen z = 2 + 1*invnorm(uniform())
54
.
. * Generate regressor x which is correlated with z, and with u via v
. gen x = c1 + c2*z + v
.
. * Generate dependent variable y
. gen y = b1 + b2*x + u
.
. * Generate z-cubed. Used as an alternative instrument
. gen zcube = z*z*z
.
. * Descriptive Statistics
. describe
Contains data
obs:
10,000
vars:
7
size:
320,000 (96.9% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------u
float %9.0g
muvgivnu
float %9.0g
v
float %9.0g
z
float %9.0g
x
float %9.0g
y
float %9.0g
zcube
float %9.0g
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u | 10000 .003772 1.010726 -4.010302 4.267661
muvgivnu | 10000 .0030176 .8085809 -3.208241 3.414129
v | 10000 .0097031 1.005874 -3.992237 3.79261
z | 10000 1.997786 1.013118 -1.895752 5.81496
x | 10000 2.007489 1.436511 -3.139744 7.366555
-------------+-------------------------------------------------------y | 10000 1.007516 1.538611 -5.309155 7.794924
zcube | 10000 14.14145 17.88016 -6.813095 196.6257
. correlate y x z u v
(obs=10000)
55
|
y
x
z
u
v
-------------+--------------------------------------------y | 1.0000
x | 0.8423 1.0000
z | 0.3403 0.7140 1.0000
u | 0.9237 0.5716 0.0107 1.0000
v | 0.8601 0.7090 0.0124 0.8055 1.0000
. correlate y x z u v, cov
(obs=10000)
|
y
x
z
u
v
-------------+--------------------------------------------y | 2.36732
x | 1.86165 2.06356
z | .530456 1.0391 1.02641
u | 1.4365 .829866 .010909 1.02157
v | 1.33119 1.02447 .012687 .818958 1.01178
. graph matrix y x z u v
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x z u v using mma04p3iv.asc, replace
.
. ********** DO THE ANALYSIS: ESTIMATE MODELS **********
.
. * (1) OLS is inconsistent (first column of Table 4.4)
. regress y x
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 1, 9998) =24412.17
Model | 16793.2198 1 16793.2198
Prob > F
= 0.0000
Residual | 6877.65935 9998 .687903516
R-squared = 0.7094
-------------+-----------------------------Adj R-squared = 0.7094
Total | 23670.8791 9999 2.36732464
Root MSE
= .8294
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .9021522 .005774 156.24 0.000 .890834 .9134704
_cons | -.8035441 .014253 -56.38 0.000 -.8314827 -.7756054
-----------------------------------------------------------------------------. regress y x, robust
Regression with robust standard errors
Number of obs = 10000
F( 1, 9998) =24780.49
56
Prob > F
= 0.0000
R-squared = 0.7094
Root MSE = .8294
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .9021522 .0057309 157.42 0.000 .8909184 .9133859
_cons | -.8035441 .0141056 -56.97 0.000 -.8311939 -.7758942
-----------------------------------------------------------------------------. estimates store olswrong
.
. * (2) IV with instrument x is consistent and efficient (second column of Table 4.4)
. ivreg y (x = z)
Instrumental variables (2SLS) regression
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 1, 9998) = 2728.97
Model | 13628.1781 1 13628.1781
Prob > F
= 0.0000
Residual | 10042.701 9998 1.004471
R-squared = 0.5757
-------------+-----------------------------Adj R-squared = 0.5757
Total | 23670.8791 9999 2.36732464
Root MSE
= 1.0022
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .5104982 .0097723 52.24 0.000 .4913426 .5296538
_cons | -.017303 .0220296 -0.79 0.432 -.0604854 .0258793
-----------------------------------------------------------------------------Instrumented: x
Instruments: z
-----------------------------------------------------------------------------. ivreg y (x = z), robust
IV (2SLS) regression with robust standard errors
Number of obs = 10000
F( 1, 9998) = 2670.19
Prob > F
= 0.0000
R-squared = 0.5757
Root MSE = 1.0022
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .5104982 .0098792 51.67 0.000 .4911329 .5298635
_cons | -.017303 .0220785 -0.78 0.433 -.0605813 .0259752
57
-----------------------------------------------------------------------------Instrumented: x
Instruments: z
-----------------------------------------------------------------------------. estimates store iv
.
. * (3) IV estimator in (3) can be computed by
.*
regress y on z gives dy/dz
.*
regress x on z gives dx/dz
. * and divide the two
. regress y z
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 1, 9998) = 1309.44
Model | 2741.16635 1 2741.16635
Prob > F
= 0.0000
Residual | 20929.7128 9998 2.09338995
R-squared = 0.1158
-------------+-----------------------------Adj R-squared = 0.1157
Total | 23670.8791 9999 2.36732464
Root MSE
= 1.4469
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------z | .516808 .0142819 36.19 0.000 .4888126 .5448035
_cons | -.0249553 .031991 -0.78 0.435 -.0876642 .0377535
-----------------------------------------------------------------------------. matrix byonz = e(b)
. regress x z
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 1, 9998) =10396.43
Model | 10518.3341 1 10518.3341
Prob > F
= 0.0000
Residual | 10115.2362 9998 1.01172597
R-squared = 0.5098
-------------+-----------------------------Adj R-squared = 0.5097
Total | 20633.5703 9999 2.06356339
Root MSE
= 1.0058
-----------------------------------------------------------------------------x|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------z | 1.01236 .0099287 101.96 0.000 .9928979 1.031822
_cons | -.0149899 .02224 -0.67 0.500 -.0585847 .028605
-----------------------------------------------------------------------------. matrix bxonz = e(b)
. matrix ivfirstprinciples = byonz[1,1]/bxonz[1,1]
. matrix list byonz
58
byonz[1,2]
z
_cons
y1 .51680804 -.02495533
. matrix list bxonz
bxonz[1,2]
z
_cons
y1 1.0123602 -.01498985
. matrix list ivfirstprinciples
symmetric ivfirstprinciples[1,1]
c1
r1 .5104982
.
. * (4) IV can be computed as 2SLS, but wrong standard errors
. * (third column of Table 4.4)
. * (4A) OLS of x on z gives xhat
. regress x z
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 1, 9998) =10396.43
Model | 10518.3341 1 10518.3341
Prob > F
= 0.0000
Residual | 10115.2362 9998 1.01172597
R-squared = 0.5098
-------------+-----------------------------Adj R-squared = 0.5097
Total | 20633.5703 9999 2.06356339
Root MSE
= 1.0058
-----------------------------------------------------------------------------x|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------z | 1.01236 .0099287 101.96 0.000 .9928979 1.031822
_cons | -.0149899 .02224 -0.67 0.500 -.0585847 .028605
-----------------------------------------------------------------------------. predict xhat, xb
. * (4B) OLS of x on xhat gives IV but wrong standard errors
. regress y xhat
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 1, 9998) = 1309.44
Model | 2741.16636 1 2741.16636
Prob > F
= 0.0000
Residual | 20929.7127 9998 2.09338995
R-squared = 0.1158
-------------+-----------------------------Adj R-squared = 0.1157
Total | 23670.8791 9999 2.36732464
Root MSE
= 1.4469
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
59
Prob > F
= 0.0000
R-squared = 0.5745
Root MSE = 1.0037
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .5086427 .0116871 43.52 0.000 .4857337 .5315517
_cons | -.0135782 .0253208 -0.54 0.592 -.063212 .0360556
-----------------------------------------------------------------------------Instrumented: x
Instruments: zcube
-----------------------------------------------------------------------------. estimates store ivineff
.
. ********** DISPLAY KEY RESULTS in Table 4.4 p.103 **********
.
. * Table 4.4 page 103
. estimates table olswrong iv twosls ivineff, se stats(N r2) b(%8.3f) keep(_cons x xhat)
---------------------------------------------------------Variable | olswrong
iv
twosls ivineff
-------------+-------------------------------------------_cons | -0.804 -0.017 -0.017 -0.014
| 0.014
0.022
0.032
0.025
x | 0.902
0.510
0.509
| 0.006
0.010
0.012
xhat |
0.510
|
0.014
-------------+-------------------------------------------N | 1.0e+04 1.0e+04 1.0e+04 1.0e+04
r2 | 0.709
0.576
0.116
0.574
---------------------------------------------------------legend: b/se
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section2\mma04p3iv.txt
log type: text
closed on: 17 May 2005, 13:44:41
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma04p4ivweak.txt
log type: text
opened on: 17 May 2005, 13:45:59
61
.
. ********** OVERVIEW OF MMA04P4IVWEAK.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 4.9.5 pages 110-2
. * IV regression with potentially weak instruments
. * (1) Compares OLS and IV estimation of log-wages on schooling regression
. * where schooling, experience and experience-squared are endogenous
. * and proximity to 4-year college, age and age-squared are instruments
. * so model is just-identified.
. * (2) Verifies that here can treat errors as homoskedastic
. * (3) Looks at weak instruments
. * (A) instrument relevance: Whether Shea's partial R-squared is low
. * (B) finite sample bias: whether first-stage partial F is low
. * (4) Provides Table 4.5
. * (5) Does more analysis than reported in the book
.
. * To run this program you need data and dictionary files
. * DATA66.dat ASCII data set
. * DATA66.dct Stata dictionary that labels variables
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set memory 20m
(20480k)
. set linesize 150 /* Permits long inputline commands with delimit */
.
. ********** ORIGINAL DATA SOURCE **********
.
. * Program mma4p4ivweak.do based on Kling Analys66.d0 September 2003
. * written for Jeffrey R. Kling (2001) "Interpreting Instrumental Variables Estimates
. * of the Return to Schooling", Journal of Business and Economic Statistics,
. * July 2001, 19 (3), pp.358-364.
. * This program focuses on Columns (1) and (2) of Kling's Table 1 on p.359
. * in turn based on
. * David Card (1995), "Using Geographic Variation in College Proximity to
. * Estimate the Returns to Schooling", in
. * Aspects of Labor Market Behavior: Essays in Honor of John Vanderkamp,
. * eds. L.N. Christofides et al., Toronto: University of Toronto Press, pp.201-221.
.
62
_column(333) reg9
%8f "If lived in Region 9 (region= P ) "
_column(342) momdad14 %8f "If lived with both parents at age 14 "
_column(351) sinmom14 %8f "If lived with mother only at age 14 "
_column(360) nodaded %1f "If father has no formal education "
_column(362) nomomed %1f "If mother has no formal education "
_column(365) daded
%10f "Mean grade level of father
"
_column(377) momed
%10f "Mean grade level of mother
"
_column(396) famed
%8f "Father's and mother's education
"
_column(405) famed1
%8f "If mgrade> 12 & fgrade> 12 (famed=1) "
_column(414) famed2
%8f "If mgrade>=12 & fgrade>=12 (famed=2) "
_column(423) famed3
%8f "If mgrade==12 & fgrade==12 (famed=3) "
_column(432) famed4
%8f "If mgrade>=12 & fgrade==-1 (famed=4) "
_column(441) famed5
%8f "If fgrade>=12 (famed=5)
"
_column(450) famed6
%8f "If mgrade>=12 & fgrade> -1 (famed=6) "
_column(459) famed7
%8f "If mgrade>=9 & fgrade>=9 (famed=7) "
_column(468) famed8
%8f "If mgrade> -1 & fgrade> -1 (famed=8) "
_column(477) famed9
%8f "If famed not in range (1-8)"
_column(486) int76
%8f "If wt76 not missing "
_column(495) age1415 %8f "If in age group =14-15"
_column(504) age1617 %8f "If in age group =16-17"
_column(513) age1819 %8f "If in age group =18-19"
_column(522) age2021 %8f "If in age group =20-21"
_column(531) age2224 %8f "If in age group =20-24"
_column(540) cage1415 %8f "If in age group =14,15 and lived near college"
_column(549) cage1617 %8f "If in age group =16,17 and lived near college"
_column(558) cage1819 %8f "If in age group =18,19 and lived near college"
_column(567) cage2021 %8f "If in age group =20,21 and lived near college"
_column(576) cage2224 %8f "If in age group =20-24 and lived near college"
_column(585) cage66
%8f "Age in 66 and whether lived near college "
_column(594) a1
%8f "If age in 66 = 14 (age66= 14)"
_column(603) a2
%8f "If age in 66 = 15 (age66= 15)"
_column(612) a3
%8f "If age in 66 = 16 (age66= 16)"
_column(621) a4
%8f "If age in 66 = 17 (age66= 17)"
_column(630) a5
%8f "If age in 66 = 18 (age66= 18)"
_column(639) a6
%8f "If age in 66 = 19 (age66= 19)"
_column(648) a7
%8f "If age in 66 = 20 (age66= 20)"
_column(657) a8
%8f "If age in 66 = 21 (age66= 21)"
_column(666) a9
%8f "If age in 66 = 22 (age66= 22)"
_column(675) a10
%8f "If age in 66 = 23 (age66= 23)"
_column(684) a11
%8f "If age in 66 = 24 (age66= 24)"
_column(693) ca1
%8f "Not lived near college in 66"
_column(702) ca2
%8f "If age in 66 = 14 and lived near college"
_column(711) ca3
%8f "If age in 66 = 15 and lived near college"
_column(720) ca4
%8f "If age in 66 = 16 and lived near college"
_column(729) ca5
%8f "If age in 66 = 17 and lived near college"
_column(738) ca6
%8f "If age in 66 = 18 and lived near college"
_column(747) ca7
%8f "If age in 66 = 19 and lived near college"
_column(756) ca8
%8f "If age in 66 = 20 and lived near college"
_column(765) ca9
%8f "If age in 66 = 21 and lived near college"
_column(774) ca10
%2f "If age in 66 = 22 and lived near college"
_column(777) ca11
%2f "If age in 66 = 23 and lived near college"
64
_column(780) ca12
%8f "If age in 66 = 24 and lived near college"
_column(782) g25
%12f "Grade level when 25 years old
"
_column(795) g25i
%12f "If =g25 and intrvwed in year used for determining g25 "
_column(819) intmo66 %8f "Intvw month in 1966, used to identify cases incl by CARD"
_column(828) nlsflt
%8f "Flag to identify if the case was used by CARD"
_column(837) nsib
%8f "Number of siblings "
_column(846) ns1
%8f "If number of siblings = 0 (nsib= 0)"
_column(855) ns2
%8f "If number of siblings = 2 (nsib= 2)"
_column(864) ns3
%8f "If number of siblings = 3 (nsib= 3)"
_column(873) ns4
%8f "If number of siblings = 4 (nsib= 4)"
_column(882) ns5
%8f "If number of siblings = 6 (nsib= 6)"
_column(891) ns6
%8f "If number of siblings = 9 (nsib= 9)"
_column(900) ns7
%8f "If number of siblings =18 (nsib=18)"
}
(5226 observations read)
. * save DATA66, replace
. desc
Contains data
obs:
5,226
vars:
101
size: 2,132,208 (89.8% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------id
float %9.0g
ID CODE (r0000100) n= 5225
mean= 2613.000 min= 1 max=
5225
black
float %9.0g
Race (r0002300) n= 5225 mean=
1.296 min= 1 max=3
imigrnt
float %9.0g
Was r's brthpl in the US?
(r0038000) n=4965 mean=0.98
mn=0 mx=1
hhead
float %9.0g
Person R lived w/ @ age 14
(r0039700) n= 5213 mean=1.92
mn=1 mx=9
mag_14
float %9.0g
Were magznes avail at age 14
(r0039900) n=5167 mean=0.69
mn=0 mx=1
news_14
float %9.0g
Were nwspaprs avail at age 14
(r0040000) n=5195 mean=0.85
mn=0 mx=1
lib_14
float %9.0g
Were lib-card avail at age14
(r0040100) n=5204 mean=0.66
mn=0 mx=1
num_sib
float %9.0g
Tot # sibs r 66 (r0056900)
n=5168 mean=3.408 min=0
max=18
65
fgrade
mgrade
iq
float %9.0g
float %9.0g
float %9.0g
bdate
gfill76
float %9.0g
float %9.0g
wt76
grade76
grade66
age66
float %9.0g
float %9.0g
float %9.0g
float %9.0g
smsa66
float %9.0g
region
smsa76
col4
float %9.0g
float %9.0g
float %9.0g
mcol4
float %9.0g
col4pub
float %9.0g
south76
float %9.0g
wage76
exp76
float %9.0g
float %9.0g
expsq76
age76
agesq76
reg1
reg2
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
reg3
float %9.0g
reg4
float %9.0g
reg5
float %9.0g
reg6
float %9.0g
reg7
float %9.0g
reg8
float %9.0g
reg9
float %9.0g
momdad14
float %9.0g
sinmom14
float %9.0g
nodaded
nomomed
daded
momed
famed
famed1
famed2
famed3
famed4
famed5
famed6
famed7
famed8
famed9
int76
age1415
age1617
age1819
age2021
age2224
cage1415
cage1617
cage1819
cage2021
cage2224
cage66
a1
a2
a3
a4
a5
a6
float %9.0g
a7
a8
a9
a10
a11
ca1
ca2
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
Sorted by:
Note: dataset has changed since last saved
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
5225
2613 1508.472
1
5225
black |
5225 .2752153 .4466655
0
1
imigrnt |
5225 .0237321 .1522277
0
1
hhead |
5225 -.3783732 47.95128
-999
9
mag_14 |
5225 .6861566 .4616275
0
1
-------------+-------------------------------------------------------news_14 |
5225 .8483024 .3577176
0
1
lib_14 |
5225 .658469 .4733619
0
1
num_sib |
5168 3.407701 2.586307
0
18
fgrade |
3930 9.93715 3.777654
0
18
mgrade |
4573 10.25104 3.17986
0
18
-------------+-------------------------------------------------------iq |
3369 101.5818 15.93225
50
158
bdate |
5204 472926.6 31765.04 360823 521224
gfill76 |
5225 12.78718 2.802705
0
18
wt76 |
3695 475512.5 265188.5
98617 2582192
grade76 |
3671 13.23018 2.747627
0
18
-------------+-------------------------------------------------------grade66 |
5225 10.58431 2.433696
0
18
age66 |
5225 18.09129 3.157657
14
24
smsa66 |
5225 .6599043 .4737864
0
1
region |
5225 4.721722 2.300767
1
9
smsa76 |
5225 .491866 .4999817
0
1
-------------+-------------------------------------------------------col4 |
5225 .691866 .4617664
0
1
mcol4 |
5225 .6874641 .4635713
0
1
col4pub |
5225 .5129187 .4998809
0
1
south76 |
3695 .3964817 .4892328
0
1
wage76 |
3078 1.658013 .4430234
0 3.1797
-------------+-------------------------------------------------------exp76 |
3671 8.933533 4.212664
0
25
expsq76 |
3671 .9754971 .8778352
0
6.25
age76 |
5225 28.09129 3.157657
24
34
agesq76 |
5225 799.0896 182.0539
576
1156
reg1 |
5225
.04 .1959779
0
1
-------------+-------------------------------------------------------reg2 |
5225 .1617225 .3682313
0
1
reg3 |
5225 .1900478 .3923763
0
1
reg4 |
5225 .0639234 .2446399
0
1
reg5 |
5225 .2126316 .4092083
0
1
reg6 |
5225 .0895694 .2855912
0
1
-------------+-------------------------------------------------------reg7 |
5225 .1083254 .3108206
0
1
reg8 |
5225 .0304306 .1717855
0
1
69
reg9 |
5225 .1033493 .3044437
0
1
momdad14 |
5225 .7680383 .4221251
0
1
sinmom14 |
5225 .1182775 .3229673
0
1
-------------+-------------------------------------------------------nodaded |
5225 .2478469 .4318038
0
1
nomomed |
5225 .1247847 .3305062
0
1
daded |
5225 9.937162 3.276134
0
18
momed |
5225 10.25103 2.974812
0
18
famed |
5225 6.05933 2.643855
1
9
-------------+-------------------------------------------------------famed1 |
5225 .0610526 .2394497
0
1
famed2 |
5225 .0742584 .262216
0
1
famed3 |
5225 .1144498 .3183872
0
1
famed4 |
5225 .0474641 .2126498
0
1
famed5 |
5225 .077512 .2674276
0
1
-------------+-------------------------------------------------------famed6 |
5225 .1245933 .3302888
0
1
famed7 |
5225 .0486124 .215077
0
1
famed8 |
5225 .2273684 .4191726
0
1
famed9 |
5225 .224689 .4174173
0
1
int76 |
5225 .707177 .4551014
0
1
-------------+-------------------------------------------------------age1415 |
5225 .2595215 .4384141
0
1
age1617 |
5225 .2482297 .4320271
0
1
age1819 |
5225 .1751196 .3801058
0
1
age2021 |
5225
.11311 .3167576
0
1
age2224 |
5225 .2040191 .4030216
0
1
-------------+-------------------------------------------------------cage1415 |
5225 .1755024 .3804327
0
1
cage1617 |
5225 .1680383 .3739361
0
1
cage1819 |
5225 .1245933 .3302888
0
1
cage2021 |
5225 .0796172 .2707256
0
1
cage2224 |
5225 .1441148 .3512397
0
1
-------------+-------------------------------------------------------cage66 |
5225 12.56115 8.785895
0
24
a1 |
5225 .1314833 .3379605
0
1
a2 |
5225 .1280383 .3341644
0
1
a3 |
5225 .1326316 .3392086
0
1
a4 |
5225 .1155981 .3197729
0
1
-------------+-------------------------------------------------------a5 |
5225 .098756 .2983627
0
1
a6 |
5225 .0763636 .2656045
0
1
a7 |
5225 .0560766 .2300915
0
1
a8 |
5225 .0570335 .2319288
0
1
a9 |
5225 .0666029 .2493568
0
1
-------------+-------------------------------------------------------a10 |
5225 .0683254 .2523275
0
1
a11 |
5225 .0690909 .2536329
0
1
ca1 |
5225 .308134 .4617664
0
1
ca2 |
5225 .0876555 .2828203
0
1
ca3 |
5225 .0878469 .2830992
0
1
70
-------------+-------------------------------------------------------ca4 |
5225 .0870813 .2819812
0
1
ca5 |
5225 .0809569 .2727951
0
1
ca6 |
5225 .0708134 .2565374
0
1
ca7 |
5225 .0537799 .2256044
0
1
ca8 |
5225 .0390431 .193716
0
1
-------------+-------------------------------------------------------ca9 |
5225 .0405742 .1973204
0
1
ca10 |
5225 .0465072 .2106009
0
1
ca11 |
5225 .0484211 .2146748
0
1
ca12 |
5225 12.52593 2.740455
0
18
g25 |
5225 12.53923 2.749407
0
18
-------------+-------------------------------------------------------g25i |
4148 12.77929 2.740756
0
18
intmo66 |
5225 -5.790239 128.4984
-999
12
nlsflt |
5225 .9835407 .1272459
0
1
nsib |
5225 2.818565 2.473752
0
18
ns1 |
5225 .2547368 .4357549
0
1
-------------+-------------------------------------------------------ns2 |
5225 .3534928 .4780998
0
1
ns3 |
5225 .0109091 .1038853
0
1
ns4 |
5225 .1892823 .3917702
0
1
ns5 |
5225 .135311 .3420882
0
1
ns6 |
5225 .0558852 .2297218
0
1
-------------+-------------------------------------------------------ns7 |
5225 .0003828 .0195628
0
1
.
. * Define the exogenous regressors using the global macro exogregressors
. global exogregressors black south76 smsa76 reg2-reg9 /*
> */ smsa66 momdad14 sinmom14 nodaded nomomed daded momed famed1-famed8
.
. * Write data to a text (ascii) file so can use with programs other than stata
. outfile wage76 grade76 exp76 expsq76 col4 age76 agesq76 black south76 smsa76 reg2-reg9 /*
> */ smsa66 momdad14 sinmom14 nodaded nomomed daded momed famed1-famed8 /*
> */ using mma04p4ivweak.asc, replace
.
.
. ********** (1) OLS AND IV ESTIMATES: COLUMNS 1 AND 2 OF KLING TABLE 1
.
. * RETAIN cases for the analysis
. * Here drop if missing wages or missing schooling or not at first interview
. keep if wage76!=. & grade76!=. & nlsflt==1
(2216 observations deleted)
.
. * DESCRIBE dependent variable, regressors and instruments
. desc wage76 grade76 exp76 expsq76 col4 age76 agesq76 $exogregressors
71
famed4
float %9.0g
famed5
famed6
float %9.0g
float %9.0g
famed7
float %9.0g
famed8
float %9.0g
(famed=3)
If mgrade>=12 & fgrade==-1
(famed=4)
If fgrade>=12 (famed=5)
If mgrade>=12 & fgrade> -1
(famed=6)
If mgrade>=9 & fgrade>=9
(famed=7)
If mgrade> -1 & fgrade> -1
(famed=8)
.
. * SUMMARIZE dependent variable, regressors and instruments
. sum wage76 grade76 exp76 expsq76 col4 age76 agesq76 $exogregressors
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------wage76 |
3010 1.656664 .443798
0 3.1797
grade76 |
3010 13.26346 2.676913
1
18
exp76 |
3010 8.856146 4.141672
0
23
expsq76 |
3010 .9557907 .8461831
0
5.29
col4 |
3010 .6820598 .4657535
0
1
-------------+-------------------------------------------------------age76 |
3010 28.1196 3.137004
24
34
agesq76 |
3010 800.5495 180.7484
576
1156
black |
3010 .2335548 .4231624
0
1
south76 |
3010 .4036545 .4907113
0
1
smsa76 |
3010 .7129568 .4524571
0
1
-------------+-------------------------------------------------------reg2 |
3010 .1607973 .367405
0
1
reg3 |
3010 .1956811
.39679
0
1
reg4 |
3010 .0641196 .2450066
0
1
reg5 |
3010 .2083056 .406164
0
1
reg6 |
3010 .0960133 .2946584
0
1
-------------+-------------------------------------------------------reg7 |
3010 .1099668 .3129003
0
1
reg8 |
3010 .0282392 .165683
0
1
reg9 |
3010 .0903654 .2867522
0
1
smsa66 |
3010 .6495017 .4772053
0
1
momdad14 |
3010 .7893688 .4078247
0
1
-------------+-------------------------------------------------------sinmom14 |
3010 .1006645 .3009339
0
1
nodaded |
3010 .2292359 .4204111
0
1
nomomed |
3010 .1172757 .321802
0
1
daded |
3010 9.988262 3.266511
0
18
momed |
3010 10.33675 2.987507
0
18
-------------+-------------------------------------------------------famed1 |
3010 .0614618 .2402153
0
1
famed2 |
3010 .0787375 .2693734
0
1
famed3 |
3010 .1249169 .3306796
0
1
famed4 |
3010 .0475083 .2127588
0
1
73
famed5 |
3010 .0790698 .2698925
0
1
-------------+-------------------------------------------------------famed6 |
3010 .1328904 .3395126
0
1
famed7 |
3010 .0504983 .2190073
0
1
famed8 |
3010 .2202658 .4144947
0
1
.
. * OLS estimates of return to schooling.
. * This regression computes schooling coeff, se for Table1 col 1 p.359
. * based on all cases (age grp 14-24) reported highest grd cmpl 76
.
. reg wage76 grade76 exp76 expsq76 $exogregressors
Source |
SS
df
MS
Number of obs = 3010
-------------+-----------------------------F( 29, 2980) = 44.94
Model | 180.320527 29 6.21794919
Prob > F
= 0.0000
Residual | 412.32209 2980 .138363117
R-squared = 0.3043
-------------+-----------------------------Adj R-squared = 0.2975
Total | 592.642616 3009 .196956669
Root MSE
= .37197
-----------------------------------------------------------------------------wage76 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------grade76 | .072635 .0036984 19.64 0.000 .0653833 .0798868
exp76 | .0845293 .0066819 12.65 0.000 .0714277 .0976308
expsq76 | -.2289581 .0319499 -7.17 0.000 -.2916041 -.1663121
black | -.1894065 .0194462 -9.74 0.000 -.2275358 -.1512773
south76 | -.1464841 .0260345 -5.63 0.000 -.1975314 -.0954368
smsa76 | .1377121 .0201334 6.84 0.000 .0982353 .1771889
reg2 | .1023805 .0360137 2.84 0.005 .0317662 .1729947
reg3 | .1488958 .0352521 4.22 0.000 .0797748 .2180168
reg4 | .0601267 .0417556 1.44 0.150 -.021746 .1419994
reg5 | .1348504 .0419098 3.22 0.001 .0526752 .2170255
reg6 | .1452831 .0453155 3.21 0.001 .0564302 .2341359
reg7 | .1301968 .044965 2.90 0.004 .0420312 .2183624
reg8 | -.0444289 .0513937 -0.86 0.387 -.1451997 .0563419
reg9 | .1285658 .0389959 3.30 0.001 .0521042 .2050274
smsa66 | .0233775 .019544 1.20 0.232 -.0149436 .0616987
momdad14 | .0693317 .0263402 2.63 0.009
.017685 .1209785
sinmom14 | .0335387 .0354168 0.95 0.344 -.0359052 .1029825
nodaded | -.0390477 .0531089 -0.74 0.462 -.1431815 .0650862
nomomed | .0168143 .0348295 0.48 0.629 -.051478 .0851066
daded | -.0017839 .0043977 -0.41 0.685 -.0104068 .0068389
momed | .0081443 .0041513 1.96 0.050 4.64e-06 .0162839
famed1 | -.1166029 .0788125 -1.48 0.139 -.2711354 .0379296
famed2 | -.052544 .0712753 -0.74 0.461 -.1922977 .0872097
famed3 | -.0719675 .0654608 -1.10 0.272 -.2003205 .0563856
famed4 | -.0197095 .0437058 -0.45 0.652 -.1054062 .0659872
famed5 | -.0252185 .0643526 -0.39 0.695 -.1513985 .1009615
famed6 | -.0733887 .0621076 -1.18 0.237 -.1951667 .0483894
famed7 | -.059927 .0656929 -0.91 0.362 -.188735 .068881
74
>
-----------------------------------------------------------------Variable | ols
olshet
iv
ivhet
-------------+---------------------------------------------------grade76 | 0.0726
0.0726
0.1324
0.1324
| 0.0037
0.0039
0.0493
0.0488
exp76 | 0.0845
0.0845
0.0632
0.0632
| 0.0067
0.0068
0.0241
0.0241
expsq76 | -0.2290 -0.2290 -0.1267
-0.1267
| 0.0319
0.0322
0.1185
0.1182
black | -0.1894 -0.1894 -0.1644 -0.1644
| 0.0194
0.0198
0.0292
0.0285
south76 | -0.1465 -0.1465 -0.1400 -0.1400
| 0.0260
0.0280
0.0284
0.0292
smsa76 | 0.1377
0.1377
0.0910
0.0910
| 0.0201
0.0193
0.0441
0.0440
reg2 | 0.1024
0.1024
0.0753
0.0753
| 0.0360
0.0350
0.0444
0.0432
reg3 | 0.1489
0.1489
0.1231
0.1231
| 0.0353
0.0338
0.0432
0.0418
reg4 | 0.0601
0.0601
0.0242
0.0242
| 0.0418
0.0412
0.0535
0.0531
reg5 | 0.1349
0.1349
0.1248
0.1248
| 0.0419
0.0428
0.0455
0.0459
reg6 | 0.1453
0.1453
0.1358
0.1358
| 0.0453
0.0452
0.0490
0.0483
reg7 | 0.1302
0.1302
0.1064
0.1064
| 0.0450
0.0457
0.0519
0.0516
reg8 | -0.0444 -0.0444 -0.0851 -0.0851
| 0.0514
0.0509
0.0643
0.0619
reg9 | 0.1286
0.1286
0.0916
0.0916
| 0.0390
0.0388
0.0516
0.0504
smsa66 | 0.0234
0.0234
0.0380
0.0380
| 0.0195
0.0187
0.0241
0.0231
momdad14 | 0.0693
0.0693
0.0432
0.0432
| 0.0263
0.0257
0.0354
0.0352
sinmom14 | 0.0335
0.0335
0.0258
0.0258
| 0.0354
0.0359
0.0383
0.0384
nodaded | -0.0390 -0.0390 -0.0462 -0.0462
| 0.0531
0.0511
0.0571
0.0550
nomomed | 0.0168
0.0168
0.0266
0.0266
| 0.0348
0.0344
0.0383
0.0375
daded | -0.0018 -0.0018 -0.0111 -0.0111
| 0.0044
0.0044
0.0090
0.0089
momed | 0.0081
0.0081
-0.0018
-0.0018
| 0.0042
0.0042
0.0093
0.0093
famed1 | -0.1166 -0.1166 -0.2133 -0.2133
| 0.0788
0.0792
0.1160
0.1160
famed2 | -0.0525 -0.0525 -0.1567 -0.1567
| 0.0713
0.0698
0.1146
0.1132
77
.
. * Not relevant here as more than one endogenous regressor
. * If only one endogenous regressor x1 Bound et al purge the effect of x2
. * by (1) get residual from regress x1 on x2
. * (2) get the residuals from regress z on x2
. * and then get the R-squared from regress (1) on (2).
.
. **** (D) Shea (1997) partial R-squared [Given in Table 4.5]
.
. * Here we have three endogenous regressors.
. * Focus on the endogenous schooling regressor.
. * For the other two just need to replace the first line of (1)
. * e.g. quietly reg exp76 grade76 expsq76 $exogregressors
. * and replace the first line of (2B)
. * e.g. quietly reg exp76hat grade76hat expsq76hat $exogregressors
.
. * (1) Form x1 - x1tilda: residual from regress x1 on other regressors
. quietly reg grade76 exp76 expsq76 $exogregressors
. predict x1minusx1tilda, resid
.
. * (2) Form x1hat - x1hattilda: residual from regress x1hat on fitted values of other regressors
. * (2A) First get the fitted values from regress endogenous on instruments
. quietly reg grade76 col4 age76 agesq76 $exogregressors
. predict grade76hat, xb
. di e(r2) " r2 from regress x1 on Z"
.29677588 r2 from regress x1 on Z
. quietly reg exp76 col4 age76 agesq76 $exogregressors
. predict exp76hat, xb
. di e(r2) " r2 from regress second endog regressor on Z"
.70622765 r2 from regress second endog regressor on Z
. quietly reg expsq76 col4 age76 agesq76 $exogregressors
. predict expsq76hat, xb
. di e(r2) " r2 from regress third endog regressor on Z"
.67573235 r2 from regress third endog regressor on Z
. * Fitted values for the exogenous from regress exogenous on instruments are the exogenous
. * (2B) Run the regression of x1hat on fitted values of other regressors
. quietly reg grade76hat exp76hat expsq76hat $exogregressors
. di e(r2) " r2 from regress prediction of x1 on predictions of x2
.98987117 r2 from regress prediction of x1 on predictions of x2
80
.
. **** DISPLAY RESULT IN TABLE 4.5 page 111
.
. * Shea's Partial R^2 in Table 4.5
. di r(rho)^2 " Shea's partial R-squared measure"
.00640757 Shea's partial R-squared measure
.
. sum grade76 grade76hat exp76 exp76hat expsq76 expsq76hat grade76 x1minusx1tilda
x1hatminusx1hattilda grade76hat
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------grade76 |
3010 13.26346 2.676913
1
18
grade76hat |
3010 13.26346 1.458306 8.919074 17.42063
exp76 | 3010 8.856146 4.141672
0
23
exp76hat |
3010 8.856146 3.480551 1.329216 17.68953
expsq76 |
3010 .9557907 .8461831
0
5.29
-------------+-------------------------------------------------------expsq76hat |
3010 .9557907 .6955874 -.3913698 2.917523
grade76 |
3010 13.26346 2.676913
1
18
x1minusx1t~a |
3010 -8.71e-10 1.833502 -6.948598 5.661138
x1hatminus~a |
3010 -6.86e-11 .1467669 -.3732457 .3033035
grade76hat |
3010 13.26346 1.458306 8.919074 17.42063
.
. **** (E) Poskitt-Skeels (2002) partial R-squared
. * Not done here
.
. **** (F) If model was over-identified then do test of over-identifying restrictions
. * Not done here as model is just-identified
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section2\mma04p4ivweak.txt
log type: text
closed on: 17 May 2005, 13:46:03
81
-----------------------------------------------------------------------------------------------------------------------------------------------------
82
----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p1mle.txt
log type: text
opened on: 17 May 2005, 13:48:11
.
. ********** OVERVIEW OF MMA05P1MLE.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 5.9 pp.159-63
. * Maximum likelihood analysis.
.
. * Provides first two columns of Table 5.7
. * (1) OLS
using Stata command regress
. * (2) MLE
using Stata command exp for exponential MLE
. * (3) MLE
using Stata command ml for user-provided log-likelihood
. * using generated data (see below)
.
. * Related programs:
. * mma05p2nls.do
NLS, WNLS, FGNLS for same data using nl command
. * mma05p3nlsbyml.do
NLS, WNLS, FGNLS for same data using ml command
. * mma05p4margeffects.do Calculates marginal effects
.
. ********** SETUP **********
.
. set more off
. version 8
.
. ********** GENERATE DATA and SUMMARIZE **********
.
. * Model is y ~ exponential(exp(a + bx))
.*
x ~ N[mux, sigx^2]
.*
f(y) = exp(a + bx)*exp(-y*exp(a + bx))
.*
lnf(y) = (a + bx) - y*exp(a + bx)
.*
E[y] = exp(-(a + bx)) note sign reversal for the mean
.*
V[y] = exp(-(a + bx)) = E[y]^2
.
. * The dgp sets particular values of a, b, mux and sigx
. * Here a = 2, b = -1 and x ~ N[1, 1]
. scalar a = 2
83
. scalar b = -1
. scalar mux = 1
. scalar sigx = 1
.
. * Set the sample size. Table 5.7 uses N=10,000
. set obs 10000
obs was 0, now 10000
.
. * Generate x and y
. set seed 2003
. gen x = mux + sigx*invnorm(uniform())
. gen lamda = exp(a + b*x)
. gen Ey = 1/lamda
. * To generate exponential with mean mu=Ey use
. * Integral 0 to a of (1/mu)exp(-x/mu) dx by change of variables
. * = Integral 0 to a/mu of exp(-t)dt
. * = incomplete gamma function P(0,a/mu) in the terminology of Stata
. gen y = Ey*invgammap(1,uniform())
. gen lny = ln(y)
. gen lnfy = ln(lamda) - y*lamda
. * twoway scatter Ey x
.
. * Descriptive Statisitcs
. describe
Contains data
obs:
10,000
vars:
6
size:
280,000 (97.3% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------x
float %9.0g
lamda
float %9.0g
Ey
float %9.0g
y
float %9.0g
lny
float %9.0g
lnfy
float %9.0g
------------------------------------------------------------------------------84
Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x | 10000 1.014313 1.004905 -2.895741 4.994059
lamda | 10000 4.457478 5.939084 .0500838 133.7191
Ey | 10000 .6185677 .8294007 .0074784 19.96655
y | 10000 .6194352 1.291416 .0000445 30.60636
lny | 10000 -1.554348 1.62358 -10.02114 3.421208
-------------+-------------------------------------------------------lnfy | 10000 -.0209485 1.419595 -7.52596 4.402257
.
. ********** WRITE DATA TO A TEXT FILE **********
.
. * Write data to a text (ascii) file
. * used for programs mma05p2nlsbyml.do, mma05p3nlsbynl.do
. * and mma05p4margeffects.do
. * and can also use with programs other than Stata
. outfile y x using mma05data.asc, replace
.
. ********** DO THE ANALYSIS: OLS and MLE **********
.
. ** (1) OLS ESTIMATION
.
. * OLS is inconsistent in this example
. regress y x
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 1, 9998) = 3030.74
Model | 3879.13606 1 3879.13606
Prob > F
= 0.0000
Residual | 12796.7438 9998 1.27993037
R-squared = 0.2326
-------------+-----------------------------Adj R-squared = 0.2325
Total | 16675.8799 9999 1.66775476
Root MSE
= 1.1313
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .6198182 .0112587 55.05 0.000 .5977488 .6418876
_cons | -.0092545 .016075 -0.58 0.565 -.0407648 .0222558
-----------------------------------------------------------------------------. estimates store rols
. regress y x, robust
Regression with robust standard errors
F( 1, 9998) = 596.30
Prob > F
= 0.0000
R-squared = 0.2326
Root MSE = 1.1313
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .6198182 .0253823 24.42 0.000 .5700638 .6695725
_cons | -.0092545 .0171978 -0.54 0.591 -.0429655 .0244566
-----------------------------------------------------------------------------. estimates store rolsrobust
.
. ** (2) ML ESTIMATION USING STATA COMMAND FOR EXPONENTIAL MLE
.
. * The following uses Stata duration model commands.
. * First need to define the duration variable (here y)
. stset y
failure event: (assumed to fail at time=y)
obs. time interval: (0, y]
exit on or before: failure
-----------------------------------------------------------------------------10000 total obs.
0 exclusions
-----------------------------------------------------------------------------10000 obs. remaining, representing
10000 failures in single record/single failure data
6194.352 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t = 30.60636
. streg x, dist(exp) nohr
failure _d: 1 (meaning all fail)
analysis time _t: y
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
10000
Number of obs =
10000
86
No. of failures =
10000
Time at risk = 6194.352495
LR chi2(1)
Log likelihood =
-15752.19
= 10003.63
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | -.9896276 .0098692 -100.27 0.000 -1.008971 -.9702842
_cons | 1.982921 .0141496 140.14 0.000 1.955188 2.010654
-----------------------------------------------------------------------------. estimates store rexp
. streg x, dist(exp) nohr robust
failure _d: 1 (meaning all fail)
analysis time _t: y
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
=
10000
Number of obs = 10000
=
10000
= 6194.352495
Wald chi2(1) = 9914.62
Log pseudo-likelihood = -15752.19
Prob > chi2 = 0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | -.9896276 .0099388 -99.57 0.000 -1.009107 -.9701479
_cons | 1.982921 .0144307 137.41 0.000 1.954637 2.011205
-----------------------------------------------------------------------------. estimates store rexprobust
.
. ** (3) ML ESTIMATION USING STATA ML COMMAND
.
. * For MLE computation can use the following Stata commands
. * ml model lf
provide the log-density
. * ml model D0
provide the log-likelihood
. * ml model D1
provide the log-likelihood and gradient
87
. * ml model D2
provide the log-likelihood, gradient and hessian
.
. * At a minimum need to provide
. * (A) program define fcn where fcn is the function name
.*
defines the log-density (independent observations assumed)
. * (B) ml model lf fcn + some extras
.*
the extras give the dependent variable and regressors
. * (C) ml maximize
.*
obtains the mle
. * (D) ml model lf fcn + some extras, robust
.*
provides robust sandwich standard errors
.
. * Here we provide the log-density (ml model lf) as this is simplest,
. * and the Stata manual says that numerically only D2 is better.
.
. * (A) Define the log-density
.*
lnf(y) = (a+bx) - y*exp(a+bx) = theta - y*exp(theta) where theta = x'b
. program define mleexp0
1. version 8.0
2. args lnf theta
/* Must use lnf while could use name other than theta */
3. quietly replace `lnf' = `theta' - $ML_y1*exp(`theta')
4. end
.
. * (B) Say that dependent variable is y and regressors are x plus a constant
. ml model lf mleexp0 (y = x)
.
. * (C) Obtain the MLE
. ml search
/* Optional - can provide better starting values */
initial:
log likelihood = -6194.3525
improve:
log likelihood = -6194.3525
alternative: log likelihood = -5212.7607
rescale:
log likelihood = -5212.7607
. ml maximize
initial:
log likelihood = -5212.7607
rescale:
log likelihood = -5212.7607
Iteration 0: log likelihood = -5212.7607
Iteration 1: log likelihood = -1563.9176
Iteration 2: log likelihood = -217.6055
Iteration 3: log likelihood = -208.73633
Iteration 4: log likelihood = -208.71383
Iteration 5: log likelihood = -208.71383
Number of obs =
10000
Wald chi2(1) = 10054.85
Log likelihood = -208.71383
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------88
y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | -.9896276 .0098692 -100.27 0.000 -1.008971 -.9702842
_cons | 1.982921 .0141496 140.14 0.000 1.955188 2.010654
-----------------------------------------------------------------------------. estimates store rmle
.
. * (D) Obtain robust standard errors
. ml model lf mleexp0 (y = x), robust
. ml search
initial:
log pseudo-likelihood = -6194.3525
improve:
log pseudo-likelihood = -6194.3525
alternative: log pseudo-likelihood = -5212.7607
rescale:
log pseudo-likelihood = -5212.7607
. ml maximize
initial:
log pseudo-likelihood = -5212.7607
rescale:
log pseudo-likelihood = -5212.7607
Iteration 0: log pseudo-likelihood = -5212.7607
Iteration 1: log pseudo-likelihood = -1563.9176
Iteration 2: log pseudo-likelihood = -217.6055
Iteration 3: log pseudo-likelihood = -208.73633
Iteration 4: log pseudo-likelihood = -208.71383
Iteration 5: log pseudo-likelihood = -208.71383
Number of obs =
10000
Wald chi2(1) = 9914.62
Log pseudo-likelihood = -208.71383
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | -.9896276 .0099388 -99.57 0.000 -1.009107 -.9701479
_cons | 1.982921 .0144307 137.41 0.000 1.954637 2.011205
-----------------------------------------------------------------------------. estimates store rmlerobust
.
. * (E) Calculate R-squared and log-likelihood at the ML estimates
. * lnL sums lnf(y) = ln(lamda) - y*lamda
. gen lamdaml = exp(_b[_cons] + _b[x]*x)
. gen lnfml = ln(lamdaml) - y*lamdaml
. quietly means lnfml
89
90
.
. * (2) MLE by command ereg - nonrobust and robust standard errors
. estimates table rexp rexprobust, b(%10.4f) se(%10.4f) t stats(N ll) keep(_cons x)
---------------------------------------Variable | rexp
rexprobust
-------------+-------------------------_cons | 1.9829
1.9829
| 0.0141
0.0144
| 140.14
137.41
x | -0.9896 -0.9896
| 0.0099
0.0099
| -100.27
-99.57
-------------+-------------------------N | 10000.0000 10000.0000
ll | -1.575e+04 -1.575e+04
---------------------------------------legend: b/se/t
.
. * (3) MLE by command ml - nonrobust and robust standard errors
. estimates table rmle rmlerobust, b(%10.4f) se(%10.4f) t stats(N ll) keep(_cons x)
---------------------------------------Variable | rmle
rmlerobust
-------------+-------------------------_cons | 1.9829
1.9829
| 0.0141
0.0144
| 140.14
137.41
x | -0.9896 -0.9896
| 0.0099
0.0099
| -100.27
-99.57
-------------+-------------------------N | 10000.0000 10000.0000
ll | -208.7138 -208.7138
---------------------------------------legend: b/se/t
. * And ML log-likelihood (check) and R-squared (needed to be computed)
. di "Log likeihood for ML: " LLml
Log likeihood for ML: -208.71383
. di "R-squared for MLE: " Rsqml
R-squared for MLE: .39062307
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma05p1mle.txt
log type: text
closed on: 17 May 2005, 13:48:18
91
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p2nls.txt
log type: text
opened on: 17 May 2005, 13:53:31
.
. ********** OVERVIEW OF MMA05P2NLS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 5.9 pp.159-63
. * Nonlinear least squares
.
. * Provides last three columns of Table 5.7 results for
. * (1) NLS using Stata command nl (hard to get robust s.e.'s)
. * (2) FGNLS using Stata command nl (hard to get robust s.e.'s)
. * (3) WNLS using Stata command nl (hard to get robust s.e.'s)
. * using generated data set mma05data.asc
.
. * Note: Stata 8 does not give robust se's for nl
.*
But ml does - see program mma05p3nlsbyml.do
.*
New Stata 9 does have a robust se option (unlike Stata 8)
.
. * Related programs:
. * mma05p1mle.do
OLS and MLE for the same data
. * mma05p3nlsbyml.do
NLS using ml rather than nl
. * mma05p4margeffects.do Calculates marginal effects
.
. * To run this program you need data and dictionary files
. * mma05data.asc ASCII data set generated by mma05p1mle.do
.
. ********** SETUP **********
.
. set more off
. version 8
.
. ********** READ IN DATA and SUMMARIZE **********
.
. * Model is y ~ exponential(exp(a + bx))
.*
x ~ N[mux, sigx^2]
.*
f(y) = exp(a + bx)*exp(-y*exp(a + bx))
.*
lnf(y) = (a + bx) - y*exp(a + bx)
.*
E[y] = exp(-(a + bx)) note sign reversal for the mean
.*
V[y] = exp(-(a + bx)) = E[y]^2
. * Here a = 2, b = -1 and x ~ N[mux=1, sigx^21]
92
5.
global b2x=0
6.
exit}
7. replace `1'=exp(-$b1int-$b2x*x) /* calculate function */
8. end
.
. * (1B) Do NLS of y on the function expnls defined in (A)
. nl expnls y
(obs = 10000)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
residual SS =
residual SS =
residual SS =
residual SS =
residual SS =
residual SS =
17308.68
10333.37
10150.66
10149.86
10149.86
10149.86
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 2, 9998) = 5103.98
Model | 10363.0157 2 5181.50784
Prob > F
= 0.0000
Residual | 10149.8633 9998 1.01518937
R-squared = 0.5052
-------------+-----------------------------Adj R-squared = 0.5051
Total | 20512.879 10000 2.0512879
Root MSE
= 1.007566
Res. dev. = 28527.52
(expnls)
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------b1int | 1.887563 .0306819 61.52 0.000
1.82742 1.947705
b2x | -.9574684 .0097419 -98.28 0.000 -.9765645 -.9383724
-----------------------------------------------------------------------------(SEs, P values, CIs, and correlations are asymptotic approximations)
. estimates store bnls
.
. * Complications now begin: getting standard erors. Easier to use (1) !!
.
. * (1C) Get sandwich heteroskedastic-robust standard errors for NLS
.
. * Note that robust option does not work for nl
. * So wrong standard errors are given for this problem as errors are heterosckeastic
.
. * To get robust standard errors is not straightforward
.
. * Obtain them by OLS regress y - g(x,b) on dg/db with robust option.
. * Explanation: OLS regress y - g(x,b) = (dg/db)'a + v
. * This is NR algorithm for update of b
. * But a = 0 since iterations have converged, so v = y - g(x,b)
. * So nonrobust standard errors from this OLS regression yield
. * V[a] = s^2 (Sum_i (dg_i/db)(dg_i/db)')
94
. gen d1 = yhatnls
. gen d2 = x*yhatnls
. * This OLS regression gives robust standard errors
. regress residnls d1 d2, noconstant robust
Regression with robust standard errors
Number of obs = 10000
F( 2, 9998) = 0.00
Prob > F
= 1.0000
R-squared = 0.0000
Root MSE = 1.0076
-----------------------------------------------------------------------------|
Robust
residnls |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------d1 | 4.46e-07 .1420794 0.00 1.000 -.2785037 .2785046
d2 | -1.49e-07 .0611969 -0.00 1.000 -.1199583 .119958
-----------------------------------------------------------------------------. estimates store bnlsrobust
.
. * Check: Do OLS regression that gives nonrobust standard errors
.*
and verify that same results as in (1B)
. regress residnls d1 d2, noconstant
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 2, 9998) = 0.00
Model | 2.6739e-10 2 1.3370e-10
Prob > F
= 1.0000
Residual | 10149.8633 9998 1.01518937
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0002
Total | 10149.8633 10000 1.01498633
Root MSE
= 1.0076
-----------------------------------------------------------------------------residnls |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
95
/* DD commented above */
96
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
residual SS =
residual SS =
residual SS =
residual SS =
220.6796
220.2856
220.2851
220.2851
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 2, 9998) = 4946.06
Model | 217.95244 2 108.97622
Prob > F
= 0.0000
Residual | 220.285065 9998 .022032913
R-squared = 0.4973
-------------+-----------------------------Adj R-squared = 0.4972
Total | 438.237505 10000 .043823751
Root MSE
= .1484349
Res. dev. = 8924.231
(expnls)
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------b1int | 1.984035 .0147737 134.30 0.000 1.955075 2.012994
b2x | -.990691 .01001 -98.97 0.000 -1.010313 -.9710694
-----------------------------------------------------------------------------(SEs, P values, CIs, and correlations are asymptotic approximations)
. estimates store bfgnls
.
. * (2C) Robust standard errors
. * The standard errors obtained given are consistent
. * assuming correct model for heteroskedasticity.
. * To guard against misspecification use similar approach to nls case
. * Obtain the derivatives dg/db
. * Here g = exp(x'b) so dg/db = exp(x'b)*x = yhat*x
. predict residoptnls, residuals
. predict yhatoptnls, yhat
. gen d1opt = yhatoptnls
. gen d2opt = x*yhatoptnls
. * This OLS regression gives robust standard errors
. regress residoptnls d1opt d2opt [aweight=wfgnls], noconstant robust
(sum of wgt is 4.0558e+05)
Regression with robust standard errors
Number of obs = 10000
F( 2, 9998) = 0.00
Prob > F
= 1.0000
R-squared = 0.0000
Root MSE = .14843
-----------------------------------------------------------------------------|
Robust
residoptnls |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
98
" Rsqfgnls
99
-208.71965
.39056605
.
. ** (3) WNLS ESTIMATION USING STATA NL COMMAND
.
. * To get WNLS estimates in Table 5.7
. * replace gen wfgnls = (1/yhatnls)^2 in (3) FGNLS by gen wfgnls = 1/yhatnls
. * Code is shorter as all comments are dropped
.
. gen wwnls = 1/yhatnls
. nl expnls y [aweight=wwnls]
(sum of wgt is 39858.614)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
residual SS =
residual SS =
residual SS =
residual SS =
residual SS =
2630.417
1694.802
1500.277
1494.658
1494.653
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 2, 9998) = 5073.75
Model | 1517.00087 2 758.500436
Prob > F
= 0.0000
Residual | 1494.6525 9998 .149495149
R-squared = 0.5037
-------------+-----------------------------Adj R-squared = 0.5036
Total | 3011.65337 10000 .301165337
Root MSE
= .386646
Res. dev. = 14035.49
(expnls)
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------b1int | 1.990623 .0224903 88.51 0.000 1.946537 2.034708
b2x | -.9960671 .009777 -101.88 0.000 -1.015232 -.9769022
-----------------------------------------------------------------------------(SEs, P values, CIs, and correlations are asymptotic approximations)
. estimates store bwnls
. predict residwnls, residuals
. predict yhatwnls, yhat
. gen d1w = yhatwnls
. gen d2w = x*yhatwnls
. regress residwnls d1w d2w [aweight=wwnls], noconstant robust
(sum of wgt is 3.9859e+04)
Regression with robust standard errors
Number of obs = 10000
F( 2, 9998) = 0.00
100
Prob > F
= 1.0000
R-squared = 0.0000
Root MSE = .38665
-----------------------------------------------------------------------------|
Robust
residwnls |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------d1w | -1.11e-07 .0358551 -0.00 1.000 -.0702833 .0702831
d2w | 5.35e-08 .0224175 0.00 1.000 -.0439428 .043943
-----------------------------------------------------------------------------. estimates store bwnlsrobust
. regress residwnls d1w d2w [aweight=wwnls], noconstant
(sum of wgt is 3.9859e+04)
Source |
SS
df
MS
Number of obs = 10000
-------------+-----------------------------F( 2, 9998) = 0.00
Model | 1.8190e-12 2 9.0949e-13
Prob > F
= 1.0000
Residual | 1494.6525 9998 .149495149
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0002
Total | 1494.6525 10000 .14946525
Root MSE
= .38665
-----------------------------------------------------------------------------residwnls |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------d1w | -1.11e-07 .0224903 -0.00 1.000 -.0440856 .0440853
d2w | 5.35e-08 .009777 0.00 1.000 -.0191649 .019165
-----------------------------------------------------------------------------. estimates store bwnlscheck
. gen lamdawnls = 1 / yhatwnls
.
. ***** PRINT RESULTS: Last three columns of Table 5.7 page 161
.
. * (1) NLS using NL - nonrobust and robust standard errors
. * Here nonrobust differs from robust asymptotically
.
. * Table 5.7 NLS nonrobust standard errors
. estimates table bnls, b(%10.4f) se(%10.4f) t stats(N ll)
--------------------------Variable | bnls
-------------+------------b1int | 1.8876
| 0.0307
|
61.52
b2x | -0.9575
| 0.0097
| -98.28
-------------+------------N | 10000.0000
ll |
--------------------------legend: b/se/t
. * Table 5.7 NLS robust standard errors
. estimates table bnlscheck bnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll)
---------------------------------------Variable | bnlscheck bnlsrobust
-------------+-------------------------d1 | 0.0000
0.0000
| 0.0307
0.1421
|
0.00
0.00
d2 | -0.0000 -0.0000
| 0.0097
0.0612
|
-0.00
-0.00
-------------+-------------------------N | 10000.0000 10000.0000
ll | -1.426e+04 -1.426e+04
---------------------------------------legend: b/se/t
.
. /*
> * Check: Nonrobust standard errors of NLS b1int and b2x:
> di seb1intnlsnr " " seb2xnlsnr
> * Robust standard errors of NLS estimates of b1int and b2x:
> di seb1intnls " " seb2xnls
> */
. * Alternative Robust standard errors of NLS estimates of b1int and b2x:
102
103
--------------------------Variable | bfgnls
-------------+------------b1int | 1.9840
| 0.0148
| 134.30
b2x | -0.9907
| 0.0100
| -98.97
-------------+------------N | 10000.0000
ll |
--------------------------legend: b/se/t
. * Table 5.7 FGNLS robust standard errors
. estimates table bfgnlscheck bfgnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll)
---------------------------------------Variable | bfgnlsch~k bfgnlsro~t
-------------+-------------------------d1opt | -0.0000
-0.0000
| 0.0148
0.0146
|
-0.00
-0.00
d2opt | 0.0000
0.0000
| 0.0100
0.0101
|
0.00
0.00
-------------+-------------------------N | 10000.0000 10000.0000
ll | 4887.7042 4887.7042
---------------------------------------legend: b/se/t
.
. * (4) Print the various log-likelihoods and R-squared
. * Log-likelihood for NLS and FNGLS
. di "LLnls: " LLnls " LLfgnls: " LLfgnls " LLwnls: " LLwnls
LLnls: -232.97524 LLfgnls: -208.71965 LLwnls: -208.93381
. * R-squared for MLE, NLS and FNGLS
. di "Rsqnls: " Rsqnls " Rsqfgnls: " Rsqfgnls " Rsqwnls: " Rsqwnls
Rsqnls: .39134462 Rsqfgnls: .39056605 Rsqwnls: .39017996
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma05p2nls.txt
log type: text
closed on: 17 May 2005, 13:53:34
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p3nlsbyml.txt
104
.
. * Descriptive Statistics
. describe
Contains data
obs:
10,000
vars:
2
size:
120,000 (98.8% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y | 10000 .6194352 1.291416 .0000445 30.60636
x | 10000 1.014313 1.004905 -2.895741 4.994059
.
. ********** DO THE ANALYSIS: NLS using STATA COMMAND ML **********
.
. * (1) NLS ESTIMATION USING STATA ML COMMAND (maximum likelihood)
.
. * Advantage: ml command has robust standard errors as an option
.
. * The NLS estimator minimizes SUM_i (y_i - g(x_i'b))^2.
. * Here let g(x'b) = exp(a + b*x) = exp(b1int + b2x*x) say.
. * In fact for this dgp E[y] = exp(-(a + bx)) so sign reversal for the mean.
.
. * To adjust this code to other NLS problems
. * (a) If more regressors, say x1 x2 and x3, replace ml model line with
.*
ml model lf mlexp (y = x1 x2 x3) / sigma
. * (b) If different functional form for mean, say g(x'b), redefine `res' as
.*
`res' = $ML_y1 - g(`theta')
. * (c) If functional form for mean is not single-index then the program
. * will become considerably more complicated with more args.
.
. * (1A) The program "mlexp" defines the objective function
. program define mlexp
1. version 8.0
2. args lnf theta sigma
/* theta contains b1int and b2x; sigma is st.dev.of error */
3. tempvar res
/* create to shorten expression for lnf */
4. quietly gen double `res' = $ML_y1 - exp(-`theta')
106
0.0000
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------eq1
|
x | -.9574683 .0093471 -102.43 0.000 -.9757883 -.9391483
_cons | 1.887562 .0295701 63.83 0.000 1.829606 1.945519
-------------+---------------------------------------------------------------sigma
|
_cons | 1.007465 .0071239 141.42 0.000 .9935028 1.021428
-----------------------------------------------------------------------------. estimates store bnlsbymle
.
. * (1C) Adding ,robust gives Heteroskedastic robust standard errors
. ml model lf mlexp (y = x) / sigma, robust
. ml search
initial:
log pseudo-likelihood = -<inf> (could not be evaluated)
feasible:
log pseudo-likelihood = -35613.002
107
improve:
log pseudo-likelihood = -17310.807
rescale:
log pseudo-likelihood = -17310.807
rescale eq: log pseudo-likelihood = -16777.282
. ml maximize
initial:
log pseudo-likelihood = -16777.282
rescale:
log pseudo-likelihood = -16777.282
rescale eq: log pseudo-likelihood = -16777.282
Iteration 0: log pseudo-likelihood = -16777.282 (not concave)
Iteration 1: log pseudo-likelihood = -16097.359
Iteration 2: log pseudo-likelihood = -16013.711
Iteration 3: log pseudo-likelihood = -14412.885
Iteration 4: log pseudo-likelihood = -14264.159
Iteration 5: log pseudo-likelihood = -14263.761
Iteration 6: log pseudo-likelihood = -14263.761
Number of obs =
10000
Wald chi2(1) = 288.75
Log pseudo-likelihood = -14263.761
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------eq1
|
x | -.9574683 .0563463 -16.99 0.000 -1.067905 -.8470317
_cons | 1.887562 .127832 14.77 0.000 1.637016 2.138108
-------------+---------------------------------------------------------------sigma
|
_cons | 1.007465 .0561714 17.94 0.000 .8973713 1.117559
-----------------------------------------------------------------------------. estimates store bnlsbymlerobust
.
. ***** PRINT RESULTS: Third column of Table 5.7 p.111 **********
.
. * (1) NLS by ML - nonrobust and robust standard errors
. * The coefficient estimates are exactly the same as those using the nl command
. * The estimated standard errors are close - within 10% of those using the nl command
. * Table 5.7 reports the standard errors using the nl command
. estimates table bnlsbymle bnlsbymlerobust, b(%10.4f) se(%10.4f) t stats(N ll)
---------------------------------------Variable | bnlsbymle bnlsbyml~t
-------------+-------------------------eq1
|
x | -0.9575 -0.9575
| 0.0093
0.0563
| -102.43
-16.99
108
_cons | 1.8876
1.8876
| 0.0296
0.1278
|
63.83
14.77
-------------+-------------------------sigma
|
_cons | 1.0075
1.0075
| 0.0071
0.0562
| 141.42
17.94
-------------+-------------------------Statistics |
N | 10000.0000 10000.0000
ll | -1.426e+04 -1.426e+04
---------------------------------------legend: b/se/t
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma05p3nlsbyml.txt
log type: text
closed on: 17 May 2005, 13:54:27
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p4margeffects.txt
log type: text
opened on: 17 May 2005, 13:57:02
.
. ********** OVERVIEW OF MMA05P4MARGINALEFFECTS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 5.9.4 pp.162-3
. * Marginal effects analysis for a nonlinear model (here exponential regression).
.
. * Provides
. * (1) Sample average marginal effect using derivative
. * (2) Sample average marginal effect using first difference
. * (3) Marginal effect evaluated at the sample mean
. * (4) Marginal effects (1)-(3) when model estimated by Stata ml command
. * using generated data (see below)
.
. * Related programs:
. * mma05p1mle.do
OLS and MLE for the same data
. * mma05p2nls.do
NLS, WNLS, FGNLS for same data using nl command
. * mma05p3nlsbyml.do NLS for same data using ml command
.
109
. gen xoriginal = x
. replace x = x+0.0001*sdx
(10000 real changes made)
. predict y1, mean time
. gen dEydxnumericalderivative = (y1 - y0)/(0.0001*sdx)
. quietly sum dEydxnumericalderivative
. scalar mesand = r(mean)
. di "Sample average marginal effect by numerical derivative = " mesand
Sample average marginal effect by numerical derivative = .60949044
. replace x = xoriginal
(10000 real changes made)
. drop xoriginal sdx y0 y1
.
. ** (2) FINITE DIFFERENCE METHOD FOR SAMPLE AVERAGE MARGINAL EFFECT
.
. streg x, distribution(exponential) nohr /* y is dependent variable */
failure _d: 1 (meaning all fail)
analysis time _t: y
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Number of obs =
LR chi2(1)
Log likelihood =
-15752.19
= 10003.63
Prob > chi2 =
10000
0.0000
-----------------------------------------------------------------------------_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | -.9896276 .0098692 -100.27 0.000 -1.008971 -.9702842
_cons | 1.982921 .0141496 140.14 0.000 1.955188 2.010654
-----------------------------------------------------------------------------112
.
. * The following method can be used following many stata estimation commands
. * 1. Predict y using sample data.
. * Need to say predict the mean as this is not the streg default.
. predict y0, mean time
. * 2. Predict y with regressor of x increased by one
. gen xoriginal = x
. replace x = x+1
(10000 real changes made)
. predict y1, mean time
. replace x = xoriginal /* Put x back to initial value for later analysis */
(10000 real changes made)
. * 3. Calculate difference
. gen dEydxfinitedifference = y1 - y0
. quietly sum dEydxfinitedifference
. scalar mesafd = r(mean)
. di "Sample average marginal effect by first differences = " mesafd
Sample average marginal effect by first differences = 1.0414485
. drop xoriginal y0 y1
.
. ** (3) DERIVATIVE METHOD FOR MARGINAL EFFECT AT SAMPLE MEAN
.
. * (3A) Use Stata command mfx
. quietly streg x, distribution(exponential) nohr
. * Need to tell mfx to predict the mean as this is not the streg default.
. mfx compute, dydx predict(mean time)
Marginal effects after ereg
y = predicted mean _t (predict, mean time)
= .37563828
-----------------------------------------------------------------------------variable |
dy/dx Std. Err. z P>|z| [ 95% C.I. ]
X
---------+-------------------------------------------------------------------x | .371742
.00525 70.81 0.000 .361452 .382032 1.01431
-----------------------------------------------------------------------------. di "Marginal effect by analytical derivative at mean of x using mfx: "
Marginal effect by analytical derivative at mean of x using mfx:
113
114
115
116
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma06p2Theil.txt
log type: text
opened on: 18 May 2005, 17:45:50
.
. ********** OVERVIEW OF MMA06P2THEIL.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * NOTE: Stata does not have a NL2SLS command
.
. * Chapter 6.5.4 nonlinear 2SLS example.
. * Table 6.4 partial only
. * (1) OLS
inconsistent
. * (2) NL2SLS consistent NOT INCLUDED AS STATA DOES NOT DO
. * (3) Wrong 2SLS inconsistent
.
. * To run this program you need data set
.*
mma06p1nl2sls.asc
. * generated by Limdep program MMA06P1NL2SLS.LIM
.
. * Some of the analysis is done in Limdep which (unlike Stata) has
. * an NL2SLS command
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** READ DATA and SUMMARIZE **********
.
. * Model is y = 1*x^2 + u
.*
x = 1*z + v
. * where u and v are joint normal (0,0,1,1,0.8)
.
. infile y x xsq z zsq u v using mma06p1nl2sls.asc
(200 observations read)
.
. * Descriptive Statistics
. describe
Contains data
obs:
200
117
vars:
7
size:
6,400 (99.9% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
xsq
float %9.0g
z
float %9.0g
zsq
float %9.0g
u
float %9.0g
v
float %9.0g
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y|
200 1.632794 2.418096 -2.332656 9.354863
x|
200 .9970513 .8330302 -1.908285 2.696363
xsq |
200 1.684581 1.638509 .0000948 7.270374
z|
200
1
0
1
1
zsq |
200
1
0
1
1
-------------+-------------------------------------------------------u|
200 -.0517871 .9427286 -2.816687 2.202356
v|
200 -.0029487 .8330302 -2.908285 1.696363
.
. ********** DO THE ANALYSIS: ESTIMATE MODELS **********
.
. * (1) OLS is inconsistent (first column of Table 4.4)
. regress y xsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 2250.83
Model | 1558.96322 1 1558.96322
Prob > F
= 0.0000
Residual | 137.83055 199 .692615831
R-squared = 0.9188
-------------+-----------------------------Adj R-squared = 0.9184
Total | 1696.79377 200 8.48396883
Root MSE
= .83224
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xsq | 1.189495 .0250721 47.44 0.000 1.140054 1.238936
-----------------------------------------------------------------------------. estimates store olswrong
118
200
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xsq | 1.189495 .0191687 62.05 0.000 1.151695 1.227295
-----------------------------------------------------------------------------. estimates store olswrongrob
.
. * (2) NL2SLS command Stata does not have
. * See LIMDEP program MMA06P1NL2SLS.LIM
.
. * (3A) Theil's 2sls where first regress x on z is inconsistent
. regress x z, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 286.51
Model | 198.822258 1 198.822258
Prob > F
= 0.0000
Residual | 138.093918 199 .693939288
R-squared = 0.5901
-------------+-----------------------------Adj R-squared = 0.5881
Total | 336.916176 200 1.68458088
Root MSE
= .83303
-----------------------------------------------------------------------------x|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------z | .9970513 .0589041 16.93 0.000 .8808949 1.113208
-----------------------------------------------------------------------------. predict xhat
(option xb assumed; fitted values)
. gen xhatsq = xhat*xhat
. regress y xhatsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
119
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xhatsq | 1.642466 .1719981 9.55 0.000 1.303293 1.981638
-----------------------------------------------------------------------------. estimates store ivwrong
.
. ********** DISPLAY KEY RESULTS Table 6.4 p.199 **********
.
. * Table 4.4 p.199
. estimates table olswrong olswrongrob ivwrong, b(%8.3f) se stats(N r2) keep(xsq xhatsq)
----------------------------------------------Variable | olswrong olswro~b ivwrong
-------------+--------------------------------xsq | 1.189
1.189
| 0.025
0.019
xhatsq |
1.642
|
0.172
-------------+--------------------------------N | 200.000 200.000 200.000
r2 | 0.919
0.919
0.314
----------------------------------------------legend: b/se
.
. * (3B) IV with instrument xsq for zsq should work but Stata cannot do
. ivreg y (xsq = xsq), noconstant
Instrumental variables (2SLS) regression
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) =
.
Model | 1558.96322 1 1558.96322
Prob > F
=
.
Residual | 137.83055 199 .692615831
R-squared =
.
-------------+-----------------------------Adj R-squared =
.
Total | 1696.79377 200 8.48396883
Root MSE
= .83224
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xsq | 1.189495 .0250721 47.44 0.000 1.140054 1.238936
-----------------------------------------------------------------------------Instrumented: xsq
Instruments: xsq
-----------------------------------------------------------------------------. corr xsq xsq
(obs=200)
120
|
xsq
xsq
-------------+-----------------xsq | 1.0000
xsq | 1.0000 1.0000
. corr xsq z
(obs=200)
|
xsq
z
-------------+-----------------xsq | 1.0000
z|
.
.
. gen one = 1
. regress y one, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------one | 1.632794 .1709852 9.55 0.000 1.295618 1.969969
-----------------------------------------------------------------------------.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma06p2Theil.txt
log type: text
closed on: 18 May 2005, 17:45:50
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma06p2twostage.txt
log type: text
opened on: 18 May 2005, 17:59:06
.
. ********** OVERVIEW OF MMA06P2TWOSTAGE.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * NOTE: Stata does not have a NL2SLS command
.
. * Chapter 6.5.4 nonlinear 2SLS example on pages 198-9.
.
. * Table 6.4 partial only
. * (1) OLS
inconsistent
. * (2) NL2SLS consistent NOT INCLUDED AS STATA DOES NOT DO
. * (3) Twostage Here 2SLS using Theil's interpretation of 2SLS is inconsistent
.
. * To run this program you need data set
.*
mma06p1nl2sls.asc
. * generated by Limdep program MMA06P1NL2SLS.LIM
.
. * Some of the analysis is done in Limdep which (unlike Stata) has
122
. * an NL2SLS command
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** READ DATA and SUMMARIZE **********
.
. * Model is y = 1*x^2 + u
.*
x = 1*z + v
. * where u and v are joint normal (0,0,1,1,0.8)
.
. infile y x xsq z zsq u v using mma06p1nl2sls.asc
(200 observations read)
.
. * Descriptive Statistics
. describe
Contains data
obs:
200
vars:
7
size:
6,400 (99.9% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
xsq
float %9.0g
z
float %9.0g
zsq
float %9.0g
u
float %9.0g
v
float %9.0g
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y|
200 1.632794 2.418096 -2.332656 9.354863
x|
200 .9970513 .8330302 -1.908285 2.696363
xsq |
200 1.684581 1.638509 .0000948 7.270374
z|
200
1
0
1
1
zsq |
200
1
0
1
1
-------------+-------------------------------------------------------123
u|
v|
200 -.0517871
200 -.0029487
.
. ********** DO THE ANALYSIS: ESTIMATE MODELS **********
.
. * (1) OLS is inconsistent (first column of Table 4.4)
. regress y xsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 2250.83
Model | 1558.96322 1 1558.96322
Prob > F
= 0.0000
Residual | 137.83055 199 .692615831
R-squared = 0.9188
-------------+-----------------------------Adj R-squared = 0.9184
Total | 1696.79377 200 8.48396883
Root MSE
= .83224
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xsq | 1.189495 .0250721 47.44 0.000 1.140054 1.238936
-----------------------------------------------------------------------------. estimates store olswrong
. regress y xsq, noconstant robust
Regression with robust standard errors
Number of obs =
F( 1, 199) = 3850.71
Prob > F
= 0.0000
R-squared = 0.9188
Root MSE = .83224
200
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xsq | 1.189495 .0191687 62.05 0.000 1.151695 1.227295
-----------------------------------------------------------------------------. estimates store olswrongrob
.
. * (2) NL2SLS command Stata does not have
. * See LIMDEP program MMA06P1NL2SLS.LIM
. * See also code further down
.
. * (3A) Theil's 2sls where first regress x on z
.*
and then use xhat^2 as instrument for x^2 is inconsistent
.
. regress x z, noconstant
124
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 286.51
Model | 198.822258 1 198.822258
Prob > F
= 0.0000
Residual | 138.093918 199 .693939288
R-squared = 0.5901
-------------+-----------------------------Adj R-squared = 0.5881
Total | 336.916176 200 1.68458088
Root MSE
= .83303
-----------------------------------------------------------------------------x|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------z | .9970513 .0589041 16.93 0.000 .8808949 1.113208
-----------------------------------------------------------------------------. predict xhat
(option xb assumed; fitted values)
. gen xhatsq = xhat*xhat
. regress y xhatsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xhatsq | 1.642466 .1719981 9.55 0.000 1.303293 1.981638
-----------------------------------------------------------------------------. estimates store twostage
.
. ********** DISPLAY KEY RESULTS Table 6.4 p.199 **********
.
. * Table 4.4 p.199 first and third columns
. estimates table olswrong twostage, b(%8.3f) se stats(N r2) keep(xsq xhatsq)
-----------------------------------Variable | olswrong twostage
-------------+---------------------xsq | 1.189
| 0.025
xhatsq |
1.642
|
0.172
-------------+---------------------N | 200.000 200.000
r2 | 0.919
0.314
125
-----------------------------------legend: b/se
.
. ********** FURTHER ANALYSIS **********
.
. * For this particular example there are ways to get linear IV to work
. * as the problem is not very nonlinear
.
. * (2A) regress xsq on z giving xsqhat and then regress y on xsqhat
.*
Gives nl2sls estimator though not correct standard errors
.
. * Note we get estimator 0.969 which is correct - Table 6.4 had typo
. regress xsq z, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 211.41
Model | 567.562553 1 567.562553
Prob > F
= 0.0000
Residual | 534.257348 199 2.68471029
R-squared = 0.5151
-------------+-----------------------------Adj R-squared = 0.5127
Total | 1101.8199 200 5.50909951
Root MSE
= 1.6385
-----------------------------------------------------------------------------xsq |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------z | 1.684581 .1158601 14.54 0.000
1.45611 1.913052
-----------------------------------------------------------------------------. predict xsqhat
(option xb assumed; fitted values)
. regress y xsqhat, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------xsqhat | .9692582 .1015002 9.55 0.000 .7691043 1.169412
-----------------------------------------------------------------------------.
. * (2B) IV with instrument z for xsq should work but Stata cannot do
.*
for some reason due to here z = 1 which has no variation
. ivreg y (xsq = z), noconstant
note: z dropped due to collinearity
126
127
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma07p1mltests.txt
log type: text
opened on: 17 May 2005, 13:59:20
.
. ********** OVERVIEW OF MMA07P1MLTESTS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 7.4 pp.241-3
. * Likelihood-based hypothesis tests
.
. * Implements the three likelihood-based tests presented in Table 7.1:
. * Wald test
. * LR test
. * LM test direct
. * LM test via auxiliary regression
. * for a Poisson model with simulated data (see below).
.
. * NOTE: To implement this program requires:
.*
the free Stata add-on rndpoix
. * To obtain this, in Stata give command: search rndpoix
. * If you don't want to do this, instead use the data set
.
. ********** SETUP ***********
.
. version 8
. set more off
.
. ********** GENERATE DATA ***********
.
. * Model is
. * y ~ Poisson[exp(b1 + b2*x2 + b3*x3 + b4*x4]
. * where
. * x2, x3 and x4 are iid ~ N[0,1]
. * and b1=0, b2=0.1, b3=0.1 and b4=0.1
.
. set seed 10001
. set obs 200
obs was 0, now 200
. scalar b1 = 0
128
. scalar b2 = 0.1
. scalar b3 = 0.1
. scalar b4 = 0.1
.
. * Generate regressors
. gen x2 = invnorm(uniform())
. gen x3 = invnorm(uniform())
. gen x4 = invnorm(uniform())
.
. * Generate y
. gen mupoiss = exp(b1+b2*x2+b3*x3+b4*x4)
. * The next requires Stata add-on. In Stata: search rndpoix
. rndpoix(mupoiss)
( Generating ....... )
Variable xp created.
. gen y = xp
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
200 -.0091098 1.010072 -2.857666 2.149822
x3 |
200 -.1459839 1.109521 -3.086754 3.111421
x4 |
200 -.0325314 .9674748 -2.852186 2.379461
mupoiss |
200 1.000447 .1993649 .6191922 1.903112
xp |
200
.845 .951579
0
6
-------------+-------------------------------------------------------y|
200
.845 .951579
0
6
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x2 x3 x4 using mma07p1mltests.asc, replace
.
. ********** ANALYSIS: LIKELIHOOD-BASED HYPOTHESIS TESTS ***********
.
. * Hypotheses to test are
. * (A) Single exclusion: b3 = 0
. * (B) Multiple exclusion: b3 = 0, b4 = 0
. * (C) Linear:
b3 = b4
. * (B) Nonlinear:
b3/b4 = 1
.
129
Number of obs =
200
LR chi2(3)
=
8.30
Prob > chi2 = 0.0401
Log likelihood = -238.77153
Pseudo R2
= 0.0171
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x2 | -.0275702 .0767909 -0.36 0.720 -.1780775 .1229371
x3 | .1630037 .0670848 2.43 0.015 .0315199 .2944874
x4 | .1026568 .0802139 1.28 0.201 -.0545595 .2598732
_cons | -.1653238 .0773479 -2.14 0.033 -.316923 -.0137246
-----------------------------------------------------------------------------.
. * (1A) Stata Wald test command
. test (x3=0) (x4=0)
( 1) [y]x3 = 0
( 2) [y]x4 = 0
chi2( 2) = 8.57
Prob > chi2 = 0.0138
.
. * (1B) Wald test done manually
. * Use h'[RVR]-inv*h.
. * Details below will change for each example.
. * In particular, for nonlinear restrictions more work in forming R
. * Note that Stata puts the intercept last, not first.
. * So here the second and third elements of b are set to zero.
. matrix bfull = e(b)
/* 1xq row vector */
. matrix vfull = e(V)
/* qxq matrix */
. matrix h = (bfull[1,2]\bfull[1,3])
/* hx1 vector */
130
. matrix R = (0,1,0,0\0,0,1,0)
/* h x q matrix */
Number of obs =
200
LR chi2(3)
=
8.30
Prob > chi2 = 0.0401
Log likelihood = -238.77153
Pseudo R2
= 0.0171
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x2 | -.0275702 .0767909 -0.36 0.720 -.1780775 .1229371
x3 | .1630037 .0670848 2.43 0.015 .0315199 .2944874
x4 | .1026568 .0802139 1.28 0.201 -.0545595 .2598732
_cons | -.1653238 .0773479 -2.14 0.033 -.316923 -.0137246
-----------------------------------------------------------------------------. estimates store unrestricted
. scalar llunrest = e(ll)
. poisson y x2
Iteration 0: log likelihood = -242.92271
Iteration 1: log likelihood = -242.92271 (backed up)
Poisson regression
Number of obs =
200
LR chi2(1)
=
0.00
Prob > chi2 = 0.9608
Log likelihood = -242.92271
Pseudo R2
= 0.0000
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x2 | -.0037493 .0763386 -0.05 0.961 -.1533701 .1458716
_cons | -.1684599 .0769294 -2.19 0.029 -.3192388 -.0176811
-----------------------------------------------------------------------------. estimates store restrictedB
. scalar llrestB = e(ll)
.
. * (2A) Stata likelihood ratio test
. lrtest unrestricted restrictedB
likelihood-ratio test
LR chi2(2) =
8.30
(Assumption: restrictedB nested in unrestricted)
Prob > chi2 =
0.0157
.
. * (2B) Likelihood test done manually
. scalar LRB = -2*(llrestB-llunrest)
. di "LR " LRB
LR 8.3023503
.
. * (3) LM test via direct compuation requires estimating only the restricted model.
.
. * For exclusion restrictions in the Poisson, from 7.6.2
. * LM = dlnL/db * V[b]-inv * dlnL/db where b evaluated at restricted
. * = [Sum_i u_i*x_i]'[Sum_i exp(x_i'b)*x_i*x_i'][Sum_i u_i*x_i]
. * First calculate Sum_i u_i*x_i' : a 1x4 row vector
.
. quietly poisson y x2
. predict yhatrest
(option n assumed; predicted number of events)
. gen u = y - yhatrest
132
. gen one = 1
. matrix vecaccum dlnL_db = u one x2 x3 x4, noconstant
. * Then calculate Sum_i exp(x_i'b)*x_i*x_i'
. gen trx1 = sqrt(yhatrest)
. gen trx2 = sqrt(yhatrest)*x2
. gen trx3 = sqrt(yhatrest)*x3
. gen trx4 = sqrt(yhatrest)*x4
. matrix accum Vb = trx1 trx2 trx3 trx4, noconstant
(obs=200)
. matrix LMdirect = dlnL_db*syminv(Vb)*dlnL_db'
. matrix list dlnL_db
dlnL_db[1,4]
one
x2
x3
x4
u 1.192e-07 -4.632e-08 37.578639 19.933299
. matrix list Vb
symmetric Vb[4,4]
trx1
trx2
trx3
trx4
trx1
169
trx2 -2.1828434 171.62608
trx3 -24.733563 16.929495 210.68156
trx4 -5.561359 17.0457 23.027167 157.58531
. matrix list LMdirect
symmetric LMdirect[1,1]
u
u 8.5750886
. scalar LMdirectB = LMdirect[1,1]
.
. * (4) LM test via auxiliary regression
.
. * N uncentered Rsq from regress (noconstant) 1 on the scores
. * Begin by computing the unrestricted scores at the restricted estimates.
. * This varies from problem to problem.
. * In general could compute lnf(y) at current parameters
. * and then get numerical derivative when perturb beta a little.
. * Here use analytical derivative.
. * s_j = dlnf(y)/db_j = (y-exp(x'b))*x_j for the Poisson
133
.
. drop yhatrest
. quietly poisson y x2
. predict yhatrest
(option n assumed; predicted number of events)
. gen s1 = (y-yhatrest)*1
. gen s2 = (y-yhatrest)*x2
. gen s3 = (y-yhatrest)*x3
. gen s4 = (y-yhatrest)*x4
. regress one s1 s2 s3 s4, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 4, 196) = 2.36
Model | 9.18577727 4 2.29644432
Prob > F
= 0.0549
Residual | 190.814223 196 .973541953
R-squared = 0.0459
-------------+-----------------------------Adj R-squared = 0.0265
Total |
200 200
1
Root MSE
= .98668
-----------------------------------------------------------------------------one |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------s1 | -.0265153 .0748092 -0.35 0.723 -.1740497 .121019
s2 | -.0102806 .0809418 -0.13 0.899 -.1699093 .1493481
s3 | .1794153 .0697359 2.57 0.011 .0418862 .3169444
s4 | .1225885 .0821671 1.49 0.137 -.0394566 .2846336
-----------------------------------------------------------------------------. * LM equals N times uncentered Rsq
. scalar LMauxB = e(N)*e(r2)
. * Check: LM equals explained sum of squares
. scalar LMauxB2 = e(mss)
. di "LMauxB " LMauxB " LMauxB2 " LMauxB2
LMauxB 9.1857773 LMauxB2 9.1857773
.
. * (5) DISPLAY RESULTS
.
. estimates table unrestricted restrictedB, se stats(N ll r2) b(%8.3f)
-----------------------------------Variable | unrest~d restri~B
-------------+---------------------134
x2 | -0.028 -0.004
| 0.077
0.076
x3 | 0.163
| 0.067
x4 | 0.103
| 0.080
_cons | -0.165 -0.168
| 0.077
0.077
-------------+---------------------N | 200.000 200.000
ll | -238.772 -242.923
r2 |
-----------------------------------legend: b/se
. * Wald test using stata default Poisson variance matrix
. di "WaldB " WaldB " p-value " chi2tail(2,WaldB)
WaldB 8.5701855 p-value .01377234
. * LR test using Poisson log-likelihoods
. di " LRB " LRB " p-value " chi2tail(2,LRB)
LRB 8.3023503 p-value .0157459
. * LM test direct
. di " LMdirectB " LMdirectB " p-value " chi2tail(2,LMdirectB)
LMdirectB 8.5750886 p-value .01373862
. * LM test direct by auxiliary regression
. di " LMauxB " LMauxB " p-value " chi2tail(2,LMauxB)
LMauxB 9.1857773 p-value .01012357
.
. ****** (A) TEST H0: b3 = 0
.
. * (1) Wald test
. quietly poisson y x2 x3 x4
. test (x3=0)
( 1) [y]x3 = 0
chi2( 1) = 5.90
Prob > chi2 = 0.0151
. scalar WaldA = r(chi2)
.
. * (2) LR test
. poisson y x2 x4
Iteration 0: log likelihood = -241.64842
135
Number of obs =
200
LR chi2(2)
=
2.55
Prob > chi2 = 0.2793
Log likelihood = -241.64842
Pseudo R2
= 0.0053
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x2 | -.0163179 .0770381 -0.21 0.832 -.1673098 .134674
x4 | .1278017 .0800348 1.60 0.110 -.0290637 .284667
_cons | -.1719505 .0772389 -2.23 0.026 -.3233359 -.0205651
-----------------------------------------------------------------------------. estimates store restrictedA
. lrtest unrestricted
likelihood-ratio test
LR chi2(1) =
5.75
(Assumption: restrictedA nested in unrestricted)
Prob > chi2 =
0.0165
. gen one = 1
. matrix vecaccum dlnL_db = u one x2 x3 x4, noconstant
. gen trx1 = sqrt(yhatrest)
. gen trx2 = sqrt(yhatrest)*x2
. gen trx3 = sqrt(yhatrest)*x3
. gen trx4 = sqrt(yhatrest)*x4
. matrix accum Vb = trx1 trx2 trx3 trx4, noconstant
136
(obs=200)
. matrix LMdirect = dlnL_db*syminv(Vb)*dlnL_db'
. matrix list dlnL_db
dlnL_db[1,4]
one
x2
x3
x4
u -1.788e-07 -1.717e-07 34.832631 -3.179e-07
. matrix list Vb
symmetric Vb[4,4]
trx1
trx2
trx3
trx4
trx1
169
trx2 -2.1828435 170.25918
trx3 -21.987555 15.647287 212.5673
trx4 14.371941 16.35821 22.067372 158.94405
. matrix list LMdirect
symmetric LMdirect[1,1]
u
u 5.9159017
. scalar LMdirectA = LMdirect[1,1]
.
. * (4) LM test via auxiliary regression
. * See (B) for more explanation
. drop yhatrest s1 s2 s3 s4 one
. quietly poisson y x2 x4
. predict yhatrest
(option n assumed; predicted number of events)
. gen s1 = (y-yhatrest)*1
. gen s2 = (y-yhatrest)*x2
. gen s3 = (y-yhatrest)*x3
. gen s4 = (y-yhatrest)*x4
. gen one = 1
. regress one s1 s2 s3 s4, noconstant
Source |
SS
df
MS
-------------+------------------------------
Number of obs =
200
LR chi2(3)
=
8.30
Prob > chi2 = 0.0401
Log likelihood = -238.77153
Pseudo R2
= 0.0171
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x2 | -.0275702 .0767909 -0.36 0.720 -.1780775 .1229371
x3 | .1630037 .0670848 2.43 0.015 .0315199 .2944874
x4 | .1026568 .0802139 1.28 0.201 -.0545595 .2598732
_cons | -.1653238 .0773479 -2.14 0.033 -.316923 -.0137246
-----------------------------------------------------------------------------. test (x3=x4)
( 1) [y]x3 - [y]x4 = 0
chi2( 1) = 0.29
Prob > chi2 = 0.5883
.
. * (1B) Wald test done manually
. * Note that Stata puts the intercept last, not first.
. * So here the second and third elements of b are tested as equal.
. matrix drop h R Wald
. matrix bfull = e(b)
/* qxq matrix */
. matrix h = (bfull[1,2]-bfull[1,3])
. matrix R = (0,1,-1,0)
/* hx1 vector */
/* h x q matrix */
symmetric h[1,1]
c1
r1 .06034684
. matrix list R
R[1,4]
c1 c2 c3 c4
r1 0 1 -1 0
. matrix list Wald
symmetric Wald[1,1]
c1
c1 .29301766
. scalar WaldC = Wald[1,1]
. di " WaldC " WaldC " p-value " chi2tail(1,WaldC)
WaldC .29301766 p-value .5882932
.
. * (2) LR Test
. * In general getting the restricted MLE requires constrained ML
. * Here simple as if b3=b4 then mean is exp(b1+b2*x2+B3*(x3+x4))
. gen x3plusx4 = x3+x4
. poisson y x2 x3plusx4
Iteration 0: log likelihood = -238.91785
Iteration 1: log likelihood = -238.91785
Poisson regression
Number of obs =
200
LR chi2(2)
=
8.01
Prob > chi2 = 0.0182
Log likelihood = -238.91785
Pseudo R2
= 0.0165
likelihood-ratio test
LR chi2(1) =
0.29
140
0.5885
. gen one = 1
. matrix vecaccum dlnL_db = u one x2 x3 x4, noconstant
. gen trx1 = sqrt(yhatrest)
. gen trx2 = sqrt(yhatrest)*x2
. gen trx3 = sqrt(yhatrest)*x3
. gen trx4 = sqrt(yhatrest)*x4
. matrix accum Vb = trx1 trx2 trx3 trx4, noconstant
(obs=200)
. matrix LMdirect = dlnL_db*syminv(Vb)*dlnL_db'
. matrix list dlnL_db
dlnL_db[1,4]
one
x2
x3
x4
u 8.345e-07 -3.601e-07 4.8459933 -4.8459932
. matrix list Vb
symmetric Vb[4,4]
trx1
trx2
trx3
trx4
trx1
169
trx2 -2.1828442 171.13986
trx3 7.9990827 13.105974 225.99023
trx4 19.217934 15.11254 28.153892 161.75506
141
142
.
. * (5) DISPLAY RESULTS in Table 7.1 page 242
.
. estimates table unrestricted restrictedC, se stats(N ll r2) b(%8.3f)
-----------------------------------Variable | unrest~d restri~C
-------------+---------------------x2 | -0.028 -0.029
| 0.077
0.077
x3 | 0.163
| 0.067
x4 | 0.103
| 0.080
x3plusx4 |
0.137
|
0.048
_cons | -0.165 -0.167
| 0.077
0.077
-------------+---------------------N | 200.000 200.000
ll | -238.772 -238.918
r2 |
-----------------------------------legend: b/se
. di "WaldC " WaldC " p-value " chi2tail(1,WaldC)
WaldC .29301766 p-value .5882932
. di " LRC " LRC " p-value " chi2tail(1,LRC)
LRC .29264001 p-value .5885337
. di " LMdirectC " LMdirectC " p-value " chi2tail(1,LMdirectC)
LMdirectC .29306257 p-value .58826462
. di " LMauxC " LMauxC " p-value " chi2tail(1,LMauxC)
LMauxC .31510777 p-value .57456264
.
. ****** (D) TEST H0: b3/b4 - 1 = 0
.
. * (1) Wald test of b3 /b4 - 1 = 0
. * Stata does not do nonlinear hypotheses.
. * Instead do 7.2.5 algebra.
. matrix drop h R Wald
. matrix h = (bfull[1,2]/bfull[1,3] - 1)
. matrix R = (0, 1/bfull[1,3], -bfull[1,2]/(bfull[1,3]^2), 0)
. matrix Wald = h'*syminv(R*vfull*R')*h
143
. matrix list h
symmetric h[1,1]
c1
r1 .58785028
. matrix list R
R[1,4]
r1
c1
c2
c3
c4
0 9.7411946 -15.467559
145
.
. * Auxiliary Regression LM test statistics
. di "LM* A to D: (A) " %8.3f LMauxA " (B) " %8.3f LMauxB " (C) " %8.3f LMauxC " (D) "
%8.3f LMauxC
LM* A to D: (A) 6.218 (B) 9.186 (C) 0.315 (D) 0.315
. di " p-values : (A) " %8.3f chi2tail(1,LMauxA) " (B) " %8.3f chi2tail(2,LMauxB) " (C) " %8.3f
chi
> 2tail(1,LMauxC) " (D) " %8.3f chi2tail(1,LMauxC)
p-values : (A) 0.013 (B) 0.010 (C) 0.575 (D) 0.575
.
. ********** CLOSE OUTPUT ***********
. log close
log: c:\Imbook\bwebpage\Section2\mma07p1mltests.txt
log type: text
closed on: 17 May 2005, 13:59:21
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma07p2power.txt
log type: text
opened on: 17 May 2005, 14:00:49
.
. ********** OVERVIEW OF MMA07P2POWER.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 7.6.3 pages 248-9
. * Asymptotic Power of Wald test
.
. * (1) Chapter 7.6.3 obtains power for noncentral chisquare
. * (2) Figure 7.2 (ch7power.wmf) plots against the noncentrality parameter lamda
. * No data needed
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** ANALYSIS **********
.
. * Obtain power of chi-square tests
146
/* Degrees of freedom */
147
. * For lamda = 0 have size = power, here 0.01, 0.05 and 0.10
. list if lamda==0 | lamda==5 | lamda==10 | lamda==20
+----------------------------------------+
| lamda power01 power05 power10 |
|----------------------------------------|
1. | 0
.01
.05
.1 |
51. | 5 .3670189 .6087795 .7228636 |
101. | 10 .7212129 .8853791 .9354209 |
201. | 20 .9710402 .9940005 .9976528 |
+----------------------------------------+
.
. ********** FIGURE 7.1 (p.249): PLOT THE POWER FUNCTION **********
.
. graph twoway (line power10 lamda, clstyle(p1)) /*
> */ (line power05 lamda, clstyle(p2)) /*
> */ (line power01 lamda, clstyle(p3)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Test Power as a function of the ncp") /*
> */ xtitle("Noncentrality parameter lamda", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Test Power", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(3) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Test size = 0.10") label(2 "Test size = 0.05") /*
> */
label(3 "Test size = 0.01"))
. graph export ch7power.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch7power.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma07p2power.txt
log type: text
closed on: 17 May 2005, 14:00:52
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma07p3montecarlo.txt
log type: text
opened on: 18 May 2005, 11:28:58
.
. ********** OVERVIEW OF MMA07P3MONTECARLO.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 7.7.1-7.7.5 pp. 250-4
148
152
. display "Wald: Lower 2.5 percentile = " r(r1) " Upper 2.5 percentile = " r(r2)
Wald: Lower 2.5 percentile = -1.904708 Upper 2.5 percentile = 2.0034728
.
. * The density of the simulated values of the Wald test should be
. * a standard normal density if Wald ~ N[0,1]
. * The following plots kernel estimate of density of Wald and a N[0,1] density
. * Could also do Student[N-k] but this looks same as N[0,1] if N>=30.
. gen N01density = normden(Wald)
. sum Wald
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------Wald | 10000 .1141294 .9558451 -4.087344 2.278257
.
. graph twoway (kdensity Wald, range(-3 3) clstyle(p1)) /*
> */ (connect N01density Wald if Wald>-3 & Wald<3, clstyle(p2) sort(Wald) s(i)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Monte Carlo Simulations of Wald Test") /*
> */ xtitle("Wald Test Statistic", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Density", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(11) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Monte Carlo") label(2 "Standard Normal") /*
> */
label(3 "Test size = 0.01"))
. graph export ch7montecarlo.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch7montecarlo.wmf written in Windows Metafile format)
.
. ****** (2) ACTUAL SIZE OF THE WALD TEST STATISTIC (Table 7.2, p.253)
.
. * Obtain the size properties of a two-sided Wald test
. * That rejects if |Wald| > z_alpha/2 where alpha = .01, .05, .1, .2
.
. * Convert to two-sided test by taking absolute value
. gen absWald = abs(Wald)
.
. * Give key percentiles of |Wald|
. * Percentiles must be in ascending order for Stata
. _pctile absWald, p(0.80,0.90,0.95,0.99)
. display "I[Upper percentiles of |Wald|: " " 1 " r(r4) " 5 " r(r3) " 10 " r(r2) " 20 " r(r1)
I[Upper percentiles of |Wald|: 1 .0115847 5 .01074749 10 .00998338 20 .00923005
.
. * Program to calculate actual size given nominal size
. * Temporary variables and scalars are in quotes ` '
. program size, rclass
153
1.
version 8.0
2.
args nominalsize
3.
tempvar reject
4.
tempname normalcriticalvalue
5.
quietly {
6.
scalar `normalcriticalvalue' = invnorm(1-(`nominalsize'/2))
7.
gen `reject' = 0
8.
replace `reject' = 1 if absWald > `normalcriticalvalue'
9.
summarize `reject'
10.
return scalar actualsize = r(mean)
11.
}
12. end
.
. * Calculate actual size for nominal sizes 0.01, 0.05, 0.10 and 0.20
. size 0.01
. scalar actualsize01 = r(actualsize)
. size 0.05
. scalar actualsize05 = r(actualsize)
. size 0.10
. scalar actualsize10 = r(actualsize)
. size 0.20
. scalar actualsize20 = r(actualsize)
.
. * Following gives Actual Size column of Table 7.2 (p.253)
. * Nominal Sizes and Actual Sizes of Two-sided Wald Test
. di "0.01: " actualsize01 _new "0.05: " actualsize05 _new /*
> */ "0.10: " actualsize10 _new "0.20: " actualsize20
0.01: .0053
0.05: .0294
0.10: .0805
0.20: .1922
.
. ****** (3) ACTUAL POWER OF THE WALD TEST STATISTIC (Table 7.2, p.253)
.
. * Consider power when b2 = 2 rather than 1
.
. * Obtain the actual power by simulation
. * Use the same program simprobit as for size,
. * except the argument b2true is 2.0 rather than 1.0
.
. drop _all
154
.
. * For size calculations set trueb2 = 2
. simulate "simprobit 2" ymean=r(ymean) yvar=r(yvar) b2hat=r(b2hat) /*
> */ seb2hat=r(seb2hat) ztestforb2eq1=r(ztestforb2eq1), reps(10000)
command:
simprobit 2
statistics: ymean
= r(ymean)
yvar
= r(yvar)
b2hat
= r(b2hat)
seb2hat = r(seb2hat)
ztestfor~1 = r(ztestforb2eq1)
.
. * Calculate |Wald|
. gen Wald = ztestforb2eq1
(71 missing values generated)
. gen absWald = abs(Wald)
(71 missing values generated)
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ymean |
9929 .4998389 .0791531
.225
.825
yvar |
9929 .249985 .0090933 .1480769 .2564103
b2hat |
9929 2.581075 2.73046 .8547966 209.9805
seb2hat |
9929 1.002628 5.799384 .2816004 540.1536
ztestforb2~1 |
9929 1.667773 .3853416 -.4042006 2.59991
-------------+-------------------------------------------------------Wald |
9929 1.667773 .3853416 -.4042006 2.59991
absWald |
9929 1.668285 .383118 .0033462 2.59991
.
. * Calculate actual power for nominal sizes 0.01, 0.05, 0.10 and 0.20
. * This can use the earlier program size
. size 0.01
. scalar actualpower01 = r(actualsize)
. size 0.05
. scalar actualpower05 = r(actualsize)
. size 0.10
. scalar actualpower10 = r(actualsize)
. size 0.20
155
.
. * This program has two arguments
. * - numsims = desired number of simulations
. * - trueb2 = slope coefficient used to generate the data
.
. drop _all
.
. program simprobit2
1.
version 8.0
2.
args numsims trueb2
3.
tempname sim
4.
postfile `sim' meany vary beta sterror ztestforbeta using probitsimresults, replace
5.
quietly {
6.
forvalues i = 1/`numsims' {
7.
drop _all
8.
set obs $numobs
/* may need to change */
9.
gen x = invnorm(uniform())
10.
/* If instead want same x in each simulation
>
replace above line with: use xforsim */
.
gen y = 0
11.
/* Use b2 = 1.0 for size and 1.5 for power */
.
replace y = 1 if 0+`trueb2'*x+invnorm(uniform()) > 0
12.
summarize y
13.
scalar meany=r(mean)
14.
scalar vary=r(Var)
15.
probit y x
16.
scalar beta=_b[x]
17.
scalar sterror = _se[x]
18.
scalar ztestforbeta = (beta-1)/sterror
19.
post `sim' (meany) (vary) (beta) (sterror) (ztestforbeta)
20.
}
21.
}
22.
postclose `sim'
23. end
.
. simprobit2 $numsims 1
. use probitsimresults, clear
.
. * Here we just summarize results for comparison with earlier
. * But could do the further analysis as above
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------meany | 10000 .4989575 .0791248
.225
.775
vary | 10000 .2499885 .0090127 .1788462 .2564103
beta | 10000 1.135003 .4315248 .0901358 7.205799
158
. * where x is N[0,1]
. * and a = 0 and b = 1
.
. * Change the following for different sample size N
. global numobs "40"
.
. * Probit example with slope coefficient equal to 1
. set seed 10105
. set obs $numobs
obs was 0, now 40
. gen x = invnorm(uniform())
. gen y = 0
. replace y = 1 if 0+1.0*x+invnorm(uniform()) > 0
(19 real changes made)
. save xyforsim, replace
file xyforsim.dta saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x|
40 -.0359197 .9203391 -2.210579 1.45199
y|
40
.475 .5057363
0
1
. probit y x
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Probit estimates
Number of obs =
40
LR chi2(1)
=
9.88
Prob > chi2 = 0.0017
Log likelihood = -22.733966
Pseudo R2
= 0.1786
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .8168831 .2942893 2.78 0.006 .2400867 1.393679
_cons | -.0725436 .2162576 -0.34 0.737 -.4964006 .3513135
-----------------------------------------------------------------------------. save mma07p4boot, replace
160
Bootstrap statistics
Number of obs =
Replications =
999
40
Number of obs =
Replications =
999
40
.
. * (1C) This bootstrap repeats (2)
. * but will permit bootstrapping if Stata commands are more than one line
. use mma07p4boot, clear
. program define commandtobootstrap, rclass
1. version 8.0
2. quietly probit y x
3. return scalar b2hat=_b[x]
4. return scalar seb2hat=_se[x]
5. end
. set seed 10105
. bootstrap "commandtobootstrap" r(b2hat) r(seb2hat), reps($breps)
command:
commandtobootstrap
statistics: _bs_1
= r(b2hat)
_bs_2
= r(seb2hat)
Bootstrap statistics
Number of obs =
Replications =
999
40
Number of obs =
Replications =
999
40
Chapter 11.2.6-11.2.7
. * For sample s compute t-test(s) = (bhat(s)-bhat) / se(s)
. * where bhat is initial estimate
. * and bhat(s) and se(s) are for sth round.
. * Order the t-test(s) statistics and choose the alpha/2 percentiles
. * which give the critical values for the t-test
.
. * Implementation requires saving the results from each bootstrap replication
. * in order to obtain ccritical values from percentiles of bootstrap distribution
.
. * (3A) Here bootstrap computes (b(s) - bhat) / se(s) s = 1,...,S
.
. use mma07p4boot, clear
. * Save the estimate and the Wald test statistic
. quietly probit y x
. scalar b2est = _b[x]
. scalar Wald = (_b[x] - 1)/_se[x]
. * Then bootstrap calculates (b(s) - bhat) / se(s)
. set seed 10105
. bootstrap "probit y x" ((_b[x]-b2est)/_se[x]), reps($breps) /*
> */ level(95) saving(mma07p4bootreps) replace
command:
probit y x
statistic: _bs_1
= (_b[x]-b2est)/_se[x]
Bootstrap statistics
Number of obs =
Replications =
999
40
165
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------_bs_1 |
999 .1003619 .9350234 -3.032139 2.572848
. gen b2test = _bs_1 /* _bs_1 is the bootstrap result of interest */
. sum b2test, detail /* Gives percentiles but not 2.5% and 97.5% */
b2test
------------------------------------------------------------Percentiles
Smallest
1% -2.188575 -3.032139
5% -1.540843 -2.605178
10% -1.137846 -2.599248
Obs
999
25% -.4995352 -2.566578
Sum of Wgt.
999
50%
75%
90%
95%
99%
.1238111
Mean
.1003619
Largest
Std. Dev.
.9350234
.7789762
2.22565
1.338348
2.359132
Variance
.8742688
1.560646
2.377491
Skewness
-.2505319
2.014282
2.572848
Kurtosis
2.853737
.
. * (3B) Equivalently bootstrap calculates b(s) and se(s) s = 1,...,S
.*
and then later calculate (b(s) - bhat) / se(s)
.
. use mma07p4boot, clear
. * Save the estimate and the Wald test statistic
. quietly probit y x
. scalar b2est = _b[x]
. scalar Wald = (_b[x] - 1)/_se[x]
. * Then bootstrap calculates b(s) and se(s)
. set seed 10105
. bootstrap "probit y x" _b[x] _se[x], reps($breps) /*
> */ level(95) saving(mma07p4bootreps) replace
command:
probit y x
statistics: _bs_1
= _b[x]
_bs_2
= _se[x]
Bootstrap statistics
Number of obs =
Replications =
999
40
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------_bs_1 |
999 .918616 .3763803 .0030288 3.806198
_bs_2 |
999 .3364898 .0932673 .2162534 1.34312
167
168
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma08p1cmtests.txt
log type: text
opened on: 17 May 2005, 14:04:20
.
. ********** OVERVIEW OF MMA08P1CMTESTS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 8.2.6 pages 269-71
. * Conditional moment tests example producing Table 8.1
.
. * (A) TEST OF THE CONDITIONAL MEAN
. * (B) TEST THAT CONDITIONAL VARIANCE = MEAN
. * (C) ALTERNATIVE TEST THAT CONDITIONAL VARIANCE = MEAN
. * (D) INFORMATION MATRIX TEST
. * (E) CHI-SQUARE GOODNESS OF FIT TEST
. * for a Poisson model with generated data (see below).
.
. * The data generation requires free Stata add-on command rndpoix
. * In Stata: search rndpoix
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
.
. ********** GENERATE DATA **********
.
. * Model is
. * y ~ Poisson[exp(b1 + b2*x2]
. * where
. * x2 is iid ~ N[0,1]
. * and b1=0 and b2=1.
.
. set seed 10001
. set obs 200
obs was 0, now 200
. scalar b1 = 0
169
. scalar b2 = 1
.
. * Generate regressors
. gen x2 = invnorm(uniform())
.
. * Generate y
. gen mupoiss = exp(b1+b2*x2)
. * The next requires Stata add-on. In Stata: search rndpoix
. rndpoix(mupoiss)
( Generating ................ )
Variable xp created.
. gen y = xp
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x2 using mma08p1cmtests.asc, replace
.
. ********* POISSON REGRESSION **********
.
. poisson y x2
Iteration 0: log likelihood = -263.53818
Iteration 1: log likelihood = -263.5288
Iteration 2: log likelihood = -263.5288
Poisson regression
Number of obs =
LR chi2(1)
= 321.75
Prob > chi2 = 0.0000
Pseudo R2
=
200
0.3791
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x2 | 1.12402 .0687868 16.34 0.000 .9892006 1.25884
_cons | -.1652935 .089065 -1.86 0.063 -.3398578 .0092707
-----------------------------------------------------------------------------. * Obtain exp(x'b)
.
. * Obtain the scores to be used later
. predict yhat
(option n assumed; predicted number of events)
. * For the Poisson s = dlnf(y)/db = (y - exp(x'b))*x
. gen s1 = (y - yhat)
170
. gen s2 = (y - yhat)*x2
.
. * Summarize data
. * Should get s1 and s2 summing to zero
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
200 -.0091098 1.010072 -2.857666 2.149822
mupoiss |
200 1.599601 1.674071 .0574026 8.58333
xp |
200
1.525 2.363749
0
15
y|
200
1.525 2.363749
0
15
yhat |
200
1.525 1.803242 .0341372 9.498652
-------------+-------------------------------------------------------s1 |
200 1.36e-09 1.36719 -3.148933 6.245292
s2 |
200 6.69e-09 1.889198 -6.420406 12.97311
.
. ********** ANALYSIS: CONDITIONAL MOMENTS TESTS **********
.
. * The program is appropriate for MLE with density assumed to be correctly specified.
. * Let H0: E[m(y,x,theta)] = 0
. * Then CM = explained sum of squares or N times uncentered Rsq from
. * auxiliary regression of 1 on m and the components of s = dlnf(y)//dtheta
. * The test is chi-squared with dim(m) degrees of freedom.
.
. * Define the dependent variable one for the aucxiliary regressions
. gen one = 1
.
. *** (A) TEST OF THE CONDITIONAL MEAN (Table 8.1 p.270 row 1)
.
. * Test H0: E[(y - exp(x'b))*z] = 0 where z = x2sq
.
. * A smilar test is relevant for many nonlinear models
. * Just change the expression for the conditional mean.
. * Here we used E[y|x] = exp(x'b) for the Poisson
. * Also for the Poisson z cannot be x as this sums to zero by Poisson foc
. * For some other models (basically non-LEF models) z can be x
.
. gen z = x2*x2
. gen mA = (y - yhat)*z
. regress one mA s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 3, 197) = 1.09
Model | 3.27177115 3 1.09059038
Prob > F
= 0.3536
Residual | 196.728229 197 .998620451
R-squared = 0.0164
171
-------------+-----------------------------Total |
200 200
1
-----------------------------------------------------------------------------one |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------mA | .1046155 .0577969 1.81 0.072 -.0093646 .2185956
s1 | -.0377486 .0822939 -0.46 0.647 -.2000387 .1245415
s2 | -.1544278 .1029465 -1.50 0.135 -.3574463 .0485908
-----------------------------------------------------------------------------. scalar CMA = e(N)*e(r2)
. di "CMA: " CMA " p-value: " chi2tail(1,CMA)
CMA: 3.2717711 p-value: .07048149
.
. * Check that three different ways give same answer.
. di "N times Uncentered R-squared: " e(N)*e(r2)
N times Uncentered R-squared: 3.2717711
. di "Explained Sum of Squares:
" e(mss)
Explained Sum of Squares:
3.2717711
. di "N minus Residual Sum of Squares: " e(N) - e(rss)
N minus Residual Sum of Squares: 3.2717711
.
. *** (B) TEST THAT CONDITIONAL VARIANCE = MEAN (Table 8.1 p.270 row 2)
.
. * Test H0: E[{(y - exp(x'b))^2 - exp(x'b)}*x] = 0
.
. * This test is peculiar to Poisson which restricts mean = variance
.
. * Here m has 2 terms
. gen mB1 = ((y - yhat)^2 - yhat)
. gen mB2 = ((y - yhat)^2 - yhat)*x2
. regress one mB1 mB2 s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 4, 196) = 0.60
Model | 2.43400011 4 .608500026
Prob > F
= 0.6604
Residual | 197.566 196 1.0079898
R-squared = 0.0122
-------------+-----------------------------Adj R-squared = -0.0080
Total |
200 200
1
Root MSE
= 1.004
-----------------------------------------------------------------------------one |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------172
175
. replace d0 = 1 if y==0
(87 real changes made)
. gen d1 = 0
. replace d1 = 1 if y==1
(51 real changes made)
. gen d2 = 0
. replace d2 = 1 if y==2
(22 real changes made)
. gen p0 = exp(-yhat)
. gen p1 = exp(-yhat)*yhat
. gen p2 = exp(-yhat)*(yhat^2)/2
. gen mE1 = d0 - p0
. gen mE2 = d1 - p1
. gen mE3 = d2 - p2
. regress one mE1 mE2 mE3 s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 5, 195) = 0.49
Model | 2.50056717 5 .500113433
Prob > F
= 0.7807
Residual | 197.499433 195 1.0128176
R-squared = 0.0125
-------------+-----------------------------Adj R-squared = -0.0128
Total |
200 200
1
Root MSE
= 1.0064
-----------------------------------------------------------------------------one |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------mE1 | 1.020078 .7290569 1.40 0.163 -.4177712 2.457927
mE2 | .7149016 .5053259 1.41 0.159 -.2817042 1.711507
mE3 | .2705081 .383646 0.71 0.482 -.4861201 1.027136
s1 | .2916116 .2217763 1.31 0.190 -.1457765 .7289997
s2 | -.1341565 .1125046 -1.19 0.235 -.3560384 .0877255
-----------------------------------------------------------------------------. scalar CME = e(N)*e(r2)
. di "CME: " CME " p-value: " chi2tail(3,CME)
CME: 2.5005672 p-value: .47518859
.
. * Wrong alternative is basic chisquare
176
. quietly sum d0
. scalar sumd0 = r(sum)
. quietly sum d1
. scalar sumd1 = r(sum)
. quietly sum d2
. scalar sumd2 = r(sum)
. scalar sumd3 = 1 - sumd0 - sumd1 - sumd2
. quietly sum p0
. scalar sump0 = r(sum)
. quietly sum p1
. scalar sump1 = r(sum)
. quietly sum p2
. scalar sump2 = r(sum)
. scalar sump3 = 1 - sump0 - sump1 - sump2
. scalar chisq = (sumd0-sump0)^2/sump0 + (sumd1-sump1)^2/sump1 /*
>
*/ + (sumd2-sump2)^2/sump2 + (sumd3-sump3)^2/sump3
. di "Wrong Traditional chi-square: " chisq " p = " chi2tail(3,chisq)
Wrong Traditional chi-square: .47431003 p = .92449803
.
.
. ********** DISPLAY RESULTS (Table 8.1 p.270) **********
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
200 -.0091098 1.010072 -2.857666 2.149822
mupoiss |
200 1.599601 1.674071 .0574026 8.58333
xp |
200
1.525 2.363749
0
15
y|
200
1.525 2.363749
0
15
yhat |
200
1.525 1.803242 .0341372 9.498652
-------------+-------------------------------------------------------s1 |
200 1.36e-09 1.36719 -3.148933 6.245292
s2 |
200 6.69e-09 1.889198 -6.420406 12.97311
one |
200
1
0
1
1
177
z|
200 1.015227 1.286795 .0000877 8.166255
mA |
200 .1563713 3.403966 -13.52498 26.94856
-------------+-------------------------------------------------------mB1 |
200 .334863 3.470417 -6.436038 30.24896
mB2 |
200
.43869 5.749749 -11.74974 62.83503
mC1 |
200 .334863 3.077815 -6.838236 24.00367
mC2 |
200
.43869 4.897291 -12.484 49.86192
mD1 |
200 .334863 3.077815 -6.838236 24.00367
-------------+-------------------------------------------------------mD2 |
200
.43869 4.897291 -12.484 49.86192
mD3 |
200 .8381842 9.190652 -22.791 103.5763
d0 |
200
.435 .4970011
0
1
d1 |
200
.255 .436955
0
1
d2 |
200
.11 .3136749
0
1
-------------+-------------------------------------------------------p0 |
200 .429237 .2918348 .000075 .9664389
p1 |
200 .2406035 .1137756 .000712 .367864
p2 |
200 .1235594 .0894167 .0005631 .2706694
mE1 |
200 .005763 .4287003 -.9289918 .9571021
mE2 |
200 .0143965 .4210301 -.367864 .9315748
-------------+-------------------------------------------------------mE3 |
200 -.0135594 .3065698 -.2706694 .9688674
.
. * Gives Rows 1-5 of Table 8.1 (The CMxnoscores are not reported)
. di "CMA: " CMA " p-value: " chi2tail(1,CMA)
CMA: 3.2717711 p-value: .07048149
. di "CMB: " CMB " p-value: " chi2tail(2,CMB)
CMB: 2.4340001 p-value: .29611717
. di "CMC: " CMC " p-value: " chi2tail(2,CMC)
CMC: 2.4340001 p-value: .29611717
. di "CMD: " CMD " p-value: " chi2tail(3,CMD)
CMD: 2.9463051 p-value: .39997818
. di "CME: " CME " p-value: " chi2tail(3,CME)
CME: 2.5005672 p-value: .47518859
. di "CMCnoscores: " CMCnoscores " p-value: " chi2tail(2,CMCnoscores)
CMCnoscores: 2.4069518 p-value: .30014911
. di "CMDnoscores: " CMDnoscores " p-value: " chi2tail(3,CMDnoscores)
CMDnoscores: 2.7344575 p-value: .43440333
.
. ********** FURTHER ANALYSIS gives M** column in Table 8.1 **********
.
. * The following drops the scores from the regression. Provides lower bound.
. * Results are reported in last column in Table 8.1
178
180
Number of obs =
100
LR chi2(1)
=
16.28
Prob > chi2 = 0.0001
Log likelihood = -183.43146
Pseudo R2
= 0.0425
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x2 | .291164 .072311 4.03 0.000 .1494371 .4328909
_cons | .6084331 .0752833 8.08 0.000 .4608806 .7559857
-----------------------------------------------------------------------------. estimates store model1
181
Number of obs =
100
LR chi2(2)
=
30.96
Prob > chi2 = 0.0000
Log likelihood = -176.09119
Pseudo R2
= 0.0808
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x3 | .3588412 .07035 5.10 0.000 .2209578 .4967245
x3sq | .0912999 .0514311 1.78 0.076 -.0095032 .1921029
_cons | .492656 .0958903 5.14 0.000 .3047144 .6805975
-----------------------------------------------------------------------------. estimates store model2
. scalar ll2 = e(ll)
. scalar q2 = e(k)
. scalar N2 = e(N)
. scalar aic2 = -2*ll2 + 2*q2
. scalar bic2 = -2*ll2 + ln(N2)*q2
. scalar caic2 = -2*ll2 + (1 + ln(N2))*q2
.
. * Display results given in first three rows of Table 8.2 page 284
.
. estimates table model1 model2, stats(N k ll aic bic)
182
---------------------------------------Variable | model1
model2
-------------+-------------------------x2 | .29116396
x3 |
.35884118
x3sq |
.09129986
_cons | .60843314 .49265596
-------------+-------------------------N|
100
100
k|
2
3
ll | -183.43146 -176.09119
aic | 370.86292 358.18238
bic | 376.07326 365.99789
---------------------------------------.
. di "Model 1: " _n "lnL: " ll1 " q: " q1 _n " N: " N1
Model 1:
lnL: -183.43146 q: 2
N: 100
. di "-2lnL: " -2*ll1 _n "AIC: " aic1 _n " BIC: " bic1 _n "caic: " caic1
-2lnL: 366.86292
AIC: 370.86292
BIC: 376.07326
caic: 378.07326
.
. di "Model 2: " _n "lnL: " ll2 " q: " q2 _n " N: " N2
Model 2:
lnL: -176.09119 q: 3
N: 100
. di "-2lnL: " -2*ll2 _n "AIC: " aic2 _n " BIC: " bic2 _n "caic: " caic2
-2lnL: 352.18238
AIC: 358.18238
BIC: 365.99789
caic: 368.99789
.
. ********* (B) VUONG TEST FOR OVERLAPPING MODELS *********
.
. * The test has three variants
. * (1) Nested models: G is contained in F
. * (2) Strictly non-nested models: F intersection G equals null set
. * (3) Overlapping models: F intersection G does not equal null set
.
. * Need to compute lnf(y) for models 1 and 2,
. * where density f is model 1 and density g is model 2
.
. * The procedures will vary with model. Here use Poisson.
183
.
. * (0) COMPUTE THE LR TEST STATISTIC
.
. * This is LR = Sum_i [ ln (fy1_i / gy2_i) ]
.*
= Sum_i lnfy1_i - Sum_i lngy2_i
.*
= difference in log-likelihood for the two models
.
. * Easiest if program output gives logL
. * Otherwise need to generate manually
.
. quietly poisson y $XLISTMODEL1
. scalar llf = e(ll)
. quietly poisson y $XLISTMODEL2
. scalar llg = e(ll)
. scalar LR = llf - llg
. di "LR = " LR " and llf = " llf " llg = " llg
LR = -7.3402698 and llf = -183.43146 llg = -176.09119
.
. * (1) NESTED MODELS
.
. * Not done here as not relevant for the example of this application.
.
. * (1A) Usual LR test if assume densities correctly specified.
.
. * (1B) If instead want robustified version then need to compute W
. * and use the weighted chi-square test.
. * This is not the appropriate test here,
. * but in 3(A) below W is computed and a weighted chi-square test used.
. * This code could be easily adapted to here.
.
. * (2) STRICTLY NON-NESTED MODELS
.
. * Not done here as not relevant for the example of this application.
. * Test uses LR/what ~ normal where what is computed in 3(B) below.
.
. * (3) OVERLAPPING MODELS
.
. * This is the relevant test here
. * First test whether overlapping (even though here know that is)
. * THen do the test
.
. * (3A-1) Compute what^2
.
. * Calculate what^2
. * = (1/N)*Sum_i[ln(fy1_i/gy2_i)^2] - [(1/N)*Sum_i[ln(fy1_i/gy2_i)]^2
184
. *** Display results given in second last row of Table 8.2 page 284
.
. di "LR = " LR " and LRcheck = " LRcheck
LR = -7.3402698 and LRcheck = -7.3402702
.
. * (3A-2) Find the critical value by first find W, then eigenvalues lamda, then simulate
.
. * Calculate estimate of the W matrix on page ?? of Vuong.
. * (a) Can estimate Af = E[d2lnf(y)/dbdb'] as inverse of usual ML variance matrix
. * (b) Since the robust ML variance matrix is V = Ainv*B*Ainv
. * can estimate Bf = -E[dlnf(y)/dbxdlnf(y)/db'] by A*V*A where A is in (a)
. * (c) For Ag same as in part (a) except for model g
. * (d) For Bg same as in part (a) except for model g
. * (e) The only tricky bit is computation of Bfg
.
. gen one = 1
. * (a) Af
. quietly poisson y one $XLISTMODEL1, noconstant
. matrix Af = syminv(e(V))
. * (b) Bf
. quietly poisson y one $XLISTMODEL1, noconstant robust
. * robust gives Ainv*B*Ainv so pre and post multiply by A gives B
. * Also make adjustment s Stata divides by (_N-1). Here use _N.
. matrix Bf = Af*e(V)*Af*(_N-1)/_N
. * (c) Ag
. quietly poisson y one $XLISTMODEL2, noconstant
. matrix Ag = syminv(e(V))
. * (d) Bg
. quietly poisson y one $XLISTMODEL2, noconstant robust
. matrix Bg = Ag*e(V)*Ag*(_N-1)/_N
.
. * (e) Bfg requires more specialized code pecuuliar to this example
. * For Poisson dlnf(y)/db = Sum_I (y_i - mu_i)*x_i
. * so Bfg = (1/N)*Sum_i [(y_i - muf_i)*xf_i]*[(y_i - mug_i)*xg_i]'
. * For model 1 x is intercept and x2 (global XLISTMODEL1 x2)
. gen bf1 = (y - yhatf)
/* yhatf saved earlier = y - muf */
. gen bf2 = (y - yhatf)*x2
. * For model 2 x is intercept, x3 and x3sq (global XLISTMODEL2 x3 x3sq)
. gen bg1 = (y - yhatg)
/* yhatg saved earlier = y - mug */
186
y:one
y:x2
bg1
bg2
bg3
y:
y:
y:
y:
y:
one
x2
one
x3
x3sq
1.5571072 .01745302 1.3738479 .03868485 -.1702893
.05110494 1.4484966 .61074273 .07847014 -.15039712
1.1488275 .1064062 1.6030095 .0647251 -.18944561
.39558125 .08428705 .20709641 1.0650899 -.05677421
1.1180355 -.0564763 .19914593 .07617139 .90718177
.
. * Calculate the eigenvalues of W
. matrix eigenvalues reigvalW ceigvalW = W
. * Real eigenvalues
. matrix list reigvalW
reigvalW[1,5]
y:
y:
y:
y:
y:
one
x2
one
x3
x3sq
real 2.7511946 .29082285 1.4750881 1.0021719 1.0616075
. * Complex eigenvalues - hopefully none
. matrix list ceigvalW
187
ceigvalW[1,5]
y: y: y: y: y:
one x2 one x3 x3sq
complex 0 0 0 0 0
.
. * This gives the vector lamda of eigenvalus of W
. matrix lamda = reigvalW
. scalar l1 = lamda[1,1]
. scalar l2 = lamda[1,2]
. scalar l3 = lamda[1,3]
. scalar l4 = lamda[1,4]
. scalar l5 = lamda[1,5]
.
. * Now obtain the p-value and critical value at level 0.05
. preserve
. * Obtain the 5 percent critical value by simulating 10000 draws from
. * M_p+q(lamda) = Sum_j lamda*j*z_j^2 where z_j are N[0,1] so z_j^2 are chi(1)
. set seed 10101
. set obs 10000
obs was 100, now 10000
. gen randomdraw = l1*invnorm(uniform())^2 + l2*invnorm(uniform())^2 + /*
> */ l3*invnorm(uniform())^2 + l4*invnorm(uniform())^2 + l5*invnorm(uniform())^2
. gen indicator = Nwhatsq >= randomdraw
. quietly sum indicator
. di "p-value for the Omegahatsq test = " 1-r(mean)
p-value for the Omegahatsq test = 0
. sum randomdraw, detail
randomdraw
------------------------------------------------------------Percentiles
Smallest
1% .6438425
.0756691
5% 1.286375
.1250253
10% 1.850972
.1326376
Obs
10000
25% 3.137835
.1402145
Sum of Wgt.
10000
50%
5.359223
Mean
6.614841
188
75%
90%
95%
99%
Largest
Std. Dev.
4.90562
8.751276
38.32291
12.8871
38.75208
Variance
24.06511
16.10237
40.94431
Skewness
1.733549
23.85304
44.08449
Kurtosis
7.514808
190
. gen x3 = invnorm(uniform())
.
. * Generate y
. gen mupoiss = exp(b1+b2*x2+b3*x3)
. * The next requires Stata add-on. In Stata: search rndpoix
. rndpoix(mupoiss)
( Generating ......... )
Variable xp created.
. gen y = xp
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
100 .0053689 1.000686 -2.173506 2.106561
x3 |
100 -.0235884 1.024207 -2.857666 2.149822
mupoiss |
100 2.020511 1.400564 .3380426 7.029678
xp |
100
1.92 1.835013
0
8
y|
100
1.92 1.835013
0
8
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x2 x3 using mma08p3diagnostics.asc, replace
.
. ********* SETUP FOR THIS PROGRAM **********
.
. * Change this if want different regressors
. gen x3sq = x3*x3
. * global XLIST x3
/* Model 1 */
. global XLIST x3 x3sq /* Model 2 */
.
. ********* R-SQUARED (reported in Table 8.3 p.291) **********
.
. * The following code can be changed to diffferent models than poisson
. * For RsqRES, RsqEXP and RsqCOR need
.* y
dependent variable
. * yhat predicted value of dependent variable
. * For RsqWRSS additionally need
. * sigmasq predicted variance of dependent variable
. * For RsqRG need log density evaluated at values given below
.
. * Obtain exp(x'b) Will vary with the model
. poisson y $XLIST
Iteration 0: log likelihood = -176.09611
191
Number of obs =
100
LR chi2(2)
=
30.96
Prob > chi2 = 0.0000
Log likelihood = -176.09119
Pseudo R2
= 0.0808
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x3 | .3588412 .07035 5.10 0.000 .2209578 .4967245
x3sq | .0912999 .0514311 1.78 0.076 -.0095032 .1921029
_cons | .492656 .0958903 5.14 0.000 .3047144 .6805975
-----------------------------------------------------------------------------. predict yhat
(option n assumed; predicted number of events)
. scalar dof = e(N)-e(k)
.
. * RsqRES and RsqEXP are R-squared from sums of squares
. * First get TSS, ESS and RSS
. egen ybar = mean(y)
. gen ylessybarsq = (y - ybar)^2
. quietly sum ylessybarsq
. scalar totalss = r(mean)
. gen yhatlessybarsq = (yhat - ybar)^2
. quietly sum yhatlessybarsq
. scalar explainedss = r(mean)
. gen residualsq = (y - yhat)^2
. quietly sum residualsq
. scalar residualss = r(mean)
. * Second computed the rsquared
. scalar sereg = sqrt(residualss/dof)
. scalar RsqRES = 1 - residualss/totalss
. scalar RsqEXP = explainedss/totalss
192
.
. * RsqCOR uses sample correlation
. quietly correlate y yhat
. scalar RsqCOR = r(rho)^2
.
. di "standard error of regression: " sereg
standard error of regression: .16620308
. di "totalss: " totalss _n "explainedss: " explainedss _n "residualss: " residualss
totalss: 3.3336
explainedss: .69556676
residualss: 2.6794761
. di "RsqRES: " RsqRES _n "RsqEXP: " RsqEXP _n "RsqCOR: " RsqCOR
RsqRES: .19622149
RsqEXP: .20865333
RsqCOR: .19640666
.
. * RsqWRSS uses weighted sums of squares
. * First generate estimated variance of y
. * Here for Poisson use fact that variance = mean
. gen sigmasq = yhat
. gen weightedylessybarsq = ((y - ybar)^2) / sigmasq
. quietly sum weightedylessybarsq
. scalar weightedtotalss = r(mean)
. gen weightedresidualsq = ((y - yhat)^2) / sigmasq
. quietly sum weightedresidualsq
. scalar weightedresidualss = r(mean)
. scalar RsqWRSS = 1 - weightedresidualss/weightedtotalss
. di "RsqWRSS: " RsqWRSS
RsqWRSS: .16945018
.
. * RsqRG is from ML. Difficult to generalize beyond LEF models.
. * Need
. * lnL_fit log-likelihood at fitted values (the usual)
. * lnL_0 log-likelihood at intecept only
. * lnL_max log-likelihood at best fit
. quietly poisson y $XLIST
193
ybar |
100
1.92
0
1.92
1.92
ylessybarsq |
100
3.3336 5.966374
.0064 36.9664
yhatlessyb~q |
100 .6955668 1.572256 4.82e-06 12.09783
-------------+-------------------------------------------------------residualsq |
100 2.679476 4.830379 .0000825 36.93972
sigmasq |
100
1.92 .838208 1.150405 5.398193
weightedyl~q |
100 1.681324 2.560112 .0018502 19.23135
weightedre~q |
100 1.396423 2.424518 .0000276 19.21747
ylny |
100 2.15694 3.48234
0 16.63553
-------------+-------------------------------------------------------lnfyatmax |
100 -1.01124 .6233793 -1.969071
0
. poisson y $XLIST /* Stata Rsq = RsqQ */
Iteration 0: log likelihood = -176.09611
Iteration 1: log likelihood = -176.09119
Iteration 2: log likelihood = -176.09119
Poisson regression
Number of obs =
100
LR chi2(2)
=
30.96
Prob > chi2 = 0.0000
Log likelihood = -176.09119
Pseudo R2
= 0.0808
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x3 | .3588412 .07035 5.10 0.000 .2209578 .4967245
x3sq | .0912999 .0514311 1.78 0.076 -.0095032 .1921029
_cons | .492656 .0958903 5.14 0.000 .3047144 .6805975
-----------------------------------------------------------------------------.
. *** The following results are for Model 2 in Table 8.3 p.291
. *** For model 1 R-squareds need to rerun with only x3 as regressor
. di "standard error of regression: " sereg
standard error of regression: .16620308
. di "RsqRES: " RsqRES _n "RsqEXP: " RsqEXP _n "RsqCOR: " RsqCOR
RsqRES: .19622149
RsqEXP: .20865333
RsqCOR: .19640666
. di "RsqWRSS: " RsqWRSS _n "RsqRG: " RsqRG _n "RsqQ: " RsqQ
RsqWRSS: .16945018
RsqRG: .17115358
RsqQ: .08080754
.
. ********* RESIDUAL ANALYSIS (text bottom p.290 to top p.291) **********
.
. * Assume that from earlier have yhat
195
.
. * raw residual
. gen raw = y - yhat
. gen sigma = sqrt(yhat)
. gen Pearson = (y - yhat)/sigma
. * Note that earlier defined ylny = 0 if y=0 and = yln(y) otherwise
. gen deviance = sign(y-yhat)*sqrt(2*(-y+ylny)-2*(-yhat+y*ln(yhat)))
.
. *** The following are results reported in text bottom p.290 to top p.291
. sum raw Pearson deviance
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------raw |
100 -2.38e-09 1.645157 -2.993904 6.077806
Pearson |
100 -.0014455 1.187656 -1.498094 4.383774
deviance |
100 -.2103819 1.212345 -2.016939 3.264961
. corr raw Pearson deviance
(obs=100)
|
raw Pearson deviance
-------------+--------------------------raw | 1.0000
Pearson | 0.9852 1.0000
deviance | 0.9625 0.9818 1.0000
196
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma09p1np.txt
log type: text
opened on: 17 May 2005, 14:16:51
.
. ********** OVERVIEW OF MMA09P1NP.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 9.2 p.295-297
. * Nonparametric density estimation and nonparametric regression using actual data.
.
. * (1) Histogram: Figure 9.1 in chapter 9.2.1 (ch9hist)
. * (2) Kernel density estimate as bandwidth varies: Figure 9.2 in chapter 9.2.1 (ch9kd1)
. * (3) Kernel density estimate as kernel varies: Figure 9.4 in chapter 9.3.4 (ch9kdensu1)
. * (4) Lowess regression: Figure 9.3 in chapter 9.4.3 (ch9ksm1)
. * (5) Extra: Nearest neighbours regression: using Lowess and using add-on knnreg
. * (6) Extra: Kernel regression: using add-on kernreg
.
. * using data on earnings and education (see below)
.
. * NOTE: This particular program uses version 8.2 rather than 8.0
.*
For kernel density Stata uses an alternative formulation of Epanechnikov
.*
To follow book and e.g. Hardle (1990) use epan2 rather than epan
.*
epan = epan2 if epan bandwidth is epan2 bandwidth divided by sqrt(5)
.*
where kernel epan2 is an update to Stata version 8.2
.
. * To run this program you need file
. * psidf3050.dat
. * in your directory
.
. * To do (5) and (6) you need Stata add-ons knnreg and kernreg
. * In Stata give command search knnreg and search kernreg
.
. * See also mma9p2npmore.do for more on nonparametric regression (Figures 9.5-9.7)
.
. ********** SETUP
.
. di "mma09p1np.do Cameron and Trivedi: Stata nonparametrics with wages and education"
mma09p1np.do Cameron and Trivedi: Stata nonparametrics with wages and education
. set more off
. version 8
. set scheme s1mono /* Graphics scheme */
197
.
. ********** DATA DESCRIPTION
.*
. * The original data are from the PSID Individual Level Final Release 1993 data
. * From www.isr.umich.edu/src/psid then choose Data Center
. * 4856 observations on 9 variables for Females 30 to 50 years
.
. * Fixed width data
. * intnum 1-4 V30001="1968 INTERVIEW NUMBER"
. * persnum 5-7 V30002="PERSON NUMBER"
. * age
8-9 V30809="AGE OF INDIVIDUAL
93"
. * educatn 10-11 V30820="G90 HIGHEST GRADE COMPLETED
93"
. * earnings 12-17 V30821="TOTAL LABOR INCOME
93"
. * hours 18-21 V30823="1992 ANNUAL WORK HOURS
93"
. * sex
22 V32000="SEX OF INDIVIDUAL"
. * kids 23-24 V32022="# LIVE BIRTHS TO THIS INDIVIDUAL"
. * [NOTE: DO NOT USE THE kids VARIABLE AS IT IS NUMBER OF BIRTHS
.*
NOT NUMBER OF KIDS CURRENTLYU IN HOUSEHOLD]
. * married 25 V32049="LAST KNOWN MARITAL STATUS"
.
. ********** READ DATA **********
.
. * Data are fixed format so use infix
. infix intnum 1-4 persnum 5-7 age 8-9 educatn 10-11 earnings 12-17 /*
> */ hours 18-21 sex 22 kids 23-24 married 25 using psidf3050.dat
(4856 observations read)
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------intnum |
4856 4598.101 2761.971
4
9306
persnum |
4856 59.21355 79.74856
1
205
age |
4856 38.46293 5.595116
30
50
educatn |
4855 16.37714 18.4495
0
99
earnings |
4856 14244.51 15985.45
0 240000
-------------+-------------------------------------------------------hours |
4856 1235.335 947.1758
0
5160
sex |
4856
2
0
2
2
kids |
4856 4.48126 14.88786
0
99
married |
4856 1.920717 1.504848
1
9
.
. ********** MISSING VALUES, DATA TRANSFORMATIONS and SAMPLE SELECTION
.
. * For Highest grade codes the missing codes are 98 DK and 99 NA and 0 inappropriate
. * Here treat these as missing
. replace educatn = . if (educatn==0 | educatn==98 | educatn==99)
(290 real changes made, 290 to missing)
198
.
. * For marital status the codes are
. * 1 married; 2 Never married; 3 Widowed; 4 Divorced, annulment;
. * 5 Separated; 8 NA / DK; 9 No histories 85-93
. * Recode 2-5 as not married and treat 8 and 9 as missing
. replace married = . if (married==8 | married==9)
(52 real changes made, 52 to missing)
. replace married = 0 if married > 1
(1785 real changes made)
.
. * For kids the missing codes are 98 DK/NA and 99 no birth history
. replace kids = . if (kids==98 | kids==99)
(118 real changes made, 118 to missing)
. * But do not use these data as it is number of births
. * not number of kids currently in household
. * So I drop kids
. drop kids
.
. * Work with positive earnings only
. drop if earnings==0
(1204 observations deleted)
. * Topcode women with very high earnings
. replace earnings=100000 if earnings>100000
(11 real changes made)
. * Create log hourly wage
. gen hwage = earnings/hours
. gen lnhwage = ln(hwage)
.
. * Work with age 36 and nonmissing education data
. keep if age == 36
(3468 observations deleted)
. drop if educatn == .
(7 observations deleted)
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------intnum |
177 4699.853 2765.081
14
9240
persnum |
177 59.53672 79.73001
1
188
age |
177
36
0
36
36
educatn |
177 12.58757 2.841347
3
17
199
earnings |
177 17470.55 13513.56
87
70000
-------------+-------------------------------------------------------hours |
177 1506.401 698.4145
8
3160
sex |
177
2
0
2
2
married |
177 .7457627 .4366669
0
1
hwage |
177 12.71631 16.58889 .6837607
175
lnhwage |
177 2.198163 .8281614 -.3801473 5.164786
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile intnum persnum age educatn earnings hours sex married hwage /*
> */ lnhwage using mma09p1np.asc, replace
.
. ********* ANALYSIS: (1)-(3) NONPARAMETRIC DENSITY ESTIMATES
.
. set scheme s1mono
.
. * Here give bin width for histogram and kdensity
.
. * Calculate Silberman's plugin estimate of optimal bandwidth in (9.13)
. * with delta given in Table 9.1 for Epanechnikov kernel
. quietly sum lnhwage, detail
. global sadj = min(r(sd),(r(p75)-r(p25))/1.349)
. di "sadj: " $sadj " iqr/1349: " (r(p75)-r(p25))/1.349 " stdev: " r(sd)
sadj: .65488184 iqr/1349: .65488184 stdev: .82816143
. global bwepan2 = 1.3643*1.7188*$sadj/(r(N)^0.2)
. di "Bandwidth: " $bwepan2
Bandwidth: .54538542
.
. * HISTOGRAM ONLY - Figure 9.1
. graph twoway (histogram lnhwage, bin(20) bcolor(*.2)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Histogram for Log Wage") /*
> */ xtitle("Log Hourly Wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Density", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(10) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Histogram") label(2 "Kernel"))
. graph save ch9hist, replace
(file ch9hist.gph saved)
. graph export ch9hist.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9hist.wmf written in Windows Metafile format)
200
.
. * COMBINED HISTOGRAM AND KERNEL DENSITY ESTIMATE
. graph twoway (histogram lnhwage, bin(20) bcolor(*.2)) /*
> */ (kdensity lnhwage, width($bwepan2) epan2 clstyle(p1)), /*
> */ title("Histogram and Kernel Density for Log Wage") /*
> */ caption("Note: Kernel is Epanechnikov with bandwidth 0.55")
.
. * KERNEL DENSITY ESTIMATE FOR 3 BANDWIDTHS - Figure 9.2
. global bwonehalf = 0.5*$bwepan2
. global btwotimes = 2*$bwepan2
. graph twoway (kdensity lnhwage, width($bwonehalf) epan2 clstyle(p2)) /*
> */ (kdensity lnhwage, width($bwepan2) epan2 clstyle(p1)) /*
> */ (kdensity lnhwage, width($btwotimes) epan2 clstyle(p3)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Density Estimates as Bandwidth Varies") /*
> */ xtitle("Log Hourly Wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Kernel density estimates", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(1) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "One-half plug-in") label(2 "Plug-in") /*
> */
label(3 "Two times plug-in"))
. graph save ch9kd1, replace
(file ch9kd1.gph saved)
. graph export ch9kd1.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9kd1.wmf written in Windows Metafile format)
.
. * KERNEL DENSITY ESTIMATE FOR 4 DIFFERENT KERNELS - Figure 9.4
. * Calculate Silberman's plugin optimal bandwidths using (9.13)
. * with delta given in Table 9.1 for the different kernels
.
. * Use sadj calculated earlier for Epanecnnikov
. global bwgauss = 1.3643*0.7764*$sadj/(_N^0.2)
. global bwbiweight = 1.3643*2.0362*$sadj/(_N^0.2)
. global bwrectang = 0.5*1.3643*1.3510*$sadj/(_N^0.2)
. di "Usual Epanechnikov (epan2):
" $bwepan2
Usual Epanechnikov (epan2):
.54538542
. di "Gaussian:
Gaussian:
" $bwgauss
.24635632
. di "Quartic or biweight:
Quartic or biweight:
" $bwbiweight
.64609832
201
. di "Uniform or rectangular:
" $bwrectang
Uniform or rectangular:
.21434015
. graph twoway (kdensity lnhwage, width($bwepan2) epan2) /*
> */ (kdensity lnhwage, width($bwgauss) gauss) /*
> */ (kdensity lnhwage, width($bwbiweight) biweight) /*
> */ (kdensity lnhwage, width($bwrectang) rectangle), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Density Estimates as Kernel Varies") /*
> */ xtitle("Log Hourly Wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Kernel density estimates", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(3) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Epanechnikov (h=0.545)") label(2 "Gaussian (h=0.246)") /*
> */
label(3 "Quartic (h=0.646)") label(4 "Uniform (h=0.214)"))
. graph save ch9kdensu1, replace
(file ch9kdensu1.gph saved)
. graph export ch9kdensu1.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9kdensu1.wmf written in Windows Metafile format)
.
. * SHOW THAT STATA EPANECHNIKOV = USUAL EPANECHNIKOV
. * Once divide usual Epanechnikov bandwidth by sqrt(5).
. * (Pagan and Ullah (1999, p.28) have formulae.)
. global bwepan = $bwepan2/sqrt(5)
. graph twoway (kdensity lnhwage, width($bwepan2) epan2) /*
> */ (kdensity lnhwage, width($bwepan) epan), /*
> */ title("Epan = Epan2 if bandwidth adjusted") /*
> */ legend( label(1 "Usual Epanechnikov") label(2 "Stata Epanechnikov"))
.
.
. ********* ANALYSIS: (4) LOWESS NONPARAMETRIC REGRESSION ESTIMATES
.
. * LOWESS WITH DEFAULT BANDWIDTH of 0.8
. lowess lnhwage educatn
.
. * LOWESS REGRESSION WITH BANDWIDTHS of 0.1, 0.4 and 0.8 - Figure 9.3
. graph twoway (scatter lnhwage educatn, msize(medsmall) msymbol(o)) /*
> */ (lowess lnhwage educatn, bwidth(0.8) clstyle(p2)) /*
> */ (lowess lnhwage educatn, bwidth(0.4) clstyle(p1)) /*
> */ (lowess lnhwage educatn, bwidth(0.1) clstyle(p3)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Nonparametric Regression as Bandwidth Varies") /*
> */ xtitle("Years of Schooling", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log Hourly Wage", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(12) ring(0) col(2)) legend(size(small)) /*
> */ legend( label(1 "Actual data") label(2 "Bandwidth h=0.8") /*
202
> */
(obs=125)
| knnlow~d knnreg~d
-------------+-----------------knnlowessp~d | 1.0000
knnregpred | 1.0000 1.0000
.
. ********* ANALYSIS: (6) EXTRA: KERNEL NONPARAMETRIC REGRESSION
.
. * KERNEL REGRESSION
. * Kercode 1 = Uniform; 2 = Triangle; 3 = Epanechnikov; 4 = Quartic (Biweight);
.*
5 = Triweight; 6 = Gaussian; 7 = Cosinus
. * bwidth(#) defines width of the weight function window around each grid point.
. * npoint(#) specifies the number of equally spaced grid points over range of x.
. * Here bwidth(3) gives e.g. positive weight from x=4 to x=10 if current x0=7
. kernreg lnhwage educatn, bwidth(3) kercode(3) npoint(100) ylabel gen(kernregpred1 xkernreg)
. graph twoway (lowess lnhwage educatn, bwidth(0.5) clstyle(p2)) /*
> */ (line kernregpred xkernreg, clstyle(p1)), /*
> */ title("Lowess versus kernel regression") /*
> */ legend( label(1 "Lowess") label(2 "Kernreg"))
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section2\mma09p1np.txt
log type: text
closed on: 17 May 2005, 14:17:05
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma09p2npmore.txt
log type: text
opened on: 17 May 2005, 14:17:35
.
. ********** OVERVIEW OF MMA09P2NPMORE.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 9.4-9.5 (pages 307-19)
. * More on nonparametric regression, including Figures 9.5 - 9.7
.
. * It provides
. * (1) Nonparametric regression
.*
k-nearest neighbors regression: Figure 9.5 in chapter 9.4.2 (ch9ksmma)
204
.*
Lowess regression: Figure 9.6 in chapter 9.4.3 (ch9ksmlowess)
.*
Kernel regression (using Stata add-on kernreg)
. * (2) Nonparametric derivative estimation
.*
Figure 9.7 in chapter 9.5.5 (ch9kderiv)
. * (3) Cross-validation - still incomplete
. * using generated data (see below)
.
. * See also mma09p1np.do for nonparametric density estimation and regression
.
. * This program uses free Stata add-on command kernreg
. * To obtain in Stata give command search kernreg
.
. ********** SETUP **********
.
. di "mma09p2npmore.do Cameron and Trivedi: Stata nonparametrics with generated data"
mma09p2npmore.do Cameron and Trivedi: Stata nonparametrics with generated data
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** GENERATE DATA **********
.
. * Model is y = 150 + 6.5*x - 0.15*x^2 + 0.001*x^3 + u
. * where u ~ N[0, 25^2]
.*
x = 1, 2, 3, ... , 100
.*
e ~ N[0, 2^2]
.
. set seed 10101
. set obs 100
obs was 0, now 100
. gen u = 25*invnorm(uniform())
. gen x = _n
. gen y = 150 + 6.5*x - 0.15*x^2 + 0.001*x^3 + u
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u|
100 2.809606 25.26291 -71.97334 73.59318
x|
100
50.5 29.01149
1
100
y|
100 228.5596 35.25377 132.2952 345.5873
.
205
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x using mma09p2npmore.asc, replace
.
. ******** PARAMETRIC REGRESSION **********
.
. * OLS regression on cubic polymomial
. gen xsquared = x^2
. gen xcubed = x^3
. reg y x xsquared xcubed
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 3, 96) = 31.15
Model | 60691.6801 3 20230.56
Prob > F
= 0.0000
Residual | 62348.2994 96 649.461452
R-squared = 0.4933
-------------+-----------------------------Adj R-squared = 0.4774
Total | 123039.98 99 1242.82808
Root MSE
= 25.485
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | 6.055295 .9033915 6.70 0.000 4.262077 7.848513
xsquared | -.1402283 .0207284 -6.77 0.000 -.1813738 -.0990828
xcubed | .0009492 .0001349 7.03 0.000 .0006814 .0012171
_cons | 155.1521 10.58835 14.65 0.000 134.1344 176.1698
-----------------------------------------------------------------------------. predict ycubic
(option xb assumed; fitted values)
. summarize y ycubic
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y|
100 228.5596 35.25377 132.2952 345.5873
ycubic |
100 228.5596 24.75979 161.0681 307.6293
.
. ******** (1) NONPARAMETRIC REGRESSION **********
.
. * K-NEAREST NEIGHBORS REGRESSION - FIGURE 9.5
. * ksm without options gives running mean = moving average = centered kNN
. * Here _N = 100 so bwidth = 0.05 gives 100*0.05 = 5 nearest neighbours
. graph twoway (scatter y x, msize(medsmall) msymbol(o)) /*
> */ (lowess y x, mean noweight bwidth(0.05) clstyle(p1)) /*
> */ (lfit y x, clstyle(p3)) /*
> */ (lowess y x, mean noweight bwidth(0.25) clstyle(p2)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("k-Nearest Neighbours Regression as k Varies") /*
206
>
>
>
>
>
cv8 |
100 84655.13 34359.8 10940.13 162417.9
py9 |
100 228.0408 8.046055 217.6967 243.0812
cv9 |
100 84655.13 34359.8 10940.13 162417.9
py10 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv10 |
100 84655.13 34359.8 10940.13 162417.9
py11 |
100 228.0408 8.046055 217.6967 243.0812
cv11 |
100 84655.13 34359.8 10940.13 162417.9
py12 |
100 228.0408 8.046055 217.6967 243.0812
cv12 |
100 84655.13 34359.8 10940.13 162417.9
-------------+-------------------------------------------------------py13 |
100 228.0408 8.046055 217.6967 243.0812
cv13 |
100 84655.13 34359.8 10940.13 162417.9
py14 |
100 228.0408 8.046055 217.6967 243.0812
cv14 |
100 84655.13 34359.8 10940.13 162417.9
py15 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv15 |
100 84655.13 34359.8 10940.13 162417.9
py16 |
100 228.0408 8.046055 217.6967 243.0812
cv16 |
100 84655.13 34359.8 10940.13 162417.9
py17 |
100 228.0408 8.046055 217.6967 243.0812
cv17 |
100 84655.13 34359.8 10940.13 162417.9
-------------+-------------------------------------------------------py18 |
100 228.0408 8.046055 217.6967 243.0812
cv18 |
100 84655.13 34359.8 10940.13 162417.9
py19 |
100 228.0408 8.046055 217.6967 243.0812
cv19 |
100 84655.13 34359.8 10940.13 162417.9
py20 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv20 |
100 84655.13 34359.8 10940.13 162417.9
py21 |
100 228.0408 8.046055 217.6967 243.0812
cv21 |
100 84655.13 34359.8 10940.13 162417.9
py22 |
100 228.0408 8.046055 217.6967 243.0812
cv22 |
100 84655.13 34359.8 10940.13 162417.9
-------------+-------------------------------------------------------py23 |
100 228.0408 8.046055 217.6967 243.0812
cv23 |
100 84655.13 34359.8 10940.13 162417.9
py24 |
100 228.0408 8.046055 217.6967 243.0812
cv24 |
100 84655.13 34359.8 10940.13 162417.9
py25 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv25 |
100 84655.13 34359.8 10940.13 162417.9
. * Then need to choose the `i' with minimum cv`i'
. * Problem here is that this gives e.g. $bw5 = 5 not 0.05
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section2\mma09p2npmore.txt
log type: text
closed on: 17 May 2005, 14:17:43
210
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma09p3kernels.txt
log type: text
opened on: 18 May 2005, 21:31:55
.
. ********** OVERVIEW OF MMA09P3KERNELS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * This program plots different kernel regression functions
. * This is not included in the book
. * There is no data
.
. * Results:
. * Epanstata is similar to Gaussian kernel. Less peaked than Epanechnikov.
. * Triangular, Quartic, Triweight and Tricubic are similar,
. * and are more peaked than Epanechnikov
. * The fourth oreder Kernels can take negative values.
.
. * NOTE: For kernel density Stata uses an alternative formulation of Epanechnikov
.*
To follow book and e.g. Hardle (1990) use epan2
.*
(available in Stata version 8.2) rather than epan
.
. ********** SETUP **********
.
. di "mma09p3kernels.do Cameron and Trivedi: Stata Kernel Functions"
mma09p3kernels.do Cameron and Trivedi: Stata Kernel Functions
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** GENERATE DATA **********
.
. * Graphs will be for z = -2.5 to 2.5 in increments of 0.02
. set obs 251
obs was 0, now 251
. gen z = -2.52 + 0.02*_n
.
. ********** CALCULATE THE KERNELS **********
211
.
. * Indicator for |z| < 1
. gen abszltone = 1
. replace abszltone = 0 if abs(z)>=1
(152 real changes made)
.
. gen kuniform = 0.5*abszltone
.
. gen ktriangular = (1 - abs(z))*abszltone
.
. * Stata calls the usual Epanechnikov kernel epan2
. gen kepanechnikov = (3/4)*(1 - z^2)*abszltone
.
. * Stata uses alternative epanechnikov
. gen abszltsqrtfive = 1
. replace abszltsqrtfive = 0 if abs(z)>=sqrt(5)
(28 real changes made)
. gen kepanstata = (3/4)*(1 - (z^2)/5)/sqrt(5)*abszltsqrtfive
.
. gen kquartic = (15/16)*((1 - z^2)^2)*abszltone
.
. gen ktriweight = (35/32)*((1 - z^2)^3)*abszltone
.
. gen ktricubic = (70/81)*((1 - (abs(z))^3)^3)*abszltone
.
. gen kgaussian = normden(z)
.
. gen k4thordergauss = (1/2)*(3-(z^2))*normden(z)
.
. * This is the optimal 4th order - Pagan and Ullah p.57
. gen k4thorderquartic = (15/32)*(3 - 10*z^2 + 7*z^4)*abszltone
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------z|
251
0 1.452033
-2.5
2.5
212
abszltone |
251 .3944223 .4897027
0
1
kuniform |
251 .1972112 .2448514
0
.5
ktriangular |
251 .1992032 .3058094
0
1
kepanechni~v |
251 .1991833 .2831384
0
.75
-------------+-------------------------------------------------------abszltsqrt~e |
251 .8884462 .3154457
0
1
kepanstata |
251 .199203 .1175801
0 .3354102
kquartic |
251 .1992032 .3209618
0
.9375
ktriweight |
251 .1992032 .351183
0 1.09375
ktricubic |
251 .1992032 .3191548
0 .8641976
-------------+-------------------------------------------------------kgaussian |
251 .1967985 .1323354 .0175283 .3989423
k4thorderg~s |
251 .2053453 .2297148 -.0327459 .5984134
k4thorderq~c |
251 .199253 .4584096 -.2676096 1.40625
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile z abszltone kuniform ktriangular kepanechnikov abszltsqrtfive /*
> */ kepanstata kquartic ktriweight ktricubic kgaussian /*
> */ k4thordergauss k4thorderquartic using mma09p3kernels.asc, replace
.
. ********** PLOT THE KERNEL FUNCTIONS **********
.
. * Epanstata is similar to Gaussian kernel. Less peaked than Epanechnikov
. graph twoway (line kuniform z) (line kepanechnikov z) (line kepanstata z) /*
> */ (line kgaussian z), title("Four standard kernel functions")
.
. * Triangular, Quartic, Triweight and Tricubic are similar
. * and are more peaked than Epanechnikov
. graph twoway (line ktriangular z) (line kquartic z) (line ktriweight z) /*
> */ (line ktricubic z), title("Four similar kernel functions")
.
. graph twoway (line k4thordergauss z) (line k4thorderquartic z), /*
> */ title("Two fourth order kernel functions")
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma09p3kernels.txt
log type: text
closed on: 18 May 2005, 21:32:00
213
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma10p1gradient.txt
log type: text
opened on: 17 May 2005, 14:21:11
.
. ********** OVERVIEW OF MMA10P1GRADIENT.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 10.2.4 page 338-9
. * Gradient Method Example (Newton-Raphson)
. * using artificial data
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
.
. ********** ANALYSIS: FIRST SIX ROUNDS OF NR **********
.
. * General Algorithm is
. * b_s+1 = b_s + A_s*g_s
.
. * For this the example in section 10.2.4
. * Q(b) = -(1/2N) * Sum_i {(y_i-exp(b))^2}
.*
= -(1/2N) * Sum_i {(y_i)^2 -2*y_i*exp(b) + exp(b)^2}
.*
= ymean*exp(b) - 0.5*(exp(b))^2 - (1/N) * Sum_i {(y_i)^2}
.
. * so the gradient vector (here a scalar)
.*
g = dQ_s / db
.*
= (ymean - exp(b))*exp(b)
.
. * and using the Method of scoring variation of Newton-Raphson
. * the weighting matrix (here a scalar)
. * A_s = Inv [ - E[d^2 Q_s / db^2 ] ]
. * A_s = Inv [ - E[(ymean - exp(b))*exp(b) - exp(b)*exp(b)] ]
.*
= Inv [ exp(2b) ] since E[(ymean - exp(b)] = 0
.*
= exp(-2b)
.
. * Data
. scalar ymean = 2.0
214
.
. * Starting value
. scalar b_1 = 0.0
.
. * First round
. scalar g_1 = (ymean - exp(b_1))*exp(b_1)
. scalar A_1 = exp(-2*b_1)
. scalar b_2 = b_1 + A_1*g_1
.
. * Second round
. scalar g_2 = (ymean - exp(b_2))*exp(b_2)
. scalar A_2 = exp(-2*b_2)
. scalar b_3 = b_2 + A_2*g_2
.
. * Third round
. scalar g_3 = (ymean - exp(b_3))*exp(b_3)
. scalar A_3 = exp(-2*b_3)
. scalar b_4 = b_3 + A_3*g_3
.
. * Fourth round
. scalar g_4 = (ymean - exp(b_4))*exp(b_4)
. scalar A_4 = exp(-2*b_4)
. scalar b_5 = b_4 + A_4*g_4
.
. * Fifth round
. scalar g_5 = (ymean - exp(b_5))*exp(b_5)
. scalar A_5 = exp(-2*b_5)
. scalar b_6 = b_5 + A_5*g_5
.
. * Sixth round
. scalar g_6 = (ymean - exp(b_6))*exp(b_6)
. scalar A_6 = exp(-2*b_6)
.
215
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 11.3 pages 366-368
. * Bootstrap applied to exponential regression model
. * Provides
. * (1) Bootstrap distribution of beta and t-statistic (Table 11.1)
. * (2) Various statistics from bootstrap (pages 366-8)
. * (3) Bootstrap density of the t-statistic (Figure 11.1)
. * using generated data (see below)
.
. * Note: To speed up progam reduce breps - the number of bootstrap replications
.*
But final program should use many repications
.
. * Note: This program uses ereg which is an old Stata command
.*
superceded by streg, dist(exp)
.
. * Note: For bootstrap see also mm07p4boot.do
.*
which has additional commands / ways to bootstrap
.
. ********** SETUP **********
.
. set more off
. version 8
.
. ********** GENERATE DATA **********
.
. * Model is y ~ exponential(exp(a + bx + cz))
. * where x and z are joint normal (1,1,0.1,0.1,0.5)
. * i.e. means 0.1 and 0.1
.*
sd's 0.1 and 0.1 and correln 0.5 (so correln^2 = .25)
. * variances 0.01 and 0.01 and covariance 0.005
.
. * Generate data from joint normal
. * Use fact that x is N(mu0.1,0.1)
.*
and z | x is N(0.1 + .05/.1*(x - .1), .01x.75 = .0075)
.*
so that st dev = sqrt(0.0075) = 0.0866025
.
. set obs 50
obs was 0, now 50
. set seed 10001
. * Generate x and z bivariate normal
. scalar mu1=0.1
217
. scalar mu2=0.1
. scalar sig1=0.1
. scalar sig2=0.1
. scalar rho=0.5
. scalar sig12=rho*sig1*sig2
. gen x = mu1 + sig1*invnorm(uniform())
. gen muzgivx = mu2+(sig12/(sig2*sig2))*(x-mu1)
. gen sigzgivx = sqrt(sig2*sig2*(1-rho*rho))
. gen z = muzgivx + sigzgivx*invnorm(uniform())
. * To generate y exponential with mean mu=Ey use
. * Integral 0 to a of (1/mu)exp(-x/mu) dx by change of variables
. * = Integral 0 to a/mu of exp(-t)dt
. * = incomplete gamma function P(0,a/mu) in the terminology of Stata
. gen Ey = exp(-2.0+2*x+2*z)
. gen y = Ey*invgammap(1,uniform())
. gen logy = log(y)
.
. * Descriptive Statistics
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x|
50 .0935209 .1031485 -.1173506 .2778609
muzgivx |
50 .0967604 .0515742 -.0086753 .1889304
sigzgivx |
50 .0866025
0 .0866025 .0866025
z|
50 .1033014 .0909297 -.0885447 .3137469
Ey |
50 .2114837 .071719 .0945722 .4314067
-------------+-------------------------------------------------------y|
50 .2024206 .2237202 .0005293 .9601147
logy |
50 -2.282336 1.45494 -7.543878 -.0407026
. ereg y x z
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
50
0.0126
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .2670543 1.417339 0.19 0.851 -2.510879 3.044988
z | 4.663384 1.740712 2.68 0.007 1.251652 8.075117
_cons | -2.191619 .2328589 -9.41 0.000 -2.648014 -1.735224
-----------------------------------------------------------------------------.
. save mma11p1boot, replace
file mma11p1boot.dta saved
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x z using mma11p1boot.asc, replace
.
. ********** SIMPLE BOOTSTRAP **********
.
. * Stata produces four bootstrap 100*(1-alpha) confidence intervals
. * (N) and (P) have no asymptotic refinement
. * (BC)-(BCA) have asymptotic refinement
. * For details see program mma07p4boot.do
.
. * Change the following for different number of simulations S
. * From page 399, for testing better to use 999 than 1000
. global breps = 999 /* The number of bootstrap reps used below */
.
. set seed 20001
.
. * A simple and adequate bootstrap command for the slope coefficients is
. bs "ereg y x z" "_b[x] _b[z]", reps($breps) level(95)
command:
ereg y x z
statistics: _bs_1
= _b[x]
_bs_2
= _b[z]
Bootstrap statistics
Number of obs =
Replications =
999
50
Number of obs =
Replications =
999
50
(N)
(N)
(N)
(N)
.
. * Now use the bootstrap estimates
. use mma11p1bootreps, clear
(bootstrap: ereg y x z)
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------_bs_1 |
999 .0785034 1.420956 -9.431229 4.278278
_bs_2 |
999 4.715863 1.939086 -1.747643 12.09208
_bs_3 |
999 1.481759 .1718393 1.145421 2.761842
_bs_4 |
999 1.831722 .186631 1.387625 2.910449
221
4.77197
Mean
4.715863
Largest
Std. Dev.
1.939086
5.970802
10.10243
7.100958
10.42623
Variance
3.760056
7.810663
10.76733
Skewness
-.1344324
9.426978
12.09208
Kurtosis
3.545415
ttestzs
------------------------------------------------------------Percentiles
Smallest
1% -2.66391 -3.921595
5% -1.727528
-3.483456
10% -1.32364 -3.201425
Obs
999
25% -.6209012 -2.975815
Sum of Wgt.
999
50%
75%
90%
95%
99%
.0618649
Mean
.0261125
Largest
Std. Dev.
1.046855
.7034938
2.693856
1.323415
3.087892
Variance
1.095904
1.70558
3.11692
Skewness
-.1596043
2.529097
3.738328
Kurtosis
3.337749
.
. * Additionally need the 2.5 and 97.5 percentiles not given in summarize, d
222
.
. * Coefficient of z
. _pctile bzs, p(2.5,97.5)
. di " Lower 2.5 and upper 2.5 percentile of coeff b for z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of coeff b for z: .50060469 and 8.4838924
.
. * t-statistic for z
. _pctile ttestzs, p(2.5,97.5)
. di " Lower 2.5 and upper 2.5 percentile of ttest on z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of ttest on z: -2.1827998 and 2.0659592
.
. ********** (2) RESULTS IN TEXT PAGES 366-7 **********
.
. * (2A) Bootstrap standard error estimate (no refinement)
. * These are given earlier in bootstrap table output
. * Equivalently get the standard deviation of bzs
.
. quietly sum bzs
. scalar bzbootse = r(sd)
. di "Bootstrap estimate of standard error: " bzbootse
Bootstrap estimate of standard error: 1.9390864
.
. * (2B) Test b3 = 0 using percentile-t method (asymptotic refinement)
. * Use the 2.5% and 97.5% bootstrap critical values for t-statistic for z
.
. _pctile ttestzs, p(2.5,97.5)
. di " Lower 2.5 and upper 2.5 percentile of ttest on z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of ttest on z: -2.1827998 and 2.0659592
.
. * (2D) 95% confidence interval with asymptotic refinement
. * Use the preceding critical values
.
. scalar lbz = $bz + r(r1)*$sez /* Note the plus sign here */
. scalar ubz = $bz + r(r2)*$sez
. di " Percentile-t interval lower and upper bounds: (" lbz "," ubz ")"
Percentile-t interval lower and upper bounds: (.86375888,8.2596243)
.
. * (2B-Var) Variation for symmetric two-sided test on z
.
223
224
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section3\mma12p1integration.txt
log type: text
opened on: 18 May 2005, 21:17:14
.
. ********** OVERVIEW OF MMA12P1INTEGRATION.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 12.3.3 pages 391-2
. * Computes integral numerically and by simulation
. * (1) Illustrate Midpoint Rule (page 392)
. * (2) Illustrate Monte Carlo integral (Table 12.1 page 392)
.*
. * for computing E[x] and E[exp(-exp(x))] for x ~ N[0,1]
.
. * No data need be read in.
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** (1) NUMERICAL INTEGRATION USING MIDPOINT RULE **********
.
. * Midpoint rule for n evaluation points between a and b is
. * Integral = Sum (j=1 to n) [(b-a)/n]*f(xbar_j)
. * where xbar_j is midpoint between x_j-1 and x_j
.
. program midpointrule, rclass
1. version 8
2. /* define arguments. Here trueb2 = b2 in Phi(b1 + b2*x2) */
. args neval a b
3. drop _all
4. scalar increment = (`b'-`a') / `neval'
5. set obs `neval'
6. /* Compute the function of interest */
. gen xbar = `a' - 0.5*increment + increment*_n
7. gen density = exp(-xbar*xbar/2)/sqrt(2*_pi)
8. * Following is contribution to E[x] when x ~ N[0,1]
. gen f1xbar = xbar*density
9. * Following is contribution to E[exp(-exp(x))] when x ~ N[0,1]
. gen f2xbar = exp(-exp(x))*density
10. /* Compute the averages */
225
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile u e y using mma12p2mslmsm.asc, replace
.
. * Use the variant ml d0 as this gives the entire likelihood, not just one observation.
. * I want this so that seed is only reset for the entire data.
. * My program is inefficient as variates needs to be redrawn at each iteration
. program define msl
1. version 6.0
2. args todo b lnf
/* Need to use the names todo b and lnf
>
todo always contains 1 and may be ignored
>
b is parameters and lnf is log-density */
3. tempvar theta1
/* create as needed to calculate lf, g, ... */
4. mleval `theta1' = `b', eq(1) /* theta1 is theta1_i = x_i'b
*/
5. local y "$ML_y1"
/* create to make program more readable */
6. set seed 10101
7. tempvar denssim
8. global isim=1
9. quietly gen `denssim' = exp(-0.5*(`y'-`theta1'+log(-log(uniform())))^2)/sqrt(2*_pi)
10. while $isim < $simreps {
11.
quietly replace `denssim' = `denssim' + exp(-0.5*(`y'-`theta1'+log(-log(uniform())))^2)/sq
> rt(2*_pi)
12. global isim=$isim+1
13. }
14. mlsum `lnf' = ln(`denssim'/$isim)
15. end
.
. gen one = 1
. ml model d0 msl (y = one, nocons )
. ml maximize
initial:
log likelihood = -216.68168
alternative: log likelihood = -199.54479
rescale:
log likelihood = -191.09715
Iteration 0: log likelihood = -191.09715
Iteration 1: log likelihood = -190.4391 (not concave)
Iteration 2: log likelihood = -190.43885
Iteration 3: log likelihood = -190.4385
Iteration 4: log likelihood = -190.4385
Number of obs =
100
Wald chi2(1) =
65.72
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------y|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
230
5. }
. gen usimbar = usim/$isim
. gen esimbar = esim/$isim
. gen theta = y - usimbar - esimbar
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u|
100 .7236045 1.372637 -1.827296 6.423636
e|
100 .0415449 .9472174 -2.906972 2.302204
y|
100 1.765149 1.684177 -2.227185 8.143228
usim |
100 57.36345 13.16979 21.96637 90.07499
esim |
100 -.9702956 11.38655 -26.38858 33.28406
-------------+-------------------------------------------------------usimbar |
100 .5736345 .1316979 .2196637 .9007499
esimbar |
100 -.009703 .1138655 -.2638858 .3328406
theta |
100 1.201218 1.681435 -2.757669 7.75245
.
. * Results for Table 12.3 on page 404
. * Here the st.eror of theta_MSM is approximated by the st. dev. of theta
. * divided by the square root of S (the number of simulations)
. quietly sum theta
. scalar theta_MSM = r(mean)
. scalar approx_sterror = r(sd)/sqrt($simreps)
.
. * Display MSM results in one column of Table 12.3 p.404
. di "For number of simulations S = " $simreps
For number of simulations S = 100
. di "MSM estimator: " theta_MSM
MSM estimator: 1.2012178
. di "Approximate standard error: " approx_sterror
Approximate standard error: .16814348
.
. * As written this will not give the correct standard errors (see p.403).
. * Can get this by also computing the squared rv to get E[y^2]
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section3\mma12p2mslmsm.txt
log type: text
232
8
1.2
3
5
100
10
50
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------xeval |
150
7.55 4.344537
.1
15
likelihood |
150 .0666548 .0944174 6.44e-12 .2820948
prior |
150 .0665247 .0804685 1.33e-08 .2303294
posterior |
150 .0666667 .1131755 1.85e-12 .3641828
.
. graph twoway (line likelihood xeval, clstyle(p2)) /*
> */ (line prior xeval, clstyle(p3)) /*
> */ (line posterior xeval, clstyle(p1)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Bayes: Likelihood, Prior and Posterior") /*
> */ xtitle("Evaluation point", size(medlarge)) xscale(titlegap(*5)) /*
236
>
>
>
>
NOTE: Copyright (c) 2002-2003 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) 9.1 (TS1M2)
Licensed to UNIV OF CA/DAVIS, Site 0029107010.
NOTE: This session is executing on the SunOS 5.9 platform.
You are running SAS 9. Some SAS 8 files will be automatically converted
by the V9 engine; others are incompatible. Please see
http://support.sas.com/rnd/migration/planning/platform/64bit.html
PROC MIGRATE will preserve current SAS file attributes and is
recommended for converting all your SAS libraries from any
SAS 8 release to SAS 9. For details and examples, please see
http://support.sas.com/rnd/migration/index.html
This message is contained in the SAS news file, and is presented upon
initialization. Edit the file "news" in the "misc/base" directory to
display site-specific news and information in the program log.
The command line option "-nonews" will prevent this display.
237
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
238
2
25, 2005
30
* MMA13P2BAYES.1ST SAS Output with one column of Table 13.3
31
* MMA13P2BAYES.LOG SAS log file
32
33
* This program uses generated data - so no data set required
34
* This program uses a lot of memory - 1 gigabyte should do
35
* In Unix give command sas -MEMSIZE 1G mma13p2bayesgibbs.sas
36
37
*********************************************************************;
38
*****
BIVARIATE NORMAL-BAYESIAN-ESTIMATION-BY-MCMC
**************;
39
*********************************************************************;
40
41
OPTIONS LS=75;
42
options NOTES;
43
44
PROC IML;
NOTE: IML Ready
45
start main;
45
!
46
47
print "A. Colin Cameron and Pravin K. Trivedi (2005)";
47
!
48
print "Microeconometrics: Methods and Applications, CUP";
48
!
49
print "MCMC Example: Gibbs Sampler for SUR";
49
!
50
51
************* GENERATING DATA: 2 EQUATION SUR
51
! ****************;
52
53
nobs = 1000;
53
!
54
replics = 50000;
54
!
55
burn = 5000;
55
!
56
replics = replics + burn;
56
!
57
58
npar1 = 2;
58
!
59
npar2 = 2;
59
!
60
61
alpha1 ={1,1};
61
!
62
alpha2 ={1,1};
62
!
239
63
64
64
65
65
66
66
67
67
68
69
240
3
69
70
70
71
72
72
73
73
74
74
75
76
76
77
77
78
79
79
80
81
81
82
82
83
84
84
85
85
86
86
87
87
88
89
89
90
90
91
91
92
93
93
94
95
95
96
97
97
98
241
99
99
100
101
102
102
103
104
isigma = inv(sigma);
!
LL = ((isigma[1,1]*R1`*R1||isigma[1,2]*R1`*R2)//
(isigma[2,1]*R2`*R1||isigma[2,2]*R2`*R2));
!
LisigY = ((isigma[1,1]*R1`*Y1+isigma[1,2]*R1`*Y2)//
(isigma[2,1]*R2`*Y1+isigma[2,2]*R2`*Y2));
242
4
104
105
106
107
107
108
109
109
110
110
111
112
112
113
113
114
115
115
116
117
118
118
119
119
120
120
121
121
122
122
123
123
124
124
125
126
126
127
128
128
129
130
130
131
131
132
132
133
134
134
135
136
136
136
137
138
138
!
!
point300:
end;
*************
! **************;
244
5
139
140
141
141
142
142
143
143
144
145
145
146
147
147
148
148
149
150
150
151
151
152
152
153
153
154
155
155
156
156
157
157
158
158
159
160
160
161
161
162
162
163
164
164
165
165
166
166
167
****************************************************************
! *****;
***** RESULTS: COMPARE LAST HALF WITH ALL (AFTER BURN-IN)
! *******;
****************************************************************
! *****;
replics = replics-burn;
!
out1 = output1[replics/2+1:replics,];
!
out = output1[1:replics,];
!
create exp from out1;
!
append from out1;
!
summary var _num_;
!
close exp;
!
create exp from out;
!
append from out;
!
summary var _num_;
!
close exp;
!
****************************************************************
! *****;
****** RESULTS: POSTERIOR MEAN AND SD - TABLE 13.3 P.454
! ********;
****************************************************************
! *****;
xnames1 = {"CONSTANT"} || {"R1"};
!
xnames2 = {"CONSTANT"} || {"R2"};
!
parnames = concat({"d1"}," ",xnames1)||concat({"d2"},"
! ",xnames2)||{"SIGMA11"}||{"SIGMA12"}||{"SIGMA22"};
245
168
168
169
169
170
170
171
171
meanout = out[+,]/replics;
!
stderr =
! sqrt(((out-j(replics,1,1)*meanout)##2)[+,]/(replics-1));
parm = meanout;
!
stderr = stderr`;
!
246
6
172
172
173
174
174
175
175
176
176
177
177
178
178
179
179
180
180
181
181
182
182
183
183
184
185
185
186
186
187
187
188
189
189
190
191
191
192
193
193
194
194
195
195
196
196
197
198
198
199
247
200
200
201
201
202
202
203
203
do k = 1 to 3;
!
covd1 = corr[,2*k-1:2*k];
!
print covd1;
!
end;
!
248
204
205
covd1 = corr[,7];
205
!
206
print covd1;
206
!
207
208
finish main;
NOTE: Module MAIN defined.
208
!
209
210
run main;
NOTE: The data set WORK.EXP has 25000 observations and 7 variables.
NOTE: The data set WORK.EXP has 50000 observations and 7 variables.
210
!
NOTE: Exiting IML.
NOTE: 65925 workspace compresses.
NOTE: The PROCEDURE IML printed pages 1-6.
NOTE: PROCEDURE IML used (Total process time):
real time
5:44.35
cpu time
5:44.04
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414
NOTE: The SAS System used:
real time
5:45.48
cpu time
5:45.15
249
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma14p1binary.txt
log type: text
opened on: 19 May 2005, 09:01:28
.
. ********** OVERVIEW OF MMA14P1BINARY.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 14.2 (pages 464-6) Logit and probit models.
. * Provides
. * (1) Table 14.1: Data summary
. * (2) Table 14.2: Logit, Probit and OLS slope estimates
. * (3) Figure 14.1: Plot of Logit Probit and OLS predicted probabilities
.
. * To run this program you need data file
. * Nldata.asc
.
. ********** SETUP
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** DATA DESCRIPTION
.
. * Data Set comes from :
. * J. A. Herriges and C. L. Kling,
. * "Nonlinear Income Effects in Random Utility Models",
. * Review of Economics and Statistics, 81(1999): 62-72
.
. * The data are given as a combined observation with data on all 4 choices.
. * This will work for multinomial logit program.
. * For conditional logit will need to make a new data set which has
. * four separate entries for each observation as there are four alternatives.
.
. * Filename: NLDATA.ASC
. * Format: Ascii
. * Number of Observations: 1182
. * Each observations appears over 3 lines with 4 variables per line
. * so 4 x 1182 = 4728 observations
. * Variable Number and Description
. * 1 Recreation mode choice. = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
250
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
.
. ********** CREATE BINARY DATA: CHARTER vs PIER **********
.
. * Binary logit of charter (mode = 2) versus pier (mode = 4)
. keep if mode == 2 | mode == 4
(552 observations deleted)
. * charter is 1 if fish from charter boat and 0 if fish from pier
. gen charter = 0
. replace charter = 1 if mode == 4
(452 real changes made)
.
. gen pratio = 100*ln(pcharter/ppier)
. gen lnrelp = ln(pchart/ppier)
.
. * Overall summary
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
630 3.434921 .9011843
2
4
price |
630 62.51669 52.31219
1.29 387.208
crate |
630 .5533478 .6953035
.0014 2.3101
dbeach |
630
0
0
0
0
dpier |
630 .2825397 .4505921
0
1
-------------+-------------------------------------------------------dprivate |
630
0
0
0
0
dcharter |
630 .7174603 .4505921
0
1
pbeach |
630 95.19802 95.62037
1.29 578.048
ppier |
630 95.19802 95.62037
1.29 578.048
pprivate |
630 55.26221 59.99482
2.29 494.058
-------------+-------------------------------------------------------pcharter |
630 84.89158 60.79327
27.29 529.058
qbeach |
630 .2546022 .1983357
.0678
.5333
qpier |
630 .1716835 .1687288
.0014 .4522
qprivate |
630 .1695303 .2033172
.0014
.7369
qcharter |
630 .6368509 .688508
.0029 2.3101
-------------+-------------------------------------------------------income |
630 3741.402 2145.71 416.6667
12500
ydiv1000 |
630 3.741402 2.14571 .4166667
12.5
charter |
630 .7174603 .4505921
0
1
252
pratio |
lnrelp |
ppier |
452 120.6483 99.78664
4.29 578.048
pprivate |
452 44.56376 52.23744
2.29 362.208
-------------+-------------------------------------------------------pcharter |
452 75.09694 52.51942
27.29 387.208
qbeach |
452 .2519077 .1997956
.0678
.5333
qpier |
452 .1595341 .1667353 .0014 .4522
qprivate |
452 .1771628 .2318749
.0014
.7369
qcharter |
452 .6914998 .7714728
.0029 2.3101
-------------+-------------------------------------------------------income |
452
3880.9 2050.028 416.6667
12500
ydiv1000 |
452
3.8809 2.050028 .4166667
12.5
charter |
452
1
0
1
1
pratio |
452 -26.43243 87.53686 -215.3976 235.8242
lnrelp |
452 -.2643243 .8753686 -2.153976 2.358242
.
. * Write final data to a text (ascii) file so can use with programs other than Stata
. outfile charter lnrelp using mma14p1binary.asc, replace
.
. ********** TABLE 14.1 - DATA SUMMARY BY OUTCOME AND OVERALL **********
.
. * Following gives Table 14.1 page 464
. summarize charter pcharter ppier lnrelp
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------charter |
630 .7174603 .4505921
0
1
pcharter |
630 84.89158 60.79327
27.29 529.058
ppier |
630 95.19802 95.62037
1.29 578.048
lnrelp |
630 .2745581 1.262598 -2.153976 4.062713
. sort mode
. by mode: summarize charter pcharter ppier lnrelp
----------------------------------------------------------------------------------------------------> mode = pier
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------charter |
178
0
0
0
0
pcharter |
178 109.7633 72.37726
27.29 529.058
ppier |
178 30.57133 35.58442
1.29 224.296
lnrelp |
178 1.642956 1.043052 -.7913917 4.062713
----------------------------------------------------------------------------------------------------> mode = charter
Variable |
Obs
Mean
Std. Dev.
Min
Max
254
-------------+-------------------------------------------------------charter |
452
1
0
1
1
pcharter |
452 75.09694 52.51942
27.29 387.208
ppier |
452 120.6483 99.78664
4.29 578.048
lnrelp |
452 -.2643243 .8753686 -2.153976 2.358242
.
. ********** TABLE 14.2 - ESTIMATE LOGIT, PROBIT AND OLS MODELS
.
. logit charter lnrelp
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Logit estimates
Number of obs =
630
LR chi2(1)
= 336.47
Prob > chi2 = 0.0000
Log likelihood = -206.82697
Pseudo R2
= 0.4486
-----------------------------------------------------------------------------charter |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnrelp | -1.82253 .1445681 -12.61 0.000 -2.105879 -1.539182
_cons | 2.053125 .1689307 12.15 0.000 1.722027 2.384223
-----------------------------------------------------------------------------. estimates store blogit
.
. probit charter lnrelp
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Probit estimates
Number of obs =
630
LR chi2(1)
= 341.30
Prob > chi2 = 0.0000
Log likelihood = -204.41087
Pseudo R2
= 0.4550
-----------------------------------------------------------------------------charter |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnrelp | -1.055515 .0761117 -13.87 0.000 -1.204691 -.9063383
255
Logit estimates
Number of obs =
630
Wald chi2(1) = 194.28
Prob > chi2 = 0.0000
Log pseudo-likelihood = -206.82697
Pseudo R2
= 0.4486
-----------------------------------------------------------------------------|
Robust
charter |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnrelp | -1.82253 .1307556 -13.94 0.000 -2.078807 -1.566254
_cons | 2.053125 .1473477 13.93 0.000 1.764329 2.341921
-----------------------------------------------------------------------------. estimates store bloghet
256
.
. probit charter lnrelp, robust
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Probit estimates
Number of obs =
630
Wald chi2(1) = 232.07
Prob > chi2 = 0.0000
Log pseudo-likelihood = -204.41087
Pseudo R2
= 0.4550
-----------------------------------------------------------------------------|
Robust
charter |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnrelp | -1.055515 .0692881 -15.23 0.000 -1.191317 -.9197122
_cons | 1.19436 .0794429 15.03 0.000 1.038655 1.350066
-----------------------------------------------------------------------------. estimates store bprobhet
.
. regress charter lnrelp, robust
Regression with robust standard errors
Number of obs =
F( 1, 628) = 792.44
Prob > F
= 0.0000
R-squared = 0.4633
Root MSE = .33036
630
-----------------------------------------------------------------------------|
Robust
charter |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnrelp | -.2429137 .0086292 -28.15 0.000 -.2598592 -.2259681
_cons | .7841542 .0119566 65.58 0.000 .7606744 .8076341
-----------------------------------------------------------------------------. estimates store bOLShet
.
. * Following gives Table 14.2 page 465
. estimates table blogit bprobit bOLS bloghet bprobhet bOLShet, /*
> */ t stats(N ll r2 r2_p) b(%8.3f) keep(_cons lnrelp)
-------------------------------------------------------------------------------Variable | blogit bprobit
bOLS bloghet bprobhet bOLShet
257
-------------+-----------------------------------------------------------------_cons | 2.053
1.194
0.784
2.053
1.194
0.784
| 12.15
13.34
58.21
13.93
15.03
65.58
lnrelp | -1.823 -1.056 -0.243 -1.823 -1.056 -0.243
| -12.61 -13.87 -23.28 -13.94 -15.23 -28.15
-------------+-----------------------------------------------------------------N | 630.000 630.000 630.000 630.000 630.000 630.000
ll | -206.827 -204.411 -195.167 -206.827 -204.411 -195.167
r2 |
0.463
0.463
r2_p | 0.449
0.455
0.449
0.455
-------------------------------------------------------------------------------legend: b/t
.
. ********** FIGURE 14.1 - PLOT PREDICTED PROBABILITY AGAINST X FOR MODELS
.
. quietly logit charter lnrelp
. predict plogit, p
.
. quietly probit charter lnrelp
. predict pprobit, p
.
. quietly regress charter lnrelp
. predict pOLS
(option xb assumed; fitted values)
.
. sum charter plogit pprobit pOLS
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------charter |
630 .7174603 .4505921
0
1
plogit |
630 .7174603 .3193077 .0047196 .9974746
pprobit |
630
.72019 .3196164 .0009877 .9997377
pOLS |
630 .7174603 .3067022 -.2027341 1.307384
.
. sort lnrelp
.
. * Following gives Figure 14.1 page 466
. graph twoway (scatter charter lnrelp, msize(vsmall) jitter(3)) /*
> */ (line plogit lnrelp, clstyle(p1)) /*
> */ (line pprobit lnrelp, clstyle(p2)) /*
> */ (line pOLS lnrelp, clstyle(p3)), /*
> */ scale (1.2) plotregion(style(none)) /*
258
>
>
>
>
>
>
259
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma15p1mnl.txt
log type: text
opened on: 19 May 2005, 12:16:20
.
. ********** OVERVIEW OF MMA15P1MNL.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 15.2.1-3 pages 491-5
. * Multinomial and conditional logit models analysis.
. * It provides ....
. * (0) Data summary (Table 15.1)
. * (1A) Multinomial Logit estimates (Table 15.1)
. * (1B) Multinomial Logit marginal effects (text page 494)
. * (2A) Conditional Logit estimates (Table 15.2)
. * (2B) Conditional Logit marginal effects (Table 15.3)
. * (3) Multinomial estimates obtained using Cinditional Logit
. * (4) "Mixed Model" estimates (Table 15.1)
.
. * Related programs are
. * mma15p2gev.do estimates a nested logit model using Stata
. * mma15p3mnl.lim estimates multinomial models using Limdep
. * mma15p4gev.lim estimates conditional and nested logit models using Limdep
.
. * To run this program you need data file
. * Nldata.asc
.
. /* Program summary:
>
> (1) Multinomial logit of mode on alternative-invariant regressor (income)
>
mlogit mode income
>
> (2) Conditional logit of mode on alternative-specific regressor (price, catch rate)
>
First reshape data so 4 observations per individual - one for each mode.
>
clogit mode p q
>
> (3) Conditional logit of mode on alternative-invariant regressor (income)
>
First reshape data so 4 observations per individual - one for each mode.
>
Then create dummy variables for each mode d2 d3 d4
>
clogit mode d2 d3 d4 d2y d3y d4y
>
This gives same results as (1)
>
> (4) Conditional logit of mode on alternative-invariant regressor (income)
>
and on alternative-sepcific regressor (price, catch rate)
>
First reshape data so 4 observations per individual - one for each mode.
260
>
Then create dummy variables for each mode d2 d3 d4
>
clogit mode d2 d3 d4 d2y d3y d4y p q
> */
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** DATA DESCRIPTION **********
.
. * Data Set comes from :
. * J. A. Herriges and C. L. Kling,
. * "Nonlinear Income Effects in Random Utility Models",
. * Review of Economics and Statistics, 81(1999): 62-72
.
. * The data are given as a combined observation with data on all 4 choices.
. * This will work for multinomial logit program.
. * For conditional logit will need to make a new data set which has
. * four separate entries for each observation as there are four alternatives.
.
. * Filename: NLDATA.ASC
. * Format: Ascii
. * Number of Observations: 1182
. * Each observations appears over 3 lines with 4 variables per line
. * so 4 x 1182 = 4728 observations
. * Variable Number and Description
. * 1 Recreation mode choice. = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
. * 2 Price for chosen alternative
. * 3 Catch rate for chosen alternative
. * 4 = 1 if beach mode chosen; = 0 otherwise
. * 5 = 1 if pier mode chosen; = 0 otherwise
. * 6 = 1 if private boat mode chosen; = 0 otherwise
. * 7 = 1 if charter boat mode chosen; = 0 otherwise
. * 8 = price for beach mode
. * 9 = price for pier mode
. * 10 = price for private boat mode
. * 11 = price for charter boat mode
. * 12 = catch rate for beach mode
. * 13 = catch rate for pier mode
. * 14 = catch rate for private boat mode
. * 15 = catch rate for charter boat mode
. * 16 = monthly income
.
. ********** READ IN DATA and SUMMARIZE (Table 15.1, p.492) **********
.
. * Method to read in depends on model used
261
.
. /* Data are on fishing mode: 1 beach, 2 pier, 3 private boat, 4 charter
> Data come as one observation having data for all 4 modes.
> Both alternative specific and alternative invariant regresssors.
> */
.
. infile mode price crate dbeach dpier dprivate dcharter pbeach ppier /*
> */ pprivate pcharter qbeach qpier qprivate qcharter income /*
> */ using nldata.asc
(1182 observations read)
.
. gen ydiv1000 = income/1000
.
. * Look at data by alternative
. label define modetype 1 "beach" 2 "pier" 3 "private" 4 "charter"
. label values mode modetype
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524
.0678
.5333
qpier |
1182 .1622237 .1603898 .0014 .4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
. sort mode
. by mode: summarize
---------------------------------------------------------------------------------------------------262
263
.
. * Following commands give Table 15.1, p.492
. summarize ydiv100 pbeach ppier pprivate pcharter qbeach qpier /*
> */ qprivate qcharter dbeach dpier dprivate dcharter
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
pcharter |
1182 84.37924 63.54465
27.29 691.11
-------------+-------------------------------------------------------qbeach |
1182 .2410113 .1907524 .0678 .5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
dbeach |
1182 .1133672 .3171753
0
1
-------------+-------------------------------------------------------dpier |
1182 .1505922 .3578023
0
1
dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
. sort mode
. by mode: summarize ydiv100 pbeach ppier pprivate pcharter qbeach qpier /*
> */ qprivate qcharter dbeach dpier dprivate dcharter
----------------------------------------------------------------------------------------------------> mode = beach
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ydiv1000 |
134 4.051617 2.50542 .4166667
12.5
pbeach |
134 35.69949 43.09414
1.29 306.82
ppier |
134 35.69949 43.09414
1.29 306.82
pprivate |
134 97.80913 75.43844
2.29 392.946
pcharter |
134 125.0032 78.37641
27.29 427.946
-------------+-------------------------------------------------------qbeach |
134 .2791948 .1938734
.0678
.5333
qpier |
134 .2190015 .1677117
.0025
.4522
qprivate |
134 .1593985 .0948855
.0008
.2601
qcharter |
134 .5176089 .3629096
.0027 1.0266
dbeach |
134
1
0
1
1
-------------+-------------------------------------------------------dpier |
134
0
0
0
0
dprivate |
134
0
0
0
0
dcharter |
134
0
0
0
0
265
pcharter |
452 75.09694 52.51942
27.29 387.208
-------------+-------------------------------------------------------qbeach |
452 .2519077 .1997956
.0678
.5333
qpier |
452 .1595341 .1667353
.0014
.4522
qprivate |
452 .1771628 .2318749
.0014
.7369
qcharter |
452 .6914998 .7714728
.0029 2.3101
dbeach |
452
0
0
0
0
-------------+-------------------------------------------------------dpier |
452
0
0
0
0
dprivate |
452
0
0
0
0
dcharter |
452
1
0
1
1
.
. ********** (1) MULTINOMIAL LOGIT: ALTERNATIVE-INVARIANT REGRESSOR
*********
.
. *** (1A) Estimate the model
.
. * Data are already in form for mlogit
.
. * The following gives MNL column of Table 15.2, p.493
. mlogit mode ydiv1000, basecategory(1)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Number of obs =
1182
=
41.14
= 0.0000
Pseudo R2
= 0.0137
-----------------------------------------------------------------------------mode |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------pier
|
ydiv1000 | -.1434029 .0532882 -2.69 0.007 -.2478459 -.03896
_cons | .8141503 .2286316 3.56 0.000 .3660405 1.26226
-------------+---------------------------------------------------------------private
|
ydiv1000 | .0919064 .0406638 2.26 0.024 .0122069 .1716059
_cons | .7389208 .1967309 3.76 0.000 .3533352 1.124506
-------------+---------------------------------------------------------------charter
|
ydiv1000 | -.0316399 .0418463 -0.76 0.450 -.1136571 .0503774
_cons | 1.341291 .1945167 6.90 0.000 .9600457 1.722537
-----------------------------------------------------------------------------(Outcome mode==beach is the comparison group)
267
.
. *** (1B) Calculate the marginal effects
.
. quietly mlogit mode ydiv1000, basecategory(1)
. * Predict by default gives the probabilities
. predict p1 p2 p3 p4
(option p assumed; predicted probabilities)
.
. * As check compare predicted to actual probabilities
. summarize dbeach p1 dpier p2 dprivate p3 dcharter p4
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dbeach |
1182 .1133672 .3171753
0
1
p1 |
1182 .1133672 .0036716 .0947395 .1153659
dpier |
1182 .1505922 .3578023
0
1
p2 |
1182 .1505922 .0444575 .0356142 .2342903
dprivate |
1182 .3536379 .4783008
0
1
-------------+-------------------------------------------------------p3 |
1182 .3536379 .0797714 .2396973 .625706
dcharter |
1182 .3824027 .4861799
0
1
p4 |
1182 .3824027 .0346281 .2439403 .4158273
.
. * Quick way to compute marginal effects (or semi-elasticities dp/dlnx or elasticities)
. * is to use built-in Stata function whcih evaluates at sample mean
. * dydx, eyex, dwex or eydx
. mfx compute, dydx predict(outcome(1))
Marginal effects after mlogit
y = Pr(mode==1) (predict, outcome(1))
= .11541492
-----------------------------------------------------------------------------variable |
dy/dx Std. Err. z P>|z| [ 95% C.I. ]
X
---------+-------------------------------------------------------------------ydiv1000 | .000075
.00393 0.02 0.985 -.007635 .007785 4.09934
-----------------------------------------------------------------------------. mfx compute, dydx predict(outcome(2))
Marginal effects after mlogit
y = Pr(mode==2) (predict, outcome(2))
= .14472379
-----------------------------------------------------------------------------variable |
dy/dx Std. Err. z P>|z| [ 95% C.I. ]
X
---------+-------------------------------------------------------------------ydiv1000 | -.0206598
.00487 -4.24 0.000 -.030212 -.011108 4.09934
------------------------------------------------------------------------------
268
dp3dy |
dp4dy |
.
. * Note that here these are similar to the earlier values at means
. * This is because little variation in predicted probability across individuals here
.
. * ASIDE: Binary logit will differ a little from MNL
. keep if mode == 1 | mode == 2
(870 observations deleted)
. mlogit mode ydiv1000
Iteration 0: log likelihood = -213.14899
Iteration 1: log likelihood = -210.28877
Iteration 2: log likelihood = -210.28833
Multinomial logistic regression
LR chi2(1)
Prob > chi2
Log likelihood = -210.28833
Number of obs =
312
=
5.72
= 0.0168
Pseudo R2
= 0.0134
-----------------------------------------------------------------------------mode |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------beach
|
ydiv1000 | .1134757 .0481736 2.36 0.018 .0190571 .2078942
_cons | -.7037127 .2125851 -3.31 0.001 -1.120372 -.2870535
-----------------------------------------------------------------------------(Outcome mode==pier is the comparison group)
.
. ******* (2) CONDITIONAL LOGIT: ALTERNATIVE-SPECIFIC REGRESSOR *********
.
. *** (2A) Estimate the model
.
. * This requires reshaping the data
. clear
. infile mode price crate dbeach dpier dprivate dcharter pbeach ppier /*
> */ pprivate pcharter qbeach qpier qprivate qcharter income /*
> */ using nldata.asc
(1182 observations read)
.
. gen ydiv1000 = income/1000
.
. * Data are one entry per individual
. * Need to reshape to 4 observations per individual - one for each alternative
. * Use reshape to do this which also creates variable (see below)
270
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
id
float %9.0g
d1
float %9.0g
p1
float %9.0g
q1
float %9.0g
d2
float %9.0g
p2
float %9.0g
q2
float %9.0g
d3
float %9.0g
p3
float %9.0g
q3
float %9.0g
d4
float %9.0g
p4
float %9.0g
q4
float %9.0g
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524
.0678
.5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885 .0002 .7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
id |
1182
591.5 341.3583
1
1182
d1 |
1182 .1133672 .3171753
0
1
p1 |
1182 103.422 103.641
1.29 843.186
-------------+-------------------------------------------------------q1 |
1182 .2410113 .1907524
.0678
.5333
d2 |
1182 .1505922 .3578023
0
1
p2 |
1182 103.422 103.641
1.29 843.186
272
q2 |
1182 .1622237 .1603898
.0014
.4522
d3 |
1182 .3536379 .4783008
0
1
-------------+-------------------------------------------------------p3 |
1182 55.25657 62.71344
2.29 666.11
q3 |
1182 .1712146 .2097885
.0002
.7369
d4 |
1182 .3824027 .4861799
0
1
p4 |
1182 84.37924 63.54465
27.29 691.11
q4 |
1182 .6293679 .7061142
.0021 2.3101
.
. reshape long d p q, i(id) j(alterntv)
(note: j = 1 2 3 4)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
1182 -> 4728
Number of variables
30 ->
22
j variable (4 values)
-> alterntv
xij variables:
d1 d2 ... d4 -> d
p1 p2 ... p4 -> p
q1 q2 ... q4 -> q
----------------------------------------------------------------------------. * This automatically creates alterntv = 1 (beach), ... 4 (charter)
. describe
Contains data
obs:
4,728
vars:
22
size:
420,792 (95.9% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------id
float %9.0g
alterntv
byte %9.0g
mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
273
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
d
float %9.0g
p
float %9.0g
q
float %9.0g
------------------------------------------------------------------------------Sorted by: id alterntv
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219 .0002 .7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
.
. clogit d q, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
4728
274
= 0.0000
Pseudo R2
=
0.0207
-----------------------------------------------------------------------------d|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------q | .6307908 .0757624 8.33 0.000 .4822993 .7792823
-----------------------------------------------------------------------------. clogit d p, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
.
. *** (2B) Calculate the marginal effects
.
. quietly clogit d p q, group(id)
. predict pinitial
(option pc1 assumed; conditional probability for single outcome within group)
.
. * Now compute marginal effects
. * Consider in turn a change in each price and catch rate
. * Change price by 1 unit and then multiply by 100 as in Table 15.2
. * Change catch rate by 0.001 and then multiply by 1000
.
. * Change p1: price beach
. replace p = p + 1 if alterntv==1
(1182 real changes made)
. predict pnewp1
(option pc1 assumed; conditional probability for single outcome within group)
. gen mep1 = 100*(pnewp1 - pinitial)
. replace p = p - 1 if alterntv==1
(1182 real changes made)
.
. * Change p2: price pier
. replace p = p + 1 if alterntv==2
(1182 real changes made)
. predict pnewp2
(option pc1 assumed; conditional probability for single outcome within group)
. gen mep2 = 100*(pnewp2 - pinitial)
. replace p = p - 1 if alterntv==2
(1182 real changes made)
.
. * Change p3: price private boat
. replace p = p + 1 if alterntv==3
(1182 real changes made)
. predict pnewp3
(option pc1 assumed; conditional probability for single outcome within group)
. gen mep3 = 100*(pnewp3 - pinitial)
. replace p = p - 1 if alterntv==3
276
277
meq3 |
meq4 |
1182 -.0374514
1182 -.0297604
----------------------------------------------------------------------------------------------------> alterntv = 3
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------pinitial |
1182 .3298317 .173932 .0000756 .6739099
mep1 |
1182 .084509 .0561326
0 .1815647
mep2 |
1182 .0799891 .0542687
0 .172469
mep3 |
1182 -.3897785 .1364849 -.5119085 -.0001532
mep4 |
1182 .2248109 .1606873 1.24e-08 .5118489
-------------+-------------------------------------------------------meq1 |
1182 -.0395636
.02626 -.0849366
0
meq2 |
1182 -.0374553 .0253917 -.0807345
0
meq3 |
1182 .1818861 .0633881 .0000721 .2382994
meq4 |
1182 -.104879 .0748259 -.2382398 -7.28e-09
----------------------------------------------------------------------------------------------------> alterntv = 4
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------pinitial |
1182 .2926737 .1807255 .000078 .7322331
mep1 |
1182 .0674624 .0398696
0 .1958013
mep2 |
1182 .0635479 .0381287
0 .1772434
mep3 |
1182
.22499 .1608719 1.24e-08 .511682
mep4 |
1182 -.3559665 .1370352 -.5119085 -.0001582
-------------+-------------------------------------------------------meq1 |
1182 -.0315891 .018653 -.0915825
0
meq2 |
1182 -.0297618 .0178418 -.0829399
0
meq3 |
1182 -.1048757 .0748219 -.2382398 -7.28e-09
meq4 |
1182 .1662257 .0636901 .0000744 .2382994
.
. ******* (3) CONDITIONAL LOGIT: ALTERNATIVE-INVARIANT REGRESSOR *********
.
. * Here we get clogit to do something that is easier done by mlogit
.
. clear
. infile mode price crate dbeach dpier dprivate dcharter pbeach ppier /*
> */ pprivate pcharter qbeach qpier qprivate qcharter income /*
> */ using nldata.asc
(1182 observations read)
.
. gen ydiv1000 = income/1000
279
.
. * Data are one entry per individual
. * Need to reshape to 4 observations per individual - one for each alternative
. * Use reshape to do this but first create variable
. * Alternative = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
. gen id = _n
. gen d1 = dbeach
. gen d2 = dpier
. gen d3 = dprivate
. gen d4 = dcharter
. describe
Contains data
obs:
1,182
vars:
22
size:
108,744 (98.9% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
id
float %9.0g
d1
float %9.0g
d2
float %9.0g
d3
float %9.0g
d4
float %9.0g
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
280
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate | 1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524 .0678 .5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
id |
1182
591.5 341.3583
1
1182
d1 |
1182 .1133672 .3171753
0
1
d2 |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------d3 | 1182 .3536379 .4783008
0
1
d4 |
1182 .3824027 .4861799
0
1
.
. reshape long d, i(id) j(alterntv)
(note: j = 1 2 3 4)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
1182 -> 4728
Number of variables
22 ->
20
j variable (4 values)
-> alterntv
xij variables:
d1 d2 ... d4 -> d
----------------------------------------------------------------------------. describe
Contains data
obs:
4,728
vars:
20
size:
382,968 (96.3% of memory free)
------------------------------------------------------------------------------281
income |
4728 4099.337 2461.183 416.6667
ydiv1000 |
4728 4.099337 2.461183 .4166667
d|
4728
.25 .4330585
0
1
12500
12.5
.
. gen obsnum=_n
. gen d2 = 0
. replace d2 = 1 if mod(obsnum,4)==2
(1182 real changes made)
. gen d3 = 0
. replace d3 = 1 if mod(obsnum,4)==3
(1182 real changes made)
. gen d4 = 0
. replace d4 = 1 if mod(obsnum,4)==0
(1182 real changes made)
. gen d2y = 0
. replace d2y = d2*ydiv1000
(1182 real changes made)
. gen d3y = 0
. replace d3y = d3*ydiv1000
(1182 real changes made)
. gen d4y = 0
. replace d4y = d4*ydiv1000
(1182 real changes made)
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
283
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919 .0678 .5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219
.0002 .7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------obsnum |
4728
2364.5
1365
1
4728
d2 |
4728
.25 .4330585
0
1
d3 |
4728
.25 .4330585
0
1
d4 |
4728
.25 .4330585
0
1
d2y |
4728 1.024834 2.160064
0
12.5
-------------+-------------------------------------------------------d3y |
4728 1.024834 2.160064
0
12.5
d4y |
4728 1.024834 2.160064
0
12.5
.
. * The following gives MNL column of Table 15.2, p.493,
. * which was more easily obtained using mlogit earlier
. clogit d d2 d3 d4 d2y d3y d4y, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Number of obs.
1182 -> 4728
Number of variables
30 ->
22
j variable (4 values)
-> alterntv
xij variables:
d1 d2 ... d4 -> d
p1 p2 ... p4 -> p
q1 q2 ... q4 -> q
----------------------------------------------------------------------------. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter | 4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219
.0002
.7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
.
. * Bring in alternative specific dummies
. * Since d2-d4 already used instead call them dummy2 - dummy4
. gen obsnum=_n
. gen dummy1 = 0
. replace dummy1 = 1 if mod(obsnum,4)==1
(1182 real changes made)
. gen dummy2 = 0
286
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219 .0002 .7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
obsnum |
4728
2364.5
1365
1
4728
dummy1 |
4728
.25 .4330585
0
1
dummy2 |
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------dummy3 |
4728
.25 .4330585
0
1
dummy4 |
4728
.25 .4330585
0
1
d1y |
4728 1.024834 2.160064
0
12.5
d2y |
4728 1.024834 2.160064
0
12.5
d3y |
4728 1.024834 2.160064
0
12.5
-------------+-------------------------------------------------------d4y |
4728 1.024834 2.160064
0
12.5
.
. clogit d dummy2 dummy3 dummy4 p q, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
-----------------------------------------------------------------------------.
. * The following gives Mixed column of Table 15.2, p.493
. clogit d p q dummy2 dummy3 dummy4 d2y d3y d4y, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 15.6.3 page 511
. * Nested logit (GEV) model analysis.
. * (1) Set data up and reproduce Mixed estimates in Table 15.2 p.493
. * (2A) Nested logit model estimates (page 511)
. * (2B) Restricted nested logit model estimates (page 511)
. * (2C) Equivalent conditional logit model estimates (same as (2B))
.
. * Related programs are
. * mma15p1mnl.do multinomial and conditional logit using Stata
. * mma15p3mnl.lim multinomial logit using Limdep
. * mma15p4gev.lim conditional and nested logit using Limdep and Nlogit
.
. * To run this program you need data file
. * Nldata.asc
.
. * NOTE: The example here is deliberately simple and merely illustrative.
.*
with nesting structure
.*
/ \
.*
/ \ / \
. * In this case with parameter rho_j differing across alternatives
. * Stata 8 estimates the earlier variant of the nested logit model
. * rather than the preferred variant given in the text.
. * See the discussion at bottom of page 511 and also Train (2003, p.88)
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** DATA DESCRIPTION **********
.
. * Data Set comes from :
. * J. A. Herriges and C. L. Kling,
. * "Nonlinear Income Effects in Random Utility Models",
. * Review of Economics and Statistics, 81(1999): 62-72
.
. * The data are given as a combined observation with data on all 4 choices.
. * This will work for multinomial logit program.
. * For conditional logit will need to make a new data set which has
. * four separate entries for each observation as there are four alternatives.
.
290
. * Filename: NLDATA.ASC
. * Format: Ascii
. * Number of Observations: 1182
. * Each observations appears over 3 lines with 4 variables per line
. * so 4 x 1182 = 4728 observations
. * Variable Number and Description
. * 1 Recreation mode choice. = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
. * 2 Price for chosen alternative
. * 3 Catch rate for chosen alternative
. * 4 = 1 if beach mode chosen; = 0 otherwise
. * 5 = 1 if pier mode chosen; = 0 otherwise
. * 6 = 1 if private boat mode chosen; = 0 otherwise
. * 7 = 1 if charter boat mode chosen; = 0 otherwise
. * 8 = price for beach mode
. * 9 = price for pier mode
. * 10 = price for private boat mode
. * 11 = price for charter boat mode
. * 12 = catch rate for beach mode
. * 13 = catch rate for pier mode
. * 14 = catch rate for private boat mode
. * 15 = catch rate for charter boat mode
. * 16 = monthly income
.
. ******* (1) CONDITIONAL LOGIT MODEL (Table 15.2 p.493 Mixed column) *********
.
. infile mode price crate dbeach dpier dprivate dcharter pbeach ppier /*
> */ pprivate pcharter qbeach qpier qprivate qcharter income /*
> */ using nldata.asc
(1182 observations read)
.
. gen ydiv1000 = income/1000
.
. * Data are one entry per individual
. * Need to reshape to 4 observations per individual - one for each alternative
. * Use reshape to do this which also creates variable (see below)
. * alternatv = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
. gen id = _n
. gen d1 = dbeach
. gen p1 = pbeach
. gen q1 = qbeach
. gen d2 = dpier
. gen p2 = ppier
. gen q2 = qpier
291
. gen d3 = dprivate
. gen p3 = pprivate
. gen q3 = qprivate
. gen d4 = dcharter
. gen p4 = pcharter
. gen q4 = qcharter
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524
.0678
.5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
id |
1182
591.5 341.3583
1
1182
d1 |
1182 .1133672 .3171753
0
1
p1 |
1182 103.422 103.641
1.29 843.186
-------------+-------------------------------------------------------q1 |
1182 .2410113 .1907524
.0678
.5333
d2 |
1182 .1505922 .3578023
0
1
p2 |
1182 103.422 103.641
1.29 843.186
q2 |
1182 .1622237 .1603898
.0014
.4522
d3 |
1182 .3536379 .4783008
0
1
-------------+-------------------------------------------------------p3 |
1182 55.25657 62.71344
2.29 666.11
q3 |
1182 .1712146 .2097885
.0002
.7369
d4 |
1182 .3824027 .4861799
0
1
p4 |
1182 84.37924 63.54465
27.29 691.11
292
q4 |
1182 .6293679
.7061142
.0021
2.3101
.
. reshape long d p q, i(id) j(alterntv)
(note: j = 1 2 3 4)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
1182 -> 4728
Number of variables
30 ->
22
j variable (4 values)
-> alterntv
xij variables:
d1 d2 ... d4 -> d
p1 p2 ... p4 -> p
q1 q2 ... q4 -> q
----------------------------------------------------------------------------. * This automatically creates alterntv = 1 (beach), ... 4 (charter)
. describe
Contains data
obs:
4,728
vars:
22
size:
420,792 (95.9% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------id
float %9.0g
alterntv
byte %9.0g
mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
d
float %9.0g
p
float %9.0g
q
float %9.0g
------------------------------------------------------------------------------293
294
296
Number of obs
=
4728
LR chi2(6)
= 917.1687
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------alterntv |
p | -.0013303 .001081 -1.23 0.218 -.003449 .0007883
q | .1284825 .1038986 1.24 0.216 -.075155
.33212
-------------+---------------------------------------------------------------type
|
dshore | -11.40196 9.15307 -1.25 0.213 -29.34164 6.537733
dshorey | .1108341 .0531049 2.09 0.037 .0067505 .2149178
-------------+---------------------------------------------------------------(incl. value |
parameters) |
type
|
/shore | 29.98591 24.40089 1.23 0.219 -17.83896 77.81078
/boat | 14.06438 11.39886 1.23 0.217 -8.276971 36.40572
-----------------------------------------------------------------------------LR test of homoskedasticity (iv = 1): chi2(2)= 145.39 Prob > chi2 = 0.0000
-----------------------------------------------------------------------------. estimates store nlogitunrest
.
. *** (2B) Estimate the restricted nested logit model
. ***
This is the model on p.511 that has log L = -1252
.
. * Set the inclusive value parameters to 1
. nlogit d (alterntv = p q) (type = dshore dshorey), group(id) ivc(shore=1, boat=1)
tree structure specified for the nested logit model
top --> bottom
type
alterntv
-------------------------shore
1
2
boat
3
4
User-defined constraint(s):
IV constraint(s):
[shore]_cons = 1
[boat]_cons = 1
303
initial:
log likelihood = -1256.8179
rescale:
log likelihood = -1256.8179
rescale eq: log likelihood = -1228.6278
Iteration 0: log likelihood = -1264.4012
Iteration 1: log likelihood = -1264.1213 (backed up)
Iteration 2: log likelihood = -1256.9241 (backed up)
Iteration 3: log likelihood = -1255.0984 (backed up)
Iteration 4: log likelihood = -1254.4838
Iteration 5: log likelihood = -1252.7216
Iteration 6: log likelihood = -1252.7111
Iteration 7: log likelihood = -1252.711
Nested logit estimates
Levels
=
2
Dependent variable =
d
Log likelihood = -1252.711
Number of obs
=
4728
LR chi2(4)
= 771.7778
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------alterntv |
p | -.020246 .0012832 -15.78 0.000 -.022761 -.017731
q | .7552644 .0918004 8.23 0.000
.575339 .9351899
-------------+---------------------------------------------------------------type
|
dshore | -.5897435 .1565201 -3.77 0.000 -.8965172 -.2829697
dshorey | -.0790869 .0381453 -2.07 0.038 -.1538503 -.0043235
-------------+---------------------------------------------------------------(incl. value |
parameters) |
type
|
/shore |
1
.
.
.
.
.
/boat |
1
.
.
.
.
.
-----------------------------------------------------------------------------LR test of homoskedasticity (iv = 1): chi2(0)= 0.00 Prob > chi2 =
.
-----------------------------------------------------------------------------. estimates store nlogitrest
.
. * Perform a likelihood ratio test that inclusive parameters = 1
. lrtest nlogitunrest nlogitrest
likelihood-ratio test
LR chi2(2) = 145.39
(Assumption: nlogitrest nested in nlogitunrest)
Prob > chi2 =
0.0000
.
. *** (2C) As a check, verify that this restricted nested logit = conditional logit
.
. clogit d p q dshore dshorey, group(id)
304
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
305
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma16p1tobit.txt
log type: text
opened on: 19 May 2005, 13:00:31
.
. ********** OVERVIEW OF MMA16P1TOBIT.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 16.2.1 pages 530-1 and 16.9.2 page 565
. * Classic Tobit model with generated data
. * Provides
. * (1) Graph of various conditional means Figure 16.1 (ch16condmeans.wmf)
. * (2) Tobit model estimation: various estimators not reported in book
. * (3) Tobit model estimation: CLAD estimation mentioned on page 565
. * using generated data (see below)
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
.
. ********** GENERATE DATA **********
.
. * Data generating process is
. * Regressor:
lnwage ~ N(2.75, 0.6^2)
. * Error term:
e ~ N(0, 1000^2)
. * Latent variable:
ystar = -2500 + 1000*lnwage + e
. * Truncated variable: ytrunc = 1(ystar>0)*ystar
. * Censored variable: ycens = 1(ystar<=0)*0 + 1(ystar>0)*ystar
. * Censoring Indicator: dy = 1(ycens>0)
.
. set seed 10101
. set obs 200
obs was 0, now 200
. gen e = 1000*invnorm(uniform( ))
. gen lnwage = 2.75 + 0.6*invnorm(uniform( ))
. gen ystar = -2500 + 1000*lnwage + e
306
307
.
. * (2A) ESTIMATE THE VARIOUS MODELS
.
. *** UNCENSORED OLS REGRESSION
. * Possible here since for these generated data we actually know ystar
. * Yelds consistent estimate. Expect slope = 1000 approximately.
. regress ystar lnwage, robust
Regression with robust standard errors
Number of obs =
F( 1, 198) = 96.32
Prob > F
= 0.0000
R-squared = 0.2944
Root MSE = 980
200
-----------------------------------------------------------------------------|
Robust
ystar |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 1010.39 102.9518 9.81 0.000 807.3673 1213.413
_cons | -2452.05 303.2432 -8.09 0.000 -3050.051 -1854.049
-----------------------------------------------------------------------------. estimates store ols
. predict ystarols
(option xb assumed; fitted values)
.
. *** CENSORED OLS REGRESSION
. * Yields inconsistent estimates
. * From subsection 16.3.6 for slope coefficient OLS converges to p times b
. * where p is fraction of sample with positive values. Here 0.65*1000 = 650.
. regress ycens lnwage, robust
Regression with robust standard errors
Number of obs =
F( 1, 198) = 84.20
Prob > F
= 0.0000
R-squared = 0.2522
Root MSE = 660.04
200
-----------------------------------------------------------------------------|
Robust
ycens |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 611.8108 66.67493 9.18 0.000 480.3267 743.2949
_cons | -1027.577 176.0776 -5.84 0.000 -1374.805 -680.3484
-----------------------------------------------------------------------------. estimates store censols
. predict ycensols
309
130
-----------------------------------------------------------------------------|
Robust
ytrunc |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 442.6319 94.26938 4.70 0.000 256.1038
629.16
_cons | -282.4444 282.9091 -1.00 0.320 -842.2285 277.3396
-----------------------------------------------------------------------------. estimates store truncols
. predict ytrunols
(option xb assumed; fitted values)
.
. *** CENSORED TOBIT MLE REGRESSION for HWAGE
. * Yields consistent estimates
. tobit ycens lnwage, ll(0)
Tobit estimates
Number of obs =
200
LR chi2(1)
=
65.64
Prob > chi2 = 0.0000
Log likelihood = -1118.3857
Pseudo R2
= 0.0285
-----------------------------------------------------------------------------ycens |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 956.4877 116.8382 8.19 0.000 726.0879 1186.887
_cons | -2244.567 346.8778 -6.47 0.000 -2928.595 -1560.539
-------------+---------------------------------------------------------------_se | 896.6811 59.14988
(Ancillary parameter)
-----------------------------------------------------------------------------Obs. summary:
130
310
. predict ycenstob
(option xb assumed; fitted values)
.
. *** TRUNCATED TOBIT MLE REGRESSION for HWAGE
. * If done propoerly yields consistent estimates
. * Not sure how to do this in Stata
. * The obvious command is
. * tobit ytrunc lnwage, ll(0)
. * but this gives the same estimates as truncated OLS
.
. *** PROBIT REGRESSION for HWAGE
. * Yields consistent estimates for slope b/s = 1000/1000 = 1
. * but uses less information so expect less efficient than tobit
. probit dy lnwage
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Probit estimates
Number of obs =
200
LR chi2(1)
=
48.39
Prob > chi2 = 0.0000
Log likelihood = -105.29672
Pseudo R2
= 0.1868
-----------------------------------------------------------------------------dy |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 1.173851 .1870053 6.28 0.000 .8073277 1.540375
_cons | -2.795715 .508104 -5.50 0.000 -3.79158 -1.799849
-----------------------------------------------------------------------------. estimates store probit
. predict yprobit
(option p assumed; Pr(dy))
.
. *** HECKMAN 2-STEP ESTIMATOR DONE MANUALLY
. * Yields consistent estimates but less efficient than censored tobit MLE
. * The second stage standard errors will be incorrect
. probit dy lnwage
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Probit estimates
Number of obs =
LR chi2(1)
=
48.39
200
311
= 0.0000
Pseudo R2
=
0.1868
-----------------------------------------------------------------------------dy |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 1.173851 .1870053 6.28 0.000 .8073277 1.540375
_cons | -2.795715 .508104 -5.50 0.000 -3.79158 -1.799849
-----------------------------------------------------------------------------. predict probity, xb
. gen invmills = normd(probity)/normprob(probity)
. summarize dy probity invmills
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dy |
200
.65 .4781665
0
1
probity |
200 .482335 .7335506 -1.734574 2.33808
invmills |
200 .5867037 .3823083 .0261866 2.140342
. regress ytrunc lnwage invmills
Source |
SS
df
MS
Number of obs = 130
-------------+-----------------------------F( 2, 127) = 9.41
Model | 8440402.78 2 4220201.39
Prob > F
= 0.0002
Residual | 56971158.9 127 448591.802
R-squared = 0.1290
-------------+-----------------------------Adj R-squared = 0.1153
Total | 65411561.6 129 507066.369
Root MSE
= 669.77
-----------------------------------------------------------------------------ytrunc |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 176.6468 418.2392 0.42 0.673 -650.9731 1004.267
invmills | -498.9958 760.3525 -0.66 0.513 -2003.596 1005.604
_cons | 745.3069 1597.558 0.47 0.642 -2415.972 3906.586
-----------------------------------------------------------------------------. estimates store heck2step
. correlate lnwage invmills
(obs=200)
| lnwage invmills
-------------+-----------------lnwage | 1.0000
invmills | -0.9745 1.0000
130
-----------------------------------------------------------------------------|
Robust
ytrunc |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 176.6468 379.1739 0.47 0.642 -573.6699 926.9636
invmills | -498.9958 635.4917 -0.79 0.434 -1756.519 758.5276
_cons | 745.3069 1431.149 0.52 0.603 -2086.68 3577.293
-----------------------------------------------------------------------------. estimates store heck2srobust
.
. *** HECKMAN 2-STEP ESTIMATOR DONE USING BUILT-IN HECKMAN COMMAND
. * Yields consistent estimates but less efficient than censored tobit MLE
. heckman ytrunc lnwage, select(lnwage) twostep
Heckman selection model -- two-step estimates Number of obs
(regression model with sample selection)
Censored obs
=
Uncensored obs =
130
Wald chi2(2)
Prob > chi2
200
70
= 39.57
= 0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------ytrunc
|
lnwage | 176.6469 425.0025 0.42 0.678 -656.3428 1009.636
_cons | 745.3067 1617.583 0.46 0.645 -2425.098 3915.711
-------------+---------------------------------------------------------------select
|
lnwage | 1.173851 .1870053 6.28 0.000 .8073277 1.540375
_cons | -2.795715 .508104 -5.50 0.000 -3.79158 -1.799849
-------------+---------------------------------------------------------------mills
|
lambda | -498.9957 760.5005 -0.66 0.512 -1989.549 991.5578
-------------+---------------------------------------------------------------rho | -0.67419
sigma | 740.1433
lambda | -498.99575 760.5005
-----------------------------------------------------------------------------. estimates store heckman
313
. predict ystarhec, xb
. predict ytrunhec, ycond
. predict ycenshec, yexpected
. predict yinvmill, mills
. predict yprobsel, psel
. correlate lnwage yinvmill
(obs=200)
| lnwage yinvmill
-------------+-----------------lnwage | 1.0000
yinvmill | -0.9745 1.0000
.
. * (2B) DISPLAY COEFFICIENT ESTIMATES
.
. * OLS estimates True model is -2500 + 1000*lnwage
. estimates table ols censols truncols, b(%10.2f) se(%10.2f) t stats(N ll)
----------------------------------------------------Variable | ols
censols
truncols
-------------+--------------------------------------lnwage | 1010.39
611.81
442.63
| 102.95
66.67
94.27
|
9.81
9.18
4.70
_cons | -2452.05 -1027.58 -282.44
| 303.24
176.08
282.91
|
-8.09
-5.84
-1.00
-------------+--------------------------------------N | 200.00
200.00
130.00
ll | -1660.29 -1581.24 -1029.07
----------------------------------------------------legend: b/se/t
.
. * Tobit estimates True model is -2500 + 1000*lnwage
. estimates table censtobit probit, b(%10.2f) se(%10.2f) t stats(N ll)
---------------------------------------Variable | censtobit
probit
-------------+-------------------------lnwage | 956.49
1.17
| 116.84
0.19
|
8.19
6.28
314
_se | 896.68
|
59.15
| 15.16
_cons | -2244.57
-2.80
| 346.88
0.51
|
-6.47
-5.50
-------------+-------------------------N | 200.00
200.00
ll | -1118.39
-105.30
---------------------------------------legend: b/se/t
.
. * Tobit estimates using Heckman manual True model is -2500 + 1000*lnwage
. estimates table heck2step heck2srobust, b(%10.2f) se(%10.2f) t stats(N ll)
---------------------------------------Variable | heck2step heck2sro~t
-------------+-------------------------lnwage | 176.65
176.65
| 418.24
379.17
|
0.42
0.47
invmills | -499.00 -499.00
| 760.35
635.49
|
-0.66
-0.79
_cons | 745.31
745.31
| 1597.56
1431.15
|
0.47
0.52
-------------+-------------------------N | 130.00
130.00
ll | -1028.85 -1028.85
---------------------------------------legend: b/se/t
.
. * Tobit estimates using Heckman built-in True model is -2500 + 1000*lnwage
. estimates table heckman, b(%10.2f) se(%10.2f) t stats(N ll)
--------------------------Variable | heckman
-------------+------------ytrunc
|
lnwage | 176.65
| 425.00
|
0.42
_cons | 745.31
| 1617.58
|
0.46
-------------+------------select
|
lnwage |
1.17
315
|
0.19
|
6.28
_cons | -2.80
|
0.51
|
-5.50
-------------+------------mills
|
lambda | -499.00
| 760.50
|
-0.66
-------------+------------Statistics |
N | 200.00
ll |
--------------------------legend: b/se/t
.
. ********** (3) CLAD ESTIMATION FOR THESE DATA page 565 **********
.
. * Compare tobit MLE with censored least absolute deviations (CLAD) estimator
. * Gives results at end of section 16.9.3 page 565
.
. tobit ycens lnwage, ll(0)
Tobit estimates
Number of obs =
200
LR chi2(1)
=
65.64
Prob > chi2 = 0.0000
Log likelihood = -1118.3857
Pseudo R2
= 0.0285
-----------------------------------------------------------------------------ycens |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwage | 956.4877 116.8382 8.19 0.000 726.0879 1186.887
_cons | -2244.567 346.8778 -6.47 0.000 -2928.595 -1560.539
-------------+---------------------------------------------------------------_se | 896.6811 59.14988
(Ancillary parameter)
-----------------------------------------------------------------------------Obs. summary:
130
. gen c = 4*(50-_n)/100
. gen PHIc = norm(c)
. gen phic = normden(c)
. gen lamdac = phic/(1-PHIc)
.
. * Descriptive statistics
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------c|
100
-.02 1.16046
-2
1.96
PHIc |
100 .4952275 .338039 .0227501 .9750021
phic |
100 .2386177 .1157086 .053991 .3989423
lamdac |
100 .9284788 .7023349 .0552479 2.337835
.
. *********** FIGURE 16.2 page 540 ***********
.
. * This graph shows Mills ratio and cdf and density
. graph twoway (scatter lamdac c, c(l) msize(vtiny) clstyle(p1) clwidth(medthick)) /*
> */ (scatter PHIc c, c(l) msize(vtiny) clstyle(p3) clwidth(medthick)) /*
> */ (scatter phic c, c(l) msize(vtiny) clstyle(p2) clwidth(medthick)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Inverse Mills Ratio as Cutoff Varies") /*
> */ xtitle("Cutoff point c", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Inverse Mills, pdf and cdf", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(11) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Inverse Mills ratio") label(2 "N[0,1] Cdf") label(3 "N[0,1] Density"))
. graph export ch16millsratio.wmf, replace
(file c:\Imbook\bwebpage\Section4\ch16millsratio.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT ***********
. log close
log: c:\Imbook\bwebpage\Section4\mma16p2mills.txt
log type: text
closed on: 19 May 2005, 13:02:15
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma16p3selection.txt
log type: text
opened on: 19 May 2005, 13:04:33
.
. ********** OVERVIEW OF MMA16P3SELECTION.DO **********
318
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 16.6 pages 553-5
. * Selection models example
. * It provides
. * (1) Two-part model estimation (Table 16.1)
. * (2) Selection model estimation
. * (2A) ML estimates (Table 16.1)
. * (2B) Heckman 2-step estimates (Table 16.1)
. * (2C) Check for possible collinearity problems in Heckman 2-Step
.
. * To use this program you need health expenditure data in Stata data set
. * randdata.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
.
. ********** DATA DESCRIPTION **********
.
. * Essentially same data as in P. Deb and P.K. Trivedi (2002)
. * "The Structure of Demand for Medical Care: Latent Class versus
. * Two-Part Models", Journal of Health Economics, 21, 601-625
. * except that paper used different outcome (counts rather than $)
.
. * Each observation is for an individual over a year.
. * Individuals may appear in up to five years.
. * All available sample is used except only fee for service plans included.
. * In analysis here only year 2 is used so panel complications are avoided.
. * Clustering of individuals within household is ignored here.
.
. * Dependent variable is
.*
MED
med
Annual medical expenditures in constant dollars
.*
excluding dental and outpatient mental
.*
LNMED lnmeddol Ln(Medical expenditures) given meddol > 0
.*
Missing otherwise
.*
DMED binexp 1 if medical expenditures > 0
.
. * Regressors are
. * - Health insurance measures
.*
LC
logc
log(coinsrate+1) where coinsurance rate is 0 to 100
319
.*
IDP
idp
1 if individual deductible plan
.*
LPI
lpi
1og(annual participation incentive payment) or 0 if no payment
.*
FMDE
fmde
log(max(medical deductible expenditure)) if IDP=1 and MDE>1 or 0
otherw
> ise.
. * - Health status measures
.*
NDISEASE disea number of chronic diseases
.*
PHYSLIM physlm 1 if physical limitation
.*
HLTHG hlthg 1 if good health
.*
HLTHF hlthf 1 if good health
.*
HLTHP hlthp 1 if good health (omitted is excellent)
. * - Socioeconomic characteristics
.*
LINC linc
log of annual family income (in $)
.*
LFAM lfam
log of family size
.*
EDUCDEC educdec years of schooling of decision maker
.*
AGE
xage
exact age
.*
BLACK black 1 if black
.*
FEMALE female 1 if female
.*
CHILD child 1 if child
.*
FEMCHILD fchild 1 if female child
.
. * If panel data used then clustering is on
.*
zper
person id
.
. ********** READ DATA **********
.
. use randdata.dta, clear
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------plan | 20190 11.17553 3.976751
1
19
site | 20190 3.298811 1.80382
1
6
coins | 20190 26.3056 36.40386
0
100
tookphys | 20190 .5974245 .4904288
0
1
year | 20190 2.420109 1.217141
1
5
-------------+-------------------------------------------------------zper | 20190 357965.5 180868.1 125024 632167
black | 20190 .1814983 .3827071
0
1
income | 20190 8037.409 4058.371
0 29237.54
xage | 20190 25.72233 16.76945
0 64.27515
female | 20190 .5170381 .499722
0
1
-------------+-------------------------------------------------------educdec | 20186 11.96681 2.806255
0
25
time | 20190 .9989561 .0259741 .0767123
1
outpdol | 20190 51.12649 94.92627
0 2599.902
drugdol | 20190 13.1687 33.76212
0 706.3979
suppdol | 20190
6.8024 21.39346
0 1009.47
-------------+-------------------------------------------------------mentdol | 20190 6.870347 58.41298
0 1340.834
320
. global XLIST LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK
.
. * Summarize the dependents and regressors
. sum MED DMED LNMED $XLIST
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
DMED |
5574 .7680301 .4221277
0
1
LNMED |
4281 4.069462 1.499372 -.5343859 10.57597
LC |
5574 2.420739 2.043883
0 4.564348
IDP |
5574 .261751 .4396272
0
1
-------------+-------------------------------------------------------LPI |
5574 4.726834 2.681354
0 7.163699
FMDE |
5574 4.065015 3.450558
0 8.294049
PHYSLIM |
5574 .1242463 .3233768
0
1
NDISEASE |
5574 11.20526 6.788959
0
58.6
HLTHG |
5574 .3649085 .4814477
0
1
-------------+-------------------------------------------------------HLTHF |
5574 .0782203 .268542
0
1
HLTHP | 5574 .0156082 .123965
0
1
LINC |
5574 8.696929 1.220592
0 10.28324
LFAM |
5574 1.241407 .5403965
0 2.564949
EDUCDEC |
5574 11.9466 2.837492
0
25
-------------+-------------------------------------------------------AGE |
5574 25.57613 16.73011 .0253251 63.27515
FEMALE |
5574 .5184787 .4997032
0
1
CHILD |
5574 .4050951 .4909545
0
1
FEMCHILD |
5574 .1955508 .3966597
0
1
BLACK |
5574 .1859852 .3860055
0
1
.
. * Detailed summary shows that MED>0 very skewed whereas LNMED is not
. sum MED LNMED if MED>0, detail
medical exp excl outpatient men
------------------------------------------------------------Percentiles
Smallest
1% 2.109705
.5860291
5% 5.752914
.6630728
10% 9.376465
.6770833
Obs
4281
25% 21.31435
.6770833
Sum of Wgt.
4281
50%
75%
90%
95%
99%
52.64357
Mean
220.987
Largest
Std. Dev.
909.9021
136.4518
12044.11
453.8059
17465.98
Variance
827921.9
904.328
18641.98
Skewness
24.00829
2666.309
39182.02
Kurtosis
873.379
323
LNMED
------------------------------------------------------------Percentiles
Smallest
1%
.746548 -.5343859
5% 1.749707
-.4108706
10% 2.238203 -.3899609
Obs
4281
25% 3.059381 -.3899609
Sum of Wgt.
4281
50%
75%
90%
95%
99%
3.963544
Mean
4.069462
Largest
Std. Dev.
1.499372
4.915971
9.396331
6.11767
9.76801
Variance
2.248116
6.807192
9.833171
Skewness
.347695
7.888451
10.57597
Kurtosis
3.28909
.
. * Write final data to a text (ascii) file so can use with programs other than Stata
. outfile DMED MED LNMED LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF
HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK /*
>
*/ using mma16p3selection.asc, replace
.
. ****************** CHAPTER 16.6 REGRESSION ANALYSIS **************
.
. * The analysis below models log expenditure (lny), not expenditure (y)
. * where here y = MED and lny = LNMED.
.
. * This makes regular tobit difficult as it is not clear
. * what the censoring/truncation point is since ln(0) = -infinity
. * Also note that some LNMED<0 as 0<MED<1 is possible.
. * So just do two-part model and sample selection model.
.
. * Interested in comparing MED not LNMED at end of day.
. * So use
. * If lny = xb + u, u ~ N[0, s^2] for y > 0
. * Then E[y] = exp(xb + (s^2)/2)
for y > 0
. * and E[y] = Pr[y>0]*exp(xb + (s^2)/2) for all y
.
. * The models estimated are
. * (1) Two-part model using
. * (a) probit for whether positive y
. * (b) regress with lny as dependent variable
. * (2) Sample selection model similar to (3)
. * except that inverse Mills ratio appears in (b), estimated by
. * (a) MLE
. * (b) Heckman 2-step
.
. * Additionally censored tobit and truncated tobit commands in levels
. * are given below for completeness.
324
.
. ************ (1) TWO-PART MODEL ************
.
. * Two-part model: binary probit and then lognormal for expenditures
.
. * First part: probit for MED > 0
. probit DMED $XLIST
/* global XLIST defined earlier */
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Probit estimates
Number of obs =
5574
LR chi2(17) = 657.11
Prob > chi2 = 0.0000
Log likelihood = -2690.5768
Pseudo R2
= 0.1088
-----------------------------------------------------------------------------DMED |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.118708 .0269005 -4.41 0.000 -.1714319 -.065984
IDP | -.1279483 .0522351 -2.45 0.014 -.2303272 -.0255693
LPI | .0283091 .0088793 3.19 0.001
.010906 .0457121
FMDE | .0075319 .0161584 0.47 0.641 -.024138 .0392018
PHYSLIM | .2732013 .0743761 3.67 0.000 .1274268 .4189758
NDISEASE | .0224861 .0035958 6.25 0.000 .0154384 .0295338
HLTHG | .0387516 .0438545 0.88 0.377 -.0472016 .1247049
HLTHF | .1920062 .0836688 2.29 0.022 .0280185 .355994
HLTHP | .6397294 .2126322 3.01 0.003 .222978 1.056481
LINC | .0518413 .0168128 3.08 0.002 .0188889 .0847938
LFAM | -.0335599 .041728 -0.80 0.421 -.1153452 .0482253
EDUCDEC | .036307 .0076536 4.74 0.000 .0213062 .0513078
AGE | .0002631 .0021606 0.12 0.903 -.0039715 .0044978
FEMALE | .4451035 .054292 8.20 0.000 .3386932 .5515138
CHILD | .111489 .0808338 1.38 0.168 -.0469424 .2699203
FEMCHILD | -.4512845 .0799219 -5.65 0.000 -.6079284 -.2946405
BLACK | -.6057367 .0523148 -11.58 0.000 -.7082718 -.5032017
_cons | -.271605 .1877345 -1.45 0.148 -.6395579 .0963478
-----------------------------------------------------------------------------. estimates store twoparta
. scalar llprobit = e(ll)
. predict probsel2part, p
. predict xbprobit, xb
/* Log-likelihood */
/* Pr[y>0] = PHI(x'b) */
/* x'b */
.
325
. predict pLNMED, xb
327
.
. * Compare predictions to actual including zeroes
. sum MED pMEDall2part DMED probsel2part
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDall2part |
5574 140.966 120.2022 4.880651 1729.783
DMED |
5574 .7680301 .4221277
0
1
probsel2part |
5574 .7678377 .1457464 .1526731 .999246
. corr MED pMEDall2part DMED probsel2part
(obs=5574)
|
MED pMEDal~t DMED probse~t
-------------+-----------------------------------MED | 1.0000
pMEDall2part | 0.1772 1.0000
DMED | 0.1162 0.2158 1.0000
probsel2part | 0.1031 0.6380 0.3467 1.0000
.
. ************ (2) SELECTION MODEL ************
.
. * Sample selection model for log expenditures
. * Selection equation:
.*
Observe y = y* if I = z'a + u > 0 u ~ N[0,1]
. * Regression equation:
.*
y* = x'b + v v ~ N[0,s^2] and Corr[u,v]=rho
.
. * (2A) MLE for sample selection model
. heckman LNMED $XLIST, select (DMED = $XLIST)
Iteration 0: log likelihood = -10183.753 (not concave)
Iteration 1: log likelihood = -10183.676 (not concave)
Iteration 2: log likelihood = -10183.593 (not concave)
Iteration 3: log likelihood = -10183.525 (not concave)
Iteration 4: log likelihood = -10183.467 (not concave)
Iteration 5: log likelihood = -10183.408 (not concave)
Iteration 6: log likelihood = -10183.311 (not concave)
Iteration 7: log likelihood = -10183.21 (not concave)
Iteration 8: log likelihood = -10179.155
Iteration 9: log likelihood = -10176.799
Iteration 10: log likelihood = -10170.17
Iteration 11: log likelihood = -10170.11
Iteration 12: log likelihood = -10170.11
Heckman selection model
Number of obs
=
5574
(regression model with sample selection)
Censored obs
=
1293
328
Uncensored obs
4281
Wald chi2(17)
= 805.17
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNMED
|
LC | -.0760236 .0337456 -2.25 0.024 -.1421638 -.0098833
IDP | -.1497199 .0661379 -2.26 0.024 -.2793478 -.020092
LPI | .01493 .0105015 1.42 0.155 -.0056526 .0355127
FMDE | -.023522 .0194745 -1.21 0.227 -.0616913 .0146474
PHYSLIM | .3548628 .0755425 4.70 0.000 .2068023 .5029233
NDISEASE | .0286474 .0037972 7.54 0.000 .0212051 .0360897
HLTHG | .1559173 .0521775 2.99 0.003 .0536513 .2581834
HLTHF | .4451223 .0955263 4.66 0.000 .2578942 .6323505
HLTHP | .9986065 .1878791 5.32 0.000 .6303701 1.366843
LINC | .1214009 .0230845 5.26 0.000 .0761562 .1666457
LFAM | -.1583018 .0497464 -3.18 0.001 -.255803 -.0608005
EDUCDEC | .0175951 .0090183 1.95 0.051 -.0000805 .0352707
AGE | .0057376 .0024426 2.35 0.019 .0009501 .0105251
FEMALE | .5503441 .0633313 8.69 0.000 .4262171 .6744711
CHILD | -.1976875 .097398 -2.03 0.042 -.3885841 -.006791
FEMCHILD | -.5653227 .0975292 -5.80 0.000 -.7564765 -.374169
BLACK | -.5358684 .0749191 -7.15 0.000 -.6827072 -.3890296
_cons | 2.107745 .2442285 8.63 0.000 1.629066 2.586424
-------------+---------------------------------------------------------------DMED
|
LC | -.1068027 .0264766 -4.03 0.000 -.1586959 -.0549096
IDP | -.108769 .0509938 -2.13 0.033 -.2087149 -.0088231
LPI | .0294804 .0086214 3.42 0.001 .0125827 .0463781
FMDE | .0007403 .0158738 0.05 0.963 -.0303719 .0318524
PHYSLIM | .2848256 .0722656 3.94 0.000 .1431877 .4264635
NDISEASE | .0210805 .0034967 6.03 0.000 .0142271 .027934
HLTHG | .0576901 .042799 1.35 0.178 -.0261945 .1415747
HLTHF | .2237238 .0814547 2.75 0.006 .0640755 .3833721
HLTHP | .7984291 .2048087 3.90 0.000 .3970114 1.199847
LINC | .0553122 .0166179 3.33 0.001 .0227416 .0878827
LFAM | -.031201 .0402985 -0.77 0.439 -.1101846 .0477827
EDUCDEC | .031499 .0074987 4.20 0.000 .0168018 .0461961
AGE | -.0006072 .0021064 -0.29 0.773 -.0047357 .0035212
FEMALE | .4093059 .0532548 7.69 0.000 .3049283 .5136834
CHILD | .0530643 .0786326 0.67 0.500 -.1010527 .2071813
FEMCHILD | -.3953421 .0783811 -5.04 0.000 -.5489662 -.241718
BLACK | -.5831049 .0520534 -11.20 0.000 -.6851277 -.4810822
_cons | -.2141574 .1842169 -1.16 0.245 -.5752159 .146901
-------------+---------------------------------------------------------------/athrho | .9408188 .0736303 12.78 0.000
.796506 1.085132
/lnsigma | .4511091 .0177227 25.45 0.000 .4163732 .485845
-------------+---------------------------------------------------------------329
/* Log-likelihood */
/* s where Var[v]=s^2 */
.
. * Save the Stata predictions:
. * Distinguish between ystar=E[y*], ypos=E[y|I>0] and yall=E[y]
. predict ystarhml, xb
/* E[y*] = x'b */
. predict yposhml, ycond
/* lamda(z'a) = phi(z'a)/PHI(z'a) */
/* PHI(z'a) */
Obs
Mean
Std. Dev.
Min
Max
330
-------------+-------------------------------------------------------LNMED |
4281 4.069462 1.499372 -.5343859 10.57597
yposhml |
4281 4.071295 .5573439 2.50515 6.92955
MED |
4281 220.987 909.9021 .5860291 39182.02
pMEDposhml |
4281 240.4096 185.0424 42.00053 3505.48
. corr LNMED yposhml MED pMEDpos2part if MED > 0
(obs=4281)
| LNMED yposhml
MED pMEDpo~t
-------------+-----------------------------------LNMED | 1.0000
yposhml | 0.3690 1.0000
MED | 0.4560 0.1592 1.0000
pMEDpos2part | 0.3387 0.9343 0.1669 1.0000
.
. * Compare predictions to actual including zeroes
. sum MED pMEDallhml DMED probselhml
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDallhml |
5574 184.5571 174.1649 8.814864 3503.564
DMED |
5574 .7680301 .4221277
0
1
probselhml |
5574 .7674107 .1404707 .1737047 .9994534
. corr MED pMEDallhml DMED probselhml
(obs=5574)
|
MED pMEDal~l DMED probse~l
-------------+-----------------------------------MED | 1.0000
pMEDallhml | 0.1734 1.0000
DMED | 0.1162 0.2015 1.0000
probselhml | 0.1074 0.6092 0.3468 1.0000
.
. * (2B) Heckman 2 step for sample selection model
. * Same as MLE execpt add option twostep in heckman command
. heckman LNMED $XLIST, select (DMED = $XLIST) twostep
Heckman selection model -- two-step estimates Number of obs
(regression model with sample selection)
Censored obs
=
Uncensored obs =
4281
=
5574
1293
Wald chi2(34)
= 944.44
Prob > chi2
= 0.0000
331
-----------------------------------------------------------------------------|
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNMED
|
LC | -.0279209 .039754 -0.70 0.482 -.1058373 .0499955
IDP | -.0922898 .0680191 -1.36 0.175 -.2256048 .0410252
LPI | .0052225 .0111057 0.47 0.638 -.0165442 .0269893
FMDE | -.0295212 .0182427 -1.62 0.106 -.0652762 .0062339
PHYSLIM | .2814948 .0804535 3.50 0.000 .1238088 .4391808
NDISEASE | .021617 .0050395 4.29 0.000 .0117398 .0314943
HLTHG | .1474026 .0490497 3.01 0.003 .051267 .2435381
HLTHF | .3821683 .0961284 3.98 0.000
.19376 .5705765
HLTHP | .833294 .1974488 4.22 0.000 .4463015 1.220287
LINC | .0990973 .0251548 3.94 0.000 .0497948 .1483998
LFAM | -.1441358 .0468074 -3.08 0.002 -.2358766 -.052395
EDUCDEC | .0033639 .0109501 0.31 0.759 -.0180979 .0248257
AGE | .0055556 .0022549 2.46 0.014 .0011361 .0099751
FEMALE | .3846323 .1032799 3.72 0.000 .1822074 .5870573
CHILD | -.2565136 .0936771 -2.74 0.006 -.4401173 -.0729098
FEMCHILD | -.392146 .125089 -3.13 0.002 -.637316 -.146976
BLACK | -.2633649 .1577542 -1.67 0.095 -.5725574 .0458276
_cons | 2.882514 .4698969 6.13 0.000 1.961533 3.803495
-------------+---------------------------------------------------------------DMED
|
LC | -.118708 .0269005 -4.41 0.000 -.1714319 -.065984
IDP | -.1279483 .0522351 -2.45 0.014 -.2303272 -.0255693
LPI | .0283091 .0088793 3.19 0.001
.010906 .0457121
FMDE | .0075319 .0161584 0.47 0.641 -.024138 .0392018
PHYSLIM | .2732013 .0743761 3.67 0.000 .1274268 .4189758
NDISEASE | .0224861 .0035958 6.25 0.000 .0154384 .0295338
HLTHG | .0387516 .0438545 0.88 0.377 -.0472016 .1247049
HLTHF | .1920062 .0836688 2.29 0.022 .0280185 .355994
HLTHP | .6397294 .2126322 3.01 0.003 .222978 1.056481
LINC | .0518413 .0168128 3.08 0.002 .0188889 .0847938
LFAM | -.0335599 .041728 -0.80 0.421 -.1153452 .0482253
EDUCDEC | .036307 .0076536 4.74 0.000 .0213062 .0513078
AGE | .0002631 .0021606 0.12 0.903 -.0039715 .0044978
FEMALE | .4451035 .054292 8.20 0.000 .3386932 .5515138
CHILD | .111489 .0808338 1.38 0.168 -.0469424 .2699203
FEMCHILD | -.4512845 .0799219 -5.65 0.000 -.6079284 -.2946405
BLACK | -.6057367 .0523148 -11.58 0.000 -.7082718 -.5032017
_cons | -.271605 .1877345 -1.45 0.148 -.6395579 .0963478
-------------+---------------------------------------------------------------mills
|
lambda | .2358048 .5018117 0.47 0.638 -.7477282 1.219338
-------------+---------------------------------------------------------------rho | 0.16833
sigma | 1.4008246
lambda | .23580476 .5018117
------------------------------------------------------------------------------
332
/* s where Var[v]=s^2 */
.
. * Save the Stata predictions:
. * Distinguish between ystar=E[y*], ypos=E[y|I>0] and yall=E[y]
. predict ystarh2s, xb
/* E[y*] = x'b */
. predict yposh2s, ycond
/* lamda(z'a) = phi(z'a)/PHI(z'a) */
/* PHI(z'a) */
333
| LNMED yposh2s
MED pMEDpo~t
-------------+-----------------------------------LNMED | 1.0000
yposh2s | 0.3697 1.0000
MED | 0.4560 0.1584 1.0000
pMEDpos2part | 0.3387 0.9240 0.1669 1.0000
.
. * Compare predictions to actual including zeroes
. sum MED pMEDallh2s DMED probselh2s
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDallh2s |
5574 142.1438 123.2964 5.272963 1910.182
DMED |
5574 .7680301 .4221277
0
1
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
. corr MED pMEDallh2s DMED probselh2s
(obs=5574)
|
MED pMEDa~2s DMED probs~2s
-------------+-----------------------------------MED | 1.0000
pMEDallh2s | 0.1772 1.0000
DMED | 0.1162 0.2132 1.0000
probselh2s | 0.1031 0.6298 0.3467 1.0000
.
. * (2C) Check for possible collinearity problems in Heckman 2-Step
.
. * Check variation in inverse mills ratio and related measures
. gen zprimea = invnorm(probselh2s)
. gen zprimeasq = zprimea*zprimea
. sum invmillh2s probselh2s zprimea ystarh2s
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------invmillh2s |
5574 .3955256 .2253329 .002599 1.545223
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
zprimea |
5574 .8217315 .5175712 -1.025036 3.17314
ystarh2s |
5574 3.904371 .589474 2.005307 6.573941
. sum invmillh2s probselh2s zprimea ystarh2s, detail
Mills' ratio
------------------------------------------------------------334
Percentiles
Smallest
1% .0443035
.002599
5% .1081773
.0065964
10% .1479522
.0074306
25% .2404661
.0111331
50%
75%
90%
95%
99%
Obs
5574
Sum of Wgt.
5574
.3522253
Mean
.3955256
Largest
Std. Dev.
.2253329
.5044507
1.42819
.7088638
1.42819
Variance
.0507749
.863094
1.466996
Skewness
1.105156
1.080771
1.545223
Kurtosis
4.403004
Pr(DMED)
------------------------------------------------------------Percentiles
Smallest
1%
.338421
.1526731
5% .4598847
.1769602
10% .5570307
.1900167
Obs
5574
25% .6946899
.1900167
Sum of Wgt.
5574
50%
75%
90%
95%
99%
.7984734
Mean
.7678377
Largest
Std. Dev.
.1457464
.8717066
.9962835
.927941
.9976236
Variance
.021242
.9502093
.9979156
Skewness
-1.048826
.9823552
.999246
Kurtosis
3.903288
zprimea
------------------------------------------------------------Percentiles
Smallest
1% -.4167765
-1.025036
5% -.1007243
-.9270119
10% .1434453 -.8778346
Obs
5574
25% .5091883 -.8778346
Sum of Wgt.
5574
50%
75%
90%
95%
99%
.8361809
Mean
.8217315
Largest
Std. Dev.
.5175712
1.134495
2.676793
1.460626
2.82333
Variance
.2678799
1.646887
2.865093
Skewness
-.0298741
2.105021
3.17314
Kurtosis
3.462529
Linear prediction
------------------------------------------------------------Percentiles
Smallest
1% 2.770451
2.005307
5% 3.096997
2.005307
10% 3.248734
2.066777
Obs
5574
25% 3.460358
2.093177
Sum of Wgt.
5574
335
50%
75%
90%
95%
99%
3.818303
Mean
3.904371
Largest
Std. Dev.
.589474
4.304362
6.054721
4.68132
6.055911
Variance
.3474796
4.946257
6.273092
Skewness
.5047628
5.495563
6.573941
Kurtosis
3.235111
.
. * Check for Mills ratio linear in zprimea
. regress invmillh2s zprimea
Source |
SS
df
MS
Number of obs = 5574
-------------+-----------------------------F( 1, 5572) =84783.34
Model | 265.518552 1 265.518552
Prob > F
= 0.0000
Residual | 17.4500012 5572 .00313173
R-squared = 0.9383
-------------+-----------------------------Adj R-squared = 0.9383
Total | 282.968553 5573 .050774906
Root MSE
= .05596
-----------------------------------------------------------------------------invmillh2s |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------zprimea | -.4217284 .0014484 -291.18 0.000 -.4245677 -.418889
_cons | .7420731 .0014065 527.59 0.000 .7393158 .7448305
-----------------------------------------------------------------------------. regress invmillh2s zprimea zprimeasq
Source |
SS
df
MS
Number of obs =
-------------+-----------------------------F( 2, 5571) =
Model | 282.919807 2 141.459904
Prob > F
Residual | .04874607 5571 8.7500e-06
R-squared
-------------+-----------------------------Adj R-squared =
Total | 282.968553 5573 .050774906
Root MSE
5574
.
= 0.0000
= 0.9998
0.9998
= .00296
-----------------------------------------------------------------------------invmillh2s |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------zprimea | -.6381933 .0001715 -3720.60 0.000 -.6385296 -.6378571
zprimeasq | .1329635 .0000943 1410.22 0.000 .1327787 .1331484
_cons | .7945547 .0000831 9556.73 0.000 .7943917 .7947177
-----------------------------------------------------------------------------. * twoway scatter yinvmill probitxb
.
. * Check R-squared from regress yinvmill on other regressors
. regress invmillh2s $XLIST
Source |
SS
df
MS
Number of obs = 5574
-------------+-----------------------------F( 17, 5556) = 7477.36
Model | 271.118403 17 15.9481414
Prob > F
= 0.0000
Residual | 11.85015 5556 .002132856
R-squared = 0.9581
336
/* Pr[y>0] = PHI(x'b) */
. predict xbmanual, xb
/* x'b */
|
-5.65
-3.92
BLACK | -0.606
-0.196
| -11.58
-2.90
_cons | -0.272
3.077
|
-1.45
13.90
-------------+-------------------------N | 5574.000 4281.000
ll | -2690.577 -7493.499
rank | 18.000
18.000
aic | 5417.154 15022.998
bic | 5536.419 15137.513
---------------------------------------legend: b/t
. di "lltwopart = " lltwopart
lltwopart = -10184.076
.
. * Last four columns of Table 16.1 (page 554)
. * Sample selection estimates: 2step and MLE estimates
. set matsize 60
. estimates table heck2step heckmle, t stats(N ll rank aic bic) b(%10.3f)
---------------------------------------Variable | heck2step heckmle
-------------+-------------------------LNMED
|
LC | -0.028
-0.076
|
-0.70
-2.25
IDP | -0.092
-0.150
|
-1.36
-2.26
LPI |
0.005
0.015
|
0.47
1.42
FMDE | -0.030
-0.024
|
-1.62
-1.21
PHYSLIM |
0.281
0.355
|
3.50
4.70
NDISEASE |
0.022
0.029
|
4.29
7.54
HLTHG |
0.147
0.156
|
3.01
2.99
HLTHF |
0.382
0.445
|
3.98
4.66
HLTHP |
0.833
0.999
|
4.22
5.32
LINC |
0.099
0.121
|
3.94
5.26
LFAM | -0.144
-0.158
|
-3.08
-3.18
EDUCDEC |
0.003
0.018
341
|
0.31
1.95
AGE |
0.006
0.006
|
2.46
2.35
FEMALE |
0.385
0.550
|
3.72
8.69
CHILD | -0.257
-0.198
|
-2.74
-2.03
FEMCHILD | -0.392
-0.565
|
-3.13
-5.80
BLACK | -0.263
-0.536
|
-1.67
-7.15
_cons |
2.883
2.108
|
6.13
8.63
-------------+-------------------------DMED
|
LC | -0.119
-0.107
| -4.41
-4.03
IDP | -0.128
-0.109
|
-2.45
-2.13
LPI |
0.028
0.029
|
3.19
3.42
FMDE |
0.008
0.001
|
0.47
0.05
PHYSLIM |
0.273
0.285
|
3.67
3.94
NDISEASE |
0.022
0.021
|
6.25
6.03
HLTHG |
0.039
0.058
|
0.88
1.35
HLTHF |
0.192
0.224
|
2.29
2.75
HLTHP |
0.640
0.798
|
3.01
3.90
LINC |
0.052
0.055
|
3.08
3.33
LFAM | -0.034
-0.031
|
-0.80
-0.77
EDUCDEC |
0.036
0.031
|
4.74
4.20
AGE |
0.000
-0.001
|
0.12
-0.29
FEMALE |
0.445
0.409
|
8.20
7.69
CHILD |
0.111
0.053
|
1.38
0.67
FEMCHILD | -0.451
-0.395
|
-5.65
-5.04
BLACK | -0.606
-0.583
| -11.58
-11.20
_cons | -0.272
-0.214
|
-1.45
-1.16
342
-------------+-------------------------mills
|
lambda |
0.236
|
0.47
-------------+-------------------------athrho
|
_cons |
0.941
|
12.78
-------------+-------------------------lnsigma
|
_cons |
0.451
|
25.45
-------------+-------------------------Statistics |
N | 5574.000 5574.000
ll |
-10170.110
rank | 37.000
38.000
aic |
. 20416.221
bic |
. 20668.004
---------------------------------------legend: b/t
.
. ************ (4) A LITTLE FURTHER ANALYSIS **********
.
. * Predictions
. * Compare predictions to actual for MED > 0
. sum MED pMEDpos2part pMEDposhml pMEDposh2s if MED > 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
4281 220.987 909.9021 .5860291 39182.02
pMEDpos2part |
4281 183.462 126.0213 26.37827 1731.088
pMEDposhml |
4281 240.4096 185.0424 42.00053 3505.48
pMEDposh2s |
4281 184.9993 129.5432 27.63657 1911.624
. corr MED pMEDpos2part pMEDposhml pMEDposh2s if MED > 0
(obs=4281)
|
MED pMEDpo~t pMEDpo~l pMEDp~2s
-------------+-----------------------------------MED | 1.0000
pMEDpos2part | 0.1669 1.0000
pMEDposhml | 0.1617 0.9830 1.0000
pMEDposh2s | 0.1669 0.9994 0.9887 1.0000
.
. * Compare predictions to actual including zeroes
. sum MED pMEDall2part pMEDallhml pMEDallh2s DMED probsel2part probselhml probselh2s
343
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDall2part |
5574 140.966 120.2022 4.880651 1729.783
pMEDallhml |
5574 184.5571 174.1649 8.814864 3503.564
pMEDallh2s |
5574 142.1438 123.2964 5.272963 1910.182
DMED |
5574 .7680301 .4221277
0
1
-------------+-------------------------------------------------------probsel2part | 5574 .7678377 .1457464 .1526731 .999246
probselhml |
5574 .7674107 .1404707 .1737047 .9994534
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
. corr MED pMEDall2part pMEDallhml pMEDallh2s DMED probsel2part probselhml probselh2s
(obs=5574)
|
MED pMEDal~t pMEDal~l pMEDa~2s DMED probse~t probse~l probs~2s
-------------+-----------------------------------------------------------------------MED | 1.0000
pMEDall2part | 0.1772 1.0000
pMEDallhml | 0.1734 0.9861 1.0000
pMEDallh2s | 0.1772 0.9995 0.9909 1.0000
DMED | 0.1162 0.2158 0.2015 0.2132 1.0000
probsel2part | 0.1031 0.6380 0.5939 0.6298 0.3467 1.0000
probselhml | 0.1074 0.6552 0.6092 0.6468 0.3468 0.9980 1.0000
probselh2s | 0.1031 0.6380 0.5939 0.6298 0.3467 1.0000 0.9980 1.0000
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section4\mma16p3selection.txt
log type: text
closed on: 19 May 2005, 13:04:40
344
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma17p1km.txt
log type: text
opened on: 19 May 2005, 13:19:55
.
. ********** OVERVIEW OF MMA17P1KM.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 17.2 (pages 574-5) and 17.5.1 (pages 581-3)
. * Nonparametric Duration Analysis
. * It provides
. * (1) Kaplan-Meier Survival Estimate Graph (Figure 17.1: kennanstrk.wmf)
. * (2) Nelson-Aalen Cumulative Hazard Estimate Graph
. * (3) Kaplan-Meier Survivor Function Estimates (Table 17.3)
. * (4) Shows that Cox regression on intercept gives same results
.
. * To run this program you need data file
. * strkdur.dta
.
. ********** SETUP **********
.
. set more off
. version 8
. set scheme s1mono /* Used for graphs */
.
. ********** DATA DESCRIPTION
.
. * The data is the same data as given in Table 1 of
. * J. Kennan, "The Duration of Contract strikes in U.S. Manufacturing",
. * Journal of Econometrics, 1985, Vol. 28, pp.5-28.
.
. * There are 566 observations from 1968-1976 with two variables
. * 1. dur is duration of the strike in days
. * 2. gdp is a measure of stage of business cycle
.*
(deviation of monthly log industrial production in manufacturing
.*
from prediction from OLS on time, time-squared and monthly dummies)
.
. * All observations are complete for these data. There is no censoring !!
. * For an example with censoring see mma17p2kmextra.do or mma17p4duration.do
.
. ********** READ DATA **********
.
345
. use strkdur.dta
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dur |
566 43.62367 44.66641
1
235
gdp |
566 .0060411 .0499072 -.13996 .08554
.
. * Create ASCII data set so that can use programs other than Stata
. outfile dur gdp using strkdur.asc, replace
.
. ********* ANALYSIS: NONPARAMETRIC SURVIVAL CURVE AND HAZARD
FUNCTION **********
.
. * Stata st curves require defining the dependent variable
. stset dur
failure event: (assumed to fail at time=dur)
obs. time interval: (0, dur]
exit on or before: failure
-----------------------------------------------------------------------------566 total obs.
0 exclusions
-----------------------------------------------------------------------------566 obs. remaining, representing
566 failures in single record/single failure data
24691 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
235
.
. * The data here are complete. If dur is instead right-censored,
. * then also need to define a censoring indicator. For example
. * stset dur, fail(censor=1)
. * where the variable censor=1 if data are right-censored and =0 otherwise
. * See mma17p3duration.do
.
. * (1) GRAPH KAPLAN-MEIER SURVIVAL CURVE
.
. * Minimal command that gives 95% confidence bands
. sts graph, gwood
failure _d: 1 (meaning all fail)
analysis time _t: dur
.
. * Longer command for Figure 17.1 (page 575)
346
. * Nicer graphs and also confidence bands are bolder and easier to read
. sts gen surv = s
. sts gen lbsurv = lb(s)
. sts gen ubsurv = ub(s)
. sort dur
. graph twoway (line ubsurv dur, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)) /*
> */ (line surv dur, msize(vtiny) mstyle(p1) c(J) clstyle(p1)) /*
> */ (line lbsurv dur, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)), /*
> */ scale(1.2) plotregion(style(none)) /*
> */ title("Kaplan-Meier Survival Function Estimate") /*
> */ xtitle("Strike duration in days", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Survival Probability", size(medlarge)) yscale(titlegap(*5)) /*
> */ ylabel(0.00(0.25)1.00,grid)/*
> */ legend(pos(3) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Upper 95% confidence band") label(2 "Survival Function") /*
> */
label(3 "Lower 95% confidence band") )
. graph export kennanstrk.wmf, replace
(file c:\Imbook\bwebpage\Section4\kennanstrk.wmf written in Windows Metafile format)
.
. * (2) GRAPH NELSON-AALEN CUMULATIVE HAZARD FUNCTION
.
. * Minimal command that gives 95% confidence bands
. sts graph, cna
failure _d: 1 (meaning all fail)
analysis time _t: dur
.
. * Longer command gives nicer figure
. sts graph, cna /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Nelson-Aalen Cumulative Hazard") /*
> */ xtitle("Strike duration in days", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Cumulative Hazard", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(12) ring(0) col(1)) legend(size(small)) /*
> */ legend(label(1 "95% confidence bands") label(2 "Cumulative Hazard"))
failure _d: 1 (meaning all fail)
analysis time _t: dur
.
. * (3) LIST SURVIVOR and NELSON-AALEN CUMULATIVE HAZARD ESTIMATES
.
. * Gives a lot of output
.
347
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
67
68
70
71
72
74
75
77
82
83
84
85
86
87
88
90
91
92
94
98
99
100
101
102
103
104
105
106
207
203
194
191
187
182
179
174
166
165
157
151
150
148
145
142
141
137
131
126
124
122
117
114
113
112
108
107
106
105
104
101
99
98
95
93
92
91
90
89
87
86
85
82
79
77
74
72
71
68
67
4
9
3
4
5
3
5
8
1
8
6
1
2
3
3
1
4
6
5
2
2
5
3
1
1
4
1
1
1
1
3
2
1
3
2
1
1
1
1
2
1
1
3
3
2
3
2
1
3
1
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.3587
0.3428
0.3375
0.3304
0.3216
0.3163
0.3074
0.2933
0.2915
0.2774
0.2668
0.2650
0.2615
0.2562
0.2509
0.2491
0.2420
0.2314
0.2226
0.2191
0.2155
0.2067
0.2014
0.1996
0.1979
0.1908
0.1890
0.1873
0.1855
0.1837
0.1784
0.1749
0.1731
0.1678
0.1643
0.1625
0.1608
0.1590
0.1572
0.1537
0.1519
0.1502
0.1449
0.1396
0.1360
0.1307
0.1272
0.1254
0.1201
0.1184
0.1148
0.0202
0.0200
0.0199
0.0198
0.0196
0.0195
0.0194
0.0191
0.0191
0.0188
0.0186
0.0186
0.0185
0.0183
0.0182
0.0182
0.0180
0.0177
0.0175
0.0174
0.0173
0.0170
0.0169
0.0168
0.0167
0.0165
0.0165
0.0164
0.0163
0.0163
0.0161
0.0160
0.0159
0.0157
0.0156
0.0155
0.0154
0.0154
0.0153
0.0152
0.0151
0.0150
0.0148
0.0146
0.0144
0.0142
0.0140
0.0139
0.0137
0.0136
0.0134
0.3193
0.3039
0.2988
0.2919
0.2834
0.2783
0.2698
0.2563
0.2546
0.2411
0.2310
0.2294
0.2260
0.2210
0.2159
0.2143
0.2076
0.1976
0.1893
0.1860
0.1827
0.1744
0.1695
0.1678
0.1662
0.1596
0.1580
0.1563
0.1547
0.1530
0.1481
0.1449
0.1432
0.1384
0.1351
0.1335
0.1319
0.1302
0.1286
0.1254
0.1238
0.1222
0.1173
0.1125
0.1093
0.1045
0.1013
0.0997
0.0950
0.0934
0.0902
0.3981
0.3819
0.3765
0.3693
0.3602
0.3548
0.3457
0.3312
0.3293
0.3147
0.3037
0.3019
0.2982
0.2927
0.2872
0.2854
0.2780
0.2669
0.2577
0.2540
0.2503
0.2410
0.2354
0.2335
0.2317
0.2242
0.2223
0.2205
0.2186
0.2167
0.2111
0.2073
0.2055
0.1998
0.1960
0.1942
0.1923
0.1904
0.1885
0.1847
0.1828
0.1809
0.1752
0.1695
0.1657
0.1600
0.1561
0.1542
0.1485
0.1465
0.1427
349
107
65
2
0
0.1113 0.0132 0.0871 0.1388
108
63
2
0
0.1078 0.0130 0.0839 0.1349
109
61
2
0
0.1042 0.0128 0.0808 0.1311
111
59
1
0
0.1025 0.0127 0.0792 0.1291
112
58
1
0
0.1007 0.0126 0.0777 0.1272
114
57
1
0
0.0989 0.0126 0.0761 0.1252
115
56
1
0
0.0972 0.0124 0.0745 0.1233
116
55
1
0
0.0954 0.0123 0.0730 0.1213
117
54
2
0
0.0919 0.0121 0.0699 0.1174
118
52
1
0
0.0901 0.0120 0.0683 0.1155
119
51
1
0
0.0883 0.0119 0.0668 0.1135
122
50
3
0
0.0830 0.0116 0.0622 0.1076
123
47
1
0
0.0813 0.0115 0.0606 0.1056
124
46
1
0
0.0795 0.0114 0.0591 0.1037
125
45
2
0
0.0760 0.0111 0.0561 0.0997
126
43
1
0
0.0742 0.0110 0.0545 0.0977
127
42
2
0
0.0707 0.0108 0.0515 0.0937
130
40
2
0
0.0671 0.0105 0.0485 0.0897
131
38
1
0
0.0654 0.0104 0.0470 0.0877
133
37
1
0
0.0636 0.0103 0.0455 0.0857
135
36
1
0
0.0618 0.0101 0.0440 0.0837
136
35
2
0
0.0583 0.0098 0.0410 0.0797
139
33
2
0
0.0548 0.0096 0.0381 0.0756
140
31
1
0
0.0530 0.0094 0.0366 0.0736
141
30
3
0
0.0477 0.0090 0.0323 0.0675
142
27
1
0
0.0459 0.0088 0.0308 0.0654
143
26
1
0
0.0442 0.0086 0.0294 0.0633
146
25
2
0
0.0406 0.0083 0.0265 0.0592
147
23
1
0
0.0389 0.0081 0.0251 0.0571
148
22
2
0
0.0353 0.0078 0.0223 0.0529
151
20
1
0
0.0336 0.0076 0.0209 0.0508
152
19
1
0
0.0318 0.0074 0.0196 0.0487
153
18
2
0
0.0283 0.0070 0.0169 0.0444
154
16
1
0
0.0265 0.0068 0.0155 0.0423
160
15
1
0
0.0247 0.0065 0.0142 0.0401
163
14
2
0
0.0212 0.0061 0.0116 0.0357
165
12
1
0
0.0194 0.0058 0.0103 0.0335
168
11
1
0
0.0177 0.0055 0.0091 0.0312
174
10
1
0
0.0159 0.0053 0.0079 0.0290
175
9
1
0
0.0141 0.0050 0.0067 0.0267
179
8
1
0
0.0124 0.0046 0.0055 0.0244
191
7
1
0
0.0106 0.0043 0.0044 0.0220
192
6
1
0
0.0088 0.0039 0.0034 0.0196
205
5
1
0
0.0071 0.0035 0.0024 0.0171
208
4
1
0
0.0053 0.0031 0.0015 0.0146
216
3
1
0
0.0035 0.0025 0.0007 0.0121
226
2
1
0
0.0018 0.0018 0.0002 0.0095
235
1 1 0
0.0000
.
.
.
------------------------------------------------------------------------------.
350
566
566
24691
Number of obs =
LR chi2(0)
Log likelihood =
-3032.134
566
=
0.00
Prob > chi2 =
.
. * (5) ESTIMATE HAZARD FUNCTION
.
. * sts graph does not give the true hazard function - it instead gives the
. * difference in the cumulative hazard (without division by time difference).
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section4\mma17p1km.txt
log type: text
closed on: 19 May 2005, 13:20:01
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma17p2kmextra.txt
log type: text
opened on: 19 May 2005, 13:24:01
.
. ********** OVERVIEW OF MMA17PP2KMEXTRA.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
352
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
duration
10 1
10 1
10 1
10 1
10 1
10 1
15 0
15 0
15 0
15 0
20 1
20 1
20 1
20 1
20 1
25 0
failed
353
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
25
25
30
30
35
40
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
50
50
50
50
50
50
50
50
50
50
50
50
50
0
0
1
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
354
68. 50
69. 50
70. 50
71. 50
72. 50
73. 50
74. 50
75. 50
76. 50
77. 50
78. 50
79. 50
80. 50
81. end
1
1
1
1
1
1
1
1
1
1
1
1
1
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------duration |
80
39.625 13.40166
10
50
failed |
80
.5 .5031546
0
1
.
. ***** COMPUTATION USING STATA **********
.
. * Stata st curves require defining the dependent variable
. stset duration, fail(failed=1)
failure event: failed == 1
obs. time interval: (0, duration]
exit on or before: failure
-----------------------------------------------------------------------------80 total obs.
0 exclusions
-----------------------------------------------------------------------------80 obs. remaining, representing
40 failures in single record/single failure data
3170 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
50
. stsum
failure _d: failed == 1
analysis time _t: duration
|
incidence
no. of |------ Survival time -----|
| time at risk rate
subjects
25%
50%
75%
---------+--------------------------------------------------------------------355
total |
3170 .0126183
80
50
50
50
. stdes
failure _d: failed == 1
analysis time _t: duration
|-------------- per subject --------------|
Category
total
mean
min median
max
-----------------------------------------------------------------------------no. of subjects
80
no. of records
80
1
1
1
1
(first) entry time
(final) exit time
subjects with gap
time on gap if gap
time at risk
0
39.625
0
10
0
45
50
0
0
3170
39.625
10
45
50
failures
40
.5
0
.5
1
-----------------------------------------------------------------------------.
. * K-M survival graph
. * sts graph, gwood
.
. * N-A Cumulative Hazard
. * sts graph, cna
.
. * Kaplan-Meier Survivor Function listed (last column Table 17.2)
. sts list
failure _d: failed == 1
analysis time _t: duration
Beg.
Net
Survivor
Std.
Time Total Fail Lost
Function Error [95% Conf. Int.]
------------------------------------------------------------------------------10
80
6
0
0.9250 0.0294 0.8407 0.9656
15
74
0
4
0.9250 0.0294 0.8407 0.9656
20
70
5
0
0.8589 0.0395 0.7596 0.9193
25
65
0
3
0.8589 0.0395 0.7596 0.9193
30
62
2
0
0.8312 0.0428 0.7268 0.8984
35
60
0
1
0.8312 0.0428 0.7268 0.8984
40
59
1
0
0.8171 0.0443 0.7104 0.8875
45
58
0 32
0.8171 0.0443 0.7104 0.8875
50
26 26 0
0.0000
.
.
.
------------------------------------------------------------------------------.
356
.
. ********** OVERVIEW OF MMA17P3WEIB.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 17.6.1 (pages 584-6)
. * Plot of Weibull density, survuvor, hazard and cumulative hazard functions
. * Provides
. * (1) Figure 17.2 (ch17weibull.wmf)
.
. * This program requires no data
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
.
. ********** GENERATE DATA AND FUNCTIONS **********
.
. set obs 800
obs was 0, now 800
.
. gen t = 0.1*_n /* duration time */
.
. * Generate the survivor, hazard, cumulative hazard and density
. scalar g = 0.01 /* gamma */
. scalar a = 1.5 /* alpha */
. gen surv = exp(-g*(t^(a)))
. gen density = g*a*(t^(a-1))*exp(-g*(t^(a)))
. gen hazard = g*a*(t^(a-1))
. gen cumhaz = -ln(surv)
.
. ********** DO THE FOUR SEPARATE GRAPHS FOR FIGURE 17.2 **********
.
358
. * Weibull density
. graph twoway (scatter density t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ xtitle("Duration time", size(large)) xscale(titlegap(*5)) /*
> */ ytitle("Weibull density", size(large)) yscale(titlegap(*5)) /*
> */ xlabel(,labsize(medlarge)) ylabel(,labsize(medlarge))
. graph save ch17fig2a, replace
(file ch17fig2a.gph saved)
.
. * Weibull survivor
. graph twoway (scatter surv t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ xtitle("Duration time", size(large)) xscale(titlegap(*5)) /*
> */ ytitle("Weibull survivor", size(large)) yscale(titlegap(*5)) /*
> */ xlabel(,labsize(medlarge)) ylabel(,labsize(medlarge))
. graph save ch17fig2b, replace
(file ch17fig2b.gph saved)
.
. * Weibull hazard
. graph twoway (scatter hazard t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ xtitle("Duration time", size(large)) xscale(titlegap(*5)) /*
> */ ytitle("Weibull hazard", size(large)) yscale(titlegap(*5)) /*
> */ xlabel(,labsize(medlarge)) ylabel(,labsize(medlarge))
. graph save ch17fig2c, replace
(file ch17fig2c.gph saved)
.
. * Weibull cumulative hazard
. graph twoway (scatter cumhaz t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ xtitle("Duration time", size(large)) xscale(titlegap(*5)) /*
> */ ytitle("Cumulative hazard", size(large)) yscale(titlegap(*5)) /*
> */ xlabel(,labsize(medlarge)) ylabel(,labsize(medlarge))
. graph save ch17fig2d, replace
(file ch17fig2d.gph saved)
.
. ********** COMBINE THE FOUR GRAPHS FOR FIGURE 17.2 (page 585) **********
.
. graph combine ch17fig2a.gph ch17fig2b.gph ch17fig2c.gph ch17fig2d.gph, /*
> */ title("Weibull Distribution", margin(b=2) size(vlarge))
. graph export ch17weibull.wmf, replace
(file c:\Imbook\bwebpage\Section4\ch17weibull.wmf written in Windows Metafile format)
359
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section4\mma17p3weib.txt
log type: text
closed on: 19 May 2005, 14:22:39
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma17p4duration.txt
log type: text
opened on: 19 May 2005, 15:25:00
.
. ********** OVERVIEW OF MMA17P4DURATION.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 17.11 (pages 603-8)
. * Duration regression with censored data example
. * Provides
. * (1) Data summary: Table 17.6
. * (2) List of Survivor Function and Cumulative Hazard Estimates: Table 17.7
. * (3) Various graphs describing the data
.*
(3A) K-M Survival Graph for all data (Figure 17.3: km_pt1.wmf)
.*
(3B) K-M Survival Graph by unemployment insurance (Figure 17.4: km_pt2.wmf)
.*
(3C) N-A Cumulative Hazard Graph for all data (Figure 17.5: na_pt1.wmf)
.*
(3D) N-A Cumulative Hazard Graph by unemployment insurance (Figure 17.6: na_pt2.wmf)
. * (4) Coefficient Estimates of Some Parametric Models (Table 17.8)
. * (4) Hazard Rate Estimates of Some Parametric Models (Table 17.9)
.
. * To run this program you need data file
. * ema1996.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
. set matsize 100
.
. ********** DATA DESCRIPTION **********
.
360
ychild |
3343 .1956327 .3967463
0
1
nonwhite |
3343 .1390966 .3460991
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
smsa |
3343 .7241998 .4469835
0
1
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
-------------+-------------------------------------------------------services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
-------------+-------------------------------------------------------midatl |
3343 .1088842 .3115405
0
1
encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
-------------+-------------------------------------------------------wscen |
3343 .1441819 .3513266
0
1
mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
.
. * The following gives variables in same order as Table 2 p.657 of McCall (1996)
. * which gives fuller names for the variables
. sum spell censor1 censor2 censor3 censor4 age /*
> */ ui reprate disrate logwage tenure slack abolpos explose bluecoll /*
> */ houshead married child ychild female schlt12 schgt12 nonwhite smsa /*
> */ midatl encen wncen southatl escen wscen mountain pacific /*
> */ mining constr transp trade fire services pubadmin /*
> */ year85 year87 year89
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
ui |
3343 .5527969 .4972791
0
1
362
reprate |
3343 .4544717 .1137918
.066
2.059
disrate |
3343 .1094376 .0735274
.002
1.02
logwage |
3343 5.692994 .5356591 2.70805 7.600402
-------------+-------------------------------------------------------tenure |
3343 4.114867 5.862322
0
40
slack |
3343 .4884834 .4999421
0
1
abolpos |
3343 .1456775 .3528354
0
1
explose |
3343 .5025426 .5000683
0
1
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------houshead |
3343 .6120251 .4873617
0
1
married |
3343 .5860006 .4926221
0
1
child |
3343 .4501944 .4975876
0
1
ychild |
3343 .1956327 .3967463
0
1
female |
3343 .3478911 .4763725
0
1
-------------+-------------------------------------------------------schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
nonwhite |
3343 .1390966 .3460991
0
1
smsa |
3343 .7241998 .4469835
0
1
midatl |
3343 .1088842 .3115405
0
1
-------------+-------------------------------------------------------encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
wscen |
3343 .1441819 .3513266
0
1
-------------+-------------------------------------------------------mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
-------------+-------------------------------------------------------trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
-------------+-------------------------------------------------------year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
.
. * The following creates a space-delimited data set with
. * variables in same order as Table 2 p.657 of McCall (1996)
. * Permits use by programs other than Stata
. * Note that order has been changed a little from the original Stata data set
.
. outfile spell censor1 censor2 censor3 censor4 age /*
> */ ui reprate disrate logwage tenure slack abolpos explose bluecoll /*
363
>
>
>
>
.
. ********* ANALYSIS: UNEMPLOYMENT DURATION **********
.
. * Stata st curves require defining the dependent variable
. * and the censoring variable if there is one
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
exit on or before: failure
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
-----------------------------------------------------------------------------3343 obs. remaining, representing
1073 failures in single record/single failure data
20887 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
28
. stdes
failure _d: censor1 == 1
analysis time _t: spell
|-------------- per subject --------------|
Category
total
mean
min median
max
-----------------------------------------------------------------------------no. of subjects
3343
no. of records
3343
1
1
1
1
(first) entry time
(final) exit time
subjects with gap
time on gap if gap
time at risk
0
6.247981
0
0
20887 6.247981
0
1
0
5
28
28
failures
1073 .3209692
0
0
1
-----------------------------------------------------------------------------.
. * (1) SUMMARIZE KEY VARIABLES (Table 17.6, p.603)
.
. sum spell censor1 censor2 censor3 censor4 ui reprate disrate tenure logwage
364
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------ui |
3343 .5527969 .4972791
0
1
reprate | 3343 .4544717 .1137918
.066
2.059
disrate |
3343 .1094376 .0735274
.002
1.02
tenure |
3343 4.114867 5.862322
0
40
logwage |
3343 5.692994 .5356591 2.70805 7.600402
.
. * (2) LIST SURVIVAL CURVE AND CUMULATIVE HAZARD ESTIMATES (Table 17.7,
p.605)
.
. * Kaplan-Meier Estimates of Survival Function
. sts list
failure _d: censor1 == 1
analysis time _t: spell
Beg.
Net
Survivor
Std.
Time Total Fail Lost
Function Error [95% Conf. Int.]
------------------------------------------------------------------------------1 3343 294 246
0.9121 0.0049 0.9019 0.9212
2 2803 178 304
0.8541 0.0062 0.8415 0.8659
3 2321 119 305
0.8103 0.0071 0.7960 0.8238
4 1897 56 165
0.7864 0.0076 0.7712 0.8008
5 1676 104 233
0.7376 0.0085 0.7206 0.7538
6 1339 32 111
0.7200 0.0088 0.7023 0.7369
7 1196 85 178
0.6688 0.0098 0.6492 0.6876
8
933 15 70
0.6581 0.0100 0.6380 0.6773
9
848 33 98
0.6325 0.0106 0.6113 0.6528
10
717
3 55
0.6298 0.0106 0.6086 0.6503
11
659 26 77
0.6050 0.0113 0.5825 0.6267
12
556
7 40
0.5974 0.0115 0.5744 0.6195
13
509 25 69
0.5680 0.0123 0.5434 0.5918
14
415 30 74
0.5270 0.0135 0.5001 0.5531
15
311 19 40
0.4948 0.0146 0.4658 0.5230
16
252 10 41
0.4751 0.0153 0.4449 0.5047
17
201
8 24
0.4562 0.0161 0.4245 0.4874
18
169
7 13
0.4373 0.0169 0.4040 0.4702
19
149
4 15
0.4256 0.0174 0.3912 0.4595
20
130
3 18
0.4158 0.0179 0.3804 0.4507
21
109
4 23
0.4005 0.0188 0.3635 0.4372
22
82
4
9
0.3810 0.0203 0.3412 0.4206
23
69
0
9
0.3810 0.0203 0.3412 0.4206
365
24
60
0
2
0.3810 0.0203 0.3412 0.4206
25
58
0 10
0.3810 0.0203 0.3412 0.4206
26
48
2 13
0.3651 0.0223 0.3214 0.4088
27
33
5 24
0.3098 0.0296 0.2528 0.3684
28
4
0
4
0.3098 0.0296 0.2528 0.3684
------------------------------------------------------------------------------.
. * Nelson-Aalen Estimates of Cumulative Hazard
. sts list, na
failure _d: censor1 == 1
analysis time _t: spell
Beg.
Net
Nelson-Aalen Std.
Time Total Fail Lost
Cum. Haz. Error [95% Conf. Int.]
------------------------------------------------------------------------------1 3343 294 246
0.0879 0.0051 0.0784 0.0986
2 2803 178 304
0.1514 0.0070 0.1383 0.1658
3 2321 119 305
0.2027 0.0084 0.1869 0.2199
4 1897 56 165
0.2322 0.0093 0.2147 0.2512
5 1676 104 233
0.2943 0.0111 0.2733 0.3169
6 1339 32 111
0.3182 0.0119 0.2957 0.3424
7 1196 85 178
0.3893 0.0142 0.3624 0.4181
8
933 15 70
0.4053 0.0148 0.3774 0.4353
9
848 33 98
0.4443 0.0162 0.4135 0.4773
10 717
3 55
0.4484 0.0164 0.4174 0.4818
11
659 26 77
0.4879 0.0182 0.4536 0.5248
12
556
7 40
0.5005 0.0188 0.4650 0.5387
13
509 25 69
0.5496 0.0212 0.5096 0.5927
14
415 30 74
0.6219 0.0250 0.5748 0.6728
15
311 19 40
0.6830 0.0286 0.6291 0.7415
16
252 10 41
0.7227 0.0313 0.6639 0.7866
17
201
8 24
0.7625 0.0343 0.6982 0.8327
18
169
7 13
0.8039 0.0377 0.7333 0.8812
19
149
4 15
0.8307 0.0400 0.7559 0.9130
20
130
3 18
0.8538 0.0422 0.7750 0.9406
21
109
4 23
0.8905 0.0460 0.8048 0.9853
22
82
4
9
0.9393 0.0521 0.8426 1.0470
23
69
0
9
0.9393 0.0521 0.8426 1.0470
24
60
0
2
0.9393 0.0521 0.8426 1.0470
25
58
0 10
0.9393 0.0521 0.8426 1.0470
26
48
2 13
0.9809 0.0598 0.8705 1.1055
27
33
5 24
1.1325 0.0904 0.9685 1.3242
28
4
0
4
1.1325 0.0904 0.9685 1.3242
------------------------------------------------------------------------------.
. * (3) VARIOUS GRAPHS (Figures 17.3-17.6)
.
. * (3A) Figure 17.3: Overall Survival Function (page 604)
366
. sort spell
. graph twoway (line ubcumhaz spell, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)) /*
> */ (line cumhaz spell, msize(vtiny) mstyle(p1) c(J) clstyle(p1)) /*
> */ (line lbcumhaz spell, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)), /*
> */ scale(1.2) plotregion(style(none)) /*
> */ title("Overall Cumulative Hazard Estimate") /*
> */ xtitle("Unemployment Duration in 2-week intervals", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Cumulative Hazard", size(medlarge)) yscale(titlegap(*5)) /*
> */ ylabel(0.00(0.50)1.50,grid)/*
> */ legend(pos(11) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Upper 95% confidence band") label(2 "Cumulative Hazard Estimate") /*
> */
label(3 "Lower 95% confidence band") )
. graph export na_pt1.wmf, replace
(file c:\Imbook\bwebpage\Section4\na_pt1.wmf written in Windows Metafile format)
.
. * (3D) Figure 17.6: Cumulative Hazard Function by Treatment (here ui) (p.606)
. * sts graph, na by(ui)
. sts graph, na by(ui) /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Cumulative Hazard Estimates by UI Status") /*
> */ xtitle("Unemployment Duration in 2-week intervals", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Cumulative Hazard", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(1) ring(0) col(1)) legend(size(small)) /*
> */ legend(label(1 "No UI (UI = 0)") label(2 "Received UI (UI = 1)") )
failure _d: censor1 == 1
analysis time _t: spell
. graph export na_pt2.wmf, replace
(file c:\Imbook\bwebpage\Section4\na_pt2.wmf written in Windows Metafile format)
.
. * (4) VARIOUS PARAMETRIC MODELS: COEFFICIENTS (Table 17.8)
.
. * streg default is to report hazard rates ratehr than coeffcients
. * streg with nohr option reports coefficients
.
. * Create regressors
. gen RR = reprate
. gen DR = disrate
. gen UI = ui
. gen RRUI = RR*UI
. gen DRUI = DR*UI
368
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .4720235 .6005534 0.79 0.432 -.7050396 1.649087
DR | -.5756396 .7624489 -0.75 0.450 -2.070012 .9187327
UI | -1.424561 .2493917 -5.71 0.000 -1.91336 -.9357622
RRUI | .9655904 .6118408 1.58 0.115 -.2335956 2.164776
DRUI | -.1990635 1.019118 -0.20 0.845 -2.196498 1.798371
LOGWAGE | .3508005 .115598 3.03 0.002 .1242327 .5773684
tenure | -.0001462 .0064637 -0.02 0.982 -.0128147 .0125224
slack | -.2593666 .0759363 -3.42 0.001 -.4081991 -.1105342
abolpos | -.1550897 .0953306 -1.63 0.104 -.3419342 .0317549
explose | .198458 .0648354 3.06 0.002
.071383 .3255331
stateur | -.064626 .0229903 -2.81 0.005 -.1096862 -.0195659
houshead | .3812208 .0836602 4.56 0.000 .2172499 .5451918
married | .369552 .0786145 4.70 0.000 .2154705 .5236335
369
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .4481156 .6381895 0.70 0.483 -.8027127 1.698944
DR | -.4269187 .8086983 -0.53 0.598 -2.011938 1.158101
UI | -1.496066 .2639679 -5.67 0.000 -2.013434 -.9786984
RRUI | 1.015226 .6455611 1.57 0.116 -.2500501 2.280503
DRUI | -.2988417 1.065384 -0.28 0.779 -2.386956 1.789272
LOGWAGE | .3655253 .12212 2.99 0.003 .1261745 .6048761
tenure | -.0011127 .0068716 -0.16 0.871 -.0145809 .0123554
slack | -.2652154 .0803214 -3.30 0.001 -.4226424 -.1077883
abolpos | -.1604227 .1012942 -1.58 0.113 -.3589557 .0381103
explose | .2075085 .0684715 3.03 0.002 .0733068 .3417103
stateur | -.0708745 .0242117 -2.93 0.003 -.1183286 -.0234204
houshead | .3976626 .0887192 4.48 0.000 .2237762 .571549
married | .3786057 .0830317 4.56 0.000 .2158665 .541345
female | .1260829 .0896987 1.41 0.160 -.0497233 .301889
child | -.0336778 .0839956 -0.40 0.688 -.1983061 .1309505
ychild | -.1613066 .108947 -1.48 0.139 -.3748389 .0522256
nonwhite | -.7025504 .12426 -5.65 0.000 -.9460956 -.4590052
age | -.0235823 .0041922 -5.63 0.000 -.0317989 -.0153658
schlt12 | -.1226759 .1022762 -1.20 0.230 -.3231335 .0777816
schgt12 | .1162848 .0880692 1.32 0.187 -.0563278 .2888973
smsa | .1999567 .0841129 2.38 0.017 .0350985 .3648149
bluecoll | -.1994925 .0899354 -2.22 0.027 -.3757626 -.0232223
mining | -.1015676 .2036644 -0.50 0.618 -.5007425 .2976073
constr | -.0253737 .1135609 -0.22 0.823 -.247949 .1972016
transp | -.1981522 .1672141 -1.19 0.236 -.5258858 .1295814
trade | -.0311361 .1079502 -0.29 0.773 -.2427146 .1804423
fire | .1262153 .1492527 0.85 0.398 -.1663145 .4187452
services | .2031673 .1038945 1.96 0.051 -.0004622 .4067968
pubadmin | .1117728 .3087374 0.36 0.717 -.4933415 .716887
year85 | .2374972 .093387 2.54 0.011
.054462 .4205325
year87 | .3787397 .1011782 3.74 0.000 .1804341 .5770454
year89 | .4920278 .1180472 4.17 0.000 .2606596 .7233959
midatl | .02465 .1542139 0.16 0.873 -.2776037 .3269036
encen | -.0014111 .1579065 -0.01 0.993 -.3109023 .30808
wncen | .1844363 .1694444 1.09 0.276 -.1476687 .5165413
southatl | .2740974 .1250481 2.19 0.028 .0290076 .5191872
371
=
=
=
Log pseudo-likelihood =
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .472405 .6033813 0.78 0.434 -.7102005 1.655011
DR | -.5627894 .7646131 -0.74 0.462 -2.061404 .9358247
372
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .5222796 .5711698 0.91 0.361 -.5971926 1.641752
DR | -.752507 .72175 -1.04 0.297 -2.167111 .6620971
UI | -1.317719 .2372893 -5.55 0.000 -1.782798 -.8526409
RRUI | .8822462 .582115 1.52 0.130 -.2586783 2.023171
DRUI | -.0951357 .977774 -0.10 0.922 -2.011538 1.821266
LOGWAGE | .3352639 .1106483 3.03 0.002 .1183972 .5521306
tenure | .0008278 .0061286 0.14 0.893 -.0111841 .0128396
slack | -.247863 .0721173 -3.44 0.001 -.3892103 -.1065158
abolpos | -.1511638 .0905035 -1.67 0.095 -.3285475 .0262198
explose | .1865068 .0615742 3.03 0.002 .0658236 .30719
stateur | -.0590475 .022085 -2.67 0.008 -.1023334 -.0157616
houshead | .3601866 .0794827 4.53 0.000 .2044035 .5159698
married | .358819 .0746355 4.81 0.000 .2125362 .5051019
female | .1002758 .0813277 1.23 0.218 -.0591236 .2596753
child | -.0396054 .0755365 -0.52 0.600 -.1876542 .1084435
ychild | -.1276638 .0967856 -1.32 0.187 -.3173602 .0620325
nonwhite | -.6394475 .1151332 -5.55 0.000 -.8651043 -.4137906
age | -.0204623 .0037593 -5.44 0.000 -.0278305 -.0130942
schlt12 | -.1220585 .0920073 -1.33 0.185 -.3023895 .0582726
schgt12 | .1104817 .0783542 1.41 0.159 -.0430897 .2640531
smsa | .1864841 .0766075 2.43 0.015 .0363361 .3366321
bluecoll | -.2108023 .080867 -2.61 0.009 -.3692986 -.052306
mining | -.1238251 .1906352 -0.65 0.516 -.4974632 .249813
constr | -.054455 .1029488 -0.53 0.597 -.256231 .1473209
transp | -.1551657 .1466515 -1.06 0.290 -.4425973 .1322659
trade | -.0383252 .0968106 -0.40 0.692 -.2280706 .1514201
fire | .1097585 .1300779 0.84 0.399 -.1451895 .3647065
services | .1666262 .0939507 1.77 0.076 -.0175138 .3507662
pubadmin | .1022002 .2829817 0.36 0.718 -.4524336 .6568341
year85 | .204162 .084908 2.40 0.016 .0377454 .3705786
374
UI | -1.318
| -5.55
RRUI | 0.882
| 1.52
DRUI | -0.095
| -0.10
LOGWAGE | 0.335
| 3.03
-------------+----------N | 3343.000
ll | -7.7e+03
------------------------legend: b/t
.
. * (5) VARIOUS PARAMETRIC MODELS: HAZARD RATIOS (Table 17.9, page 608))
.
. * streg default is to report hazard rates rather than coeffcients
. * streg with nohr option reports coefficients
.
. * Exponential regression
. streg $xlist, robust dist(exponential)
failure _d: censor1 == 1
analysis time _t: spell
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t | Haz. Ratio Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | 1.603235 .9628283 0.79 0.432 .494089 5.202226
DR | .5623451 .4287594 -0.75 0.450 .1261843 2.506112
UI | .2406141 .0600072 -5.71 0.000 .1475837 .3922867
RRUI | 2.626338 1.606901 1.58 0.115 .7916819 8.712654
DRUI | .8194978 .8351649 -0.20 0.845 .1111919 6.039799
LOGWAGE | 1.420204 .1641727 3.03 0.002 1.132279 1.781344
376
377
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t | Haz. Ratio Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | 1.56536 .998996 0.70 0.483 .4481117 5.46817
DR | .6525166 .527689 -0.53 0.598 .1337292 3.183881
UI | .2240097 .0591314 -5.67 0.000 .1335294 .3757999
RRUI | 2.759988 1.781741 1.57 0.116 .7787618 9.781599
DRUI | .7416768 .7901705 -0.28 0.779 .0919091 5.985096
LOGWAGE | 1.441271 .176008 2.99 0.003
1.13448 1.831025
tenure | .9988879 .006864 -0.16 0.871 .9855249 1.012432
slack | .7670407 .0616098 -3.30 0.001 .6553129 .8978176
abolpos | .8517837 .0862808 -1.58 0.113 .6984053 1.038846
explose | 1.230608 .0842616 3.03 0.002 1.076061 1.407352
stateur | .9315788 .0225551 -2.93 0.003 .8884041 .9768517
houshead | 1.488342 .1320445 4.48 0.000 1.250791 1.771008
married | 1.460247 .1212469 4.56 0.000 1.240937 1.718316
female | 1.134376 .101752 1.41 0.160 .9514927 1.352411
child | .966883 .0812139 -0.40 0.688 .8201188 1.139911
ychild | .8510311 .0927173 -1.48 0.139
.6874 1.053613
nonwhite | .4953204 .0615485 -5.65 0.000
.388254 .6319119
age | .9766936 .0040945 -5.63 0.000 .9687014 .9847517
schlt12 | .8845503 .0904684 -1.20 0.230 .7238772 1.080887
schgt12 | 1.123316 .0989295 1.32 0.187 .9452293 1.334955
smsa | 1.22135 .1027313 2.38 0.017 1.035722 1.440247
bluecoll | .8191464 .0736702 -2.22 0.027 .6867654 .9770452
mining | .9034201 .1839945 -0.50 0.618 .6060805 1.346633
constr | .9749455 .1107157 -0.22 0.823 .7803997 1.21799
transp | .820245 .1371565 -1.19 0.236 .5910316 1.138352
trade | .9693436 .1046408 -0.29 0.773 .7844954 1.197747
fire | 1.134526 .1693311 0.85 0.398 .8467799 1.520053
services | 1.225277 .1272996 1.96 0.051 .9995379 1.501999
pubadmin | 1.118259 .3452483 0.36 0.717 .6105827 2.048048
year85 | 1.268072 .1184214 2.54 0.011 1.055972 1.522772
year87 | 1.460443 .147765 3.74 0.000 1.197737 1.780769
year89 | 1.63563 .1930814 4.17 0.000 1.297786 2.061422
378
=
=
=
Log pseudo-likelihood =
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t | Haz. Ratio Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | 1.603847 .9677311 0.78 0.434 .4915456 5.233135
379
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t | Haz. Ratio Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | 1.685866 .962916 0.91 0.361 .5503545 5.164209
DR | .4711838 .3400769 -1.04 0.297 .1145079 1.938854
UI | .2677452 .0635331 -5.55 0.000
.168167 .4262877
RRUI | 2.416321 1.406577 1.52 0.130 .7720714 7.562264
DRUI | .9092495 .8890406 -0.10 0.922 .1337828 6.179678
LOGWAGE | 1.398309 .1547206 3.03 0.002 1.125691 1.73695
tenure | 1.000828 .0061337 0.14 0.893 .9888782 1.012922
slack | .7804668 .0562851 -3.44 0.001 .6775918 .8989608
abolpos | .8597068 .0778065 -1.67 0.095 .7199688 1.026567
explose | 1.205033 .0741989 3.03 0.002 1.068038 1.359599
stateur | .942662 .0208187 -2.67 0.008 .9027285 .9843619
houshead | 1.433597 .1139461 4.53 0.000 1.226793 1.675262
married | 1.431638 .106851 4.81 0.000 1.236811 1.657154
female | 1.105476 .0899059 1.23 0.218 .9425903 1.296509
child | .9611687 .0726033 -0.52 0.600 .8289013 1.114542
ychild | .8801492 .0851858 -1.32 0.187 .7280685 1.063997
nonwhite | .5275839 .0607424 -5.55 0.000 .4210076 .6611394
age | .9797456 .0036832 -5.44 0.000 .9725532 .9869912
schlt12 | .8850966 .0814354 -1.33 0.185 .7390501 1.060004
schgt12 | 1.116816 .0875072 1.41 0.159 .9578255 1.302197
smsa | 1.205005 .0923125 2.43 0.015 1.037004 1.400224
bluecoll | .8099341 .0654969 -2.61 0.009 .6912189 .9490384
mining | .8835344 .1684327 -0.65 0.516 .6080713 1.283785
constr | .9470011 .0974926 -0.53 0.597 .7739632 1.158726
transp | .8562733 .1255737 -1.06 0.290 .6423659 1.141412
trade | .9623999 .0931706 -0.40 0.692
.796068 1.163485
fire | 1.116009 .1451681 0.84 0.399 .8648584 1.440091
services | 1.181313 .1109851 1.77 0.076 .9826387 1.420155
pubadmin | 1.107605 .313432 0.36 0.718 .6360783 1.928677
year85 | 1.226497 .1041394 2.40 0.016 1.038467 1.448572
year87 | 1.402734 .1261218 3.76 0.000 1.176095 1.673046
year89 | 1.566206 .1643529 4.28 0.000 1.275047 1.92385
381
382
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma18p1heterogeneity.txt
log type: text
opened on: 19 May 2005, 17:58:22
.
. ********** OVERVIEW OF MMA18P1HETEROGENEITY.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 18.8 Pages 632-6
. * Unobserved Heterogeneity with Duration data Example
. * (1) Exponential with and without heterogeneity
.*
Residuals Plots: Figures 18.2 (exp.wmf) and 18.3 (exp_gamma.wmf)
.*
Tabulate Model Estimates: Table 18.1
. * (2) Weibull with and without heterogeneity: Generalized Residuals Plots
.*
Residuals Plots: Figures 18.4 (Weibul16.wmf) and 18.5 (Weibul16_IG.wmf)
.*
Tabulate model Estimates: Table 18.2
.
. * To run this program you need data file
. * ema1996.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
. set matsize 100
.
. ********** DATA DESCRIPTION **********
.
. * The data is from
. * B.P. McCall (1996), "Unemployment Insurance Rules, Joblessness,
.*
and Part-time Work," Econometrica, 64, 647-682.
.
. * There are 3343 observations from the CPS Displaced Worker Surveys
. * of 1986, 1988, 1990 and 1992 on 33 variables including
. * spell = length of spell in number of two-week intervals
. * CENSOR1 = 1 if re-employed at full-time job
.
. * See program mma17p4duration.do for further description of the data set
.
. ********** READ DATA **********
383
.
. use ema1996.dta
(Sample for 1996 EMA paper: part-time= worked part-time last week)
.
. ********** CREATE ADDITIONAL VARIABLES **********
.
. gen RR = reprate
. gen DR = disrate
. gen UI = ui
. gen RRUI = RR*UI
. gen DRUI = DR*UI
. gen LOGWAGE = logwage
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------ui |
3343 .5527969 .4972791
0
1
reprate |
3343 .4544717 .1137918
.066
2.059
logwage |
3343 5.692994 .5356591 2.70805 7.600402
tenure |
3343 4.114867 5.862322
0
40
disrate |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------slack |
3343 .4884834 .4999421
0
1
abolpos |
3343 .1456775 .3528354
0
1
explose |
3343 .5025426 .5000683
0
1
stateur |
3343
6.5516 1.803825
2.5
13
houshead |
3343 .6120251 .4873617
0
1
-------------+-------------------------------------------------------married |
3343 .5860006 .4926221
0
1
female |
3343 .3478911 .4763725
0
1
child |
3343 .4501944 .4975876
0
1
ychild |
3343 .1956327 .3967463
0
1
nonwhite |
3343 .1390966 .3460991
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
smsa |
3343 .7241998 .4469835
0
1
384
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
-------------+-------------------------------------------------------services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
-------------+-------------------------------------------------------midatl |
3343 .1088842 .3115405
0
1
encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
-------------+-------------------------------------------------------wscen |
3343 .1441819 .3513266
0
1
mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
RR |
3343 .4544717 .1137918
.066
2.059
DR |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------UI |
3343 .5527969 .4972791
0
1
RRUI |
3343 .2478687 .2380667
0
2.059
DRUI |
3343 .0602776 .0754261
0
.824
LOGWAGE |
3343 5.692994 .5356591 2.70805 7.600402
.
. ********* ANALYSIS: UNEMPLOYMENT DURATION **********
.
. * Stata st curves require defining the dependent variable
. * and the censoring variable if there is one
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
exit on or before: failure
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
-----------------------------------------------------------------------------3343 obs. remaining, representing
1073 failures in single record/single failure data
20887 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
28
385
. stdes
failure _d: censor1 == 1
analysis time _t: spell
|-------------- per subject --------------|
Category
total
mean
min median
max
-----------------------------------------------------------------------------no. of subjects
3343
no. of records
3343
1
1
1
1
(first) entry time
(final) exit time
subjects with gap
time on gap if gap
time at risk
0
6.247981
0
0
20887 6.247981
0
1
0
5
28
28
failures
1073 .3209692
0
0
1
-----------------------------------------------------------------------------.
. * Define $xlist = list of regressors used in subsequent regressions
. global xlist RR DR UI RRUI DRUI LOGWAGE /*
> */ tenure slack abolpos explose stateur houshead married /*
> */ female child ychild nonwhite age schlt12 schgt12 smsa bluecoll /*
> */ mining constr transp trade fire services pubadmin /*
> */ year85 year87 year89 midatl /*
> */ encen wncen southatl escen wscen mountain pacific
.
. * (1) EXPONENTIAL REGRESSION
.
. * Estimate exponential without heterogeneity
. streg $xlist, nolog nohr dist(exponential) robust
failure _d: censor1 == 1
analysis time _t: spell
Exponential regression -- log relative-hazard form
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
386
28
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .5005828 .6187508 0.81 0.419 -.7121465 1.713312
DR | -.8824469 .7894395 -1.12 0.264 -2.42972 .664826
UI | -1.584537 .2622252 -6.04 0.000 -2.098489 -1.070586
RRUI | 1.091168 .6327026 1.72 0.085 -.1489067 2.331242
DRUI | .0574048 1.047123 0.05 0.956 -1.994919 2.109729
LOGWAGE | .3792805 .1191278 3.18 0.001 .1457944 .6127666
tenure | .0007938 .0065903 0.12 0.904 -.012123 .0137106
slack | -.2862928 .0770348 -3.72 0.000 -.4372782 -.1353074
abolpos | -.1842749 .0977213 -1.89 0.059 -.3758051 .0072552
explose | .2151452 .0663117 3.24 0.001 .0851767 .3451137
stateur | -.0650451 .023552 -2.76 0.006 -.1112061 -.0188841
houshead | .3960399 .0847153 4.67 0.000 .2300009 .5620789
married | .3961194 .0806744 4.91 0.000 .2380005 .5542384
female | .1102564 .0869256 1.27 0.205 -.0601147 .2806275
child | -.0464355 .0815869 -0.57 0.569 -.206343 .113472
ychild | -.1213622 .103309 -1.17 0.240 -.3238441 .0811196
nonwhite | -.6909793 .1217489 -5.68 0.000 -.9296027 -.4523559
age | -.0225342 .0040184 -5.61 0.000 -.0304101 -.0146582
schlt12 | -.1513782 .0968026 -1.56 0.118 -.3411079 .0383515
schgt12 | .1011742 .0834622 1.21 0.225 -.0624088 .2647572
smsa | .212363 .081774 2.60 0.009
.052089 .372637
bluecoll | -.220439 .0862751 -2.56 0.011 -.3895351 -.0513429
mining | -.1721823 .2051663 -0.84 0.401 -.5743008 .2299362
constr | -.0897602 .11034 -0.81 0.416 -.3060225 .1265022
transp | -.1572488 .1563607 -1.01 0.315 -.4637102 .1492126
trade | -.0451107 .1034986 -0.44 0.663 -.2479642 .1577428
fire | .0881685 .1386688 0.64 0.525 -.1836175 .3599544
services | .1682835 .1005405 1.67 0.094 -.0287723 .3653393
pubadmin | .0961407 .3092103 0.31 0.756 -.5099004 .7021817
year85 | .1940199 .0906564 2.14 0.032 .0163366 .3717031
year87 | .3564373 .0959014 3.72 0.000 .1684741 .5444005
389
.
. * Estimate Weibull without heterogeneity
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
exit on or before: failure
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
-----------------------------------------------------------------------------3343 obs. remaining, representing
1073 failures in single record/single failure data
20887 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
28
. streg $xlist, nolog nohr dist(weibull) robust
failure _d: censor1 == 1
analysis time _t: spell
Weibull regression -- log relative-hazard form
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .4481156 .6381895 0.70 0.483 -.8027127 1.698944
DR | -.4269187 .8086983 -0.53 0.598 -2.011938 1.158101
UI | -1.496066 .2639679 -5.67 0.000 -2.013434 -.9786984
RRUI | 1.015226 .6455611 1.57 0.116 -.2500501 2.280503
DRUI | -.2988417 1.065384 -0.28 0.779 -2.386956 1.789272
LOGWAGE | .3655253 .12212 2.99 0.003 .1261745 .6048761
tenure | -.0011127 .0068716 -0.16 0.871 -.0145809 .0123554
slack | -.2652154 .0803214 -3.30 0.001 -.4226424 -.1077883
abolpos | -.1604227 .1012942 -1.58 0.113 -.3589557 .0381103
explose | .2075085 .0684715 3.03 0.002 .0733068 .3417103
stateur | -.0708745 .0242117 -2.93 0.003 -.1183286 -.0234204
houshead | .3976626 .0887192 4.48 0.000 .2237762 .571549
married | .3786057 .0830317 4.56 0.000 .2158665 .541345
female | .1260829 .0896987 1.41 0.160 -.0497233 .301889
child | -.0336778 .0839956 -0.40 0.688 -.1983061 .1309505
ychild | -.1613066 .108947 -1.48 0.139 -.3748389 .0522256
392
=
=
3343
1073
Number of obs =
3343
394
Time at risk
20887
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .7356277 .9058181 0.81 0.417 -1.039743 2.510998
DR | -1.072566 1.149098 -0.93 0.351 -3.324758 1.179625
UI | -2.574752 .3843798 -6.70 0.000 -3.328123 -1.821381
RRUI | 1.733571 .9333928 1.86 0.063 -.0958458 3.562987
DRUI | -.060621 1.537813 -0.04 0.969 -3.07468 2.953438
LOGWAGE | .575656 .1766599 3.26 0.001 .2294089 .9219031
tenure | -.0009848 .0097472 -0.10 0.920 -.0200889 .0181194
slack | -.4416007 .1142976 -3.86 0.000 -.6656199 -.2175814
abolpos | -.2873066 .1465357 -1.96 0.050 -.5745113 -.0001019
explose | .3641943 .0976897 3.73 0.000 .1727259 .5556627
stateur | -.0981133 .0346763 -2.83 0.005 -.1660775 -.030149
houshead | .5924383 .1256739 4.71 0.000 .3461219 .8387546
married | .6083214 .1183487 5.14 0.000 .3763624 .8402805
female | .1788439 .1285074 1.39 0.164 -.0730259 .4307137
child | -.0914227 .121778 -0.75 0.453 -.3301031 .1472578
ychild | -.1805373 .1527477 -1.18 0.237 -.4799173 .1188426
nonwhite | -1.008517 .1725174 -5.85 0.000 -1.346645 -.6703894
age | -.0333776 .0059183 -5.64 0.000 -.0449772 -.0217779
schlt12 | -.2258621 .1439543 -1.57 0.117 -.5080075 .0562832
schgt12 | .1505129 .124469 1.21 0.227 -.0934418 .3944677
smsa | .3009952 .119907 2.51 0.012 .0659819 .5360086
bluecoll | -.3211857 .1253163 -2.56 0.010 -.5668012 -.0755702
mining | -.2319827 .3008491 -0.77 0.441 -.8216361 .3576708
constr | -.1260324 .1633669 -0.77 0.440 -.4462257 .1941609
transp | -.2763858 .225893 -1.22 0.221 -.7191279 .1663562
trade | -.0687616 .1518284 -0.45 0.651 -.3663399 .2288166
fire | .0668973 .2131814 0.31 0.754 -.3509306 .4847252
services | .231914 .1494712 1.55 0.121 -.0610441 .5248721
pubadmin | .0901949 .4579252 0.20 0.844 -.807322 .9877117
year85 | .2780139 .1339053 2.08 0.038 .0155644 .5404634
year87 | .5208783 .1415375 3.68 0.000 .2434699 .7982867
year89 | .7209598 .1655487 4.35 0.000 .3964903 1.045429
midatl | -.0192077 .2222646 -0.09 0.931 -.4548382 .4164228
encen | -.0297055 .2284931 -0.13 0.897 -.4775438 .4181328
wncen | .2460338 .24216 1.02 0.310 -.2285911 .7206586
southatl | .3563643 .1793284 1.99 0.047 .0048872 .7078415
escen | .5461543 .2910193 1.88 0.061 -.024233 1.116542
wscen | .4606814 .2140966 2.15 0.031 .0410598 .880303
mountain | .017581 .2293804 0.08 0.939 -.4319963 .4671584
pacific | .1379886 .3636985 0.38 0.704 -.5748475 .8508247
_cons | -5.303059 1.34133 -3.95 0.000 -7.932017 -2.6741
-------------+---------------------------------------------------------------/ln_p | .5611667 .0225898 24.84 0.000 .5168915 .6054418
395
397
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma19p1comprisks.txt
log type: text
opened on: 19 May 2005, 17:52:44
.
. ********** OVERVIEW OF MMA18P1COMPRISKS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 19.5 pages 658-62
. * Competing Risks Example with censoring mechanism each of the three risks
. * (1A) Table 19.2 p.659 Exponential
. * (1B) Table 19.2 p.659 Exponential with IG frailty
. * (2A) Table 19.3 p.659 Weibull
. * (2B) Table 19.3 p.659 Weibull with IG frailty
. * (2C) Table 19.3 p.660 Cox model
. * (2D) Graph the resulting Cox baseline survival and cumulative hazards
.*
Figure 19.1: (combined_bsf.wmf) baseline survival functions
.*
Figure 19.2: (combined_cbh.wmf) baseline cumulative hazards
.
. * To run this program you need data file
. * ema1996.dta
.
. * NOTE: The IG Heterogeneity estimation was unsuccessful for exponential
.*
but successful for Weibull
.
. ********** SETUP **********
.
. set more off
. version 8
. set scheme s1mono /* Used for graphs */
. set matsize 80
.
. ********** DATA DESCRIPTION **********
.
. * The data is from
. * B.P. McCall (1996), "Unemployment Insurance Rules, Joblessness,
.*
and Part-time Work," Econometrica, 64, 647-682.
.
. * There are 3343 observations from the CPS Displaced Worker Surveys
. * of 1986, 1988, 1990 and 1992 on 33 variables including
. * spell = length of spell in number of two-week intervals
398
nonwhite |
3343 .1390966 .3460991
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
smsa |
3343 .7241998 .4469835
0
1
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
-------------+-------------------------------------------------------services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
-------------+-------------------------------------------------------midatl |
3343 .1088842 .3115405
0
1
encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
-------------+-------------------------------------------------------wscen |
3343 .1441819 .3513266
0
1
mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
RR |
3343 .4544717 .1137918
.066
2.059
DR |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------UI |
3343 .5527969 .4972791
0
1
RRUI |
3343 .2478687 .2380667
0
2.059
DRUI |
3343 .0602776 .0754261
0
.824
LOGWAGE |
3343 5.692994 .5356591 2.70805 7.600402
.
. ********* COMPETING RISKS FOR UNEMPLOYMENT DURATION **********
.
. * Stata analysis requires using stset to define the dependent variable
. * and the censoring variable if there is one
.
. * For the competing risks model there are three censoring variables
. * CENSOR1 = 1 if re-employed at full-time job
. * CENSOR2 = 1 if re-employed at part-time job
. * CENSOR3 = 1 if re-employed but left job: pt-ft status unknown
.
. * Define $xlist = list of regressors used in subsequent regressions
. global xlist RR DR UI RRUI DRUI LOGWAGE /*
> */ tenure slack abolpos explose stateur houshead married /*
400
>
>
>
>
.
. *** (1A) EXPONENTIAL WITH NO HETEROGENEITY Table 19.2
.
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
exit on or before: failure
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
-----------------------------------------------------------------------------3343 obs. remaining, representing
1073 failures in single record/single failure data
20887 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
28
. streg $xlist, nolog nohr robust dist(exponential)
failure _d: censor1 == 1
analysis time _t: spell
Exponential regression -- log relative-hazard form
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .4720235 .6005534 0.79 0.432 -.7050396 1.649087
DR | -.5756396 .7624489 -0.75 0.450 -2.070012 .9187327
UI | -1.424561 .2493917 -5.71 0.000 -1.91336 -.9357622
RRUI | .9655904 .6118408 1.58 0.115 -.2335956 2.164776
DRUI | -.1990635 1.019118 -0.20 0.845 -2.196498 1.798371
LOGWAGE | .3508005 .115598 3.03 0.002 .1242327 .5773684
tenure | -.0001462 .0064637 -0.02 0.982 -.0128147 .0125224
slack | -.2593666 .0759363 -3.42 0.001 -.4081991 -.1105342
abolpos | -.1550897 .0953306 -1.63 0.104 -.3419342 .0317549
explose | .198458 .0648354 3.06 0.002
.071383 .3255331
401
=
=
=
3343
339
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.0928628 .9761428 -0.10 0.924 -2.006068 1.820342
DR | -.9600127 1.246692 -0.77 0.441 -3.403483 1.483458
UI | -1.047747 .5236826 -2.00 0.045 -2.074146 -.021348
RRUI | -.6698307 1.191869 -0.56 0.574 -3.005851 1.666189
DRUI | 1.987208 1.726509 1.15 0.250 -1.396688 5.371105
LOGWAGE | -.2577715 .1793075 -1.44 0.151 -.6092077 .0936646
tenure | .0053684 .0125538 0.43 0.669 -.0192366 .0299734
slack | -.2636908 .1311029 -2.01 0.044 -.5206477 -.0067339
abolpos | -.5626836 .202701 -2.78 0.006 -.9599703 -.1653969
explose | .0490271 .1130116 0.43 0.664 -.1724715 .2705258
stateur | -.1032439 .0406788 -2.54 0.011 -.182973 -.0235148
houshead | -.073544 .1343412 -0.55 0.584 -.3368479 .18976
married | -.0618813 .1339552 -0.46 0.644 -.3244287 .2006661
female | .4531912 .1384047 3.27 0.001
.181923 .7244594
child | -.2164986 .1452571 -1.49 0.136 -.5011973 .0682002
ychild | .149031 .1815684 0.82 0.412 -.2068365 .5048986
nonwhite | -.4563527 .1820135 -2.51 0.012 -.8130927 -.0996127
age | -.001781 .0064207 -0.28 0.781 -.0143653 .0108033
schlt12 | -.1803101 .1661528 -1.09 0.278 -.5059636 .1453433
schgt12 | -.0534463 .1462829 -0.37 0.715 -.3401555 .2332629
smsa | .1295376 .1384588 0.94 0.349 -.1418367 .400912
bluecoll | .0088207 .1510547 0.06 0.953 -.2872411 .3048825
mining | -.0141252 .4078632 -0.03 0.972 -.8135225 .785272
constr | .1867498 .1896106 0.98 0.325 -.1848802 .5583799
transp | -.402533 .2898061 -1.39 0.165 -.9705426 .1654766
trade | .1106678 .1735195 0.64 0.524 -.2294241 .4507598
fire | -.3396026 .3006096 -1.13 0.259 -.9287865 .2495813
services | .1619867 .1705571 0.95 0.342 -.172299 .4962724
pubadmin | .7445446 .5413463 1.38 0.169 -.3164746 1.805564
year85 | -.0548375 .149323 -0.37 0.713 -.3475052 .2378301
year87 | -.12113 .1616797 -0.75 0.454 -.4380164 .1957563
year89 | .1244437 .1950397 0.64 0.523 -.257827 .5067144
midatl | -.3969537 .2577568 -1.54 0.124 -.9021477 .1082403
403
=
=
=
3343
574
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.6011551 .724665 -0.83 0.407 -2.021472 .8191621
DR | 1.121525 .9012528 1.24 0.213 -.6448975 2.887948
UI | -.9672682 .4486302 -2.16 0.031 -1.846567 -.0879691
RRUI | -.4326869 1.014413 -0.43 0.670
-2.4209 1.555526
DRUI | 2.102012 1.302564 1.61 0.107 -.450967 4.654991
404
|
0.762
1.247
0.901
UI | -1.425
-1.048
-0.967
|
0.249
0.524
0.449
RRUI |
0.966
-0.670
-0.433
|
0.612
1.192
1.014
DRUI | -0.199
1.987
2.102
|
1.019
1.727
1.303
LOGWAGE |
0.351
-0.258
0.003
|
0.116
0.179
0.145
tenure | -0.000
0.005
-0.048
|
0.006
0.013
0.012
-------------+--------------------------------------N | 3343.000 3343.000 3343.000
ll | -2700.690 -1250.545 -1742.396
----------------------------------------------------legend: b/se
.
. *** (1B) EXPONENTIAL WITH IG HETEROGENEITY Table 19.2
.
. /* Did not work even though Weibull with IG heterogeneity did
>
> stset spell, fail(censor1=1)
> streg $xlist, nohr robust dist(exponential) frailty(invgauss)
> estimates store bexpigr1
>
> stset spell, fail(censor2=1)
> streg $xlist, nolog nohr robust dist(exponential) frailty(invgauss)
> estimates store bexpigr2
>
> stset spell, fail(censor3=1)
> streg $xlist, nolog nohr robust dist(exponential)
> estimates store bexpiggr3
>
> * Table 19.2 (page 658) first three columns
> estimates table bexpigr1 bexpigr2 bexpigr3, b(%10.3f) se(%10.3f) stats(N ll) /*
> */ keep(RR DR UI RRUI DRUI LOGWAGE tenure)
>
> */
.
. *** (2A) WEIBULL WITH NO HETEROGENEITY Table 19.3
.
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
exit on or before: failure
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
406
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .4481156 .6381895 0.70 0.483 -.8027127 1.698944
DR | -.4269187 .8086983 -0.53 0.598 -2.011938 1.158101
UI | -1.496066 .2639679 -5.67 0.000 -2.013434 -.9786984
RRUI | 1.015226 .6455611 1.57 0.116 -.2500501 2.280503
DRUI | -.2988417 1.065384 -0.28 0.779 -2.386956 1.789272
LOGWAGE | .3655253 .12212 2.99 0.003 .1261745 .6048761
tenure | -.0011127 .0068716 -0.16 0.871 -.0145809 .0123554
slack | -.2652154 .0803214 -3.30 0.001 -.4226424 -.1077883
abolpos | -.1604227 .1012942 -1.58 0.113 -.3589557 .0381103
explose | .2075085 .0684715 3.03 0.002 .0733068 .3417103
stateur | -.0708745 .0242117 -2.93 0.003 -.1183286 -.0234204
houshead | .3976626 .0887192 4.48 0.000 .2237762 .571549
married | .3786057 .0830317 4.56 0.000 .2158665 .541345
female | .1260829 .0896987 1.41 0.160 -.0497233 .301889
child | -.0336778 .0839956 -0.40 0.688 -.1983061 .1309505
ychild | -.1613066 .108947 -1.48 0.139 -.3748389 .0522256
nonwhite | -.7025504 .12426 -5.65 0.000 -.9460956 -.4590052
age | -.0235823 .0041922 -5.63 0.000 -.0317989 -.0153658
schlt12 | -.1226759 .1022762 -1.20 0.230 -.3231335 .0777816
schgt12 | .1162848 .0880692 1.32 0.187 -.0563278 .2888973
smsa | .1999567 .0841129 2.38 0.017 .0350985 .3648149
bluecoll | -.1994925 .0899354 -2.22 0.027 -.3757626 -.0232223
mining | -.1015676 .2036644 -0.50 0.618 -.5007425 .2976073
constr | -.0253737 .1135609 -0.22 0.823 -.247949 .1972016
transp | -.1981522 .1672141 -1.19 0.236 -.5258858 .1295814
trade | -.0311361 .1079502 -0.29 0.773 -.2427146 .1804423
fire | .1262153 .1492527 0.85 0.398 -.1663145 .4187452
407
=
=
=
3343
339
20887
Number of obs =
Wald chi2(40) =
3343
222.95
408
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.0855974 .9920715 -0.09 0.931 -2.030022 1.858827
DR | -.9387836 1.279111 -0.73 0.463 -3.445794 1.568227
UI | -1.110175 .5267037 -2.11 0.035 -2.142496 -.0778551
RRUI | -.6171912 1.203735 -0.51 0.608 -2.976469 1.742086
DRUI | 1.973269 1.756599 1.12 0.261 -1.469601 5.41614
LOGWAGE | -.2437885 .1833224 -1.33 0.184 -.6030938 .1155168
tenure | .0050643 .0127387 0.40 0.691 -.0199031 .0300317
slack | -.2689689 .133176 -2.02 0.043 -.529989 -.0079487
abolpos | -.5721689 .2059292 -2.78 0.005 -.9757826 -.1685551
explose | .0555267 .1147555 0.48 0.628
-.16939 .2804433
stateur | -.1087083 .0413647 -2.63 0.009 -.1897816 -.027635
houshead | -.0679894 .13661 -0.50 0.619 -.3357401 .1997613
married | -.060856 .1362403 -0.45 0.655 -.327882 .20617
female | .4583892 .1408831 3.25 0.001 .1822634 .734515
child | -.2228982 .147376 -1.51 0.130 -.5117499 .0659535
ychild | .1463598 .1844362 0.79 0.427 -.2151284 .507848
nonwhite | -.485664 .186033 -2.61 0.009 -.8502819 -.121046
age | -.0027009 .0065569 -0.41 0.680 -.0155521 .0101503
schlt12 | -.1837633 .1684487 -1.09 0.275 -.5139167 .1463901
schgt12 | -.0488958 .1485385 -0.33 0.742 -.340026 .2422343
smsa | .1380042 .1410747 0.98 0.328 -.1384971 .4145055
bluecoll | .0132584 .1537386 0.09 0.931 -.2880637 .3145805
mining | -.0138734 .4110202 -0.03 0.973 -.8194583 .7917115
constr | .1973771 .1920481 1.03 0.304 -.1790303 .5737845
transp | -.4116241 .2927848 -1.41 0.160 -.9854717 .1622234
trade | .1125741 .1765277 0.64 0.524 -.2334139 .4585621
fire | -.3378747 .3046641 -1.11 0.267 -.9350054 .2592561
services | .1700335 .1729565 0.98 0.326 -.1689551 .5090221
pubadmin | .7553679 .5487635 1.38 0.169 -.3201889 1.830925
year85 | -.0501695 .1515048 -0.33 0.741 -.3471135 .2467745
year87 | -.1116858 .1645254 -0.68 0.497 -.4341497 .2107781
year89 | .1344555 .1987084 0.68 0.499 -.2550059 .5239168
midatl | -.4039691 .2606153 -1.55 0.121 -.9147658 .1068276
encen | -.5105877 .2608364 -1.96 0.050 -1.021818 .0006423
wncen | -.0579723 .2607792 -0.22 0.824 -.5690902 .4531456
southatl | -.2682241 .1972983 -1.36 0.174 -.6549216 .1184733
escen | .079807 .3146812 0.25 0.800 -.5369568 .6965709
wscen | -.0854421 .2368638 -0.36 0.718 -.5496865 .3788024
mountain | .2441762 .2300886 1.06 0.289 -.2067892 .6951416
pacific | -.1999107 .4003467 -0.50 0.618 -.9845758 .5847544
_cons | -1.055211 1.353275 -0.78 0.436 -3.707582 1.597159
-------------+---------------------------------------------------------------/ln_p | .0815649 .0308379 2.64 0.008 .0211236 .1420061
-------------+---------------------------------------------------------------p | 1.084984 .0334587
1.021348 1.152584
409
=
=
=
3343
574
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.6946399 .762754 -0.91 0.362 -2.18961 .8003305
DR | 1.361414 .9691375 1.40 0.160 -.5380611 3.260888
UI | -1.098453 .4595297 -2.39 0.017 -1.999115 -.1977918
RRUI | -.3055217 1.046769 -0.29 0.770 -2.357151 1.746107
DRUI | 1.990913 1.37004 1.45 0.146 -.6943156 4.676141
LOGWAGE | .0401096 .1526549 0.26 0.793 -.2590886 .3393078
tenure | -.0495153 .0126559 -3.91 0.000 -.0743204 -.0247103
slack | -.473113 .1025776 -4.61 0.000 -.6741614 -.2720647
abolpos | -.2910168 .1465355 -1.99 0.047 -.5782212 -.0038124
explose | .0315602 .0906338 0.35 0.728 -.1460787 .2091991
stateur | -.1199252 .0337488 -3.55 0.000 -.1860717 -.0537787
houshead | .5592843 .1107798 5.05 0.000 .3421598 .7764087
410
|
0.264
0.527
0.460
RRUI |
1.015
-0.617
-0.306
|
0.646
1.204
1.047
DRUI | -0.299
1.973
1.991
|
1.065
1.757
1.370
LOGWAGE |
0.366
-0.244
0.040
|
0.122
0.183
0.153
tenure | -0.001
0.005
-0.050
|
0.007
0.013
0.013
-------------+--------------------------------------N | 3343.000 3343.000 3343.000
ll | -2687.600 -1248.686 -1729.836
----------------------------------------------------legend: b/se
.
. *** (2B) WEIBULL WITH IG HETEROGENEITY Table 19.3
.
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
exit on or before: failure
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
-----------------------------------------------------------------------------3343 obs. remaining, representing
1073 failures in single record/single failure data
20887 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
28
. streg $xlist, nohr robust dist(weibull) frailty(invgauss)
failure _d: censor1 == 1
analysis time _t: spell
Fitting weibull model:
Fitting constant-only model:
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
412
=
=
=
3343
1073
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .7356277 .9058181 0.81 0.417 -1.039743 2.510998
DR | -1.072566 1.149098 -0.93 0.351 -3.324758 1.179625
UI | -2.574752 .3843798 -6.70 0.000 -3.328123 -1.821381
RRUI | 1.733571 .9333928 1.86 0.063 -.0958458 3.562987
DRUI | -.060621 1.537813 -0.04 0.969 -3.07468 2.953438
LOGWAGE | .575656 .1766599 3.26 0.001 .2294089 .9219031
tenure | -.0009848 .0097472 -0.10 0.920 -.0200889 .0181194
slack | -.4416007 .1142976 -3.86 0.000 -.6656199 -.2175814
abolpos | -.2873066 .1465357 -1.96 0.050 -.5745113 -.0001019
explose | .3641943 .0976897 3.73 0.000 .1727259 .5556627
stateur | -.0981133 .0346763 -2.83 0.005 -.1660775 -.030149
houshead | .5924383 .1256739 4.71 0.000 .3461219 .8387546
married | .6083214 .1183487 5.14 0.000 .3763624 .8402805
female | .1788439 .1285074 1.39 0.164 -.0730259 .4307137
child | -.0914227 .121778 -0.75 0.453 -.3301031 .1472578
ychild | -.1805373 .1527477 -1.18 0.237 -.4799173 .1188426
nonwhite | -1.008517 .1725174 -5.85 0.000 -1.346645 -.6703894
age | -.0333776 .0059183 -5.64 0.000 -.0449772 -.0217779
schlt12 | -.2258621 .1439543 -1.57 0.117 -.5080075 .0562832
schgt12 | .1505129 .124469 1.21 0.227 -.0934418 .3944677
smsa | .3009952 .119907 2.51 0.012 .0659819 .5360086
bluecoll | -.3211857 .1253163 -2.56 0.010 -.5668012 -.0755702
mining | -.2319827 .3008491 -0.77 0.441 -.8216361 .3576708
constr | -.1260324 .1633669 -0.77 0.440 -.4462257 .1941609
transp | -.2763858 .225893 -1.22 0.221 -.7191279 .1663562
trade | -.0687616 .1518284 -0.45 0.651 -.3663399 .2288166
fire | .0668973 .2131814 0.31 0.754 -.3509306 .4847252
services | .231914 .1494712 1.55 0.121 -.0610441 .5248721
pubadmin | .0901949 .4579252 0.20 0.844 -.807322 .9877117
413
=
=
=
3343
339
20887
Number of obs =
3343
414
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.3802006 1.452095 -0.26 0.793 -3.226255 2.465854
DR | -1.689504 1.779553 -0.95 0.342 -5.177363 1.798355
UI | -2.063963 .7469659 -2.76 0.006 -3.527989 -.5999369
RRUI | -.3019038 1.702153 -0.18 0.859 -3.638063 3.034255
DRUI | 3.263067 2.469908 1.32 0.186 -1.577863 8.103998
LOGWAGE | -.4954862 .2614747 -1.89 0.058 -1.007967 .0169948
tenure | .0174014 .0192239 0.91 0.365 -.0202768 .0550795
slack | -.3889861 .1911789 -2.03 0.042 -.7636898 -.0142824
abolpos | -.8027208 .2877528 -2.79 0.005 -1.366706 -.2387356
explose | .1187808 .1663987 0.71 0.475 -.2073546 .4449162
stateur | -.1753726 .059272 -2.96 0.003 -.2915437 -.0592015
houshead | -.0832153 .1944376 -0.43 0.669 -.464306 .2978754
married | -.0092249 .1945187 -0.05 0.962 -.3904747 .3720248
female | .6284921 .2064768 3.04 0.002
.223805 1.033179
child | -.389325 .2127697 -1.83 0.067 -.806346 .0276959
ychild | .3144939 .2663886 1.18 0.238 -.2076182 .836606
nonwhite | -.6691885 .2633831 -2.54 0.011 -1.18541 -.1529671
age | -.0034533 .0093696 -0.37 0.712 -.0218174 .0149108
schlt12 | -.3242365 .2380109 -1.36 0.173 -.7907293 .1422562
schgt12 | -.0745655 .2138285 -0.35 0.727 -.4936618 .3445307
smsa | .2107394 .2012744 1.05 0.295 -.1837512
.60523
bluecoll | -.0065426 .2175612 -0.03 0.976 -.4329548 .4198696
mining | .1293103 .6093175 0.21 0.832 -1.06493 1.323551
constr | .2870954 .2728176 1.05 0.293 -.2476172 .8218081
transp | -.6470251 .4118414 -1.57 0.116 -1.454219 .1601692
trade | .1901489 .2529975 0.75 0.452 -.3057172 .6860149
fire | -.4680763 .4488502 -1.04 0.297 -1.347807 .411654
services | .2462185 .2531429 0.97 0.331 -.2499325 .7423696
pubadmin | 1.351206 .7621665 1.77 0.076 -.1426127 2.845025
year85 | -.1501166 .2195046 -0.68 0.494 -.5803377 .2801044
year87 | -.2400145 .236954 -1.01 0.311 -.7044358 .2244069
year89 | .1828811 .2831188 0.65 0.518 -.3720216 .7377838
midatl | -.4074373 .3806192 -1.07 0.284 -1.153437 .3385627
encen | -.6525035 .381508 -1.71 0.087 -1.400245 .0952385
wncen | -.1300751 .3835973 -0.34 0.735 -.8819119 .6217617
southatl | -.3491396 .2954776 -1.18 0.237 -.928265 .2299859
escen | .2960895 .4558667 0.65 0.516 -.5973927 1.189572
wscen | -.0903554 .3527441 -0.26 0.798 -.7817212 .6010104
mountain | .3721587 .3457717 1.08 0.282 -.3055413 1.049859
pacific | -.1996218 .6042626 -0.33 0.741 -1.383955 .9847112
_cons | 1.157635 1.957298 0.59 0.554 -2.678599 4.993869
-------------+---------------------------------------------------------------/ln_p | .5004283 .0361284 13.85 0.000
.429618 .5712386
/ln_the | 2.896807 .1749249 16.56 0.000
2.55396 3.239653
415
=
=
=
3343
574
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.4326716 1.111223 -0.39 0.697 -2.610628 1.745285
DR | 1.166629 1.377826 0.85 0.397 -1.533861 3.867119
UI | -1.761667 .623017 -2.83 0.005 -2.982758 -.5405758
RRUI | -.5160276 1.418361 -0.36 0.716 -3.295964 2.263909
DRUI | 3.668779 1.93489 1.90 0.058 -.1235355 7.461093
LOGWAGE | -.0069584 .2162461 -0.03 0.974 -.4307929 .4168762
tenure | -.0677151 .0174959 -3.87 0.000 -.1020065 -.0334237
slack | -.7093182 .145145 -4.89 0.000 -.9937971 -.4248392
416
-------------+--------------------------------------RR |
0.736
-0.380
-0.433
|
0.906
1.452
1.111
DR | -1.073
-1.690
1.167
|
1.149
1.780
1.378
UI | -2.575
-2.064
-1.762
|
0.384
0.747
0.623
RRUI |
1.734
-0.302
-0.516
|
0.933
1.702
1.418
DRUI | -0.061
3.263
3.669
|
1.538
2.470
1.935
LOGWAGE |
0.576
-0.495
-0.007
|
0.177
0.261
0.216
tenure | -0.001
0.017
-0.068
|
0.010
0.019
0.017
-------------+--------------------------------------N | 3343.000 3343.000 3343.000
ll | -2616.322 -1230.164 -1696.846
----------------------------------------------------legend: b/se
.
. *** (2C) ESTIMATE COX MODEL SPECIFICATION OF COMPETING RISKS
.
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
exit on or before: failure
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
-----------------------------------------------------------------------------3343 obs. remaining, representing
1073 failures in single record/single failure data
20887 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
28
. stcox $xlist, nolog nohr robust basesurv(survrisk1) basechazard(chrisk1)
failure _d: censor1 == 1
analysis time _t: spell
Cox regression -- Breslow method for ties
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) =
3343
540.98
418
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | .5222796 .5711698 0.91 0.361 -.5971926 1.641752
DR | -.752507 .72175 -1.04 0.297 -2.167111 .6620971
UI | -1.317719 .2372893 -5.55 0.000 -1.782798 -.8526409
RRUI | .8822462 .582115 1.52 0.130 -.2586783 2.023171
DRUI | -.0951357 .977774 -0.10 0.922 -2.011538 1.821266
LOGWAGE | .3352639 .1106483 3.03 0.002 .1183972 .5521306
tenure | .0008278 .0061286 0.14 0.893 -.0111841 .0128396
slack | -.247863 .0721173 -3.44 0.001 -.3892103 -.1065158
abolpos | -.1511638 .0905035 -1.67 0.095 -.3285475 .0262198
explose | .1865068 .0615742 3.03 0.002 .0658236
.30719
stateur | -.0590475 .022085 -2.67 0.008 -.1023334 -.0157616
houshead | .3601866 .0794827 4.53 0.000 .2044035 .5159698
married | .358819 .0746355 4.81 0.000 .2125362 .5051019
female | .1002758 .0813277 1.23 0.218 -.0591236 .2596753
child | -.0396054 .0755365 -0.52 0.600 -.1876542 .1084435
ychild | -.1276638 .0967856 -1.32 0.187 -.3173602 .0620325
nonwhite | -.6394475 .1151332 -5.55 0.000 -.8651043 -.4137906
age | -.0204623 .0037593 -5.44 0.000 -.0278305 -.0130942
schlt12 | -.1220585 .0920073 -1.33 0.185 -.3023895 .0582726
schgt12 | .1104817 .0783542 1.41 0.159 -.0430897 .2640531
smsa | .1864841 .0766075 2.43 0.015 .0363361 .3366321
bluecoll | -.2108023 .080867 -2.61 0.009 -.3692986 -.052306
mining | -.1238251 .1906352 -0.65 0.516 -.4974632 .249813
constr | -.054455 .1029488 -0.53 0.597 -.256231 .1473209
transp | -.1551657 .1466515 -1.06 0.290 -.4425973 .1322659
trade | -.0383252 .0968106 -0.40 0.692 -.2280706 .1514201
fire | .1097585 .1300779 0.84 0.399 -.1451895 .3647065
services | .1666262 .0939507 1.77 0.076 -.0175138 .3507662
pubadmin | .1022002 .2829817 0.36 0.718 -.4524336 .6568341
year85 | .204162 .084908 2.40 0.016 .0377454 .3705786
year87 | .3384229 .0899115 3.76 0.000 .1621997 .5146462
year89 | .4486559 .104937 4.28 0.000 .2429832 .6543286
midatl | .0342238 .140515 0.24 0.808 -.2411805 .3096282
encen | .0174597 .1438862 0.12 0.903 -.2645521 .2994716
wncen | .1650967 .1532559 1.08 0.281 -.1352795 .4654728
southatl | .2518023 .1127138 2.23 0.025 .0308874 .4727172
escen | .3450422 .1839818 1.88 0.061 -.0155554 .7056398
wscen | .3316752 .1359801 2.44 0.015 .0651591 .5981914
mountain | .009484 .1468626 0.06 0.949 -.2783613 .2973293
pacific | .0720292 .2263339 0.32 0.750 -.3715771 .5156355
-----------------------------------------------------------------------------. estimates store bcoxrisk1
.
419
=
=
=
Log pseudo-likelihood =
3343
339
20887
Number of obs =
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.0719673 .9513101 -0.08 0.940 -1.936501 1.792566
DR | -1.0236 1.193087 -0.86 0.391 -3.362007 1.314807
UI | -.906022 .5109396 -1.77 0.076 -1.907445 .0954013
RRUI | -.7818457 1.166182 -0.67 0.503 -3.06752 1.503829
DRUI | 2.031968 1.671862 1.22 0.224 -1.244821 5.308756
LOGWAGE | -.2800345 .1736454 -1.61 0.107 -.6203732 .0603043
tenure | .0059934 .0122664 0.49 0.625 -.0180483 .0300352
slack | -.2476685 .12775 -1.94 0.053 -.498054 .0027169
abolpos | -.5434923 .1976775 -2.75 0.006 -.9309331 -.1560516
explose | .0334802 .1101886 0.30 0.761 -.1824856 .2494459
stateur | -.0923228 .0393339 -2.35 0.019 -.1694157 -.0152299
houshead | -.0864111 .1303336 -0.66 0.507 -.3418602 .1690379
married | -.065464 .1298376 -0.50 0.614 -.3199409 .189013
female | .4386603 .1340263 3.27 0.001 .1759735 .7013471
child | -.2049337 .1413612 -1.45 0.147 -.4819966 .0721293
ychild | .1556684 .1766059 0.88 0.378 -.1904727 .5018095
nonwhite | -.3956483 .1761206 -2.25 0.025 -.7408382 -.0504583
age | .0001207 .0062519 0.02 0.985 -.0121327 .0123741
420
3343
Number of obs =
3343
421
No. of failures
Time at risk
=
=
574
20887
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------RR | -.4692082 .7157644 -0.66 0.512 -1.872081 .9336643
DR | .8759221 .8786992 1.00 0.319 -.8462967 2.598141
UI | -.9051384 .4449384 -2.03 0.042 -1.777202 -.0330753
RRUI | -.5392752 1.002388 -0.54 0.591 -2.503919 1.425369
DRUI | 2.293752 1.274021 1.80 0.072 -.2032836 4.790787
LOGWAGE | -.0140883 .1415912 -0.10 0.921 -.291602 .2634253
tenure | -.0465013 .0118142 -3.94 0.000 -.0696567 -.0233458
slack | -.4587556 .0952092 -4.82 0.000 -.6453621 -.2721491
abolpos | -.2743895 .136703 -2.01 0.045 -.5423223 -.0064566
explose | .0199625 .0843281 0.24 0.813 -.1453176 .1852426
stateur | -.1013309 .0311307 -3.26 0.001 -.1623459 -.0403159
houshead | .5154239 .1031203 5.00 0.000 .3133117 .717536
married | .0280002 .1037338 0.27 0.787 -.1753143 .2313148
female | .2477194 .1071841 2.31 0.021 .0376425 .4577962
child | -.1477253 .1086376 -1.36 0.174 -.3606511 .0652005
ychild | -.0702224 .1341067 -0.52 0.601 -.3330667 .1926219
nonwhite | -.4472066 .1401892 -3.19 0.001 -.7219723 -.1724409
age | -.0227849 .0053188 -4.28 0.000 -.0332096 -.0123602
schlt12 | -.1050265 .1191449 -0.88 0.378 -.3385462 .1284931
schgt12 | .0912594 .1057371 0.86 0.388 -.1159815 .2985004
smsa | .0078536 .0994133 0.08 0.937 -.1869928
.2027
bluecoll | .2916892 .1085873 2.69 0.007 .0788619 .5045165
mining | .2392902 .2514416 0.95 0.341 -.2535263 .7321067
constr | .0659352 .1393882 0.47 0.636 -.2072606 .339131
transp | -.0724276 .1845329 -0.39 0.695 -.4341054 .2892502
trade | .0824395 .1260009 0.65 0.513 -.1645178 .3293967
fire | -.3901171 .2648329 -1.47 0.141
-.90918 .1289458
services | .0007351 .1296195 0.01 0.995 -.2533144 .2547847
pubadmin | -1.749927 1.038715 -1.68 0.092 -3.785771 .2859182
year85 | .2810465 .1124259 2.50 0.012 .0606957 .5013973
year87 | .4139684 .117016 3.54 0.000 .1846212 .6433155
year89 | -.1485614 .1590621 -0.93 0.350 -.4603173 .1631946
midatl | -.5271828 .2165005 -2.44 0.015 -.9515159 -.1028497
encen | -.063171 .1962513 -0.32 0.748 -.4478166 .3214745
wncen | .134275 .2051501 0.65 0.513 -.2678118 .5363617
southatl | .1522905 .1610446 0.95 0.344 -.1633512 .4679321
escen | -.5030762 .3118938 -1.61 0.107 -1.114377 .1082245
wscen | .0116807 .1858946 0.06 0.950 -.352666 .3760273
mountain | .2043736 .1827277 1.12 0.263 -.1537662 .5625134
pacific | .4327009 .2661013 1.63 0.104 -.088848 .9542498
------------------------------------------------------------------------------
422
.
. * Figure 19.2 (page 659) - Plot the three baseline cumulative hazards
. sort _t
. graph twoway (scatter chrisk1 _t, c(J) msymbol(i) msize(small) clstyle(p1)) /*
> */ (scatter chrisk2 _t, c(J) msymbol(i) msize(small) clstyle(p2)) /*
> */ (scatter chrisk3 _t, c(J) msymbol(i) msize(small) clstyle(p3)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Baseline Cumulative Hazard Functions") /*
> */ xtitle("Unemployment Duration in 2-week intervals", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Baseline Cumulative Hazard", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(11) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Risk 1 (full-time job)") label(2 "Risk 2 (part-time job)") label(3 "Risk 3 (
> unknown job)"))
. graph export combined_cbh.wmf, replace
(file c:\Imbook\bwebpage\Section4\combined_cbh.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma19p1comprisks.txt
log type: text
closed on: 19 May 2005, 17:53:08
424
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma20p1count.txt
log type: text
opened on: 20 May 2005, 08:41:33
.
. ********* OVERVIEW OF MMA20P1COUNT.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 20.3 pages 671-4 and 20.7 page 690
. * Count data regression example
. * It provides
. * (1) Frequency distribution for count (Table 20.3)
. * (2) Data summary (Table 20.4)
. * (3) Poisson regression with various standard errors (Table 20.5)
. * (4) Negative binomial regression with various standard errors (Table 20.5)
.
. * To use this program you need health expenditure data in Stata data set
. * randdata.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
.
. ********** DATA DESCRIPTION **********
.
. * Essentially same data as in P. Deb and P.K. Trivedi (2002)
. * "The Structure of Demand for Medical Care: Latent Class versus
. * Two-Part Models", Journal of Health Economics, 21, 601-625
. * except that paper used different outcome (counts rather than $)
.
. * Each observation is for an individual over a year.
. * Individuals may appear in up to five years.
. * All available sample is used except only fee for service plans included.
. * In analysis here only year 2 is used so panel complications are avoided.
. * Clustering of individuals within household is ignored here.
.
. * Dependent variable is
.*
MED
med
Annual medical expenditures in constant dollars
.*
excluding dental and outpatient mental
.*
LNMED lnmeddol Ln(Medical expenditures) given meddol > 0
425
.*
Missing otherwise
.*
DMED binexp 1 if medical expenditures > 0
.
. * Regressors are
. * - Health insurance measures
.*
LC
logc
log(coinsrate+1) where coinsurance rate is 0 to 100
.*
IDP
idp
1 if individual deductible plan
.*
LPI
lpi
1og(annual participation incentive payment) or 0 if no payment
.*
FMDE
fmde
log(max(medical deductible expenditure)) if IDP=1 and MDE>1 or 0
otherw
> ise.
. * - Health status measures
.*
NDISEASE disea number of chronic diseases
.*
PHYSLIM physlm 1 if physical limitation
.*
HLTHG hlthg 1 if good health
.*
HLTHF hlthf 1 if good health
.*
HLTHP hlthp 1 if good health (omitted is excellent)
. * - Socioeconomic characteristics
.*
LINC linc
log of annual family income (in $)
.*
LFAM lfam
log of family size
.*
EDUCDEC educdec years of schooling of decision maker
.*
AGE
xage
exact age
.*
BLACK black 1 if black
.*
FEMALE female 1 if female
.*
CHILD child 1 if child
.*
FEMCHILD fchild 1 if female child
.
. * If panel data used then clustering is on
.*
zper
person id
.
. ********** READ DATA, SELECT AND TRANSFORM **********
.
. use randdata.dta, clear
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------plan | 20190 11.17553 3.976751
1
19
site | 20190 3.298811 1.80382
1
6
coins | 20190 26.3056 36.40386
0
100
tookphys | 20190 .5974245 .4904288
0
1
year | 20190 2.420109 1.217141
1
5
-------------+-------------------------------------------------------zper | 20190 357965.5 180868.1 125024 632167
black | 20190 .1814983 .3827071
0
1
income | 20190 8037.409 4058.371
0 29237.54
xage | 20190 25.72233 16.76945
0 64.27515
female | 20190 .5170381 .499722
0
1
-------------+-------------------------------------------------------educdec | 20186 11.96681 2.806255
0
25
426
> */
.
. * Note that unlike chapter 16 we use all years, not just year 2
.
. * educdec is missing for some observations
. drop if educdec==.
(4 observations deleted)
.
. * rename variables
. rename mdvis MDU
. rename meddol MED
. rename binexp DMED
. rename lnmeddol LNMED
. rename linc LINC
. rename lfam LFAM
. rename educdec EDUCDEC
. rename xage AGE
. rename female FEMALE
. rename child CHILD
. rename fchild FEMCHILD
. rename black BLACK
. rename disea NDISEASE
. rename physlm PHYSLIM
. rename hlthg HLTHG
. rename hlthf HLTHF
. rename hlthp HLTHP
. rename idp IDP
. rename logc LC
. rename lpi LPI
. rename fmde FMDE
428
.
. * Define the regressor list which in commands can refer to as $XLIST
. global XLIST LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK
.
. sum MDU $XLIST
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MDU | 20186 2.860696 4.504765
0
77
LC | 20186 2.383588 2.041713
0 4.564348
IDP | 20186 .2599822 .4386354
0
1
LPI | 20186 4.708827 2.697293
0 7.163699
FMDE | 20186 4.030322 3.471234
0 8.294049
-------------+-------------------------------------------------------PHYSLIM | 20186 .1235247 .3220437
0
1
NDISEASE | 20186 11.2445 6.741647
0
58.6
HLTHG | 20186 .3620826 .4806144
0
1
HLTHF | 20186 .0772813 .2670439
0
1
HLTHP | 20186 .0149609 .1213992
0
1
-------------+-------------------------------------------------------LINC | 20186 8.708167 1.22841
0 10.28324
LFAM | 20186 1.248404 .5390681
0 2.639057
EDUCDEC | 20186 11.96681 2.806255
0
25
AGE | 20186 25.71844 16.76759
0 64.27515
FEMALE | 20186 .5169424 .4997252
0
1
-------------+-------------------------------------------------------CHILD | 20186 .4014168 .4901972
0
1
FEMCHILD | 20186 .1937481 .3952436
0
1
BLACK | 20186 .1815343 .3827365
0
1
.
. * Write final data to a text (ascii) file so can use with programs other than Stata
. outfile MDU LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK /*
>
*/ using mma20p1count.asc, replace
.
. ********** (1) FREQUENCIES OF COUNT (Table 20.3, page 672) **********
.
. * Following ggives Table 20.3 (page 672) frequencies
. tabulate MDU
number |
face-to-fac |
t md visits |
Freq. Percent
Cum.
------------+----------------------------------0|
6,308
31.25
31.25
1|
3,815
18.90
50.15
429
2|
3|
4|
5|
6|
7|
8|
9|
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
37 |
38 |
39 |
40 |
41 |
44 |
45 |
46 |
48 |
51 |
52 |
55 |
56 |
57 |
58 |
62 |
63 |
2,795
1,884
1,345
968
689
531
408
287
206
190
118
109
82
59
56
33
37
35
26
22
19
19
13
8
10
6
12
6
8
8
4
5
9
5
5
9
1
3
5
6
2
2
2
1
3
1
1
1
1
1
1
13.85
9.33
6.66
4.80
3.41
2.63
2.02
1.42
1.02
0.94
0.58
0.54
0.41
0.29
0.28
0.16
0.18
0.17
0.13
0.11
0.09
0.09
0.06
0.04
0.05
0.03
0.06
0.03
0.04
0.04
0.02
0.02
0.04
0.02
0.02
0.04
0.00
0.01
0.02
0.03
0.01
0.01
0.01
0.00
0.01
0.00
0.00
0.00
0.00
0.00
0.00
63.99
73.33
79.99
84.79
88.20
90.83
92.85
94.27
95.29
96.24
96.82
97.36
97.77
98.06
98.34
98.50
98.68
98.86
98.98
99.09
99.19
99.28
99.35
99.39
99.44
99.46
99.52
99.55
99.59
99.63
99.65
99.68
99.72
99.75
99.77
99.82
99.82
99.84
99.86
99.89
99.90
99.91
99.92
99.93
99.94
99.95
99.95
99.96
99.96
99.97
99.97
430
65 |
1
0.00
99.98
69 |
1
0.00
99.98
72 |
1
0.00
99.99
74 |
1
0.00
99.99
76 |
1
0.00
100.00
77 |
1
0.00
100.00
------------+----------------------------------Total | 20,186
100.00
.
. * Histogram with kernel density estimate
. hist MDU, discrete kdensity
(start=0, width=1)
.
. ********** (2) DATA SUMMARY (Table 20.4, page 672) **********
.
. * Following gives variables in same order as Table 20.4 (page 672)
. sum MDU LC IDP LPI FMDE LINC LFAM AGE FEMALE CHILD FEMCHILD BLACK /*
>
*/ EDUCDEC PHYSLIM NDISEASE HLTHG HLTHF HLTHP
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MDU | 20186 2.860696 4.504765
0
77
LC | 20186 2.383588 2.041713
0 4.564348
IDP | 20186 .2599822 .4386354
0
1
LPI | 20186 4.708827 2.697293
0 7.163699
FMDE | 20186 4.030322 3.471234
0 8.294049
-------------+-------------------------------------------------------LINC | 20186 8.708167 1.22841
0 10.28324
LFAM | 20186 1.248404 .5390681
0 2.639057
AGE | 20186 25.71844 16.76759
0 64.27515
FEMALE | 20186 .5169424 .4997252
0
1
CHILD | 20186 .4014168 .4901972
0
1
-------------+-------------------------------------------------------FEMCHILD | 20186 .1937481 .3952436
0
1
BLACK | 20186 .1815343 .3827365
0
1
EDUCDEC | 20186 11.96681 2.806255
0
25
PHYSLIM | 20186 .1235247 .3220437
0
1
NDISEASE | 20186 11.2445 6.741647
0
58.6
-------------+-------------------------------------------------------HLTHG | 20186 .3620826 .4806144
0
1
HLTHF | 20186 .0772813 .2670439
0
1
HLTHP | 20186 .0149609 .1213992
0
1
.
.
. *********** (3, 4) REGRESSION ANALYSIS **************
.
. * Here just two estimators - Poisson and negative binomial
. * but three ways to calculate standard errors
431
. * (A) default ML
. * (B) robust (to misspecification of heteroskedasticity)
. * (C) cluster-robust needed here as data are actually panel (see chapter 21, 24)
.
. *** Table 20.5 Poisson regression estimates
.
. * Default standard errors assume variance = mean (ignoring overdispersion)
. * This is first t-ratio in Table 20.5
. poisson MDU $XLIST
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Poisson regression
Number of obs =
20186
LR chi2(17) = 13106.07
Prob > chi2 = 0.0000
Log likelihood = -60087.622
Pseudo R2
= 0.0983
-----------------------------------------------------------------------------MDU |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.0427332 .0060785 -7.03 0.000 -.0546469 -.0308195
IDP | -.1613169 .0116218 -13.88 0.000 -.1840952 -.1385385
LPI | .0128511 .0018362 7.00 0.000 .0092523 .0164499
FMDE | -.020613 .0035521 -5.80 0.000 -.027575 -.0136511
PHYSLIM | .2684048 .0123624 21.71 0.000 .2441749 .2926347
NDISEASE | .023183 .0006081 38.12 0.000 .0219912 .0243749
HLTHG | .0394004 .0095884 4.11 0.000 .0206074 .0581934
HLTHF | .2531119 .016212 15.61 0.000 .2213369 .2848869
HLTHP | .5216034 .0272382 19.15 0.000 .4682176 .5749892
LINC | .0834099 .0051656 16.15 0.000 .0732854 .0935343
LFAM | -.1296626 .0089603 -14.47 0.000 -.1472245 -.1121008
EDUCDEC | .0176149 .0016387 10.75 0.000 .0144031 .0208268
AGE | .0023756 .0004311 5.51 0.000 .0015306 .0032206
FEMALE | .3487667 .0113504 30.73 0.000 .3265203 .371013
CHILD | .3361904 .0178194 18.87 0.000 .3012649 .3711158
FEMCHILD | -.3625218 .0179396 -20.21 0.000 -.3976827 -.3273608
BLACK | -.6800518 .0155484 -43.74 0.000 -.7105262 -.6495775
_cons | -.1898766 .0491731 -3.86 0.000 -.2862541 -.093499
-----------------------------------------------------------------------------. estimates store poisml
.
. * Should always control for possible overdispersion
. * This is second t-ratio in Table 20.5
. poisson MDU $XLIST, robust
Iteration 0: log pseudo-likelihood = -60097.599
432
Number of obs =
20186
Wald chi2(17) = 1924.78
Prob > chi2 = 0.0000
Log pseudo-likelihood = -60087.622
Pseudo R2
= 0.0983
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.0427332 .0150712 -2.84 0.005 -.0722723 -.0131942
IDP | -.1613169 .0279441 -5.77 0.000 -.2160863 -.1065474
LPI | .0128511 .0044136 2.91 0.004 .0042007 .0215015
FMDE | -.020613 .0088874 -2.32 0.020 -.0380319 -.0031941
PHYSLIM | .2684048 .0325743 8.24 0.000 .2045604 .3322493
NDISEASE | .023183 .0017189 13.49 0.000
.019814 .0265521
HLTHG | .0394004 .023194 1.70 0.089 -.006059 .0848598
HLTHF | .2531119 .0429454 5.89 0.000 .1689405 .3372833
HLTHP | .5216034 .0748808 6.97 0.000 .3748398 .668367
LINC | .0834099 .0139182 5.99 0.000 .0561306 .1106891
LFAM | -.1296626 .0226793 -5.72 0.000 -.1741132 -.085212
EDUCDEC | .0176149 .004042 4.36 0.000 .0096927 .0255371
AGE | .0023756 .0011184 2.12 0.034 .0001837 .0045675
FEMALE | .3487667 .0283549 12.30 0.000
.293192 .4043413
CHILD | .3361904 .040411 8.32 0.000 .2569863 .4153945
FEMCHILD | -.3625218 .04415 -8.21 0.000 -.4490542 -.2759893
BLACK | -.6800518 .0368748 -18.44 0.000 -.7523252 -.6077785
_cons | -.1898766 .127516 -1.49 0.136 -.4398033 .0600502
-----------------------------------------------------------------------------. estimates store poisrobust
.
. * Should also control here for clustering (see chapter 24)
. * as up to four years of data for each person.
. * Table 20.5 did not report these results
. poisson MDU $XLIST, cluster(zper)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Poisson regression
Number of obs =
20186
Wald chi2(17) = 827.07
Log pseudo-likelihood = -60087.622
Prob > chi2 = 0.0000
(standard errors adjusted for clustering on zper)
433
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.0427332 .0226824 -1.88 0.060 -.0871899 .0017235
IDP | -.1613169 .0424591 -3.80 0.000 -.2445352 -.0780986
LPI | .0128511 .0067697 1.90 0.058 -.0004173 .0261195
FMDE | -.020613 .0134449 -1.53 0.125 -.0469646 .0057386
PHYSLIM | .2684048 .0491061 5.47 0.000 .1721586 .364651
NDISEASE | .023183 .0027457 8.44 0.000 .0178015 .0285645
HLTHG | .0394004 .0354001 1.11 0.266 -.0299825 .1087833
HLTHF | .2531119 .0675164 3.75 0.000 .1207822 .3854416
HLTHP | .5216034 .1163731 4.48 0.000 .2935163 .7496905
LINC | .0834099 .0200881 4.15 0.000 .0440379 .1227818
LFAM | -.1296626 .0340038 -3.81 0.000 -.1963089 -.0630164
EDUCDEC | .0176149 .0062678 2.81 0.005 .0053302 .0298996
AGE | .0023756 .0016549 1.44 0.151 -.0008681 .0056192
FEMALE | .3487667 .0432567 8.06 0.000
.263985 .4335483
CHILD | .3361904 .0586109 5.74 0.000 .2213151 .4510656
FEMCHILD | -.3625218 .0660639 -5.49 0.000 -.4920045 -.233039
BLACK | -.6800518 .0544268 -12.49 0.000 -.7867263 -.5733774
_cons | -.1898766 .1860343 -1.02 0.307 -.5544971 .174744
-----------------------------------------------------------------------------. estimates store poiscluster
.
. *** Table 20.5 Negative binomial regression estimates
.
. * Default standard errors assume variance = mean (ignoring overdispersion)
. * This is first t-ratio in Table 20.5
. nbreg MDU $XLIST
Fitting Poisson model:
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Number of obs =
20186
LR chi2(17) = 2828.01
Prob > chi2 = 0.0000
Log likelihood = -42777.611
Pseudo R2
= 0.0320
-----------------------------------------------------------------------------MDU |
Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.0504405 .0128694 -3.92 0.000 -.0756641 -.0252169
IDP | -.1475976 .0254099 -5.81 0.000 -.1974001 -.0977951
LPI | .0158351 .0040586 3.90 0.000 .0078805 .0237898
FMDE | -.021335 .0075119 -2.84 0.005 -.036058 -.0066119
PHYSLIM | .2751715 .0295572 9.31 0.000 .2172404 .3331026
NDISEASE | .0259352 .0014827 17.49 0.000 .0230292 .0288412
HLTHG | .0065371 .0202235 0.32 0.747 -.0331002 .0461744
HLTHF | .2368643 .0374086 6.33 0.000 .1635448 .3101837
HLTHP | .4256563 .0741812 5.74 0.000 .2802638 .5710488
LINC | .0845165 .0085659 9.87 0.000 .0677277 .1013053
LFAM | -.1226764 .019308 -6.35 0.000 -.1605195 -.0848333
EDUCDEC | .0162582 .0034846 4.67 0.000 .0094285 .0230879
AGE | .0025943 .0009433 2.75 0.006 .0007455 .0044432
FEMALE | .3672884 .024005 15.30 0.000 .3202395 .4143373
CHILD | .3060317 .0385618 7.94 0.000 .230452 .3816115
FEMCHILD | -.3755503 .0371392 -10.11 0.000 -.4483418 -.3027587
BLACK | -.7104372 .0274929 -25.84 0.000 -.7643223 -.6565521
_cons | -.2069298 .0899431 -2.30 0.021 -.3832151 -.0306445
-------------+---------------------------------------------------------------/lnalpha | .1674206 .0147901
.1384326 .1964087
-------------+---------------------------------------------------------------alpha | 1.182251 .0174856
1.148472 1.217024
-----------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 3.5e+04 Prob>=chibar2 = 0.000
. estimates store nbml
.
. * Should always control for possible overdispersion
. * This is second t-ratio in Table 20.5
. nbreg MDU $XLIST, robust
Fitting Poisson model:
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Number of obs =
20186
Wald chi2(17) = 2203.12
Prob > chi2 = 0.0000
Log pseudo-likelihood = -42777.611
Pseudo R2
= 0.0320
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.0504405 .0156238 -3.23 0.001 -.0810625 -.0198184
IDP | -.1475976 .0303777 -4.86 0.000 -.2071367 -.0880585
LPI | .0158351 .004431 3.57 0.000 .0071505 .0245197
FMDE | -.021335 .0090748 -2.35 0.019 -.0391211 -.0035488
PHYSLIM | .2751715 .0341067 8.07 0.000 .2083235 .3420195
NDISEASE | .0259352 .0016925 15.32 0.000
.022618 .0292524
HLTHG | .0065371 .023814 0.27 0.784 -.0401375 .0532118
HLTHF | .2368643 .0436579 5.43 0.000 .1512963 .3224322
HLTHP | .4256563 .0686042 6.20 0.000 .2911945 .560118
LINC | .0845165 .0113918 7.42 0.000 .0621891 .106844
LFAM | -.1226764 .0231639 -5.30 0.000 -.1680769 -.0772759
EDUCDEC | .0162582 .0040332 4.03 0.000 .0083533 .024163
AGE | .0025943 .0011128 2.33 0.020 .0004133 .0047753
FEMALE | .3672884 .0285724 12.85 0.000 .3112876 .4232892
CHILD | .3060317 .0428976 7.13 0.000
.221954 .3901095
FEMCHILD | -.3755503 .0447039 -8.40 0.000 -.4631682 -.2879323
BLACK | -.7104372 .0359462 -19.76 0.000 -.7808903 -.639984
_cons | -.2069298 .1130753 -1.83 0.067 -.4285533 .0146938
-------------+---------------------------------------------------------------/lnalpha | .1674206 .0187562
.1306591 .2041821
-------------+---------------------------------------------------------------alpha | 1.182251 .0221746
1.139579 1.226522
-----------------------------------------------------------------------------. estimates store nbrobust
.
. * Should also control here for clustering (see chapter 24)
. * as up to four years of data for each person.
436
Number of obs =
20186
Wald chi2(17) = 1034.43
Log pseudo-likelihood = -42777.611
Prob > chi2 = 0.0000
(standard errors adjusted for clustering on zper)
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.0504405 .0236804 -2.13 0.033 -.0968533 -.0040277
IDP | -.1475976 .0457769 -3.22 0.001 -.2373186 -.0578766
LPI | .0158351 .0066968 2.36 0.018 .0027096 .0289607
FMDE | -.021335 .0137245 -1.55 0.120 -.0482344 .0055645
PHYSLIM | .2751715 .0489905 5.62 0.000 .1791519 .371191
NDISEASE | .0259352 .0025814 10.05 0.000 .0208758 .0309946
HLTHG | .0065371 .0359676 0.18 0.856 -.0639581 .0770323
HLTHF | .2368643 .0653989 3.62 0.000 .1086848 .3650437
HLTHP | .4256563 .1000813 4.25 0.000 .2295005 .621812
LINC | .0845165 .0152197 5.55 0.000 .0546864 .1143467
LFAM | -.1226764 .0340453 -3.60 0.000 -.189404 -.0559488
EDUCDEC | .0162582 .0059501 2.73 0.006 .0045962 .0279202
AGE | .0025943 .001581 1.64 0.101 -.0005045 .0056931
FEMALE | .3672884 .0420327 8.74 0.000 .2849059 .4496709
CHILD | .3060317 .0598167 5.12 0.000 .1887932 .4232702
FEMCHILD | -.3755503 .0649845 -5.78 0.000 -.5029175 -.2481831
BLACK | -.7104372 .0531155 -13.38 0.000 -.8145417 -.6063326
_cons | -.2069298 .1576721 -1.31 0.189 -.5159613 .1021018
437
| -20.208
-8.211
-5.487
BLACK | -0.6801 -0.6801 -0.6801
| -43.738 -18.442 -12.495
_cons | -0.1899 -0.1899 -0.1899
| -3.861
-1.489
-1.021
-------------+--------------------------------------N | 20186.0000 20186.0000 20186.0000
ll | -6.009e+04 -6.009e+04 -6.009e+04
rank | 18.0000
18.0000
18.0000
aic | 1.202e+05 1.202e+05 1.202e+05
bic | 1.204e+05 1.204e+05 1.204e+05
----------------------------------------------------legend: b/t
.
. * Last columns of Table 20.5 (page 673) give bnbml. Also give others.
. estimates table nbml nbrobust nbcluster, t stats(N ll rank aic bic) b(%10.4f) t(%10.3f)
----------------------------------------------------Variable | nbml
nbrobust nbcluster
-------------+--------------------------------------MDU
|
LC | -0.0504 -0.0504 -0.0504
| -3.919
-3.228
-2.130
IDP | -0.1476 -0.1476 -0.1476
| -5.809
-4.859
-3.224
LPI | 0.0158
0.0158
0.0158
|
3.902
3.574
2.365
FMDE | -0.0213 -0.0213 -0.0213
| -2.840
-2.351
-1.555
PHYSLIM | 0.2752
0.2752
0.2752
|
9.310
8.068
5.617
NDISEASE | 0.0259
0.0259
0.0259
| 17.492
15.324
10.047
HLTHG | 0.0065
0.0065
0.0065
|
0.323
0.275
0.182
HLTHF | 0.2369
0.2369
0.2369
|
6.332
5.425
3.622
HLTHP | 0.4257
0.4257
0.4257
|
5.738
6.205
4.253
LINC | 0.0845
0.0845
0.0845
|
9.867
7.419
5.553
LFAM | -0.1227 -0.1227 -0.1227
| -6.354
-5.296
-3.603
EDUCDEC | 0.0163
0.0163
0.0163
|
4.666
4.031
2.732
AGE | 0.0026
0.0026
0.0026
|
2.750
2.331
1.641
FEMALE | 0.3673
0.3673
0.3673
| 15.300
12.855
8.738
CHILD | 0.3060
0.3060
0.3060
439
|
7.936
7.134
5.116
FEMCHILD | -0.3756 -0.3756 -0.3756
| -10.112
-8.401
-5.779
BLACK | -0.7104 -0.7104 -0.7104
| -25.841 -19.764 -13.375
_cons | -0.2069 -0.2069 -0.2069
| -2.301
-1.830
-1.312
-------------+--------------------------------------lnalpha
|
_cons | 0.1674
0.1674
0.1674
| 11.320
8.926
6.628
-------------+--------------------------------------Statistics |
N | 20186.0000 20186.0000 20186.0000
ll | -4.278e+04 -4.278e+04 -4.278e+04
rank | 19.0000
19.0000
19.0000
aic | 85593.2220 85593.2220 85593.2220
bic | 85743.5642 85743.5642 85743.5642
----------------------------------------------------legend: b/t
.
. * For Poisson correcting for overdispersion is most important.
. * For negative binomial overdispersion is already incorporated.
. * For both contreolling for clustering (in this example with panel data)
. * is also needed.
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section4\mma20p1count.txt
log type: text
closed on: 20 May 2005, 08:41:56
440
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p1panfeandre.txt
log type: text
opened on: 23 May 2005, 11:27:25
.
. ********** OVERVIEW OF MMA21P1PANBFEANDRE.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 21.3.1-3 pages 709-14
. * Program performs basic panel analysis, mainly using XTREG:
. * It derives most of Table 21.1 and Figures 21.1-21.4
. * (1) pooled OLS
. * (2) between
. * (3) within (or fixed effects)
. * (4) first differences
. * (5) random effects - GLS
. * (6) random effects - MLE
. * (7) Hausman test of FE versus RE
. * Standard errors are default plus panel bootstrap
.
. * The individual effects model is
. * y_it = x_it'b + a_i + e_it
. * Default panel output assumes e_it is random.
. * This is usually too strong an assumption.
. * Instead should get panel-robust or cluster-robust errors after xtreg
. * See Section 21.2.3 pages 709-12
. * Stata Version 8 does not do this but Stata version 9 does.
.
. * Three ways to obtain panel-robust se's for fixed and random effects models:
. * (1) Use Stata version 9 and cluster option in xtreg
. * (2) Use Stata version 8 xtreg and then panel bootstrap (this program)
. * (3) Use Stata version 8 regress cluster option on transformed model (next program)
.
. * The four basic linear panel programs are
. * mma21p1panfeandre.do Linear fixed and random effects using xtreg
. * mma21p2panfeandre.do Linear fe and re using transformation and regress
.*
plus also has valid Hausman test
. * mma21p3panresiduals.do Residual analysis after linear fe and re
. * mma21p4panpangls.do Pooled panel OLS and GLS
.
. * To run this program you need data file
. * MOM.dat
.
. * To speed up this program reduce nreps, the number of bootstraps
. * used in the panel bootstrap to get panel-robust standard errors
441
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** DATA DESCRIPTION **********
.
. * The original data is from
. * Jim Ziliak (1997)
. * "Efficient Estimation With Panel Data when Instruments are Predetermined:
. * An Empirical Comparison of Moment-Condition Estimators"
. * Journal of Business and Economic Statistics, 15, 419-431
.
. * File MOM.dat has data on 532 men over 10 years (1979-1988)
. * Data are space-delimited ordered by person with separate line for each year
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
. * lnhr lnwg kids ageh agesq disab id year
.
. * File MOM.dat is the version of the data posted at the JBES website
. * Note that in chapter 22 we instead use MOMprecise.dat
. * which is the same data set but with more significant digits
.
. ********** READ DATA **********
.
. * The data are in ascii file MOM.dat
. * There are 532 individuals with 10 lines (years) per individual
. * Read in using Infile: FREE FORMAT WITHOUT DICTIONARY
. infile lnhr lnwg kids ageh agesq disab id year using MOM.dat
(5320 observations read)
.
. ********** DATA TRANSFORMATIONS AND CHECK **********
.
. * Create year dummies
. tabulate year, generate(dyear)
year |
Freq. Percent
Cum.
------------+----------------------------------1979 |
532
10.00
10.00
1980 |
532
10.00
20.00
1981 |
532
10.00
30.00
1982 |
532
10.00
40.00
1983 |
532
10.00
50.00
1984 |
532
10.00
60.00
1985 |
532
10.00
70.00
442
1986 |
532
10.00
80.00
1987 |
532
10.00
90.00
1988 |
532
10.00
100.00
------------+----------------------------------Total |
5,320
100.00
.
. * The following lists the variables in data set and summarizes data
. describe
Contains data
obs:
5,320
vars:
18
size:
244,720 (97.6% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------lnhr
float %9.0g
lnwg
float %9.0g
kids
float %9.0g
ageh
float %9.0g
agesq
float %9.0g
disab
float %9.0g
id
float %9.0g
year
float %9.0g
dyear1
byte %8.0g
year== 1979.0000
dyear2
byte %8.0g
year== 1980.0000
dyear3
byte %8.0g
year== 1981.0000
dyear4
byte %8.0g
year== 1982.0000
dyear5
byte %8.0g
year== 1983.0000
dyear6
byte %8.0g
year== 1984.0000
dyear7
byte %8.0g
year== 1985.0000
dyear8
byte %8.0g
year== 1986.0000
dyear9
byte %8.0g
year== 1987.0000
dyear10
byte %8.0g
year== 1988.0000
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
443
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
dyear1 |
5320
.1 .3000282
0
1
dyear2 |
5320
.1 .3000282
0
1
-------------+-------------------------------------------------------dyear3 |
5320
.1 .3000282
0
1
dyear4 |
5320
.1 .3000282
0
1
dyear5 |
5320
.1 .3000282
0
1
dyear6 |
5320
.1 .3000282
0
1
dyear7 |
5320
.1 .3000282
0
1
-------------+-------------------------------------------------------dyear8 |
5320
.1 .3000282
0
1
dyear9 |
5320
.1 .3000282
0
1
dyear10 |
5320
.1 .3000282
0
1
. save mom, replace
file mom.dta saved
.
. * The following summarizes panel features for completeness
. iis id
. tis year
. xtdes
id: 1, 2, ..., 532
n=
532
year: 1979, 1980, ..., 1988
T=
Delta(year) = 1; (1988-1979)+1 = 10
(id*year uniquely identifies each observation)
Distribution of T_i: min
5%
10
10
10
10
25%
50%
75%
10
10
10
10
95%
max
|
|
overall | 1.555827 1.195924
0
6 | N = 5320
between |
1.032205
0
5.4 | n = 532
within |
.605468 -2.444173 5.055827 | T =
10
|
|
ageh overall | 38.91823 8.450351
22
60 | N = 5320
between |
7.945371
26.5
55.5 | n = 532
within |
2.895916 32.71823 52.21823 | T =
10
|
|
agesq overall | 1586.024 689.7759
484
3600 | N = 5320
between |
650.9138
710.5 3088.5 | n = 532
within |
229.8235 963.3239 2581.724 | T =
10
|
|
disab overall | .0609023 .2391734
0
1 | N = 5320
between |
.1657419
0
1 | n = 532
within |
.1725689 -.8390977 .9609023 | T =
10
kids
.
. ********** DEFINE GLOBALS INCLUDING REGRESSOR LIST **********
.
. * Number of reps for the boostrap
. * Table 21.2 pge 710 used 500
. global nreps 500
.
. * The regression below are of lnhrs on lnwg
. * Additional regressors to be included below are defined in xextra
. * Choose one of the following
.
. * No additional regressors
. global xextra
. global xextrashort
.
. * Include year dummies with one ommitted (or two omitted for first differences)
. * global xextra dyear1 dyear2 dyear3 dyear3 dyear4 dyear5 dyear6 dyear7 dyear8 dyear9
. * global xextrashort dyear2 dyear3 dyear3 dyear4 dyear5 dyear6 dyear7 dyear8 dyear9
.
. * Include socioeconomic characteristics
. * global xextra kids ageh agesq disab
. * global xextrashort kids ageh agesq disab
.
. ********* DIFFERENT PANEL ESTIMATES pages 709-14 **********
.
. * Note that in the first xt command need to give , i(id)
. * to indicate that the ith observation is for the ith id
.
. * XTDATA permits plots of between, within and overall
. * Useful for looking at the data. See Stata manual under xtdata for example.
. * XTREG gives between, within and RE estiamtes though not correct standard errors
445
.
. * The graphs below use new Stata 8 graphics
. * Change graphics scheme from default s2color to s1mono for printing
. set scheme s1mono
. * The following graphs include
. * legend(pos(4) ring(0) col(1))
.*
changes position of legend to four o'clock
. * legend( label(1 "Data used") label(2 "Smoothed fit") label(3 "Linear fit"))
.*
changes labels for the legends
.
. *** (1) POOLED OLS (OVERALL) REGRESSION (Table 21.2 POLS column and Figure 21.1)
.
. use mom, clear
.
. * Wrong formula OLS standard errors require e_it is i.i.d.
. regress lnhr lnwg $xextra
Source |
SS
df
MS
Number of obs = 5320
-------------+-----------------------------F( 1, 5318) = 82.22
Model | 6.60538417 1 6.60538417
Prob > F
= 0.0000
Residual | 427.225206 5318 .080335691
R-squared = 0.0152
-------------+-----------------------------Adj R-squared = 0.0150
Total | 433.830591 5319 .081562435
Root MSE
= .28344
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0827436 .0091251 9.07 0.000 .0648545 .1006326
_cons | 7.441516 .0241265 308.44 0.000 7.394219 7.488814
-----------------------------------------------------------------------------. estimates store polsiid
.
. * Wrong White heteroskesdastic-consistent standard errors
. * assume standard errors require e_it is independent over i
. regress lnhr lnwg $xextra, robust
Regression with robust standard errors
Number of obs =
F( 1, 5318) = 16.61
Prob > F
= 0.0000
R-squared = 0.0152
Root MSE = .28344
5320
-----------------------------------------------------------------------------|
Robust
lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0827436 .0203042 4.08 0.000 .0429391 .122548
446
Number of obs =
N of clusters =
532
Replications =
500
5320
P = percentile
BC = bias-corrected
. matrix polsbootse = e(se)
.
. * Overall plot of data with lowess local regression line - Figure 21.1 page 712
. graph twoway (scatter lnhr lnwg, msize(vsmall)) (lowess lnhr lnwg) (lfit lnhr lnwg), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Pooled (Overall) Regression") /*
> */ xtitle("Log hourly wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log annual hours", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(4) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Original data") label(2 "Nonparametric fit") label(3 "Linear fit"))
. graph export ch21pantot.wmf, replace
(file c:\Imbook\bwebpage\Section5\ch21pantot.wmf written in Windows Metafile format)
.
. *** (2) BETWEEN REGRESSION (Table 21.2 Between column and Figure 21.2)
.
. use mom, clear
.
. * Usual standard errors assume iid error
. xtreg lnhr lnwg, be i(id)
Between regression (regression on group means) Number of obs
=
Group variable (i): id
Number of groups =
532
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
F(1,530)
sd(u_i + avg(e_i.))= .1772555
5320
10
0.0007
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0668379 .0196635 3.40 0.001 .0282099 .1054658
_cons | 7.483021 .0518829 144.23 0.000
7.3811 7.584943
-----------------------------------------------------------------------------. estimates store beiid
.
. * Heteroskedasticity robust standard errors
. * Stata has no option for this. See ch21panel2.do
.
. * Correct panel bootstrap standard errors
448
Number of obs =
N of clusters =
532
Replications =
500
5320
Number of obs
=
5320
Number of groups =
532
10
78.96
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
rho | .37789558 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(531, 4787) = 5.83
Prob > F = 0.0000
. estimates store feiid
.
. * Correct panel robust standard errors
. * Stata has no option for this. See ch21panel2.do
.
. * Correct panel bootstrap standard errors
. set seed 10001
. bootstrap "xtreg lnhr lnwg $xextra, fe i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level
> (95)
command:
xtreg lnhr lnwg , fe i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
532
Replications =
500
5320
451
4788
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .1089851 .0688514 1.58 0.114 -.0259952 .2439654
_cons | .0008283 .0042856 0.19 0.847 -.0075735 .0092301
-----------------------------------------------------------------------------. estimates store fdiffhet
.
. * Correct panel bootstrap standard errors
. set seed 10001
. bs "regress dlnhr dlnwg $xextrashort" "_b[dlnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
regress dlnhr dlnwg
statistics: _bs_1
= _b[dlnwg]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
532
Replications =
500
4788
. graph twoway (scatter dlnhr dlnwg, msize(vsmall)) (lowess dlnhr dlnwg) (lfit dlnhr dlnwg), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("First Differences Regression") /*
> */ xtitle("Log hourly wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log annual hours", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(4) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "First differences") label(2 "Nonparametric fit") label(3 "Linear fit"))
. graph export ch21panfd.wmf, replace
(file c:\Imbook\bwebpage\Section5\ch21panfd.wmf written in Windows Metafile format)
.
. *** (5) RANDOM EFFECTS GLS REGRESSION (Table 21.2 RE-GLS column)
.
. use mom, clear
.
. * Usual standard errors assume iid error
. xtreg lnhr lnwg, re i(id)
Random-effects GLS regression
Group variable (i): id
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
Obs per group: min =
avg =
10.0
max =
10
=
5320
532
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
rho | .32424354 (fraction of variance due to u_i)
-----------------------------------------------------------------------------. estimates store reglsiid
.
. * Correct panel robust standard errors
. * Stata has no option for this. See ch21panel2.do
. * or use xtgee corr(exchangeable), robust see ch21panel4.do
.
. * Correct panel bootstrap standard errors
. set seed 10001
454
. bootstrap "xtreg lnhr lnwg, re i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
xtreg lnhr lnwg , re i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
532
Replications =
500
5320
Number of obs
Number of groups =
=
5320
532
10
455
LR chi2(1)
Log likelihood = -266.91155
= 76.14
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1195474 .0137484 8.70 0.000 .092601 .1464938
_cons | 7.345479 .0366973 200.16 0.000 7.273554 7.417404
-------------+---------------------------------------------------------------/sigma_u | .162175 .0060469 26.82 0.000 .1503233 .1740266
/sigma_e | .2329172 .0023819 97.79 0.000 .2282488 .2375856
-------------+---------------------------------------------------------------rho | .3265097 .017266
.2934209 .3610233
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 1147.08 Prob>=chibar2 = 0.000
. estimates store remleiid
.
. * Correct panel robust standard errors
. * Stata has no option for this. See ch21panel2.do
.
. * Correct panel bootstrap standard errors
. set seed 10001
. bootstrap "xtreg lnhr lnwg, mle i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
xtreg lnhr lnwg , mle i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
532
Replications =
500
5320
.
. * Population averaged is similar to re (gives similar to mle version of re)
. * Exactly same as xtgee, i(id)
. xtreg lnhr lnwg, pa i(id)
Iteration 1: tolerance = .03364039
Iteration 2: tolerance = .00033468
Iteration 3: tolerance = 4.733e-06
Iteration 4: tolerance = 6.715e-08
GEE population-averaged model
Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
Obs per group: min =
10
Family:
Gaussian
avg =
10.0
Correlation:
exchangeable
max =
10
Wald chi2(1)
= 76.70
Scale parameter:
.0805511
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1195474 .0136507 8.76 0.000 .0927925 .1463023
_cons | 7.345479 .0364481 201.53 0.000 7.274042 7.416916
-----------------------------------------------------------------------------. estimates store paiid
.
. *** (7) HAUSMAN TEST (NOT ROBUST)
.
. * Hausman test of fixed versus random effects
. * The FE estimates are saved in feiid
. * The RE estimates are saved in reglsiid
.
. * From Section 21.4.3 pages 717-9 this usual implementation of the Hausman test
. * is invalid if there is any intracluster correlation left in the RE model
. * as then the RE estimator is no longer fully efficient
. * so Var[b_RE - b_FE] does not equal Var[b_FE] - V[b_RE]
.
. * Following is not valid - see MMA21P2PANMANUAL.DO for robust version
. hausman feiid reglsiid
---- Coefficients ---|
(b)
(B)
(b-B) sqrt(diag(V_b-V_B))
| feiid
reglsiid
Difference
S.E.
-------------+---------------------------------------------------------------lnwg | .1676755 .1193322
.0483432
.0130486
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
457
|
0.037
-------------+----------------------------------------------------------------sigma_u
|
_cons |
0.162
|
0.006
-------------+----------------------------------------------------------------sigma_e
|
_cons |
0.233
|
0.002
-------------+----------------------------------------------------------------Statistics |
N | 4788.000 4788.000 4788.000 5320.000 5320.000
ll | -956.059 -956.059 -956.059
-266.912
r2 | 0.005
0.005
0.005
tss |
rss | 417.944
417.944
417.944
mss |
2.279
2.279
2.279
rmse |
0.296
0.296
0.296
df_r | 4786.000 4786.000
531.000
------------------------------------------------------------------------------legend: b/se
. estimates table paiid, se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
--------------------------Variable | paiid
-------------+------------lnwg |
0.120
|
0.014
_cons |
7.345
|
0.036
-------------+------------N | 5320.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
--------------------------legend: b/se
.
. * Standard errors using panel bootstrap (regular bootstrap for between)
. matrix list polsbootse
polsbootse[1,2]
_bs_1
_bs_2
se .02983953 .0805676
459
461
.
. ********** DATA DESCRIPTION **********
.
. * The original data is from
. * Jim Ziliak (1997)
. * "Efficient Estimation With Panel Data when Instruments are Predetermined:
. * An Emprirical Comparison of Moment-Condition Estimators"
. * Journal of Business and Economic Statistics, 15, 419-431
.
. * File MOM.dat has data on 532 men over 10 years (1979-1988)
. * Data are space-delimited ordered by person with separate line for each year
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
. * lnhr lnwg kids ageh agesq disab id year
.
. * File MOM.dat is the version of the data posted at the JBES website
. * Note that in chapter 22 we instead use MOMprecise.dat
. * which is the same data set but with more significant digits
.
. ********** READ DATA **********
.
. * The data are in ascii file MOM.dat
. * There are 532 individuals with 10 lines (years) per individual
. * Read in using Infile: FREE FORMAT WITHOUT DICTIONARY
. infile lnhr lnwg kids ageh agesq disab id year using MOM.dat
(5320 observations read)
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ********** DEFINE GLOBALS **********
.
. * Number of reps for the boostrap
. * Table 21.1 used 500
. global nreps 500
.
. ******** RUN REGRESSIONS USING XTREG **********
.
462
F(1,530)
sd(u_i + avg(e_i.))= .1772555
= 11.55
Prob > F
=
5320
10
0.0007
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0668379 .0196635 3.40 0.001 .0282099 .1054658
_cons | 7.483021 .0518829 144.23 0.000
7.3811 7.584943
-----------------------------------------------------------------------------. estimates store bextreg
.
. xtreg lnhr lnwg, fe i(id)
Fixed-effects (within) regression
Group variable (i): id
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
Number of obs
=
5320
Number of groups =
532
Obs per group: min =
avg =
10.0
max =
10
F(1,4787)
=
Prob > F
10
78.96
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
rho | .37789558 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(531, 4787) = 5.83
Prob > F = 0.0000
463
Number of obs
Number of groups =
=
5320
532
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
rho | .32424354 (fraction of variance due to u_i)
-----------------------------------------------------------------------------. estimates store reglsxtreg
. scalar sesq = e(sigma_e)^2
. scalar susq = e(sigma_u)^2
. scalar lamdaregls = 1 - sqrt( sesq / (e(Tbar)*susq + sesq) )
. di lamdaregls
.58470925
.
. xtreg lnhr lnwg, mle i(id)
Fitting constant-only model:
Iteration 0: log likelihood = -305.19469
Iteration 1: log likelihood = -304.97993
Iteration 2: log likelihood = -304.97987
Fitting full model:
Iteration 0: log likelihood = -270.51687
Iteration 1: log likelihood = -266.91794
Iteration 2: log likelihood = -266.91155
Random-effects ML regression
Number of obs
5320
464
Number of groups =
LR chi2(1)
Log likelihood = -266.91155
532
= 76.14
Prob > chi2
=
10
0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1195474 .0137484 8.70 0.000
.092601 .1464938
_cons | 7.345479 .0366973 200.16 0.000 7.273554 7.417404
-------------+---------------------------------------------------------------/sigma_u | .162175 .0060469 26.82 0.000 .1503233 .1740266
/sigma_e | .2329172 .0023819 97.79 0.000 .2282488 .2375856
-------------+---------------------------------------------------------------rho | .3265097 .017266
.2934209 .3610233
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 1147.08 Prob>=chibar2 = 0.000
. estimates store remlextreg
. scalar sesq2 = e(sigma_e)^2
. scalar susq2 = e(sigma_u)^2
. scalar lamdaremle = 1 - sqrt( sesq2 / (e(g_avg)*susq2 + sesq2) )
. di lamdaremle
.58648101
.
. ******** ANALYSIS: FE, RE and FD ESTIMATORS CALCULATED MANUALLY
**********
.
. *** FIRST TRANSFORM DATA FROM LONG FORM TO WIDE FORM
.
. * Here just do this for lnhr and lnwg
. keep lnhr lnwg id year
. reshape wide lnhr lnwg, i(id) j(year)
(note: j = 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 -> 532
Number of variables
4 ->
21
j variable (10 values)
year -> (dropped)
xij variables:
465
466
.
. * Should replicate xtreg, be
. regress avelnhr avelnwg
Source |
SS
df
MS
Number of obs = 532
-------------+-----------------------------F( 1, 530) = 11.55
Model | .363013807 1 .363013807
Prob > F
= 0.0007
Residual | 16.6523404 530 .03141951
R-squared = 0.0213
-------------+-----------------------------Adj R-squared = 0.0195
Total | 17.0153542 531 .032043982
Root MSE
= .17726
-----------------------------------------------------------------------------avelnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------avelnwg | .0668379 .0196635 3.40 0.001 .0282099 .1054658
_cons | 7.483021 .0518829 144.23 0.000
7.3811 7.584943
-----------------------------------------------------------------------------. estimates store bebyols
.
. * Better is the following as gives heteroskedastic robust standard errors
. regress avelnhr avelnwg, robust
Regression with robust standard errors
Number of obs =
F( 1, 530) = 7.55
Prob > F
= 0.0062
R-squared = 0.0213
Root MSE = .17726
532
-----------------------------------------------------------------------------|
Robust
avelnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------avelnwg | .0668379 .0243185 2.75 0.006 .0190654 .1146103
_cons | 7.483021 .0657699 113.78 0.000
7.35382 7.612223
-----------------------------------------------------------------------------. estimates store behet
.
. * Or could bootstrap
. bootstrap "regress avelnhr avelnwg" "_b[avelnwg] _b[_cons]", reps(200) level(95)
command:
regress avelnhr avelnwg
statistics: _bs_1
= _b[avelnwg]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
Replications =
200
532
467
468
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
532 -> 5320
Number of variables
85 ->
14
j variable (10 values)
-> year
xij variables:
lnhr1979 lnhr1980 ... lnhr1988 -> lnhr
lnwg1979 lnwg1980 ... lnwg1988 -> lnwg
mdlnhr1979 mdlnhr1980 ... mdlnhr1988 -> mdlnhr
mdlnwg1979 mdlnwg1980 ... mdlnwg1988 -> mdlnwg
reglsdlnhr1979 reglsdlnhr1980 ... reglsdlnhr1988->reglsdlnhr
reglsdlnwg1979 reglsdlnwg1980 ... reglsdlnwg1988->reglsdlnwg
remledlnhr1979 remledlnhr1980 ... remledlnhr1988->remledlnhr
remledlnwg1979 remledlnwg1980 ... remledlnwg1988->remledlnwg
----------------------------------------------------------------------------.
. describe
Contains data
obs:
5,320
vars:
14
size:
276,640 (97.2% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------id
float %9.0g
year
int %9.0g
lnhr
float %9.0g
lnwg
float %9.0g
avelnhr
float %9.0g
avelnwg
float %9.0g
_est_bebyols byte %8.0g
esample() from estimates store
_est_behet
byte %8.0g
esample() from estimates store
mdlnhr
float %9.0g
mdlnwg
float %9.0g
reglsdlnhr
float %9.0g
reglsdlnwg
float %9.0g
remledlnhr
float %9.0g
remledlnwg
float %9.0g
------------------------------------------------------------------------------Sorted by: id year
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
5320
266.5 153.5893
1
532
471
year |
5320
1983.5 2.872551
1979
1988
lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
avelnhr |
5320 7.65743 .1788568
6.416
8.242
-------------+-------------------------------------------------------avelnwg |
5320 2.609436 .3908626
1.346
4.543
_est_bebyols |
5320
1
0
1
1
_est_behet |
5320
1
0
1
1
mdlnhr |
5320 -1.21e-09 .2226492 -3.988
1.344
mdlnwg |
5320 -9.86e-10 .1691472 -2.54 1.878
-------------+-------------------------------------------------------reglsdlnhr |
5320 3.18006 .2347122 -1.181465 4.008506
reglsdlnwg |
5320 1.083675 .2344336 -1.593137 2.966892
remledlnhr |
5320 3.166493 .2346121 -1.193439 3.997138
remledlnwg |
5320 1.079051 .2339546 -1.597177 2.962247
. save MOM2, replace
file MOM2.dta saved
.
. *** (4) FIXED EFFECTS ESTIMATOR USING DIFFERENCED DATA
.
. * This should replicate xtreg, fe
. regress mdlnhr mdlnwg
Source |
SS
df
MS
Number of obs = 5320
-------------+-----------------------------F( 1, 5318) = 87.72
Model | 4.27857391 1 4.27857391
Prob > F
= 0.0000
Residual | 259.39846 5318 .048777446
R-squared = 0.0162
-------------+-----------------------------Adj R-squared = 0.0160
Total | 263.677034 5319 .04957267
Root MSE
= .22086
-----------------------------------------------------------------------------mdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------mdlnwg | .1676755 .0179032 9.37 0.000
.132578 .202773
_cons | -1.04e-09 .003028 -0.00 1.000 -.0059361 .0059361
-----------------------------------------------------------------------------. estimates store febyols
.
. * This gives panel corrected standard errors
. regress mdlnhr mdlnwg, cluster(id)
Regression with robust standard errors
Number of obs = 5320
F( 1, 531) = 3.89
Prob > F
= 0.0490
R-squared = 0.0162
Number of clusters (id) = 532
Root MSE
= .22086
472
-----------------------------------------------------------------------------|
Robust
mdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------mdlnwg | .1676755 .0849706 1.97 0.049 .0007557 .3345953
_cons | -1.04e-09 6.39e-09 -0.16 0.870 -1.36e-08 1.15e-08
-----------------------------------------------------------------------------. estimates store fepanel
.
. * This gives panel bootstrap standard errors
. * Similar to bootstrap applied to xtreg, fe
. set seed 10001
. bs "regress mdlnhr mdlnwg" "_b[mdlnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
regress mdlnhr mdlnwg
statistics: _bs_1
= _b[mdlnwg]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
532
Replications =
500
5320
5320
473
-----------------------------------------------------------------------------|
Robust
mdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------mdlnwg | .1676755 .0600942 2.79 0.005 .0498662 .2854848
_cons | -1.04e-09 .003028 -0.00 1.000 -.0059361 .0059361
-----------------------------------------------------------------------------. estimates store fehet
.
. *** (5) RANDOM EFFECTS - GLS ESTIMATOR USING DIFFERENCED DATA
.
. * Should give same coefficient estimates as xtreg
. * May give different standard errors as treats lamda as known
. * but in practice the differnece is not great as lamda precisely estimated
.
. * This should replicate xtreg, re
. regress reglsdlnhr reglsdlnwg
Source |
SS
df
MS
Number of obs = 5320
-------------+-----------------------------F( 1, 5318) = 76.64
Model | 4.16279701 1 4.16279701
Prob > F
= 0.0000
Residual | 288.860014 5318 .054317415
R-squared = 0.0142
-------------+-----------------------------Adj R-squared = 0.0140
Total | 293.022811 5319 .055089831
Root MSE
= .23306
-----------------------------------------------------------------------------reglsdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0136312 8.75 0.000 .0926095 .146055
_cons | 3.050743 .0151135 201.86 0.000 3.021114 3.080371
-----------------------------------------------------------------------------. estimates store reglsbyols
.
. * This gives panel corrected standard errors
. regress reglsdlnhr reglsdlnwg, cluster(id)
Regression with robust standard errors
Number of obs = 5320
F( 1, 531) = 5.39
Prob > F
= 0.0206
R-squared = 0.0142
Number of clusters (id) = 532
Root MSE
= .23306
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0514016 2.32 0.021 .0183568 .2203077
474
Number of obs =
N of clusters =
532
Replications =
500
5320
5320
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0426897 2.80 0.005
.035643 .2030215
475
476
.
. * This gives panel bootstrap standard errors
. * Similar to bootstrap applied to xtreg, fe
. set seed 10001
. bs "regress remledlnhr remledlnwg" "_b[remledlnwg] _b[_cons]", cluster(id) reps($nreps)
level(95)
command:
regress remledlnhr remledlnwg
statistics: _bs_1
= _b[remledlnwg]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
532
Replications =
500
5320
5320
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0426897 2.80 0.005
.035643 .2030215
_cons | 3.050743 .047821 63.80 0.000 2.956994 3.144491
-----------------------------------------------------------------------------. estimates store remlehet
477
.
. *** (7) ROBUST VARIANT OF HAUSMAN TEST
.
. * From Section 21.4.3 pages 717-9 the usual implementation of the Hausman test
. * is invalid if there is any intracluster correlation left in the RE model
. * as then the RE estimator is no longer fully efficient
. * so Var[b_RE - b_FE] does not equal Var[b_FE] - V[b_RE]
.
. * (7A) Nonrobust version of Hausman test by auxiliary regression
.*
[will be similar to nonrobust version in mma21p1panfeandre.do]
. regress reglsdlnhr reglsdlnwg mdlnwg
Source |
SS
df
MS
Number of obs = 5320
-------------+-----------------------------F( 2, 5317) = 45.26
Model | 4.90465081 2 2.45232541
Prob > F
= 0.0000
Residual | 288.11816 5317 .054188106
R-squared = 0.0167
-------------+-----------------------------Adj R-squared = 0.0164
Total | 293.022811 5319 .055089831
Root MSE
= .23278
-----------------------------------------------------------------------------reglsdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------reglsdlnwg | .0668379 .0196635 3.40 0.001 .0282893 .1053864
mdlnwg | .1008376 .0272531 3.70 0.000 .0474104 .1542648
_cons | 3.10763 .0215465 144.23 0.000
3.06539 3.14987
-----------------------------------------------------------------------------. scalar Hnonrobust = (_b[mdlnwg]/_se[mdlnwg])^2
. di Hnonrobust
13.690344
.
. * Perform preferred valid robust version of Hausman test
. * This gives the results presented on p.719
. regress reglsdlnhr reglsdlnwg mdlnwg, cluster(id)
Regression with robust standard errors
Number of obs = 5320
F( 2, 531) = 4.24
Prob > F
= 0.0149
R-squared = 0.0167
Number of clusters (id) = 532
Root MSE
= .23278
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------reglsdlnwg | .0668379 .0243001 2.75 0.006 .0191016 .1145741
mdlnwg | .1008376 .0785137 1.28 0.200 -.053398 .2550732
_cons | 3.10763 .027293 113.86 0.000 3.054014 3.161245
478
-------------+---------------------------------------------------sigma_e
|
_cons |
0.233
| 0.002
-------------+---------------------------------------------------_
|
remledlnwg |
0.120
0.120
|
0.014
0.052
reglsdlnwg |
0.119
|
0.043
_cons |
3.037
3.051
3.037
|
0.015
0.048
0.057
-------------+---------------------------------------------------Statistics |
N | 5320.000 5320.000 5320.000 5320.000
ll | -266.912
202.872
200.589
202.872
r2 |
0.014
0.014
0.014
tss |
rss |
288.612
288.860
288.612
mss |
4.161
4.163
4.161
rmse |
0.233
0.233
0.233
df_r |
5318.000 5318.000
531.000
-----------------------------------------------------------------legend: b/se
.
. * The following are (panel) bootstrap standard errors
. matrix list bebootse
bebootse[1,2]
_bs_1
_bs_2
se .02394857 .06483965
. matrix list febootse
febootse[1,2]
_bs_1
_bs_2
se .08446309 6.497e-09
. * Note that the following two differ from mma21p1panfeandre.do
. * as here the same value of lamda is used throught the bootstraps
. matrix list remlebootse
remlebootse[1,2]
_bs_1
_bs_2
se .05181879 .05710419
. matrix list reglsbootse
reglsbootse[1,2]
_bs_1
_bs_2
481
se .05167569 .05719414
.
. * For completeness give lamda
. di lamdaregls
.58470925
. di lamdaremle
.58648101
.
. * Robust and nonrobust versions of Hausman test given on p.719
. di Hnonrobust /* Not valid if intracluster correlation */
13.690344
. di Hrobust
1.6495074
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section5\mma21p2panmanual.txt
log type: text
closed on: 23 May 2005, 11:35:55
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p2panresiduals.txt
log type: text
opened on: 23 May 2005, 11:37:22
.
. ********** OVERVIEW OF MMA21P3PANRESIDUALS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 21.3.4 pages 713-15 Residual analysis
. * This program
. * (1) estimates correlations for
. * - dependent variable
. * - regressors variable
. * - residuals from pooled ols [Table 21.3]
. * - residuals from within estimation [Table 21.4]
. * - residuals from random effects estimation
. * (2) separately estimates correlations for
. * - residuals from first differences estiamtion
. * (3) gets correlations for each individual observation
.
482
483
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ************ (1) ANALYSIS: OBTAIN KEY AUTOCORRELATIONS Tables 21.3, 21.4
**********
.
. ** RUN REGRESSIONS AND GET RESIDUALS OF INTEREST
.
. * pooled ols
. regress lnhr lnwg
Source |
SS
df
MS
Number of obs = 5320
-------------+-----------------------------F( 1, 5318) = 82.22
Model | 6.60538417 1 6.60538417
Prob > F
= 0.0000
Residual | 427.225206 5318 .080335691
R-squared = 0.0152
-------------+-----------------------------Adj R-squared = 0.0150
Total | 433.830591 5319 .081562435
Root MSE
= .28344
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0827436 .0091251 9.07 0.000 .0648545 .1006326
_cons | 7.441516 .0241265 308.44 0.000 7.394219 7.488814
-----------------------------------------------------------------------------. predict upols, residuals
.
. * fixed effects (within)
. xtreg lnhr lnwg, fe i(id)
Fixed-effects (within) regression
Group variable (i): id
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
Number of obs
=
5320
Number of groups =
532
Obs per group: min =
avg =
10.0
max =
10
F(1,4787)
10
78.96
484
Prob > F
0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
rho | .37789558 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(531, 4787) = 5.83
Prob > F = 0.0000
. predict ufe, e
.
. * random effects
. xtreg lnhr lnwg, re i(id)
Random-effects GLS regression
Group variable (i): id
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
=
5320
532
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
rho | .32424354 (fraction of variance due to u_i)
-----------------------------------------------------------------------------. predict ure, e
.
. summarize upols ufe ure
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------upols |
5320 -1.27e-10 .2834089 -4.826247 .964581
ufe |
5320 -5.52e-11 .2208354 -4.003929 1.2719
ure |
5320 -9.00e-11 .2231118 -4.131111 1.085362
485
-------------+-------------------------------------------------------ure1981 |
532 .0100382 .1596593 -1.02491 .8517824
lnhr1982 |
532 7.64609 .2427195
5.38
8.31
lnwg1982 |
532 2.61468 .4014363
1.21
4.61
upols1982 |
532 -.0117742 .2422735 -2.264238 .6897579
ufe1982 |
532 -.0122196 .1890237 -1.623214 .7918997
-------------+-------------------------------------------------------ure1982 |
532 -.0119661 .1875585 -1.737484 .6666697
lnhr1983 |
532 7.613064 .382703
2.77
8.37
lnwg1983 |
532 2.610526 .4111869
1.08
4.62
upols1983 |
532 -.0444568 .3778255 -4.826247 .7307264
ufe1983 |
532 -.0445494 .2836351 -3.577253 .5196197
-------------+-------------------------------------------------------ure1983 |
532 -.0444967 .294545 -3.804399 .5078294
lnhr1984 |
532 7.636523 .3316735
3.18
8.44
lnwg1984 |
532 2.600188 .4621549
-.26
4.65
upols1984 |
532 -.0201427 .3208512 -4.240003 .8263766
ufe1984 |
532 -.0193572 .225836 -2.810104 .8327778
-------------+-------------------------------------------------------ure1984 |
532 -.0198043 .2378605 -3.140221 .7036628
lnhr1985 |
532 7.668365 .2597423
5.08
8.54
lnwg1985 |
532 2.614944 .4347554
1.33
4.69
upols1985 |
532 .0104785 .259051 -2.503835 .8624523
ufe1985 |
532 .0100107 .1856724 -1.581894 .7944546
-------------+-------------------------------------------------------ure1985 |
532 .010277 .1886509 -1.752727 .7370209
lnhr1986 |
532 7.659286 .3330862
2.77
8.38
lnwg1986 |
532 2.602632 .4432807
.07
4.59
upols1986 |
532 .0024183 .3312105 -4.801424 .7439653
ufe1986 |
532 .0029962 .2595405 -4.003929 .6384854
-------------+-------------------------------------------------------ure1986 |
532 .0026673 .264328 -4.131111 .5111209
lnhr1987 |
532 7.67406 .2745015
4.38
8.56
lnwg1987 |
532 2.614699 .4300122
1.28
4.03
upols1987 |
532 .0161942 .2749153 -3.283269 .964581
ufe1987 |
532 .0157472 .2141618 -2.817174 1.009662
-------------+-------------------------------------------------------ure1987 |
532 .0160016 .2148092 -2.897725 .8441463
lnhr1988 |
532 7.679831 .2552894
4.79
8.53
lnwg1988 |
532 2.625602 .4701759
-.22
4.6
upols1988 |
532 .0210628 .2519891 -2.633313 .9072749
ufe1988 |
532 .0196898 .2048927 -1.68379 1.123516
-------------+-------------------------------------------------------ure1988 |
532 .0204713 .2022375 -1.897506 .9393954
.
. ** OBTAIN THE VARIOUS CORRELATIONS
.
. corr lnhr1979 lnhr1980 lnhr1981 lnhr1982 lnhr1983 lnhr1984 lnhr1985 lnhr1986 lnhr1987
lnhr1988
(obs=532)
487
upols1979 |
upols1980 |
upols1981 |
upols1982 |
upols1983 |
upols1984 |
upols1985 |
upols1986 |
upols1987 |
upols1988 |
1.0000
0.3283
0.4442
0.3008
0.2089
0.2025
0.2395
0.1987
0.2091
0.1619
1.0000
0.4035
0.3140
0.2298
0.2289
0.3246
0.1903
0.3167
0.2456
1.0000
0.5678
0.3739
0.3194
0.4087
0.2797
0.3340
0.3016
1.0000
0.4684
0.3360
0.3484
0.2470
0.2877
0.2582
1.0000
0.6398
0.3898
0.3109
0.3097
0.2083
1.0000
0.5800
0.3535
0.3361
0.2470
1.0000
0.3991 1.0000
0.3941 0.3496 1.0000
0.3436 0.5545 0.5242
| upo~1988
-------------+--------upols1988 | 1.0000
. corr ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987 ure1988
(obs=532)
| ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987
-------------+--------------------------------------------------------------------------------ure1979 | 1.0000
ure1980 | 0.0778 1.0000
ure1981 | 0.1777 0.0604 1.0000
ure1982 | -0.0250 -0.0519 0.2492 1.0000
ure1983 | -0.2339 -0.2277 -0.1609 0.0587 1.0000
ure1984 | -0.2482 -0.2431 -0.2691 -0.1709 0.3795 1.0000
ure1985 | -0.1842 -0.0919 -0.1054 -0.1581 -0.0939 0.2197 1.0000
ure1986 | -0.1860 -0.2333 -0.2434 -0.2405 -0.1110 -0.0763 -0.0361 1.0000
ure1987 | -0.1665 -0.0481 -0.1580 -0.1904 -0.1710 -0.1506 -0.0646 -0.0553 1.0000
ure1988 | -0.1960 -0.1251 -0.1646 -0.1949 -0.3265 -0.2786 -0.1221 0.2708 0.2379
| ure1988
-------------+--------ure1988 | 1.0000
ufe1987 | -0.1519 -0.0497 -0.1561 -0.2008 -0.2399 -0.2066 -0.0918 -0.0908 1.0000
ufe1988 | -0.1650 -0.1109 -0.1385 -0.1772 -0.3816 -0.3096 -0.1268 0.2420 0.2439
| ufe1988
-------------+--------ufe1988 | 1.0000
.
. * The following does estimation for just one year
. regress lnhr1979 lnwg1979
Source |
SS
df
MS
Number of obs = 532
-------------+-----------------------------F( 1, 530) = 0.00
Model | .000035507 1 .000035507
Prob > F
= 0.9810
Residual | 33.0180361 530 .062298181
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0019
Total | 33.0180716 531 .062180926
Root MSE
= .2496
-----------------------------------------------------------------------------lnhr1979 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg1979 | .0006173 .0258574 0.02 0.981 -.0501783 .0514129
_cons | 7.667738 .0680375 112.70 0.000 7.534082 7.801395
-----------------------------------------------------------------------------.
. ************ (2) ANALYSIS: OBTAIN AUTOCORRELATIONS FOR FIRST DIFFERNCES
.
. ** SET UP THE DATA
. use mom, clear
. gen dlnhr = lnhr - lnhr[_n-1]
(1 missing value generated)
. gen dlnwg = lnwg - lnwg[_n-1]
(1 missing value generated)
. * The following drops the first year which here is 1979
. drop if year == 1979
(532 observations deleted)
. regress dlnhr dlnwg
Source |
SS
df
MS
Number of obs = 4788
-------------+-----------------------------F( 1, 4786) = 26.09
Model | 2.27870825 1 2.27870825
Prob > F
= 0.0000
Residual | 417.943979 4786 .087326364
R-squared = 0.0054
-------------+-----------------------------Adj R-squared = 0.0052
Total | 420.222687 4787 .087784142
Root MSE
= .29551
490
-----------------------------------------------------------------------------dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .1089851 .0213351 5.11 0.000 .0671584 .1508118
_cons | .0008283 .0042712 0.19 0.846 -.0075452 .0092018
-----------------------------------------------------------------------------. predict ufdiff, residuals
. * Here just do this for lnhr and lnwg and the residuals
. keep dlnhr dlnwg ufdiff id year
. reshape wide dlnhr dlnwg ufdiff, i(id) j(year)
(note: j = 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
4788 -> 532
Number of variables
5 ->
28
j variable (9 values)
year -> (dropped)
xij variables:
dlnhr -> dlnhr1980 dlnhr1981 ... dlnhr1988
dlnwg -> dlnwg1980 dlnwg1981 ... dlnwg1988
ufdiff -> ufdiff1980 ufdiff1981 ... ufdiff1988
----------------------------------------------------------------------------. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
532
266.5 153.7194
1
532
dlnhr1980 |
532 -.0092481 .3023508
-2.5
1.71
dlnwg1980 |
532 .0046053 .2301879
-2.12
1.05
ufdiff1980 |
532 -.0105783 .3014161 -2.499738 1.690644
dlnhr1981 |
532 .0075564 .2668644
-1.2
2.32
-------------+-------------------------------------------------------dlnwg1981 |
532 .0085902 .1818033
-.79
1.62
ufdiff1981 |
532 .0057919 .2669213 -1.145188 2.343149
dlnhr1982 |
532 -.0215602 .212834 -2.06
1.14
dlnwg1982 |
532 .0037218 .1755574
-1.17
.74
ufdiff1982 |
532 -.0227941 .213709 -2.036851 1.135902
-------------+-------------------------------------------------------dlnhr1983 |
532 -.0330263 .3413969 -4.51 .9899998
dlnwg1983 |
532 -.0041541 .1673057
-.88 .6399999
ufdiff1983 |
532 -.0334019 .3398726 -4.419281 .9780819
dlnhr1984 |
532 .0234586 .3034213
-2.31
2.57
dlnwg1984 |
532 -.0103383 .2342514 -2.13
.77
-------------+-------------------------------------------------------ufdiff1984 |
532 .0237571 .3004287 -2.168058 2.502691
dlnhr1985 |
532 .0318421 .2772558
-1.46
3.52
dlnwg1985 |
532 .0147556 .2371054
-1.33
3.06
491
ufdiff1985 |
532 .0294057 .2697542 -1.315878 3.185677
dlnhr1986 |
532 -.0090789 .3270724 -4.79
1.8
-------------+-------------------------------------------------------dlnwg1986 |
532 -.012312 .1804162
-1.83
1.04
ufdiff1986 |
532 -.0085654 .3299129 -4.796278 1.789363
dlnhr1987 |
532 .0147744 .3470122
-3.24
4.52
dlnwg1987 |
532 .0120677 .1845692 -.9400001
1.95
ufdiff1987 |
532 .0126309 .3494111 -3.243008 4.550777
-------------+-------------------------------------------------------dlnhr1988 |
532 .0057707 .2587991
-2.5
2.74
dlnwg1988 |
532 .0109023 .194813
-1.5
1.22
ufdiff1988 |
532 .0037542 .2576554 -2.337351 2.739172
.
. ** GET THE CORRELATIONS
. corr dlnhr1980 dlnhr1981 dlnhr1982 dlnhr1983 dlnhr1984 dlnhr1985 dlnhr1986 dlnhr1987
dlnhr1988
(obs=532)
| dlnhr1~0 dlnhr1~1 dlnhr1~2 dlnhr1~3 dlnhr1~4 dlnhr1~5 dlnhr1~6 dlnhr1~7 dlnhr1~8
-------------+--------------------------------------------------------------------------------dlnhr1980 | 1.0000
dlnhr1981 | -0.6289 1.0000
dlnhr1982 | 0.0402 -0.2306 1.0000
dlnhr1983 | 0.0144 -0.0204 -0.2209 1.0000
dlnhr1984 | -0.0001 -0.0570 -0.1410 -0.4495 1.0000
dlnhr1985 | 0.0393 -0.0320 -0.0827 -0.4035 -0.1969 1.0000
dlnhr1986 | -0.0629 0.0322 0.0112 0.0233 -0.1192 -0.2334 1.0000
dlnhr1987 | 0.0811 -0.0709 -0.0029 -0.0448 -0.0202 0.0093 -0.6231 1.0000
dlnhr1988 | -0.0341 0.0461 -0.0082 -0.1020 0.0261 0.0682 0.2486 -0.6064 1.0000
492
.
. ************ (3) ANALYSIS: CORRELATIONS FOR AN INDIVIDUAL OBSERVATION
.
. * Look at correlations for each individual
.
. ** TRANSFORM DATA FROM LONG FORM TO WIDE FORM FOR INDIVIDUALS
.
. use mom3, replace
. * Here just do this for lnhr and lnwg and the residuals
. keep lnhr lnwg id year
. reshape wide lnhr lnwg, i(year) j(id)
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33
> 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
65 6
> 6 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
98
> 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
121 122 123
> 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
146 147 1
> 48 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169
170 171 172
> 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194
195 196 1
> 97 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
219 220 221
> 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243
244 245 2
493
> 46 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
268 269 270
> 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
293 294 2
> 95 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316
317 318 319
> 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341
342 343 3
> 44 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365
366 367 368
> 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390
391 392 3
> 93 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414
415 416 417
> 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439
440 441 4
> 42 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463
464 465 466
> 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488
489 490 4
> 91 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512
513 514 515
> 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 ->
10
Number of variables
4 -> 1065
j variable (532 values)
id -> (dropped)
xij variables:
lnhr -> lnhr1 lnhr2 ... lnhr532
lnwg -> lnwg1 lnwg2 ... lnwg532
----------------------------------------------------------------------------. * Note that i and j are reversed
.
. * Since year is 1979 to 1988 this will create
. * lnhr1979 to lnhr1988 and lnwg1979 to lnwg1988
.
. tsset year
time variable: year, 1979 to 1988
.
. * First-order Correlation over T years for the first observation
. corr lnhr1 L.lnhr1
(obs=9)
|
L.
| lnhr1 lnhr1
-------------+-----------------lnhr1
|
494
-- | 1.0000
L1 | 0.6378 1.0000
. * And so on
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section5\mma21p2panresiduals.txt
log type: text
closed on: 23 May 2005, 11:37:30
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p3panresiduals.txt
log type: text
opened on: 23 May 2005, 13:01:06
.
. ********** OVERVIEW OF MMA21P3PANRESIDUALS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 21.3.4 pages 713-15 Residual analysis
. * This program
. * (1) estimates correlations for
. * - dependent variable
. * - regressors variable
. * - residuals from pooled ols [Table 21.3]
. * - residuals from within estimation [Table 21.4]
. * - residuals from random effects estimation
. * (2) separately estimates correlations for
. * - residuals from first differences estiamtion
. * (3) gets correlations for each individual observation
.
. * The code is very limited:
495
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ************ (1) ANALYSIS: OBTAIN KEY AUTOCORRELATIONS Tables 21.3, 21.4
**********
.
. ** RUN REGRESSIONS AND GET RESIDUALS OF INTEREST
.
. * pooled ols
. regress lnhr lnwg
Source |
SS
df
MS
Number of obs = 5320
-------------+-----------------------------F( 1, 5318) = 82.22
Model | 6.60538417 1 6.60538417
Prob > F
= 0.0000
Residual | 427.225206 5318 .080335691
R-squared = 0.0152
-------------+-----------------------------Adj R-squared = 0.0150
Total | 433.830591 5319 .081562435
Root MSE
= .28344
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0827436 .0091251 9.07 0.000 .0648545 .1006326
_cons | 7.441516 .0241265 308.44 0.000 7.394219 7.488814
-----------------------------------------------------------------------------. predict upols, residuals
.
. * fixed effects (within)
. xtreg lnhr lnwg, fe i(id)
Fixed-effects (within) regression
Group variable (i): id
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
Number of obs
=
5320
Number of groups =
532
Obs per group: min =
avg =
10.0
max =
10
F(1,4787)
=
Prob > F
10
78.96
= 0.0000
497
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
rho | .37789558 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(531, 4787) = 5.83
Prob > F = 0.0000
. predict ufe, e
.
. * random effects
. xtreg lnhr lnwg, re i(id)
Random-effects GLS regression
Group variable (i): id
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
=
5320
532
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
rho | .32424354 (fraction of variance due to u_i)
-----------------------------------------------------------------------------. predict ure, e
.
. summarize upols ufe ure
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------upols |
5320 -1.27e-10 .2834089 -4.826247 .964581
ufe |
5320 -5.52e-11 .2208354 -4.003929 1.2719
ure |
5320 -9.00e-11 .2231118 -4.131111 1.085362
498
ure1981 |
532 .0100382 .1596593 -1.02491 .8517824
lnhr1982 |
532 7.64609 .2427195
5.38
8.31
lnwg1982 |
532 2.61468 .4014363
1.21
4.61
upols1982 |
532 -.0117742 .2422735 -2.264238 .6897579
ufe1982 |
532 -.0122196 .1890237 -1.623214 .7918997
-------------+-------------------------------------------------------ure1982 |
532 -.0119661 .1875585 -1.737484 .6666697
lnhr1983 |
532 7.613064 .382703
2.77
8.37
lnwg1983 |
532 2.610526 .4111869
1.08
4.62
upols1983 |
532 -.0444568 .3778255 -4.826247 .7307264
ufe1983 |
532 -.0445494 .2836351 -3.577253 .5196197
-------------+-------------------------------------------------------ure1983 |
532 -.0444967 .294545 -3.804399 .5078294
lnhr1984 |
532 7.636523 .3316735
3.18
8.44
lnwg1984 |
532 2.600188 .4621549
-.26
4.65
upols1984 |
532 -.0201427 .3208512 -4.240003 .8263766
ufe1984 |
532 -.0193572 .225836 -2.810104 .8327778
-------------+-------------------------------------------------------ure1984 |
532 -.0198043 .2378605 -3.140221 .7036628
lnhr1985 |
532 7.668365 .2597423
5.08
8.54
lnwg1985 |
532 2.614944 .4347554
1.33
4.69
upols1985 |
532 .0104785 .259051 -2.503835 .8624523
ufe1985 |
532 .0100107 .1856724 -1.581894 .7944546
-------------+-------------------------------------------------------ure1985 |
532 .010277 .1886509 -1.752727 .7370209
lnhr1986 |
532 7.659286 .3330862
2.77
8.38
lnwg1986 |
532 2.602632 .4432807
.07
4.59
upols1986 |
532 .0024183 .3312105 -4.801424 .7439653
ufe1986 |
532 .0029962 .2595405 -4.003929 .6384854
-------------+-------------------------------------------------------ure1986 |
532 .0026673 .264328 -4.131111 .5111209
lnhr1987 |
532 7.67406 .2745015
4.38
8.56
lnwg1987 |
532 2.614699 .4300122
1.28
4.03
upols1987 |
532 .0161942 .2749153 -3.283269 .964581
ufe1987 |
532 .0157472 .2141618 -2.817174 1.009662
-------------+-------------------------------------------------------ure1987 |
532 .0160016 .2148092 -2.897725 .8441463
lnhr1988 |
532 7.679831 .2552894
4.79
8.53
lnwg1988 |
532 2.625602 .4701759
-.22
4.6
upols1988 |
532 .0210628 .2519891 -2.633313 .9072749
ufe1988 |
532 .0196898 .2048927 -1.68379 1.123516
-------------+-------------------------------------------------------ure1988 |
532 .0204713 .2022375 -1.897506 .9393954
.
. ** OBTAIN THE VARIOUS CORRELATIONS
.
. corr lnhr1979 lnhr1980 lnhr1981 lnhr1982 lnhr1983 lnhr1984 lnhr1985 lnhr1986 lnhr1987
lnhr1988
(obs=532)
500
upols1980 |
upols1981 |
upols1982 |
upols1983 |
upols1984 |
upols1985 |
upols1986 |
upols1987 |
upols1988 |
0.3283
0.4442
0.3008
0.2089
0.2025
0.2395
0.1987
0.2091
0.1619
1.0000
0.4035
0.3140
0.2298
0.2289
0.3246
0.1903
0.3167
0.2456
1.0000
0.5678
0.3739
0.3194
0.4087
0.2797
0.3340
0.3016
1.0000
0.4684
0.3360
0.3484
0.2470
0.2877
0.2582
1.0000
0.6398
0.3898
0.3109
0.3097
0.2083
1.0000
0.5800
0.3535
0.3361
0.2470
1.0000
0.3991 1.0000
0.3941 0.3496 1.0000
0.3436 0.5545 0.5242
| upo~1988
-------------+--------upols1988 | 1.0000
. corr ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987 ure1988
(obs=532)
| ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987
-------------+--------------------------------------------------------------------------------ure1979 | 1.0000
ure1980 | 0.0778 1.0000
ure1981 | 0.1777 0.0604 1.0000
ure1982 | -0.0250 -0.0519 0.2492 1.0000
ure1983 | -0.2339 -0.2277 -0.1609 0.0587 1.0000
ure1984 | -0.2482 -0.2431 -0.2691 -0.1709 0.3795 1.0000
ure1985 | -0.1842 -0.0919 -0.1054 -0.1581 -0.0939 0.2197 1.0000
ure1986 | -0.1860 -0.2333 -0.2434 -0.2405 -0.1110 -0.0763 -0.0361 1.0000
ure1987 | -0.1665 -0.0481 -0.1580 -0.1904 -0.1710 -0.1506 -0.0646 -0.0553 1.0000
ure1988 | -0.1960 -0.1251 -0.1646 -0.1949 -0.3265 -0.2786 -0.1221 0.2708 0.2379
| ure1988
-------------+--------ure1988 | 1.0000
ufe1988 | -0.1650 -0.1109 -0.1385 -0.1772 -0.3816 -0.3096 -0.1268 0.2420 0.2439
| ufe1988
-------------+--------ufe1988 | 1.0000
.
. * The following does estimation for just one year
. regress lnhr1979 lnwg1979
Source |
SS
df
MS
Number of obs = 532
-------------+-----------------------------F( 1, 530) = 0.00
Model | .000035507 1 .000035507
Prob > F
= 0.9810
Residual | 33.0180361 530 .062298181
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0019
Total | 33.0180716 531 .062180926
Root MSE
= .2496
-----------------------------------------------------------------------------lnhr1979 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg1979 | .0006173 .0258574 0.02 0.981 -.0501783 .0514129
_cons | 7.667738 .0680375 112.70 0.000 7.534082 7.801395
-----------------------------------------------------------------------------.
. ************ (2) ANALYSIS: OBTAIN AUTOCORRELATIONS FOR FIRST DIFFERNCES
.
. ** SET UP THE DATA
. use mom, clear
. gen dlnhr = lnhr - lnhr[_n-1]
(1 missing value generated)
. gen dlnwg = lnwg - lnwg[_n-1]
(1 missing value generated)
. * The following drops the first year which here is 1979
. drop if year == 1979
(532 observations deleted)
. regress dlnhr dlnwg
Source |
SS
df
MS
Number of obs = 4788
-------------+-----------------------------F( 1, 4786) = 26.09
Model | 2.27870825 1 2.27870825
Prob > F
= 0.0000
Residual | 417.943979 4786 .087326364
R-squared = 0.0054
-------------+-----------------------------Adj R-squared = 0.0052
Total | 420.222687 4787 .087784142
Root MSE
= .29551
-----------------------------------------------------------------------------503
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .1089851 .0213351 5.11 0.000 .0671584 .1508118
_cons | .0008283 .0042712 0.19 0.846 -.0075452 .0092018
-----------------------------------------------------------------------------. predict ufdiff, residuals
. * Here just do this for lnhr and lnwg and the residuals
. keep dlnhr dlnwg ufdiff id year
. reshape wide dlnhr dlnwg ufdiff, i(id) j(year)
(note: j = 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
4788 -> 532
Number of variables
5 ->
28
j variable (9 values)
year -> (dropped)
xij variables:
dlnhr -> dlnhr1980 dlnhr1981 ... dlnhr1988
dlnwg -> dlnwg1980 dlnwg1981 ... dlnwg1988
ufdiff -> ufdiff1980 ufdiff1981 ... ufdiff1988
----------------------------------------------------------------------------. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
532
266.5 153.7194
1
532
dlnhr1980 |
532 -.0092481 .3023508
-2.5
1.71
dlnwg1980 |
532 .0046053 .2301879
-2.12
1.05
ufdiff1980 |
532 -.0105783 .3014161 -2.499738 1.690644
dlnhr1981 |
532 .0075564 .2668644
-1.2
2.32
-------------+-------------------------------------------------------dlnwg1981 |
532 .0085902 .1818033
-.79
1.62
ufdiff1981 |
532 .0057919 .2669213 -1.145188 2.343149
dlnhr1982 |
532 -.0215602 .212834 -2.06
1.14
dlnwg1982 |
532 .0037218 .1755574
-1.17
.74
ufdiff1982 |
532 -.0227941 .213709 -2.036851 1.135902
-------------+-------------------------------------------------------dlnhr1983 |
532 -.0330263 .3413969 -4.51 .9899998
dlnwg1983 |
532 -.0041541 .1673057
-.88 .6399999
ufdiff1983 |
532 -.0334019 .3398726 -4.419281 .9780819
dlnhr1984 |
532 .0234586 .3034213
-2.31
2.57
dlnwg1984 |
532 -.0103383 .2342514 -2.13
.77
-------------+-------------------------------------------------------ufdiff1984 |
532 .0237571 .3004287 -2.168058 2.502691
dlnhr1985 |
532 .0318421 .2772558
-1.46
3.52
dlnwg1985 |
532 .0147556 .2371054
-1.33
3.06
ufdiff1985 |
532 .0294057 .2697542 -1.315878 3.185677
504
dlnhr1986 |
532 -.0090789 .3270724 -4.79
1.8
-------------+-------------------------------------------------------dlnwg1986 |
532 -.012312 .1804162 -1.83
1.04
ufdiff1986 |
532 -.0085654 .3299129 -4.796278 1.789363
dlnhr1987 |
532 .0147744 .3470122
-3.24
4.52
dlnwg1987 |
532 .0120677 .1845692 -.9400001
1.95
ufdiff1987 |
532 .0126309 .3494111 -3.243008 4.550777
-------------+-------------------------------------------------------dlnhr1988 |
532 .0057707 .2587991
-2.5
2.74
dlnwg1988 |
532 .0109023 .194813
-1.5
1.22
ufdiff1988 |
532 .0037542 .2576554 -2.337351 2.739172
.
. ** GET THE CORRELATIONS
. corr dlnhr1980 dlnhr1981 dlnhr1982 dlnhr1983 dlnhr1984 dlnhr1985 dlnhr1986 dlnhr1987
dlnhr1988
(obs=532)
| dlnhr1~0 dlnhr1~1 dlnhr1~2 dlnhr1~3 dlnhr1~4 dlnhr1~5 dlnhr1~6 dlnhr1~7 dlnhr1~8
-------------+--------------------------------------------------------------------------------dlnhr1980 | 1.0000
dlnhr1981 | -0.6289 1.0000
dlnhr1982 | 0.0402 -0.2306 1.0000
dlnhr1983 | 0.0144 -0.0204 -0.2209 1.0000
dlnhr1984 | -0.0001 -0.0570 -0.1410 -0.4495 1.0000
dlnhr1985 | 0.0393 -0.0320 -0.0827 -0.4035 -0.1969 1.0000
dlnhr1986 | -0.0629 0.0322 0.0112 0.0233 -0.1192 -0.2334 1.0000
dlnhr1987 | 0.0811 -0.0709 -0.0029 -0.0448 -0.0202 0.0093 -0.6231 1.0000
dlnhr1988 | -0.0341 0.0461 -0.0082 -0.1020 0.0261 0.0682 0.2486 -0.6064 1.0000
> f1988
(obs=532)
| ufd~1980 ufd~1981 ufd~1982 ufd~1983 ufd~1984 ufd~1985 ufd~1986 ufd~1987
ufd~1988
-------------+--------------------------------------------------------------------------------ufdiff1980 | 1.0000
ufdiff1981 | -0.6263 1.0000
ufdiff1982 | 0.0451 -0.2389 1.0000
ufdiff1983 | 0.0128 -0.0239 -0.2316 1.0000
ufdiff1984 | -0.0010 -0.0588 -0.1291 -0.4804 1.0000
ufdiff1985 | 0.0453 -0.0285 -0.0868 -0.3731 -0.1853 1.0000
ufdiff1986 | -0.0674 0.0321 0.0110 0.0256 -0.1138 -0.2538 1.0000
ufdiff1987 | 0.0811 -0.0711 -0.0077 -0.0533 -0.0081 0.0211 -0.6250 1.0000
ufdiff1988 | -0.0323 0.0499 0.0022 -0.1019 0.0368 0.0543 0.2326 -0.5943 1.0000
.
. ************ (3) ANALYSIS: CORRELATIONS FOR AN INDIVIDUAL OBSERVATION
.
. * Look at correlations for each individual
.
. ** TRANSFORM DATA FROM LONG FORM TO WIDE FORM FOR INDIVIDUALS
.
. use mom3, replace
. * Here just do this for lnhr and lnwg and the residuals
. keep lnhr lnwg id year
. reshape wide lnhr lnwg, i(year) j(id)
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33
> 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
65 6
> 6 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
98
> 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
121 122 123
> 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
146 147 1
> 48 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169
170 171 172
> 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194
195 196 1
> 97 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
219 220 221
> 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243
244 245 2
> 46 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
268 269 270
506
> 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
293 294 2
> 95 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316
317 318 319
> 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341
342 343 3
> 44 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365
366 367 368
> 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390
391 392 3
> 93 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414
415 416 417
> 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439
440 441 4
> 42 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463
464 465 466
> 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488
489 490 4
> 91 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512
513 514 515
> 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 ->
10
Number of variables
4 -> 1065
j variable (532 values)
id -> (dropped)
xij variables:
lnhr -> lnhr1 lnhr2 ... lnhr532
lnwg -> lnwg1 lnwg2 ... lnwg532
----------------------------------------------------------------------------. * Note that i and j are reversed
.
. * Since year is 1979 to 1988 this will create
. * lnhr1979 to lnhr1988 and lnwg1979 to lnwg1988
.
. tsset year
time variable: year, 1979 to 1988
.
. * First-order Correlation over T years for the first observation
. corr lnhr1 L.lnhr1
(obs=9)
|
L.
| lnhr1 lnhr1
-------------+-----------------lnhr1
|
-- | 1.0000
L1 | 0.6378 1.0000
507
. * And so on
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section5\mma21p3panresiduals.txt
log type: text
closed on: 23 May 2005, 13:01:15
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p4pangls.txt
log type: text
opened on: 23 May 2005, 11:38:01
.
. ********** OVERVIEW OF MMA21P4PANGLS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 21.5.5 page 725 Table 21.6 Pooled panel OLS and GLS
. * Demonstrate pooled GLS estimation using XTGEE
. * (1) No correlation (i.e. pooled OLS)
. * (2) Equicorrelated
. * (3) AR1
. * (4) Unrestricted
. * Standard errors are default plus panel boostrap
.
. * To run you need file
. * MOM.dat
. * in your directory
.
. * The four basic linear panel programs are
. * mma21p1panfeandre.do Linear fixed and random effects using xtreg
. * mma21p2panfeandre.do Linear fe and re using transformation and regress
508
.*
plus also has valid Hausman test
. * mma21p3panresiduals.do Residual analysis after linear fe and re
. * mma21p4panpangls.do Pooled panel OLS and GLS
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** DATA DESCRIPTION **********
.
. * The original data is from
. * Jim Ziliak (1997)
. * "Efficient Estimation With Panel Data when Instruments are Predetermined:
. * An Empirical Comparison of Moment-Condition Estimators"
. * Journal of Business and Economic Statistics, 15, 419-431
.
. * File MOM.dat has data on 532 men over 10 years (1979-1988)
. * Data are space-delimited ordered by person with separate line for each year
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
. * lnhr lnwg kids ageh agesq disab id year
.
. * File MOM.dat is the version of the data posted at the JBES website
. * Note that in chapter 22 we instead use MOMprecise.dat
. * which is the same data set but with more significant digits
.
. ********** READ DATA AND SUMMARIZE **********
.*
. * The data are in ascii file MOM.dat
. * There are 532 individuals with 10 lines (years) per individual
. * Read in using Infile: FREE FORMAT WITHOUT DICTIONARY
. infile lnhr lnwg kids ageh agesq disab id year using MOM.dat
(5320 observations read)
.
. describe
Contains data
obs:
5,320
vars:
8
size:
191,520 (98.1% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------lnhr
float %9.0g
509
lnwg
float %9.0g
kids
float %9.0g
ageh
float %9.0g
agesq
float %9.0g
disab
float %9.0g
id
float %9.0g
year
float %9.0g
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ********** DEFINE GLOBALS INCLUDING REGRESSOR LIST *********
.
. * Number of reps for the boostrap
. * Table 21.6 used 500
. global nreps 500
.
. ********* ANALYSIS: DIFFERENT POOLED GLS ESTIMATES USING XTGEE *********
.
. *** (1) N0 ERROR CORRELATION - SAME AS POOLED OLS Table 21.7 first column
.
. * Default standard error
. xtgee lnhr lnwg, corr(independent) i(id)
Iteration 1: tolerance = 3.405e-13
GEE population-averaged model
Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
Obs per group: min =
10
Family:
Gaussian
avg =
10.0
Correlation:
independent
max =
10
Wald chi2(1)
= 82.25
Scale parameter:
.0803055
Prob > chi2
= 0.0000
Pearson chi2(5320):
427.23
Deviance
427.23
510
Dispersion (Pearson):
.0803055
Dispersion
= .0803055
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0827436 .0091234 9.07 0.000 .064862 .1006251
_cons | 7.441516 .0241219 308.50 0.000 7.394238 7.488795
-----------------------------------------------------------------------------. estimates store ind
. * "Robust" standard error
. xtgee lnhr lnwg, corr(independent) i(id) robust
Iteration 1: tolerance = 3.405e-13
GEE population-averaged model
Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
Obs per group: min =
10
Family:
Gaussian
avg =
10.0
Correlation:
independent
max =
10
Wald chi2(1)
=
7.99
Scale parameter:
.0803055
Prob > chi2
= 0.0047
Pearson chi2(5320):
Dispersion (Pearson):
427.23
Deviance
.0803055
Dispersion
427.23
= .0803055
Number of obs =
N of clusters =
532
Replications =
500
5320
511
Number of obs =
N of clusters =
532
Replications =
500
5320
513
.
. *** (3) AR(1) Table 21.7 third column
.
. * Default standard error
. xtgee lnhr lnwg, corr(ar 1) i(id) t(year)
Iteration 1: tolerance = .001507
Iteration 2: tolerance = 2.246e-06
Iteration 3: tolerance = 1.547e-09
GEE population-averaged model
Number of obs
=
5320
Group and time vars:
id year
Number of groups =
532
Link:
identity
Obs per group: min =
10
Family:
Gaussian
avg =
10.0
Correlation:
AR(1)
max =
10
Wald chi2(1)
= 46.73
Scale parameter:
.0803129
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0843777 .0123428 6.84 0.000 .0601862 .1085691
_cons | 7.439893 .0327698 227.04 0.000 7.375665 7.50412
-----------------------------------------------------------------------------. estimates store ar1
. * "Robust" standard error
. xtgee lnhr lnwg, corr(ar 1) i(id) t(year) robust
Iteration 1: tolerance = .001507
Iteration 2: tolerance = 2.246e-06
Iteration 3: tolerance = 1.547e-09
GEE population-averaged model
Number of obs
=
5320
Group and time vars:
id year
Number of groups =
532
Link:
identity
Obs per group: min =
10
Family:
Gaussian
avg =
10.0
Correlation:
AR(1)
max =
10
Wald chi2(1)
=
5.15
Scale parameter:
.0803129
Prob > chi2
= 0.0232
(standard errors adjusted for clustering on id)
-----------------------------------------------------------------------------|
Semi-robust
lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0843777 .0371764 2.27 0.023 .0115133 .1572421
_cons | 7.439893 .100308 74.17 0.000 7.243293 7.636493
------------------------------------------------------------------------------
514
Number of obs =
N of clusters =
532
Replications =
500
5320
515
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0910023 .0137712 6.61 0.000 .0640113 .1179933
_cons | 7.426262 .0366836 202.44 0.000 7.354363 7.49816
-----------------------------------------------------------------------------. estimates store unstr
. * "Robust" standard error
. xtgee lnhr lnwg, corr(unstructured) i(id) t(year) robust
Iteration 1: tolerance = .00721446
Iteration 2: tolerance = .0003951
Iteration 3: tolerance = .00001469
Iteration 4: tolerance = 4.230e-07
GEE population-averaged model
Number of obs
=
5320
Group and time vars:
id year
Number of groups =
532
Link:
identity
Obs per group: min =
10
Family:
Gaussian
avg =
10.0
Correlation:
unstructured
max =
10
Wald chi2(1)
=
3.29
Scale parameter:
.0803575
Prob > chi2
= 0.0695
(standard errors adjusted for clustering on id)
-----------------------------------------------------------------------------|
Semi-robust
lnhr |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------lnwg | .0910023 .0501344 1.82 0.069 -.0072594 .189264
_cons | 7.426262 .1328255 55.91 0.000 7.165929 7.686595
-----------------------------------------------------------------------------. estimates store unstrrob
. * Correct panel bootstrap standard errors
. set seed 10001
. /* For some reason the following did not work
> bootstrap "xtgee lnhr lnwg, corr(unstructured) i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nrep
> s) level(95)
> matrix unstrbootse = e(se)
> */
.
. ********** DISPLAY RESULTS IN TABLE 21.7 page 725 **********
.
. * Standard error using iid errors and in some cases panel
. estimates table ind indrob exch exchrob, /*
> */ se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
516
-----------------------------------------------------------------Variable | ind
indrob
exch
exchrob
-------------+---------------------------------------------------lnwg |
0.083
0.083
0.120
0.120
|
0.009
0.029
0.014
0.052
_cons |
7.442
7.442
7.345
7.345
|
0.024
0.080
0.036
0.138
-------------+---------------------------------------------------N | 5320.000 5320.000 5320.000 5320.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
-----------------------------------------------------------------legend: b/se
. estimates table ar1 ar1rob unstr unstrrob, /*
> */ se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
-----------------------------------------------------------------Variable | ar1
ar1rob
unstr
unstrrob
-------------+---------------------------------------------------lnwg |
0.084
0.084
0.091
0.091
|
0.012
0.037
0.014
0.050
_cons |
7.440
7.440
7.426
7.426
|
0.033
0.100
0.037
0.133
-------------+---------------------------------------------------N | 5320.000 5320.000 5320.000 5320.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
-----------------------------------------------------------------legend: b/se
.
. * Standard errors using panel bootstrap (regular bootstrap for between)
. matrix list indbootse
indbootse[1,2]
_bs_1
_bs_2
se .03178369 .0861859
. matrix list exchbootse
517
exchbootse[1,2]
_bs_1
_bs_2
se .05989501 .15855561
. matrix list ar1bootse
ar1bootse[1,2]
_bs_1
_bs_2
se .05039303 .13673201
. matrix list unstrbootse
matrix unstrbootse not found
r(111);
end of do-file
r(111);
. exit, clear
518
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma22p1pangmm.txt
log type: text
opened on: 23 May 2005, 11:52:35
.
. ********** OVERVIEW OF MMA22P1PANGMM.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 22.3 pages 754-6
. * Panel 2SLS and GMM for a linear model with endogenous regressors
. * Fixed effects are first differenced.
. * Then 2SLS and GMM applied to first differenced model.
.
. * Program derives Table 22.2 and does other analysis in section
. * (1) pooled OLS
. * (2) 2SLS in base instruments case
. * (3) 2SLS in stacked instruments case
. * (4) 2SGMM in base instruments case
. * (5) 2SGMM in stacked instruments case
. * (6) F-statistics for weak instruments
. * (7) Partial R-squared for weak instruments
.
. * The pooled OLS and 2SLS replicate Ziliak (1997) Table 1 Top left-hand corner
. * for Base Case (9 instruments) and first Stacked Case (72 instruments)
. * 2SLS in first differences where both 1979 and 1980 are dropped
.
. * To run you need file
. * MOMprecise.dat
. * in your directory
.
. * NOTE: This data set is different from MOM.dat used in chapter 21.
.*
The data here has more significant digits.
.*
leading to some difference in resulting coefficient estiamtes.
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** DATA DESCRIPTION **********
.
519
Obs
Mean
Std. Dev.
Min
Max
520
-------------+-------------------------------------------------------lnhr |
5320 7.657458
.28564 2.772589 8.556414
lnwg |
5320 2.609477 .4260333 -.2613648 4.686474
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ********** FIRST DIFFERENCES REGRESSION **********
.
. * Stata has no command for first differences regression
. * Though may be possible with xtivreg
.
. * The following only works if each observation is (i,t)
. * and within i the data are ordered by t
. gen dlnhr = lnhr - lnhr[_n-1]
(1 missing value generated)
. gen dlnwg = lnwg - lnwg[_n-1]
(1 missing value generated)
. gen dkids = kids - kids[_n-1]
(1 missing value generated)
. gen dageh = ageh - ageh[_n-1]
(1 missing value generated)
. gen dagesq = agesq - agesq[_n-1]
(1 missing value generated)
. gen ddisab = disab - disab[_n-1]
(1 missing value generated)
.
. * The regression is of
. * dlnhr on constant dlnwg dkids dageh dagesq ddisab
.
. ********** GENERATE THE INSTRUMENTS **********
.
. * The endogenous variable is dlnwg. The others are exogenous.
. * It is not clear whether current values of the exogenous variables are used as instruments.
. * I would think so but there is no mention in the paper of this.
. * In addition Table 1 considers various instrument sets
. * We consider the first (first rows) and second (second rows)
.
. * (1) Use the levels of the exogenous regressors lagged one and two periods
. * and the level of the endogenous regressor lagged two periods
521
z8y1 |
5320 .0054511 .0736372
0
1
z9y1 |
5320 .2597756 .7905791
0 4.61522
z1y2 |
5320 3.63891 11.20265
0
53
z2y2 |
5320 138.7175 458.8032
0
2809
z3y2 |
5320 .1590226 .6057112
0
6
-------------+-------------------------------------------------------z4y2 |
5320 .0039474 .0627099
0
1
z5y2 |
5320 3.544549 10.92972
0
52
z6y2 |
5320 132.0002 438.9997
0
2704
z7y2 |
5320 .1567669 .5978681
0
6
z8y2 |
5320 .0048872 .0697442
0
1
-------------+-------------------------------------------------------z9y2 |
5320 .2602349 .7906729
0 4.60976
z1y3 |
5320 3.737218 11.49054
0
54
z2y3 |
5320 145.9744 480.6547
0
2916
z3y3 |
5320 .1637218 .6172305
0
6
z4y3 |
5320 .0052632 .0723633
0
1
-------------+-------------------------------------------------------z5y3 |
5320 3.63891 11.20265
0
53
z6y3 |
5320 138.7175 458.8032
0
2809
z7y3 |
5320 .1590226 .6057112
0
6
z8y3 |
5320 .0039474 .0627099
0
1
z9y3 |
5320 .2610997 .7928738
0 4.52656
-------------+-------------------------------------------------------z1y4 |
5320 3.83985 11.79093
0
55
z2y4 |
5320 153.7444 503.9576
0
3025
z3y4 |
5320 .1620301 .6132476
0
6
z4y4 |
5320 .0037594 .0612043
0
1
z5y4 |
5320 3.737218 11.49054
0
54
-------------+-------------------------------------------------------z6y4 |
5320 145.9744 480.6547
0
2916
z7y4 |
5320 .1637218 .6172305
0
6
z8y4 |
5320 .0052632 .0723633
0
1
z9y4 |
5320 .2614749 .7946793
0 4.607767
z1y5 |
5320 3.940414 12.08767
0
56
-------------+-------------------------------------------------------z2y5 |
5320 161.6111 527.9522
0
3136
z3y5 |
5320 .1595865 .608814
0
6
z4y5 |
5320 .006015 .0773303
0
1
z5y5 |
5320 3.83985 11.79093
0
55
z6y5 |
5320 153.7444 503.9576
0
3025
-------------+-------------------------------------------------------z7y5 |
5320 .1620301 .6132476
0
6
z8y5 |
5320 .0037594 .0612043
0
1
z9y5 |
5320 .2610663 .7939903
0 4.618777
z1y6 |
5320 4.047368 12.40128
0
57
z2y6 |
5320 170.144 553.5552
0
3249
-------------+-------------------------------------------------------z3y6 |
5320 .1575188 .6042401
0
5
z4y6 |
5320 .0065789 .0808511
0
1
z5y6 |
5320 3.940414 12.08767
0
56
524
z6y6 |
5320 161.6111 527.9522
0
3136
z7y6 |
5320 .1595865 .608814
0
6
-------------+-------------------------------------------------------z8y6 |
5320 .006015 .0773303
0
1
z9y6 |
5320 .2600271 .7937085 -.2613648 4.648325
z1y7 |
5320 4.140602 12.67474
0
58
z2y7 |
5320 177.7635 576.2959
0
3364
z3y7 |
5320 .1537594 .5983346
0
5
-------------+-------------------------------------------------------z4y7 |
5320 .006203 .0785219
0
1
z5y7 |
5320 4.047368 12.40128
0
57
z6y7 |
5320 170.144 553.5552
0
3249
z7y7 |
5320 .1575188 .6042401
0
5
z8y7 |
5320 .0065789 .0808511
0
1
-------------+-------------------------------------------------------z9y7 |
5320 .261494 .7964894
0 4.686474
z1y8 |
5320 4.240414 12.96638
0
59
z2y8 |
5320 186.0765 600.9297
0
3481
z3y8 |
5320 .1494361 .5901043
0
5
z4y8 |
5320 .0090226 .0945665
0
1
-------------+-------------------------------------------------------z5y8 |
5320 4.140602 12.67474
0
58
z6y8 |
5320 177.7635 576.2959
0
3364
z7y8 |
5320 .1537594 .5983346
0
5
z8y8 |
5320 .006203 .0785219
0
1
z9y8 |
5320 .2602616 .7933278
0 4.5933
.
. * Define variable lists for regressors X and instruments Z
.
. global XREG dlnwg dkids dageh dagesq ddisab
.
. global ZBASECASE kidsl1 agehl1 agesql1 disabl1 agehl2 kidsl2 agesql2 disabl2 lnwgl2
.
. global ZSTACKED z1y1 z2y1 z3y1 z4y1 z5y1 z6y1 z7y1 z8y1 z9y1 /*
> */
z1y2 z2y2 z3y2 z4y2 z5y2 z6y2 z7y2 z8y2 z9y2 /*
> */
z1y3 z2y3 z3y3 z4y3 z5y3 z6y3 z7y3 z8y3 z9y3 /*
> */
z1y4 z2y4 z3y4 z4y4 z5y4 z6y4 z7y4 z8y4 z9y4 /*
> */
z1y5 z2y5 z3y5 z4y5 z5y5 z6y5 z7y5 z8y5 z9y5 /*
> */
z1y6 z2y6 z3y6 z4y6 z5y6 z6y6 z7y6 z8y6 z9y6 /*
> */
z1y7 z2y7 z3y7 z4y7 z5y7 z6y7 z7y7 z8y7 z9y7 /*
> */
z1y8 z2y8 z3y8 z4y8 z5y8 z6y8 z7y8 z8y8 z9y8
.
. * Define variable lists for weak instruments test which drops
.
. save momfdiffgmm, replace
file momfdiffgmm.dta saved
525
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.657458
.28564 2.772589 8.556414
lnwg |
5320 2.609477 .4260333 -.2613648 4.686474
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
dlnhr |
5319 .0000192 .3016322 -4.787492 4.521109
dlnwg |
5319 .0001115 .2718437 -2.32463 3.062298
-------------+-------------------------------------------------------dkids |
5319 -.000188 .6629109
-5
6
dageh |
5319 .0030081 4.611209
-36
19
dagesq |
5319 .2105659 371.0841
-3024
1577
ddisab |
5319
0 .2429913
-1
1
kidsl1 |
5319 1.555932 1.196012
0
6
-------------+-------------------------------------------------------kidsl2 |
5318 1.556036 1.196101
0
6
agehl1 |
5319 38.91747 8.45096
22
60
agehl2 |
5318 38.91707 8.451706
22
60
agesql1 |
5319 1585.974 689.8313
484
3600
agesql2 |
5318 1585.957 689.8949
484
3600
-------------+-------------------------------------------------------disabl1 |
5319 .0609137 .2391944
0
1
disabl2 |
5318 .0609252 .2392155
0
1
lnwgl2 |
5318 2.609513 .4261095 -.2613648 4.686474
z1y1 |
5320 3.544549 10.92972
0
52
z2y1 |
5320 132.0002 438.9997
0
2704
-------------+-------------------------------------------------------z3y1 |
5320 .1567669 .5978681
0
6
z4y1 |
5320 .0048872 .0697442
0
1
z5y1 |
5320 3.445489 10.64043
0
51
z6y1 |
5320 125.0688 418.0247
0
2601
z7y1 |
5320 .1520677 .5938801
0
6
-------------+-------------------------------------------------------z8y1 |
5320 .0054511 .0736372
0
1
z9y1 |
5320 .2597756 .7905791
0 4.61522
z1y2 |
5320 3.63891 11.20265
0
53
z2y2 |
5320 138.7175 458.8032
0
2809
z3y2 |
5320 .1590226 .6057112
0
6
-------------+-------------------------------------------------------z4y2 |
5320 .0039474 .0627099
0
1
z5y2 |
5320 3.544549 10.92972
0
52
z6y2 |
5320 132.0002 438.9997
0
2704
z7y2 |
5320 .1567669 .5978681
0
6
z8y2 |
5320 .0048872 .0697442
0
1
526
-------------+-------------------------------------------------------z9y2 |
5320 .2602349 .7906729
0 4.60976
z1y3 |
5320 3.737218 11.49054
0
54
z2y3 |
5320 145.9744 480.6547
0
2916
z3y3 |
5320 .1637218 .6172305
0
6
z4y3 |
5320 .0052632 .0723633
0
1
-------------+-------------------------------------------------------z5y3 |
5320 3.63891 11.20265
0
53
z6y3 |
5320 138.7175 458.8032
0
2809
z7y3 |
5320 .1590226 .6057112
0
6
z8y3 |
5320 .0039474 .0627099
0
1
z9y3 |
5320 .2610997 .7928738
0 4.52656
-------------+-------------------------------------------------------z1y4 |
5320 3.83985 11.79093
0
55
z2y4 |
5320 153.7444 503.9576
0
3025
z3y4 |
5320 .1620301 .6132476
0
6
z4y4 |
5320 .0037594 .0612043
0
1
z5y4 |
5320 3.737218 11.49054
0
54
-------------+-------------------------------------------------------z6y4 |
5320 145.9744 480.6547
0
2916
z7y4 |
5320 .1637218 .6172305
0
6
z8y4 |
5320 .0052632 .0723633
0
1
z9y4 |
5320 .2614749 .7946793
0 4.607767
z1y5 |
5320 3.940414 12.08767
0
56
-------------+-------------------------------------------------------z2y5 |
5320 161.6111 527.9522
0
3136
z3y5 |
5320 .1595865 .608814
0
6
z4y5 |
5320 .006015 .0773303
0
1
z5y5 |
5320 3.83985 11.79093
0
55
z6y5 |
5320 153.7444 503.9576
0
3025
-------------+-------------------------------------------------------z7y5 |
5320 .1620301 .6132476
0
6
z8y5 |
5320 .0037594 .0612043
0
1
z9y5 |
5320 .2610663 .7939903
0 4.618777
z1y6 |
5320 4.047368 12.40128
0
57
z2y6 |
5320 170.144 553.5552
0
3249
-------------+-------------------------------------------------------z3y6 |
5320 .1575188 .6042401
0
5
z4y6 |
5320 .0065789 .0808511
0
1
z5y6 |
5320 3.940414 12.08767
0
56
z6y6 |
5320 161.6111 527.9522
0
3136
z7y6 |
5320 .1595865 .608814
0
6
-------------+-------------------------------------------------------z8y6 |
5320 .006015 .0773303
0
1
z9y6 |
5320 .2600271 .7937085 -.2613648 4.648325
z1y7 |
5320 4.140602 12.67474
0
58
z2y7 |
5320 177.7635 576.2959
0
3364
z3y7 |
5320 .1537594 .5983346
0
5
-------------+-------------------------------------------------------z4y7 |
5320 .006203 .0785219
0
1
z5y7 |
5320 4.047368 12.40128
0
57
527
z6y7 |
5320 170.144 553.5552
0
3249
z7y7 |
5320 .1575188 .6042401
0
5
z8y7 |
5320 .0065789 .0808511
0
1
-------------+-------------------------------------------------------z9y7 |
5320 .261494 .7964894
0 4.686474
z1y8 |
5320 4.240414 12.96638
0
59
z2y8 |
5320 186.0765 600.9297
0
3481
z3y8 |
5320 .1494361 .5901043
0
5
z4y8 |
5320 .0090226 .0945665
0
1
-------------+-------------------------------------------------------z5y8 |
5320 4.140602 12.67474
0
58
z6y8 |
5320 177.7635 576.2959
0
3364
z7y8 |
5320 .1537594 .5983346
0
5
z8y8 |
5320 .006203 .0785219
0
1
z9y8 |
5320 .2602616 .7933278
0 4.5933
.
. ********** (1)-(3) 2SLS USING IVREG IS STRAIGHTFORWARD (Table 22.2, p.755)
**********
.
. * Note that this will automatically includes the exogenous variables as instrumetns
. * It is not clear that Ziliak does this
.
. * The following drops the first two years which here are 1979 and 1980
. drop if year == 1979 | year == 1980
(1064 observations deleted)
.
. * (1) OLS results at bottom Ziliak table 1
. * Table 22.2 (page 755) OLS column with various standard errors estimates
. regress dlnhr $XREG, noconstant
Source |
SS
df
MS
Number of obs = 4256
-------------+-----------------------------F( 5, 4251) = 5.38
Model | 2.3389287 5 .467785741
Prob > F
= 0.0001
Residual | 369.369193 4251 .086889954
R-squared = 0.0063
-------------+-----------------------------Adj R-squared = 0.0051
Total | 371.708121 4256 .087337435
Root MSE
= .29477
-----------------------------------------------------------------------------dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .1115114 .0230566 4.84 0.000 .0663084 .1567144
dkids | -.0062887 .0116719 -0.54 0.590 -.0291717 .0165943
dageh | .0066935 .0212744 0.31 0.753 -.0350154 .0484025
dagesq | -.0000797 .0002644 -0.30 0.763 -.000598 .0004387
ddisab | -.0352603 .0199796 -1.76 0.078 -.0744306 .0039101
-----------------------------------------------------------------------------. estimates store olsiid
528
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .1115114 .0791674 1.41 0.159 -.043698 .2667207
dkids | -.0062887 .011057 -0.57 0.570 -.0279662 .0153888
dageh | .0066935 .0243788 0.27 0.784 -.0411016 .0544887
dagesq | -.0000797 .0003147 -0.25 0.800 -.0006965 .0005372
ddisab | -.0352603 .0364021 -0.97 0.333 -.1066273 .0361067
-----------------------------------------------------------------------------. estimates store olshet
. regress dlnhr $XREG, noconstant cluster(id)
Regression with robust standard errors
Number of obs = 4256
F( 5, 531) = 0.52
Prob > F
= 0.7617
R-squared = 0.0063
Number of clusters (id) = 532
Root MSE
= .29477
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .1115114 .0960926 1.16 0.246 -.0772569 .3002797
dkids | -.0062887 .0109558 -0.57 0.566 -.0278107 .0152333
dageh | .0066935 .012339 0.54 0.588 -.0175458 .0309328
dagesq | -.0000797 .0001551 -0.51 0.608 -.0003843 .000225
ddisab | -.0352603 .0452557 -0.78 0.436 -.1241625 .053642
-----------------------------------------------------------------------------. estimates store olspanel
.
. * (2) 2SLS using the base case instrument set
. * Table 22.2 (page 755) 2SLS column base case with various se estimates
. ivreg dlnhr ($XREG = $ZBASECASE), noconstant
Instrumental variables (2SLS) regression
Source |
SS
df
MS
-------------+------------------------------
Number of obs =
F( 5, 4251) =
4256
.
529
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .2091087 .423312 0.49 0.621 -.6208038 1.039021
dkids | -.0296864 .0400461 -0.74 0.459 -.1081977 .0488249
dageh | .026388 .0361631 0.73 0.466 -.0445106 .0972866
dagesq | -.0003411 .0004555 -0.75 0.454 -.0012342 .000552
ddisab | .000402 .0731433 0.01 0.996 -.142997 .143801
-----------------------------------------------------------------------------Instrumented: dlnwg dkids dageh dagesq ddisab
Instruments: kidsl1 agehl1 agesql1 disabl1 agehl2 kidsl2 agesql2 disabl2
lnwgl2
-----------------------------------------------------------------------------. estimates store basehet
. ivreg dlnhr ($XREG = $ZBASECASE), noconstant cluster(id)
IV (2SLS) regression with robust standard errors
Number of obs =
F( 5, 531) = 1.44
Prob > F
= 0.2087
4256
530
R-squared
Number of clusters (id) = 532
=
.
Root MSE
= .29564
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .2091087 .3741705 0.56 0.576 -.5259273 .9441447
dkids | -.0296864 .0293678 -1.01 0.313 -.0873777 .0280048
dageh | .026388 .0153921 1.71 0.087 -.0038488 .0566249
dagesq | -.0003411 .0001837 -1.86 0.064 -.0007019 .0000198
ddisab | .000402 .0667719 0.01 0.995 -.1307674 .1315714
-----------------------------------------------------------------------------Instrumented: dlnwg dkids dageh dagesq ddisab
Instruments: kidsl1 agehl1 agesql1 disabl1 agehl2 kidsl2 agesql2 disabl2
lnwgl2
-----------------------------------------------------------------------------. estimates store basepanel
.
. * (3) 2SLS using the stacked instrument set
. * Table 22.2 (page 755) 2SLS column stacked case with various se estimates
. set matsize 100
. ivreg dlnhr ($XREG = $ZSTACKED), noconstant
Instrumental variables (2SLS) regression
Source |
SS
df
MS
Number of obs = 4256
-------------+-----------------------------F( 5, 4251) =
.
Model | -29.3711267 5 -5.87422533
Prob > F
=
.
Residual | 401.079248 4251 .094349388
R-squared =
.
-------------+-----------------------------Adj R-squared =
.
Total | 371.708121 4256 .087337435
Root MSE
= .30716
-----------------------------------------------------------------------------dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .542827 .1691348 3.21 0.001 .2112345 .8744195
dkids | -.0482932 .0393723 -1.23 0.220 -.1254834 .028897
dageh | .0268935 .0288808 0.93 0.352 -.029728 .0835151
dagesq | -.0003511 .0003671 -0.96 0.339 -.0010709 .0003687
ddisab | .0079759 .0397995 0.20 0.841 -.0700519 .0860037
-----------------------------------------------------------------------------Instrumented: dlnwg dkids dageh dagesq ddisab
Instruments: z1y1 z2y1 z3y1 z4y1 z5y1 z6y1 z7y1 z8y1 z9y1 z1y2 z2y2 z3y2 z4y2
z5y2 z6y2 z7y2 z8y2 z9y2 z1y3 z2y3 z3y3 z4y3 z5y3 z6y3 z7y3 z8y3
z9y3 z1y4 z2y4 z3y4 z4y4 z5y4 z6y4 z7y4 z8y4 z9y4 z1y5 z2y5 z3y5
z4y5 z5y5 z6y5 z7y5 z8y5 z9y5 z1y6 z2y6 z3y6 z4y6 z5y6 z6y6 z7y6
z8y6 z9y6 z1y7 z2y7 z3y7 z4y7 z5y7 z6y7 z7y7 z8y7 z9y7 z1y8 z2y8
531
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .542827 .2260738 2.40 0.016 .0996043 .9860497
dkids | -.0482932 .0350149 -1.38 0.168 -.1169408 .0203544
dageh | .0268935 .0339561 0.79 0.428 -.0396781 .0934652
dagesq | -.0003511 .0004324 -0.81 0.417 -.0011989 .0004966
ddisab | .0079759 .064012 0.12 0.901 -.1175211 .1334729
-----------------------------------------------------------------------------Instrumented: dlnwg dkids dageh dagesq ddisab
Instruments: z1y1 z2y1 z3y1 z4y1 z5y1 z6y1 z7y1 z8y1 z9y1 z1y2 z2y2 z3y2 z4y2
z5y2 z6y2 z7y2 z8y2 z9y2 z1y3 z2y3 z3y3 z4y3 z5y3 z6y3 z7y3 z8y3
z9y3 z1y4 z2y4 z3y4 z4y4 z5y4 z6y4 z7y4 z8y4 z9y4 z1y5 z2y5 z3y5
z4y5 z5y5 z6y5 z7y5 z8y5 z9y5 z1y6 z2y6 z3y6 z4y6 z5y6 z6y6 z7y6
z8y6 z9y6 z1y7 z2y7 z3y7 z4y7 z5y7 z6y7 z7y7 z8y7 z9y7 z1y8 z2y8
z3y8 z4y8 z5y8 z6y8 z7y8 z8y8 z9y8
-----------------------------------------------------------------------------. estimates store stackhet
. ivreg dlnhr ($XREG = $ZSTACKED), noconstant cluster(id)
IV (2SLS) regression with robust standard errors
Number of obs =
F( 5, 531) = 2.41
Prob > F
= 0.0357
R-squared =
.
Number of clusters (id) = 532
Root MSE
= .30716
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .542827 .2085225 2.60 0.009 .1331968 .9524572
dkids | -.0482932 .0245011 -1.97 0.049 -.0964242 -.0001622
dageh | .0268935 .0149934 1.79 0.073 -.0025602 .0563473
dagesq | -.0003511 .0001866 -1.88 0.060 -.0007176 .0000154
ddisab | .0079759 .0624423 0.13 0.898 -.1146884 .1306402
532
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------dlnwg | .542827 .2085225 2.60 0.009 .1331968 .9524572
dkids | -.0482932 .0245011 -1.97 0.049 -.0964242 -.0001622
dageh | .0268935 .0149934 1.79 0.073 -.0025602 .0563473
dagesq | -.0003511 .0001866 -1.88 0.060 -.0007176 .0000154
ddisab | .0079759 .0624423 0.13 0.898 -.1146884 .1306402
-----------------------------------------------------------------------------Instrumented: dlnwg dkids dageh dagesq ddisab
Instruments: z1y1 z2y1 z3y1 z4y1 z5y1 z6y1 z7y1 z8y1 z9y1 z1y2 z2y2 z3y2 z4y2
z5y2 z6y2 z7y2 z8y2 z9y2 z1y3 z2y3 z3y3 z4y3 z5y3 z6y3 z7y3 z8y3
z9y3 z1y4 z2y4 z3y4 z4y4 z5y4 z6y4 z7y4 z8y4 z9y4 z1y5 z2y5 z3y5
z4y5 z5y5 z6y5 z7y5 z8y5 z9y5 z1y6 z2y6 z3y6 z4y6 z5y6 z6y6 z7y6
z8y6 z9y6 z1y7 z2y7 z3y7 z4y7 z5y7 z6y7 z7y7 z8y7 z9y7 z1y8 z2y8
z3y8 z4y8 z5y8 z6y8 z7y8 z8y8 z9y8
-----------------------------------------------------------------------------.
. * DISPLAY THE OLS AND 2SLS RESULTS
.
. * The following are used in Table 22.2 (page 755)
.
. * OLS column with various standard errors estimates
. estimates table olspanel olshet olsiid, /*
> */ se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
----------------------------------------------------Variable | olspanel
olshet
olsiid
-------------+--------------------------------------533
dlnwg |
0.112
0.112
0.112
|
0.096
0.079
0.023
dkids | -0.006
-0.006
-0.006
|
0.011
0.011
0.012
dageh |
0.007
0.007
0.007
|
0.012
0.024
0.021
dagesq | -0.000
-0.000
-0.000
|
0.000
0.000
0.000
ddisab | -0.035
-0.035
-0.035
|
0.045
0.036
0.020
-------------+--------------------------------------N | 4256.000 4256.000 4256.000
ll | -837.557 -837.557 -837.557
r2 |
0.006
0.006
0.006
tss |
rss | 369.369
369.369
369.369
mss |
2.339
2.339
2.339
rmse |
0.295
0.295
0.295
df_r | 531.000 4251.000 4251.000
----------------------------------------------------legend: b/se
.
. * 2SLS column base case with various standard errors estimates
. estimates table basepanel basehet baseiid, /*
> */ se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
----------------------------------------------------Variable | basepanel basehet
baseiid
-------------+--------------------------------------dlnwg |
0.209
0.209
0.209
|
0.374
0.423
0.389
dkids | -0.030
-0.030
-0.030
|
0.029
0.040
0.044
dageh |
0.026
0.026
0.026
|
0.015
0.036
0.029
dagesq | -0.000
-0.000
-0.000
|
0.000
0.000
0.000
ddisab |
0.000
0.000
0.000
| 0.067
0.073
0.043
-------------+--------------------------------------N | 4256.000 4256.000 4256.000
ll |
r2 |
.
.
.
tss |
rss | 371.543
371.543
371.543
mss |
0.165
0.165
0.165
rmse |
0.296
0.296
0.296
df_r | 531.000 4251.000 4251.000
----------------------------------------------------legend: b/se
534
.
. * 2SLS column stacked case with various standard errors estimates
. estimates table stackpanel stackhet stackiid, /*
> */ se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
----------------------------------------------------Variable | stackpanel stackhet stackiid
-------------+--------------------------------------dlnwg |
0.543
0.543
0.543
|
0.209
0.226
0.169
dkids | -0.048
-0.048
-0.048
|
0.025
0.035
0.039
dageh |
0.027
0.027
0.027
|
0.015
0.034
0.029
dagesq | -0.000
-0.000
-0.000
|
0.000
0.000
0.000
ddisab |
0.008
0.008
0.008
|
0.062
0.064
0.040
-------------+--------------------------------------N | 4256.000 4256.000 4256.000
ll |
r2 |
.
.
.
tss |
rss | 401.079
401.079
401.079
mss | -29.371
-29.371 -29.371
rmse |
0.307
0.307
0.307
df_r | 531.000 4251.000 4251.000
----------------------------------------------------legend: b/se
.
. ********** (4)-(5) 2SGMM REQUIRES SPECIAL MARTRIX CODING **********
.
. *** PROGRAM PANELGMM DOES 2SLS (as check) and 2SGMM USING MATRIX
COMMANDS
.
. * This program:
. * - requires as inputs the global macros
.*
y gives the dependent variable name
.*
X gives the list of regressor names
.*
Z gives the list of instrument names
. * - assumes the appropriate data is in memory
. * - assumes the cluster identifier is called id
.
. * If the regressors and instruments include an intercept include
. * this as a separate regressor, say called ONE, in X and Z.
. * Then continue to use the following code with the noconstant option for accum and optaccum.
. * (accum and optaccum automatically include a constant AT THE END,
. * which is not where we want the constant.)
.
535
(obs=4256)
(obs=4256)
(obs=4256)
2SLS results:
b2SLS[5,1]
dlnhr
dlnwg .20910869
dkids -.02968643
dageh .02638804
dagesq -.00034108
ddisab .00040197
rmse = .29563723
se2SLS[5,1]
c1
r1 .3736429
r2 .02932634
r3 .01537039
r4 .00018343
r5 .06667771
2SGMM results:
b2SGMM[5,1]
dlnhr
dlnwg .54679602
dkids -.04490416
dageh .02747594
dagesq -.00035912
ddisab -.0468348
rmse = .30719932
se2SGMM[5,1]
c1
r1 .32762396
r2 .02714405
r3 .01295984
r4 .00015941
r5 .06236006
Over-identifying restrictions test 5.4503878 dof 4 p-value .24412497
.
. * (5) 2SGMM (and 2SLS as check) using the stacked instrument set
. * Gives 2SGMM Stacked Case column of Table 22.2 (page 755)
.
. drop uhat yhat uhatsq uhat2 /* Obtained in panelgmm */
. global Z $ZSTACKED
539
.
. * Test weak instruments for dlnwg using panel robust inference
. quietly regress dlnwg $ZBASECASE, cluster(id)
. quietly test $ZBASECASE
. * This value should have been reported in the text on page 756
. * [Instead by mistake the F assuning iid errors below was reported]
. di "r2 = " e(r2) " F = " r(F) " p = " r(p) " dof = " r(df)
r2 = .00590049 F = 2.3790046 p = .01209278 dof = 9
.
. * Same except use wrong inference assuming iid errors
. quietly regress dlnwg $ZBASECASE
. quietly test $ZBASECASE
. di "r2 = " e(r2) " F = " r(F) " p = " r(p) " dof = " r(df)
r2 = .00590049 F = 2.800243 p = .00281135 dof = 9
.
. * (2) Weak Instruments using stacked instrument set
.
. * Test weak instruments for dlnwg using panel robust inference
. quietly regress dlnwg $ZSTACKED, cluster(id)
. quietly test $ZSTACKED
. * This value was reported in the text on page 756
. di "r2 = " e(r2) " F = " r(F) " p = " r(p) " dof = " r(df)
r2 = .02256803 F = 1.9000813 p = .00003808 dof = 72
.
. * Same except use wrong inference assuming iid errors
. quietly regress dlnwg $ZSTACKED
. quietly test $ZSTACKED
. di "r2 = " e(r2) " F = " r(F) " p = " r(p) " dof = " r(df)
r2 = .02256803 F = 1.341413 p = .02961833 dof = 72
.
. * (3) Weak Instruments for other regressors
. * Here all regressors are instrumented. So should test all as above.
. * These find no problems.
. * For example, for dkids and base case instrument set
. quietly regress dkids $ZSTACKED, cluster(id)
. quietly test $ZSTACKED
. di "r2 = " e(r2) " F = " r(F) " p = " r(p) " dof = " r(df)
541
542
.
. * (2) Form x1hat - x1hattilda: residual from regress x1hat on fitted values of other regressors
. * (2A) First get the fitted values from regress endogenous on instruments
. quietly reg dlnwg $ZBASECASE
. predict dlnwghat, xb
. di e(r2) " r2 from regress x1 on Z"
.00590049 r2 from regress x1 on Z
. quietly reg dkids $ZBASECASE
. predict dkidshat, xb
. di e(r2) " r2 from regress second endog regressor on Z"
.1473738 r2 from regress second endog regressor on Z
. quietly reg dageh $ZBASECASE
. predict dagehhat, xb
. di e(r2) " r2 from regress third endog regressor on Z"
.13903221 r2 from regress third endog regressor on Z
. quietly reg dagesq $ZBASECASE
. predict dagesqhat, xb
. di e(r2) " r2 from regress fourth endog regressor on Z"
.3049799 r2 from regress fourth endog regressor on Z
. quietly reg ddisab $ZBASECASE
. predict ddisabhat, xb
. di e(r2) " r2 from regress fifth endog regressor on Z"
.26087493 r2 from regress fifth endog regressor on Z
. * (2B) Run the regression of x1hat on fitted values of other regressors
. quietly reg dlnwghat dkidshat dagehhat dagesqhat ddisabhat
. * quietly reg dkidshat dlnwghat dagehhat dagesqhat ddisabhat
. di e(r2) " r2 from regress prediction of x1 on predictions of x2
.38268288 r2 from regress prediction of x1 on predictions of x2
. predict x1hatminusx1hattilda, resid
.
. * (3) Form the correlation between (1) and (2)
. * This value is reported in the text on page 756
. corr x1minusx1tilda x1hatminusx1hattilda
543
(obs=4256)
| x1minu~a x1hatm~a
-------------+-----------------x1minusx1t~a | 1.0000
x1hatminus~a | 0.0604 1.0000
544
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma23p1pannonlin.txt
log type: text
opened on: 23 May 2005, 12:46:16
.
. ********** OVERVIEW OF MMA23P1PANNONLIN.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 23.3 pages 792-5
. * Example of nonlinear model (multiplicative effects)
.
. * This program derives Table 23.1 and Figure 23.1.
. * It performs nonlinear panel analysis for multiplicative effects model
. * y_it = a_i*exp(x_it'b) = exp(c_i+x_it'b)
. * and parametric count data models
.
. * (1) Linear (xtreg) for log(PAT) with adjustment for PAT=0
.*
Output include Figure 23.1
. * (2) Poisson (xtpoisson) fixed and random effects
. * (3) GEE (xtgee) which includes pooled NLS
.
. * The Poisson individual effects model is
. * y_it ~ Poisson(x_it'b + a_i)
. * The standard errors assume this model correctly specified
. * i.e. Variance = mean given x+it and a_i
.
. * FOr "panel robust se's see section 23.2.6 pages 788-791
. * To obtain more panel robust standard errors this program panel bootstraps
. * Note that the panel se entries of 0.033 under GEE, Poisson-RE and Poisson-FE
. * are not panel robust to the extent that the bootstrap se's are panel robust
. * and in fact are the usual se's in the case of Poisson-RE and Poisson-FE
. * Unlike ch.21 here "panel se" means "defaul panel se" and not "panel-robust se".
.
. * To speed up program reduce nreps, the number of bootstrap replications
.
. * To run this program you need data file
. * patr7079.asc
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
545
.
. ********** DATA DESCRIPTION **********
.
. * There are ten years of data but only five years 1975-79 are used in estimation
.
. * The original data is from
. * Bronwyn Hall, Zvi Griliches, and Jerry Hausman (1986),
. * "Patents and R&D: Is There a Lag?",
. * International Economic Review, 27, 265-283.
.
. * File patr7079.dat has data on 346 firms
. * There are 4 lines per firm, with 25 variables
. * Time-invariant: CUSIP,ARDSSIC,SCISECT,LOGK,SUMPAT,
. * Time-varying X: LOGR70,LOGR71,LOGR72, ....., LOGR77,LOGR78,LOGR79
. * Time-varying Y: PAT70,PAT71,PAT72, ....., PAT77,PAT78,PAT79
. * in the format:
. * I7,I3,I2,5F12.6/6F12.6/6F12.6/5F12.6/
. * where
. * CUSIP Compustat's identifying number for the firm (Committee on
.*
Uniform Security Identification Procedures number).
. * ARDSIC A two-digit code for the applied R&D industrial classification
.*
(roughly that in Bound, Cummins, Griliches, Hall, and Jaffe, in
.*
the Griliches R&D, Patents, and Productivity volume).
. * SCISECT Dummy equal to one for firms in the scientific sector.
. * LOGK The logarithm of the book value of capital in 1972.
. * SUMPAT The sum of patents applied for between 1972-1979.
. * LOGR70- The logarithm of R&D spending during the year (in 1972 dollars).
. * LOGR79
. * PAT70- The number of patents applied for during the year that were
. * PAT79 eventually granted.
.
. ********** READ DATA **********
.
. * The data are in ascii file patr7079.asc
. * There are 346 observations on 25 variables with four lines per obs
. * The data are fixed format with
. * line 1 variables 1-8 I7,I3,I2,5F12.6
. * line 2 variables 9-14 6F12.6
. * line 3 variables 15-20 6F12.6
. * line 4 variables 20-25 6F12.6
.
. * Read in using Infile: FREE FORMAT WITHOUT DICTIONARY
. * As there is space between each observation data is also space-delimited
. * free format and then there is no need for a dictionary file
. * The following command spans more that one line so use /* and */
. infile CUSIP ARDSSIC SCISECT LOGK SUMPAT LOGR70 LOGR71 LOGR72 LOGR73 /*
> */ LOGR74 LOGR75 LOGR76 LOGR77 LOGR78 LOGR79 PAT70 PAT71 PAT72 /*
> */ PAT73 PAT74 PAT75 PAT76 PAT77 PAT78 PAT79 using patr7079.asc
(346 observations read)
546
.
. ********** DATA TRANSFORMATIONS **********
.
. * Use observation number as an identifier, not just CUSIP
. gen id = _n
. label variable id "id"
. * The following lists the variables in data set and summarizes data
. describe
Contains data
obs:
346
vars:
26
size:
37,368 (99.6% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------CUSIP
float %9.0g
ARDSSIC
float %9.0g
SCISECT
float %9.0g
LOGK
float %9.0g
SUMPAT
float %9.0g
LOGR70
float %9.0g
LOGR71
float %9.0g
LOGR72
float %9.0g
LOGR73
float %9.0g
LOGR74
float %9.0g
LOGR75
float %9.0g
LOGR76
float %9.0g
LOGR77
float %9.0g
LOGR78
float %9.0g
LOGR79
float %9.0g
PAT70
float %9.0g
PAT71
float %9.0g
PAT72
float %9.0g
PAT73
float %9.0g
PAT74
float %9.0g
PAT75
float %9.0g
PAT76
float %9.0g
PAT77
float %9.0g
PAT78
float %9.0g
PAT79
float %9.0g
id
float %9.0g
id
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
547
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------CUSIP |
346 531201.2 282074.9
800 989399
ARDSSIC |
336 9.97619 5.459706
1
21
SCISECT |
346 .4248555 .4950369
0
1
LOGK |
346 3.921063 2.095542 -1.76965 9.66626
SUMPAT |
346 284.7312 571.1136
0
3806
-------------+-------------------------------------------------------LOGR70 |
346 1.198348 1.941968 -3.67354 6.56641
LOGR71 |
346 1.169182 1.929444 -3.53055 6.95687
LOGR72 |
346 1.185953 1.929078 -3.35241 6.97009
LOGR73 |
346 1.231135 1.934896 -3.67395 7.06211
LOGR74 |
346 1.232636 1.946417 -3.15274 7.06524
-------------+-------------------------------------------------------LOGR75 |
346 1.165802 1.98001 -3.5476 6.76486
LOGR76 |
346 1.212888 1.979273 -3.84868 6.8285
LOGR77 |
346 1.250034 2.003002 -3.47884 6.90253
LOGR78 |
346 1.306511 2.019792 -3.2832 6.96345
LOGR79 |
346 1.345581 2.054982 -3.57742 7.03432
-------------+-------------------------------------------------------PAT70 |
346 40.00289 82.50335
0
608
PAT71 |
346 38.10983 78.40308
0
553
PAT72 |
346 36.30925 74.81591
0
557
PAT73 |
346 36.95376 77.91971
0
595
PAT74 |
346 37.60983 75.94388
0
528
-------------+-------------------------------------------------------PAT75 |
346 36.87283 75.98788
0
508
PAT76 |
346 35.84682 73.31613
0
487
PAT77 |
346 36.23121 72.75146
0
456
PAT78 |
346 32.80636 65.6505
0
434
PAT79 |
346 32.10116 66.36197
0
515
-------------+-------------------------------------------------------id |
346
173.5 100.0258
1
346
.
. ******** CHANGE ORGANIZATION OF DATA USING RESHAPE AND MORE
TRANSFORMATIONS
.
. reshape long PAT LOGR, i(id) j(year)
(note: j = 70 71 72 73 74 75 76 77 78 79)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
346 -> 3460
Number of variables
26 ->
9
j variable (10 values)
-> year
xij variables:
PAT70 PAT71 ... PAT79 -> PAT
LOGR70 LOGR71 ... LOGR79 -> LOGR
-----------------------------------------------------------------------------
548
. describe
Contains data
obs:
3,460
vars:
9
size:
128,020 (98.7% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------id
float %9.0g
id
year
byte %9.0g
CUSIP
float %9.0g
ARDSSIC
float %9.0g
SCISECT
float %9.0g
LOGK
float %9.0g
SUMPAT
float %9.0g
LOGR
float %9.0g
PAT
float %9.0g
------------------------------------------------------------------------------Sorted by: id year
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
3460
173.5 99.89562
1
346
year |
3460
74.5 2.872696
70
79
CUSIP |
3460 531201.2 281707.7
800 989399
ARDSSIC |
3360 9.97619 5.452387
1
21
SCISECT |
3460 .4248555 .4943925
0
1
-------------+-------------------------------------------------------LOGK |
3460 3.921063 2.092814 -1.76965 9.66626
SUMPAT |
3460 284.7312 570.3701
0
3806
LOGR |
3460 1.229807 1.970524 -3.84868 7.06524
PAT |
3460 36.28439 74.46563
0
608
.
. * Create new variable log(patents) with adjustment for patents = 0
. gen NEWPAT = PAT
. replace NEWPAT = 0.5 if NEWPAT==0.
(605 real changes made)
. gen LPAT = ln(NEWPAT)
. label variable LPAT "Ln(Patents)"
. label variable PAT "Patents"
549
550
. gen dyear3 = 0
. replace dyear3 = 1 if year==77
(346 real changes made)
. gen dyear4 = 0
. replace dyear4 = 1 if year==78
(346 real changes made)
. gen dyear5 = 0
. replace dyear5 = 1 if year==79
(346 real changes made)
.
. * Check data and Save data as Stata data set
. describe
Contains data
obs:
3,460
vars:
22
size:
307,940 (97.0% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------id
float %9.0g
id
year
byte %9.0g
CUSIP
float %9.0g
ARDSSIC
float %9.0g
SCISECT
float %9.0g
LOGK
float %9.0g
SUMPAT
float %9.0g
LOGR
float %9.0g
Ln(R&D)
PAT
float %9.0g
Patents
NEWPAT
float %9.0g
LPAT
float %9.0g
Ln(Patents)
DPAT
float %9.0g
Patent Indicator
RANDD
float %9.0g
R&D
LOGRL1
float %9.0g
Ln(R&D) lagged once
LOGRL2
float %9.0g
Ln(R&D) lagged twice
LOGRL3
float %9.0g
Ln(R&D) lagged three times
LOGRL4
float %9.0g
Ln(R&D) lagged four times
LOGRL5
float %9.0g
Ln(R&D) lagged five times
dyear2
float %9.0g
dyear3
float %9.0g
dyear4
float %9.0g
dyear5
float %9.0g
------------------------------------------------------------------------------Sorted by: id year
551
PAT |
3460 36.28439 74.46563
0
608
LPAT |
3460 1.935464 1.949421 -.6931472 6.410175
-------------+-------------------------------------------------------DPAT |
3460 .8251445 .3798984
0
1
RANDD |
3460 23.02263 82.90186 .0213078 1170.563
LOGRL1 |
3114 1.216943 1.960836 -3.84868 7.06524
LOGRL2 |
2768 1.205747 1.953427 -3.84868 7.06524
LOGRL3 |
2422 1.19942 1.946583 -3.84868 7.06524
-------------+-------------------------------------------------------LOGRL4 |
2076 1.197176 1.941555 -3.67395 7.06524
LOGRL5 |
1730 1.203451 1.934293 -3.67395 7.06524
dyear2 |
3460
.1 .3000434
0
1
dyear3 |
3460
.1 .3000434
0
1
dyear4 |
3460
.1 .3000434
0
1
-------------+-------------------------------------------------------dyear5 |
3460
.1 .3000434
0
1
. xtsum, i(id)
Variable
|
Mean Std. Dev.
Min
Max | Observations
-----------------+--------------------------------------------+---------------id
overall | 173.5 99.89562
1
346 | N = 3460
between |
100.0258
1
346 | n = 346
within |
0
173.5
173.5 | T =
10
|
|
year overall |
74.5 2.872696
70
79 | N = 3460
between |
0
74.5
74.5 | n = 346
within |
2.872696
70
79 | T =
10
|
|
CUSIP overall | 531201.2 281707.7
800 989399 | N = 3460
between |
282074.9
800 989399 | n = 346
within |
0 531201.2 531201.2 | T =
10
|
|
ARDSSIC overall | 9.97619 5.452387
1
21 | N = 3360
between |
5.459706
1
21 | n = 336
within |
0 9.97619 9.97619 | T =
10
|
|
SCISECT overall | .4248555 .4943925
0
1 | N = 3460
between |
.4950369
0
1 | n = 346
within |
0 .4248555 .4248555 | T =
10
|
|
LOGK overall | 3.921063 2.092814 -1.76965 9.66626 | N = 3460
between |
2.095542 -1.76965 9.66626 | n = 346
within |
0 3.921063 3.921063 | T =
10
|
|
SUMPAT overall | 284.7312 570.3701
0
3806 | N = 3460
between |
571.1136
0
3806 | n = 346
within |
0 284.7312 284.7312 | T =
10
|
|
LOGR overall | 1.229807 1.970524 -3.84868 7.06524 | N = 3460
between |
1.944421 -3.120133 6.911438 | n = 346
553
within |
.3347099 -1.19673 4.218814 | T =
10
|
|
PAT
overall | 36.28439 74.46563
0
608 | N = 3460
between |
72.5989
0
484.8 | n = 346
within |
16.97772 -177.7156 224.3844 | T =
10
|
|
LPAT overall | 1.935464 1.949421 -.6931472 6.410175 | N =
between |
1.873181 -.6931472 6.180623 | n = 346
within |
.5482375 -.2643028 4.368045 | T =
10
|
|
DPAT overall | .8251445 .3798984
0
1 | N = 3460
between |
.2831052
0
1 | n = 346
within |
.2537376 -.0748555 1.725145 | T =
10
|
|
RANDD overall | 23.02263 82.90186 .0213078 1170.563 | N =
between |
81.69163 .0582575 1014.058 | n = 346
within |
14.71596 -280.2214 311.47 | T =
10
|
|
LOGRL1 overall | 1.216943 1.960836 -3.84868 7.06524 | N =
between |
1.937733 -3.123236 6.897784 | n = 346
within |
.3157841 -.6151992 4.203909 | T =
9
|
|
LOGRL2 overall | 1.205747 1.953427 -3.84868 7.06524 | N =
between |
1.932143 -3.12461 6.889576 | n = 346
within |
.3035537 -.486563 4.187752 | T =
8
|
|
LOGRL3 overall | 1.19942 1.946583 -3.84868 7.06524 | N =
between |
1.926813 -3.074006 6.887726 | n = 346
within |
.2928787 -.2381882 4.153968 | T =
7
|
|
LOGRL4 overall | 1.197176 1.941555 -3.67395 7.06524 | N =
between |
1.923302 -2.989647 6.897597 | n = 346
within |
.2818841 -.2335892 4.095286 | T =
6
|
|
LOGRL5 overall | 1.203451 1.934293 -3.67395 7.06524 | N =
between |
1.917687 -2.99075 6.924144 | n = 346
within |
.2692134 -.1899074 4.062701 | T =
5
|
|
dyear2 overall |
.1 .3000434
0
1 | N = 3460
between |
0
.1
.1 | n = 346
within |
.3000434
0
1| T=
10
|
|
dyear3 overall |
.1 .3000434
0
1 | N = 3460
between |
0
.1
.1 | n = 346
within |
.3000434
0
1| T=
10
|
|
dyear4 overall |
.1 .3000434
0
1 | N = 3460
between |
0
.1
.1 | n = 346
within |
.3000434
0
1| T=
10
|
|
dyear5 overall |
.1 .3000434
0
1 | N = 3460
3460
3460
3114
2768
2422
2076
1730
554
between |
within |
0
.3000434
.1
0
.1 | n = 346
1| T=
10
.
. ********** DEFINE GLOBALS INCLUDING REGRESSOR LIST **********
.
. * Number of reps for the bootstrap
. * Table 23.1 used 500
. global nreps 500
.
. * The regressions below are of patents on LOGR ??? on ???
. * Additional regressors to be included below are defined in xextra
. * Here no additional regressors
. global xextra
.
. ********** (1) LINEAR PANEL RANDOM AND FIXED EFFECTS FOR LOG(PAT)
**********
.
. * This adhoc method uses as dependent variable
. * LPAT = ln(PAT) if PAT > 0
.*
= ln(0.5) if PAT = 0
. * which is analyzed using chapter 21 methods
.
. * Note that in the first xt command need to give , i(id)
. * to indicate that the ith observation is for the ith id
. * Time invariant regressors LOGK SCISECT are not included
.
. use patr7079, clear
. drop if year<75
(1730 observations deleted)
.
. * Overall plot of data
. * The graphs below use new Stata 8 graphics
. * Change graphics scheme from default s2color to s1mono for printing
. set scheme s1mono
.
. * Figure 21.1 page 792 [with axis labels corrected - book is wrong]
. graph twoway (scatter LPAT LOGR, msize(vsmall)) (lowess LPAT LOGR) (lfit LPAT LOGR), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Pooled (Overall) Regression") /*
> */ xtitle("Log R&D Spending", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log Patents", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(4) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Original data") label(2 "Nonparametric fit") label(3 "Linear fit"))
. graph export ch23fig1.wmf, replace
555
Number of obs
=
1730
Number of groups =
346
Obs per group: min =
avg =
5.0
max =
5
F(1,1383)
=
Prob > F
3.63
=
0.0570
-----------------------------------------------------------------------------LPAT |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .1067505 .0560364 1.91 0.057 -.0031749 .216676
_cons | 1.709116 .0714557 23.92 0.000 1.568943 1.849289
-------------+---------------------------------------------------------------sigma_u | 1.7380872
sigma_e | .51119065
rho | .92038546 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(345, 1383) = 16.96
Prob > F = 0.0000
. estimates store linfe
.
556
. * Random effects
. xtreg LPAT LOGR $xextra, re i(id)
Random-effects GLS regression
Group variable (i): id
R-sq: within = 0.0026
between = 0.7669
overall = 0.7192
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
=
1730
346
Wald chi2(1)
= 915.90
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------LPAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .7202377 .0237986 30.26 0.000 .6735932 .7668821
_cons | .9384761 .0599584 15.65 0.000 .8209598 1.055992
-------------+---------------------------------------------------------------sigma_u | .90057544
sigma_e | .51119065
rho | .7563152 (fraction of variance due to u_i)
-----------------------------------------------------------------------------. estimates store linre
.
.
. ********** (2) POISSON RANDOM AND FIXED EFFECTS (Table 32.1 p.794 ) **********
.
. use patr7079, clear
. drop if year<75
(1730 observations deleted)
.
. * Poisson Cross-section with Poisson standard errors
. * Table 23.1 Poisson column
.
. poisson PAT LOGR $xextra
Iteration 0: log likelihood = -21030.607
Iteration 1: log likelihood = -21030.583
Iteration 2: log likelihood = -21030.583
Poisson regression
Number of obs =
1730
LR chi2(1)
= 108479.76
Prob > chi2 = 0.0000
Log likelihood = -21030.583
Pseudo R2
= 0.7206
-----------------------------------------------------------------------------557
PAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .6929337 .0022454 308.61 0.000 .6885329 .6973346
_cons | 1.711528 .009767 175.24 0.000 1.692385 1.730671
-----------------------------------------------------------------------------. estimates store poisiid
.
. * Poisson Cross-section with heteroskedastic robust standard errors
. poisson PAT LOGR $xextra, robust
Iteration 0: log pseudo-likelihood = -21030.607
Iteration 1: log pseudo-likelihood = -21030.583
Iteration 2: log pseudo-likelihood = -21030.583
Poisson regression
Number of obs =
1730
Wald chi2(1) = 1223.63
Prob > chi2 = 0.0000
Log pseudo-likelihood = -21030.583
Pseudo R2
= 0.7206
-----------------------------------------------------------------------------|
Robust
PAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .6929337 .0198092 34.98 0.000 .6541084 .731759
_cons | 1.711528 .0620025 27.60 0.000 1.590006 1.833051
-----------------------------------------------------------------------------. estimates store poishet
.
. * Poisson Cross-section with panel robust standard errors
. poisson PAT LOGR $xextra, cluster(id)
Iteration 0: log pseudo-likelihood = -21030.607
Iteration 1: log pseudo-likelihood = -21030.583
Iteration 2: log pseudo-likelihood = -21030.583
Poisson regression
Number of obs =
Wald chi2(1) = 259.15
Log pseudo-likelihood = -21030.583
Prob > chi2
1730
=
0.0000
1620
Wald chi2(1)
=
1.35
Prob > chi2
=
0.2460
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | -.0377642 .0325518 -1.16 0.246 -.1015645 .026036
-----------------------------------------------------------------------------. estimates store poisfe
.
. /*
> * Alternative way is to put in dummy variables
> set matsize 400
> xi: poisson PAT LOGR $xextra i.id
> */
.
. * Poisson panel random effects
. * Table 23.1 p.794 Poisson-RE column
.
. * Poisson random effects
. xtpoisson PAT LOGR $xextra, re i(id)
Fitting Poisson model:
Iteration 0: log likelihood = -21030.607
Iteration 1: log likelihood = -21030.583
Iteration 2: log likelihood = -21030.583
559
=
1730
346
Wald chi2(1)
= 110.20
Log likelihood = -5553.1787
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .3487832 .0332254 10.50 0.000 .2836625 .4139039
_cons | 2.312705 .124758 18.54 0.000 2.068184 2.557226
-------------+---------------------------------------------------------------/lnalpha | .5454692 .0899144
.3692402 .7216983
-------------+---------------------------------------------------------------alpha | 1.725418 .1551399
1.446635 2.057925
-----------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 3.1e+04 Prob>=chibar2 = 0.000
. estimates store poisre
.
. * Poisson random effects with normal error
. xtpoisson PAT LOGR $xextra, re i(id) normal
Fitting comparison Poisson model:
Iteration 0: log likelihood = -21030.607
Iteration 1: log likelihood = -21030.583
Iteration 2: log likelihood = -21030.583
Fitting constant-only model:
tau =
tau =
tau =
tau =
tau =
tau =
0.0
0.1
0.2
0.3
0.4
0.5
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
LR chi2(0)
Log likelihood = -6261.9825
=
1730
346
= 2649.21
Prob > chi2
=
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .815977
.
.
.
.
.
_cons | 1.156293
.
.
.
.
.
-------------+---------------------------------------------------------------/lnsig2u | -1.310299
.
.
.
.
.
-------------+---------------------------------------------------------------sigma_u | .5193643
.
.
.
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01) = 3.0e+04 Pr>=chibar2 = 0.000
. estimates store poisrenormal
.
. * Poisson random effects population averaged
. xtpoisson PAT LOGR $xextra, pa i(id)
Iteration 1: tolerance = .09172122
Iteration 2: tolerance = .02686915
Iteration 3: tolerance = .00712438
Iteration 4: tolerance = .00159015
Iteration 5: tolerance = .00032104
Iteration 6: tolerance = .00006195
Iteration 7: tolerance = .00001174
Iteration 8: tolerance = 2.209e-06
561
.
. ********** (3) POISSON GEE (GENERALIZED ESTIMATING EQUATIONS **********
.
. * Xtgee should reproduce Poisson random effects population averaged
. xtgee PAT LOGR $xextra, corr(exchangeable) family(poisson) link(log) i(id)
Iteration 1: tolerance = .09172122
Iteration 2: tolerance = .02686915
Iteration 3: tolerance = .00712438
Iteration 4: tolerance = .00159015
Iteration 5: tolerance = .00032104
Iteration 6: tolerance = .00006195
Iteration 7: tolerance = .00001174
Iteration 8: tolerance = 2.209e-06
Iteration 9: tolerance = 4.146e-07
GEE population-averaged model
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
Obs per group: min =
5
Family:
Poisson
avg =
5.0
Correlation:
exchangeable
max =
5
Wald chi2(1)
= 16317.27
Scale parameter:
1
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .5595302 .0043803 127.74 0.000
.550945 .5681153
_cons | 2.067515 .0185166 111.66 0.000 2.031223 2.103807
-----------------------------------------------------------------------------. estimates store poisgee
.
. * Xtgee should reproduce Poisson random effects population averaged with robust se
. xtgee PAT LOGR $xextra, corr(exchangeable) family(poisson) link(log) i(id) robust
Iteration 1: tolerance = .09172122
Iteration 2: tolerance = .02686915
Iteration 3: tolerance = .00712438
Iteration 4: tolerance = .00159015
Iteration 5: tolerance = .00032104
Iteration 6: tolerance = .00006195
Iteration 7: tolerance = .00001174
Iteration 8: tolerance = 2.209e-06
Iteration 9: tolerance = 4.146e-07
GEE population-averaged model
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
Obs per group: min =
5
563
Family:
Correlation:
Scale parameter:
Poisson
avg =
5.0
exchangeable
max =
5
Wald chi2(1)
= 293.80
1
Prob > chi2
= 0.0000
3565052.8
2060.724
Deviance
Dispersion
= 3565052.8
= 2060.724
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LOGR | .5084673 .0105636 48.13 0.000
.487763 .5291716
_cons | 2.528729 .0544558 46.44 0.000 2.421997 2.63546
-----------------------------------------------------------------------------. estimates store nls
.
. * Xtgee should give NLS of exponential mean with robust standard errors
. xtgee PAT LOGR $xextra, corr(independent) family(gaussian) link(log) i(id) robust
Iteration 1: tolerance = 8.014e-08
GEE population-averaged model
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
Obs per group: min =
5
564
Family:
Correlation:
Scale parameter:
Pearson chi2(1730):
Dispersion (Pearson):
Gaussian
avg =
5.0
independent
max =
5
Wald chi2(1)
= 85.32
2060.724
Prob > chi2
= 0.0000
3565052.8
2060.724
Deviance
Dispersion
= 3565052.8
= 2060.724
Number of obs =
N of clusters =
346
Replications =
500
1730
BC = bias-corrected
. matrix poisbootse = e(se)
.
. * Poisson fixed effects panel bootstrap standard errors
. set seed 10001
. bootstrap "xtpoisson PAT LOGR $xextra, fe i(id)" "_b[LOGR]", cluster(id) reps($nreps) level(95)
command:
xtpoisson PAT LOGR , fe i(id)
statistic: _bs_1
= _b[LOGR]
Bootstrap statistics
Number of obs =
N of clusters =
324
Replications =
500
1620
Number of obs =
N of clusters =
346
Replications =
500
1730
|
.2775298 .5040658 (BC)
_bs_2 | 500 2.312705 .5382745 .4384781 1.451214 3.174196 (N)
|
2.104445 3.743506 (P)
|
1.804036 2.552794 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix poisrebootse = e(se)
.
. * Poisson population averaged panel bootstrap standard errors
. set seed 10001
. bootstrap "xtpoisson PAT LOGR $xextra, pa i(id)" "_b[LOGR] _b[_cons]", cluster(id)
reps($nreps) le
> vel(95)
command:
xtpoisson PAT LOGR , pa i(id)
statistics: _bs_1
= _b[LOGR]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
346
Replications =
500
1730
567
command:
xtgee PAT LOGR , corr(independent) family(gaussian) link(log) i(id)
statistics: _bs_1
= _b[LOGR]
_bs_2
= _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
346
Replications =
500
1730
. ********** DISPLAY RESULTS FOR (1)-(3) GIVEN IN TABLE 23.1 page 794 **********
.
. * Standard error using iid errors and in some cases panel
.
. estimates table linolspan linfe linre, t se /*
> */ stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
----------------------------------------------------Variable | linolspan
linfe
linre
-------------+--------------------------------------LOGR |
0.834
0.107
0.720
|
0.023
0.056
0.024
|
36.48
1.91
30.26
_cons |
0.795
1.709
0.938
|
0.058
0.071
0.060
|
13.73
23.92
15.65
-------------+--------------------------------------N | 1730.000 1730.000 1730.000
ll | -2531.658 -1100.267
r2 |
0.719
0.003
tss |
6732.584
rss | 1890.831 361.400
mss | 4841.753
0.948
rmse |
1.046
0.511
df_r | 345.000 1383.000
----------------------------------------------------legend: b/se/t
. estimates table poisiid poishet poispan, t se /*
> */ stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
----------------------------------------------------Variable | poisiid
poishet
poispan
-------------+--------------------------------------LOGR |
0.693
0.693
0.693
|
0.002
0.020
0.043
| 308.61
34.98
16.10
_cons |
1.712
1.712
1.712
|
0.010
0.062
0.134
| 175.24
27.60
12.77
-------------+--------------------------------------N | 1730.000 1730.000 1730.000
ll | -21030.583 -21030.583 -21030.583
r2 |
tss |
rss |
mss |
rmse |
df_r |
----------------------------------------------------legend: b/se/t
569
LOGR |
0.560
0.560
0.508
0.508
|
0.004
0.033
0.011
0.055
| 127.74
17.14
48.13
9.24
_cons |
2.068
2.068
2.529
2.529
|
0.019
0.111
0.054
0.218
| 111.66
18.57
46.44
11.62
-------------+---------------------------------------------------N | 1730.000 1730.000 1730.000 1730.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
-----------------------------------------------------------------legend: b/se/t
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section5\mma23p1pannonlin.txt
log type: text
closed on: 23 May 2005, 12:53:45
571
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma24p1olscluster.txt
log type: text
opened on: 24 May 2005, 14:33:58
.
. ********** OVERVIEW OF MMA24P1OLSCLUSTER.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 24.7 pages 848-53 Table 24.4
. * Cluster robust inference for OLS cross-section application using
. * Vietnam Living Standard Survey data
.
. * (0) Descriptive Statistics (Table 24.3 first half)
. * (1) Linear regression (in logs) with household data (Table 24.4)
.
. * For Tables 24.5-6 for clustered count data see MMA24P2POISCLUSTER.DO
.
. * The cluster effects model is
. * y_it = x_it'b + a_i + e_it
. * Default xtreg output assumes e_it is iid.
. * This is usually too strong an assumption.
. * Instead should get cluster-robust errors after xtreg
. * See Section 21.2.3 pages 709-12
. * Stata Version 8 does not do this but Stata version 9 does.
. * Here we do a panel bootstrap - results not reported in the text
.
. * To speed up programs reduce breps - the number of bootstrap reps
.
. * To run this program you need data set
. * vietnam_ex1.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
.
. ********** DATA DESCRIPTION **********
.
. * The data comes from World Bank 1997 Vietnam Living Standards Survey
. * A subset was used in chapter 4.6.4.
. * The larger sample here is described on pages 848-9
572
.
. * The data are HOUSEHOLD data
. * There are N=5006 households in 194 clusters
.
. * The separate data set vietnam_ex2.dta has household-level data
.
. ********** READ IN HOUSEHOLD DATA and SUMMARIZE (Table 24.3) **********
.
. use vietnam_ex1.dta
. desc
Contains data from vietnam_ex1.dta
obs:
5,999
vars:
8
11 Apr 2005 12:39
size:
185,969 (98.2% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------sex
byte %8.0g
Gender of HH.head (1:M;2:F)
age
int %8.0g
Age of household head
comped98
float %9.0g
diploma completed diploma HH.head
farm
float %9.0g
loaiho Type of HH (1:farm; 0:nonfarm)
hhsize
long %12.0g
Household size
commune
float %9.0g
commune code PSU-SVY commands
lhhexp1
float %9.0g
lhhex12m
float %9.0g
------------------------------------------------------------------------------Sorted by:
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------sex |
5999 1.270712 .4443645
1
2
age |
5999 48.01284 13.7702
16
95
comped98 |
5999 3.385564 2.037543
0
9
farm |
5999 .5730955 .4946694
0
1
hhsize |
5999 4.752292 1.954292
1
19
-------------+-------------------------------------------------------commune |
5999 98.26588 56.00461
1
194
lhhexp1 |
5999 9.341561 .6877458 6.543108 12.20242
lhhex12m |
5006 6.310585 1.593083
0 12.36325
.
. rename sex SEX
. rename age AGE
. rename comped98 EDUC
573
5006
-----------------------------------------------------------------------------|
Robust
LNEXP12M |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .6702328 .0425223 15.76 0.000 .5868705 .7535952
AGE | .0105766 .0016634 6.36 0.000 .0073157 .0138376
SEX | .097444 .0519606 1.88 0.061 -.0044217 .1993096
HHSIZE | .0289812 .0134698 2.15 0.031 .0025744 .055388
FARM | .1346891 .0494286 2.72 0.006 .0377873 .2315908
EDUC | -.0903599 .0127869 -7.07 0.000 -.1154278 -.0652919
_cons | -.5107135 .3812665 -1.34 0.180 -1.258163 .2367362
-----------------------------------------------------------------------------. estimates store olshet
.
. * OLS with cluster-robust standard errors (Table 24.4 column 4)
. regress LNEXP12M $XLISTLINEAR, cluster(COMMUNE)
Regression with robust standard errors
Number of obs =
F( 6, 193) = 54.91
Prob > F
= 0.0000
5006
575
R-squared
Number of clusters (COMMUNE) = 194
= 0.0896
Root MSE
= 1.5209
-----------------------------------------------------------------------------|
Robust
LNEXP12M |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .6702328 .0528536 12.68 0.000
.565988 .7744777
AGE | .0105766 .0019371 5.46 0.000 .0067561 .0143972
SEX | .097444 .0595084 1.64 0.103 -.0199263 .2148142
HHSIZE | .0289812 .0153602 1.89 0.061 -.0013142 .0592766
FARM | .1346891 .0608046 2.22 0.028 .0147622 .2546159
EDUC | -.0903599 .0149743 -6.03 0.000 -.1198942 -.0608255
_cons | -.5107135 .4706163 -1.09 0.279 -1.438925 .4174979
-----------------------------------------------------------------------------. estimates store olsclust
.
. * Random effects estimation (FGLS) (Table 24.4 columns 5-6)
. * This uses the xtreg command which first requires identifying the cluster
. iis COMMUNE
. xtreg LNEXP12M $XLISTLINEAR, re
Random-effects GLS regression
Group variable (i): COMMUNE
R-sq: within = 0.0518
between = 0.2884
overall = 0.0883
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Number of obs
=
5006
Number of groups =
194
Obs per group: min =
avg =
25.8
max =
39
Wald chi2(6)
= 335.12
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .6268899 .0468004 13.39 0.000 .5351627 .718617
AGE | .0112334 .0016411 6.85 0.000
.008017 .0144499
SEX | .1069915 .0511849 2.09 0.037 .0066709 .2073121
HHSIZE | .0158302 .0135166 1.17 0.242 -.0106618 .0423222
FARM | .0928509 .0549544 1.69 0.091 -.0148578 .2005595
EDUC | -.0638447 .0129744 -4.92 0.000 -.0892741 -.0384153
_cons | -.1660698 .4202027 -0.40 0.693 -.989652 .6575123
-------------+---------------------------------------------------------------sigma_u | .46739871
sigma_e | 1.4526468
rho | .09381491 (fraction of variance due to u_i)
------------------------------------------------------------------------------
576
Number of obs
=
5006
Number of groups =
194
Obs per group: min =
avg =
25.8
max =
39
F(6,4806)
=
Prob > F
43.92
= 0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .6037139 .0520178 11.61 0.000 .5017352 .7056926
AGE | .0115845 .0016706 6.93 0.000 .0083092 .0148597
SEX | .112821 .0520014 2.17 0.030 .0108745 .2147675
HHSIZE | .0107124 .0141127 0.76 0.448 -.016955 .0383797
FARM | .0693037 .0609002 1.14 0.255 -.0500885 .1886959
EDUC | -.0510325 .0135817 -3.76 0.000 -.0776588 -.0244062
_cons | .0361552 .461482 0.08 0.938 -.8685606 .9408711
-------------+---------------------------------------------------------------sigma_u | .57732514
sigma_e | 1.4526468
rho | .13640519 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(193, 4806) = 3.49
Prob > F = 0.0000
. estimates store fe
.
. * Note that can cluster bootstrap if desired to get more robust standard errors
. * This is done at end of program
.
. * Random effects estimation by MLE assuming normality (Table 24.4 columns 5-6)
. * This uses the xtreg command which first requires identifying the cluster
. iis COMMUNE
. xtreg LNEXP12M $XLISTLINEAR, mle
Fitting constant-only model:
Iteration 0: log likelihood = -9262.6182
Iteration 1: log likelihood = -9252.6974
577
Number of obs
=
5006
Number of groups =
194
Obs per group: min =
avg =
25.8
max =
39
LR chi2(6)
Log likelihood = -9092.5546
= 319.19
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .6276456 .0467072 13.44 0.000
.536101 .7191901
AGE | .01122 .0016406 6.84 0.000 .0080045 .0144354
SEX | .1067788 .0511618 2.09 0.037 .0065035 .207054
HHSIZE | .01603 .0135121 1.19 0.235 -.0104533 .0425133
FARM | .0936529 .0548379 1.71 0.088 -.0138274 .2011332
EDUC | -.0643046 .0130222 -4.94 0.000 -.0898277 -.0387816
_cons | -.1718111 .4192856 -0.41 0.682 -.9935959 .6499737
-------------+---------------------------------------------------------------/sigma_u | .455472 .0329742 13.81 0.000 .3908438 .5201002
/sigma_e | 1.452303 .0148092 98.07 0.000 1.423278 1.481329
-------------+---------------------------------------------------------------rho | .0895499 .0120221
.0682208 .1154799
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 212.57 Prob>=chibar2 = 0.000
. estimates store remle
.
. * Test of the RE specification using Breusch-Pagan test
. * This is statistic in third bottom row of Table 24.4
. quietly xtreg LNEXP12M $XLISTLINEAR, re
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects:
LNEXP12M[COMMUNE,t] = Xb + u[COMMUNE] + e[COMMUNE,t]
Estimated results:
|
Var
sd = sqrt(Var)
578
---------+----------------------------LNEXP12M | 2.537914
1.593083
e | 2.110183
1.452647
u | .2184615
.4673987
Test: Var(u) = 0
chi2(1) = 432.75
Prob > chi2 = 0.0000
.
. * Hausman test of FE vs. RE specification
. * This test is not a robust version.
. * Its validity asswumes that errors are iid after including COMMUNE-specific effect
. * For this example this may be reasonable as cluster bootstrap se's close to usual se's
. xthausman
(Warning: xthausman is no longer a supported command; use -hausman-. For instructions, see help
hausman.)
Number of obs
5006
579
Group variable:
Link:
Family:
Correlation:
Scale parameter:
COMMUNE
Number of groups =
194
identity
Obs per group: min =
1
Gaussian
avg =
25.8
exchangeable
max =
39
Wald chi2(6)
= 338.97
2.314413
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .6281447 .0466076 13.48 0.000 .5367955 .719494
AGE | .0112111 .0016411 6.83 0.000 .0079946 .0144275
SEX | .1066389 .0511914 2.08 0.037 .0063056 .2069722
HHSIZE | .0161625 .013502 1.20 0.231 -.0103009 .0426259
FARM | .0941811 .0547349 1.72 0.085 -.0130973 .2014594
EDUC | -.0646085 .0129528 -4.99 0.000 -.0899956 -.0392215
_cons | -.1756087 .4185566 -0.42 0.675 -.9959645 .6447472
-----------------------------------------------------------------------------. estimates store pa
.
. ********** DISPLAY TABLE 24.4 RESULTS page 851 **********
.
. estimates table olsiid olshet olsclust, /*
> */ b(%10.3f) t(%10.2f) stats(r2 N)
----------------------------------------------------Variable | olsiid
olshet
olsclust
-------------+--------------------------------------LNHHEXP |
0.670
0.670
0.670
|
16.01
15.76
12.68
AGE |
0.011
0.011
0.011
|
6.39
6.36
5.46
SEX |
0.097
0.097
0.097
|
1.88
1.88
1.64
HHSIZE |
0.029
0.029
0.029
|
2.19
2.15
1.89
FARM |
0.135
0.135
0.135
|
2.73
2.72
2.22
EDUC | -0.090
-0.090
-0.090
|
-7.36
-7.07
-6.03
_cons | -0.511
-0.511
-0.511
|
-1.34
-1.34
-1.09
-------------+--------------------------------------r2 |
0.090
0.090
0.090
N | 5006.000 5006.000 5006.000
----------------------------------------------------legend: b/t
. estimates table pa fe refgls remle, /*
580
>
-----------------------------------------------------------------Variable | pa
fe
refgls
remle
-------------+---------------------------------------------------_
|
LNHHEXP |
0.628
0.604
0.627
|
13.48
11.61
13.39
AGE |
0.011
0.012
0.011
|
6.83
6.93
6.85
SEX |
0.107
0.113
0.107
|
2.08
2.17
2.09
HHSIZE |
0.016
0.011
0.016
|
1.20
0.76
1.17
FARM |
0.094
0.069
0.093
|
1.72
1.14
1.69
EDUC | -0.065
-0.051
-0.064
|
-4.99
-3.76
-4.92
_cons | -0.176
0.036
-0.166
|
-0.42
0.08
-0.40
-------------+---------------------------------------------------LNEXP12M |
LNHHEXP |
0.628
|
13.44
AGE |
0.011
|
6.84
SEX |
0.107
|
2.09
HHSIZE |
0.016
|
1.19
FARM |
0.094
|
1.71
EDUC |
-0.064
|
-4.94
_cons |
-0.172
|
-0.41
-------------+---------------------------------------------------sigma_u
|
_cons |
0.455
|
13.81
-------------+---------------------------------------------------sigma_e
|
_cons |
1.452
|
98.07
-------------+---------------------------------------------------Statistics |
r2 |
0.052
N | 5006.000 5006.000 5006.000 5006.000
-----------------------------------------------------------------legend: b/t
581
.
. ********** ADDITIONALLY DO CLUSTER BOOTSTRAPS **********
.
. * These results not given in the text
.
. global breps = 500
.
. * Note that can bootstrap if desired to get more robust standard errors
. * The first reproduces reg , cluster(COMMUNE)
. bootstrap "reg LNEXP12M $XLISTLINEAR" _b, cluster(COMMUNE) reps($breps) level(95)
command:
reg LNEXP12M LNHHEXP AGE SEX HHSIZE FARM EDUC
statistics: b_LNHHEXP = _b[LNHHEXP]
b_AGE
= _b[AGE]
b_SEX
= _b[SEX]
b_HHSIZE = _b[HHSIZE]
b_FARM = _b[FARM]
b_EDUC = _b[EDUC]
b_cons = _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
194
Replications =
500
5006
P = percentile
BC = bias-corrected
. * The t-statistic vector is e(b)./e(se) where ./ is elt. by elt. division
. * But Stata Version 8 does not do ./ so instead need the following
. matrix tols = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tols, format(%10.2f)
tols[7,1]
r1
b_LNHHEXP 12.26
b_AGE 5.41
b_SEX 1.62
b_HHSIZE 1.81
b_FARM 2.40
b_EDUC -6.03
b_cons -1.04
.
. * The next two reproduce xtreg , cluster(COMMUNE)
. * but the cluster option for xtreg is not available for Stata version 8
.
. * For this example the cluster bootstrap se's are within 10 percent
. * of the usual xtreg se's, so usual se's may be okay here
.
. * Fixed effects estimator
. bootstrap "xtreg LNEXP12M $XLISTLINEAR, fe" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtreg LNEXP12M LNHHEXP AGE SEX HHSIZE FARM EDUC , fe
statistics: b_LNHHEXP = _b[LNHHEXP]
b_AGE
= _b[AGE]
b_SEX
= _b[SEX]
b_HHSIZE = _b[HHSIZE]
b_FARM = _b[FARM]
b_EDUC = _b[EDUC]
b_cons = _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
194
Replications =
500
5006
|
.0084701 .0152766 (BC)
b_SEX | 500 .112821 -.0017372 .0546362 .0054756 .2201664 (N)
|
.0129603 .2214846 (P)
|
.017047 .235448 (BC)
b_HHSIZE | 500 .0107124 -.0004379 .0150286 -.0188148 .0402395 (N)
|
-.0195233 .0415316 (P)
|
-.0184428 .044119 (BC)
b_FARM | 500 .0693037 -.0010067 .0497627 -.0284666 .167074 (N)
|
-.0291446 .1679352 (P)
|
-.0259051 .1705921 (BC)
b_EDUC | 500 -.0510325 .0003307 .0153224 -.081137 -.020928 (N)
|
-.0818133 -.0219096 (P)
|
-.0844261 -.0230367 (BC)
b_cons | 500 .0361552 .0087515 .5186644 -.9828799 1.05519 (N)
|
-.934128 1.087458 (P)
|
-.934128 1.087458 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix tfe = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tfe, format(%10.2f)
tfe[7,1]
r1
b_LNHHEXP 10.35
b_AGE 6.63
b_SEX 2.06
b_HHSIZE 0.71
b_FARM 1.39
b_EDUC -3.33
b_cons 0.07
.
. * Random effects estimator
. bootstrap "xtreg LNEXP12M $XLISTLINEAR, re" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtreg LNEXP12M LNHHEXP AGE SEX HHSIZE FARM EDUC , re
statistics: b_LNHHEXP = _b[LNHHEXP]
b_AGE
= _b[AGE]
b_SEX
= _b[SEX]
b_HHSIZE = _b[HHSIZE]
b_FARM = _b[FARM]
b_EDUC = _b[EDUC]
b_cons = _b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
194
5006
584
Replications
500
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma24p2poiscluster.txt
log type: text
opened on: 24 May 2005, 16:35:22
.
. ********** OVERVIEW OF MMA24P2POISCLUSTER.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 24.7 pages 848-53 Table 24.6
. * Cluster robust inference for Poisson cross-section application using
. * Vietnam Living Standard Survey data
.
. * (0) Descriptive Statistics (Table 24.3 second half)
. * (1) Frequencies of data (Table 24.5)
. * (2) Poisson regression with individual-level data (Table 24.6)
.
. * The results differ in second significant digit from those in text
. * despite same sample size. Not sure why.
.
. * For Table 24.4 for clustered household data see MMA24P1OLSCLUSTER.DO
.
. * The Poisson cluster effects model is
. * y_it ~ Poiss0n(x_it'b + a_i)
. * Default xtreg output assumes Poisson distribution - var = mean.
. * This is usually too strong an assumption.
. * Instead should get cluster-robust errors after xtpois
. * See Section 21.2.3 pages 709-12 and section 23.26 pages 788-9
. * Stata Version 8 does not do this.
. * Here we do a panel bootstrap - results not reported in the text
.
. * To speed up programs reduce breps - the number of bootstrap reps
. * This program takes a long time if bootstrap
.
. * To run this program you need data set
. * vietnam_ex2.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Used for graphs */
586
.
. ********** DATA DESCRIPTION **********
.
. * The data comes from World Bank 1997 Vietnam Living Standards Survey
. * A subset was used in chapter 4.6.4.
. * The larger sample here is described on pages 848-9
.
. * The data are HOUSEHOLD data
. * There are N=5006 individuals in 194 clusters (communes)
.
. * The separate data set vietnam_ex1.dta has individual level data
.
. ********** READ IN INDIVIDUAL-LEVEL DATA and SUMMARIZE (Table 24.3)
**********
.
. use vietnam_ex2.dta, clear
. desc
Contains data from vietnam_ex2.dta
obs:
27,766
vars:
12
11 Apr 2005 12:33
size: 1,443,832 (85.9% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------COMPED98
float %9.0g
SEX
float %9.0g
AGE
float %9.0g
MARRIED
float %9.0g
ILLDUM
float %9.0g
INJDUM
float %9.0g
ILLDAYS
float %9.0g
ACTDAYS
float %9.0g
PHARVIS
float %9.0g
HLTHINS
float %9.0g
lnhhinc
float %9.0g
commune
float %9.0g
------------------------------------------------------------------------------Sorted by:
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------COMPED98 | 27765 3.390672 1.93115
0
11
SEX | 27765 .5111471 .4998847
0
1
AGE | 27765 2.977504 .9671446
0 4.59512
MARRIED | 27765 .3988835 .4896775
0
1
ILLDUM | 27765 .6219701 .8995068
0
9
587
Percentiles
Smallest
1% 1.302267
.0467014
5% 1.658267
.1111674
10% 1.875315
.3755146
25% 2.188848
.4177101
50%
75%
90%
95%
99%
Obs
27765
Sum of Wgt.
27765
2.534935
Mean
2.60261
Largest
Std. Dev.
.6244145
2.962732
5.405502
3.458658
5.405502
Variance
.3898934
3.737957
5.405502
Skewness
.4925002
4.295394
5.405502
Kurtosis
3.583693
.
. * Following gives Table 24.5 (page 852) frequencies
. * These differ in some places from Table 24.5 - especially for number = 0
. tabulate PHARVIS
PHARVIS |
Freq. Percent
Cum.
------------+----------------------------------0 | 20,668
74.44
74.44
1|
3,829
13.79
88.23
2|
1,716
6.18
94.41
3|
777
2.80
97.21
4|
359
1.29
98.50
5|
174
0.63
99.13
6|
64
0.23
99.36
7|
43
0.15
99.51
8|
16
0.06
99.57
9|
4
0.01
99.59
10 |
78
0.28
99.87
11 |
1
0.00
99.87
12 |
5
0.02
99.89
13 |
1
0.00
99.89
14 |
3
0.01
99.90
15 |
9
0.03
99.94
16 |
1
0.00
99.94
20 |
8
0.03
99.97
22 |
2
0.01
99.97
27 |
1
0.00
99.98
28 |
3
0.01
99.99
30 |
3
0.01
100.00
------------+----------------------------------Total | 27,765
100.00
.
. * Histogram with kernel density estimate
. hist PHARVIS, discrete kdensity
(start=0, width=1)
.
589
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile PHARVIS LNHHEXP AGE SEX MARRIED EDUC ILLNESS INJURY ILLDAYS /*
> */ ACTDAYS INSURANCE COMMUNE using vietnam_ex2.asc, replace
.
. ********** ANALYSIS: CLUSTER ANALYSIS FOR POISSON MODEL [Table 24.6 p.851]
*********
.
. * Regressor list for the Poisson regressions
. global XLISTPOISSON LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS ACTDAYS
INJURY ILLNESS EDUC
.
. * Poisson with usual standard errors (Table 24.6 columns 1-2)
. poisson PHARVIS $XLISTPOISSON
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Poisson regression
Number of obs =
27765
LR chi2(10) = 13226.50
Prob > chi2 = 0.0000
Log likelihood = -25281.786
Pseudo R2
= 0.2073
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .078686 .0138419 5.68 0.000 .0515564 .1058156
INSURANCE | -.2485716 .0259704 -9.57 0.000 -.2994727 -.1976706
SEX | .0851733 .0171697 4.96 0.000 .0515213 .1188253
AGE | .0252426 .0106126 2.38 0.017 .0044423 .0460429
MARRIED | .1239639 .0209267 5.92 0.000 .0829483 .1649795
ILLDAYS | .0429083 .0010728 40.00 0.000 .0408057 .0450109
ACTDAYS | .0089793 .0052409 1.71 0.087 -.0012927 .0192514
INJURY | .1717029 .0747292 2.30 0.022 .0252364 .3181694
ILLNESS | .5623976 .0064536 87.15 0.000 .5497488 .5750464
EDUC | -.0524459 .0048173 -10.89 0.000 -.0618878 -.0430041
_cons | -1.640821 .0458542 -35.78 0.000 -1.730694 -1.550949
-----------------------------------------------------------------------------. estimates store poisiid
.
. * Poisson with heteroskedastic-robust standard errors (Table 24.6 column 3)
. poisson PHARVIS $XLISTPOISSON, robust
Iteration 0: log pseudo-likelihood = -26309.924
Iteration 1: log pseudo-likelihood = -25300.337
590
Number of obs =
27765
Wald chi2(10) = 2423.07
Prob > chi2 = 0.0000
Log pseudo-likelihood = -25281.786
Pseudo R2
= 0.2073
-----------------------------------------------------------------------------|
Robust
PHARVIS |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .078686 .0255091 3.08 0.002 .0286891 .1286829
INSURANCE | -.2485716 .0437892 -5.68 0.000 -.3343969 -.1627464
SEX | .0851733 .030907 2.76 0.006 .0245967 .1457499
AGE | .0252426 .0198448 1.27 0.203 -.0136526 .0641377
MARRIED | .1239639 .0419107 2.96 0.003 .0418205 .2061073
ILLDAYS | .0429083 .0028779 14.91 0.000 .0372678 .0485488
ACTDAYS | .0089793 .0207444 0.43 0.665 -.031679 .0496377
INJURY | .1717029 .2043534 0.84 0.401 -.2288224 .5722282
ILLNESS | .5623976 .0228635 24.60 0.000
.517586 .6072092
EDUC | -.0524459 .0081043 -6.47 0.000 -.0683301 -.0365618
_cons | -1.640821 .0872497 -18.81 0.000 -1.811828 -1.469815
-----------------------------------------------------------------------------. estimates store poishet
.
. * Poisson with cluster-robust standard errors (Table 24.6 column 4)
. poisson PHARVIS $XLISTPOISSON, cluster(COMMUNE)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Poisson regression
Number of obs =
27765
Wald chi2(10) = 1295.38
Log pseudo-likelihood = -25281.786
Prob > chi2 = 0.0000
(standard errors adjusted for clustering on COMMUNE)
-----------------------------------------------------------------------------|
Robust
PHARVIS |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | .078686 .0472052 1.67 0.096 -.0138344 .1712065
INSURANCE | -.2485716 .0617873 -4.02 0.000 -.3696725 -.1274708
SEX | .0851733 .0327427 2.60 0.009 .0209988 .1493478
AGE | .0252426 .0262626 0.96 0.336 -.0262311 .0767163
591
Number of obs
= 27765
Number of groups =
194
Obs per group: min =
avg = 143.1
max =
206
51
Wald chi2(10)
= 13723.01
Log likelihood = -23419.132
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | -.1013746 .0187549 -5.41 0.000 -.1381336 -.0646157
INSURANCE | -.1675953 .0273642 -6.12 0.000 -.2212283 -.1139624
SEX | .099303 .0172541 5.76 0.000 .0654855 .1331206
AGE | .0047406 .0107899 0.44 0.660 -.0164073 .0258884
592
Number of obs
= 27765
Number of groups =
194
Obs per group: min =
avg = 143.1
max =
206
51
Wald chi2(10)
= 13723.01
Log likelihood = -23419.132
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | -.1013746 .0187549 -5.41 0.000 -.1381336 -.0646157
INSURANCE | -.1675953 .0273642 -6.12 0.000 -.2212283 -.1139624
593
27671
193
51
Wald chi2(10)
= 13621.76
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LNHHEXP | -.1146402 .019025 -6.03 0.000 -.1519285 -.0773519
INSURANCE | -.163603 .0274193 -5.97 0.000 -.2173438 -.1098622
SEX | .0997415 .0172564 5.78 0.000 .0659195 .1335635
AGE | .0033591 .0107945 0.31 0.756 -.0177977 .024516
MARRIED | .1606792 .0212958 7.55 0.000 .1189403 .2024182
ILLDAYS | .046148 .0011453 40.29 0.000 .0439032 .0483929
ACTDAYS | .0189184 .0054666 3.46 0.001
.008204 .0296328
594
>
---------------------------------------Variable | poisre
poisfe
-------------+-------------------------PHARVIS
|
LNHHEXP | -0.101
-0.115
|
-5.41
-6.03
INSURANCE | -0.168
-0.164
|
-6.12
-5.97
SEX |
0.099
0.100
|
5.76
5.78
AGE |
0.005
0.003
|
0.44
0.31
MARRIED |
0.158
0.161
|
7.42
7.55
ILLDAYS |
0.046
0.046
|
40.32
40.29
ACTDAYS |
0.019
0.019
|
3.41
3.46
INJURY |
0.148
0.148
|
1.89
1.89
ILLNESS |
0.580
0.580
|
75.49
75.09
EDUC | -0.028
-0.027
|
-5.10
-4.84
_cons | -1.277
| -17.66
-------------+-------------------------lnalpha
|
_cons | -1.040
| -10.04
-------------+-------------------------Statistics |
r2 |
N | 27765.000 27671.000
---------------------------------------legend: b/t
.
. ********** ADDITIONALLY DO CLUSTER BOOTSTRAPS **********
.
. * These results not given in the text
.
. * Output at website uses breps 500
. global breps 50
.
. * Note that can bootstrap if desired to get more robust standard errors
. * The first reproduces pois , cluster(COMMUNE)
. bootstrap "poisson PHARVIS $XLISTPOISSON" _b, cluster(COMMUNE) reps($breps) level(95)
596
command:
poisson PHARVIS LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS
ACTDAYS INJURY ILLNESS EDUC
statistics: b_LNHHEXP = [PHARVIS]_b[LNHHEXP]
b_INSURA~E = [PHARVIS]_b[INSURANCE]
b_SEX
= [PHARVIS]_b[SEX]
b_AGE
= [PHARVIS]_b[AGE]
b_MARRIED = [PHARVIS]_b[MARRIED]
b_ILLDAYS = [PHARVIS]_b[ILLDAYS]
b_ACTDAYS = [PHARVIS]_b[ACTDAYS]
b_INJURY = [PHARVIS]_b[INJURY]
b_ILLNESS = [PHARVIS]_b[ILLNESS]
b_EDUC = [PHARVIS]_b[EDUC]
b_cons = [PHARVIS]_b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
194
Replications =
50
27765
|
-.0850821 -.0256777 (BC)
b_cons | 50 -1.640821 -.0414073 .1460702 -1.93436 -1.347282 (N)
|
-1.984352 -1.399226 (P)
|
-1.867373 -1.310915 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. * The t-statistic vector is e(b)./e(se) where ./ is elt. by elt. division
. * But Stata Version 8 does not do ./ so instead need the following
. matrix tpois = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tpois, format(%10.2f)
tpois[11,1]
r1
b_LNHHEXP 1.66
b_INSURANCE -3.23
b_SEX 2.46
b_AGE 0.93
b_MARRIED 3.05
b_ILLDAYS 12.62
b_ACTDAYS 0.36
b_INJURY 0.82
b_ILLNESS 19.08
b_EDUC -3.28
b_cons -11.23
.
. * The next two reproduce xtpois , cluster(COMMUNE)
. * but xtpois has no cluster option so instead cluster boostrap
.
. * Fixed effects estimator
. bootstrap "xtpois PHARVIS $XLISTPOISSON, fe" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtpois PHARVIS LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS
ACTDAYS INJURY ILLNESS EDUC ,
> fe
statistics: b_LNHHEXP = [PHARVIS]_b[LNHHEXP]
b_INSURA~E = [PHARVIS]_b[INSURANCE]
b_SEX
= [PHARVIS]_b[SEX]
b_AGE
= [PHARVIS]_b[AGE]
b_MARRIED = [PHARVIS]_b[MARRIED]
b_ILLDAYS = [PHARVIS]_b[ILLDAYS]
b_ACTDAYS = [PHARVIS]_b[ACTDAYS]
b_INJURY = [PHARVIS]_b[INJURY]
b_ILLNESS = [PHARVIS]_b[ILLNESS]
b_EDUC = [PHARVIS]_b[EDUC]
598
Bootstrap statistics
Number of obs =
N of clusters =
193
Replications =
50
27671
b_AGE 0.15
b_MARRIED 3.69
b_ILLDAYS 16.54
b_ACTDAYS 1.07
b_INJURY 0.67
b_ILLNESS 29.14
b_EDUC -2.41
.
. * Random effects estimator
. bootstrap "xtpois PHARVIS $XLISTPOISSON, re" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtpois PHARVIS LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS
ACTDAYS INJURY ILLNESS EDUC ,
> re
statistics: b_LNHHEXP = [PHARVIS]_b[LNHHEXP]
b_INSURA~E = [PHARVIS]_b[INSURANCE]
b_SEX
= [PHARVIS]_b[SEX]
b_AGE
= [PHARVIS]_b[AGE]
b_MARRIED = [PHARVIS]_b[MARRIED]
b_ILLDAYS = [PHARVIS]_b[ILLDAYS]
b_ACTDAYS = [PHARVIS]_b[ACTDAYS]
b_INJURY = [PHARVIS]_b[INJURY]
b_ILLNESS = [PHARVIS]_b[ILLNESS]
b_EDUC = [PHARVIS]_b[EDUC]
b_cons = [PHARVIS]_b[_cons]
b_1cons = [lnalpha]_b[_cons]
Bootstrap statistics
Number of obs =
N of clusters =
194
Replications =
50
27765
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma25p1treatment.txt
log type: text
opened on: 26 May 2005, 10:26:17
.
. ********** OVERVIEW OF MMA25P1TREATMENT.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 25.8.1-25.8.4 pages 889-893 Tables 25.3-25.4 and Fig. 25.3
. * Evaluating treatment effect of training on Earnings
. * using Dehejia-Wahba data (originally Lalonde data)
.
. * (0) Summarize data for treatments and controls (Table 25.3)
. * (1) Calculate the treatment effect by simple methods (Table 25.4)
. * To replicate some results in DW 1999
. * (1A) treatment-control
. * (1B) control function
. * (1C) before-after cpmparison
. * (1D) differences-in-differences
. * (2) Calculate treatment effect by propensity score (matching by strata)
. * Last entry in Table 25.4 and Figure 25.3.
.
. * The program MMA25P2MATCHING.DO uses propensity scores with matching
. * methods more sophisticated than those usd in the MMA25P1TREAMENT.DO
.
. * To run this program you need file
. * nswpsid.da1
.
. ********** STATA SETUP **********
.
. set more off
. version 8
. set scheme s1mono /* Used for graphs */
.
. ********** DATA DESCRIPTION **********
.
. * Data set nswpsid.da1 is data set nswpsid.da1 from Guido Imbens
. * http://emlab.berkeley.edu/users/imbens/index.shtml
.
. * Data originally from DW99
. * R.H. Dehejia and S. Wahba (1999)
. * "Causal Effects in Nonexperimental Studies: reevaluating the
602
-------------+-------------------------------------------------------RE78 |
2675 20502.38 15632.52
0 121174
TREAT |
2675 .0691589 .2537716
0
1
AGESQ |
2675 1281.61 766.8415
289
3025
EDUCSQ |
2675 153.1862 70.62231
0
289
RE74SQ |
2675 5.21e+08 8.47e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2675 5.11e+08 8.91e+08
0 2.45e+10
U74BLACK |
2675 .0549533 .2279316
0
1
U74HISP |
2675 .0056075 .0746868
0
1
.
. * Reproduce DW99 Table 1: RE74subset Treated and PSID-1 rows
. * Same as CT Table 25.3 page 890
. * except for changes to U74, U75 and U74BLACK
. bysort TREAT: sum AGE EDUC NODEGREE BLACK HISP MARR U74 U75 RE74 RE75
RE78 TREAT /*
> */ AGESQ EDUCSQ RE74SQ RE75SQ U74BLACK
----------------------------------------------------------------------------------------------------> TREAT = 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
2490 34.8506 10.44076
18
55
EDUC |
2490 12.11687 3.082435
0
17
NODEGREE |
2490 .3052209 .4605934
0
1
BLACK |
2490 .2506024 .433447
0
1
HISP |
2490 .0325301 .1774389
0
1
-------------+-------------------------------------------------------MARR |
2490 .8662651 .3404357
0
1
U74 |
2490 .0863454 .2809298
0
1
U75 |
2490
.1 .3000603
0
1
RE74 |
2490 19428.75 13406.88
0 137149
RE75 | 2490 19063.34 13596.95
0 156653
-------------+-------------------------------------------------------RE78 |
2490 21553.92 15555.35
0 121174
TREAT |
2490
0
0
0
0
AGESQ |
2490 1323.53 769.796
324
3025
EDUCSQ |
2490 156.3161 71.43048
0
289
RE74SQ |
2490 5.57e+08 8.66e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2490 5.48e+08 9.12e+08
0 2.45e+10
U74BLACK |
2490 .0144578 .1193923
0
1
----------------------------------------------------------------------------------------------------> TREAT = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
185 25.81622 7.155019
17
48
605
EDUC |
185 10.34595 2.01065
4
16
NODEGREE |
185 .7081081 .4558666
0
1
BLACK |
185 .8432432 .3645579
0
1
HISP |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------MARR |
185 .1891892 .3927217
0
1
U74 |
185 .7081081 .4558666
0
1
U75 |
185
.6 .4912274
0
1
RE74 |
185 2095.574 4886.623
0 35040.1
RE75 |
185 1532.056 3219.251
0 25142.2
-------------+-------------------------------------------------------RE78 |
185 6349.145 7867.405
0 60307.9
TREAT |
185
1
0
1
1
AGESQ |
185 717.3946 431.2517
289
2304
EDUCSQ |
185 111.0595 39.30388
16
256
RE74SQ |
185 2.81e+07 1.14e+08
0 1.23e+09
-------------+-------------------------------------------------------RE75SQ |
185 1.27e+07 5.60e+07
0 6.32e+08
U74BLACK |
185
.6 .4912274
0
1
.
. save nswpsid, replace
file nswpsid.dta saved
.
. ********** ANALYSIS: (1) CALCULATE EFFECT OF TRAINING (Table 25.4, p.891)
**********
.
. ***** (1A) TREATMENT-CONTROL COMPARISON USING POST_TREATMENT
EARNINGS
. *****
[Difference in means]
.
. * DW99 Table 5 column 1 and Table 3 column 1
. regress RE78 T
Source |
SS
df
MS
Number of obs = 2675
-------------+-----------------------------F( 1, 2673) = 173.41
Model | 3.9811e+10 1 3.9811e+10
Prob > F
= 0.0000
Residual | 6.1365e+11 2673 229573201
R-squared = 0.0609
-------------+-----------------------------Adj R-squared = 0.0606
Total | 6.5346e+11 2674 244375675
Root MSE
= 15152
-----------------------------------------------------------------------------RE78 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------TREAT | -15204.78 1154.614 -13.17 0.000 -17468.8 -12940.75
_cons | 21553.92 303.6414 70.98 0.000 20958.53 22149.32
-----------------------------------------------------------------------------.
606
2675
-----------------------------------------------------------------------------|
Robust
RE78 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------TREAT | -15204.78 655.9143 -23.18 0.000 -16490.93 -13918.63
_cons | 21553.92 311.785 69.13 0.000 20942.56 22165.29
-----------------------------------------------------------------------------. estimates store treatcontrol
.
. ***** (1B) CONTROL FUNCTION ESTIMATOR Additionally Include pre-treatment controls
.
. * DW99 Table 5 column 2 using regressors in footnote a
. * Same as DW99 Table 2 column 14
. regress RE78 TREAT AGE AGESQ EDUC NODEGREE BLACK HISP RE74 RE75
Source |
SS
df
MS
Number of obs = 2675
-------------+-----------------------------F( 9, 2665) = 419.22
Model | 3.8296e+11 9 4.2551e+10
Prob > F
= 0.0000
Residual | 2.7050e+11 2665 101500967
R-squared = 0.5860
-------------+-----------------------------Adj R-squared = 0.5847
Total | 6.5346e+11 2674 244375675
Root MSE
= 10075
-----------------------------------------------------------------------------RE78 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------TREAT | 217.9438 866.1968 0.25 0.801 -1480.542 1916.43
AGE | 158.5058 155.4065 1.02 0.308 -146.2239 463.2354
AGESQ | -3.232885 2.11617 -1.53 0.127 -7.382386 .9166173
EDUC | 564.6237 103.56 5.45 0.000 361.5577 767.6898
NODEGREE | 502.0912 647.0243 0.78 0.438 -766.6292 1770.812
BLACK | -699.3353 493.1811 -1.42 0.156 -1666.392 267.7211
HISP | 2226.535 1092.71 2.04 0.042 83.88965 4369.181
RE74 | .2791682 .0279297 10.00 0.000 .2244021 .3339343
RE75 | .5680874 .0275763 20.60 0.000 .5140143 .6221605
_cons | -2836.703 2901.443 -0.98 0.328 -8526.01 2852.604
-----------------------------------------------------------------------------.
. * CT Table 25.4 p.891 second row uses heteroskedastic-robust standard errors
. regress RE78 TREAT AGE AGESQ EDUC NODEGREE BLACK HISP RE74 RE75, robust
607
2675
-----------------------------------------------------------------------------|
Robust
RE78 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------TREAT | 217.9438 767.8811 0.28 0.777 -1287.759 1723.647
AGE | 158.5058 151.0305 1.05 0.294 -137.6431 454.6546
AGESQ | -3.232885 2.103324 -1.54 0.124 -7.357197 .891428
EDUC | 564.6237 121.6483 4.64 0.000 326.0891 803.1583
NODEGREE | 502.0912 632.3685 0.79 0.427 -737.8914 1742.074
BLACK | -699.3353 432.4582 -1.62 0.106 -1547.323 148.6523
HISP | 2226.535 1219.08 1.83 0.068 -163.9034 4616.974
RE74 | .2791682 .0618802 4.51 0.000 .1578301 .4005063
RE75 | .5680874 .0663995 8.56 0.000 .4378876 .6982872
_cons | -2836.703 2937.385 -0.97 0.334 -8596.487 2923.081
-----------------------------------------------------------------------------. estimates store controlfunction
.
. * Variation that lets OLS coefficients differ across treatment and controls
. * Interaction of regressors with T
. gen TAGE = TREAT*AGE
. gen TAGESQ = TREAT*AGESQ
. gen TEDUC = TREAT*EDUC
. gen TNODEGREE = TREAT*NODEGREE
. gen TBLACK = TREAT*BLACK
. gen THISP = TREAT*HISP
. gen TRE74 = TREAT*RE74
. gen TRE75 = TREAT*RE75
. regress RE78 TREAT AGE AGESQ EDUC NODEGREE BLACK HISP RE74 RE75 /*
> */TAGE TAGESQ TEDUC TNODEGREE TBLACK THISP TRE74 TRE75
Source |
SS
df
MS
Number of obs = 2675
-------------+-----------------------------F( 17, 2657) = 223.17
Model | 3.8431e+11 17 2.2607e+10
Prob > F
= 0.0000
Residual | 2.6915e+11 2657 101297131
R-squared = 0.5881
608
609
. gen dyear2 = 0
. replace dyear2 = 1 if year==2
(2675 real changes made)
. gen Tdyear2 = TREAT*dyear2
. regress EARNS Tdyear2 TREAT dyear2
Source |
SS
df
MS
Number of obs = 5350
-------------+-----------------------------F( 3, 5346) = 169.20
Model | 1.0214e+11 3 3.4047e+10
Prob > F
= 0.0000
Residual | 1.0757e+12 5346 201218724
R-squared = 0.0867
-------------+-----------------------------Adj R-squared = 0.0862
Total | 1.1779e+12 5349 220201247
Root MSE
= 14185
-----------------------------------------------------------------------------EARNS |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------Tdyear2 | 2326.505 1528.712 1.52 0.128 -670.3928 5323.403
TREAT | -17531.28 1080.962 -16.22 0.000 -19650.41 -15412.15
dyear2 | 2490.585 402.0217 6.20 0.000 1702.458 3278.711
_cons | 19063.34 284.2723 67.06 0.000 18506.05 19620.63
-----------------------------------------------------------------------------.
. * CT Table 25.4 p.891 fourth row usea heteroskedastic-robust standard errors
. regress EARNS Tdyear2 TREAT dyear2, robust
Regression with robust standard errors
Number of obs =
F( 3, 5346) = 1222.98
Prob > F
= 0.0000
R-squared = 0.0867
Root MSE = 14185
5350
-----------------------------------------------------------------------------|
Robust
EARNS |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------Tdyear2 | 2326.505 748.5021 3.11 0.002 859.1359 3793.875
TREAT | -17531.28 360.5992 -48.62 0.000 -18238.2 -16824.36
dyear2 | 2490.585 414.1056 6.01 0.000 1678.769
3302.4
_cons | 19063.34 272.5318 69.95 0.000 18529.06 19597.61
-----------------------------------------------------------------------------. estimates store diffindiff
.
. * Adding pretreatment controls makes no differnce as timne-invariant
. regress EARNS Tdyear2 TREAT dyear2 AGE AGESQ EDUC NODEGREE BLACK HISP
610
Source |
SS
df
MS
Number of obs = 5350
-------------+-----------------------------F( 9, 5340) = 184.54
Model | 2.7943e+11 9 3.1048e+10
Prob > F
= 0.0000
Residual | 8.9843e+11 5340 168245017
R-squared = 0.2372
-------------+-----------------------------Adj R-squared = 0.2359
Total | 1.1779e+12 5349 220201247
Root MSE
= 12971
-----------------------------------------------------------------------------EARNS |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------Tdyear2 | 2326.505 1397.856 1.66 0.096 -413.8634 5066.874
TREAT | -9766.469 1043.296 -9.36 0.000 -11811.76 -7721.183
dyear2 | 2490.585 367.6092 6.78 0.000
1769.92 3211.249
AGE | 1357.093 139.6885 9.72 0.000 1083.246 1630.939
AGESQ | -15.23373 1.911801 -7.97 0.000 -18.98164 -11.48582
EDUC | 1504.728 91.99622 16.36 0.000 1324.377 1685.078
NODEGREE | -447.8275 588.8841 -0.76 0.447 -1602.281 706.6257
BLACK | -3177.524 446.5098 -7.12 0.000 -4052.865 -2302.182
HISP | -360.5058 993.7164 -0.36 0.717 -2308.596 1587.584
_cons | -25357.74 2618.207 -9.69 0.000 -30490.49 -20224.98
-----------------------------------------------------------------------------.
. ***** (1C) BEFORE-AFTER COMPARISON
.
. * Regression for treated only
. regress EARNS Tdyear2 if TREAT==1
Source |
SS
df
MS
Number of obs = 370
-------------+-----------------------------F( 1, 368) = 59.41
Model | 2.1464e+09 1 2.1464e+09
Prob > F
= 0.0000
Residual | 1.3296e+10 368 36129816.6
R-squared = 0.1390
-------------+-----------------------------Adj R-squared = 0.1367
Total | 1.5442e+10 369 41848713.4
Root MSE
= 6010.8
-----------------------------------------------------------------------------EARNS |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------Tdyear2 | 4817.09 624.9741 7.71 0.000 3588.121 6046.058
_cons | 1532.056 441.9234 3.47 0.001 663.0436 2401.068
-----------------------------------------------------------------------------.
. * CT Table 25.4 p.891 third row uses heteroskedastic-robust standard errors
. regress EARNS Tdyear2 if TREAT==1, robust
Regression with robust standard errors
Number of obs =
F( 1, 368) = 59.41
Prob > F
= 0.0000
R-squared = 0.1390
Root MSE
= 6010.8
370
611
-----------------------------------------------------------------------------|
Robust
EARNS |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------Tdyear2 | 4817.09 624.9741 7.71 0.000 3588.121 6046.058
_cons | 1532.056 236.684 6.47 0.000 1066.633 1997.478
-----------------------------------------------------------------------------. estimates store beforeafter
.
. ***** DISPLAY RESULTS FOR FIRST FOUR ROWSM OF Table 25.4, p.891
.
. estimates table treatcontrol controlfunction beforeafter diffindiff, /*
> */ b(%10.0f) se(%10.0f) stats(N)
-----------------------------------------------------------------Variable | treatcon~l controlf~n beforeaf~r diffindiff
-------------+---------------------------------------------------TREAT | -15205
218
-17531
|
656
768
361
AGE |
159
|
151
AGESQ |
-3
|
2
EDUC |
565
|
122
NODEGREE |
502
|
632
BLACK |
-699
|
432
HISP |
2227
|
1219
RE74 |
0
|
0
RE75 |
1
|
0
Tdyear2 |
4817
2327
|
625
749
dyear2 |
2491
|
414
_cons |
21554
-2837
1532
19063
|
312
2937
237
273
-------------+---------------------------------------------------N|
2675
2675
370
5350
-----------------------------------------------------------------legend: b/se
.
612
. ********** ANALYSIS: (2) PROPENSITY SCORE USING STRATA (Table 25.4, p.891)
**********
.
. use nswpsid, clear
.
. ***** (2A) COMPUTE PROPENSITY SCORE
.
. * Calculate propensity score using regressors in DW99 Table 3 footnote e
. logit TREAT AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK HISP RE74 RE75
RE74SQ RE75SQ U74BLACK
Iteration 0: log likelihood = -672.64954
Iteration 1: log likelihood = -499.56574
Iteration 2: log likelihood = -318.55053
Iteration 3: log likelihood = -248.28844
Iteration 4: log likelihood = -225.08984
Iteration 5: log likelihood = -219.00396
Iteration 6: log likelihood = -209.30653
Iteration 7: log likelihood = -208.38887
Iteration 8: log likelihood = -205.17689
Iteration 9: log likelihood = -204.93156
Iteration 10: log likelihood = -204.92951
Iteration 11: log likelihood = -204.9295
Logit estimates
Number of obs =
2675
LR chi2(13) = 935.44
Prob > chi2 = 0.0000
Pseudo R2
= 0.6953
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------AGE | .3305734 .1203353 2.75 0.006 .0947206 .5664262
AGESQ | -.0063429 .0018561 -3.42 0.001 -.0099808 -.0027049
EDUC | .8247711 .3534216 2.33 0.020 .1320775 1.517465
EDUCSQ | -.0483153 .0186057 -2.60 0.009 -.0847819 -.0118488
MARR | -1.884062 .2994614 -6.29 0.000 -2.470996 -1.297129
NODEGREE | .1299868 .4284278 0.30 0.762 -.7097163
.96969
BLACK | 1.132961 .352088 3.22 0.001 .4428814 1.823041
HISP | 1.962762 .5673735 3.46 0.001 .8507302 3.074793
RE74 | -.0001047 .0000355 -2.95 0.003 -.0001743 -.0000351
RE75 | -.0002172 .0000415 -5.23 0.000 -.0002986 -.0001357
RE74SQ | 2.36e-09 6.57e-10 3.59 0.000 1.07e-09 3.65e-09
RE75SQ | 1.58e-10 6.68e-10 0.24 0.813 -1.15e-09 1.47e-09
U74BLACK | 2.137042 .4273667 5.00 0.000 1.299419 2.974665
_cons | -7.552458 2.451721 -3.08 0.002 -12.35774 -2.747173
-----------------------------------------------------------------------------note: 19 failures and 0 successes completely determined.
613
. * now
lose 1344 controls and 6 treated leaving 1325
. * versus DW Figure 1 1333 controls are dropped leaving 1342
. * and Dw Table 3 column 6 says that there are 1255 left
.
. ***** (2C) CREATE FIGURE 25.3 ON PAGE 892
.
. * This will differ a little from figure in text due to U74 and U75 corrected
.
. label define tstatus 0 Comparison_sample 1 Treated_sample
. label values TREAT tstatus
. label variable TREAT "Treatment Status"
. graph twoway (scatter RE78 PSCORE if RE78 < 20000, msize(small)) /*
> */ (lowess RE78 PSCORE, bwidth(0.5) clpattern(solid)), /*
>
*/ by(TREAT, title("Post-treatment Earnings against Propensity Score", margin(b=3)
size(vlarge))
> ) /*
> */ subtitle(, bfcolor(none)) /*
> */ scale (1.2) plotregion(style(none)) /*
> */ xtitle(" Propensity Score
Propensity Score", size(medlarge))
> xscale(titlegap(*5)) /*
> */ ytitle("Real Earnings 1978", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(12) ring(0) col(2)) /*
> */ legend( label(1 "Original data") label(2 "Nonparametric regression"))
. graph export ch25treatment.wmf, replace
(file c:\Imbook\bwebpage\Section6\ch25treatment.wmf written in Windows Metafile format)
.
. ***** (2D) ADJUSTED DIFFERENCE Use PSCORE to summarize pre-treatment controls
.
. * A simple method regressors RE78 on a quadratic on PSCORE and on TREAT
. * And measures the treatment effect as coefficient of TREATED
.
. gen PSCORESQ = PSCORE*PSCORE
. regress RE78 TREAT PSCORE PSCORESQ
Source |
SS
df
MS
Number of obs = 1325
-------------+-----------------------------F( 3, 1321) = 46.14
Model | 1.5152e+10 3 5.0505e+09
Prob > F
= 0.0000
Residual | 1.4458e+11 1321 109450232
R-squared = 0.0949
-------------+-----------------------------Adj R-squared = 0.0928
Total | 1.5974e+11 1324 120645977
Root MSE
= 10462
-----------------------------------------------------------------------------RE78 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------TREAT | 301.5344 1388.756 0.22 0.828 -2422.874 3025.943
615
.
. ***** (2F) Test for similar regressor means for treated and nontreated within each Strata
.
. * Compare means within Strata across treatment status
. tab STRATA TREAT, sum(AGE) nostand nofreq
Means of AGE
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | 31.427308 30.363636 | 31.415938
2 | 28.037736 28.714286 | 28.116667
3 | 27.833333 27.909091 | 27.857143
4 | 27.529412
28.25 | 27.878788
5 | 28.875
27.8 | 28.461538
6|
25
23.4 | 23.857143
617
7 | 24.875
24.5 | 24.636364
8|
24.8
32 | 29.230769
9|
. 29.461538 | 29.461538
10 | 23.285714 23.367089 | 23.360465
-----------+----------------------+---------Total | 30.961606 25.765363 | 30.259623
. tab STRATA TREAT, sum(EDUC) nostand nofreq
Means of EDUC
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | 11.229862 11.545455 | 11.233236
2 | 10.433962 10.714286 | 10.466667
3 | 10.583333 10.181818 | 10.457143
4 | 10.647059 10.0625 | 10.363636
5 | 10.625
9.4 | 10.153846
6 | 9.3333333 10.066667 | 9.8571429
7 | 9.875 11.071429 | 10.636364
8|
10.8
11.25 | 11.076923
9|
.
11 |
11
10 | 10.571429 10.164557 | 10.197674
-----------+----------------------+---------Total | 11.141361 10.413408 | 11.043019
. tab STRATA TREAT, sum(MARR) nostand nofreq
Means of MARR
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | .8280943 .81818182 | .82798834
2 | .56603774 .85714286 |
.6
3 | .29166667 .18181818 | .25714286
4 | .23529412
.25 | .24242424
5|
.25
0 | .15384615
6 | .16666667 .06666667 | .0952381
7|
.125 .07142857 | .09090909
8|
.2
.625 | .46153846
9|
. .53846154 | .53846154
10 |
0
0|
0
-----------+----------------------+---------Total | .77574171 .19553073 | .69735849
. tab STRATA TREAT, sum(NODEGREE) nostand nofreq
Means of NODEGREE
618
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | .38408644 .36363636 | .38386783
2 | .62264151 .57142857 | .61666667
3|
.625 .54545455 |
.6
4 | .52941176
.625 | .57575758
5|
.625
.8 | .69230769
6 | .83333333
.8 | .80952381
7|
.625 .64285714 | .63636364
8|
.8
.75 | .76923077
9|
. .76923077 | .76923077
10 | .71428571 .75949367 | .75581395
-----------+----------------------+---------Total | .41186736 .69832402 | .45056604
. tab STRATA TREAT, sum(BLACK) nostand nofreq
Means of BLACK
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | .36247544 .63636364 | .3654033
2 | .60377358 .57142857 |
.6
3 | .66666667 .54545455 | .62857143
4 | .88235294
.875 | .87878788
5|
1
.4 | .76923077
6 | .83333333
.6 | .66666667
7|
.875 .92857143 | .90909091
8|
.8
1 | .92307692
9|
. .92307692 | .92307692
10 |
1 .94936709 | .95348837
-----------+----------------------+---------Total | .40401396 .83798883 | .46264151
. tab STRATA TREAT, sum(HISP) nostand nofreq
Means of HISP
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | .04911591
0 | .04859086
2 | .0754717 .28571429 |
.1
3 | .08333333
0 | .05714286
4|
0
0|
0
5|
0
.2 | .07692308
6 | .16666667 .13333333 | .14285714
7|
.125 .07142857 | .09090909
8|
.2
0 | .07692308
619
9|
. .07692308 | .07692308
10 |
0 .05063291 | .04651163
-----------+----------------------+---------Total | .05148342 .06145251 | .05283019
. tab STRATA TREAT, sum(RE74) nostand nofreq
Means of RE74
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | 12216.528 12142.62 | 12215.738
2 | 5989.8844 2031.6573 | 5528.0912
3 | 6476.1906 5884.7335 | 6290.3041
4 | 4790.868 4895.09 | 4841.3999
5 | 2375.3662 5715.8799 | 3660.1792
6 | 3173.6867 2402.9567 | 2623.1653
7 | 1533.1259 2269.1672 | 2001.5158
8 | 1567.414
0 | 602.85154
9|
. 34.243847 | 34.243847
10 |
0
0|
0
-----------+----------------------+---------Total | 11386.483 2165.8167 | 10140.823
. tab STRATA TREAT, sum(RE75) nostand nofreq
Means of RE75
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | 10352.924 8964.4728 | 10338.081
2 | 3916.448 3250.0113 | 3838.697
3 | 2417.8314 2694.2624 | 2504.7097
4 | 3134.96 2905.615 | 3023.7624
5 | 3204.6788 1917.262 | 2709.5185
6 | 2878.54 1731.1554 | 2058.9796
7 | 643.84411 1230.5051 | 1017.1739
8 | 2539.0337 1501.9275 | 1900.8145
9|
. 201.91542 | 201.91542
10 | 127.88014 234.47151 | 225.79547
-----------+----------------------+---------Total | 9528.6389 1583.4094 | 8455.2834
. tab STRATA TREAT, sum(U74BLACK) nostand nofreq
Means of U74BLACK
| Treatment Status
STRATA | Compariso Treated_s |
Total
620
-----------+----------------------+---------1 | .01473477
0 | .01457726
2 | .05660377 .14285714 | .06666667
3 | .08333333 .09090909 | .08571429
4 | .17647059
.1875 | .18181818
5|
.25
.2 | .23076923
6 | .16666667 .06666667 | .0952381
7|
.125 .21428571 | .18181818
8|
.4
1 | .76923077
9|
. .92307692 | .92307692
10 |
1 .94936709 | .95348837
-----------+----------------------+---------Total | .03141361 .58659218 | .10641509
.
. * Formal test of difference in means within strata across treatment status
. * Example is for education
. * bysort STRATA: oneway EDUC T
.
. ***** (2G) Calculate weighted average of within strata mean difference in outcome
.
. #delimit ;
delimiter now ;
. global sum = 0 ;
.
* Sums the estimate of interest over strata ;
. global sumwgt = 0 ;
. /* Sums the number of treated obs over strata */
> global count = 0 ;
.
/* This gives the number of Strata used
> global numcut = 10;
*/
12.
global sum = $sum + $addon * $tobs ;
13.
global sumwgt = $sumwgt + $tobs ;
14.
global count = $count + 1 ;
15. } ;
16. } ;
1 estimate = -4410.946812653378
Top cut = .1
2 estimate = -2113.275144674707
Top cut = .2
3 estimate = 1486.684503266305
Top cut = .3
4 estimate = -6085.742371951832
Top cut = .4
5 estimate = 1899.984014892578
Top cut = .5
6 estimate = -411.1481648763024
Top cut = .6
7 estimate = 133.9267490931921
Top cut = .7
8 estimate = 1848.656362915039
Top cut = .8
9 estimate = 0
Top cut = .9 #treat obs = 13
10 estimate = 4857.563579676591
Top cut =
#treat obs = 11
#treat obs = 7
#treat obs = 11
#treat obs = 16
#treat obs = 5
#treat obs = 15
#treat obs = 14
#treat obs = 8
#treat obs = 79
. #delimit cr ;
delimiter now cr
.
.
. ***** DISPLAY RESULT: "Propensity Score" estimate in last row Table 25.4
.
. * Weighted estimate
. di $sum / $sumwgt "
Count = " $count
1562.7274
Count = 9
.
. * This differs from value 995 given in text due to
. * previously mentioned correction of U74 and U75.
. * Now get 1562 with se not estimated
. * compared to DW99 estimates Table 3 column 4 1608 and column 5 1494
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section6\mma25p1treatment.txt
log type: text
closed on: 26 May 2005, 10:26:22
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma25p2matching.txt
log type: text
opened on: 26 May 2005, 10:26:31
.
. ********** OVERVIEW OF MMA25P2MATCHING.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
622
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------U74 |
2675 .1345794 .3413376
0
1
U75 |
2675 .1293458 .335645
0
1
.
. * Correct the original data
. drop U74 U75
. gen U74 = cond(RE74 == 0, 1, 0)
. gen U75 = cond(RE75 == 0, 1, 0)
.
. * Correct U74 and U75
. sum U74 U75
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------U74 |
2675 .1293458 .335645
0
1
U75 |
2675 .1345794 .3413376
0
1
.
. * Create regressors used as additional controls in regressions below
. gen AGESQ = AGE*AGE
. gen EDUCSQ = EDUC*EDUC
. * DW99 do not define NODEGREE but following gives Table 1 means
. gen NODEGREE = 0
. replace NODEGREE = 1 if EDUC < 12
(891 real changes made)
. gen RE74SQ = RE74*RE74
. gen RE75SQ = RE75*RE75
. gen U74BLACK = U74*BLACK
. gen U74HISP = U74*HISP
.
. sum AGE EDUC NODEGREE BLACK HISP MARR U74 U75 RE74 RE75 RE78 TREAT /*
> */ AGESQ EDUCSQ RE74SQ RE75SQ U74BLACK U74HISP
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
2675 34.22579 10.49984
17
55
EDUC |
2675 11.99439 3.053556
0
17
625
NODEGREE |
2675 .3330841 .4714045
0
1
BLACK |
2675 .2915888 .4545789
0
1
HISP |
2675 .0343925 .1822693
0
1
-------------+-------------------------------------------------------MARR |
2675 .8194393 .3847257
0
1
U74 |
2675 .1293458 .335645
0
1
U75 |
2675 .1345794 .3413376
0
1
RE74 |
2675
18230 13722.25
0 137149
RE75 |
2675 17850.89 13877.78
0 156653
-------------+-------------------------------------------------------RE78 |
2675 20502.38 15632.52
0 121174
TREAT |
2675 .0691589 .2537716
0
1
AGESQ |
2675 1281.61 766.8415
289
3025
EDUCSQ |
2675 153.1862 70.62231
0
289
RE74SQ |
2675 5.21e+08 8.47e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2675 5.11e+08 8.91e+08
0 2.45e+10
U74BLACK |
2675 .0549533 .2279316
0
1
U74HISP |
2675 .0056075 .0746868
0
1
.
. bysort TREAT: sum AGE EDUC NODEGREE BLACK HISP MARR U74 U75 RE74 RE75
RE78 TREAT /*
> */ AGESQ EDUCSQ RE74SQ RE75SQ U74BLACK U74HISP
----------------------------------------------------------------------------------------------------> TREAT = 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
2490 34.8506 10.44076
18
55
EDUC |
2490 12.11687 3.082435
0
17
NODEGREE |
2490 .3052209 .4605934
0
1
BLACK |
2490 .2506024 .433447
0
1
HISP |
2490 .0325301 .1774389
0
1
-------------+-------------------------------------------------------MARR |
2490 .8662651 .3404357
0
1
U74 |
2490 .0863454 .2809298
0
1
U75 |
2490
.1 .3000603
0
1
RE74 |
2490 19428.75 13406.88
0 137149
RE75 |
2490 19063.34 13596.95
0 156653
-------------+-------------------------------------------------------RE78 |
2490 21553.92 15555.35
0 121174
TREAT |
2490
0
0
0
0
AGESQ |
2490 1323.53 769.796
324
3025
EDUCSQ |
2490 156.3161 71.43048
0
289
RE74SQ |
2490 5.57e+08 8.66e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2490 5.48e+08 9.12e+08
0 2.45e+10
U74BLACK |
2490 .0144578 .1193923
0
1
U74HISP |
2490 .0036145 .0600237
0
1
626
----------------------------------------------------------------------------------------------------> TREAT = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
185 25.81622 7.155019
17
48
EDUC |
185 10.34595 2.01065
4
16
NODEGREE |
185 .7081081 .4558666
0
1
BLACK |
185 .8432432 .3645579
0
1
HISP |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------MARR |
185 .1891892 .3927217
0
1
U74 |
185 .7081081 .4558666
0
1
U75 |
185
.6 .4912274
0
1
RE74 |
185 2095.574 4886.623
0 35040.1
RE75 |
185 1532.056 3219.251
0 25142.2
-------------+-------------------------------------------------------RE78 |
185 6349.145 7867.405
0 60307.9
TREAT |
185
1
0
1
1
AGESQ |
185 717.3946 431.2517
289
2304
EDUCSQ |
185 111.0595 39.30388
16
256
RE74SQ |
185 2.81e+07 1.14e+08
0 1.23e+09
-------------+-------------------------------------------------------RE75SQ |
185 1.27e+07 5.60e+07
0 6.32e+08
U74BLACK |
185
.6 .4912274
0
1
U74HISP |
185 .0324324 .1776263
0
1
.
. *** NOTE: The benchmark estimate obtained from NSW experiment is
. ***
$1,794 = Average(RE_78 for NSW treated) - Average (RE_78 for NSW comtrols)
. ***
See MMA25P3EXTRA.DO
.
. ********** (1) ANALYSIS for DW02 SPECIFICATION OF THE PROPENSITY SCORE
**********
.
. * Following defines number of bootstrap replications
. * Table 25.6 used 200 (or 100 in some places)
. global breps 200
.
. * From DW02 Table 3 footnote a the propensity score uses the following regressors
. global XDW02 AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK HISP RE74 RE75
RE74SQ U74 U75 U74HISP
.
. **** Table 25.5 p.894 summarizes propensity score
. **** using just those observations with common support
.
627
****************************************************
Algorithm to estimate the propensity score
****************************************************
Logit estimates
Number of obs =
2675
LR chi2(14) = 951.10
Prob > chi2 = 0.0000
Log likelihood = -197.10175
Pseudo R2
= 0.7070
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------AGE | .2628422 .120206 2.19 0.029 .0272428 .4984416
AGESQ | -.0053794 .0018341 -2.93 0.003 -.0089742 -.0017846
EDUC | .7149774 .3418173 2.09 0.036 .0450278 1.384927
EDUCSQ | -.0426178 .0179039 -2.38 0.017 -.0777088 -.0075269
MARR | -1.780857 .301802 -5.90 0.000 -2.372378 -1.189336
NODEGREE | .1891046 .4257533 0.44 0.657 -.6453564 1.023566
BLACK | 2.519383 .370358 6.80 0.000 1.793495 3.245272
HISP | 3.087327 .7340486 4.21 0.000 1.648618 4.526036
RE74 | -.0000448 .0000425 -1.05 0.292 -.000128 .0000385
628
.0090427
Mean
.1447205
Largest
Std. Dev.
.2809511
.0897599
.9803043
.656286
.9830988
Variance
.0789335
.9392306
.9855413
Skewness
2.049999
.9640553
.9857676
Kurtosis
5.748631
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
629
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. **** For completeness do same with common support option NOT selected
.
. drop myscore myblock
. pscore TREAT $XDW02, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
------------+----------------------------------Total |
2,675
100.00
Logit estimates
Number of obs =
2675
LR chi2(14) = 951.10
Prob > chi2 = 0.0000
Log likelihood = -197.10175
Pseudo R2
= 0.7070
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------AGE | .2628422 .120206 2.19 0.029 .0272428 .4984416
AGESQ | -.0053794 .0018341 -2.93 0.003 -.0089742 -.0017846
EDUC | .7149774 .3418173 2.09 0.036 .0450278 1.384927
EDUCSQ | -.0426178 .0179039 -2.38 0.017 -.0777088 -.0075269
MARR | -1.780857 .301802 -5.90 0.000 -2.372378 -1.189336
NODEGREE | .1891046 .4257533 0.44 0.657 -.6453564 1.023566
BLACK | 2.519383 .370358 6.80 0.000 1.793495 3.245272
HISP | 3.087327 .7340486 4.21 0.000 1.648618 4.526036
RE74 | -.0000448 .0000425 -1.05 0.292 -.000128 .0000385
RE75 | -.0002678 .0000485 -5.52 0.000 -.0003628 -.0001727
RE74SQ | 1.99e-09 7.75e-10 2.57 0.010 4.72e-10 3.51e-09
U74 | 3.100056 .5187391 5.98 0.000 2.083346 4.116766
U75 | -1.273525 .4644557 -2.74 0.006 -2.183842 -.3632088
U74HISP | -1.925803 1.07186 -1.80 0.072 -4.02661 .1750032
_cons | -7.407524 2.445692 -3.03 0.002 -12.20099 -2.614056
-----------------------------------------------------------------------------note: 65 failures and 0 successes completely determined.
Percentiles
Smallest
1% 2.36e-09
1.76e-12
5% 8.39e-08
5.07e-12
10% 4.47e-07
1.14e-11
25% .0000107
1.14e-11
50%
75%
90%
95%
99%
Obs
2675
Sum of Wgt.
2675
.0002558
Mean
.0691589
Largest
Std. Dev.
.2074207
.0071195
.9830988
.129801
.9855413
Variance
.0430234
.6394923
.9857676
Skewness
3.407447
.9572224
.986626
Kurtosis
13.56404
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
Variable BLACK is not balanced in block 1
The balancing property is not satisfied
Try a different specification of the propensity score
Inferior |
of block |
TREAT
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 2,265
7 | 2,272
.05 |
98
2|
100
.1 |
56
10 |
66
.2 |
33
14 |
47
.4 |
22
24 |
46
.6 |
7
33 |
40
.8 |
9
95 |
104
-----------+----------------------+---------632
Total |
2,490
185 |
2,675
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. **** All of the following use common support
.
. ****************************************************************************
. **** Note: The results in the first half of Table 25.6
. ****
erroneously added RE75SQ as a regressor.
. ****
This does not effect Table 25.5 (done correctly) or
. ****
stratification estimates (which used myscore from correct model).
. ****
But it does effect NN, radius and kernel estimates.
. ****
To enable comparison with the text we do analysis here
. ****
both with and without RE75SQ.
. ****
Even dropping RE75SQ the results continue to differ from DW02.
. ****
Text Corrected
. ****
Table 25.6 Table 25.6 DW 2002
. ****
NN
2385
1286
1202
. ****
Radius = 0.001 -7815
-7808
1187
. ****
Radius = 0.0001 -9333
-6401
1191
. ****
Radius = 0.00001 -2200
-1135
1198
. ****
Stratification
1497
1497
. ****
Kernel
1309
1342
. ****************************************************************************
.
. **** Row 1 Table 25.6: Nearest neighbor matching (random version)
. set seed 10101
. attnd RE78 TREAT $XDW02 RE75SQ, comsup boot reps($breps) dots logit
53
2385.430
1792.028
1.331
633
Bootstrap statistics
Number of obs =
Replications =
200
2675
53
2385.430
1094.969
2.179
634
60
1285.782
3895.044
0.330
Bootstrap statistics
Number of obs =
Replications =
200
2675
635
60
1285.782
1275.405
1.008
517 -7815.382
1118.181
-6.989
....................................................................................................
> ..................................................................................................
> ..
Bootstrap statistics
Number of obs =
Replications =
200
2675
517 -7815.381
3794.466
-2.060
637
51
541 -7808.241
1146.418
-6.811
Bootstrap statistics
Number of obs =
Replications =
200
2675
541 -7808.242
3770.093
-2.071
92 -9333.120
2285.624
-4.083
Bootstrap statistics
Number of obs =
Replications =
200
2675
BC = bias-corrected
92 -9333.120
5211.110
-1.791
91 -6401.345
2054.218
-3.116
> ..................................................................................................
> ..
Bootstrap statistics
Number of obs =
Replications =
200
2675
91 -6401.345
5618.880
-1.139
15
19 -2200.022
2986.211
-0.737
Bootstrap statistics
Number of obs =
Replications =
200
2675
19 -2200.022
7009.510
-0.314
642
17 -1135.184
3189.367
-0.356
Bootstrap statistics
Number of obs =
Replications =
200
2675
17 -1135.184
7030.204
-0.161
1086
1497.484
920.688
1.626
---------------------------------------------------------
Bootstrap statistics
Number of obs =
Replications =
200
2675
644
1086
1497.484
913.129
1.640
--------------------------------------------------------.
. **** Row 6 Table 25.6: Kernel Matching
. set seed 10101
. attk RE78 TREAT $XDW02 RE75SQ, comsup boot reps($breps) dots logit
1058
1309.217
645
Bootstrap statistics
Number of obs =
Replications =
200
2675
1058
1309.217
958.180
1.366
646
1086
1342.016
Bootstrap statistics
Number of obs =
Replications =
200
2675
1086
1342.016
933.867
1.437
--------------------------------------------------------647
.
. ********** (2) ANALYSIS for DW99 SPECIFICATION OF THE PROPENSITY SCORE
**********
.
. * From DW99 Table 3 footnote e the propensity score uses the following regressors
. global XDW99 AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK HISP RE74 RE75
RE74SQ RE75SQ U74BLACK
.
. * Note that CT Table 25.6 footnote b erroneously lists RE74*RE75 as regressor
. * but this program (correctly) did not include RE74*RE75
.
. **** Propensity score with just those observations with common support
.
. drop myscore myblock
. pscore TREAT $XDW99, pscore(myscore) comsup blockid(myblock) numblo($breps)
level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
Number of obs =
2675
LR chi2(13) = 935.44
Prob > chi2 = 0.0000
Pseudo R2
= 0.6953
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------AGE | .3305734 .1203353 2.75 0.006 .0947206 .5664262
AGESQ | -.0063429 .0018561 -3.42 0.001 -.0099808 -.0027049
EDUC | .8247711 .3534216 2.33 0.020 .1320775 1.517465
EDUCSQ | -.0483153 .0186057 -2.60 0.009 -.0847819 -.0118488
MARR | -1.884062 .2994614 -6.29 0.000 -2.470996 -1.297129
NODEGREE | .1299868 .4284278 0.30 0.762 -.7097163
.96969
BLACK | 1.132961 .352088 3.22 0.001 .4428814 1.823041
HISP | 1.962762 .5673735 3.46 0.001 .8507302 3.074793
RE74 | -.0001047 .0000355 -2.95 0.003 -.0001743 -.0000351
RE75 | -.0002172 .0000415 -5.23 0.000 -.0002986 -.0001357
RE74SQ | 2.36e-09 6.57e-10 3.59 0.000 1.07e-09 3.65e-09
RE75SQ | 1.58e-10 6.68e-10 0.24 0.813 -1.15e-09 1.47e-09
U74BLACK | 2.137042 .4273667 5.00 0.000 1.299419 2.974665
_cons | -7.552458 2.451721 -3.08 0.002 -12.35774 -2.747173
-----------------------------------------------------------------------------note: 19 failures and 0 successes completely determined.
.0111854
Mean
.1388772
Largest
Std. Dev.
.275571
.0779976
.9744237
.6200607
.9747552
Variance
.0759394
.9494181
.9747918
Skewness
2.17177
649
99%
.970738
.9748754
Kurtosis
6.296349
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
.09 |
.095 |
.1 |
.105 |
.11 |
.115 |
.12 |
.125 |
.13 |
.135 |
.14 |
.145 |
.15 |
.155 |
.16 |
.165 |
.175 |
.18 |
.185 |
.19 |
.195 |
.2 |
.205 |
.215 |
.225 |
.23 |
.235 |
.24 |
.245 |
.25 |
.26 |
.265 |
.27 |
.28 |
.285 |
.29 |
.295 |
.3 |
.305 |
.315 |
.32 |
.325 |
.33 |
.335 |
.34 |
.345 |
.35 |
.355 |
.365 |
.37 |
.375 |
8
6
9
4
8
3
1
2
6
1
1
1
2
4
3
2
1
0
1
2
2
1
1
5
2
2
2
2
0
0
1
1
1
1
1
2
2
2
0
1
0
2
1
0
1
1
2
0
1
2
2
1|
0|
0|
0|
0|
0|
0|
3|
1|
0|
1|
0|
0|
0|
0|
0|
0|
1|
0|
0|
1|
0|
0|
0|
1|
1|
3|
0|
1|
2|
1|
0|
0|
0|
0|
1|
1|
0|
1|
0|
1|
1|
0|
1|
1|
2|
0|
1|
0|
0|
2|
9
6
9
4
8
3
1
5
7
1
2
1
2
4
3
2
1
1
1
2
3
1
1
5
3
3
5
2
1
2
2
1
1
1
1
3
3
2
1
1
1
3
1
1
2
3
2
1
1
2
4
651
.38 |
.385 |
.4 |
.405 |
.42 |
.425 |
.45 |
.47 |
.48 |
.485 |
.495 |
.5 |
.51 |
.515 |
.525 |
.53 |
.535 |
.54 |
.555 |
.56 |
.565 |
.57 |
.575 |
.59 |
.595 |
.6 |
.605 |
.61 |
.615 |
.62 |
.625 |
.635 |
.64 |
.645 |
.665 |
.67 |
.675 |
.68 |
.69 |
.71 |
.735 |
.74 |
.745 |
.765 |
.79 |
.795 |
.8 |
.805 |
.815 |
.825 |
.84 |
1
1
0
0
0
1
2
1
1
2
1
0
0
2
0
0
0
1
0
1
1
0
1
0
0
0
0
1
0
0
0
1
1
2
0
1
0
1
1
1
0
1
2
1
0
0
0
0
0
0
0
2|
4|
1|
2|
1|
0|
0|
0|
1|
0|
0|
2|
2|
1|
1|
2|
1|
0|
1|
1|
0|
1|
1|
1|
1|
1|
1|
2|
1|
1|
1|
2|
1|
0|
1|
0|
3|
0|
0|
1|
1|
0|
0|
1|
4|
1|
1|
2|
3|
1|
1|
3
5
1
2
1
1
2
1
2
2
1
2
2
3
1
2
1
1
1
2
1
1
2
1
1
1
1
3
1
1
1
3
2
2
1
1
3
1
1
2
1
1
2
2
4
1
1
2
3
1
1
652
.845 |
0
1|
1
.85 |
0
1|
1
.86 |
0
1|
1
.865 |
0
1|
1
.895 |
0
1|
1
.9 |
0
2|
2
.905 |
0
2|
2
.915 |
0
1|
1
.92 |
0
1|
1
.925 |
0
7|
7
.93 |
0
2|
2
.935 |
0
1|
1
.94 |
0
3|
3
.945 |
1
6|
7
.95 |
1
14 |
15
.955 |
0
16 |
16
.96 |
1
5|
6
.965 |
3
12 |
15
.97 |
1
13 |
14
-----------+----------------------+---------Total | 1,146
185 | 1,331
Note: the common support option has been selected
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. **** For completeness do same with common support option NOT selected
.
. drop myscore myblock
. pscore TREAT $XDW99, pscore(myscore) blockid(myblock) numblo($breps) level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
Number of obs =
2675
LR chi2(13) = 935.44
Prob > chi2 = 0.0000
Pseudo R2
= 0.6953
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------AGE | .3305734 .1203353 2.75 0.006 .0947206 .5664262
AGESQ | -.0063429 .0018561 -3.42 0.001 -.0099808 -.0027049
EDUC | .8247711 .3534216 2.33 0.020 .1320775 1.517465
EDUCSQ | -.0483153 .0186057 -2.60 0.009 -.0847819 -.0118488
MARR | -1.884062 .2994614 -6.29 0.000 -2.470996 -1.297129
NODEGREE | .1299868 .4284278 0.30 0.762 -.7097163
.96969
BLACK | 1.132961 .352088 3.22 0.001 .4428814 1.823041
HISP | 1.962762 .5673735 3.46 0.001 .8507302 3.074793
RE74 | -.0001047 .0000355 -2.95 0.003 -.0001743 -.0000351
RE75 | -.0002172 .0000415 -5.23 0.000 -.0002986 -.0001357
RE74SQ | 2.36e-09 6.57e-10 3.59 0.000 1.07e-09 3.65e-09
RE75SQ | 1.58e-10 6.68e-10 0.24 0.813 -1.15e-09 1.47e-09
U74BLACK | 2.137042 .4273667 5.00 0.000 1.299419 2.974665
_cons | -7.552458 2.451721 -3.08 0.002 -12.35774 -2.747173
-----------------------------------------------------------------------------note: 19 failures and 0 successes completely determined.
1%
5%
10%
25%
2.84e-08
4.47e-07
2.07e-06
.000034
50%
.0006388
Mean
.0691589
Largest
Std. Dev.
.2063646
.010941
.9744237
.1336877
.9747552
Variance
.0425863
.6200607
.9747918
Skewness
3.471137
.9651648
.9748754
Kurtosis
14.05057
75%
90%
95%
99%
4.49e-11
4.88e-10
4.88e-10
4.95e-10
Obs
2675
Sum of Wgt.
2675
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
Variable BLACK is not balanced in block 1
The balancing property is not satisfied
Try a different specification of the propensity score
Inferior |
of block |
TREAT
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 1,845
2 | 1,847
.005 |
143
3|
146
.01 |
78
0|
78
.015 |
42
0|
42
.02 |
38
0|
38
.025 |
29
1|
30
.03 |
22
0|
22
.035 |
23
0|
23
.04 |
22
0|
22
655
.045 |
.05 |
.055 |
.06 |
.065 |
.07 |
.075 |
.08 |
.085 |
.09 |
.095 |
.1 |
.105 |
.11 |
.115 |
.12 |
.125 |
.13 |
.135 |
.14 |
.145 |
.15 |
.155 |
.16 |
.165 |
.175 |
.18 |
.185 |
.19 |
.195 |
.2 |
.205 |
.215 |
.225 |
.23 |
.235 |
.24 |
.245 |
.25 |
.26 |
.265 |
.27 |
.28 |
.285 |
.29 |
.295 |
.3 |
.305 |
.315 |
.32 |
.325 |
17
23
13
12
9
11
9
6
6
8
6
9
4
8
3
1
2
6
1
1
1
2
4
3
2
1
0
1
2
2
1
1
5
2
2
2
2
0
0
1
1
1
1
1
2
2
2
0
1
0
2
1|
0|
1|
0|
0|
1|
1|
0|
0|
1|
0|
0|
0|
0|
0|
0|
3|
1|
0|
1|
0|
0|
0|
0|
0|
0|
1|
0|
0|
1|
0|
0|
0|
1|
1|
3|
0|
1|
2|
1|
0|
0|
0|
0|
1|
1|
0|
1|
0|
1|
1|
18
23
14
12
9
12
10
6
6
9
6
9
4
8
3
1
5
7
1
2
1
2
4
3
2
1
1
1
2
3
1
1
5
3
3
5
2
1
2
2
1
1
1
1
3
3
2
1
1
1
3
656
.33 |
.335 |
.34 |
.345 |
.35 |
.355 |
.365 |
.37 |
.375 |
.38 |
.385 |
.4 |
.405 |
.42 |
.425 |
.45 |
.47 |
.48 |
.485 |
.495 |
.5 |
.51 |
.515 |
.525 |
.53 |
.535 |
.54 |
.555 |
.56 |
.565 |
.57 |
.575 |
.59 |
.595 |
.6 |
.605 |
.61 |
.615 |
.62 |
.625 |
.635 |
.64 |
.645 |
.665 |
.67 |
.675 |
.68 |
.69 |
.71 |
.735 |
.74 |
1
0
1
1
2
0
1
2
2
1
1
0
0
0
1
2
1
1
2
1
0
0
2
0
0
0
1
0
1
1
0
1
0
0
0
0
1
0
0
0
1
1
2
0
1
0
1
1
1
0
1
0|
1|
1|
2|
0|
1|
0|
0|
2|
2|
4|
1|
2|
1|
0|
0|
0|
1|
0|
0|
2|
2|
1|
1|
2|
1|
0|
1|
1|
0|
1|
1|
1|
1|
1|
1|
2|
1|
1|
1|
2|
1|
0|
1|
0|
3|
0|
0|
1|
1|
0|
1
1
2
3
2
1
1
2
4
3
5
1
2
1
1
2
1
2
2
1
2
2
3
1
2
1
1
1
2
1
1
2
1
1
1
1
3
1
1
1
3
2
2
1
1
3
1
1
2
1
1
657
.745 |
2
0|
2
.765 |
1
1|
2
.79 |
0
4|
4
.795 |
0
1|
1
.8 |
0
1|
1
.805 |
0
2|
2
.815 |
0
3|
3
.825 |
0
1|
1
.84 |
0
1|
1
.845 |
0
1|
1
.85 |
0
1|
1
.86 |
0
1|
1
.865 |
0
1|
1
.895 |
0
1|
1
.9 |
0
2|
2
.905 |
0
2|
2
.915 |
0
1|
1
.92 |
0
1|
1
.925 |
0
7|
7
.93 |
0
2|
2
.935 |
0
1|
1
.94 |
0
3|
3
.945 |
1
6|
7
.95 |
1
14 |
15
.955 |
0
16 |
16
.96 |
1
5|
6
.965 |
3
12 |
15
.97 |
1
13 |
14
-----------+----------------------+---------Total | 2,490
185 | 2,675
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. **** All of the following use common support
.
. **** Row 7 Table 25.6: Nearest neighbor matching (random version)
. set seed 10101
. attnd RE78 TREAT $XDW99, comsup boot reps($breps) dots logit
658
57
560.287
2205.663
0.254
Bootstrap statistics
Number of obs =
Replications =
200
2675
--------------------------------------------------------185
57
560.287
1331.294
0.421
583 -9358.228
997.561
-9.381
Bootstrap statistics
Number of obs =
Replications =
200
2675
660
583 -9358.228
3079.824
-3.039
76 -7847.460 2066.697
-3.797
661
Bootstrap statistics
Number of obs =
Replications =
200
2675
76 -7847.460
4850.874
-1.618
662
13
223.468
4551.850
0.049
Bootstrap statistics
Number of obs =
Replications =
200
2675
13
223.468
5608.927
0.040
1233
1322.160
---------------------------------------------------------
Bootstrap statistics
Number of obs =
Replications =
200
2675
|
-1383.034 4034.298 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
1233
1322.160
1276.237
1.036
--------------------------------------------------------.
. **** Row 12 Table 25.6: Kernel Matching
. * pscore TREAT $XDW99, pscore(myscore) comsup blockid(myblock) numblo($breps)
level(0.005) logit
. set seed 10101
. attk RE78 TREAT $XDW99, comsup boot reps($breps) dots logit
1146
1518.694
665
command:
attk RE78 TREAT AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK
HISP RE74 RE75 RE74SQ RE75SQ
> U74BLACK , pscore() logit comsup bwidth(.06)
statistic: attk
= r(attk)
....................................................................................................
> ..................................................................................................
> ..
Bootstrap statistics
Number of obs =
Replications =
200
2675
1146
1518.694
808.339
1.879
--------------------------------------------------------.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section6\mma25p2matching.txt
log type: text
closed on: 26 May 2005, 11:15:53
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma25p3extra.txt
log type: text
opened on: 26 May 2005, 11:33:04
.
. ********** OVERVIEW OF MMA25P3EXTRA.DO **********
.
666
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 25.8 pages 889-893
. * Evaluating treatment effect of training on Earnings
. * This program provides additional analysis and data not in the book
. * (1) Compare NSW experiment treated to NSW experiment controls
. * (2) Compare NSW experiment treated to CPS "controls"
. * [Same as text except "controls" are from CPS not PSID]
.
. * The program is based on
.*
MMA25P2MATCHING.DO propensity score matching
.
. * To run this program you need STATA data files
. * nswre74_treated.dta NSW Treated sample
. * nswre74_control.dta NSW Control sample (not analyzed earlier)
. * propensity_cps.dta
CPS Control sample (rather than PSID)
.
. * To run this program you need the Stata add-ons
. * pscore.ado, atts.ado, attr.ado, attnd.ado, attnw.ado
. * due to Sascha O. Becker and Andrea Ichino (2002)
. * "Estimation of average treatment effects based on propensity scores",
. * The Stata Journal, Vol.2, No.4, pp. 358-377.
.
. * This program uses version 2.02 May 13 2005 for Stata version 8
. * downloadable from http://www.iue.it/Personal/Ichino/#pscore
. * We earlier used version 1.29 October 8 2002 for Stata version 7
. * downloadable from http://www.iue.it/Personal/Ichino/#pscore
. * and obtained the same results
.
. * To speed up the program reduce breps: the number of bootstrap
. * replications used to obtain bootstrap standard errors
. * Bootstrap se's will differ from text as here seed is set to 10101
.
. ********** STATA SETUP **********
.
. set more off
. version 8
. set scheme s1mono /* Used for graphs */
.
. ********** DATA DESCRIPTION **********
.
. * Data originally from DW99
. * R.H. Dehejia and S. Wahba (1999)
. * "Causal Effects in Nonexperimental Studies: reevaluating the
667
black |
260 .8269231 .3790434
0
1
hisp |
260 .1076923 .3105893
0
1
-------------+-------------------------------------------------------married |
260 .1538462 .3614971
0
1
nodegree |
260 .8346154 .3722439
0
1
re74 |
260 2107.027 5687.906
0 39570.68
re75 |
260 1266.909 3102.982
0 23031.98
re78 |
260 4554.801 5483.836
0 39483.53
-------------+-------------------------------------------------------u74 |
260
.25 .4338478
0
1
u75 |
260 .3153846 .4655651
0
1
----------------------------------------------------------------------------------------------------> treat = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat |
185
1
0
1
1
age |
185 25.81622 7.155019
17
48
edu |
185 10.34595 2.01065
4
16
black |
185 .8432432 .3645579
0
1
hisp |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------married |
185 .1891892 .3927217
0
1
nodegree |
185 .7081081 .4558666
0
1
re74 |
185 2095.574 4886.62
0 35040.07
re75 |
185 1532.055 3219.251
0 25142.24
re78 |
185 6349.144 7867.402
0 60307.93
-------------+-------------------------------------------------------u74 |
185 .2918919 .4558666
0
1
u75 |
185
.4 .4912274
0
1
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile treat age edu black hisp married nodegree re74 re75 re78 u74 u75 /*
> */using nswre74_all.asc, replace
.
. ** Calculate the benchmark Treatment Effect
. ** Same as DW02 Tables 2 and 3 NSW row second last column
. ** and is the number given in CT page 894 second last line
.
. regress re78 treat
Source |
SS
df
MS
Number of obs = 445
-------------+-----------------------------F( 1, 443) = 8.04
Model | 348013183 1 348013183
Prob > F
= 0.0048
Residual | 1.9178e+10 443 43290369.3
R-squared = 0.0178
-------------+-----------------------------Adj R-squared = 0.0156
Total | 1.9526e+10 444 43976681.9
Root MSE
= 6579.5
669
-----------------------------------------------------------------------------re78 |
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------treat | 1794.342 632.8534 2.84 0.005 550.5745 3038.11
_cons | 4554.801 408.0459 11.16 0.000 3752.855 5356.747
-----------------------------------------------------------------------------.
. ********** (2) ANALYSIS: NSW TREATED VERSUS CPS CONTROLS **********
.
. * This data set has NSW treated and full CPS controls
. use propensity_cps.dta, clear
.
. * Variables u74, u75 were evaluated wrongly in the original file
. * So make the following correction
. drop u74 u75
. gen u74=0
. replace u74=1 if re74==0
(2044 real changes made)
. gen u75=0
. replace u75=1 if re75==0
(1859 real changes made)
. gen age2=age*age
. gen age3=age2*age
. gen edu2=edu*edu
. gen edure74=edu*re74
. * Not sure whether this is needed
. * Does DW99 use edu*re74*age3 or separately edu*re74 and age3 ?
. gen edre74age3=edu*re74*age3
.
. ** Summarize these data
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat | 16177 .011436 .1063292
0
1
age | 16177 33.14051 11.03651
16
55
edu | 16177 12.00828 2.868005
0
18
black | 16177 .0823391 .2748892
0
1
670
age |
185 25.81622 7.155019
17
48
edu |
185 10.34595 2.01065
4
16
black |
185 .8432432 .3645579
0
1
hisp |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------married |
185 .1891892 .3927217
0
1
nodegree |
185 .7081081 .4558666
0
1
re74 |
185 2095.574 4886.62
0 35040.07
re75 |
185 1532.055 3219.251
0 25142.24
re78 |
185 6349.144 7867.402
0 60307.93
-------------+-------------------------------------------------------u74 |
185 .7081081 .4558666
0
1
u75 |
185
.6 .4912274
0
1
age2 |
185 717.3946 431.2517
289
2304
age3 |
185 21554.66 20964.71
4913 110592
edu2 |
185 111.0595 39.30388
16
256
-------------+-------------------------------------------------------edure74 |
185 22898.73 57393.97
0 490561
edre74age3 |
185 4.28e+08 1.24e+09
0 8.75e+09
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. * This has data as original except for recode of u74 and u75
. outfile treat age edu black hisp married nodegree re74 re75 re78 u74 u75 /*
> */ using propensity_cps.asc, replace
.
. ** Number of replications to use in the bootstrap
. ** Ideally at least 400
. global breps 200
.
. *** (2A) CPS propensity score model from DW02 Table 2 footnote A
.
. global CPSDW02 age age2 age3 edu edu2 married nodegree black hisp re74 re75 u74 u75 edure74
.
. * With common support option
. pscore treat $CPSDW02, pscore(myscore) blockid(myblock) comsup numblo(5) level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
Freq.
Percent
Cum.
672
------------+----------------------------------0 | 15,992
98.86
98.86
1|
185
1.14
100.00
------------+----------------------------------Total | 16,177
100.00
Logit estimates
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Log likelihood = -404.15991
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------age | 2.425229 .3500652 6.93 0.000 1.739114 3.111344
age2 | -.0672395 .0111308 -6.04 0.000 -.0890555 -.0454234
age3 | .0005685 .0001113 5.11 0.000 .0003505 .0007866
edu | .9247848 .2500694 3.70 0.000 .4346577 1.414912
edu2 | -.0572021 .0136202 -4.20 0.000 -.0838972 -.0305071
married | -1.556471 .2517687 -6.18 0.000 -2.049929 -1.063014
nodegree | .9270591 .3254621 2.85 0.004 .2891651 1.564953
black | 3.850668 .2662868 14.46 0.000 3.328755 4.37258
hisp | 1.673885 .409913 4.08 0.000 .8704705
2.4773
re74 | -.0002203 .0001086 -2.03 0.043 -.0004332 -7.40e-06
re75 | -.0001969 .0000378 -5.21 0.000 -.000271 -.0001228
u74 | 1.749522 .2897311 6.04 0.000
1.18166 2.317385
u75 | .00944 .257531 0.04 0.971 -.4953115 .5141915
edure74 | .0000222 9.08e-06 2.45 0.014 4.43e-06
.00004
_cons | -35.22098 3.797922 -9.27 0.000 -42.66477 -27.77719
-----------------------------------------------------------------------------note: 3 failures and 0 successes completely determined.
.0053823
Mean
.0452964
Largest
Std. Dev.
.1326324
.0156111
.9356451
.0856723
.93718
Variance
.0175914
.282253
.9374608
Skewness
4.475994
.822637
.9384554
Kurtosis
24.36564
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
treat
0
1|
Total
674
-----------+----------------------+---------.0010614 | 3,214
18 | 3,232
.025 |
240
8|
248
.05 |
172
14 |
186
.1 |
96
19 |
115
.2 |
86
32 |
118
.4 |
31
38 |
69
.6 |
9
20 |
29
.8 |
8
36 |
44
-----------+----------------------+---------Total | 3,856
185 | 4,041
Note: the common support option has been selected
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. * Without common support option
. drop myscore myblock
. pscore treat $CPSDW02, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Log likelihood = -404.15991
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------age | 2.425229 .3500652 6.93 0.000 1.739114 3.111344
age2 | -.0672395 .0111308 -6.04 0.000 -.0890555 -.0454234
age3 | .0005685 .0001113 5.11 0.000 .0003505 .0007866
edu | .9247848 .2500694 3.70 0.000 .4346577 1.414912
edu2 | -.0572021 .0136202 -4.20 0.000 -.0838972 -.0305071
married | -1.556471 .2517687 -6.18 0.000 -2.049929 -1.063014
nodegree | .9270591 .3254621 2.85 0.004 .2891651 1.564953
black | 3.850668 .2662868 14.46 0.000 3.328755 4.37258
hisp | 1.673885 .409913 4.08 0.000 .8704705
2.4773
re74 | -.0002203 .0001086 -2.03 0.043 -.0004332 -7.40e-06
re75 | -.0001969 .0000378 -5.21 0.000 -.000271 -.0001228
u74 | 1.749522 .2897311 6.04 0.000
1.18166 2.317385
u75 | .00944 .257531 0.04 0.971 -.4953115 .5141915
edure74 | .0000222 9.08e-06 2.45 0.014 4.43e-06
.00004
_cons | -35.22098 3.797922 -9.27 0.000 -42.66477 -27.77719
-----------------------------------------------------------------------------note: 3 failures and 0 successes completely determined.
.0001247
Mean
.011436
Largest
Std. Dev.
.0691037
.0010579
.9356451
.0073933
.93718
Variance
.0047753
.0250635
.9374608
Skewness
9.281842
.3620009
.9384554
Kurtosis
99.39697
676
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
*******************************************
End of the algorithm to estimate the pscore
*******************************************
677
.
. * Nearest neighbor matching (random version)
. attnd re78 treat $CPSDW02, comsup boot reps($breps) dots logit
155
730.380
1049.321
0.696
Bootstrap statistics
Number of obs =
Replications =
200
16177
P = percentile
BC = bias-corrected
155
730.380
941.076
0.776
1027 -2935.932
888.041
-3.306
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Bootstrap statistics
Number of obs =
Replications =
200
16177
1027 -2935.932
1332.096
-2.204
680
185
3856
1267.716
Bootstrap statistics
Number of obs =
Replications =
200
16177
3856
1267.716
720.580
1.759
--------------------------------------------------------.
. * Stratification Matching
. atts re78 treat, pscore(myscore) blockid(myblock) comsup boot reps($breps) dots
681
3856
1505.512
734.270
2.050
---------------------------------------------------------
Bootstrap statistics
Number of obs =
Replications =
200
16177
3856
1505.512
665.184
2.263
682
--------------------------------------------------------.
. *** (2B) CPS propensity score model from DW99 Table 2 footnote A
.
. global CPSDW99 age age2 edu edu2 nodegree married black hisp re74 re75 u74 u75 edure74 age3
.
. * With common support option
. drop myscore myblock
. pscore treat $CPSDW99, pscore(myscore) blockid(myblock) comsup numblo(5) level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
Logit estimates
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Log likelihood = -404.15991
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
683
.0053823
Mean
.0452964
Largest
Std. Dev.
.1326324
.0156111
.9356451
.0856723
.93718
Variance
.0175914
.282253
.9374608
Skewness
4.475994
.822637
.9384554
Kurtosis
24.36564
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
684
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. * Without common support option
. drop myscore myblock
. pscore treat $CPSDW99, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
685
****************************************************
Algorithm to estimate the propensity score
****************************************************
Logit estimates
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Log likelihood = -404.15991
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------age | 2.425229 .3500652 6.93 0.000 1.739114 3.111344
age2 | -.0672395 .0111308 -6.04 0.000 -.0890555 -.0454234
edu | .9247848 .2500694 3.70 0.000 .4346577 1.414912
edu2 | -.0572021 .0136202 -4.20 0.000 -.0838972 -.0305071
nodegree | .9270591 .3254621 2.85 0.004 .2891651 1.564953
married | -1.556471 .2517687 -6.18 0.000 -2.049929 -1.063014
black | 3.850668 .2662868 14.46 0.000 3.328755 4.37258
hisp | 1.673885 .409913 4.08 0.000 .8704705
2.4773
re74 | -.0002203 .0001086 -2.03 0.043 -.0004332 -7.40e-06
re75 | -.0001969 .0000378 -5.21 0.000 -.000271 -.0001228
u74 | 1.749522 .2897311 6.04 0.000
1.18166 2.317385
u75 | .00944 .257531 0.04 0.971 -.4953115 .5141915
edure74 | .0000222 9.08e-06 2.45 0.014 4.43e-06
.00004
age3 | .0005685 .0001113 5.11 0.000 .0003505 .0007866
_cons | -35.22098 3.797922 -9.27 0.000 -42.66477 -27.77719
686
.0001247
Mean
.011436
Largest
Std. Dev.
.0691037
.0010579
.9356451
.0073933
.93718
Variance
.0047753
.0250635
.9374608
Skewness
9.281842
.3620009
.9384554
Kurtosis
99.39697
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
of block |
treat
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 11,635
0 | 11,635
.0007813 | 1,056
2 | 1,058
.0015625 |
932
5|
937
.003125 |
712
2|
714
.00625 |
709
2|
711
.0125 |
306
7|
313
.025 |
240
8|
248
.05 |
172
14 |
186
.1 |
96
19 |
115
.2 |
86
32 |
118
.4 |
31
38 |
69
.6 |
9
20 |
29
.8 |
8
36 |
44
-----------+----------------------+---------Total | 15,992
185 | 16,177
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. * Nearest neighbor matching (random version)
. attnd re78 treat $CPSDW99, comsup boot reps($breps) dots logit
155
730.380
1049.321
0.696
688
Bootstrap statistics
Number of obs =
Replications =
200
16177
155
730.380
964.544
0.757
689
1027 -2935.932
888.041
-3.306
Bootstrap statistics
Number of obs =
Replications =
200
16177
--------------------------------------------------------67
1027 -2935.932
1276.508
-2.300
3856 1267.716
Bootstrap statistics
Number of obs =
Replications =
200
16177
3856
1267.716
751.290
1.687
--------------------------------------------------------.
. * Stratification Matching
. atts re78 treat, pscore(myscore) blockid(myblock) comsup boot reps($breps) dots
3856
1505.512
734.270
2.050
---------------------------------------------------------
692
Bootstrap statistics
Number of obs =
Replications =
200
16177
3856
1505.512
741.786
2.030
--------------------------------------------------------.
. *** (2C) CPS propensity score model from Becker-Ichino, 2002 (BI02)
.
. gen re742 = re74*re74
. gen re752 = re75*re75
. gen blacku74 = black*u74
. global CPSBI02 age age2 edu edu2 married black hisp re74 re75 re742 re752 blacku74
.
. * With common support option
. drop myscore myblock
. pscore treat $CPSBI02, pscore(myscore) blockid(myblock) comsup numblo(5) level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
693
Logit estimates
Number of obs =
16177
LR chi2(12) = 1170.86
Prob > chi2 = 0.0000
Log likelihood = -425.64309
Pseudo R2
= 0.5790
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------age | .7902073 .0940972 8.40 0.000 .6057803 .9746344
age2 | -.0128161 .0015894 -8.06 0.000 -.0159313 -.0097009
edu | .9953909 .2558663 3.89 0.000 .4939022 1.49688
edu2 | -.0636036 .0131378 -4.84 0.000 -.0893532 -.0378541
married | -1.534639 .2516679 -6.10 0.000 -2.027899 -1.041379
black | 3.340175 .3032312 11.02 0.000 2.745853 3.934497
hisp | 1.636367 .3971529 4.12 0.000 .8579614 2.414772
re74 | -.0001744 .0000626 -2.79 0.005 -.0002971 -.0000517
re75 | -.000168 .0000693 -2.42 0.015 -.0003039 -.0000322
re742 | 8.06e-09 2.61e-09 3.09 0.002 2.95e-09 1.32e-08
re752 | -2.05e-09 3.97e-09 -0.52 0.605 -9.83e-09 5.73e-09
blacku74 | 1.033264 .288037 3.59 0.000 .4687217 1.597806
_cons | -18.16269 1.865757 -9.73 0.000 -21.81951 -14.50588
-----------------------------------------------------------------------------note: 112 failures and 0 successes completely determined.
.0040446
Mean
.0343457
Largest
Std. Dev. .1120884
.0089357
.8905055
.0495031
.898552
Variance
.0125638
.1913766
.9023286
Skewness
4.931471
.6773557
.9038652
Kurtosis
29.27201
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
Variable blacku74 is not balanced in block 3
The balancing property is not satisfied
Try a different specification of the propensity score
Inferior |
of block |
of pscore |
treat
0
1|
Total
695
-----------+----------------------+---------0 | 4,230
13 | 4,243
.0125 |
330
7|
337
.025 |
231
9|
240
.05 |
126
14 |
140
.1 |
108
23 |
131
.2 |
87
30 |
117
.4 |
29
20 |
49
.5 |
10
24 |
34
.6 |
12
25 |
37
.8 |
6
20 |
26
-----------+----------------------+---------Total | 5,169
185 | 5,354
Note: the common support option has been selected
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. * Without common support option
. drop myscore myblock
. pscore treat $CPSBI02, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:
Logit estimates
Number of obs =
16177
LR chi2(12) = 1170.86
Prob > chi2 = 0.0000
Log likelihood = -425.64309
Pseudo R2
= 0.5790
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------age | .7902073 .0940972 8.40 0.000 .6057803 .9746344
age2 | -.0128161 .0015894 -8.06 0.000 -.0159313 -.0097009
edu | .9953909 .2558663 3.89 0.000 .4939022 1.49688
edu2 | -.0636036 .0131378 -4.84 0.000 -.0893532 -.0378541
married | -1.534639 .2516679 -6.10 0.000 -2.027899 -1.041379
black | 3.340175 .3032312 11.02 0.000 2.745853 3.934497
hisp | 1.636367 .3971529 4.12 0.000 .8579614 2.414772
re74 | -.0001744 .0000626 -2.79 0.005 -.0002971 -.0000517
re75 | -.000168 .0000693 -2.42 0.015 -.0003039 -.0000322
re742 | 8.06e-09 2.61e-09 3.09 0.002 2.95e-09 1.32e-08
re752 | -2.05e-09 3.97e-09 -0.52 0.605 -9.83e-09 5.73e-09
blacku74 | 1.033264 .288037 3.59 0.000 .4687217 1.597806
_cons | -18.16269 1.865757 -9.73 0.000 -21.81951 -14.50588
-----------------------------------------------------------------------------note: 112 failures and 0 successes completely determined.
.0001313
Mean
.011436
Largest
Std. Dev. .0664629
.0016513
.8905055
.0074369
.898552
Variance
.0044173
.0234798
.9023286
Skewness
8.811019
.3855562
.9038652
Kurtosis
89.82108
697
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
Variable blacku74 is not balanced in block 7
The balancing property is not satisfied
Try a different specification of the propensity score
Inferior |
of block |
treat
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 11,076
1 | 11,077
.0007813 |
968
2|
970
.0015625 | 1,020
2 | 1,022
.003125 | 1,185
3 | 1,188
.00625 |
804
5|
809
.0125 |
330
7|
337
.025 |
231
9|
240
.05 |
126
14 |
140
.1 |
108
23 |
131
.2 |
87
30 |
117
.4 |
29
20 |
49
.5 |
10
24 |
34
.6 |
12
25 |
37
.8 |
6
20 |
26
-----------+----------------------+---------Total | 15,992
185 | 16,177
*******************************************
End of the algorithm to estimate the pscore
*******************************************
698
.
. * Nearest neighbor matching (random version)
. attnd re78 treat $CPSBI02, comsup boot reps($breps) dots logit
147
1214.888
988.298
1.229
Bootstrap statistics
Number of obs =
Replications =
200
16177
P = percentile
BC = bias-corrected
147
1214.888
924.342
1.314
1089 -3094.104
857.247
-3.609
....................................................................................................
> ..................................................................................................
> ..
Bootstrap statistics
Number of obs =
Replications =
200
16177
1089 -3094.104
1724.927
-1.794
5169
881.520
.
701
Bootstrap statistics
Number of obs =
Replications =
200
16177
5169
881.520
741.305
1.189
--------------------------------------------------------.
. * Stratification Matching
. atts re78 treat, pscore(myscore) blockid(myblock) comsup boot reps($breps) dots
702
5169
1538.713
---------------------------------------------------------
Bootstrap statistics
Number of obs =
Replications =
200
16177
5169
1538.713
748.444
2.056
--------------------------------------------------------703
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section6\mma25p3extra.txt
log type: text
closed on: 26 May 2005, 13:26:49
----------------------------------------------------------------------------------------------------
704
705
706
707
708
709
710
711
712
713
714
715
BOOK
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
FIGURES
Most of these figures are produced by Stata programs given at this website.
Page Figure Brief caption
File
50 3.1
Social experiment with random assignment
ch3-fig1.wmf
89
4.1
ch4fig1qr.wmf
90
4.2
ch4fig2qr.wmf
249 7.1
ch7power.wmf
253 7.2
ch7montecarlo.wmf
296 9.1
ch9hist.wmf
296 9.2
ch9kd1.wmf
297 9.3
ch9ksm1.wmf
300 9.4
ch9kdensu1.wmf
309 9.5
k-NN regression
ch9ksmma.wmf
310 9.6
ch9ksmlowess.wmf
317 9.7
368 11.1
411 12.1
413 12.2
414 12.3
ch12fig3envelope.wmf
424 13.1
ch13_bayes1.wmf
466 14.1
ch14binary.wmf
516 15.1
ch15-Gen-RUM2.wmf
531 16.1
ch16condmeans.wmf
540 16.2
ch16millsratio.wmf
575 17.1
kennanstrk.wmf
585 17.2
604 17.3
605 17.4
Unemployment duration:
unemployment insurance
606 17.5
606 17.6
627 18.1
633 18.2
Unemployment duration:
generalized residuals
633 18.3
survival
ch11boot.wmf
functions
exponential-gamma
km_pt1.wmf
by km_pt2.wmf
ch18lbias.wmf
model exp.wmf
743
635 18.4
636 18.5
661 19.1
662 19.2
712 21.1
ch21pantot.wmf
713 21.2
ch21panbe.wmf
713 21.3
ch21panfe.wmf
714 21.4
ch21panfd.wmf
793 23.1
880 25.1
ch25-fig1-rd.wmf
883 25.2
ch25-fig2-rd.wmf
892 25.3
924 27.1
ch27fig1.wmf
Assign to
treatment
Yes
Eligible
subject
invited to
participate
Randomize
Agrees to
participate?
Assign to
control
No
Drop from
study
744
1
.8
.6
.4
.2
.2
.4
.6
.8
15
10
10th percentile
Quantile
10
12
745
.6
.4
.2
Test Power
.8
10
15
20
.4
.2
.1
0
Density
.3
Standard Normal
-4
-2
746
.2
Density
.4
.6
One-half plug-in
Plug-in
.2
.4
.6
.8
747
Bandwidth h=0.8
Bandwidth h=0.4
Bandwidth h=0.1
Actual data
10
15
20
Years of Schooling
.4
Epanechnikov (h=0.545)
Gaussian (h=0.246)
Quartic (h=0.646)
.2
Uniform (h=0.214)
.6
748
350
Actual Data
300
kNN (k=5)
Linear OLS
200
250
kNN (k=25)
150
Dependent variable y
20
40
60
80
100
Regressor x
Actual Data
300
Lowess (k=25)
200
250
150
Dependent variable y
350
20
40
60
80
100
Regressor x
749
-2
Dependent variable y
20
40
60
80
100
Regressor x
.4
.2
.1
0
Density
.3
Standard Normal
-4
-2
750
.6
.4
0
.2
Cdf F(x)
.8
Random variable x
Draw of 0.64 (vertical axis) yields x = 1.02 (horizontal axis).
.6
Accept-reject Method
Desired density f(x)
.4
.2
0
Envelope kg(x)
10
Random variable x
751
.4
.2
0
.1
Density
.3
Posterior N[8,1.2]
10
15
Evaluation point
1.5
Probit
.5
OLS
-.5
Predicted probability
Logit
-2
752
Explanatory
variables
Disturbances
Disturbances
Latent
classes
Indicators
Latent
variables
Stated preference
indicators
Utilities
Observable
variable
Indicators
Unobservable
variable
Structural
relationship
Disturbances
Revealed preference
indicator y
-2000
2000
4000
-4000
Measurement
relationship
Uncensored Mean
753
N[0,1] Cdf
.5
1.5
N[0,1] Density
2.5
-2
-1
Cutoff point c
.75
.5
.25
Survival Probability
50
100
150
200
250
754
20
40
60
0 .2 .4 .6 .8 1
Weibull survivor
0 .01.02 .03.04
Weibull density
Weibull Distribution
80
20
40
60
80
Duration time
60
80
0 2 4 6 8
Cumulative hazard
.05 .1 .15
0
40
Duration time
Weibull hazard
Duration time
20
20
40
60
80
Duration time
.25
.5
.75
Survival Probability
Survival Estimate
10
20
30
755
1.00
0.75
0.50
0.25
0.00
Survival Probability
Received UI (UI = 1)
10
20
30
1.5
.5
Cumulative Hazard
10
20
30
756
1.50
1.00
0.50
0.00
Cumulative Hazard
Received UI (UI = 1)
10
20
30
S3
S2
S1
S5
12-month
survey
period
S4
S7
Survey date
S9
S6
S8
757
4
3
2
1
Cumulative Hazard
Cumulative Hazard
45 degree line
3
2
1
Cumulative Hazard
45 degree line
Cumulative Hazard
758
4
2
Cumulative Hazard
Cumulative Hazard
45 degree line
4
3
2
1
Cumulative Hazard
45 degree line
Cumulative Hazard
759
1
.8
.6
.2
.4
10
20
30
10
10
20
30
760
8
6
4
10
Original data
Nonparametric fit
Linear fit
8
7.5
7
Averages
Nonparametric fit
6.5
8.5
Between Regression
Linear fit
761
7
6
5
Nonparametric fit
Linear fit
First differences
Nonparametric fit
Linear fit
-5
-2
-1
762
4
2
0
Log Patents
Original data
-2
Nonparametric fit
Linear fit
-5
10
10
5
Actual data
No treat (low)
Treat (high)
Outcome y
15
20
Selection variable S
763
5000
10000
15000
20000
Comparison_sample
.5
Propensity Score
Original data
.5
Propensity Score
Nonparametric regression
Sharp Design
Fuzzy design
Selection variable S
764
5000
10000
15000
Treated_sample
20000
Comparison_sample
.5
Propensity Score
Original data
.5
Propensity Score
Nonparametric regression
765
BOOK CORRECTIONS - June 9, 2005 plus some but not all corrections since
then added
Page
p.85
p.68, 147 11/22/2005 Liebler should be spelt Leibler [Joerg Stoye, NYU]
p.89
3/30/2006
Third last line should be "q = 0.1, 0.5, and 0.9" and not "q = 0.1, 0.2, ...,
0.9" [James MacKinnon, Queen's]
p. 113
5/27/2005
Exercise 4-2 part (b) should be Hence directly obtain a consistent estimate
of
the
variance
of
_hat
(and not Hence directly obtain the variance of y_bar)
p. 114
6/9/2005
p. 164
6/9/2005
p. 165
6/9/2005
p. 168
3/3/2006
p. 178
3/3/2006
Last displayed equation. The first and third matrices are wrong and should
be similar to G_hat in (6.21). For these matrices the two terms being
summed over i should be x_i*x_i' and 3*utilde_i^2*x_i*x_i'. [Doug Miller,
UC-Davis]
p. 189
3/6/2006
p.190
3/6/2006
p.193
3/6/2006
p. 199
5/18/2005
In Table 6.4 NL2SLS column is 0.969, 0.041, 0.84 (and not 0.960, 0.046,
0.85)
p. 214
3/28/2006
In the displayed equation for the 3SLS estimator the matrix OMEGA_hat
should
be
SIGMA-hat.
Same change two lines down and four lines down. SIGMA_hat = definition
given for OMEGA_hat.
p. 220
5/27/2005
linear.
5/18/2005
p.255
5/18/2005
p. 256
5/18/2005
In section 7.8.3 the percentiles should be -1.89 and 1.80 (and not -2.62 and
1.83)
5/18/2005
Figure 12.3 vertical axis label should be f(x) and kg(x) and legend should
be kg(x) (and not g(x))
766
p. 493
2/18/2006
First two lines should be "in the probability of fishing from a beach, and an
increase
of
0.119,
0.080,
and 0.068, respectively, in the probability of fishing from a pier, a private
boat,
and
a
charter
boat."
[Jeff Smith, Michigan]
p. 501
3/22/2006
(15.17) and the line before should have minus sign before the expected
Hessian. [Frank Windmeijer, Bristol]
p. 505
3/22/2006
p.508
3/22/2006
p. 569
5/19/2005
Bibliographic note 16.3 should refer to Tobin (1958) (and not Tobit (1958))
[Kevin Hoover, UCD]
p. 793
4/7/2005
Figure 23.1 axes labels are reversed. Vertical axis is log(patents) and
horizontal axis is log(R&D)
p. 839
4/10/2006
Second equality for SIGMA_c^-1 should not have the inverse at the end.
p. 839
4/10/2006
Formula for [I + aee']^(1/2) should finish with ee' and not Mee'.
p. 895
5/26/2005
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785