Solution Manual For Microeconometrics

1
BOOK PREFACE
This book provides a detailed treatment of microeconometric analysis, the analysis of individuallevel data on the economic behavior of individuals or firms. This usually entails regression methods
applied to cross-section and panel data.
The book aims to provide the practitioner with a comprehensive coverage of statistical methods
and their application in modern applied microeconometrics research. These methods include
nonlinear modelling, inference under minimal distributional assumptions, identifying and
measuring causation rather than mere association, and correcting from departures from simple
random sampling. Many of these features are of relevance to individual-level data analysis
throughout the social sciences.
The ambitious agenda has determined the characteristics of this book. First, although oriented to
the practitioner the book is relatively advanced in places. A cookbook approach is inadequate as
when two or more complications occur simultaneously, a common situation, the practitioner must
know enough to be able to adapt available methods. Second, the book provides considerable
coverage of practical data problems, see especially the last three chapters. Third, the book includes
substantial empirical examples in many chapters, to illustrate some of the methods covered. Finally,
the book is unusually long. Despite this length we have been space-constrained. We had intended to
include even more empirical examples. And abbreviated presentations will at times fail to recognize
the accomplishments of researchers who have made substantive contributions.
The book assumes a basic understanding of the linear regression model with matrix algebra. It is
written at the mathematical level of the first-year economics Ph.D. sequence, comparable to Greene
(2003). We have two types of readers in mind. First, the book can be used as a course text for a
microeconometrics course, typically taught in the second-year of the Ph.D., or for data-oriented
microeconomics field courses such as labor economics, public economics and industrial
organization. Second, the book can be used as a reference work for graduate students and applied
researchers who despite training in microeconometrics will inevitably have gaps that they wish to
fill.
For instructors using this book as an econometrics course text it is best to introduce the basic
nonlinear cross-section and linear panel data models as early as possible, initially skipping many of
the methods chapters. The key methods chapter (chapter 5) covers maximum likelihood and
nonlinear least squares estimation. ML and NLS provide adequate background for the most
commonly-used nonlinear cross-section models (chapters 14-17, 20), basic linear panel data models
(chapter 21) and treatment evaluation methods (chapter 25). Generalized method of moments
estimation (chapter 6) is needed especially for advanced linear panel data methods (chapter 22).
For readers using this book as a reference work, the chapters have been written to be as selfcontained as possible. The notable exception is that some command of general estimation results in
chapter 5, and occasionally chapter 6, will be necessary. Most models chapters are structured to
begin with a discussion and example that is accessible to a wide audience.
The web-site www.econ.ucdavis.edu/faculty/cameron/mmabook provides all the data and
computer programs used in this book, and related materials useful for instructional purposes.
This project has been long and arduous, and at times seemingly without an end. Its completion
has been greatly aided by our colleagues, friends, and graduate students. We would like to thank
especially the following for reading and commenting on specific chapters: Bijan Borah, Kurt
Brnns, Pian Chen, Tim Cogley, Parthe Deb, David Drukker, Massimiliano De Santis, Jeff Gill,
10
Tue Gorgens, Shiferaw Gurmu, Lu Ji, Oscar Jorda, Roger Koenker, Chenghui Li, Tong Li, Doug
Miller, Murat Munkin, Jim Prieger, Ahmed Rahmen, Sunil Sapra, Haruki Seitani, Yacheng Sun,
Xiaoyong Zheng, and David Zimmer. We thank Rajeev Dehejia, Bronwyn Hall, Cathy Kling,
Jeffrey Kling, Will Manning, Brian McCall and Jim Ziliak for making their data available for
empirical illustrations. We thank our respective departments for facilitating our collaboration, and
for the production and distribution of the draft manuscript at various stages. We benefitted from the
comments of two anonymous reviewers. Guidance, advice and encouragement from our CUP
editor, Scott Pariss, has been invaluable.
Our interest in econometrics owes much to the training and environments we encountered as
students and in the initial stages of our academic careers. The first author thanks The Australian
National University, Stanford University, especially Takeshi Amemiya and Tom MaCurdy, and The
Ohio State University. The second author thanks the London School of Economics and The
Australian National University.
Our interest in writing a book oriented to the practitioner owes much to our exposure to the
research of graduate students and colleagues at our respective institutions, UC-Davis and IUBloomington.
Finally, we would like to thank our families for their patience and understanding without which
completion of this project would not have been possible.
A. Colin Cameron
Davis, California
Pravin K. Trivedi
Bloomington, Indiana
11
TABLE OF CONTENTS
I: PRELIMINARIES
II: CORE METHODS
1.
Overview
2. Causal and Noncausal Models
3. Microeconomic Data Structures
4.
Linear
models
5.
ML
and
NLS
estimation
6. GMM and Systems Estimation
7.
Hypothesis
Tests
8. Specification Tests and Model
Selection
9.
Semiparametric
Methods
10. Numerical Optimization
III:
SIMULATION- 11.
Bootstrap
BASED
12.
Simulation-based
METHODS
13. Bayesian Methods
Methods
Methods
IV:
CROSS-SECTION 14.
Binary
Outcome
Models
DATA MODELS
15.
Multinomial
Models
16. Tobit and Selection Models
17. Transition Data: Survival Analysis
18. Mixture Models and Unobserved
Heterogeneity
19. Models of Multiple Hazards
20. Count Data Models
V:
PANEL
MODELS
DATA 21. Linear Panel Models: Basics

22. Linear Panel Models: Extensions
23. Nonlinear Panel Models
VI: FURTHER TOPICS 24. Stratified and Clustered Samples

25.
Treatment
Evaluation
26.
Measurement
Error
Models
27. Missing Data and Imputation
APPENDICES
A.
Asymptotic
Theory
B. Making Pseudo-Random Draws
12
PART 1 (chapters 1-3)
Part 1 covers the essential components of microeconometric analysis -- an economic specification, a

statistical
model
and
a
data
set.
Chapter 1 discusses the distinctive aspects of microeconometrics, and provides an outline of the
book. It emphasizes that discreteness of data, and nonlinearity and heterogeneity of behavioral
relationships are key aspects of disaggregated microeconometric models. It concludes by presenting
the
notation
and
conventions
used
throughout
the
book.
Chapters 2 and 3 set the scene for the remainder of the book by introducing the reader to key model
and
data
concepts
that
shape
the
analyses
of
later
chapters.
A key distinction in econometrics is between essentially descriptive models and data summaries at
various levels of statistical sophistication and models that go beyond associations and attempt to
estimate causal parameters. The classic definitions of causality in econometrics derive from the
Cowles Commission simultaneous equations models that draw sharp distinctions between
exogenous and endogenous variables, and between structure and reduced form parameters.
Although reduced form models are very useful for prediction, knowledge of structural or causal
parameters is essential for policy analyses. Identification of structural parameters within the
simultaneous equations framework poses numerous conceptual and practical difficulties. An
alternative approach based on the potential outcome model, also attempts to identify causal
parameters but it does so by posing limited questions within a more manageable framework.
Chapter 2 attempts to provide an overview of the fundamental issues that arise in these alternative
frameworks. Readers who initially find this material challenging should return to this chapter later
after gaining greater familiarity with specific models covered later in the book.
The empirical researchers ability to identify causal parameters depends not only on the statistical
tools and models but also on the type of data available. An experimental framework provides a
standard for establishing causal connections. However, observational, not experimental, data form
the basis of much of econometric inference. Chapter 3 surveys the pros and cons of three main types
of data available: observational data, data from social experiments, and those from natural
experiments. The potential as well as the difficulties of conducting causal inference based on each
type of data are reviewed.
Part 2 presents the core methods least squares, method of moments, and maximum likelihood -of estimation and inference in nonlinear regression models that are central in microeconometrics.
Both the traditional topics as well as more modern topics like quantile regression, sequential
estimation, empirical likelihood, bootstrap, and semi- and nonparametric regression are covered. In
general the discussion is at a level intended to provide enough background and detail to enable the
practitioner to read and comprehend articles in the leading econometrics journals. We presume
prior
familiarity
with
linear
regression
analysis.
Chapter 4 begins with the linear regression model. It then covers at an introductory level quantile
regression, which models distributional features other than the conditional mean. It provides a
lengthy expository treatment of instrumental variables estimation, a major semiparametric method
13
of causal inference. Chapter 5 presents the most commonly-used estimation methods for nonlinear
models, beginning with the quite general topic of m-estimation, before specialization to maximum
likelihood and nonlinear least squares regression. Chapter 6 provides a comprehensive treatment of
generalized method of moments, which is a quite general estimation framework, applicable both in
linear and nonlinear, and single- and multi-equation settings. The chapter emphasizes the special
case
of
instrumental
variables
estimation.
Chapter 7 covers both the classical and bootstrap approaches to hypothesis testing, while Chapter 8
presents relatively more modern methods of model selection and specification analysis. .Because of
their importance the bootstrap methods also get a more detailed stand-alone treatment in Chapter
11. As much as possible testing methods are presented in a unified manner in these chapters, but
specific
applications
occur
throughout
the
book
Chapter 9 is a stand-alone chapter that presents nonparametric and semiparametric estimation
methods that place a flexible structure on the econometric model. Chapter 10 presents the
computational methods used to compute the nonlinear estimators presented in chapters 5 and 6.
This material becomes especially relevant to the practitioner if an estimator is not automatically
computed by an econometrics package.
Part 1 emphasized that: (1) Microeconometric models are often nonlinear; (2) they are frequently
estimated using large and heterogeneous data sets; and (3) the data often come from surveys that
are complex and subject to a variety of sampling biases. A realistic depiction of the economic
phenomena in such settings often requires the use of models that are difficult to estimate and
analyze. Advances in computing hardware and software now make it feasible to tackle such tasks.
Part 3 presents modern, computer-intensive, simulation-based methods of inference that mitigate
some of these difficulties. The background required to cover this material varies somewhat with the
chapter but the essential base is least squares and maximum likelihood estimation.
Chapter 11 presents bootstrap methods for statistical inference. These methods have the attraction
of providing a simple way to obtain standard errors when the formulae from asymptotic theory are
complex, as is the case for some two-step estimators. Furthermore, if implemented appropriately, a
bootstrap can lead to a more refined asymptotic theory that may then lead to better statistical
inference
in
small
samples.
Chapter 12 presents simulation-based estimation methods. These methods permit estimation in
situations where standard computational methods may not permit calculation of an estimator,
because of the presence of an integral over a probability distribution for which there is no closedform
solution.
Chapter 13 surveys Bayesian methods that provide an approach to estimation and inference that is
quite different from the classical approach used in other chapters of this book. Despite this different
approach, the Bayesian toolkit can also be adopted to permit classical estimation and inference for
problems that are otherwise intractable
14
Part 4, consisting of chapters 14 to 20, covers the core nonlinear limited dependent variable models
for cross-section data, defined by the range of values taken by the dependent variable. Topics
covered include models for binary and multinomial data, duration data and count data. The
complications of censoring, truncation and sample selection are also studied.
Chapters 14-15 cover models for binary and multinomial data that are standard in the analysis of
discrete choice and outcomes. Maximum likelihood methods are dominant. Different
parameterizations for the conditional probabilities in these models lead to different models, notably
logit and probit models, which are well-established Recent literature has focused on less restrictive
modeling with more flexible functional forms for conditional probabilities and on accommodating
individual unobserved heterogeneity. These objectives motivate the use of semiparametric methods
and
simulation-based
estimation
methods.
Censoring, truncation or sample selection generate empirically several important classes of models
that are analyzed in Chapter 16. The long-established Tobit model is central to this literature, but its
estimation and inference rely on strong distributional assumptions to permit consistent estimation.
We also examine the newer semiparametric methods require weaker assumptions.
Chapters 17-19 consider duration models in which the focus is on either the determinants of spell
lengths, such as length of an unemployment spell, or on modeling the hazard rate of transitions from
one initial state to another. The relative importance of state dependence and unobserved
heterogeneity as determinants of the average length of spell is a central issue, whose resolution
raises fundamental questions about alternative modeling approaches. The analysis covers both
discrete and continuous time models, and both parametric and semiparametric formulations,
including the standard models like the exponential, the Weibull, and the proportional hazards
model. Chapter 18 covers formulation and interpretation of richer models that incorporate
unobserved heterogeneity. Chapter 19 deals with models with several types of events using the
competing
risks
formulation
and
models
of
multiple
spells.
Chapter 20 covers the analysis of event count of the kind very common in health economics. There
are many strong connections and parallels between count data models and duration models because
of their common foundation in stochastic processes. We analyze the widely-used Poisson and
negative binomial regression models, together with important variants such as the two-part or
hurdle model, zero-inflated models, latent class models, and endogenous regressor models, all of
which accommodate different facets of the event processes.
Cross section models have certain inherent limitations. They are predominantly equilibrium models
that generally do not shed light on intertemporal dependence of events. They also cannot
satisfactorily resolve fundamental issues about the sources of persistence in behavior. Such
persistence may be behavioral, i.e. arising from true state dependence, or it may be spurious, being
an artifact of the inability to control for heterogeneous behavior in the population. Because panel
data, also called longitudinal data, contain periodically repeated observations of the same subjects,
they have a large potential for resolving issues that cross section models cannot satisfactorily
handle. Chapters 21 through 23 present methods for panel data. We progress systematically from
15
linear models for continuous data in Chapter 21 to nonlinear panel data models for limited
dependent variables in Chapter 23. Both fixed effects and random effects models are considered. A
persistent theme through these three chapters is the importance of using robust methods of
inference.
Chapter 21, which reviews the key general results for linear panel data regression models, can be
read easily by those with a good grasp of linear regression; it does not require the material covered
in Parts 2 to 4. We recommend that even those who are interested in more advanced material should
quickly peruse through the contents of this chapter first to gain familiarity with key concepts and
definitions.
Chapter 22 covers important extensions of Chapter 21, especially to dynamic panels which allow
for Markovian dependence structure of current variables. The analysis is in the GMM framework
that is currently favored by many practitioners in this area. The analysis here is at times intricate,
involving many issues of detail. A strong grasp of GMM will be helpful in absorbing the main
results
of
this
chapter.
The results of Chapters 21 and 22 do not extend to nonlinear panel models of Chapter 23 in a
general and unified fashion. There are relatively fewer general results for limited dependent variable
panel models. Despite this, in Chapter 23 we begin by presenting an analysis of some general issues
and approaches. Later sections can be treated as panel data extensions of the counterpart cross
section models in Part 4. these analyze four categories of models for binary, count , censored, and
duration data, respectively. These should be accessible to a suitably prepared reader familiar with
the parallel cross section models.
Frequently in empirical work data present not one but multiple complications that the analysis must
simultaneously deal with. Examples of such complications include departures from simple random
sampling, clustering of observations, measurement errors, and missing data. When they occur,
individually or jointly, and in the context of any of the models developed in Parts 4 and 5,
identification of parameters of interest will be compromised. Three chapters in Part 6 Chapters
24, 26, and 27 analyze the consequences of such complications and then present methods that
attempt to overcome the consequences. The methods are illustrated using examples taken from the
earlier parts of the book. This features gives points of connection between Part 6 and the rest of the
book.
Chapter 24, which deals with features of data from complex surveys, complements various topics
covered Chapters 3, 5, and 16. Chapter 26 which deals with measurement errors complements
topics in Chapter 4, 14, and 20. Chapter 27 is a stand-alone chapter on missing data and multiple
imputation, but its use of the EM algorithm and Gibbs sampler also gives it points of contact with
Chapters
10
and
13,
respectively.
Chapter 25 deals with the important topic of treatment evaluation. Treatment is a broad term that
refers to the impact of one variable, e.g. schooling, on some outcome variable, e.g. income.
Treatment variables may be exogenously assigned, or may be endogenously chosen. The topic of
treatment evaluation concerns the identifiability of the impact of treatment on outcome, as measured
by either the marginal effects or certain functions of marginal effect. A variety of methods are used
including instrumental variables regression and propensity score matching. The problem of
treatment evaluation can arise in the context of any model considered in parts 4 and 5. This chapter
16
may also be read on its own, but it does presume familiarity with many other topics covered in the
book, including instrumental variables and selection models, which is why it is placed in the last
part.
17
GUIDE FOR INSTRUCTORS AND OTHER READERS
The book assumes a basic understanding of the linear regression model with matrix algebra. It is
written at the mathematical level of the first-year economics Ph.D. sequence, comparable to Greene
(2000).
While some of the material in this book is covered in a first-year sequence, most of the material in
this book appears in second year econometrics Ph.D. courses or in data-oriented microeconomics
field courses such as labor economics, public economics or industrial organization. This book is
intended to be used as both an econometrics text and as an adjunct for such field courses. More
generally, the book is intended to be useful as a reference work for applied researchers in
economics, in related social sciences such as sociology and political science, and in epidemiology.
The models chapters have been written to be as self-contained as possible, to minimize the amount
of background material in the methods chapters that needs to be read. For the specific models
presented in parts four and five (chapters 14-23) it will generally be sufficient to read the relevant
chapter in isolation, except that some command of the general estimation results in chapter 5 and in
some cases chapter 6 will be necessary. Most chapters are structured to begin with a discussion and
example that is accessible to a wide audience.
For instructors using this book as a course text it is best to introduce the basic nonlinear crosssection and linear panel data models as early as possible, skipping many of the methods chapters.
The most commonly-used nonlinear cross-section models are presented in chapters 14-16, and
require knowledge of maximum likelihood and least squares estimation, presented in chapter five.
Chapter twenty-one on linear panel data models requires even less preparation, essentially just
chapter four.
Table 1.2 provides an outline for a one-quarter second-year graduate course taught at the University
of California - Davis, immediately following the required first-year statistics and econometrics
sequence. A quarter provides sufficient time to cover the basic results given in the first half of the
chapters in this outline. With additional time one can go into further detail or cover a subset of
chapters eleven to thirteen on computationally-intensive estimation methods (simulation-based
estimation, the bootstrap which is also briefly presented in chapter seven and Bayesian methods);
additional cross-section models (durations and counts) presented in chapters seventeen to twenty;
and additional panel data models (linear model extensions and nonlinear models) given in chapters
twenty-two and twenty-three.
Outline of a twenty-lecture ten-week course:
Lectures
Chapter
Topic
1-3
4
Review of linear models and asymptotic theory
4-7
Estimation: M-estimation, ML and NLS
10
Estimation: Numerical Optimization
9-11
14,15
Models: Binary and multinomial
12-14
16
Models: Censored and Truncated
15
Estimation: GMM
16
Testing: Hypothesis Tests
17-19
21
Models: Basic Linear Panel
20
9
Estimation: Semiparametric
At Indiana University - Bloomington, a fifteen-week semester long field course in
microeconometrics is based on material in most of Parts 4 and 5 (chapters 14-23). The prerequisite
courses for this course cover material similar to the material in Part 2 (chapters 4-10).
18
Some exercises are provided at the end of each chapter after the first three introductory chapters.
These exercises are usually learning-by-doing exercises, some are purely methodological while
others entail analysis of generated or actual data. The level of difficulty of the questions is mostly
related to the level of difficulty of the topic.
Detailed programs and data for all the data applications (using either actual data or generated data)
will be made available at the book website.
19
ADVANCE REVIEWS
"This book presents an elegant and accessible treatment of the broad range of rapidly expanding
topics currently being studied by microeconometricians. Thoughtful, intuitive, and careful in laying
out central concepts of sophisticated econometric methodologies, it is not only an excellent
textbook for students, but also an invaluable reference text for practitioners and researchers."
- Cheng Hsiao, University of Southern California
"I wish "Microeconometrics" was available when I was a student! Here, in one place -- and in clear
and readable prose -- you can find all of the tools that are necessary to do cutting-edge applied
economic
analysis,
and
with
many
helpful
examples."
- Alan Krueger, Princeton University
"Cameron and Trivedi have written a remarkably thorough and up-to-date treatment of
microeconometric methods. This is not a superficial cookbook; the early chapters carefully lay the
theoretical foundations on which the authors build their discussion of methods for discrete and
limited dependent variables and for analysis of longitudinal data. A distinctive feature of the book
is its attention to cutting-edge topics like semiparametric regression, bootstrap methods, simulationbased estimation, and empirical likelihood estimation. A highly valuable book."
- Gary Solon, University of Michigan
"The empirical analysis of micro data is more widespread than ever before. The book by Cameron
and Trivedi contains a superb treatment of all the methods that economists like to apply to such
data. What is more, it fully integrates a number of exciting new methods that have become
applicable due to recent advances in computer technology. The text is in perfect balance between
econometric theory and empirical intuition, and it contains many insightful examples."
-
Gerard J. van den Berg, Free University, Amsterdam, The Netherlands
20
PROGRAMS: I. INTRODUCTION (chapters 1-3)

No programs.
PROGRAMS: II. CORE METHODS (chapters 4-10)

Section Pages
Example
Program and Output
4.5.3
84-5
Robust Standard Errors for mma04p1wls.do

OLS, WLS and GLS
mma04p1wls.txt
* mma04p1wls.asc
4.6.4
88-90
Quantile
and
Regression
qreg0902.dta
qreg0902.asc
4.8.8
102-3
Instrumental
Regression
4.9.6
110-2
IV Application with Weak mma04p4ivweak.do

mma04p4ivweak.txt
Instruments
Median mma04p2qreg.do
mma04p2qreg.txt
Variables mma04p3iv.do
mma04p3iv.txt
Data
[* means generated]
or
* mma04p3iv.asc
DATA66.dat
DATA66.dct
and
5.9.2-3 159-63 Exponential: MLE using mma05p1mle.do

ml command
mma05p1mle.txt
* mma05data.asc
5.9.2-3 159-63 Exponential: NLS using nl mma05p2nls.do

command
mma05p2nls.txt
* mma05data.asc
5.9.2-3 159-63 Exponential: NLS using ml mma05p3nlsbyml.do

command
mma05p3nlsbyml.txt
* mma05data.asc
5.9.4
159-63 Exponential: Computation mma05p4margeffects.do

mma05p4margeffects.txt
of marginal effects
* mma05data.asc
6.5.4
198-9
Nonlinear
Limdep
* mma06p1nl2sls.asc
6.5.4
198-9
Part of preceding using mma06p2twostage.do

Stata
mma06p2twostage.txt
* mma06p1nl2sls.asc
7.4
241-3
Likelihood-based
Hypothesis Testts
* mma07p1mltests.asc
7.6.3
248-9
Asymptotic Power of Wald mma07p2power.do

Test
mma07p2power.txt
No data
7.7.1-5 250-4
Monte Carlo Simulation of mma07p3montecarlo.do

Wald Test
mma07p3montecarlo.txt
Data
for
many
simulations not saved
7.8
254-6
Bootstrap example
* mma07p4boot.asc
8.2.9
269-71 Conditional moment tests mma08p1cmtests.do

example
mma08p1cmtests.txt
* mma08p1cmtests.asc
8.5.5
283-4
Nonnested
2SLS:
models
Using mma06p1nl2sls.lim
mma06p1nl2sls.out
mma07p1mltests.do
mma07p1mltests.txt
mma07p4boot.do
mma07p4boot.txt
test mma08p2nonnested.do
21
example
mma08p2nonnested.txt
8.7.3
290-1
Model
example
diagnostics mma08p3diagnostics.do
mma08p3diagnostics.txt
9.2
295-7
Nonparametric
density mma09p1np.do
estimation and regression: mma09p1np.txt
appplication
mma08p2nonnested.asc
*
mma08p3diagnostics.asc
9.4-9.5 307-19 Nonparametric regression: mma09p2npmore.do

more
mma09p2npmore.txt
* mma09p2npmore.asc
9.3.3
* mma09p3kernels.asc
299300
10.2.5 338-9
Kernel functions plotted
mma09p3kernels.do
mma09p3kernels.txt
Gradient method example mma10p1gradient.do

(Newton Raphson)
mma10p1gradient.txt
PROGRAMS:
No data
III. Computationally-Intensive Methods
(chapters 11-13)
Section
Pages
Example
Program and Output
Data
11.3
366-8
Bootstrap example
mma11p1boot.do
mma11p1boot.txt
* mma11p1boot.asc
12.3.3
391-2
Integral
Example
12.4.5,
12.5.6
397-7,
403-4
Maximum
Simulated mma12p2mslmsm.do
Likelihood and Maximum mma12p2mslmsm.txt
Simulated Score Example
*
mma12p2mslmsm.asc
12.8.2
412-3
Illustration of Methods to mma12p3draws.do

Draw Random Variates
mma12p3draws.txt
No data
13.2.2
424
Bayes Theorem Illustration mma13p1bayesthm.do

for Normal Distribution mma13p1bayesthm.txt
and Prior
No data
13.6
452-4
MCMC Example: Gibbs mma13p2bayesgibbs.sas Program generated

Sampler for SUR
mma13p2bayesgibbs.lst
mma13p2bayesgibbs.log
PROGRAMS:
IV.
Computation mma12p1integration.do No data

mma12p1integration.txt
Models
for
Cross-Section
Data
Section Pages
Example
14.2
Logit
and
Probit mma14p1binary.do
Application (fishing mode) mma14p1binary.txt
464-5
Program and Output
(chapters
14-20)
Data
Nldata.asc
22
14.7.5
486
Maximum score estimator mma14p2maxscore.lim

for binary outcome
mma14p2maxscore.out
mma14p1binary.asc
15.2.1- 491-5
3
Multinomial Logit and mma15p1mnl.do

Conditional
Logit mma15p1mnl.txt
Application (fishing mode)
Nldata.asc
15.6.3
511
Nested Logit (or GEV) mma15p2gev.do

estimation
mma15p2gev.txt
Nldata.asc
15.2.2
493-4
Limdep multinomial logit
Nldata.asc
mma15p3mnl.lim
mma15p3mnl.out
15.2.1- 491-5
3
Limdep and addon Nlogit mma15p4gev.lim

for conditional and nested mma15p4gev.out
logit
mma15p4gev.asc
16.2.1
530-1,
565
Classic Tobit MLE and mma16p1tobit.do

CLAD
mma16p1tobit.txt
mma16p1tobit.asc
16.3.4
540
Inverse Mills ratio plotted
No data
16.6
553-5
Selection
Application
expenditures)
17.2
17.5.1
574-5
581-3
Nonparametric estimation mma17p1km.do

(KM for NA) for survival mma17p1km.txt
data (strike duration)
strkdur.dta
strkdur.asc
17.5.1
581-2
Nonparametric estimation mma17p2kmextra.do

(KM and NA) for survival mma17p2kmextra.txt
data (artificial)
Data in program
17.6.1
584-6
Weibull
distribution mma17p3weib.do
functions plotted
mma17p3weib.txt
No data
17.11
603-8
Duration regression models mma17p4duration.do

(unemployment duration) mma17p4duration.txt
ema1996.dta
or ema1996.asc
18.8
632-6
Duration regression with mma18p1heterogeneity.do ema1996.dta

unobserved heterogeneity mma18p1heterogeneity.txt or ema1996.asc
(unemployment duration)
19.5
658-3
Competing risks model mma19p1comprisks.do

(unemployment duration) mma19p1comprisks.txt
ema1996.dta
or ema1996.asc
20.2
20.7
671-4
690
Count regression (doctor mma20p1count.do

contacts)
mma20p1count.txt
randdata.dta
mma20p1count.asc
mma16p2mills.do
mma16p2mills.txt
Model mma16p3selection.do
(medical mma16p3selection.txt
randdata.dta
or
mma16p3selection.asc
or
23
PROGRAMS:
V.
Models
for
Data
(chapters
Pages
21.3.1-3
708-13 Linear Panel Fixed and mma21p1panfeandre.do

Random Effects Application mma21p1panfeandre.txt
(hours and wages)
MOM.dat
21.3.2
21.3.4
710
719
Linear Panel Estimators mma21p2panmanual.do

manually obtained by OLS mma21p2panmanual.txt
on transformed equation
(hours and wages)
MOM.dat
21.3.4
713-5
Linear
Panel
Residual mma21p3panresiduals.do
Analysis (hours and wages) mma21p3panresiduals.txt
MOM.dat
21.5.5
725
Linear Panel pooled OLS mma21p4pangls.do

and GLS estimation (hours mma21p4pangls.txt
and wages)
MOM.dat
22.3
754-6
Linear
Panel
GMM mma22p1gmmpanel.do
Application (hours and mma22p1gmmpanel.txt
wages)
MOMprecise.dat
23.3
792-5
Nonlinear Panel Application mma23p1pannonlin.do

(patents and R&D)
mma23p1pannonlin.txt
patr7079.asc
VI.
Example
Program and Output
21-23)
Section
PROGRAMS:
Example
Panel
Further
Methods
Section
Pages
24.7
848-53 Clustered Linear Regression mma24p1olscluster.do

(household
medical mma24p1olscluster.txt
expenditure clustered on
commune)
Data
(chapters
Program and Output
Clustered
Poisson mma24p2poiscluster.do
Regression
(individual mma24p2poiscluster.txt
pharmacy visits clustered on
commune)
24-27)
Data
vietnam_ex1.dta
or vietnam_ex1.asc
vietnam_ex2.dta
or vietnam_ex2.asc
25.8.1-4
889-93 Treatment
Evaluation: mma25p1treatment.do
Simple
calculations mma25p1treatment.txt
(training on earnings)
nswpsid.da1
or nswpsid.dta
25.8.5
893-6
nswpsid.da1
or nswpsid.dta
25.8
889-96 Treatment
Treatment
Evaluation: mma25p2matching.do
Propensity score matching mma25p2matching.txt
(training on earnings):
Evaluation: mma25p3extra.do
nswre74_treated.dta
24
Additional analysis not in mma25p3extra.txt

book using additional data
sets (NSW experimental
controls and CPS controls)
26.5
919-20 Measurement
Example
27.8
935-9
Error
Bias To
Missing
Data
MCMC To come
Imputation Example
and
nswre74_control.dta
or nswre74_all.asc
propensity_cps.dta
or
propensity_cps.asc
come Generated data
Generated data
25
DATA
SETS
Data in fixed format text file have extension .asc or .dat [and if Stata dictionary used extension is
.dct]
Stata
data
files
have
extension
.dta
We thank Rajeev Dehejia, Bronwyn Hall, Cathy Kling, Jeffrey Kling, Will Manning, Brian McCall
and Jim Ziliak for making their data available for empirical illustrations. The relevant citations are
given below. For "Authors' extract" the citation is A. C. Cameron and P. K. Trivedi (2005),
"Microeconometrics: Methods and Applications," Cambridge University Press, New York.
Many more examples use generated data - see programs.
Pages
Topic
Data Source
Data
88-90
Median and quantile Vietnam World Bank Livings Standards qreg0902.dta

regression
Survey
qreg0902.asc
Authors' extract
or
110-2
Instrumental
National
Longitudinal
Survey DATA66.dat
variables with weak J. R. Kling (2001) "Interpreting DATA66.dct
instruments
Instrumental Variables Estimates of the
Return to Schooling," Journal of Business
and Economic Statistics, 19, 358-364.
and
295-7
300
Panel Survey of
Nonparametric
density estimation Authors' extract
and regression
463-6
486
491-5
Binary
multinomial
outcomes
553-6
565
Selection models
Rand Health Insurance

Authors' extract
574-5
582
Duration models
Strike
duration
data strkdur.asc
J. Kennan (1985), "The Duration of strkdur.asc
Contract strikes in U.S. Manufacturing,"
Journal of Econometrics, 28, 5-28.
or
603-8
632-6
658-62
Duration models
Current Population Survey Displaced ema1996.dta

Workers
Supplement ema1996.asc
B. P. McCall (1996), "Unemployment
Insurance Rules, Joblessness, and Parttime Work," Econometrica, 64, 647-682.
or
671-4
692
Count data models
Rand Health Insurance Experiment randdata.dta

or
P. Deb and P.K. Trivedi (2002), "The mma20p1count.asc
Structure of Demand for Medical Care:
Latent Class versus Two-Part Models,"
Journal of Health Economics, 21, 601625.
708-15
Linear
panel Panel Survey of Income Dynamics MOM.dat
models: basics
J. Ziliak (1997), "Efficient Estimation
Income
Dynamics psidf3050.dat
choice
data Nldata.asc
and Fishing-mode
J. A. Herriges and C. L. Kling (1999), mma15p4gev.asc
"Nonlinear Income Effects in Random
Utility Models," Review of Economics
and Statistics, 81, 62-72.
or
Experiment randdata.dta
or
mma16p3selection.asc
26
With Panel Data when Instruments are

Predetermined: An Empirical Comparison
of Moment-Condition Estimators," Journal
of Business and Economic Statistics, 15,
419-431.
754-6
Linear
panel Panel Survey of Income Dynamics MOMprecise.dat
models: GMM
J. Ziliak (1997) - see previous cite.
792-5
Nonlinear
models
848-53
Clustered data
889-95
panel Patents-R&D
data patr7079.asc
B. H. Hall, Z. Griliches and J. A.
Hausman (1986), "Patents and R&D: Is
There a Lag?", International Economic
Review, 27, 265-283.
Treatment
evaluation
[nswpsid:
NSW
treated vs PSID
control used in text.
The other data sets
not used in text but
used
in
mmap3extra.do]
Vietnam World Bank Livings Standards

Survey
Authors' extract: (1) Household data (2)
Individual data
vietnam_ex1.dta
vietnam_ex1.asc
vietnam_ex2.dta
vietnam_ex2.asc
National Supported Work demonstration

project
and
controls.
R.H. Dehejia and S. Wahba (1999),
"Causal Effects in Nonexperimental
Studies: Reevaluating the Evaluation of
Training Programs," JASA, 1053-1062.
and
/
or
R.H. Dehejia and S. Wahba (2002),
"Propensity-score Matching Methods for
Nonexperimental Causal Studies," ReStat,
151-161.
nswpsid.da1
or
nswpsid.dta
nswre74_treated.dta
and
nswre74_control.dta
or
nswre74_all.asc
propensity_cps.dta
or propensity_cps.asc
or
or
27
EXPLANATION OF BOOK PROGRAMS

PROGRAMS USED:
Most programs are in Stata version 8.0, executed on a MSWindows PC with Stata 8.2.
Stata 7 will usually be okay. Exceptions where Stata 8 is needed include:
(1) Estimates command (for tabulating regression results) is not available in version 7.
Comment out occurrences of "estimates store ..."
and "estimates table ...."
(2) Graphics commands (used to obtain the figures in the book) changed substantially from 7 to 8.
This only effects generating figures. If graphs are important, it is best to upgrade to Stata 8 as so
much
better.
(3) In some places free Stata add-ons have been included. These are noted in programs.
To download these programs e.g. knnreg in Stata give command "search knnreg" and follow
directions.
The Stata programs vary from very problem-specific code to code that potentially can be adapted to
one's own needs.
Some programs use Limdep version 7.0 and Nlogit 2.0, executed on an MSWindows PC.
Some programs use SAS / IML. SAS version 8.0 used on a Unix machine.
FILE NAMING CONVENTIONS:
For
Stata:
as
an
example
for
chapter
4.5.3
we
provide:
mma04p1wls.do
Stata
program
mma04p1wls.txt
Output
from
this
program
- mma04p1wls.asc
The generated data as fixed width ascii data set
[permits analysis with programs other than Stata]
For
Limdep:
as
an
example
for
chapter
14.5.3
we
provide:
mma15p3mnl.lim
Limdep
program
- mma15p3mnl.out
Output from this program
For
SAS:
as
an
example
for
chapter
13.6
we
provide:
mma15p2bayesgibbs.sas
SAS
program
mma13p2bayesgibbs.lst
SAS
output
- mma13p2bayesgibbs.log SAS logfile
For
data
sets
the
extensions
are:
.dta
for
Stata
data
set
- .asc for ascii (text) data set that is usually both space delimited and fixed width
For descriptions of the data sets see the relevant program that uses the data set, and the associated
output.
PROGRAM CPU TIME
Programs generally take little time to run.
Exception is programs that entail simulation, including bootstrapping.
Programs can be speeded up by reducing the number of simulations / replications, though final
analysis should use many simulations / replications.
28
29
30
31
32
33
34
35
36
37
38
Chapter 4. Linear models
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma04p1wls.txt
log type: text
opened on: 17 May 2005, 13:41:48
.
. ********** OVERVIEW OF MMA04P1WLS.DO **********
.
. * STATA Program
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
. * used for "Microeconometrics: Methods and Applications"
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press
.
. * Chapter 4.5.3 pages 84-5
. * Robust Standard Errors for OLS, WLS and GLS
. * (1) Robust and nonrobust standard errors for OLS, WLS and GLS.
. * (2) Table 4.3
. * using generated data (see below)
.
. ********** SETUP **********
.
. set more off
. version 8
. set scheme s1mono /* Used for graphs */
.
. ********** GENERATE DATA and SUMMARIZE **********
.
. * Model is y = 1 + 1*x + u
. * where u = abs(x)*e
.*
x ~ N(0, 5^2)
.*
e ~ N(0, 2^2)
.
. * Errors are conditionally heteroskedastic with V[u|x]=4*x^2
. * OLS, WLS and GLS are consistent
. * but need to use robust standard errors for OLS and WLS.
.
. set seed 10105
. set obs 100
obs was 0, now 100
. gen x = 5*invnorm(uniform())
39
. gen e = 2*invnorm(uniform())
. gen u = abs(x)*e
. gen y = 1 + 1*x + u
.
. * Descriptive Statistics
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x|
100 -.1322828 4.64293 -11.05289 10.63336
e|
100 .350339 2.033639 -3.776468 5.150759
u|
100 1.215709 8.187081 -19.58098 32.6086
y|
100 2.083426 9.364465 -27.63657 39.93944
.
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x e u using mma04p1wls.asc, replace
.
. ********** ESTIMATE THE MODELS **********
.
. ** (1) OLS - first column of Table 4.3
.
. * (1A) OLS with wrong standard errors
. regress y x
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 1, 98) = 30.23
Model | 2046.73901 1 2046.73901
Prob > F
= 0.0000
Residual | 6634.88855 98 67.7029444
R-squared = 0.2358
-------------+-----------------------------Adj R-squared = 0.2280
Total | 8681.62755 99 87.6932076
Root MSE
= 8.2282
-----------------------------------------------------------------------------y|
Coef. Std. Err.
t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .979313 .1781124 5.50 0.000 .6258548 1.332771
_cons | 2.212973 .8231553 2.69 0.008 .5794478 3.846497
-----------------------------------------------------------------------------. estimates store olsusual
.
. * (1B) OLS with correct standard errors (robust sandwich)
. regress y x, robust
40
Regression with robust standard errors

Number of obs =
F( 1, 98) = 12.68
Prob > F
= 0.0006
R-squared = 0.2358
Root MSE = 8.2282
100
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .979313 .2750617 3.56 0.001 .4334621 1.525164
_cons | 2.212973 .8198253 2.70 0.008
.586056 3.839889
-----------------------------------------------------------------------------. estimates store olsrobust
.
. ** (2) WLS - second column of Table 4.3
.
. * (2A) WLS with wrong standard errors
. * Use the aweight option (not clearly explained in Stata manual).
. * The aweight option MULTIPLIES y and x by sqrt(aweight).
. * Here we suppose V[u]=constant*|x|
. * So want to divide by sqrt(|x|), so let aweight=1/|x|
. gen absx = abs(x)
. regress y x [aweight=1/absx]
(sum of wgt is 5.7885e+02)
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 1, 98) = 25.29
Model | 56.759883 1 56.759883
Prob > F
= 0.0000
Residual | 219.985987 98 2.24475497
R-squared = 0.2051
-------------+-----------------------------Adj R-squared = 0.1970
Total | 276.74587 99 2.79541283
Root MSE
= 1.4983
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .9569768 .1903115 5.03 0.000 .5793097 1.334644
_cons | 1.060374 .1498265 7.08 0.000 .7630484
1.3577
-----------------------------------------------------------------------------. estimates store wlsusual
.
. * (2B) WLS with correct standard errors (robust sandwich)
. regress y x [aweight=1/absx], robust
Number of obs =
100
41
F( 1, 98) = 17.07
Prob > F
= 0.0001
R-squared = 0.2051
Root MSE = 1.4983
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | .9569768 .231612 4.13 0.000 .4973503 1.416603
_cons | 1.060374 .050533 20.98 0.000 .9600931 1.160655
-----------------------------------------------------------------------------. estimates store wlsrobust
.
. ** (3) GLS - last column of Table 4.3
.
. * (3A) GLS with usual standard errors (correct)
. * Here we know V[u]=constant*x^2
. * So want to divide by x, so let aweight=1/(x^2)
. gen xsq = x*x
. regress y x [aweight=1/xsq]
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 1, 98) = 20.70
Model | .086075004 1 .086075004
Prob > F
= 0.0000
Residual | .407542418 98 .004158596
R-squared = 0.1744
-------------+-----------------------------Adj R-squared = 0.1660
Total | .493617422 99 .004986035
Root MSE
= .06449
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .9516457 .2091752 4.55 0.000 .5365444 1.366747
_cons | .9964956 .0065131 153.00 0.000 .9835706 1.009421
-----------------------------------------------------------------------------. estimates store glsusual
.
. * (3B) GLS with standard errors (robust sandwich - unnecessary here)
. regress y x [aweight=1/xsq], robust
Number of obs =
F( 1, 98) = 20.89
Prob > F
= 0.0000
R-squared = 0.1744
100
42
Root MSE
= .06449
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .9516457 .2082145 4.57 0.000 .5384508 1.364841
_cons | .9964956 .0078922 126.26 0.000 .9808337 1.012157
-----------------------------------------------------------------------------. estimates store glsrobust
.
. * (3C) Check that aweight works as expected.
. * Do GLS by OLS on daya transformed by dividing by x.
. gen try = y/x
. gen trint = 1/x
. gen trx = x/x
. regress try trx trint, noconstant
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 2, 98) =11850.15
Model | 101659.545 2 50829.7726
Prob > F
= 0.0000
Residual | 420.359033 98 4.28937789
R-squared = 0.9959
-------------+-----------------------------Adj R-squared = 0.9958
Total | 102079.904 100 1020.79904
Root MSE
= 2.0711
-----------------------------------------------------------------------------try |
Coef. Std. Err.
-------------+---------------------------------------------------------------trx | .9516457 .2091752 4.55 0.000 .5365444 1.366747
trint | .9964956 .0065131 153.00 0.000 .9835706 1.009421
-----------------------------------------------------------------------------.
. ********** DISPLAY KEY RESULTS **********
.
. * Table 4.3
. estimates table olsusual olsrobust wlsusual wlsrobust glsusual glsrobust, /*
>
*/ se stats(N r2) b(%7.3f) keep(_cons x)
-------------------------------------------------------------------------Variable | olsus~l olsro~t wlsus~l wlsro~t glsus~l glsro~t
-------------+-----------------------------------------------------------_cons | 2.213 2.213 1.060 1.060 0.996 0.996
| 0.823 0.820 0.150 0.051 0.007 0.008
x | 0.979 0.979 0.957 0.957 0.952 0.952
| 0.178 0.275 0.190 0.232 0.209 0.208
43
-------------+-----------------------------------------------------------N | 100.000 100.000 100.000 100.000 100.000 100.000

r2 | 0.236 0.236 0.205 0.205 0.174 0.174
-------------------------------------------------------------------------legend: b/se
.
. * Minor typo in Table 4.3:
. * for GLS Constant has robust s.e. of [0.008] not [0.006]
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma04p1wls.txt
log type: text
closed on: 17 May 2005, 13:41:48
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma04p2qreg.txt
log type: text
opened on: 17 May 2005, 13:43:21
.
. ********** OVERVIEW OF MMA04P2QREG.DO **********
.
. * STATA Program
.
. * Quantile Regression analysis.
. * (1) Quantile regression estimates for different quantiles
. * (2) Figure 4.1: Quantile Slope Coefficient Estimates as Quantile Varies
. * (3) Figure 4.2: Quantile Regression Lines as Quantile Varies
.
. * To run this program you need data file
. * qreg0902.dta
. * or for programs other than Stata use qreg92.asc
.
. * Step (3) takes a long time due to bootstrap to get standard errors.
. * To speed up the program reduce the number of repititions in qsreg
. * But any final results should use a large number of bootstraps
.
. ********** SETUP **********
.
. set more off
. version 8.0
44
.
. ********** DATA DESCRIPTION **********
.
. * The data from World Bank 1997 Vietnam Living Standards Survey
. * are described in chapter 4.6.4.
. * A larger sample from this survey is studied in Chapter 24.7
.
. ********** READ DATA, TRANSFORM and SAMPLE SELECTION **********
.
. use qreg0902
. describe
Contains data from qreg0902.dta
obs:
5,999
vars:
9
19 Sep 2002 21:45
size:
191,968 (98.1% of memory free)
------------------------------------------------------------------------------storage display value
variable name type format
label
variable label
------------------------------------------------------------------------------sex
byte %8.0g
Gender of HH.head (1:M;2:F)
age
int %8.0g
Age of household head
educyr98
float %9.0g
schooling year of HH.head
farm
float %9.0g
loaiho Type of HH (1:farm; 0:nonfarm)
urban98
byte %8.0g
urban
1:urban 98; 0:rural 98
hhsize
long %12.0g
Household size
lhhexp1
float %9.0g
lhhex12m
float %9.0g
lnrlfood
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------sex |
5999 1.270712 .4443645
1
2
age |
5999 48.01284 13.7702
16
95
educyr98 |
5999 7.094419 4.416092
0
22
farm |
5999 .5730955 .4946694
0
1
urban98 |
5999 .2883814 .4530472
0
1
-------------+-------------------------------------------------------hhsize |
5999 4.752292 1.954292
1
19
lhhexp1 |
5999 9.341561 .6877458 6.543108 12.20242
lhhex12m |
5006 6.310585 1.593083
0 12.36325
lnrlfood |
5999 8.679536 .5368118 6.356364 11.38385
.
. outfile sex age educyr98 farm urban98 hhsize lhhexp1 lhhex12m lnrlfood /*
45
>
*/ using qreg0902.asc, replace
.
. * drop zero observations for medical expenditures
. drop if lhhex12m == .
(993 observations deleted)
.
. * lhhexp1 is natural logarithm of household total expenditure
. * lhhex12m is natural logarithm of household medical expenditure
. gen lntotal = lhhexp1
. gen lnmed = lhhex12m
. label variable lntotal "Log household total expenditure"
. label variable lnmed "Log household medical expenditure"
. describe
Contains data from qreg0902.dta
obs:
5,006
vars:
11
19 Sep 2002 21:45
size:
label
variable label
------------------------------------------------------------------------------sex
byte %8.0g
age
int %8.0g
educyr98
float %9.0g
schooling year of HH.head
farm
float %9.0g
urban98
byte %8.0g
urban
1:urban 98; 0:rural 98
hhsize
long %12.0g
Household size
lhhexp1
float %9.0g
lhhex12m
float %9.0g
lnrlfood
float %9.0g
lntotal
float %9.0g
Log household total expenditure
lnmed
float %9.0g
Log household medical
expenditure
------------------------------------------------------------------------------Sorted by:
Note: dataset has changed since last saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------sex |
5006 1.269676 .443836
1
2
age |
5006 48.06133 13.79974
18
95
educyr98 |
5006 7.147956 4.333304
0
21
46
farm |
5006 .5679185 .4954151
0
1
urban98 |
5006 .2920495 .4547504
0
1
-------------+-------------------------------------------------------hhsize |
5006 4.832601 1.95257
1
19
lhhexp1 |
5006 9.370402 .6726841 6.543108 12.20242
lhhex12m |
5006 6.310585 1.593083
0 12.36325
lnrlfood |
5006 8.697963 .5309517 6.356364 11.38385
lntotal |
5006 9.370402 .6726841 6.543108 12.20242
-------------+-------------------------------------------------------lnmed |
5006 6.310585 1.593083
0 12.36325
.
. ********* ANALYSIS: QUANTILE REGRESSION **********
.
. * (0) OLS
. reg lnmed lntotal
Source |
SS
df
MS
Number of obs = 5006
-------------+-----------------------------F( 1, 5004) = 311.91
Model | 745.293239 1 745.293239
Prob > F
= 0.0000
Residual | 11956.9671 5004 2.38948183
R-squared = 0.0587
-------------+-----------------------------Adj R-squared = 0.0585
Total | 12702.2603 5005 2.53791415
Root MSE
= 1.5458
-----------------------------------------------------------------------------lnmed |
Coef. Std. Err.
-------------+---------------------------------------------------------------lntotal | .5736545 .0324817 17.66 0.000 .5099761 .6373328
_cons | .9352117 .3051496 3.06 0.002 .3369847 1.533439
-----------------------------------------------------------------------------. predict pols
(option xb assumed; fitted values)
. reg lnmed lntotal, robust
Number of obs =
F( 1, 5004) = 318.05
Prob > F
= 0.0000
R-squared = 0.0587
Root MSE = 1.5458
5006
-----------------------------------------------------------------------------|
Robust
lnmed |
Coef. Std. Err.
-------------+---------------------------------------------------------------lntotal | .5736545 .0321665 17.83 0.000
.510594 .636715
_cons | .9352117 .298119 3.14 0.002 .3507677 1.519656
-----------------------------------------------------------------------------. * Bootstrap standard errors for OLS
47
. set seed 10101

. * bs "reg lnmed lntotal" "_b[lntotal]", reps(100)
.
. * (1) Quantile and median regression for quantiles 0.1, 0.5 and 0.9
. * Save prediction to construct Figure 4.2.
. qreg lnmed lntotal, quant(.10)
Iteration 1: WLS sum of weighted deviations = 3554.0793
Iteration 1: sum of abs. weighted deviations = 3555.3279
.1 Quantile regression
Number of obs =
5006
Raw sum of deviations 2936.097 (about 4.1743875)
Min sum of deviations 2932.443
Pseudo R2 = 0.0012
-----------------------------------------------------------------------------lnmed |
Coef. Std. Err.
-------------+---------------------------------------------------------------lntotal | .1512009 .0552584 2.74 0.006 .0428702 .2595317
_cons | 2.825072 .5194064 5.44 0.000 1.806808 3.843336
-----------------------------------------------------------------------------. predict pqreg10
Iteration
Iteration
Iteration
Iteration
1: sum of abs. weighted deviations =

6112.4546
6098.5295
6097.2178
6097.1564
Median regression
Number of obs =
Pseudo R2
5006
=
0.0359
-----------------------------------------------------------------------------lnmed |
Coef. Std. Err.
-------------+---------------------------------------------------------------lntotal | .6210917 .0388194 16.00 0.000 .5449886 .6971948
_cons | .5921626 .3646869 1.62 0.104 -.1227836 1.307109
48
-----------------------------------------------------------------------------. predict pqreg50

Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration

3279.5575
2691.3839
2521.5214
2506.303
2505.1952
2505.1334
2505.1314
2505.1313
.9 Quantile regression
Number of obs =
5006
Pseudo R2 = 0.0679
-----------------------------------------------------------------------------lnmed |
Coef. Std. Err.
-------------+---------------------------------------------------------------lntotal | .8003569 .0517225 15.47 0.000 .6989581 .9017558
_cons | .6750967 .4857563 1.39 0.165 -.2771985 1.627392
-----------------------------------------------------------------------------. predict pqreg90
.
. * (2) Create Figure 4.2 on page 90 first as this is easy
. graph twoway (scatter lnmed lntotal, msize(vsmall)) (lfit pqreg90 lntotal, clstyle(p2)) /*
> */ (lfit pqreg50 lntotal, clstyle(p1)) (lfit pqreg10 lntotal, clstyle(p3)), /*
> */ scale (1.2) plotregion(style(none)) /*
> */ title("Regression Lines as Quantile Varies") /*
> */ xtitle("Log Household Medical Expenditure", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log Household Total Expenditure", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(11) ring(0) col(1)) legend(size(small)) /*
> */ legend( label(1 "Actual Data") label(2 "90th percentile") /*
> */
label(3 "Median") label(4 "10th percentile"))
. graph export ch4fig2QR.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch4fig2QR.wmf written in Windows Metafile format)
.
. * (3) Create Figure 4.1 second as this is more difficult
. * Simultaneous quantile regression for quantiles 0.05, 0.10, ..., 0.90, 0.95
. * with standard errors by bootstrap - here 200 replications
. set seed 10101
49
. sqreg lnmed lntotal, quant(.05,.1,.15,.2,.25,.3,.35,.4,.45,.5,.55,.6,.65,.7,.75,.8,.85,.9,.95) rep

> s(200)
(fitting base model)
(bootstrapping .....................................................................................
> ..................................................................................................
> .................)
Simultaneous quantile regression
bootstrap(200) SEs
Number of obs =
5006
.05 Pseudo R2 = 0.0015
.10 Pseudo R2 = 0.0012
.15 Pseudo R2 = 0.0058
.20 Pseudo R2 = 0.0106
.25 Pseudo R2 = 0.0149
.30 Pseudo R2 = 0.0183
.35 Pseudo R2 = 0.0242
.40 Pseudo R2 = 0.0274
.45 Pseudo R2 = 0.0326
.50 Pseudo R2 = 0.0359
.55 Pseudo R2 = 0.0408
.60 Pseudo R2 = 0.0464
.65 Pseudo R2 = 0.0500
.70 Pseudo R2 = 0.0520
.75 Pseudo R2 = 0.0563
.80 Pseudo R2 = 0.0603
.85 Pseudo R2 = 0.0630
.90 Pseudo R2 = 0.0679
.95 Pseudo R2 = 0.0795
-----------------------------------------------------------------------------|
Bootstrap
lnmed |
Coef. Std. Err.
-------------+---------------------------------------------------------------q5
|
lntotal | .1536332 .0791236 1.94 0.052 -.0014838 .3087501
_cons | 2.095395 .7559016 2.77 0.006 .6134964 3.577293
-------------+---------------------------------------------------------------q10
|
lntotal | .1512009 .085018 1.78 0.075 -.0154716 .3178734
_cons | 2.825072 .7697613 3.67 0.000 1.316002 4.334141
-------------+---------------------------------------------------------------q15
|
lntotal | .2695707 .0580757 4.64 0.000 .1557168 .3834245
_cons | 2.231293 .5429047 4.11 0.000 1.166962 3.295624
-------------+---------------------------------------------------------------q20
|
lntotal | .3552251 .0504688 7.04 0.000 .2562841 .4541662
_cons | 1.740233 .4649551 3.74 0.000 .8287172 2.651749
-------------+---------------------------------------------------------------q25
|
lntotal | .4034632 .0421514 9.57 0.000 .3208279 .4860984
50
_cons | 1.567055 .3844967 4.08 0.000 .8132731 2.320837

-------------+---------------------------------------------------------------q30
|
lntotal | .4797723 .0478081 10.04 0.000 .3860474 .5734972
_cons | 1.097107 .4299363 2.55 0.011 .2542435 1.93997
-------------+---------------------------------------------------------------q35
|
lntotal | .52179 .0440082 11.86 0.000 .4355147 .6080652
_cons | .9213684 .4064355 2.27 0.023 .1245768 1.71816
-------------+---------------------------------------------------------------q40
|
lntotal | .5691746 .0412824 13.79 0.000 .4882429 .6501062
_cons | .6808693 .3754568 1.81 0.070 -.0551906 1.416929
-------------+---------------------------------------------------------------q45
|
lntotal | .6123663 .0402805 15.20 0.000 .5333989 .6913337
_cons | .4890392 .373467 1.31 0.190 -.2431197 1.221198
-------------+---------------------------------------------------------------q50
|
lntotal | .6210917 .0414602 14.98 0.000 .5398117 .7023718
_cons | .5921626 .3866997 1.53 0.126 -.1659383 1.350263
-------------+---------------------------------------------------------------q55
|
lntotal | .6523013 .02904 22.46 0.000 .5953701 .7092324
_cons | .4913988 .264271 1.86 0.063 -.0266881 1.009486
-------------+---------------------------------------------------------------q60
|
lntotal | .6531127 .0321585 20.31 0.000 .5900679 .7161575
_cons | .6631971 .2981433 2.22 0.026 .0787056 1.247689
-------------+---------------------------------------------------------------q65
|
lntotal | .6843844 .03378 20.26 0.000 .6181608 .7506079
_cons | .5550968 .3162769 1.76 0.079 -.0649445 1.175138
-------------+---------------------------------------------------------------q70
|
lntotal | .714783 .0330755 21.61 0.000 .6499406 .7796255
_cons | .4732288 .3028818 1.56 0.118 -.1205524 1.06701
-------------+---------------------------------------------------------------q75
|
lntotal | .7416898 .0369607 20.07 0.000 .6692306 .814149
_cons | .4298887 .3416755 1.26 0.208 -.239945 1.099722
-------------+---------------------------------------------------------------q80
|
lntotal | .7675658 .0443925 17.29 0.000
.680537 .8545946
_cons | .3966887 .4132223 0.96 0.337 -.4134081 1.206785
-------------+---------------------------------------------------------------q85
|
lntotal | .8009016 .056703 14.12 0.000 .6897389 .9120642
_cons | .3649957 .5369325 0.68 0.497 -.6876273 1.417619
-------------+---------------------------------------------------------------q90
|
51
lntotal | .8003569 .0473557 16.90 0.000 .7075189 .8931949

_cons | .6750967 .4450068 1.52 0.129 -.1973116 1.547505
-------------+---------------------------------------------------------------q95
|
lntotal | .767308 .0507532 15.12 0.000 .6678094 .8668066
_cons | 1.487137 .4739756 3.14 0.002 .5579371 2.416337
-----------------------------------------------------------------------------. * Test equality of slope coefffiients for 25th and 75th quantiles
. test [q25]lntotal = [q75]lntotal
( 1) [q25]lntotal - [q75]lntotal = 0
F( 1, 5004) = 55.14
Prob > F = 0.0000
. * Create vectors of slope cofficients and estimated variances
. * Code here specific for this problem
. * with single slope coefficient is 1st, 3rd, 5th , ... entry
. matrix b = e(b)
. matrix bslopevector = b[1,1]\b[1,3]\b[1,5]\b[1,7]\b[1,9]\b[1,11]\b[1,13] /*
>
*/ \b[1,15]\b[1,17]\b[1,19]\b[1,21]\b[1,23]\b[1,25] /*
>
*/ \b[1,27]\b[1,29]\b[1,31]\b[1,33]\b[1,35]\b[1,37]
. matrix V = e(V)
. matrix Vslopevector = V[1,1]\V[3,3]\V[5,5]\V[7,7]\V[9,9]\V[11,11]\V[13,13] /*
>
*/ \V[15,15]\V[17,17]\V[19,19]\V[21,21]\V[23,23]\V[25,25] /*
>
*/ \V[27,27]\V[29,29]\V[31,31]\V[33,33]\V[35,35]\V[37,37]
. matrix q = e(q1)\e(q2)\e(q3)\e(q4)\e(q5)\e(q6)\e(q7)\e(q8)\e(q9)\e(q10) /*
>
*/ \e(q11)\e(q12)\e(q13)\e(q14)\e(q15)\e(q16)\e(q17)\e(q18)\e(q19)
. * Convert column vectors to variables as graph handles variables
. svmat bslopevector, name(bslope)
. svmat Vslopevector, name(Vslope)
. svmat q, name(quantiles)
. gen upper = bslope1 + 1.96*sqrt(Vslope1)
(4987 missing values generated)
. gen lower = bslope1 - 1.96*sqrt(Vslope1)
. * Also include OLS slope ccoefficient
. quietly reg lnmed lntotal
. gen bols=_b[lntotal]
52
. sum upper bslope1 lower bols

Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------upper |
19 .6564067 .1904354 .3087155 .9120393
bslope1 |
19 .5641943 .209318 .1512009 .8009015
lower |
19 .4719818 .2302585 -.0154343 .7075397
bols |
5006 .5736545
0 .5736545 .5736545
.
. * Following produces Figure 4.1 om page 89
. graph twoway (line upper quantiles1, msize(vtiny) mstyle(p2) clstyle(p1) clcolor(gs12)) /*
> */ (line bslope1 quantiles1, msize(vtiny) mstyle(p1) clstyle(p1)) /*
> */ (line lower quantiles1, msize(vtiny) mstyle(p2) clstyle(p1) clcolor(gs12)) /*
> */ (line bols quantiles1, msize(vtiny) mstyle(p3) clstyle(p2)), /*
> */ scale(1.2) plotregion(style(none)) /*
> */ title("Slope Estimates as Quantile Varies") /*
> */ xtitle("Quantile", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Slope and confidence bands", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Upper 95% confidence band") label(2 "Quantile slope coefficient") /*
> */
label(3 "Lower 95% confidence band") label(4 "OLS slope coefficient") )
. graph export ch4fig1QR.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch4fig1QR.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma04p2qreg.txt
log type: text
closed on: 17 May 2005, 13:51:21
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma04p3iv.txt
log type: text
opened on: 17 May 2005, 13:44:29
.
. ********** OVERVIEW OF MMA04P3IV.DO **********
.
. * STATA Program
.
. * Instrumental variables analysis.
53
. * (1) IV Regression (with robust s.e.'s though not needed here for iid error).
. * (2) Table 4.4
.
. ********** SETUP **********
.
. set more off
. version 8
.
.
. * Model is
. * y = b1 + b2*x + u
. * x = c1 + c2*z + v
. * z ~ N[2,1]
. * where b1=0, b2=0.5, c1=0 and c2=1
. * and u and v are joint normal (0,0,1,1,0.8)
.
. * OLS of y on z is inconsistent as z is correlated with u
. * Instead need to do IV with instrument x for z
. * Also try using
.
. set seed 10001
. set obs 10000
obs was 0, now 10000
. scalar b1 = 0
. scalar b2 = 0.5
. scalar c1 = 0
. scalar c2 = 1
.
. * Generate errors u and v
. * Use fact that u is N(0,1)
. * and v | u is N(0 + (.8/1)(u - 0), 1 - .8x.8/1 = 0.36)
. gen u = 1*invnorm(uniform())
. gen muvgivnu = 0.8*u
. gen v = 1*(muvgivnu+sqrt(0.36)*invnorm(uniform()))
.
. * Generate instrument z (which is purely random)
. gen z = 2 + 1*invnorm(uniform())
54
.
. * Generate regressor x which is correlated with z, and with u via v
. gen x = c1 + c2*z + v
.
. * Generate dependent variable y
. gen y = b1 + b2*x + u
.
. * Generate z-cubed. Used as an alternative instrument
. gen zcube = z*z*z
.
. describe
Contains data
obs:
10,000
vars:
7
size:
label
variable label
------------------------------------------------------------------------------u
float %9.0g
muvgivnu
float %9.0g
v
float %9.0g
z
float %9.0g
x
float %9.0g
y
float %9.0g
zcube
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u | 10000 .003772 1.010726 -4.010302 4.267661
muvgivnu | 10000 .0030176 .8085809 -3.208241 3.414129
v | 10000 .0097031 1.005874 -3.992237 3.79261
z | 10000 1.997786 1.013118 -1.895752 5.81496
x | 10000 2.007489 1.436511 -3.139744 7.366555
-------------+-------------------------------------------------------y | 10000 1.007516 1.538611 -5.309155 7.794924
zcube | 10000 14.14145 17.88016 -6.813095 196.6257
. correlate y x z u v
(obs=10000)
55
|
y
x
z
u
v
-------------+--------------------------------------------y | 1.0000
x | 0.8423 1.0000
z | 0.3403 0.7140 1.0000
u | 0.9237 0.5716 0.0107 1.0000
v | 0.8601 0.7090 0.0124 0.8055 1.0000
. correlate y x z u v, cov
(obs=10000)
|
y
x
z
u
v
-------------+--------------------------------------------y | 2.36732
x | 1.86165 2.06356
z | .530456 1.0391 1.02641
u | 1.4365 .829866 .010909 1.02157
v | 1.33119 1.02447 .012687 .818958 1.01178
. graph matrix y x z u v
.
. outfile y x z u v using mma04p3iv.asc, replace
.
. ********** DO THE ANALYSIS: ESTIMATE MODELS **********
.
. * (1) OLS is inconsistent (first column of Table 4.4)
. regress y x
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) =24412.17
Model | 16793.2198 1 16793.2198
Prob > F
= 0.0000
Residual | 6877.65935 9998 .687903516
R-squared = 0.7094
-------------+-----------------------------Adj R-squared = 0.7094
Total | 23670.8791 9999 2.36732464
Root MSE
= .8294
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .9021522 .005774 156.24 0.000 .890834 .9134704
_cons | -.8035441 .014253 -56.38 0.000 -.8314827 -.7756054
-----------------------------------------------------------------------------. regress y x, robust
F( 1, 9998) =24780.49
56
Prob > F
= 0.0000
R-squared = 0.7094
Root MSE = .8294
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .9021522 .0057309 157.42 0.000 .8909184 .9133859
_cons | -.8035441 .0141056 -56.97 0.000 -.8311939 -.7758942
-----------------------------------------------------------------------------. estimates store olswrong
.
. * (2) IV with instrument x is consistent and efficient (second column of Table 4.4)
. ivreg y (x = z)
Instrumental variables (2SLS) regression
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) = 2728.97
Model | 13628.1781 1 13628.1781
Prob > F
= 0.0000
Residual | 10042.701 9998 1.004471
R-squared = 0.5757
-------------+-----------------------------Adj R-squared = 0.5757
Total | 23670.8791 9999 2.36732464
Root MSE
= 1.0022
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .5104982 .0097723 52.24 0.000 .4913426 .5296538
_cons | -.017303 .0220296 -0.79 0.432 -.0604854 .0258793
-----------------------------------------------------------------------------Instrumented: x
Instruments: z
-----------------------------------------------------------------------------. ivreg y (x = z), robust
IV (2SLS) regression with robust standard errors
F( 1, 9998) = 2670.19
Prob > F
= 0.0000
R-squared = 0.5757
Root MSE = 1.0022
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .5104982 .0098792 51.67 0.000 .4911329 .5298635
_cons | -.017303 .0220785 -0.78 0.433 -.0605813 .0259752
57
-----------------------------------------------------------------------------Instrumented: x
Instruments: z
-----------------------------------------------------------------------------. estimates store iv
.
. * (3) IV estimator in (3) can be computed by
.*
regress y on z gives dy/dz
.*
regress x on z gives dx/dz
. * and divide the two
. regress y z
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) = 1309.44
Model | 2741.16635 1 2741.16635
Prob > F
= 0.0000
Residual | 20929.7128 9998 2.09338995
R-squared = 0.1158
-------------+-----------------------------Adj R-squared = 0.1157
Total | 23670.8791 9999 2.36732464
Root MSE
= 1.4469
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------z | .516808 .0142819 36.19 0.000 .4888126 .5448035
_cons | -.0249553 .031991 -0.78 0.435 -.0876642 .0377535
-----------------------------------------------------------------------------. matrix byonz = e(b)
. regress x z
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) =10396.43
Model | 10518.3341 1 10518.3341
Prob > F
= 0.0000
Residual | 10115.2362 9998 1.01172597
R-squared = 0.5098
-------------+-----------------------------Adj R-squared = 0.5097
Total | 20633.5703 9999 2.06356339
Root MSE
= 1.0058
-----------------------------------------------------------------------------x|
Coef. Std. Err.
-------------+---------------------------------------------------------------z | 1.01236 .0099287 101.96 0.000 .9928979 1.031822
_cons | -.0149899 .02224 -0.67 0.500 -.0585847 .028605
-----------------------------------------------------------------------------. matrix bxonz = e(b)
. matrix ivfirstprinciples = byonz[1,1]/bxonz[1,1]
. matrix list byonz
58
byonz[1,2]
z
_cons
y1 .51680804 -.02495533
. matrix list bxonz
bxonz[1,2]
z
_cons
y1 1.0123602 -.01498985
. matrix list ivfirstprinciples
symmetric ivfirstprinciples[1,1]
c1
r1 .5104982
.
. * (4) IV can be computed as 2SLS, but wrong standard errors
. * (third column of Table 4.4)
. * (4A) OLS of x on z gives xhat
. regress x z
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) =10396.43
Model | 10518.3341 1 10518.3341
Prob > F
= 0.0000
Residual | 10115.2362 9998 1.01172597
R-squared = 0.5098
-------------+-----------------------------Adj R-squared = 0.5097
Total | 20633.5703 9999 2.06356339
Root MSE
= 1.0058
-----------------------------------------------------------------------------x|
Coef. Std. Err.
-------------+---------------------------------------------------------------z | 1.01236 .0099287 101.96 0.000 .9928979 1.031822
_cons | -.0149899 .02224 -0.67 0.500 -.0585847 .028605
-----------------------------------------------------------------------------. predict xhat, xb
. * (4B) OLS of x on xhat gives IV but wrong standard errors
. regress y xhat
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) = 1309.44
Model | 2741.16636 1 2741.16636
Prob > F
= 0.0000
Residual | 20929.7127 9998 2.09338995
R-squared = 0.1158
-------------+-----------------------------Adj R-squared = 0.1157
Total | 23670.8791 9999 2.36732464
Root MSE
= 1.4469
-----------------------------------------------------------------------------y|
Coef. Std. Err.
59
-------------+---------------------------------------------------------------xhat | .5104982 .0141075 36.19 0.000 .4828446 .5381518

_cons | -.017303 .0318026 -0.54 0.586 -.0796425 .0450364
-----------------------------------------------------------------------------. regress y xhat, robust
F( 1, 9998) = 1271.86
Prob > F
= 0.0000
R-squared = 0.1158
Root MSE = 1.4469
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xhat | .5104982 .0143144 35.66 0.000
.482439 .5385574
_cons | -.017303 .0319207 -0.54 0.588 -.0798741 .045268
-----------------------------------------------------------------------------. estimates store twosls
.
. * (5) IV with instrument xcubed is consistent but inefficient
. * (last column of Table 4.4)
. ivreg y (x = zcube)
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) = 2001.31
Model | 13598.1181 1 13598.1181
Prob > F
= 0.0000
Residual | 10072.761 9998 1.0074776
R-squared = 0.5745
-------------+-----------------------------Adj R-squared = 0.5744
Total | 23670.8791 9999 2.36732464
Root MSE
= 1.0037
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .5086427 .0113699 44.74 0.000 .4863555 .5309299
_cons | -.0135782 .0249344 -0.54 0.586 -.0624546 .0352982
-----------------------------------------------------------------------------Instrumented: x
Instruments: zcube
-----------------------------------------------------------------------------. ivreg y (x = zcube), robust
F( 1, 9998) = 1894.15
60
Prob > F
= 0.0000
R-squared = 0.5745
Root MSE = 1.0037
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .5086427 .0116871 43.52 0.000 .4857337 .5315517
_cons | -.0135782 .0253208 -0.54 0.592 -.063212 .0360556
-----------------------------------------------------------------------------Instrumented: x
Instruments: zcube
-----------------------------------------------------------------------------. estimates store ivineff
.
. ********** DISPLAY KEY RESULTS in Table 4.4 p.103 **********
.
. * Table 4.4 page 103
. estimates table olswrong iv twosls ivineff, se stats(N r2) b(%8.3f) keep(_cons x xhat)
---------------------------------------------------------Variable | olswrong
iv
twosls ivineff
-------------+-------------------------------------------_cons | -0.804 -0.017 -0.017 -0.014
| 0.014
0.022
0.032
0.025
x | 0.902
0.510
0.509
| 0.006
0.010
0.012
xhat |
0.510
|
0.014
-------------+-------------------------------------------N | 1.0e+04 1.0e+04 1.0e+04 1.0e+04
r2 | 0.709
0.576
0.116
0.574
---------------------------------------------------------legend: b/se
.
. ********** CLOSE OUTPUT
. log close
log: c:\Imbook\bwebpage\Section2\mma04p3iv.txt
log type: text
closed on: 17 May 2005, 13:44:41
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma04p4ivweak.txt
log type: text
opened on: 17 May 2005, 13:45:59
61
.
. ********** OVERVIEW OF MMA04P4IVWEAK.DO **********
.
. * STATA Program
.
. * IV regression with potentially weak instruments
. * (1) Compares OLS and IV estimation of log-wages on schooling regression
. * where schooling, experience and experience-squared are endogenous
. * and proximity to 4-year college, age and age-squared are instruments
. * so model is just-identified.
. * (2) Verifies that here can treat errors as homoskedastic
. * (3) Looks at weak instruments
. * (A) instrument relevance: Whether Shea's partial R-squared is low
. * (B) finite sample bias: whether first-stage partial F is low
. * (4) Provides Table 4.5
. * (5) Does more analysis than reported in the book
.
. * To run this program you need data and dictionary files
. * DATA66.dat ASCII data set
. * DATA66.dct Stata dictionary that labels variables
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set memory 20m
(20480k)
. set linesize 150 /* Permits long inputline commands with delimit */
.
. ********** ORIGINAL DATA SOURCE **********
.
. * Program mma4p4ivweak.do based on Kling Analys66.d0 September 2003
. * written for Jeffrey R. Kling (2001) "Interpreting Instrumental Variables Estimates
. * of the Return to Schooling", Journal of Business and Economic Statistics,
. * July 2001, 19 (3), pp.358-364.
. * This program focuses on Columns (1) and (2) of Kling's Table 1 on p.359
. * in turn based on
. * David Card (1995), "Using Geographic Variation in College Proximity to
. * Estimate the Returns to Schooling", in
. * Aspects of Labor Market Behavior: Essays in Honor of John Vanderkamp,
. * eds. L.N. Christofides et al., Toronto: University of Toronto Press, pp.201-221.
.
62
. ********** READ IN DATA and SUMMARIZE **********

.
. infile using DATA66.dct, using(DATA66.dat)
dictionary using DATA66.dat {
_column(1) id
%8f "ID CODE (r0000100) n= 5225 mean= 2613.000 min= 1 max=
5225 "
_column(9) black
%3f "Race (r0002300) n= 5225 mean= 1.296 min= 1 max=3
"
_column(13) imigrnt
%3f "Was r's brthpl in the US? (r0038000) n=4965 mean=0.98 mn=0
mx=1 "
_column(17) hhead
%8f "Person R lived w/ @ age 14 (r0039700) n= 5213 mean=1.92 mn=1
mx=9"
_column(28) mag_14
%10f "Were magznes avail at age 14 (r0039900) n=5167 mean=0.69
mn=0 mx=1 "
_column(40) news_14 %10f "Were nwspaprs avail at age 14 (r0040000) n=5195 mean=0.85
mn=0 mx=1"
_column(52) lib_14 %10f "Were lib-card avail at age14 (r0040100) n=5204 mean=0.66 mn=0
mx=1 "
_column(63) num_sib
%8f "Tot # sibs r 66 (r0056900) n=5168 mean=3.408 min=0
max=18"
_column(72) fgrade
%8f "Hgc by father, 66 (r0063100) n=3930 mean=9.937 min=0
max=18"
_column(81) mgrade
%8f "Hgc by mother, 66 (r0063300) n=4573 mean=10.25 min=0
max=18"
_column(90) iq
%8f "Iq_score (r0171100) n= 3369 mean=101.582 min=50 max=158 "
_column(99) bdate
%8f "Birthdate - STATA formatted
"
_column(108) gfill76 %8f "'76 Grade level, some values filled from prevs reports"
_column(117) wt76
%8f "'76 Weight "
_column(126) grade76 %8f "'76 Grade level"
_column(135) grade66 %8f "'66 Grade level"
_column(144) age66
%8f "Age reported by screener (r0002200) "
_column(153) smsa66
%8f "If lived in SMSA in 1966 (r0002455=1,2)"
_column(162) region
%8f "Census Region in 1966 (r0002900)
"
_column(171) smsa76
%8f "If lived in SMSA in 1976 (r0437515=1,2)"
_column(180) col4
%8f "If any 4-year college nearby (r0004000!=4) "
_column(189) mcol4
%8f "If male 4-year college nearby (r0004100=1,2) "
_column(198) col4pub %8f "If public 4-year college nearby (r0004000=2,3)"
_column(207) south76 %1f "If lived in South in 1976 (r0437511=1)
"
_column(209) wage76 %10f "'76 Wage"
_column(219) exp76
%8f "'76 experience, (10 + age66) - grade76 - 6)"
_column(230) expsq76 %10f "'76 experience, exp76 ^2/100
"
_column(243) age76
%8f "'76 age (age66 +10)
"
_column(252) agesq76 %8f "'76 age squared (age76^2)
"
_column(261) reg1
%8f "region==NE"
_column(270) reg2
%8f "If lived in Region 2 (region= MidAtl)"
_column(279) reg3
%8f "If lived in Region 3 (region= ENC) "
_column(288) reg4
%8f "If lived in Region 4 (region= WNC) "
_column(297) reg5
%8f "If lived in Region 5 (region= SA ) "
_column(306) reg6
%8f "If lived in Region 6 (region= ESC) "
_column(315) reg7
%8f "If lived in Region 7 (region= WSC) "
_column(324) reg8
%8f "If lived in Region 8 (region= M ) "
63
_column(333) reg9
%8f "If lived in Region 9 (region= P ) "
_column(342) momdad14 %8f "If lived with both parents at age 14 "
_column(351) sinmom14 %8f "If lived with mother only at age 14 "
_column(360) nodaded %1f "If father has no formal education "
_column(362) nomomed %1f "If mother has no formal education "
_column(365) daded
%10f "Mean grade level of father
"
_column(377) momed
%10f "Mean grade level of mother
"
_column(396) famed
%8f "Father's and mother's education
"
_column(405) famed1
%8f "If mgrade> 12 & fgrade> 12 (famed=1) "
_column(414) famed2
%8f "If mgrade>=12 & fgrade>=12 (famed=2) "
_column(423) famed3
%8f "If mgrade==12 & fgrade==12 (famed=3) "
_column(432) famed4
%8f "If mgrade>=12 & fgrade==-1 (famed=4) "
_column(441) famed5
%8f "If fgrade>=12 (famed=5)
"
_column(450) famed6
%8f "If mgrade>=12 & fgrade> -1 (famed=6) "
_column(459) famed7
%8f "If mgrade>=9 & fgrade>=9 (famed=7) "
_column(468) famed8
%8f "If mgrade> -1 & fgrade> -1 (famed=8) "
_column(477) famed9
%8f "If famed not in range (1-8)"
_column(486) int76
%8f "If wt76 not missing "
_column(495) age1415 %8f "If in age group =14-15"
_column(540) cage1415 %8f "If in age group =14,15 and lived near college"
_column(576) cage2224 %8f "If in age group =20-24 and lived near college"
_column(585) cage66
%8f "Age in 66 and whether lived near college "
_column(594) a1
%8f "If age in 66 = 14 (age66= 14)"
_column(603) a2
%8f "If age in 66 = 15 (age66= 15)"
_column(612) a3
%8f "If age in 66 = 16 (age66= 16)"
_column(621) a4
%8f "If age in 66 = 17 (age66= 17)"
_column(630) a5
%8f "If age in 66 = 18 (age66= 18)"
_column(639) a6
%8f "If age in 66 = 19 (age66= 19)"
_column(648) a7
%8f "If age in 66 = 20 (age66= 20)"
_column(657) a8
%8f "If age in 66 = 21 (age66= 21)"
_column(666) a9
%8f "If age in 66 = 22 (age66= 22)"
_column(675) a10
%8f "If age in 66 = 23 (age66= 23)"
_column(684) a11
%8f "If age in 66 = 24 (age66= 24)"
_column(693) ca1
%8f "Not lived near college in 66"
_column(702) ca2
%8f "If age in 66 = 14 and lived near college"
_column(711) ca3
_column(720) ca4
_column(729) ca5
_column(738) ca6
_column(747) ca7
_column(756) ca8
_column(765) ca9
_column(774) ca10
_column(777) ca11
64
_column(780) ca12
_column(782) g25
%12f "Grade level when 25 years old
"
_column(795) g25i
%12f "If =g25 and intrvwed in year used for determining g25 "
_column(819) intmo66 %8f "Intvw month in 1966, used to identify cases incl by CARD"
_column(828) nlsflt
%8f "Flag to identify if the case was used by CARD"
_column(837) nsib
%8f "Number of siblings "
_column(846) ns1
%8f "If number of siblings = 0 (nsib= 0)"
_column(855) ns2
_column(864) ns3
_column(873) ns4
_column(882) ns5
_column(891) ns6
_column(900) ns7
%8f "If number of siblings =18 (nsib=18)"
}
(5226 observations read)
. * save DATA66, replace
. desc
Contains data
obs:
5,226
vars:
101
size: 2,132,208 (89.8% of memory free)
label
variable label
------------------------------------------------------------------------------id
float %9.0g
ID CODE (r0000100) n= 5225
mean= 2613.000 min= 1 max=
5225
black
float %9.0g
Race (r0002300) n= 5225 mean=
1.296 min= 1 max=3
imigrnt
float %9.0g
Was r's brthpl in the US?
(r0038000) n=4965 mean=0.98
mn=0 mx=1
hhead
float %9.0g
Person R lived w/ @ age 14
(r0039700) n= 5213 mean=1.92
mn=1 mx=9
mag_14
float %9.0g
Were magznes avail at age 14
(r0039900) n=5167 mean=0.69
mn=0 mx=1
news_14
float %9.0g
Were nwspaprs avail at age 14
(r0040000) n=5195 mean=0.85
mn=0 mx=1
lib_14
float %9.0g
Were lib-card avail at age14
(r0040100) n=5204 mean=0.66
mn=0 mx=1
num_sib
float %9.0g
Tot # sibs r 66 (r0056900)
n=5168 mean=3.408 min=0
max=18
65
fgrade
mgrade
iq
float %9.0g
float %9.0g
float %9.0g
bdate
gfill76
float %9.0g
float %9.0g
wt76
grade76
grade66
age66
float %9.0g
float %9.0g
float %9.0g
float %9.0g
smsa66
float %9.0g
region
smsa76
col4
float %9.0g
float %9.0g
float %9.0g
mcol4
float %9.0g
col4pub
float %9.0g
south76
float %9.0g
wage76
exp76
float %9.0g
float %9.0g
expsq76
age76
agesq76
reg1
reg2
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
reg3
float %9.0g
reg4
float %9.0g
reg5
float %9.0g
reg6
float %9.0g
reg7
float %9.0g
reg8
float %9.0g
reg9
float %9.0g
Hgc by father, 66 (r0063100)

n=3930 mean=9.937 min=0 max=18
Hgc by mother, 66 (r0063300)
n=4573 mean=10.25 min=0 max=18
Iq_score (r0171100) n= 3369
mean=101.582 min=50 max=158
Birthdate - STATA formatted
'76 Grade level, some values
filled from prevs reports
'76 Weight
'76 Grade level
'66 Grade level
Age reported by screener
(r0002200)
If lived in SMSA in 1966
(r0002455=1,2)
Census Region in 1966
(r0002900)
(r0437515=1,2)
If any 4-year college nearby
(r0004000!=4)
If male 4-year college nearby
(r0004100=1,2)
If public 4-year college nearby
(r0004000=2,3)
If lived in South in 1976
(r0437511=1)
'76 Wage
'76 experience, (10 + age66) grade76 - 6)
'76 experience, exp76 ^2/100
'76 age (age66 +10)
'76 age squared (age76^2)
region==NE
If lived in Region 2 (region=
MidAtl)
ENC)
WNC)
SA )
ESC)
WSC)
If lived in Region 8 (region= M
)
If lived in Region 9 (region= P
)
66
momdad14
float %9.0g
If lived with both parents at

age 14
sinmom14
float %9.0g
If lived with mother only at

age 14
nodaded
nomomed
daded
momed
famed
famed1
famed2
famed3
famed4
famed5
famed6
famed7
famed8
famed9
int76
age1415
age1617
age1819
age2021
age2224
cage1415
cage1617
cage1819
cage2021
cage2224
cage66
a1
a2
a3
a4
a5
a6
float %9.0g
If father has no formal

education
float %9.0g
If mother has no formal
education
float %9.0g
Mean grade level of father
float %9.0g
Mean grade level of mother
float %9.0g
Father's and mother's education
float %9.0g
If mgrade> 12 & fgrade> 12
(famed=1)
float %9.0g
If mgrade>=12 & fgrade>=12
(famed=2)
float %9.0g
If mgrade==12 & fgrade==12
(famed=3)
float %9.0g
If mgrade>=12 & fgrade==-1
(famed=4)
float %9.0g
If fgrade>=12 (famed=5)
float %9.0g
If mgrade>=12 & fgrade> -1
(famed=6)
float %9.0g
(famed=7)
float %9.0g
If mgrade> -1 & fgrade> -1
(famed=8)
float %9.0g
If famed not in range (1-8)
float %9.0g
If wt76 not missing
float %9.0g
If in age group =14-15
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
If in age group =14,15 and
lived near college
float %9.0g
lived near college
float %9.0g
lived near college
float %9.0g
lived near college
float %9.0g
If in age group =20-24 and
lived near college
float %9.0g
Age in 66 and whether lived
near college
float %9.0g
If age in 66 = 14 (age66= 14)
float %9.0g
If age in 66 = 15 (age66= 15)
float %9.0g
If age in 66 = 16 (age66= 16)
float %9.0g
If age in 66 = 17 (age66= 17)
float %9.0g
If age in 66 = 18 (age66= 18)
float %9.0g
If age in 66 = 19 (age66= 19)
67
a7
a8
a9
a10
a11
ca1
ca2
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
float %9.0g
If age in 66 = 20 (age66= 20)

If age in 66 = 21 (age66= 21)
If age in 66 = 22 (age66= 22)
If age in 66 = 23 (age66= 23)
If age in 66 = 24 (age66= 24)
Not lived near college in 66
If age in 66 = 14 and lived
near college
ca3
float %9.0g
near college
ca4
float %9.0g
near college
ca5
float %9.0g
near college
ca6
float %9.0g
near college
ca7
float %9.0g
near college
ca8
float %9.0g
near college
ca9
float %9.0g
near college
ca10
float %9.0g
near college
ca11
float %9.0g
near college
ca12
float %9.0g
near college
g25
float %9.0g
Grade level when 25 years old
g25i
float %9.0g
If =g25 and intrvwed in year
used for determining g25
intmo66
float %9.0g
Intvw month in 1966, used to
identify cases incl by CARD
nlsflt
float %9.0g
Flag to identify if the case
was used by CARD
nsib
float %9.0g
Number of siblings
ns1
float %9.0g
If number of siblings = 0
(nsib= 0)
ns2
float %9.0g
(nsib= 2)
ns3
float %9.0g
(nsib= 3)
ns4
float %9.0g
(nsib= 4)
ns5
float %9.0g
(nsib= 6)
ns6
float %9.0g
(nsib= 9)
ns7
float %9.0g
If number of siblings =18
(nsib=18)
------------------------------------------------------------------------------68
Sorted by:
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
5225
2613 1508.472
1
5225
black |
5225 .2752153 .4466655
0
1
imigrnt |
5225 .0237321 .1522277
0
1
hhead |
5225 -.3783732 47.95128
-999
9
mag_14 |
5225 .6861566 .4616275
0
1
-------------+-------------------------------------------------------news_14 |
5225 .8483024 .3577176
0
1
lib_14 |
5225 .658469 .4733619
0
1
num_sib |
5168 3.407701 2.586307
0
18
fgrade |
3930 9.93715 3.777654
0
18
mgrade |
4573 10.25104 3.17986
0
18
-------------+-------------------------------------------------------iq |
3369 101.5818 15.93225
50
158
bdate |
5204 472926.6 31765.04 360823 521224
gfill76 |
5225 12.78718 2.802705
0
18
wt76 |
3695 475512.5 265188.5
98617 2582192
grade76 |
3671 13.23018 2.747627
0
18
-------------+-------------------------------------------------------grade66 |
5225 10.58431 2.433696
0
18
age66 |
5225 18.09129 3.157657
14
24
smsa66 |
5225 .6599043 .4737864
0
1
region |
5225 4.721722 2.300767
1
9
smsa76 |
5225 .491866 .4999817
0
1
-------------+-------------------------------------------------------col4 |
5225 .691866 .4617664
0
1
mcol4 |
5225 .6874641 .4635713
0
1
col4pub |
5225 .5129187 .4998809
0
1
south76 |
3695 .3964817 .4892328
0
1
wage76 |
3078 1.658013 .4430234
0 3.1797
-------------+-------------------------------------------------------exp76 |
3671 8.933533 4.212664
0
25
expsq76 |
3671 .9754971 .8778352
0
6.25
age76 |
5225 28.09129 3.157657
24
34
agesq76 |
5225 799.0896 182.0539
576
1156
reg1 |
5225
.04 .1959779
0
1
-------------+-------------------------------------------------------reg2 |
5225 .1617225 .3682313
0
1
reg3 |
5225 .1900478 .3923763
0
1
reg4 |
5225 .0639234 .2446399
0
1
reg5 |
5225 .2126316 .4092083
0
1
reg6 |
5225 .0895694 .2855912
0
1
-------------+-------------------------------------------------------reg7 |
5225 .1083254 .3108206
0
1
reg8 |
5225 .0304306 .1717855
0
1
69
reg9 |
5225 .1033493 .3044437
0
1
momdad14 |
5225 .7680383 .4221251
0
1
sinmom14 |
5225 .1182775 .3229673
0
1
-------------+-------------------------------------------------------nodaded |
5225 .2478469 .4318038
0
1
nomomed |
5225 .1247847 .3305062
0
1
daded |
5225 9.937162 3.276134
0
18
momed |
5225 10.25103 2.974812
0
18
famed |
5225 6.05933 2.643855
1
9
-------------+-------------------------------------------------------famed1 |
5225 .0610526 .2394497
0
1
famed2 |
5225 .0742584 .262216
0
1
famed3 |
5225 .1144498 .3183872
0
1
famed4 |
5225 .0474641 .2126498
0
1
famed5 |
5225 .077512 .2674276
0
1
-------------+-------------------------------------------------------famed6 |
5225 .1245933 .3302888
0
1
famed7 |
5225 .0486124 .215077
0
1
famed8 |
5225 .2273684 .4191726
0
1
famed9 |
5225 .224689 .4174173
0
1
int76 |
5225 .707177 .4551014
0
1
-------------+-------------------------------------------------------age1415 |
5225 .2595215 .4384141
0
1
age1617 |
5225 .2482297 .4320271
0
1
age1819 |
5225 .1751196 .3801058
0
1
age2021 |
5225
.11311 .3167576
0
1
age2224 |
5225 .2040191 .4030216
0
1
-------------+-------------------------------------------------------cage1415 |
5225 .1755024 .3804327
0
1
cage1617 |
5225 .1680383 .3739361
0
1
cage1819 |
5225 .1245933 .3302888
0
1
cage2021 |
5225 .0796172 .2707256
0
1
cage2224 |
5225 .1441148 .3512397
0
1
-------------+-------------------------------------------------------cage66 |
5225 12.56115 8.785895
0
24
a1 |
5225 .1314833 .3379605
0
1
a2 |
5225 .1280383 .3341644
0
1
a3 |
5225 .1326316 .3392086
0
1
a4 |
5225 .1155981 .3197729
0
1
-------------+-------------------------------------------------------a5 |
5225 .098756 .2983627
0
1
a6 |
5225 .0763636 .2656045
0
1
a7 |
5225 .0560766 .2300915
0
1
a8 |
5225 .0570335 .2319288
0
1
a9 |
5225 .0666029 .2493568
0
1
-------------+-------------------------------------------------------a10 |
5225 .0683254 .2523275
0
1
a11 |
5225 .0690909 .2536329
0
1
ca1 |
5225 .308134 .4617664
0
1
ca2 |
5225 .0876555 .2828203
0
1
ca3 |
5225 .0878469 .2830992
0
1
70
-------------+-------------------------------------------------------ca4 |
5225 .0870813 .2819812
0
1
ca5 |
5225 .0809569 .2727951
0
1
ca6 |
5225 .0708134 .2565374
0
1
ca7 |
5225 .0537799 .2256044
0
1
ca8 |
5225 .0390431 .193716
0
1
-------------+-------------------------------------------------------ca9 |
5225 .0405742 .1973204
0
1
ca10 |
5225 .0465072 .2106009
0
1
ca11 |
5225 .0484211 .2146748
0
1
ca12 |
5225 12.52593 2.740455
0
18
g25 |
5225 12.53923 2.749407
0
18
-------------+-------------------------------------------------------g25i |
4148 12.77929 2.740756
0
18
intmo66 |
5225 -5.790239 128.4984
-999
12
nlsflt |
5225 .9835407 .1272459
0
1
nsib |
5225 2.818565 2.473752
0
18
ns1 |
5225 .2547368 .4357549
0
1
-------------+-------------------------------------------------------ns2 |
5225 .3534928 .4780998
0
1
ns3 |
5225 .0109091 .1038853
0
1
ns4 |
5225 .1892823 .3917702
0
1
ns5 |
5225 .135311 .3420882
0
1
ns6 |
5225 .0558852 .2297218
0
1
-------------+-------------------------------------------------------ns7 |
5225 .0003828 .0195628
0
1
.
. * Define the exogenous regressors using the global macro exogregressors
. global exogregressors black south76 smsa76 reg2-reg9 /*
> */ smsa66 momdad14 sinmom14 nodaded nomomed daded momed famed1-famed8
.
. * Write data to a text (ascii) file so can use with programs other than stata
. outfile wage76 grade76 exp76 expsq76 col4 age76 agesq76 black south76 smsa76 reg2-reg9 /*
> */ smsa66 momdad14 sinmom14 nodaded nomomed daded momed famed1-famed8 /*
> */ using mma04p4ivweak.asc, replace
.
.
. ********** (1) OLS AND IV ESTIMATES: COLUMNS 1 AND 2 OF KLING TABLE 1
.
. * RETAIN cases for the analysis
. * Here drop if missing wages or missing schooling or not at first interview
. keep if wage76!=. & grade76!=. & nlsflt==1
.
. * DESCRIBE dependent variable, regressors and instruments
. desc wage76 grade76 exp76 expsq76 col4 age76 agesq76 $exogregressors
71
storage display value

label
variable label
------------------------------------------------------------------------------wage76
float %9.0g
'76 Wage
grade76
float %9.0g
'76 Grade level
exp76
float %9.0g
'76 experience, (10 + age66) grade76 - 6)
expsq76
float %9.0g
'76 experience, exp76 ^2/100
col4
float %9.0g
If any 4-year college nearby
(r0004000!=4)
age76
float %9.0g
'76 age (age66 +10)
agesq76
float %9.0g
'76 age squared (age76^2)
black
float %9.0g
Race (r0002300) n= 5225 mean=
1.296 min= 1 max=3
south76
float %9.0g
If lived in South in 1976
(r0437511=1)
smsa76
float %9.0g
(r0437515=1,2)
reg2
float %9.0g
MidAtl)
reg3
float %9.0g
ENC)
reg4
float %9.0g
WNC)
reg5
float %9.0g
SA )
reg6
float %9.0g
ESC)
reg7
float %9.0g
WSC)
reg8
float %9.0g
If lived in Region 8 (region= M
)
reg9
float %9.0g
If lived in Region 9 (region= P
)
smsa66
float %9.0g
(r0002455=1,2)
momdad14
float %9.0g
If lived with both parents at
age 14
sinmom14
float %9.0g
If lived with mother only at
age 14
nodaded
float %9.0g
If father has no formal
education
nomomed
float %9.0g
If mother has no formal
education
daded
float %9.0g
Mean grade level of father
momed
float %9.0g
Mean grade level of mother
famed1
float %9.0g
If mgrade> 12 & fgrade> 12
(famed=1)
famed2
float %9.0g
(famed=2)
famed3
float %9.0g
If mgrade==12 & fgrade==12
72
famed4
float %9.0g
famed5
famed6
float %9.0g
float %9.0g
famed7
float %9.0g
famed8
float %9.0g
(famed=3)
If mgrade>=12 & fgrade==-1
(famed=4)
If fgrade>=12 (famed=5)
If mgrade>=12 & fgrade> -1
(famed=6)
(famed=7)
If mgrade> -1 & fgrade> -1
(famed=8)
.
. * SUMMARIZE dependent variable, regressors and instruments
. sum wage76 grade76 exp76 expsq76 col4 age76 agesq76 $exogregressors
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------wage76 |
3010 1.656664 .443798
0 3.1797
grade76 |
3010 13.26346 2.676913
1
18
exp76 |
3010 8.856146 4.141672
0
23
expsq76 |
3010 .9557907 .8461831
0
5.29
col4 |
3010 .6820598 .4657535
0
1
-------------+-------------------------------------------------------age76 |
3010 28.1196 3.137004
24
34
agesq76 |
3010 800.5495 180.7484
576
1156
black |
3010 .2335548 .4231624
0
1
south76 |
3010 .4036545 .4907113
0
1
smsa76 |
3010 .7129568 .4524571
0
1
-------------+-------------------------------------------------------reg2 |
3010 .1607973 .367405
0
1
reg3 |
3010 .1956811
.39679
0
1
reg4 |
3010 .0641196 .2450066
0
1
reg5 |
3010 .2083056 .406164
0
1
reg6 |
3010 .0960133 .2946584
0
1
-------------+-------------------------------------------------------reg7 |
3010 .1099668 .3129003
0
1
reg8 |
3010 .0282392 .165683
0
1
reg9 |
3010 .0903654 .2867522
0
1
smsa66 |
3010 .6495017 .4772053
0
1
momdad14 |
3010 .7893688 .4078247
0
1
-------------+-------------------------------------------------------sinmom14 |
3010 .1006645 .3009339
0
1
nodaded |
3010 .2292359 .4204111
0
1
nomomed |
3010 .1172757 .321802
0
1
daded |
3010 9.988262 3.266511
0
18
momed |
3010 10.33675 2.987507
0
18
-------------+-------------------------------------------------------famed1 |
3010 .0614618 .2402153
0
1
famed2 |
3010 .0787375 .2693734
0
1
famed3 |
3010 .1249169 .3306796
0
1
famed4 |
3010 .0475083 .2127588
0
1
73
famed5 |
3010 .0790698 .2698925
0
1
-------------+-------------------------------------------------------famed6 |
3010 .1328904 .3395126
0
1
famed7 |
3010 .0504983 .2190073
0
1
famed8 |
3010 .2202658 .4144947
0
1
.
. * OLS estimates of return to schooling.
. * This regression computes schooling coeff, se for Table1 col 1 p.359
. * based on all cases (age grp 14-24) reported highest grd cmpl 76
.
. reg wage76 grade76 exp76 expsq76 $exogregressors
Source |
SS
df
MS
-------------+-----------------------------F( 29, 2980) = 44.94
Model | 180.320527 29 6.21794919
Prob > F
= 0.0000
Residual | 412.32209 2980 .138363117
R-squared = 0.3043
-------------+-----------------------------Adj R-squared = 0.2975
Total | 592.642616 3009 .196956669
Root MSE
= .37197
-----------------------------------------------------------------------------wage76 |
Coef. Std. Err.
-------------+---------------------------------------------------------------grade76 | .072635 .0036984 19.64 0.000 .0653833 .0798868
exp76 | .0845293 .0066819 12.65 0.000 .0714277 .0976308
expsq76 | -.2289581 .0319499 -7.17 0.000 -.2916041 -.1663121
black | -.1894065 .0194462 -9.74 0.000 -.2275358 -.1512773
south76 | -.1464841 .0260345 -5.63 0.000 -.1975314 -.0954368
smsa76 | .1377121 .0201334 6.84 0.000 .0982353 .1771889
reg2 | .1023805 .0360137 2.84 0.005 .0317662 .1729947
reg3 | .1488958 .0352521 4.22 0.000 .0797748 .2180168
reg4 | .0601267 .0417556 1.44 0.150 -.021746 .1419994
reg5 | .1348504 .0419098 3.22 0.001 .0526752 .2170255
reg6 | .1452831 .0453155 3.21 0.001 .0564302 .2341359
reg7 | .1301968 .044965 2.90 0.004 .0420312 .2183624
reg8 | -.0444289 .0513937 -0.86 0.387 -.1451997 .0563419
reg9 | .1285658 .0389959 3.30 0.001 .0521042 .2050274
smsa66 | .0233775 .019544 1.20 0.232 -.0149436 .0616987
momdad14 | .0693317 .0263402 2.63 0.009
.017685 .1209785
sinmom14 | .0335387 .0354168 0.95 0.344 -.0359052 .1029825
nodaded | -.0390477 .0531089 -0.74 0.462 -.1431815 .0650862
nomomed | .0168143 .0348295 0.48 0.629 -.051478 .0851066
daded | -.0017839 .0043977 -0.41 0.685 -.0104068 .0068389
momed | .0081443 .0041513 1.96 0.050 4.64e-06 .0162839
famed1 | -.1166029 .0788125 -1.48 0.139 -.2711354 .0379296
famed2 | -.052544 .0712753 -0.74 0.461 -.1922977 .0872097
famed3 | -.0719675 .0654608 -1.10 0.272 -.2003205 .0563856
famed4 | -.0197095 .0437058 -0.45 0.652 -.1054062 .0659872
famed5 | -.0252185 .0643526 -0.39 0.695 -.1513985 .1009615
famed6 | -.0733887 .0621076 -1.18 0.237 -.1951667 .0483894
famed7 | -.059927 .0656929 -0.91 0.362 -.188735 .068881
74
famed8 | -.0738951 .0572428 -1.29 0.197 -.1861345 .0383444

_cons | -.0278815 .1005974 -0.28 0.782 -.2251288 .1693659
-----------------------------------------------------------------------------. estimates store ols
.
. * IV Instrumental variables estimates of return to schooling.
. * This regression computes schooling coeff and se for Table 1. col 2 p.359
. * Endogenous variables: schooling, experience, experience squared
. * Excl instruments: college in cnty, age age^2
. * based on all cases (age grp 14-24) reported highest grd cmpl 76 ***/
.
. ivreg wage76 $exogregressors /*
> */ (grade76 exp76 expsq76 = col4 age76 agesq76 $exogregressors)
Source |
SS
df
MS
-------------+-----------------------------F( 29, 2980) = 34.56
Model | 122.395448 29 4.22053269
Prob > F
= 0.0000
Residual | 470.247169 2980 .157801063
R-squared = 0.2065
-------------+-----------------------------Adj R-squared = 0.1988
Total | 592.642616 3009 .196956669
Root MSE
= .39724
-----------------------------------------------------------------------------wage76 |
Coef. Std. Err.
-------------+---------------------------------------------------------------grade76 | .1324485 .0493419 2.68 0.007 .0357009 .2291961
exp76 | .0632411 .0241061 2.62 0.009 .0159748 .1105074
expsq76 | -.1266694 .1184765 -1.07 0.285 -.3589735 .1056347
black | -.1643766 .0292248 -5.62 0.000 -.2216795 -.1070737
south76 | -.1400178 .0283887 -4.93 0.000 -.1956812 -.0843545
smsa76 | .0909867 .0441338 2.06 0.039 .0044509 .1775224
reg2 | .0753178 .0444167 1.70 0.090 -.0117726 .1624083
reg3 | .1231473 .0431763 2.85 0.004
.038489 .2078057
reg4 | .0241968 .0534911 0.45 0.651 -.0806865 .1290801
reg5 | .1247819 .0455148 2.74 0.006 .0355383 .2140255
reg6 | .135761 .0490304 2.77 0.006
.039624 .2318979
reg7 | .1063645 .0519274 2.05 0.041 .0045472 .2081817
reg8 | -.0850609 .064327 -1.32 0.186 -.2111907 .0410688
reg9 | .0916464 .0515551 1.78 0.076 -.0094409 .1927337
smsa66 | .0379821 .0241116 1.58 0.115 -.0092951 .0852592
momdad14 | .043168 .0354056 1.22 0.223 -.0262539
.11259
sinmom14 | .025849 .0383465 0.67 0.500 -.0493392 .1010373
nodaded | -.0462392 .0570684 -0.81 0.418 -.1581366 .0656583
nomomed | .0266252 .0383434 0.69 0.487 -.048557 .1018074
daded | -.0110565 .0089768 -1.23 0.218 -.0286579 .0065449
momed | -.0017539 .0093223 -0.19 0.851 -.0200326 .0165249
famed1 | -.213271 .1160049 -1.84 0.066 -.4407287 .0141867
famed2 | -.1567074 .1145696 -1.37 0.171 -.3813508 .0679361
75
famed3 | -.1354685 .0872725 -1.55 0.121 -.3065889 .035652

famed4 | -.0707323 .0627189 -1.13 0.260 -.193709 .0522444
famed5 | -.0699675 .077928 -0.90 0.369 -.2227656 .0828306
famed6 | -.1171712 .0754408 -1.55 0.120 -.2650926 .0307502
famed7 | -.0921498 .0749801 -1.23 0.219 -.2391679 .0548683
famed8 | -.1184618 .0713021 -1.66 0.097 -.2582681 .0213445
_cons | -.4311125 .3567904 -1.21 0.227 -1.130693 .2684678
-----------------------------------------------------------------------------Instrumented: grade76 exp76 expsq76
Instruments: black south76 smsa76 reg2 reg3 reg4 reg5 reg6 reg7 reg8 reg9
smsa66 momdad14 sinmom14 nodaded nomomed daded momed famed1
famed2 famed3 famed4 famed5 famed6 famed7 famed8 col4 age76
agesq76
-----------------------------------------------------------------------------. estimates store iv
.
. ********** (2) NEW ANALYSIS: HETEROSKEDASTIC ROBUST STANDARD ERRORS
**********
.
. * Heteroskedastic errors makes little difference here.
.
. quietly reg wage76 grade76 exp76 expsq76 $exogregressors
. hettest /* Shows that here there is no heteroskeadsticity for OLS */
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of wage76
chi2(1)
= 0.42
Prob > chi2 = 0.5191
. quietly reg wage76 grade76 exp76 expsq76 $exogregressors, robust
. estimates store olshet
.
. quietly ivreg wage76 $exogregressors /*
> */ (grade76 exp76 expsq76 = col4 age76 agesq76 $exogregressors), robust
. estimates store ivhet
.
. **** DISPLAY RESULTS IN TABLE 4.5 p.111
.
. * Table 4.5 p.111: OLS and IV estimates, s.e.'s and R^2 in Table 4.5
.
. * Table reports only the coefficient and standard erros for grade76
. estimates table ols olshet iv ivhet, /*
76
>
*/ se stats(N ll r2 rss mss rmse df_r) b(%10.4f)
-----------------------------------------------------------------Variable | ols
olshet
iv
ivhet
-------------+---------------------------------------------------grade76 | 0.0726
0.0726
0.1324
0.1324
| 0.0037
0.0039
0.0493
0.0488
exp76 | 0.0845
0.0845
0.0632
0.0632
| 0.0067
0.0068
0.0241
0.0241
expsq76 | -0.2290 -0.2290 -0.1267
-0.1267
| 0.0319
0.0322
0.1185
0.1182
black | -0.1894 -0.1894 -0.1644 -0.1644
| 0.0194
0.0198
0.0292
0.0285
south76 | -0.1465 -0.1465 -0.1400 -0.1400
| 0.0260
0.0280
0.0284
0.0292
smsa76 | 0.1377
0.1377
0.0910
0.0910
| 0.0201
0.0193
0.0441
0.0440
reg2 | 0.1024
0.1024
0.0753
0.0753
| 0.0360
0.0350
0.0444
0.0432
reg3 | 0.1489
0.1489
0.1231
0.1231
| 0.0353
0.0338
0.0432
0.0418
reg4 | 0.0601
0.0601
0.0242
0.0242
| 0.0418
0.0412
0.0535
0.0531
reg5 | 0.1349
0.1349
0.1248
0.1248
| 0.0419
0.0428
0.0455
0.0459
reg6 | 0.1453
0.1453
0.1358
0.1358
| 0.0453
0.0452
0.0490
0.0483
reg7 | 0.1302
0.1302
0.1064
0.1064
| 0.0450
0.0457
0.0519
0.0516
reg8 | -0.0444 -0.0444 -0.0851 -0.0851
| 0.0514
0.0509
0.0643
0.0619
reg9 | 0.1286
0.1286
0.0916
0.0916
| 0.0390
0.0388
0.0516
0.0504
smsa66 | 0.0234
0.0234
0.0380
0.0380
| 0.0195
0.0187
0.0241
0.0231
momdad14 | 0.0693
0.0693
0.0432
0.0432
| 0.0263
0.0257
0.0354
0.0352
sinmom14 | 0.0335
0.0335
0.0258
0.0258
| 0.0354
0.0359
0.0383
0.0384
nodaded | -0.0390 -0.0390 -0.0462 -0.0462
| 0.0531
0.0511
0.0571
0.0550
nomomed | 0.0168
0.0168
0.0266
0.0266
| 0.0348
0.0344
0.0383
0.0375
daded | -0.0018 -0.0018 -0.0111 -0.0111
| 0.0044
0.0044
0.0090
0.0089
momed | 0.0081
0.0081
-0.0018
-0.0018
| 0.0042
0.0042
0.0093
0.0093
famed1 | -0.1166 -0.1166 -0.2133 -0.2133
| 0.0788
0.0792
0.1160
0.1160
famed2 | -0.0525 -0.0525 -0.1567 -0.1567
| 0.0713
0.0698
0.1146
0.1132
77
famed3 | -0.0720 -0.0720 -0.1355 -0.1355

| 0.0655
0.0644
0.0873
0.0865
famed4 | -0.0197 -0.0197 -0.0707 -0.0707
| 0.0437
0.0416
0.0627
0.0601
famed5 | -0.0252 -0.0252 -0.0700 -0.0700
| 0.0644
0.0625
0.0779
0.0763
famed6 | -0.0734 -0.0734 -0.1172 -0.1172
| 0.0621
0.0601
0.0754
0.0735
famed7 | -0.0599 -0.0599 -0.0921 -0.0921
| 0.0657
0.0640
0.0750
0.0730
famed8 | -0.0739 -0.0739 -0.1185 -0.1185
| 0.0572
0.0545
0.0713
0.0682
_cons | -0.0279 -0.0279 -0.4311
-0.4311
| 0.1006
0.0997
0.3568
0.3528
-------------+---------------------------------------------------N | 3010.0000 3010.0000 3010.0000 3010.0000
ll | -1279.2297 -1279.2297
r2 | 0.3043
0.3043
0.2065
0.2065
rss | 412.3221 412.3221 470.2472 470.2472
mss | 180.3205 180.3205 122.3954 122.3954
rmse | 0.3720
0.3720
0.3972
0.3972
df_r | 2980.0000 2980.0000 2980.0000 2980.0000
-----------------------------------------------------------------legend: b/se
.
. ********** (3) NEW ANALYSIS: CHECK FOR WEAK INSTRUMENTS **********
.
. * Model is y = b1*x1 + x2'b2 + u
. * where x1 is scalar endogenous (grade76)
. * where x2 is vector of regressors that includes
.*
exp76 and exp76 which are also endogenous
.*
and $exogregressors which are exogenous
. * and the instruments Z are grade76 col4 age76 agesq76 $exogregressors
.
. * Check for weak instruments
. * Focus on grade76 but can also do this for the other two endogenous regressors.
. * In this example no problems for the other two:
. * as age and age-squared are good instruments for exp and exp-squared.
.
. **** (A) Simple analysis R-squared and F-test [Given in Table 4.5]
.
. * R2 from regress endogenous regressor on instruments
. * This is same as correlation between x1 and projection of x1 on Z
. quietly reg grade76 col4 age76 agesq76 $exogregressors
. di e(r2) " r2 of x1 on Z"
.29677588 r2 of x1 on Z
.
. * Do the partial F-test on the three instruments
78
. * This is the standard first-stage regression F-test

.
. **** DISPLAY RESULT IN TABLE 4.5 page 111
.
. * First-stage F statistic given in Table 4.5
. test col4 age76 agesq76
( 1) col4 = 0
( 2) age76 = 0
( 3) agesq76 = 0
F( 3, 2980) = 8.07
Prob > F = 0.0000
.
. * Compare this to R-squared when only regress on instruments without Z
. quietly reg grade76 $exogregressors
. di e(r2) " r2 of x1 on Z with the three additional instruments dropped"
.29106483 r2 of x1 on Z with the three additional instruments dropped
.
. * Obtain first-stge F for the other two endogenous
. quietly reg exp76 col4 age76 agesq76 $exogregressors
( 1) col4 = 0
( 2) age76 = 0
( 3) agesq76 = 0
F( 3, 2980) = 1772.03
Prob > F = 0.0000
. quietly reg expsq76 col4 age76 agesq76 $exogregressors
( 1) col4 = 0
( 2) age76 = 0
( 3) agesq76 = 0
F( 3, 2980) = 1542.36
Prob > F = 0.0000
.
. **** (B) Minimum eigenvalue of matrix analog of the first-stage F statistic
.*
proposed by Stock et al (2002) and tables in Stock and Yogo (2003)
. * This test is not done here.
.
. **** (C) Bound et al (1995) partial R-squared
79
.
. * Not relevant here as more than one endogenous regressor
. * If only one endogenous regressor x1 Bound et al purge the effect of x2
. * by (1) get residual from regress x1 on x2
. * (2) get the residuals from regress z on x2
. * and then get the R-squared from regress (1) on (2).
.
. **** (D) Shea (1997) partial R-squared [Given in Table 4.5]
.
. * Here we have three endogenous regressors.
. * Focus on the endogenous schooling regressor.
. * For the other two just need to replace the first line of (1)
. * e.g. quietly reg exp76 grade76 expsq76 $exogregressors
. * and replace the first line of (2B)
. * e.g. quietly reg exp76hat grade76hat expsq76hat $exogregressors
.
. * (1) Form x1 - x1tilda: residual from regress x1 on other regressors
. quietly reg grade76 exp76 expsq76 $exogregressors
. predict x1minusx1tilda, resid
.
. * (2) Form x1hat - x1hattilda: residual from regress x1hat on fitted values of other regressors
. * (2A) First get the fitted values from regress endogenous on instruments
. quietly reg grade76 col4 age76 agesq76 $exogregressors
. predict grade76hat, xb
. di e(r2) " r2 from regress x1 on Z"
.29677588 r2 from regress x1 on Z
. quietly reg exp76 col4 age76 agesq76 $exogregressors
. predict exp76hat, xb
. di e(r2) " r2 from regress second endog regressor on Z"
.70622765 r2 from regress second endog regressor on Z
. quietly reg expsq76 col4 age76 agesq76 $exogregressors
. predict expsq76hat, xb
. di e(r2) " r2 from regress third endog regressor on Z"
.67573235 r2 from regress third endog regressor on Z
. * Fitted values for the exogenous from regress exogenous on instruments are the exogenous
. * (2B) Run the regression of x1hat on fitted values of other regressors
. quietly reg grade76hat exp76hat expsq76hat $exogregressors
. di e(r2) " r2 from regress prediction of x1 on predictions of x2
.98987117 r2 from regress prediction of x1 on predictions of x2
80
. predict x1hatminusx1hattilda, resid

.
. * (3) Form the correlation between (1) and (2)
. corr x1minusx1tilda x1hatminusx1hattilda
(obs=3010)
| x1minu~a x1hatm~a
-------------+-----------------x1minusx1t~a | 1.0000
x1hatminus~a | 0.0800 1.0000
.
. **** DISPLAY RESULT IN TABLE 4.5 page 111
.
. * Shea's Partial R^2 in Table 4.5
. di r(rho)^2 " Shea's partial R-squared measure"
.00640757 Shea's partial R-squared measure
.
. sum grade76 grade76hat exp76 exp76hat expsq76 expsq76hat grade76 x1minusx1tilda
x1hatminusx1hattilda grade76hat
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------grade76 |
3010 13.26346 2.676913
1
18
grade76hat |
3010 13.26346 1.458306 8.919074 17.42063
exp76 | 3010 8.856146 4.141672
0
23
exp76hat |
3010 8.856146 3.480551 1.329216 17.68953
expsq76 |
3010 .9557907 .8461831
0
5.29
-------------+-------------------------------------------------------expsq76hat |
3010 .9557907 .6955874 -.3913698 2.917523
grade76 |
3010 13.26346 2.676913
1
18
x1minusx1t~a |
3010 -8.71e-10 1.833502 -6.948598 5.661138
x1hatminus~a |
3010 -6.86e-11 .1467669 -.3732457 .3033035
grade76hat |
3010 13.26346 1.458306 8.919074 17.42063
.
. **** (E) Poskitt-Skeels (2002) partial R-squared
. * Not done here
.
. **** (F) If model was over-identified then do test of over-identifying restrictions
. * Not done here as model is just-identified
.
. log close
log: c:\Imbook\bwebpage\Section2\mma04p4ivweak.txt
log type: text
closed on: 17 May 2005, 13:46:03
81
-----------------------------------------------------------------------------------------------------------------------------------------------------
82
Chapter 5.9 pp.159-63
----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p1mle.txt
log type: text
opened on: 17 May 2005, 13:48:11
.
. ********** OVERVIEW OF MMA05P1MLE.DO **********
.
. * STATA Program
.
. * Chapter 5.9 pp.159-63
. * Maximum likelihood analysis.
.
. * Provides first two columns of Table 5.7
. * (1) OLS
using Stata command regress
. * (2) MLE
using Stata command exp for exponential MLE
. * (3) MLE
using Stata command ml for user-provided log-likelihood
.
. * Related programs:
. * mma05p2nls.do
NLS, WNLS, FGNLS for same data using nl command
. * mma05p3nlsbyml.do
NLS, WNLS, FGNLS for same data using ml command
. * mma05p4margeffects.do Calculates marginal effects
.
. ********** SETUP **********
.
. set more off
. version 8
.
.
. * Model is y ~ exponential(exp(a + bx))
.*
x ~ N[mux, sigx^2]
.*
f(y) = exp(a + bx)*exp(-y*exp(a + bx))
.*
lnf(y) = (a + bx) - y*exp(a + bx)
.*
E[y] = exp(-(a + bx)) note sign reversal for the mean
.*
V[y] = exp(-(a + bx)) = E[y]^2
.
. * The dgp sets particular values of a, b, mux and sigx
. * Here a = 2, b = -1 and x ~ N[1, 1]
. scalar a = 2
83
. scalar b = -1
. scalar mux = 1
. scalar sigx = 1
.
. * Set the sample size. Table 5.7 uses N=10,000
. set obs 10000
.
. * Generate x and y
. set seed 2003
. gen x = mux + sigx*invnorm(uniform())
. gen lamda = exp(a + b*x)
. gen Ey = 1/lamda
. * To generate exponential with mean mu=Ey use
. * Integral 0 to a of (1/mu)exp(-x/mu) dx by change of variables
. * = Integral 0 to a/mu of exp(-t)dt
. * = incomplete gamma function P(0,a/mu) in the terminology of Stata
. gen y = Ey*invgammap(1,uniform())
. gen lny = ln(y)
. gen lnfy = ln(lamda) - y*lamda
. * twoway scatter Ey x
.
. * Descriptive Statisitcs
. describe
Contains data
obs:
10,000
vars:
6
size:
label
variable label
------------------------------------------------------------------------------x
float %9.0g
lamda
float %9.0g
Ey
float %9.0g
y
float %9.0g
lny
float %9.0g
lnfy
float %9.0g
------------------------------------------------------------------------------84
Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x | 10000 1.014313 1.004905 -2.895741 4.994059
lamda | 10000 4.457478 5.939084 .0500838 133.7191
Ey | 10000 .6185677 .8294007 .0074784 19.96655
y | 10000 .6194352 1.291416 .0000445 30.60636
lny | 10000 -1.554348 1.62358 -10.02114 3.421208
-------------+-------------------------------------------------------lnfy | 10000 -.0209485 1.419595 -7.52596 4.402257
.
. ********** WRITE DATA TO A TEXT FILE **********
.
. * Write data to a text (ascii) file
. * used for programs mma05p2nlsbyml.do, mma05p3nlsbynl.do
. * and mma05p4margeffects.do
. * and can also use with programs other than Stata
. outfile y x using mma05data.asc, replace
.
. ********** DO THE ANALYSIS: OLS and MLE **********
.
. ** (1) OLS ESTIMATION
.
. * OLS is inconsistent in this example
. regress y x
Source |
SS
df
MS
-------------+-----------------------------F( 1, 9998) = 3030.74
Model | 3879.13606 1 3879.13606
Prob > F
= 0.0000
Residual | 12796.7438 9998 1.27993037
R-squared = 0.2326
-------------+-----------------------------Adj R-squared = 0.2325
Total | 16675.8799 9999 1.66775476
Root MSE
= 1.1313
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .6198182 .0112587 55.05 0.000 .5977488 .6418876
_cons | -.0092545 .016075 -0.58 0.565 -.0407648 .0222558
-----------------------------------------------------------------------------. estimates store rols
. regress y x, robust

85
F( 1, 9998) = 596.30
Prob > F
= 0.0000
R-squared = 0.2326
Root MSE = 1.1313
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .6198182 .0253823 24.42 0.000 .5700638 .6695725
_cons | -.0092545 .0171978 -0.54 0.591 -.0429655 .0244566
-----------------------------------------------------------------------------. estimates store rolsrobust
.
. ** (2) ML ESTIMATION USING STATA COMMAND FOR EXPONENTIAL MLE
.
. * The following uses Stata duration model commands.
. * First need to define the duration variable (here y)
. stset y
failure event: (assumed to fail at time=y)
obs. time interval: (0, y]
exit on or before: failure
-----------------------------------------------------------------------------10000 total obs.
0 exclusions
-----------------------------------------------------------------------------10000 obs. remaining, representing
10000 failures in single record/single failure data
6194.352 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t = 30.60636
. streg x, dist(exp) nohr
failure _d: 1 (meaning all fail)
analysis time _t: y
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
log likelihood = -20754.005

Exponential regression -- log relative-hazard form

No. of subjects =
10000
Number of obs =
10000
86
No. of failures =
10000
Time at risk = 6194.352495
LR chi2(1)
Log likelihood =
-15752.19
= 10003.63
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------_t |
Coef. Std. Err.
z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------x | -.9896276 .0098692 -100.27 0.000 -1.008971 -.9702842
_cons | 1.982921 .0141496 140.14 0.000 1.955188 2.010654
-----------------------------------------------------------------------------. estimates store rexp
. streg x, dist(exp) nohr robust
analysis time _t: y
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
log pseudo-likelihood = -20754.005


No. of subjects
No. of failures
Time at risk
=
10000
=
10000
= 6194.352495
Wald chi2(1) = 9914.62
Log pseudo-likelihood = -15752.19
Prob > chi2 = 0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------x | -.9896276 .0099388 -99.57 0.000 -1.009107 -.9701479
_cons | 1.982921 .0144307 137.41 0.000 1.954637 2.011205
-----------------------------------------------------------------------------. estimates store rexprobust
.
. ** (3) ML ESTIMATION USING STATA ML COMMAND
.
. * For MLE computation can use the following Stata commands
. * ml model lf
provide the log-density
. * ml model D0
provide the log-likelihood
. * ml model D1
provide the log-likelihood and gradient
87
. * ml model D2
provide the log-likelihood, gradient and hessian
.
. * At a minimum need to provide
. * (A) program define fcn where fcn is the function name
.*
defines the log-density (independent observations assumed)
. * (B) ml model lf fcn + some extras
.*
the extras give the dependent variable and regressors
. * (C) ml maximize
.*
obtains the mle
. * (D) ml model lf fcn + some extras, robust
.*
provides robust sandwich standard errors
.
. * Here we provide the log-density (ml model lf) as this is simplest,
. * and the Stata manual says that numerically only D2 is better.
.
. * (A) Define the log-density
.*
lnf(y) = (a+bx) - y*exp(a+bx) = theta - y*exp(theta) where theta = x'b
. program define mleexp0
1. version 8.0
2. args lnf theta
/* Must use lnf while could use name other than theta */
3. quietly replace `lnf' = `theta' - $ML_y1*exp(`theta')
4. end
.
. * (B) Say that dependent variable is y and regressors are x plus a constant
. ml model lf mleexp0 (y = x)
.
. * (C) Obtain the MLE
. ml search
/* Optional - can provide better starting values */
initial:
improve:
alternative: log likelihood = -5212.7607
rescale:
. ml maximize
initial:
rescale:
Iteration 0: log likelihood = -5212.7607
Number of obs =
10000
Wald chi2(1) = 10054.85
Log likelihood = -208.71383
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------88
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | -.9896276 .0098692 -100.27 0.000 -1.008971 -.9702842
_cons | 1.982921 .0141496 140.14 0.000 1.955188 2.010654
-----------------------------------------------------------------------------. estimates store rmle
.
. * (D) Obtain robust standard errors
. ml model lf mleexp0 (y = x), robust
. ml search
initial:
improve:
alternative: log pseudo-likelihood = -5212.7607
rescale:
. ml maximize
initial:
rescale:
Iteration 0: log pseudo-likelihood = -5212.7607
Number of obs =
10000
Wald chi2(1) = 9914.62
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | -.9896276 .0099388 -99.57 0.000 -1.009107 -.9701479
_cons | 1.982921 .0144307 137.41 0.000 1.954637 2.011205
-----------------------------------------------------------------------------. estimates store rmlerobust
.
. * (E) Calculate R-squared and log-likelihood at the ML estimates
. * lnL sums lnf(y) = ln(lamda) - y*lamda
. gen lamdaml = exp(_b[_cons] + _b[x]*x)
. gen lnfml = ln(lamdaml) - y*lamdaml
. quietly means lnfml
89
. scalar LLml = r(mean)*r(N)

. * R-squared = 1 - Sum_i(y_i - yhat_i)^2 / Sum_i(y_i - ybar)^2
. gen yhatml = 1/lamdaml
. egen ybar = mean(y)
. * quietly means y
. * scalar ybar = r(mean)
. gen y_yhatsqml = (y - yhatml)^2
. gen y_ybarsq = (y - ybar)^2
. quietly means y_yhatsqml
. scalar SSresidml = r(mean)
. quietly means y_ybarsq
. scalar SStotal = r(mean)
. scalar Rsqml = 1 - SSresidml/SStotal
. di LLml " " Rsqml
-208.71383 .39062307
.
. ********** DISPLAY RESULTS: First two columns of Table 5.7 p.161
.
. * (1) OLS - nonrobust and robust standard errors
. * Here OLS is inconsistent.
. * And expect sign reversal for slope as in true model mean E[y] = exp(-x'b)
. estimates table rols rolsrobust, b(%10.4f) se(%10.4f) t stats(N ll r2) keep(_cons x)
---------------------------------------Variable | rols
rolsrobust
-------------+-------------------------_cons | -0.0093 -0.0093
| 0.0161
0.0172
|
-0.58
-0.54
x | 0.6198
0.6198
| 0.0113
0.0254
|
55.05
24.42
-------------+-------------------------N | 10000.0000 10000.0000
ll | -1.542e+04 -1.542e+04
r2 | 0.2326
0.2326
---------------------------------------legend: b/se/t
90
.
. * (2) MLE by command ereg - nonrobust and robust standard errors
. estimates table rexp rexprobust, b(%10.4f) se(%10.4f) t stats(N ll) keep(_cons x)
---------------------------------------Variable | rexp
rexprobust
-------------+-------------------------_cons | 1.9829
1.9829
| 0.0141
0.0144
| 140.14
137.41
x | -0.9896 -0.9896
| 0.0099
0.0099
| -100.27
-99.57
-------------+-------------------------N | 10000.0000 10000.0000
ll | -1.575e+04 -1.575e+04
---------------------------------------legend: b/se/t
.
. * (3) MLE by command ml - nonrobust and robust standard errors
. estimates table rmle rmlerobust, b(%10.4f) se(%10.4f) t stats(N ll) keep(_cons x)
---------------------------------------Variable | rmle
rmlerobust
-------------+-------------------------_cons | 1.9829
1.9829
| 0.0141
0.0144
| 140.14
137.41
x | -0.9896 -0.9896
| 0.0099
0.0099
| -100.27
-99.57
-------------+-------------------------N | 10000.0000 10000.0000
ll | -208.7138 -208.7138
---------------------------------------legend: b/se/t
. * And ML log-likelihood (check) and R-squared (needed to be computed)
. di "Log likeihood for ML: " LLml
Log likeihood for ML: -208.71383
. di "R-squared for MLE: " Rsqml
R-squared for MLE: .39062307
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma05p1mle.txt
log type: text
closed on: 17 May 2005, 13:48:18
91
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p2nls.txt
log type: text
opened on: 17 May 2005, 13:53:31
.
. ********** OVERVIEW OF MMA05P2NLS.DO **********
.
. * STATA Program
.
. * Chapter 5.9 pp.159-63
. * Nonlinear least squares
.
. * Provides last three columns of Table 5.7 results for
. * (1) NLS using Stata command nl (hard to get robust s.e.'s)
. * (2) FGNLS using Stata command nl (hard to get robust s.e.'s)
. * (3) WNLS using Stata command nl (hard to get robust s.e.'s)
. * using generated data set mma05data.asc
.
. * Note: Stata 8 does not give robust se's for nl
.*
But ml does - see program mma05p3nlsbyml.do
.*
New Stata 9 does have a robust se option (unlike Stata 8)
.
. * mma05p1mle.do
OLS and MLE for the same data
. * mma05p3nlsbyml.do
NLS using ml rather than nl
.
. * mma05data.asc ASCII data set generated by mma05p1mle.do
.
. ********** SETUP **********
.
. set more off
. version 8
.
.
.*
x ~ N[mux, sigx^2]
.*
.*
.*
.*
V[y] = exp(-(a + bx)) = E[y]^2
. * Here a = 2, b = -1 and x ~ N[mux=1, sigx^21]
92
. * and Table 5.7 uses N=10,000

.
. * Data was generated by program mma05p1mle.do
. infile y x using mma05data.asc
.
. describe
Contains data
obs:
10,000
vars:
2
size:
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y | 10000 .6194352 1.291416 .0000445 30.60636
x | 10000 1.014313 1.004905 -2.895741 4.994059
.
. ********** DO THE ANALYSIS: NLS, WNLS and NFGLS **********
.
. *** (1) NLS ESTIMATION USING STATA NL COMMAND (Nonlinear LS)
.
. * To do this in Stata
. * (A) program define nlfcn where fcn is the function name
.*
defines g(x_i'b) and says what the regressors x are
. * (B) nl fcn y
where fcn is the function name in (A)
.*
and y is the dependent variable
.*
does NLS of y on fcn defined in (A)
. * (C) Heteroskedastic-consistent standard errors requires extra coding
.
. * (1A) Define g(x'b)
.*
Note: Since E[y] = exp(-(a + bx)) there is sign reversal for the mean
. program define nlexpnls
1. version 7.0
2. if "`1'" == "?" {
/* if query call ... */
3.
global S_1 "b1int b2x"
/* declare parameters */
4.
global b1int=1
/* initial values */
93
5.
global b2x=0
6.
exit}
7. replace `1'=exp(-$b1int-$b2x*x) /* calculate function */
8. end
.
. * (1B) Do NLS of y on the function expnls defined in (A)
. nl expnls y
(obs = 10000)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
residual SS =
residual SS =
residual SS =
residual SS =
residual SS =
residual SS =
17308.68
10333.37
10150.66
10149.86
10149.86
10149.86
Source |
SS
df
MS
-------------+-----------------------------F( 2, 9998) = 5103.98
Model | 10363.0157 2 5181.50784
Prob > F
= 0.0000
Residual | 10149.8633 9998 1.01518937
R-squared = 0.5052
-------------+-----------------------------Adj R-squared = 0.5051
Total | 20512.879 10000 2.0512879
Root MSE
= 1.007566
Res. dev. = 28527.52
(expnls)
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------b1int | 1.887563 .0306819 61.52 0.000
1.82742 1.947705
b2x | -.9574684 .0097419 -98.28 0.000 -.9765645 -.9383724
-----------------------------------------------------------------------------(SEs, P values, CIs, and correlations are asymptotic approximations)
. estimates store bnls
.
. * Complications now begin: getting standard erors. Easier to use (1) !!
.
. * (1C) Get sandwich heteroskedastic-robust standard errors for NLS
.
. * Note that robust option does not work for nl
. * So wrong standard errors are given for this problem as errors are heterosckeastic
.
. * To get robust standard errors is not straightforward
.
. * Obtain them by OLS regress y - g(x,b) on dg/db with robust option.
. * Explanation: OLS regress y - g(x,b) = (dg/db)'a + v
. * This is NR algorithm for update of b
. * But a = 0 since iterations have converged, so v = y - g(x,b)
. * So nonrobust standard errors from this OLS regression yield
. * V[a] = s^2 (Sum_i (dg_i/db)(dg_i/db)')
94
. * where s^2 = (Sum_i(y - g(x_i,b)^2))

. * This is the nonrobust standard errors for NLS
. * And robust option gives robust standard errors from this OLS regression.
.
. * Obtain the derivatives dg/db
. * Here g = exp(x'b) so dg/db = exp(x'b)*x = yhat*x
. quietly nl expnls y
. predict residnls, residuals
. predict yhatnls, yhat
. scalar snls = e(rmse)
/* Use in earlier code */
. gen d1 = yhatnls
. gen d2 = x*yhatnls
. * This OLS regression gives robust standard errors
. regress residnls d1 d2, noconstant robust
F( 2, 9998) = 0.00
Prob > F
= 1.0000
R-squared = 0.0000
Root MSE = 1.0076
-----------------------------------------------------------------------------|
Robust
residnls |
Coef. Std. Err.
-------------+---------------------------------------------------------------d1 | 4.46e-07 .1420794 0.00 1.000 -.2785037 .2785046
d2 | -1.49e-07 .0611969 -0.00 1.000 -.1199583 .119958
-----------------------------------------------------------------------------. estimates store bnlsrobust
.
. * Check: Do OLS regression that gives nonrobust standard errors
.*
and verify that same results as in (1B)
. regress residnls d1 d2, noconstant
Source |
SS
df
MS
-------------+-----------------------------F( 2, 9998) = 0.00
Model | 2.6739e-10 2 1.3370e-10
Prob > F
= 1.0000
Residual | 10149.8633 9998 1.01518937
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0002
Total | 10149.8633 10000 1.01498633
Root MSE
= 1.0076
-----------------------------------------------------------------------------residnls |
Coef. Std. Err.
95
-------------+---------------------------------------------------------------d1 | 4.46e-07 .0306819 0.00 1.000 -.0601423 .0601432

d2 | -1.49e-07 .0097419 -0.00 1.000 -.0190961 .0190958
-----------------------------------------------------------------------------. estimates store bnlscheck
.
. * (1D) Alternative to (1C) robust NLS standard errors that are better.
. * These are sandwich form but use knowledge that V[u]=exp(x'b)^2
. * which can be estimated by Vhat[u] = yhat
. * Now use this knowledge here in computing S in DSD.
. * Form DSDknown = D'SD with S = Diag(yhat^2)
. gen ds1known = yhatnls*yhatnls
. gen ds2known = x*yhatnls*yhatnls
. matrix accum DSDknown = ds1known ds2known, noconstant
(obs=10000)
. matrix accum DD2 = d1 d2, noconstant
(obs=10000)
/* DD commented above */
. * Form the robust variance matrix estimate

. matrix vnlsknown = syminv(DD2)*DSDknown*syminv(DD2)
. * Calculate the robust standard errors
. scalar seb1intnlsknown = sqrt(vnlsknown[1,1])
. scalar seb2xnlsknown = sqrt(vnlsknown[2,2])
. di "Robust standard errors of NLS estimates of b1int and b2x: "
Robust standard errors of NLS estimates of b1int and b2x:
. di "Using knowledge that Var[u] = exp(x'b)^2 estimated by yhat"
Using knowledge that Var[u] = exp(x'b)^2 estimated by yhat
. di seb1intnlsknown " " seb2xnlsknown
.21097066 .08798113
.
. * (1E) Calculate R-squared and log-likelihood at the NLS estimates
. * Note that Stata version 8 reports the wrong R-squared
. * as uses TSS = Sum_i y_i^2 and not Sum_i(y_i - ybar)^2
. gen lamdanls = 1 / yhatnls
/* yhatnls saved earlier */
. gen lnfnls = ln(lamdanls) - y*lamdanls
. quietly means lnfnls
96
. scalar LLnls = r(mean)*r(N)

. * quietly means y
. * scalar ybar = r(mean)
. gen y_ybarsq = (y - ybar)^2
. quietly means y_ybarsq
. scalar SStotal = r(mean)
. gen y_yhatsqnls = (y - yhatnls)^2
. quietly means y_yhatsqnls
. scalar SSresidnls = r(mean)
. scalar Rsqnls = 1 - SSresidnls/SStotal
/* SStotal found earlier */
. di LLnls " " Rsqnls

-232.97524 .39134462
.
. ** (2) FGNLS ESTIMATION USING STATA NL COMMAND
.
. * The following gives FGNLS in Table 5.7
. * To instead get the WNLS estimates in Table 5.7
. * replace gen wfgnls = (1/yhatnls)^2 below by gen wfgnls = 1/yhatnls
.
. * The Feasible generalized NLS estimator minimizes
. * SUM_i (y_i - g(x_i'b))^2 / s_i^2 where s_i^2 = estimate of sigma_i^2
. * This is y_i = g(x_i'b) + u_i where u_i ~ (0,s_i^2)
. * Can do NLS with weighting option [aweight = 1/(s_i^2)]
. * Here s_i^2 = [exp(x_i'b)]^2 = yhatnls^2
.
. * The simplest way to proceed is to use the aweights option.
.
. * (2A) nls program expnls already defined in (1A)
.
. * (2B) For FGNLS do this nls but now with weights
. gen wfgnls = (1/yhatnls)^2
. * gen wfgnls = 1/yhatnls
. nl expnls y [aweight=wfgnls]
(sum of wgt is 405584.32)
Iteration 0: residual SS = 1127.256
97
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
residual SS =
residual SS =
residual SS =
residual SS =
220.6796
220.2856
220.2851
220.2851
Source |
SS
df
MS
-------------+-----------------------------F( 2, 9998) = 4946.06
Model | 217.95244 2 108.97622
Prob > F
= 0.0000
Residual | 220.285065 9998 .022032913
R-squared = 0.4973
-------------+-----------------------------Adj R-squared = 0.4972
Total | 438.237505 10000 .043823751
Root MSE
= .1484349
Res. dev. = 8924.231
(expnls)
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------b1int | 1.984035 .0147737 134.30 0.000 1.955075 2.012994
b2x | -.990691 .01001 -98.97 0.000 -1.010313 -.9710694
. estimates store bfgnls
.
. * (2C) Robust standard errors
. * The standard errors obtained given are consistent
. * assuming correct model for heteroskedasticity.
. * To guard against misspecification use similar approach to nls case
. * Obtain the derivatives dg/db
. * Here g = exp(x'b) so dg/db = exp(x'b)*x = yhat*x
. predict residoptnls, residuals
. predict yhatoptnls, yhat
. gen d1opt = yhatoptnls
. gen d2opt = x*yhatoptnls
. * This OLS regression gives robust standard errors
. regress residoptnls d1opt d2opt [aweight=wfgnls], noconstant robust
F( 2, 9998) = 0.00
Prob > F
= 1.0000
R-squared = 0.0000
Root MSE = .14843
-----------------------------------------------------------------------------|
Robust
residoptnls |
Coef. Std. Err.
98
-------------+---------------------------------------------------------------d1opt | -9.85e-09 .0145803 -0.00 1.000 -.0285803 .0285802

d2opt | 8.81e-09 .0101319 0.00 1.000 -.0198606 .0198606
-----------------------------------------------------------------------------. estimates store bfgnlsrobust
. * This OLS regression gives nonrobust standard errors
. * It is a check and should equal (C)
. regress residoptnls d1opt d2opt [aweight=wfgnls], noconstant
Source |
SS
df
MS
-------------+-----------------------------F( 2, 9998) = 0.00
Model | 2.2737e-13 2 1.1369e-13
Prob > F
= 1.0000
Residual | 220.285065 9998 .022032913
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0002
Total | 220.285065 10000 .022028506
Root MSE
= .14843
-----------------------------------------------------------------------------residoptnls |
Coef. Std. Err.
-------------+---------------------------------------------------------------d1opt | -9.85e-09 .0147737 -0.00 1.000 -.0289594 .0289594
d2opt | 8.81e-09 .01001 0.00 1.000 -.0196216 .0196216
-----------------------------------------------------------------------------. estimates store bfgnlscheck
.
. * (2D) Calculate R-squared and log-likelihood at the NLS estimates
. * Note that Stata version 8 reports the wrong R-squared
. * as uses TSS = Sum_i y_i^2 and not Sum_i(y_i - ybar)^2
. gen lamdafgnls = 1 / yhatoptnls
/* yhatoptnls saved earlier */
. gen lnffgnls = ln(lamdafgnls) - y*lamdafgnls
. quietly means lnffgnls
. scalar LLfgnls = r(mean)*r(N)
. gen y_yhatsqfgnls = (y - yhatoptnls)^2
. quietly means y_yhatsqfgnls
. scalar SSresidfgnls = r(mean)
. scalar Rsqfgnls = 1 - SSresidfgnls/SStotal
. di LLfgnls "
" Rsqfgnls
99
-208.71965
.39056605
.
. ** (3) WNLS ESTIMATION USING STATA NL COMMAND
.
. * To get WNLS estimates in Table 5.7
. * replace gen wfgnls = (1/yhatnls)^2 in (3) FGNLS by gen wfgnls = 1/yhatnls
. * Code is shorter as all comments are dropped
.
. gen wwnls = 1/yhatnls
. nl expnls y [aweight=wwnls]
(sum of wgt is 39858.614)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
residual SS =
residual SS =
residual SS =
residual SS =
residual SS =
2630.417
1694.802
1500.277
1494.658
1494.653
Source |
SS
df
MS
-------------+-----------------------------F( 2, 9998) = 5073.75
Model | 1517.00087 2 758.500436
Prob > F
= 0.0000
Residual | 1494.6525 9998 .149495149
R-squared = 0.5037
-------------+-----------------------------Adj R-squared = 0.5036
Total | 3011.65337 10000 .301165337
Root MSE
= .386646
Res. dev. = 14035.49
(expnls)
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------b1int | 1.990623 .0224903 88.51 0.000 1.946537 2.034708
b2x | -.9960671 .009777 -101.88 0.000 -1.015232 -.9769022
. estimates store bwnls
. predict residwnls, residuals
. predict yhatwnls, yhat
. gen d1w = yhatwnls
. gen d2w = x*yhatwnls
. regress residwnls d1w d2w [aweight=wwnls], noconstant robust
F( 2, 9998) = 0.00
100
Prob > F
= 1.0000
R-squared = 0.0000
Root MSE = .38665
-----------------------------------------------------------------------------|
Robust
residwnls |
Coef. Std. Err.
-------------+---------------------------------------------------------------d1w | -1.11e-07 .0358551 -0.00 1.000 -.0702833 .0702831
d2w | 5.35e-08 .0224175 0.00 1.000 -.0439428 .043943
-----------------------------------------------------------------------------. estimates store bwnlsrobust
. regress residwnls d1w d2w [aweight=wwnls], noconstant
Source |
SS
df
MS
-------------+-----------------------------F( 2, 9998) = 0.00
Model | 1.8190e-12 2 9.0949e-13
Prob > F
= 1.0000
Residual | 1494.6525 9998 .149495149
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0002
Total | 1494.6525 10000 .14946525
Root MSE
= .38665
-----------------------------------------------------------------------------residwnls |
Coef. Std. Err.
-------------+---------------------------------------------------------------d1w | -1.11e-07 .0224903 -0.00 1.000 -.0440856 .0440853
d2w | 5.35e-08 .009777 0.00 1.000 -.0191649 .019165
-----------------------------------------------------------------------------. estimates store bwnlscheck
. gen lamdawnls = 1 / yhatwnls
/* yhatwnls saved earlier */
. gen lnfwnls = ln(lamdawnls) - y*lamdawnls

. quietly means lnfwnls
. scalar LLwnls = r(mean)*r(N)
. gen y_yhatsqwnls = (y - yhatwnls)^2
. quietly means y_yhatsqwnls
. scalar SSresidwnls = r(mean)
. scalar Rsqwnls = 1 - SSresidwnls/SStotal
. di LLwnls " " Rsqwnls

-208.93381 .39017996
101
.
. ***** PRINT RESULTS: Last three columns of Table 5.7 page 161
.
. * (1) NLS using NL - nonrobust and robust standard errors
. * Here nonrobust differs from robust asymptotically
.
. * Table 5.7 NLS nonrobust standard errors
. estimates table bnls, b(%10.4f) se(%10.4f) t stats(N ll)
--------------------------Variable | bnls
-------------+------------b1int | 1.8876
| 0.0307
|
61.52
b2x | -0.9575
| 0.0097
| -98.28
-------------+------------N | 10000.0000
ll |
--------------------------legend: b/se/t
. * Table 5.7 NLS robust standard errors
. estimates table bnlscheck bnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll)
---------------------------------------Variable | bnlscheck bnlsrobust
-------------+-------------------------d1 | 0.0000
0.0000
| 0.0307
0.1421
|
0.00
0.00
d2 | -0.0000 -0.0000
| 0.0097
0.0612
|
-0.00
-0.00
-------------+-------------------------N | 10000.0000 10000.0000
ll | -1.426e+04 -1.426e+04
---------------------------------------legend: b/se/t
.
. /*
> * Check: Nonrobust standard errors of NLS b1int and b2x:
> di seb1intnlsnr " " seb2xnlsnr
> * Robust standard errors of NLS estimates of b1int and b2x:
> di seb1intnls " " seb2xnls
> */
. * Alternative Robust standard errors of NLS estimates of b1int and b2x:
102
. * These use knowledge that Var[u] = exp(x'b)

. di seb1intnlsknown " " seb2xnlsknown
.21097066 .08798113
.
. * (3) WNLS - nonrobust and robust standard errors
. * Here nonrobust = robust asymptotically as WNLS in LEF
. * Also should be same as MLE asymptotically
. * Table 5.7 WNLS nonrobust standard errors
. estimates table bwnls, b(%10.4f) se(%10.4f) t stats(N ll)
--------------------------Variable | bwnls
-------------+------------b1int | 1.9906
| 0.0225
|
88.51
b2x | -0.9961
| 0.0098
| -101.88
-------------+------------N | 10000.0000
ll |
--------------------------legend: b/se/t
. * Table 5.7 WNLS robust standard errors
. estimates table bwnlscheck bwnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll)
---------------------------------------Variable | bwnlscheck bwnlsrob~t
-------------+-------------------------d1w | -0.0000 -0.0000
| 0.0225
0.0359
|
-0.00
-0.00
d2w | 0.0000
0.0000
| 0.0098
0.0224
|
0.00
0.00
-------------+-------------------------N | 10000.0000 10000.0000
ll | -4685.9286 -4685.9286
---------------------------------------legend: b/se/t
.
. * (2) FGNLS - nonrobust and robust standard errors
. * Here nonrobust = robust asymptotically as FGNLS in LEF
. * Also should be same as MLE asymptotically
. * Table 5.7 FGNLS nonrobust standard errors
. estimates table bfgnls, b(%10.4f) se(%10.4f) t stats(N ll)
103
--------------------------Variable | bfgnls
-------------+------------b1int | 1.9840
| 0.0148
| 134.30
b2x | -0.9907
| 0.0100
| -98.97
-------------+------------N | 10000.0000
ll |
--------------------------legend: b/se/t
. * Table 5.7 FGNLS robust standard errors
. estimates table bfgnlscheck bfgnlsrobust, b(%10.4f) se(%10.4f) t stats(N ll)
---------------------------------------Variable | bfgnlsch~k bfgnlsro~t
-------------+-------------------------d1opt | -0.0000
-0.0000
| 0.0148
0.0146
|
-0.00
-0.00
d2opt | 0.0000
0.0000
| 0.0100
0.0101
|
0.00
0.00
-------------+-------------------------N | 10000.0000 10000.0000
ll | 4887.7042 4887.7042
---------------------------------------legend: b/se/t
.
. * (4) Print the various log-likelihoods and R-squared
. * Log-likelihood for NLS and FNGLS
. di "LLnls: " LLnls " LLfgnls: " LLfgnls " LLwnls: " LLwnls
LLnls: -232.97524 LLfgnls: -208.71965 LLwnls: -208.93381
. * R-squared for MLE, NLS and FNGLS
. di "Rsqnls: " Rsqnls " Rsqfgnls: " Rsqfgnls " Rsqwnls: " Rsqwnls
Rsqnls: .39134462 Rsqfgnls: .39056605 Rsqwnls: .39017996
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma05p2nls.txt
log type: text
closed on: 17 May 2005, 13:53:34
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p3nlsbyml.txt
104
log type: text

opened on: 17 May 2005, 13:54:20
.
. ********** OVERVIEW OF MMA05P2NLSBYML.DO **********
.
. * STATA Program
.
. * Chapter 5.9 pp.159-63
. * Nonlinear Least Squares using Stata command ml
.
. * Provides third column of Table 5.7 for
. * (1) NLS using Stata ml command (easy to get robust s.e.'s)
. * using generated data set mma05data.asc
.
. * Note: Use ml rather than nl as then much easier to get robust s.e.'s
.*
Can instead use stata command nl see program mma05p2nlsbynl.do
.
. * mma05p1mle.do
. * mma05p2nls.do
NLS (and WMNLS and FGNLS) using Stata command nl
.
.
. ********** SETUP **********
.
. set more off
. version 8
.
.
.*
x ~ N[mux, sigx^2]
.*
.*
.*
.*
V[y] = exp(-(a + bx)) = E[y]^2
.
105
.
. describe
Contains data
obs:
10,000
vars:
2
size:
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y | 10000 .6194352 1.291416 .0000445 30.60636
x | 10000 1.014313 1.004905 -2.895741 4.994059
.
. ********** DO THE ANALYSIS: NLS using STATA COMMAND ML **********
.
. * (1) NLS ESTIMATION USING STATA ML COMMAND (maximum likelihood)
.
. * Advantage: ml command has robust standard errors as an option
.
. * The NLS estimator minimizes SUM_i (y_i - g(x_i'b))^2.
. * Here let g(x'b) = exp(a + b*x) = exp(b1int + b2x*x) say.
. * In fact for this dgp E[y] = exp(-(a + bx)) so sign reversal for the mean.
.
. * To adjust this code to other NLS problems
. * (a) If more regressors, say x1 x2 and x3, replace ml model line with
.*
ml model lf mlexp (y = x1 x2 x3) / sigma
. * (b) If different functional form for mean, say g(x'b), redefine `res' as
.*
`res' = $ML_y1 - g(`theta')
. * (c) If functional form for mean is not single-index then the program
. * will become considerably more complicated with more args.
.
. * (1A) The program "mlexp" defines the objective function
. program define mlexp
1. version 8.0
2. args lnf theta sigma
/* theta contains b1int and b2x; sigma is st.dev.of error */
3. tempvar res
/* create to shorten expression for lnf */
4. quietly gen double `res' = $ML_y1 - exp(-`theta')
106
5. quietly replace `lnf' = -0.5*ln(2*_pi) - ln(`sigma') - 0.5*`res'^2/`sigma'^2

6. end
.
. * (1B) The following command gives the dep variable (y) and regressors (x + intercept)
. ml model lf mlexp (y = x) / sigma
. ml search
initial:
log likelihood = -<inf> (could not be evaluated)
feasible:
improve:
rescale:
rescale eq: log likelihood = -16938.923
. ml maximize
initial:
rescale:
Iteration 0: log likelihood = -16938.923 (not concave)
Number of obs =
10000
Wald chi2(1) = 10492.88
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------eq1
|
x | -.9574683 .0093471 -102.43 0.000 -.9757883 -.9391483
_cons | 1.887562 .0295701 63.83 0.000 1.829606 1.945519
-------------+---------------------------------------------------------------sigma
|
_cons | 1.007465 .0071239 141.42 0.000 .9935028 1.021428
-----------------------------------------------------------------------------. estimates store bnlsbymle
.
. * (1C) Adding ,robust gives Heteroskedastic robust standard errors
. ml model lf mlexp (y = x) / sigma, robust
. ml search
initial:
log pseudo-likelihood = -<inf> (could not be evaluated)
feasible:
107
improve:
rescale:
rescale eq: log pseudo-likelihood = -16777.282
. ml maximize
initial:
rescale:
rescale eq: log pseudo-likelihood = -16777.282
Iteration 0: log pseudo-likelihood = -16777.282 (not concave)
Number of obs =
10000
Wald chi2(1) = 288.75
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------eq1
|
x | -.9574683 .0563463 -16.99 0.000 -1.067905 -.8470317
_cons | 1.887562 .127832 14.77 0.000 1.637016 2.138108
-------------+---------------------------------------------------------------sigma
|
_cons | 1.007465 .0561714 17.94 0.000 .8973713 1.117559
-----------------------------------------------------------------------------. estimates store bnlsbymlerobust
.
. ***** PRINT RESULTS: Third column of Table 5.7 p.111 **********
.
. * (1) NLS by ML - nonrobust and robust standard errors
. * The coefficient estimates are exactly the same as those using the nl command
. * The estimated standard errors are close - within 10% of those using the nl command
. * Table 5.7 reports the standard errors using the nl command
. estimates table bnlsbymle bnlsbymlerobust, b(%10.4f) se(%10.4f) t stats(N ll)
---------------------------------------Variable | bnlsbymle bnlsbyml~t
-------------+-------------------------eq1
|
x | -0.9575 -0.9575
| 0.0093
0.0563
| -102.43
-16.99
108
_cons | 1.8876
1.8876
| 0.0296
0.1278
|
63.83
14.77
-------------+-------------------------sigma
|
_cons | 1.0075
1.0075
| 0.0071
0.0562
| 141.42
17.94
-------------+-------------------------Statistics |
N | 10000.0000 10000.0000
ll | -1.426e+04 -1.426e+04
---------------------------------------legend: b/se/t
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma05p3nlsbyml.txt
log type: text
closed on: 17 May 2005, 13:54:27
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma05p4margeffects.txt
log type: text
opened on: 17 May 2005, 13:57:02
.
. ********** OVERVIEW OF MMA05P4MARGINALEFFECTS.DO **********
.
. * STATA Program
.
. * Chapter 5.9.4 pp.162-3
. * Marginal effects analysis for a nonlinear model (here exponential regression).
.
. * Provides
. * (1) Sample average marginal effect using derivative
. * (2) Sample average marginal effect using first difference
. * (3) Marginal effect evaluated at the sample mean
. * (4) Marginal effects (1)-(3) when model estimated by Stata ml command
.
. * mma05p1mle.do
. * mma05p2nls.do
NLS, WNLS, FGNLS for same data using nl command
. * mma05p3nlsbyml.do NLS for same data using ml command
.
109

.
. ********** SETUP **********
.
. set more off
. version 8
.
.
.*
x ~ N[mux, sigx^2]
.*
.*
.*
.*
V[y] = exp(-(a + bx)) = E[y]
.
.
. describe
Contains data
obs:
10,000
vars:
2
size:
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y | 10000 .6194352 1.291416 .0000445 30.60636
x | 10000 1.014313 1.004905 -2.895741 4.994059
.
110
. ********** MARGINAL EFFECTS for CHAPTER 5.9.4 **********

.
. ** (1) DERIVATIVE METHOD FOR SAMPLE AVERAGE MARGINAL EFFECT
.
. * (1A) METHOD A: Use analytical results
. * Since E[y] = exp(-(a + bx)) Note: here sign reversal for the mean !!
.*
dE[y]/dx = -b*exp(-(a + bx)) = -b*E[y]
.
. * Estimate the model
. * The Stata code for exponential regression is unusual as st command
. * Need to declare data to be st data with dependent variable y
. stset y
failure event: (assumed to fail at time=y)
obs. time interval: (0, y]
-----------------------------------------------------------------------------10000 total obs.
0 exclusions
6194.352 total analysis time at risk, at risk from t =
0
0
. quietly streg x, distribution(exponential) nohr
. gen dEydxanalyticalderivative = -_b[x]*exp(-_b[_cons] - _b[x]*x)
. * Alternative is to (1) predict the mean and (2) multiply by -_b[x]
. quietly sum dEydxanalyticalderivative
. scalar mesaad = r(mean)
. di "Sample average marginal effect by analytical derivative = " mesaad
Sample average marginal effect by analytical derivative = .60976598
.
. * (1B) METHOD B: Use numerical derivative (here one-sided)
. * This is same as first difference code, except have small change in x
. * Note: precision problems can arise with small changes in x
. * The following code tries to minimize such problems
. * Change in x will be 0.0001 times the standard deviation of x
. egen sdx = sd(x)
. * Need to tell streg to predict the mean as this is not the default.
. predict y0, mean time
111
. gen xoriginal = x
. replace x = x+0.0001*sdx
(10000 real changes made)
. gen dEydxnumericalderivative = (y1 - y0)/(0.0001*sdx)
. quietly sum dEydxnumericalderivative
. scalar mesand = r(mean)
. di "Sample average marginal effect by numerical derivative = " mesand
Sample average marginal effect by numerical derivative = .60949044
. replace x = xoriginal
. drop xoriginal sdx y0 y1
.
. ** (2) FINITE DIFFERENCE METHOD FOR SAMPLE AVERAGE MARGINAL EFFECT
.
. streg x, distribution(exponential) nohr /* y is dependent variable */
analysis time _t: y
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


No. of subjects =
10000
No. of failures =
10000
Time at risk = 6194.352464
Number of obs =
LR chi2(1)
Log likelihood =
-15752.19
= 10003.63
Prob > chi2 =
10000
0.0000
-----------------------------------------------------------------------------_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------x | -.9896276 .0098692 -100.27 0.000 -1.008971 -.9702842
_cons | 1.982921 .0141496 140.14 0.000 1.955188 2.010654
-----------------------------------------------------------------------------112
.
. * The following method can be used following many stata estimation commands
. * 1. Predict y using sample data.
. * Need to say predict the mean as this is not the streg default.
. * 2. Predict y with regressor of x increased by one
. gen xoriginal = x
. replace x = x+1
. replace x = xoriginal /* Put x back to initial value for later analysis */
. * 3. Calculate difference
. gen dEydxfinitedifference = y1 - y0
. quietly sum dEydxfinitedifference
. scalar mesafd = r(mean)
. di "Sample average marginal effect by first differences = " mesafd
Sample average marginal effect by first differences = 1.0414485
. drop xoriginal y0 y1
.
. ** (3) DERIVATIVE METHOD FOR MARGINAL EFFECT AT SAMPLE MEAN
.
. * (3A) Use Stata command mfx
. * Need to tell mfx to predict the mean as this is not the streg default.
. mfx compute, dydx predict(mean time)
Marginal effects after ereg
y = predicted mean _t (predict, mean time)
= .37563828
-----------------------------------------------------------------------------variable |
dy/dx Std. Err. z P>|z| [ 95% C.I. ]
X
---------+-------------------------------------------------------------------x | .371742
.00525 70.81 0.000 .361452 .382032 1.01431
-----------------------------------------------------------------------------. di "Marginal effect by analytical derivative at mean of x using mfx: "
Marginal effect by analytical derivative at mean of x using mfx:
113
. matrix list e(Xmfx_dydx)

symmetric e(Xmfx_dydx)[1,1]
x
r1 .371742
.
. * (3B) Write ones own code
. quietly sum x
. scalar meanx = r(mean)
. scalar dEydxatmeanx = -_b[x]*exp(-_b[_cons] - _b[x]*meanx)
. di "Marginal effect by analytical derivative at mean of x done manually: "
Marginal effect by analytical derivative at mean of x done manually:
. di dEydxatmeanx
.371742
.
. ** (4) MARGINAL EFFECTS AFTER ML COMMAND
.
. * Preceding (1) - (3) presume there is a built-in command to get MLE.
. * Now consider ML estimation using Stata's ml command.
. * After ml command cannot use predict or mfx.
. * Need to be more manual, as follows.
.
. * Estimate model by ml: for details see mma0p1mle.do
. program define mleexp0
1. version 8.0
2. args lnf theta
/* Must use lnf while could use name other than theta */
3. quietly replace `lnf' = `theta' - $ML_y1*exp(`theta')
4. end
. quietly ml model lf mleexp0 (y = x)
. quietly ml search
. quietly ml maximize
.
. * Note that here the mean is in fact exp(-a-b*x)
.
. * (1A) Sample average marginal effect by calculus methods
. gen mldEydxanalyticalderivative = -_b[x]*exp(-_b[_cons] - _b[x]*x)
. quietly sum mldEydxanalyticalderivative
114
. scalar mlmesaad = r(mean)

. di "Sample average marginal effect by analytical derivative = " mlmesaad
Sample average marginal effect by analytical derivative = .60976598
.
. * (1B) Sample average marginal effect by numerical derivative
. egen sdx = sd(x)
. gen y0 = exp(-_b[_cons] - _b[x]*x)
. gen xoriginal = x
. replace x = x+0.0001*sdx
. gen y1 = exp(-_b[_cons] - _b[x]*x)
. gen mldEydxnumericalderivative = (y1 - y0)/(0.0001*sdx)
. quietly sum mldEydxnumericalderivative
. scalar mlmesand = r(mean)
. di "ML sample average marginal effect by numerical derivative = " mlmesand
ML sample average marginal effect by numerical derivative = .60949063
. replace x = xoriginal
. drop xoriginal sdx y0 y1
.
. * (2) Sample average marginal effect by increase x by one unit (finite difference)
. gen mldEydxfinitedifference = exp(-_b[_cons]-_b[x]*(x+1)) - exp(-_b[_cons]-_b[x]*x)
. quietly sum mldEydxfinitedifference
. scalar mlmesafd = r(mean)
. di "Sample average marginal effect by first differnce = " mlmesafd
Sample average marginal effect by first differnce = 1.0414485
.
. * (3) Marginal effect estimated at the sample mean of x
. quietly sum x
. scalar meanx = r(mean)
. scalar mldEydxatmeanx = -_b[x]*exp(-_b[_cons] - _b[x]*meanx)
115
. di "ML marginal effect at mean of x by analytical derivative: "

ML marginal effect at mean of x by analytical derivative:
. di mldEydxatmeanx
.371742
.
. ********** DISPLAY RESULTS on p.162-3 **********
.
. di "Marginal Effects: (1A) Analytical deriv (1B) Numerical Deriv (2) First diff"
Marginal Effects: (1A) Analytical deriv (1B) Numerical Deriv (2) First diff
. sum dEydxfinitedifference dEydxanalyticalderivative dEydxnumericalderivative
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dEydxfinit~e | 10000 1.041449 1.373144 .01325 32.59646
dEydxanaly~e | 10000 .609766 .8039727 .0077578 19.08516
dEydxnumer~e | 10000 .6094904 .8035654 .0077479 19.11325
.
. di "KEY RESULTS FOR CHAPTER 5.9.4 pp.162-3 FOLLOW"
KEY RESULTS FOR CHAPTER 5.9.4 pp.162-3 FOLLOW
. di "(1A) Sample average marginal effect by analytical derivative = " mesaad
(1A) Sample average marginal effect by analytical derivative = .60976598
. di "(1B) Sample average marginal effect by numerical derivative = " mesand
(1B) Sample average marginal effect by numerical derivative = .60949044
. di "(2) Sample average marginal effect by first differences = " mesafd
(2) Sample average marginal effect by first differences = 1.0414485
. di "(3) Marginal effect at mean of x by analytical derivative = " dEydxatmeanx
(3) Marginal effect at mean of x by analytical derivative = .371742
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma05p4margeffects.txt
log type: text
closed on: 17 May 2005, 13:57:06
116
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma06p2Theil.txt
log type: text
opened on: 18 May 2005, 17:45:50
.
. ********** OVERVIEW OF MMA06P2THEIL.DO **********
.
. * STATA Program
.
. * NOTE: Stata does not have a NL2SLS command
.
. * Chapter 6.5.4 nonlinear 2SLS example.
. * Table 6.4 partial only
. * (1) OLS
inconsistent
. * (2) NL2SLS consistent NOT INCLUDED AS STATA DOES NOT DO
. * (3) Wrong 2SLS inconsistent
.
. * To run this program you need data set
.*
mma06p1nl2sls.asc
. * generated by Limdep program MMA06P1NL2SLS.LIM
.
. * Some of the analysis is done in Limdep which (unlike Stata) has
. * an NL2SLS command
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** READ DATA and SUMMARIZE **********
.
. * Model is y = 1*x^2 + u
.*
x = 1*z + v
. * where u and v are joint normal (0,0,1,1,0.8)
.
. infile y x xsq z zsq u v using mma06p1nl2sls.asc
.
. describe
Contains data
obs:
200
117
vars:
7
size:
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
xsq
float %9.0g
z
float %9.0g
zsq
float %9.0g
u
float %9.0g
v
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y|
200 1.632794 2.418096 -2.332656 9.354863
x|
200 .9970513 .8330302 -1.908285 2.696363
xsq |
200 1.684581 1.638509 .0000948 7.270374
z|
200
1
0
1
1
zsq |
200
1
0
1
1
-------------+-------------------------------------------------------u|
200 -.0517871 .9427286 -2.816687 2.202356
v|
200 -.0029487 .8330302 -2.908285 1.696363
.
.
. regress y xsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 2250.83
Model | 1558.96322 1 1558.96322
Prob > F
= 0.0000
Residual | 137.83055 199 .692615831
R-squared = 0.9188
-------------+-----------------------------Adj R-squared = 0.9184
Total | 1696.79377 200 8.48396883
Root MSE
= .83224
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xsq | 1.189495 .0250721 47.44 0.000 1.140054 1.238936
118
. regress y xsq, noconstant robust

Number of obs =
F( 1, 199) = 3850.71
Prob > F
= 0.0000
R-squared = 0.9188
Root MSE = .83224
200
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xsq | 1.189495 .0191687 62.05 0.000 1.151695 1.227295
-----------------------------------------------------------------------------. estimates store olswrongrob
.
. * (2) NL2SLS command Stata does not have
. * See LIMDEP program MMA06P1NL2SLS.LIM
.
. * (3A) Theil's 2sls where first regress x on z is inconsistent
. regress x z, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 286.51
Model | 198.822258 1 198.822258
Prob > F
= 0.0000
Residual | 138.093918 199 .693939288
R-squared = 0.5901
-------------+-----------------------------Adj R-squared = 0.5881
Total | 336.916176 200 1.68458088
Root MSE
= .83303
-----------------------------------------------------------------------------x|
Coef. Std. Err.
-------------+---------------------------------------------------------------z | .9970513 .0589041 16.93 0.000 .8808949 1.113208
-----------------------------------------------------------------------------. predict xhat
. gen xhatsq = xhat*xhat
. regress y xhatsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
119
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xhatsq | 1.642466 .1719981 9.55 0.000 1.303293 1.981638
-----------------------------------------------------------------------------. estimates store ivwrong
.
. ********** DISPLAY KEY RESULTS Table 6.4 p.199 **********
.
. * Table 4.4 p.199
. estimates table olswrong olswrongrob ivwrong, b(%8.3f) se stats(N r2) keep(xsq xhatsq)
----------------------------------------------Variable | olswrong olswro~b ivwrong
-------------+--------------------------------xsq | 1.189
1.189
| 0.025
0.019
xhatsq |
1.642
|
0.172
-------------+--------------------------------N | 200.000 200.000 200.000
r2 | 0.919
0.919
0.314
----------------------------------------------legend: b/se
.
. * (3B) IV with instrument xsq for zsq should work but Stata cannot do
. ivreg y (xsq = xsq), noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) =
.
Model | 1558.96322 1 1558.96322
Prob > F
=
.
Residual | 137.83055 199 .692615831
R-squared =
.
-------------+-----------------------------Adj R-squared =
.
Total | 1696.79377 200 8.48396883
Root MSE
= .83224
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xsq | 1.189495 .0250721 47.44 0.000 1.140054 1.238936
-----------------------------------------------------------------------------Instrumented: xsq
Instruments: xsq
-----------------------------------------------------------------------------. corr xsq xsq
(obs=200)
120
|
xsq
xsq
-------------+-----------------xsq | 1.0000
xsq | 1.0000 1.0000
. corr xsq z
(obs=200)
|
xsq
z
-------------+-----------------xsq | 1.0000
z|
.
.
. regress xsq z, noconstant

Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 211.41
Model | 567.562553 1 567.562553
Prob > F
= 0.0000
Residual | 534.257348 199 2.68471029
R-squared = 0.5151
-------------+-----------------------------Adj R-squared = 0.5127
Total | 1101.8199 200 5.50909951
Root MSE
= 1.6385
-----------------------------------------------------------------------------xsq |
Coef. Std. Err.
-------------+---------------------------------------------------------------z | 1.684581 .1158601 14.54 0.000 1.45611 1.913052
-----------------------------------------------------------------------------. predict xsqhat
. regress y xsqhat, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xsqhat | .9692582 .1015002 9.55 0.000 .7691043 1.169412
-----------------------------------------------------------------------------. * ivreg y (xsq = z), noconstant
.
121
. gen one = 1
. regress y one, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------one | 1.632794 .1709852 9.55 0.000 1.295618 1.969969
-----------------------------------------------------------------------------.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma06p2Theil.txt
log type: text
closed on: 18 May 2005, 17:45:50
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma06p2twostage.txt
log type: text
opened on: 18 May 2005, 17:59:06
.
. ********** OVERVIEW OF MMA06P2TWOSTAGE.DO **********
.
. * STATA Program
.
. * NOTE: Stata does not have a NL2SLS command
.
. * Chapter 6.5.4 nonlinear 2SLS example on pages 198-9.
.
. * Table 6.4 partial only
. * (1) OLS
inconsistent
. * (2) NL2SLS consistent NOT INCLUDED AS STATA DOES NOT DO
. * (3) Twostage Here 2SLS using Theil's interpretation of 2SLS is inconsistent
.
.*
mma06p1nl2sls.asc
. * generated by Limdep program MMA06P1NL2SLS.LIM
.
. * Some of the analysis is done in Limdep which (unlike Stata) has
122
. * an NL2SLS command
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** READ DATA and SUMMARIZE **********
.
. * Model is y = 1*x^2 + u
.*
x = 1*z + v
. * where u and v are joint normal (0,0,1,1,0.8)
.
. infile y x xsq z zsq u v using mma06p1nl2sls.asc
.
. describe
Contains data
obs:
200
vars:
7
size:
label
variable label
------------------------------------------------------------------------------y
float %9.0g
x
float %9.0g
xsq
float %9.0g
z
float %9.0g
zsq
float %9.0g
u
float %9.0g
v
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y|
200 1.632794 2.418096 -2.332656 9.354863
x|
200 .9970513 .8330302 -1.908285 2.696363
xsq |
200 1.684581 1.638509 .0000948 7.270374
z|
200
1
0
1
1
zsq |
200
1
0
1
1
-------------+-------------------------------------------------------123
u|
v|
200 -.0517871
200 -.0029487
.9427286 -2.816687 2.202356

.8330302 -2.908285 1.696363
.
.
. regress y xsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 2250.83
Model | 1558.96322 1 1558.96322
Prob > F
= 0.0000
Residual | 137.83055 199 .692615831
R-squared = 0.9188
-------------+-----------------------------Adj R-squared = 0.9184
Total | 1696.79377 200 8.48396883
Root MSE
= .83224
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xsq | 1.189495 .0250721 47.44 0.000 1.140054 1.238936
. regress y xsq, noconstant robust
Number of obs =
F( 1, 199) = 3850.71
Prob > F
= 0.0000
R-squared = 0.9188
Root MSE = .83224
200
-----------------------------------------------------------------------------|
Robust
y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xsq | 1.189495 .0191687 62.05 0.000 1.151695 1.227295
-----------------------------------------------------------------------------. estimates store olswrongrob
.
. * (2) NL2SLS command Stata does not have
. * See LIMDEP program MMA06P1NL2SLS.LIM
. * See also code further down
.
. * (3A) Theil's 2sls where first regress x on z
.*
and then use xhat^2 as instrument for x^2 is inconsistent
.
. regress x z, noconstant
124
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 286.51
Model | 198.822258 1 198.822258
Prob > F
= 0.0000
Residual | 138.093918 199 .693939288
R-squared = 0.5901
-------------+-----------------------------Adj R-squared = 0.5881
Total | 336.916176 200 1.68458088
Root MSE
= .83303
-----------------------------------------------------------------------------x|
Coef. Std. Err.
-------------+---------------------------------------------------------------z | .9970513 .0589041 16.93 0.000 .8808949 1.113208
-----------------------------------------------------------------------------. predict xhat
. gen xhatsq = xhat*xhat
. regress y xhatsq, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xhatsq | 1.642466 .1719981 9.55 0.000 1.303293 1.981638
-----------------------------------------------------------------------------. estimates store twostage
.
. ********** DISPLAY KEY RESULTS Table 6.4 p.199 **********
.
. * Table 4.4 p.199 first and third columns
. estimates table olswrong twostage, b(%8.3f) se stats(N r2) keep(xsq xhatsq)
-----------------------------------Variable | olswrong twostage
-------------+---------------------xsq | 1.189
| 0.025
xhatsq |
1.642
|
0.172
-------------+---------------------N | 200.000 200.000
r2 | 0.919
0.314
125
-----------------------------------legend: b/se
.
. ********** FURTHER ANALYSIS **********
.
. * For this particular example there are ways to get linear IV to work
. * as the problem is not very nonlinear
.
. * (2A) regress xsq on z giving xsqhat and then regress y on xsqhat
.*
Gives nl2sls estimator though not correct standard errors
.
. * Note we get estimator 0.969 which is correct - Table 6.4 had typo
. regress xsq z, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 211.41
Model | 567.562553 1 567.562553
Prob > F
= 0.0000
Residual | 534.257348 199 2.68471029
R-squared = 0.5151
-------------+-----------------------------Adj R-squared = 0.5127
Total | 1101.8199 200 5.50909951
Root MSE
= 1.6385
-----------------------------------------------------------------------------xsq |
Coef. Std. Err.
-------------+---------------------------------------------------------------z | 1.684581 .1158601 14.54 0.000
1.45611 1.913052
-----------------------------------------------------------------------------. predict xsqhat
. regress y xsqhat, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 1, 199) = 91.19
Model | 533.203113 1 533.203113
Prob > F
= 0.0000
Residual | 1163.59065 199 5.84718921
R-squared = 0.3142
-------------+-----------------------------Adj R-squared = 0.3108
Total | 1696.79377 200 8.48396883
Root MSE
= 2.4181
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------xsqhat | .9692582 .1015002 9.55 0.000 .7691043 1.169412
-----------------------------------------------------------------------------.
. * (2B) IV with instrument z for xsq should work but Stata cannot do
.*
for some reason due to here z = 1 which has no variation
. ivreg y (xsq = z), noconstant
note: z dropped due to collinearity
126
equation not identified; must have at least as many instruments not in

the regression as there are instrumented variables
r(481);
end of do-file
r(481);
. exit, clear
127
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma07p1mltests.txt
log type: text
opened on: 17 May 2005, 13:59:20
.
. ********** OVERVIEW OF MMA07P1MLTESTS.DO **********
.
. * STATA Program
.
. * Chapter 7.4 pp.241-3
. * Likelihood-based hypothesis tests
.
. * Implements the three likelihood-based tests presented in Table 7.1:
. * Wald test
. * LR test
. * LM test direct
. * LM test via auxiliary regression
. * for a Poisson model with simulated data (see below).
.
. * NOTE: To implement this program requires:
.*
the free Stata add-on rndpoix
. * To obtain this, in Stata give command: search rndpoix
. * If you don't want to do this, instead use the data set
.
. ********** SETUP ***********
.
. version 8
. set more off
.
. ********** GENERATE DATA ***********
.
. * Model is
. * y ~ Poisson[exp(b1 + b2*x2 + b3*x3 + b4*x4]
. * where
. * x2, x3 and x4 are iid ~ N[0,1]
. * and b1=0, b2=0.1, b3=0.1 and b4=0.1
.
. set seed 10001
. set obs 200
obs was 0, now 200
. scalar b1 = 0
128
. scalar b2 = 0.1
. scalar b3 = 0.1
. scalar b4 = 0.1
.
. * Generate regressors
. gen x2 = invnorm(uniform())
.
. * Generate y
. gen mupoiss = exp(b1+b2*x2+b3*x3+b4*x4)
. * The next requires Stata add-on. In Stata: search rndpoix
. rndpoix(mupoiss)
( Generating ....... )
Variable xp created.
. gen y = xp
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
200 -.0091098 1.010072 -2.857666 2.149822
x3 |
200 -.1459839 1.109521 -3.086754 3.111421
x4 |
200 -.0325314 .9674748 -2.852186 2.379461
mupoiss |
200 1.000447 .1993649 .6191922 1.903112
xp |
200
.845 .951579
0
6
-------------+-------------------------------------------------------y|
200
.845 .951579
0
6
.
. outfile y x2 x3 x4 using mma07p1mltests.asc, replace
.
. ********** ANALYSIS: LIKELIHOOD-BASED HYPOTHESIS TESTS ***********
.
. * Hypotheses to test are
. * (A) Single exclusion: b3 = 0
. * (B) Multiple exclusion: b3 = 0, b4 = 0
. * (C) Linear:
b3 = b4
. * (B) Nonlinear:
b3/b4 = 1
.
129
. * Tests are Wald, LR, LM and LM (auxiliary)

.
. ****** (A) TEST H0: b3 = 0
.
. * First skip to (B) where many comments given.
.
. ****** (B) TEST H0: b3 = 0, b4 = 0.
.
. * (1) Wald test requires estimation of unrestricted model only
. poisson y x2 x3 x4
Poisson regression
Number of obs =
200
LR chi2(3)
=
8.30
Prob > chi2 = 0.0401
Pseudo R2
= 0.0171
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x2 | -.0275702 .0767909 -0.36 0.720 -.1780775 .1229371
x3 | .1630037 .0670848 2.43 0.015 .0315199 .2944874
x4 | .1026568 .0802139 1.28 0.201 -.0545595 .2598732
_cons | -.1653238 .0773479 -2.14 0.033 -.316923 -.0137246
-----------------------------------------------------------------------------.
. * (1A) Stata Wald test command
. test (x3=0) (x4=0)
( 1) [y]x3 = 0
( 2) [y]x4 = 0
chi2( 2) = 8.57
Prob > chi2 = 0.0138
.
. * (1B) Wald test done manually
. * Use h'[RVR]-inv*h.
. * Details below will change for each example.
. * In particular, for nonlinear restrictions more work in forming R
. * Note that Stata puts the intercept last, not first.
. * So here the second and third elements of b are set to zero.
. matrix bfull = e(b)
/* 1xq row vector */
. matrix vfull = e(V)
/* qxq matrix */
. matrix h = (bfull[1,2]\bfull[1,3])
/* hx1 vector */
130
. matrix R = (0,1,0,0\0,0,1,0)
/* h x q matrix */
. matrix Wald = h'*syminv(R*vfull*R')*h /* scalar */

. matrix list h
h[2,1]
c1
r1 .16300365
r2 .10265681
. matrix list R
R[2,4]
c1 c2 c3 c4
r1 0 1 0 0
r2 0 0 1 0
. matrix list Wald
symmetric Wald[1,1]
c1
c1 8.5701855
. scalar WaldB = Wald[1,1]
.
. * (2) Likelihood ratio test requires estimating both models
.
Poisson regression
Number of obs =
200
LR chi2(3)
=
8.30
Prob > chi2 = 0.0401
Pseudo R2
= 0.0171
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x2 | -.0275702 .0767909 -0.36 0.720 -.1780775 .1229371
x3 | .1630037 .0670848 2.43 0.015 .0315199 .2944874
x4 | .1026568 .0802139 1.28 0.201 -.0545595 .2598732
_cons | -.1653238 .0773479 -2.14 0.033 -.316923 -.0137246
-----------------------------------------------------------------------------. estimates store unrestricted
. scalar llunrest = e(ll)
/* Used for Stata lrtest */

/* Used for manual lrtest */
131
. poisson y x2
Iteration 1: log likelihood = -242.92271 (backed up)
Poisson regression
Number of obs =
200
LR chi2(1)
=
0.00
Prob > chi2 = 0.9608
Pseudo R2
= 0.0000
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x2 | -.0037493 .0763386 -0.05 0.961 -.1533701 .1458716
_cons | -.1684599 .0769294 -2.19 0.029 -.3192388 -.0176811
-----------------------------------------------------------------------------. estimates store restrictedB
. scalar llrestB = e(ll)

.
. * (2A) Stata likelihood ratio test
. lrtest unrestricted restrictedB
likelihood-ratio test
LR chi2(2) =
8.30
(Assumption: restrictedB nested in unrestricted)
Prob > chi2 =
0.0157
.
. * (2B) Likelihood test done manually
. scalar LRB = -2*(llrestB-llunrest)
. di "LR " LRB
LR 8.3023503
.
. * (3) LM test via direct compuation requires estimating only the restricted model.
.
. * For exclusion restrictions in the Poisson, from 7.6.2
. * LM = dlnL/db * V[b]-inv * dlnL/db where b evaluated at restricted
. * = [Sum_i u_i*x_i]'[Sum_i exp(x_i'b)*x_i*x_i'][Sum_i u_i*x_i]
. * First calculate Sum_i u_i*x_i' : a 1x4 row vector
.
. quietly poisson y x2
. predict yhatrest
(option n assumed; predicted number of events)
. gen u = y - yhatrest
/* yhatrest = exp(x_brest) calculated earlier */
132
. gen one = 1
. matrix vecaccum dlnL_db = u one x2 x3 x4, noconstant
. * Then calculate Sum_i exp(x_i'b)*x_i*x_i'
. gen trx1 = sqrt(yhatrest)
. gen trx2 = sqrt(yhatrest)*x2
. matrix accum Vb = trx1 trx2 trx3 trx4, noconstant
(obs=200)
. matrix LMdirect = dlnL_db*syminv(Vb)*dlnL_db'
. matrix list dlnL_db
dlnL_db[1,4]
one
x2
x3
x4
u 1.192e-07 -4.632e-08 37.578639 19.933299
. matrix list Vb
symmetric Vb[4,4]
trx1
trx2
trx3
trx4
trx1
169
trx2 -2.1828434 171.62608
trx3 -24.733563 16.929495 210.68156
trx4 -5.561359 17.0457 23.027167 157.58531
. matrix list LMdirect
symmetric LMdirect[1,1]
u
u 8.5750886
. scalar LMdirectB = LMdirect[1,1]
.
. * (4) LM test via auxiliary regression
.
. * N uncentered Rsq from regress (noconstant) 1 on the scores
. * Begin by computing the unrestricted scores at the restricted estimates.
. * This varies from problem to problem.
. * In general could compute lnf(y) at current parameters
. * and then get numerical derivative when perturb beta a little.
. * Here use analytical derivative.
. * s_j = dlnf(y)/db_j = (y-exp(x'b))*x_j for the Poisson
133
.
. drop yhatrest
. quietly poisson y x2
. predict yhatrest
. gen s1 = (y-yhatrest)*1
. gen s2 = (y-yhatrest)*x2
. regress one s1 s2 s3 s4, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 4, 196) = 2.36
Model | 9.18577727 4 2.29644432
Prob > F
= 0.0549
Residual | 190.814223 196 .973541953
R-squared = 0.0459
-------------+-----------------------------Adj R-squared = 0.0265
Total |
200 200
1
Root MSE
= .98668
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------s1 | -.0265153 .0748092 -0.35 0.723 -.1740497 .121019
s2 | -.0102806 .0809418 -0.13 0.899 -.1699093 .1493481
s3 | .1794153 .0697359 2.57 0.011 .0418862 .3169444
s4 | .1225885 .0821671 1.49 0.137 -.0394566 .2846336
-----------------------------------------------------------------------------. * LM equals N times uncentered Rsq
. scalar LMauxB = e(N)*e(r2)
. * Check: LM equals explained sum of squares
. scalar LMauxB2 = e(mss)
. di "LMauxB " LMauxB " LMauxB2 " LMauxB2
LMauxB 9.1857773 LMauxB2 9.1857773
.
. * (5) DISPLAY RESULTS
.
. estimates table unrestricted restrictedB, se stats(N ll r2) b(%8.3f)
-----------------------------------Variable | unrest~d restri~B
-------------+---------------------134
x2 | -0.028 -0.004
| 0.077
0.076
x3 | 0.163
| 0.067
x4 | 0.103
| 0.080
_cons | -0.165 -0.168
| 0.077
0.077
-------------+---------------------N | 200.000 200.000
ll | -238.772 -242.923
r2 |
-----------------------------------legend: b/se
. * Wald test using stata default Poisson variance matrix
. di "WaldB " WaldB " p-value " chi2tail(2,WaldB)
WaldB 8.5701855 p-value .01377234
. * LR test using Poisson log-likelihoods
. di " LRB " LRB " p-value " chi2tail(2,LRB)
LRB 8.3023503 p-value .0157459
. * LM test direct
. di " LMdirectB " LMdirectB " p-value " chi2tail(2,LMdirectB)
LMdirectB 8.5750886 p-value .01373862
. * LM test direct by auxiliary regression
. di " LMauxB " LMauxB " p-value " chi2tail(2,LMauxB)
LMauxB 9.1857773 p-value .01012357
.
. ****** (A) TEST H0: b3 = 0
.
. * (1) Wald test
. quietly poisson y x2 x3 x4
. test (x3=0)
( 1) [y]x3 = 0
chi2( 1) = 5.90
Prob > chi2 = 0.0151
. scalar WaldA = r(chi2)
.
. * (2) LR test
. poisson y x2 x4
135

Poisson regression
Number of obs =
200
LR chi2(2)
=
2.55
Prob > chi2 = 0.2793
Pseudo R2
= 0.0053
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x2 | -.0163179 .0770381 -0.21 0.832 -.1673098 .134674
x4 | .1278017 .0800348 1.60 0.110 -.0290637 .284667
_cons | -.1719505 .0772389 -2.23 0.026 -.3233359 -.0205651
-----------------------------------------------------------------------------. estimates store restrictedA
. lrtest unrestricted
/* Uses estimates store unrestricted from earlier */
LR chi2(1) =
5.75
(Assumption: restrictedA nested in unrestricted)
Prob > chi2 =
0.0165
. scalar LRA = r(chi2)

.
. * (3) LM test via direct compuation requires estimating only the restricted model.
. * See (B) for more explanation
. drop one yhatrest u trx1 trx2 trx3 trx4
. matrix drop dlnL_db Vb LMdirect
. quietly poisson y x2 x4
. predict yhatrest
. gen one = 1
136
(obs=200)
dlnL_db[1,4]
one
x2
x3
x4
u -1.788e-07 -1.717e-07 34.832631 -3.179e-07
. matrix list Vb
symmetric Vb[4,4]
trx1
trx2
trx3
trx4
trx1
169
trx2 -2.1828435 170.25918
trx3 -21.987555 15.647287 212.5673
trx4 14.371941 16.35821 22.067372 158.94405
u
u 5.9159017
. scalar LMdirectA = LMdirect[1,1]
.
. * See (B) for more explanation
. drop yhatrest s1 s2 s3 s4 one
. quietly poisson y x2 x4
. predict yhatrest
. gen one = 1
Source |
SS
df
MS
-------------+------------------------------
Number of obs = 200

F( 4, 196) = 1.57
137
Model | 6.21794802 4 1.554487

Prob > F
= 0.1832
Residual | 193.782052 196 .988683939
R-squared = 0.0311
-------------+-----------------------------Adj R-squared = 0.0113
Total |
200 200
1
Root MSE
= .99433
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------s1 | -.021781 .0760166 -0.29 0.775 -.1716964 .1281344
s2 | .0237921 .082791 0.29 0.774 -.1394834 .1870675
s3 | .1785093 .0711813 2.51 0.013 .0381297 .3188889
s4 | -.0065009 .084884 -0.08 0.939 -.1739042 .1609024
. scalar LMauxA = e(N)*e(r2)
. di "LMauxA " LMauxA
LMauxA 6.217948
.
. * (5) DISPLAY RESULTS in Table 7.1 page 242
.
. estimates table unrestricted restrictedA, se stats(N ll r2) b(%8.3f)
-----------------------------------Variable | unrest~d restri~A
-------------+---------------------x2 | -0.028 -0.016
| 0.077
0.077
x3 | 0.163
| 0.067
x4 | 0.103
0.128
| 0.080
0.080
_cons | -0.165 -0.172
| 0.077 0.077
-------------+---------------------N | 200.000 200.000
ll | -238.772 -241.648
r2 |
-----------------------------------legend: b/se
. di "WaldA " WaldA " p-value " chi2tail(1,WaldA)
WaldA 5.9040087 p-value .01510647
. di " LRA " LRA " p-value " chi2tail(1,LRA)
LRA 5.7537678 p-value .01645333
. di " LMdirectA " LMdirectA " p-value " chi2tail(1,LMdirectA)
LMdirectA 5.9159017 p-value .01500482
138
. di " LMauxA " LMauxA " p-value " chi2tail(1,LMauxA)

LMauxA 6.217948 p-value .01264616
.
. ****** (C) TEST H0: b3 = b4
.
. * (1A) Wald test
Poisson regression
Number of obs =
200
LR chi2(3)
=
8.30
Prob > chi2 = 0.0401
Pseudo R2
= 0.0171
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x2 | -.0275702 .0767909 -0.36 0.720 -.1780775 .1229371
x3 | .1630037 .0670848 2.43 0.015 .0315199 .2944874
x4 | .1026568 .0802139 1.28 0.201 -.0545595 .2598732
_cons | -.1653238 .0773479 -2.14 0.033 -.316923 -.0137246
-----------------------------------------------------------------------------. test (x3=x4)
( 1) [y]x3 - [y]x4 = 0
chi2( 1) = 0.29
Prob > chi2 = 0.5883
.
. * (1B) Wald test done manually
. * Note that Stata puts the intercept last, not first.
. * So here the second and third elements of b are tested as equal.
. matrix drop h R Wald
. matrix bfull = e(b)
/* 1xq row vector */
. matrix vfull = e(V)
/* qxq matrix */
. matrix h = (bfull[1,2]-bfull[1,3])
. matrix R = (0,1,-1,0)
/* hx1 vector */
/* h x q matrix */
. matrix Wald = h'*syminv(R*vfull*R')*h /* scalar */

. matrix list h
139
symmetric h[1,1]
c1
r1 .06034684
. matrix list R
R[1,4]
c1 c2 c3 c4
r1 0 1 -1 0
. matrix list Wald
symmetric Wald[1,1]
c1
c1 .29301766
. scalar WaldC = Wald[1,1]
. di " WaldC " WaldC " p-value " chi2tail(1,WaldC)
WaldC .29301766 p-value .5882932
.
. * (2) LR Test
. * In general getting the restricted MLE requires constrained ML
. * Here simple as if b3=b4 then mean is exp(b1+b2*x2+B3*(x3+x4))
. gen x3plusx4 = x3+x4
. poisson y x2 x3plusx4
Poisson regression
Number of obs =
200
LR chi2(2)
=
8.01
Prob > chi2 = 0.0182
Pseudo R2
= 0.0165
-----------------------------------------------------------------------------y | Coef. Std. Err.

-------------+---------------------------------------------------------------x2 | -.0287235 .0768651 -0.37 0.709 -.1793763 .1219293
x3plusx4 | .1374814 .0479519 2.87 0.004 .0434974 .2314653
_cons | -.1672262 .0773265 -2.16 0.031 -.3187832 -.0156691
-----------------------------------------------------------------------------. estimates store restrictedC
. lrtest unrestricted
/* Uses estimates store unrestricted from earlier */
LR chi2(1) =
0.29
140
(Assumption: restrictedC nested in unrestricted)
Prob > chi2 =
0.5885
. scalar LRC = r(chi2)

.
. * (3) LM test direct
. * Can use same code as earlier. Just different restricted estimates.
. * Now from poisson y x2 x3plusx4
. drop one yhatrest u trx1 trx2 trx3 trx4
. matrix drop dlnL_db Vb
. quietly poisson y x2 x3plusx4
. predict yhatrest
. gen one = 1
(obs=200)
dlnL_db[1,4]
one
x2
x3
x4
u 8.345e-07 -3.601e-07 4.8459933 -4.8459932
. matrix list Vb
symmetric Vb[4,4]
trx1
trx2
trx3
trx4
trx1
169
trx2 -2.1828442 171.13986
trx3 7.9990827 13.105974 225.99023
trx4 19.217934 15.11254 28.153892 161.75506
141

u
u .29306257
. scalar LMdirectC = LMdirect[1,1]
.
. drop yhatrest s1 s2 s3 s4 one
. quietly poisson y x2 x3plusx4
. predict yhatrest
. gen one = 1
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 4, 196) = 0.08
Model | .31510777 4 .078776943
Prob > F
= 0.9891
Residual | 199.684892 196 1.01880047
R-squared = 0.0016
-------------+-----------------------------Adj R-squared = -0.0188
Total |
200 200
1
Root MSE
= 1.0094
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------s1 | -.000531 .077731 -0.01 0.995 -.1538275 .1527654
s2 | .012802 .0857027 0.15 0.881 -.1562159 .1818199
s3 | .0283145 .0761713 0.37 0.711 -.121906 .1785351
s4 | -.0367099 .0869889 -0.42 0.673 -.2082642 .1348445
. scalar LMauxC = e(N)*e(r2)
. di "LMauxC " LMauxC
LMauxC .31510777
142
.
. * (5) DISPLAY RESULTS in Table 7.1 page 242
.
. estimates table unrestricted restrictedC, se stats(N ll r2) b(%8.3f)
-----------------------------------Variable | unrest~d restri~C
-------------+---------------------x2 | -0.028 -0.029
| 0.077
0.077
x3 | 0.163
| 0.067
x4 | 0.103
| 0.080
x3plusx4 |
0.137
|
0.048
_cons | -0.165 -0.167
| 0.077
0.077
-------------+---------------------N | 200.000 200.000
ll | -238.772 -238.918
r2 |
-----------------------------------legend: b/se
. di "WaldC " WaldC " p-value " chi2tail(1,WaldC)
WaldC .29301766 p-value .5882932
. di " LRC " LRC " p-value " chi2tail(1,LRC)
LRC .29264001 p-value .5885337
. di " LMdirectC " LMdirectC " p-value " chi2tail(1,LMdirectC)
LMdirectC .29306257 p-value .58826462
. di " LMauxC " LMauxC " p-value " chi2tail(1,LMauxC)
LMauxC .31510777 p-value .57456264
.
. ****** (D) TEST H0: b3/b4 - 1 = 0
.
. * (1) Wald test of b3 /b4 - 1 = 0
. * Stata does not do nonlinear hypotheses.
. * Instead do 7.2.5 algebra.
. matrix drop h R Wald
. matrix h = (bfull[1,2]/bfull[1,3] - 1)
. matrix R = (0, 1/bfull[1,3], -bfull[1,2]/(bfull[1,3]^2), 0)
. matrix Wald = h'*syminv(R*vfull*R')*h
143
. matrix list h
symmetric h[1,1]
c1
r1 .58785028
. matrix list R
R[1,4]
r1
c1
c2
c3
c4
0 9.7411946 -15.467559
. matrix list Wald

symmetric Wald[1,1]
c1
c1 .15768686
. scalar WaldD = Wald[1,1]
. di " WaldD " WaldD " p-value " chi2tail(1,WaldD)
WaldD .15768686 p-value .69129516
.
. * (2) LR Test
. * This requires MLE subject to nonlinear constraints.
. * This is difficult so not done here.
. * But note that here will get same result as if
. * get MLE subject to b3 = b4 which was done in (C).
.
. * (3) LM test direct
. * Like (2) requires restricted MLE.
. * This is difficult so not done here.
. * But note that here will get same result as if
. * get MLE subject to b3 = b4 which was done in (C).
.
. * (4) LM test via auxiliary regrression
. * Same as for (3)
.
. * (5) DISPLAY RESULTS
. di "WaldD " WaldD " p-value " chi2tail(1,WaldD)
WaldD .15768686 p-value .69129516
.
.
. *********** DISPLAY RESULTS GIVEN IN TABLE 7.1 on page 242 ***********
.
. estimates table unrestricted restrictedA restrictedB restrictedC, se stats(N ll r2) b(%8.3f)
---------------------------------------------------------Variable | unrest~d restri~A restri~B restri~C
144
-------------+-------------------------------------------x2 | -0.028 -0.016 -0.004 -0.029

| 0.077
0.077
0.076
0.077
x3 | 0.163
| 0.067
x4 | 0.103
0.128
| 0.080
0.080
x3plusx4 |
0.137
|
0.048
_cons | -0.165 -0.172 -0.168 -0.167
| 0.077
0.077
0.077
0.077
-------------+-------------------------------------------N | 200.000 200.000 200.000 200.000
ll | -238.772 -241.648 -242.923 -238.918
r2 |
---------------------------------------------------------legend: b/se
. di "WaldA " WaldA " p-value " chi2tail(1,WaldA)
WaldA 5.9040087 p-value .01510647
.
. * Wald test statistics
. di "Wald A to D: (A) " %8.3f WaldA " (B) " %8.3f WaldB " (C) " %8.3f WaldC " (D) " %8.3f
WaldD
Wald A to D: (A) 5.904 (B) 8.570 (C) 0.293 (D) 0.158
. di " p-values : (A) " %8.3f chi2tail(1,WaldA) " (B) " %8.3f chi2tail(2,WaldB) " (C) " %8.3f chi2t
> ail(1,WaldC) " (D) " %8.3f chi2tail(1,WaldD)
p-values : (A) 0.015 (B) 0.014 (C) 0.588 (D) 0.691
.
. * LR test statistics
. di "LR A to D: (A) " %8.3f LRA " (B) " %8.3f LRB " (C) " %8.3f LRC " (D) " %8.3f LRC
LR A to D: (A) 5.754 (B) 8.302 (C) 0.293 (D) 0.293
. di " p-values : (A) " %8.3f chi2tail(1,LRA) " (B) " %8.3f chi2tail(2,LRB) " (C) " %8.3f chi2tail(
> 1,LRC) " (D) " %8.3f chi2tail(1,LRC)
p-values : (A) 0.016 (B) 0.016 (C) 0.589 (D) 0.589
.
. * Direct LM test statistics
. di "LM A to D: (A) " %8.3f LMdirectA " (B) " %8.3f LMdirectB " (C) " %8.3f LMdirectC " (D)
" %8.
> 3f LMdirectC
LM A to D: (A) 5.916 (B) 8.575 (C) 0.293 (D) 0.293
. di " p-values: (A) " %8.3f chi2tail(1,LMdirectA) " (B) " %8.3f chi2tail(2,LMdirectB) " (C) " %8.
> 3f chi2tail(1,LMdirectC) " (D) " %8.3f chi2tail(1,LMdirectC)
p-values: (A) 0.015 (B) 0.014 (C) 0.588 (D) 0.588
145
.
. * Auxiliary Regression LM test statistics
. di "LM* A to D: (A) " %8.3f LMauxA " (B) " %8.3f LMauxB " (C) " %8.3f LMauxC " (D) "
%8.3f LMauxC
LM* A to D: (A) 6.218 (B) 9.186 (C) 0.315 (D) 0.315
. di " p-values : (A) " %8.3f chi2tail(1,LMauxA) " (B) " %8.3f chi2tail(2,LMauxB) " (C) " %8.3f
chi
> 2tail(1,LMauxC) " (D) " %8.3f chi2tail(1,LMauxC)
p-values : (A) 0.013 (B) 0.010 (C) 0.575 (D) 0.575
.
. ********** CLOSE OUTPUT ***********
. log close
log: c:\Imbook\bwebpage\Section2\mma07p1mltests.txt
log type: text
closed on: 17 May 2005, 13:59:21
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma07p2power.txt
log type: text
opened on: 17 May 2005, 14:00:49
.
. ********** OVERVIEW OF MMA07P2POWER.DO **********
.
. * STATA Program
.
. * Asymptotic Power of Wald test
.
. * (1) Chapter 7.6.3 obtains power for noncentral chisquare
. * (2) Figure 7.2 (ch7power.wmf) plots against the noncentrality parameter lamda
. * No data needed
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set scheme s1mono /* Graphics scheme */
.
. ********** ANALYSIS **********
.
. * Obtain power of chi-square tests
146
. * with df degrees of freedom

. * and noncentrality parameter (ncp) lamda from 0 to 20
. * for size alpha = 0.01, 0.05 and 0.10
.
. set obs 201
obs was 0, now 201
. scalar df = 1
/* Degrees of freedom */
. gen lamda = 0.1*(_n-1) /* Lamda = 0, 0.1, 0.2, ..., 19.9, 20.0 */

.
. * Obtain power
.*
= Pr[W > chi-square(alpha) | W ~ chi-square(alpha)]
. * for alpha = 0.01, 0.05 and 0.10
.
. * Critical value at size alpha uses central chisquare
. * invchi2tail gives cv such that Pr(Chi2 > cv) = alpha
. * Power is 1 minus cdf of noncentral chisquare
. * nchi2 gives the cdf of noncentral chisquare
.
. scalar alpha = 0.01
. scalar criticalvalue = invchi2tail(df,alpha)
. gen power01 = 1-nchi2(df,lamda,criticalvalue)
.
.
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lamda |
201
10 5.816786
0
20
power01 |
201 .6230651 .3095508
.01 .9710402
power05 |
201 .7583101 .2717153
.05 .9940005
power10 |
201 .8152767 .2396043
.1 .9976528
147
. * For lamda = 0 have size = power, here 0.01, 0.05 and 0.10
. list if lamda==0 | lamda==5 | lamda==10 | lamda==20
+----------------------------------------+
| lamda power01 power05 power10 |
|----------------------------------------|
1. | 0
.01
.05
.1 |
51. | 5 .3670189 .6087795 .7228636 |
101. | 10 .7212129 .8853791 .9354209 |
201. | 20 .9710402 .9940005 .9976528 |
+----------------------------------------+
.
. ********** FIGURE 7.1 (p.249): PLOT THE POWER FUNCTION **********
.
. graph twoway (line power10 lamda, clstyle(p1)) /*
> */ (line power05 lamda, clstyle(p2)) /*
> */ (line power01 lamda, clstyle(p3)), /*
> */ title("Test Power as a function of the ncp") /*
> */ xtitle("Noncentrality parameter lamda", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Test Power", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Test size = 0.10") label(2 "Test size = 0.05") /*
> */
label(3 "Test size = 0.01"))
. graph export ch7power.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch7power.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma07p2power.txt
log type: text
closed on: 17 May 2005, 14:00:52
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma07p3montecarlo.txt
log type: text
opened on: 18 May 2005, 11:28:58
.
. ********** OVERVIEW OF MMA07P3MONTECARLO.DO **********
.
. * STATA Program
.
. * Chapter 7.7.1-7.7.5 pp. 250-4
148
. * Size and power of the Wald test

.
. * (1) Figure 7.2 Density of Wald test statistic
. * (2) Table 7.2 Actual size of Wald test at various nominal sizes
. * (3) Table 7.2 Actual power of Wald test at various nominal sizes
. * (4) Table 7.2 Nominal power of Wald test at various nominal sizes
. * (5) Alternative way to simulate using postfile rather than simulate
.
. * on the slope coefficient for a Probit model with simulated data (see below).
.
. * NOTE: Because this is a simulation using many samples (here 10,000)
. * the generated data are not saved in a text file.
.
. * Problem can arise if in one of the simulations all of sample is y=0 or y=1
. * Then the probit model is not estimable.
. * Then need increase sample size, change dgp or reduce number of simulations.
. * Here used N=40 with S=10000 for size and for power
. * Another possible change is to have same regressors x across simulations
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** MONTE CARLO OVERVIEW **********
.
. * The data generating process is
. * - Probit with Pr[y=1] = Phi(b1 + b2*x2)
. * - where b1 = 0 and b2 = 1
. * - and regressor x ~ N[0,1] is fixed throughout the simulations
.
. * The sample size N set below in the global numobs
. * The number of simulations S is set below in the global numsims
. * A third option is to switch to same x in each sample. This needs to be done manually.
.
. * The simulation is done using stata command simulate
. * At the end of the program, an alternative using postfile is given
.
. * The program investigates both size and power
. * of the Wald test that b2 = 1.
. * For power the dgp instead uses b2 = 2.
.
. ********** INITIAL SIMULATION SET UP **********
.
. set seed 10101
. * Change the following for different sample size N
149
. global numobs "40"

. * Change the following for different number of simulations S
. global numsims "10000"
.
. ****** ANALYSIS: SIMULATION OF PROBIT MODEL SLOPE ESTIMATES AND WALD
TEST
.
. * The program is rclass.
. * This means the results returned by the program are put into r( )
. * Here we return meany, vary, betahat, sebetahat, ztestforbetaeq1
.
. * The probit model is Pr[y=1] = Phi(b1 + b2*x2) where b1=0 and b2=1
. * For size calculations: b2 = 1
. * For power calculations: b2 = 1.5 (as an example)
. * So pass the argument trueb2 as an argument.
.
. * The following three lines are only needed
. * if the regressors are constant across simulations,
. * as then need to generate once and put in a data file to be reused.
. * They are commented out here as here (x,y) both resampled.
. * Also simprobit and simprobit2 need one line changed if x is fixed.
. /*
> set obs numobs
> gen x = invnorm(uniform())
> save xforsim, replace
> */
. * This version of the program instead redraws both x and y in each simulation
.
. * The program has one argument
. * - trueb2 = value of b2 in the dgp
.
. program simprobit, rclass
1. version 8.0
2. /* define arguments. Here trueb2 = b2 in Phi(b1 + b2*x2) */
. args trueb2
3. /* Generate the data: here x and y */
. drop _all
4. set obs $numobs
5. gen x = invnorm(uniform())
6. /* If instead want same x in each simulation,
>
replace above line with: use xforsim */
. gen y = 0
7. replace y = 1 if 0 + `trueb2'*x + invnorm(uniform()) > 0
8. /* Summarize the generated data as a check */
. summarize y
9. return scalar ymean=r(mean)
10. return scalar yvar=r(Var)
11. /* Do probit and store key results */
. probit y x
150
12. return scalar b2hat=_b[x]

13. return scalar seb2hat = _se[x]
14. return scalar ztestforb2eq1 = (_b[x]-1)/_se[x]
15. end
.
. ****** (1) DISTRIBUTION OF WALD TEST STATISTIC (Figure 7.2 p.253)
.
. * Now call the program simprobit where
. * - include values for each argument within the quotes " "
. * (here the argument is b2true and is set to 1 for size and 1.5 for power)
. * - make sure that ask for each of the returned results
.
. * For size calculations set trueb2 = 1
. simulate "simprobit 1" ymean=r(ymean) yvar=r(yvar) b2hat=r(b2hat) /*
> */ seb2hat=r(seb2hat) ztestforb2eq1=r(ztestforb2eq1), reps($numsims)
command:
simprobit 1
statistics: ymean
= r(ymean)
yvar
= r(yvar)
b2hat
= r(b2hat)
seb2hat = r(seb2hat)
ztestfor~1 = r(ztestforb2eq1)
.
. * Summary of the results returned by simulate
. * For Wald test key output is ztestforb2eq1
. describe
Contains data
obs:
10,000
simulate: simprobit 1
vars:
5
18 May 2005 11:29
size:
label
variable label
------------------------------------------------------------------------------ymean
float %9.0g
r(ymean)
yvar
float %9.0g
r(yvar)
b2hat
float %9.0g
r(b2hat)
seb2hat
float %9.0g
r(seb2hat)
ztestforb2eq1 float %9.0g
r(ztestforb2eq1)
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ymean | 10000
.49946 .0794447
.225
.775
yvar | 10000 .2499373 .0089917 .1788462 .2564103
151
b2hat | 10000 1.133952 .4516738 -.0306482 9.389184

seb2hat | 10000 .3589645 .1561059 .1902922 4.583915
ztestforb2~1 | 10000 .1141294 .9558451 -4.087344 2.278257
.
. * For b2hat there are two ways to estimate the standard deviation.
. * One is the average of seb2hat, the standard error of b2hat
. * The other is the standard deviation of b2hat.
. * These are equal asymptotically, but perhaps not in small samples due to bias.
. * Also aveseb2hat is used later in calculating asymptotic power.
. sum seb2hat
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------seb2hat | 10000 .3589645 .1561059 .1902922 4.583915
. scalar aveseb2hat = r(mean)
. sum b2hat
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------b2hat | 10000 1.133952 .4516738 -.0306482 9.389184
. scalar stdevb2hat = r(sd)
. di "Average standard error of b2hat: " aveseb2hat
Average standard error of b2hat: .3589645
. di "Standard deviation of b2hat:
" stdevb2hat
Standard deviation of b2hat:
.45167383
.
. * The Wald test statistic will be called Wald
. gen Wald = ztestforb2eq1
. label var Wald "Wald test statistic"
.
. * The mean and st.dev. should be 0 and 1 if Wald ~ N[0,1]
. sum Wald
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------Wald | 10000 .1141294 .9558451 -4.087344 2.278257
.
. * The 2.5 and 97.5 percentiles should be -1.96 and 1.96 if Wald ~ N[0,1]
. * They can be used to get size-adjusted Wald test at 5 percent.
. _pctile Wald, p(2.5,99.5)
152
. display "Wald: Lower 2.5 percentile = " r(r1) " Upper 2.5 percentile = " r(r2)
Wald: Lower 2.5 percentile = -1.904708 Upper 2.5 percentile = 2.0034728
.
. * The density of the simulated values of the Wald test should be
. * a standard normal density if Wald ~ N[0,1]
. * The following plots kernel estimate of density of Wald and a N[0,1] density
. * Could also do Student[N-k] but this looks same as N[0,1] if N>=30.
. gen N01density = normden(Wald)
. sum Wald
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------Wald | 10000 .1141294 .9558451 -4.087344 2.278257
.
. graph twoway (kdensity Wald, range(-3 3) clstyle(p1)) /*
> */ (connect N01density Wald if Wald>-3 & Wald<3, clstyle(p2) sort(Wald) s(i)), /*
> */ title("Monte Carlo Simulations of Wald Test") /*
> */ xtitle("Wald Test Statistic", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Density", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Monte Carlo") label(2 "Standard Normal") /*
> */
label(3 "Test size = 0.01"))
. graph export ch7montecarlo.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch7montecarlo.wmf written in Windows Metafile format)
.
. ****** (2) ACTUAL SIZE OF THE WALD TEST STATISTIC (Table 7.2, p.253)
.
. * Obtain the size properties of a two-sided Wald test
. * That rejects if |Wald| > z_alpha/2 where alpha = .01, .05, .1, .2
.
. * Convert to two-sided test by taking absolute value
. gen absWald = abs(Wald)
.
. * Give key percentiles of |Wald|
. * Percentiles must be in ascending order for Stata
. _pctile absWald, p(0.80,0.90,0.95,0.99)
. display "I[Upper percentiles of |Wald|: " " 1 " r(r4) " 5 " r(r3) " 10 " r(r2) " 20 " r(r1)
I[Upper percentiles of |Wald|: 1 .0115847 5 .01074749 10 .00998338 20 .00923005
.
. * Program to calculate actual size given nominal size
. * Temporary variables and scalars are in quotes ` '
. program size, rclass
153
1.
version 8.0
2.
args nominalsize
3.
tempvar reject
4.
tempname normalcriticalvalue
5.
quietly {
6.
scalar `normalcriticalvalue' = invnorm(1-(`nominalsize'/2))
7.
gen `reject' = 0
8.
replace `reject' = 1 if absWald > `normalcriticalvalue'
9.
summarize `reject'
10.
return scalar actualsize = r(mean)
11.
}
12. end
.
. * Calculate actual size for nominal sizes 0.01, 0.05, 0.10 and 0.20
. size 0.01
. scalar actualsize01 = r(actualsize)
. size 0.05
. size 0.10
. size 0.20
.
. * Following gives Actual Size column of Table 7.2 (p.253)
. * Nominal Sizes and Actual Sizes of Two-sided Wald Test
. di "0.01: " actualsize01 _new "0.05: " actualsize05 _new /*
> */ "0.10: " actualsize10 _new "0.20: " actualsize20
0.01: .0053
0.05: .0294
0.10: .0805
0.20: .1922
.
. ****** (3) ACTUAL POWER OF THE WALD TEST STATISTIC (Table 7.2, p.253)
.
. * Consider power when b2 = 2 rather than 1
.
. * Obtain the actual power by simulation
. * Use the same program simprobit as for size,
. * except the argument b2true is 2.0 rather than 1.0
.
. drop _all
154
.
. * For size calculations set trueb2 = 2
. simulate "simprobit 2" ymean=r(ymean) yvar=r(yvar) b2hat=r(b2hat) /*
> */ seb2hat=r(seb2hat) ztestforb2eq1=r(ztestforb2eq1), reps(10000)
command:
simprobit 2
statistics: ymean
= r(ymean)
yvar
= r(yvar)
b2hat
= r(b2hat)
seb2hat = r(seb2hat)
ztestfor~1 = r(ztestforb2eq1)
.
. * Calculate |Wald|
. gen Wald = ztestforb2eq1
. gen absWald = abs(Wald)
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ymean |
9929 .4998389 .0791531
.225
.825
yvar |
9929 .249985 .0090933 .1480769 .2564103
b2hat |
9929 2.581075 2.73046 .8547966 209.9805
seb2hat |
9929 1.002628 5.799384 .2816004 540.1536
ztestforb2~1 |
9929 1.667773 .3853416 -.4042006 2.59991
-------------+-------------------------------------------------------Wald |
9929 1.667773 .3853416 -.4042006 2.59991
absWald |
9929 1.668285 .383118 .0033462 2.59991
.
. * Calculate actual power for nominal sizes 0.01, 0.05, 0.10 and 0.20
. * This can use the earlier program size
. size 0.01
. scalar actualpower01 = r(actualsize)
. size 0.05
. size 0.10
. size 0.20
155

.
. * Following gives Actual Power column of Table 7.2 (p.253)
. * Nominal Sizes and Actual Power of Two-sided Wald Test
. di "0.01: " actualpower01 _new "0.05: " actualpower05 _new /*
> */ "0.10: " actualpower10 _new "0.20: " actualpower20
0.01: .0073
0.05: .2257
0.10: .6077
0.20: .8583
.
. ****** (4) ASYMPTOTIC POWER OF THE WALD TEST STATISTIC (Table 7.2, p.253)
.
. * Consider power when b2 = 2 rather than 1
.
. * Calculate asymptotic theoretical power using noncentral chisquare
. * Asymptotic power = Pr[W > chi-square(alpha) | W ~ noncentral chi-square(alpha,ncp)
. * The noncentrality parameter is 0.5*(delta^2)/(se[b2]^2)
. * Here size has b2 = 1 and power has b2 = 1+delta
. * So delta = b2true - 1.
. * Need to find the standard error of b2.
. * Use the average from earlier simulations.
.
. * Program to calculate asymptotic power given nominal size
. * Temporary variables and scalars and arguments are in quotes ` '
. * invchi2tail gives cv such that Pr(Chi2 > cv) = nominalsize
. * Power is 1 minus cdf of noncentral chisquare
. * nchi2 gives the cdf of noncentral chisquare
.
. drop _all
.
. * Arguments are alpha (size), lamda and df (degrees of freedom)
. program power, rclass
1.
version 8.0
2.
args alpha lamda df
3.
tempname criticalvalue powervianoncentralchi
4.
quietly {
5.
scalar `criticalvalue' = invchi2tail(`df',àlpha')
6.
scalar `powervianoncentralchi' = 1-nchi2(`df',`lamda',`criticalvalue')
7.
return scalar asymppower = `powervianoncentralchi'
8.
}
9. end
.
. * scalar criticalvalue = invchi2tail(df,alpha)
. * replace power = 1-nchi2(df,lamda,criticalvalue)
.
156
. * Calculate df and lamda.

. * This uses an estimate of se[beta] obtained earlier
. scalar delta = 1 /* Here 2 - 1. Changes for different alternatives */
. scalar lamda = 0.5*(delta*delta)/(aveseb2hat*aveseb2hat)
. scalar df = 1
. di "delta: " delta " aveseb2hat: " aveseb2hat " lamda: " lamda " df: " df
delta: 1 aveseb2hat: .3589645 lamda: 3.8803151 df: 1
.
. * Calculate asymptotic power for nominal sizes 0.01, 0.05, 0.10 and 0.20
. power 0.01 lamda df
. scalar asymppower01 = r(asymppower)
.
. * Following gives Asymptotic Power column of Table 7.2 (p.253)
. * Nominal Sizes and Asymptotic Power of Two-sided Wald Test
. di "0.01: " asymppower01 _new "0.05: " asymppower05 _new /*
> */ "0.10: " asymppower10 _new "0.20: " asymppower20
0.01: .2722675
0.05: .50398701
0.10: .62755902
0.20: .75494224
.
. ****** (5) ALTERNATIVE ANALYSIS: SIMULATION METHOD USING POSTFILE
.
. * This is an alternative, given for completeness.
. * This fails if the model is not estimable in any of the simulation samples.
. * By contrast, simulate just drops that simulation sample and continues simulating.
.
. * For each round of the simulation, the variables in `sim' are sent
. * as a new line to a stata data set simprobitresults.
. * The names of these variables are given in quotes after S_1
. * Need as many names in quotes after S_1 as variables at post
. * Then can analyze these using summarize etcetera
157
.
. * This program has two arguments
. * - numsims = desired number of simulations
. * - trueb2 = slope coefficient used to generate the data
.
. drop _all
.
. program simprobit2
1.
version 8.0
2.
args numsims trueb2
3.
tempname sim
4.
postfile `sim' meany vary beta sterror ztestforbeta using probitsimresults, replace
5.
quietly {
6.
forvalues i = 1/`numsims' {
7.
drop _all
8.
set obs $numobs
/* may need to change */
9.
gen x = invnorm(uniform())
10.
/* If instead want same x in each simulation
>
replace above line with: use xforsim */
.
gen y = 0
11.
/* Use b2 = 1.0 for size and 1.5 for power */
.
replace y = 1 if 0+`trueb2'*x+invnorm(uniform()) > 0
12.
summarize y
13.
scalar meany=r(mean)
14.
scalar vary=r(Var)
15.
probit y x
16.
scalar beta=_b[x]
17.
scalar sterror = _se[x]
18.
scalar ztestforbeta = (beta-1)/sterror
19.
post `sim' (meany) (vary) (beta) (sterror) (ztestforbeta)
20.
}
21.
}
22.
postclose `sim'
23. end
.
. simprobit2 $numsims 1
. use probitsimresults, clear
.
. * Here we just summarize results for comparison with earlier
. * But could do the further analysis as above
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------meany | 10000 .4989575 .0791248
.225
.775
vary | 10000 .2499885 .0090127 .1788462 .2564103
beta | 10000 1.135003 .4315248 .0901358 7.205799
158
sterror | 10000 .3583266 .133302 .1863547 3.360862

ztestforbeta | 10000 .1218973 .954814 -3.401833 2.299991
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma07p3montecarlo.txt
log type: text
closed on: 18 May 2005, 11:29:29
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma07p4boot.txt
log type: text
opened on: 18 May 2005, 21:36:29
.
. ********** OVERVIEW OF MMA07BOOT4.DO **********
.
. * STATA Program
.
. * Chapter 7.8 pages 254-256
. * Bootstrap applied to probit model
. * Provides
. * (1) Bootstrap confidence intervals
. * (2) Bootstrap hypothesis test without refinement
. * (3) Bootstrap hypothesis test with refinement: percentile-t method
.
. * Note corrections to book
. * - sample size is N=40 not N=30
. * - use 999 bootstrap replications not 1000
. * - for asymptotic refinement p.256 the critical region
.*
is (-1.89, 1.80) not (-2.62, 1.83)
.
. * For more detail on bootstrap see
. * Chapter 11: Bootstrap Methods pages 355-383
. * and program mma11p1boot.do
.
. ********** SETUP **********
.
. set more off
. version 8
.
. ********** GENERATE DATA **********
.
. * DGP is Probit: Pr[y=1] = PHI(a + bx)
159
. * where x is N[0,1]
. * and a = 0 and b = 1
.
. * Change the following for different sample size N
. global numobs "40"
.
. * Probit example with slope coefficient equal to 1
. set seed 10105
. set obs $numobs
obs was 0, now 40
. gen x = invnorm(uniform())
. gen y = 0
. replace y = 1 if 0+1.0*x+invnorm(uniform()) > 0
. save xyforsim, replace
file xyforsim.dta saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x|
40 -.0359197 .9203391 -2.210579 1.45199
y|
40
.475 .5057363
0
1
. probit y x
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Probit estimates
Number of obs =
40
LR chi2(1)
=
9.88
Prob > chi2 = 0.0017
Pseudo R2
= 0.1786
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .8168831 .2942893 2.78 0.006 .2400867 1.393679
_cons | -.0725436 .2162576 -0.34 0.737 -.4964006 .3513135
-----------------------------------------------------------------------------. save mma07p4boot, replace
160
file mma07p4boot.dta saved

.
. outfile y x using mma07p4boot.asc, replace
.
. ********** (1) BOOTSTRAP CONFIDENCE INTERVALS **********
.
. * Stata produces four bootstrap 100*(1-alpha) confidence intervals
. * (1)-(2) have no asymptotic refinement
. * (3)-(4) have asymptotic refinement
.
. * (1) Regular asymptotic normal: bhat +/- t(S-1)_alpha/2*se(bhat)
. * except instead of using the initial se(bhat)
. * we use the standard deviation of bhat from the bootstrap reps
. * and use t(S-1) rather than z for critical value
. * where S = number of bootstrap reps
.
. * (2) Percentile method: which orders the bhat(s) from simulations and
. * goes from alpha/2 lowest bhat(s) to the alpha/2 highest bhat(s)
. * where (s) denotes the s-th bootstrap sample
.
. * (3) Bootstrap-corrected. Same as (4) with a=0
.
. * (4) Bootstrap-corrected and accelerated.
. * This works with the pivotal Wald statistic.
. * See the manual [R]bootstrap or a textbook.
. * e.g. Efron and Tibsharani (1993, pp.184-188) with a=0
. * This orders the bhats from simulations and
. * goes from p1 to the p2 highest
. * where p1 and p2 are bias-correction adjustments to alpha/2 and 1-alpha/2
. * Let p1 = Phi(2z0 - z_alpha/2)
.*
p2 = Phi(2z0 + z_alpha/2)
.*
z0 measures the median bias in bhat with
.*
z0 = Phi-inv(fraction of the bhat(s) < bhat)
. * And if z0=0 then p1 = alpha/2 and no correction
.
. * From page 399, for testing better to use 999 than 1000
. global breps "999" /* The number of bootstrap reps used below */
.
. * (1A) Simplest bootstrap is of all the estimated coefficients
. set seed 10105
. bootstrap "probit y x" _b, reps($breps) bca
command:
probit y x
statistics: b_x
= _b[x]
b_cons = _b[_cons]
161
Bootstrap statistics
Number of obs =
Replications =
999
40
-----------------------------------------------------------------------------Variable | Reps Observed

Bias Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------b_x | 999 .8168831 .1017329 .3763803 .0782956 1.555471 (N)
|
.3495505 1.878616 (P)
|
.2808956 1.600026 (BC)
|
.1552112 1.480223 (BCa)
b_cons | 999 -.0725436 -.0176301 .2448404 -.5530047 .4079175 (N)
|
-.596443 .4247662 (P)
|
-.5528302 .4381396 (BC)
|
-.5205303 .4445401 (BCa)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
BCa = bias-corrected and accelerated
.
. * (1B) This bootstrap is of MLE of b2 and the associated standard error
. * and additionally gives the bias-accelerated method of Efron
. set seed 10105
. bootstrap "probit y x" _b[x] _se[x], reps($breps) bca
command:
probit y x
statistics: _bs_1
= _b[x]
_bs_2
= _se[x]
Number of obs =
Replications =
999
40

-------------+---------------------------------------------------------------_bs_1 | 999 .8168831 .1017329 .3763803 .0782956 1.555471 (N)
|
.3495505 1.878616 (P)
|
.2808956 1.600026 (BC)
|
.1552112 1.480223 (BCa)
_bs_2 | 999 .2942893 .0422005 .0932673 .1112667 .4773118 (N)
|
.2323841 .5831083 (P)
|
.2214397 .4475662 (BC)
|
.2162534 .4143377 (BCa)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
162
.
. * (1C) This bootstrap repeats (2)
. * but will permit bootstrapping if Stata commands are more than one line
. use mma07p4boot, clear
. program define commandtobootstrap, rclass
1. version 8.0
2. quietly probit y x
3. return scalar b2hat=_b[x]
4. return scalar seb2hat=_se[x]
5. end
. set seed 10105
. bootstrap "commandtobootstrap" r(b2hat) r(seb2hat), reps($breps)
command:
commandtobootstrap
statistics: _bs_1
= r(b2hat)
_bs_2
= r(seb2hat)
Number of obs =
Replications =
999
40

-------------+---------------------------------------------------------------_bs_1 | 999 .8168831 .1017329 .3763803 .0782956 1.555471 (N)
|
.3495505 1.878616 (P)
|
.2808956 1.600026 (BC)
_bs_2 | 999 .2942893 .0422005 .0932673 .1112667 .4773118 (N)
|
.2323841 .5831083 (P)
|
.2214397 .4475662 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
.
. ********** (2) BOOTSTRAP HYPOTHESIS TESTS - NO REFINEMENT p.255 **********
.
. * We want to test H0: b2 = 1 against Ha: b2 not equal 1
.
. * For a simple test such as this we can just use
. * the bootstrap confidence intervals from (1)
. * and reject if bhat2 is not in the confidence interval
.
. * Here we instead present a common method without refinement
. * essentially (1) above, performing the usual Wald test,
. * except the standard error is estimated by bootstrap.
. * This is useful when hard to obtain standard error by other means.
163
. * Here W = (b2hat - b2_0) / seb2hat_boot where b2_0 = 1

. * and reject at level .05 if |W| > z_.025 = 1.96
.
. * Save the estimate
. quietly probit y x
. scalar b2est = _b[x]
. * Obtain the bootstrap standard error
. set seed 10105
. bootstrap "probit y x" _b, reps($breps) bca
command:
probit y x
statistics: b_x
= _b[x]
b_cons = _b[_cons]
Number of obs =
Replications =
999
40

-------------+---------------------------------------------------------------b_x | 999 .8168831 .1017329 .3763803 .0782956 1.555471 (N)
|
.3495505 1.878616 (P)
|
.2808956 1.600026 (BC)
|
.1552112 1.480223 (BCa)
b_cons | 999 -.0725436 -.0176301 .2448404 -.5530047 .4079175 (N)
|
-.596443 .4247662 (P)
|
-.5528302 .4381396 (BC)
|
-.5205303 .4445401 (BCa)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix sebboot = e(se)
. scalar seb2boot = sebboot[1,1] /* x is first then constant */
. * Calculate the test statistic
. scalar Wald = (b2est - 1)/seb2boot
.
. * DISPLAY RESULTS at bottom p.255
. * Note: Text had typo:
. * (1-0.817)/0.376 = -0.487 should be (0.817-1)/0.376 = -0.487
.
164
. di "Probit slope estimate is:

" b2est
Probit slope estimate is:
.8168831
. di "Bootstrap standard estimate is: " seb2boot
Bootstrap standard estimate is: .37638029
. di "Wald statistic (no refinement) is: " Wald
Wald statistic (no refinement) is: -.48652096
. di "Reject at level .05 if |Wald| > 1.96"
Reject at level .05 if |Wald| > 1.96
.
. ********** (3) BOOTSTRAP HYPOTHESIS TESTS - PERCENTILE-T p.256 **********
.
. * Stata does not give this. For methods see
. * e.g. Efron and Tibsharani (1993, pp.160-162)
. * e.g. Cameron and Trivedi (2005)
Chapter 11.2.6-11.2.7
. * For sample s compute t-test(s) = (bhat(s)-bhat) / se(s)
. * where bhat is initial estimate
. * and bhat(s) and se(s) are for sth round.
. * Order the t-test(s) statistics and choose the alpha/2 percentiles
. * which give the critical values for the t-test
.
. * Implementation requires saving the results from each bootstrap replication
. * in order to obtain ccritical values from percentiles of bootstrap distribution
.
. * (3A) Here bootstrap computes (b(s) - bhat) / se(s) s = 1,...,S
.
. * Save the estimate and the Wald test statistic
. scalar Wald = (_b[x] - 1)/_se[x]
. * Then bootstrap calculates (b(s) - bhat) / se(s)
. set seed 10105
. bootstrap "probit y x" ((_b[x]-b2est)/_se[x]), reps($breps) /*
> */ level(95) saving(mma07p4bootreps) replace
command:
probit y x
statistic: _bs_1
= (_b[x]-b2est)/_se[x]
Number of obs =
Replications =
999
40
165

-------------+---------------------------------------------------------------_bs_1 | 999
0 .1003619 .9350234 -1.834837 1.834837 (N)
|
-1.890602 1.801358 (P)
|
-2.101316 1.565618 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. * Then get data sets with result from each bootstrap
. use mma07p4bootreps, clear
(bootstrap: probit y x)
. sum
/* Here just _bs_1 */
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------_bs_1 |
999 .1003619 .9350234 -3.032139 2.572848
. gen b2test = _bs_1 /* _bs_1 is the bootstrap result of interest */
. sum b2test, detail /* Gives percentiles but not 2.5% and 97.5% */
b2test
------------------------------------------------------------Percentiles
Smallest
1% -2.188575 -3.032139
5% -1.540843 -2.605178
10% -1.137846 -2.599248
Obs
999
25% -.4995352 -2.566578
Sum of Wgt.
999
50%
75%
90%
95%
99%
.1238111
Mean
.1003619
Largest
Std. Dev.
.9350234
.7789762
2.22565
1.338348
2.359132
Variance
.8742688
1.560646
2.377491
Skewness
-.2505319
2.014282
2.572848
Kurtosis
2.853737
. _pctile b2test, p(2.5,97.5)

.
. * DISPLAY RESULTS on p.256
.
. * Note: Error on p.256 Here get (-1.89, 1.80) not (-2.62, 1.83)
. di "Lower 2.5 and upper 2.5 percentile of coeff b for z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of coeff b for z: -1.8906019 and 1.8013585
. di "Reject H0 if Wald = " Wald " lies outside " r(r1) " ," r(r2) ")"
Reject H0 if Wald = -.62223436 lies outside -1.8906019 ,1.8013585)
166
.
. * (3B) Equivalently bootstrap calculates b(s) and se(s) s = 1,...,S
.*
and then later calculate (b(s) - bhat) / se(s)
.
. * Save the estimate and the Wald test statistic
. scalar Wald = (_b[x] - 1)/_se[x]
. * Then bootstrap calculates b(s) and se(s)
. set seed 10105
. bootstrap "probit y x" _b[x] _se[x], reps($breps) /*
> */ level(95) saving(mma07p4bootreps) replace
command:
probit y x
statistics: _bs_1
= _b[x]
_bs_2
= _se[x]
Number of obs =
Replications =
999
40

-------------+---------------------------------------------------------------_bs_1 | 999 .8168831 .1017329 .3763803 .0782956 1.555471 (N)
|
.3495505 1.878616 (P)
|
.2808956 1.600026 (BC)
_bs_2 | 999 .2942893 .0422005 .0932673 .1112667 .4773118 (N)
|
.2323841 .5831083 (P)
|
.2214397 .4475662 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. * Then get data sets with result from each bootstrap
(bootstrap: probit y x)
. sum
/* Here _bs_1 and _bs_2 */
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------_bs_1 |
999 .918616 .3763803 .0030288 3.806198
_bs_2 |
999 .3364898 .0932673 .2162534 1.34312
167
. gen b2test = (_bs_1 - b2est)/_bs_2

. _pctile b2test, p(2.5,97.5)
.
. * DISPLAY RESULTS on p.256
. * Note: Error on p.256 Here get (-1.89, 1.80) not (-2.62, 1.83)
. di "Lower 2.5 and upper 2.5 percentile of coeff b for z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of coeff b for z: -1.8906019 and 1.8013583
. di "Reject H0 if Wald = " Wald " lies outside " r(r1) " ," r(r2) ")"
Reject H0 if Wald = -.62223436 lies outside -1.8906019 ,1.8013583)
.
. log close
log: c:\Imbook\bwebpage\Section2\mma07p4boot.txt
log type: text
closed on: 18 May 2005, 21:36:36
168
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma08p1cmtests.txt
log type: text
opened on: 17 May 2005, 14:04:20
.
. ********** OVERVIEW OF MMA08P1CMTESTS.DO **********
.
. * STATA Program
.
. * Conditional moment tests example producing Table 8.1
.
. * (A) TEST OF THE CONDITIONAL MEAN
. * (B) TEST THAT CONDITIONAL VARIANCE = MEAN
. * (C) ALTERNATIVE TEST THAT CONDITIONAL VARIANCE = MEAN
. * (D) INFORMATION MATRIX TEST
. * (E) CHI-SQUARE GOODNESS OF FIT TEST
. * for a Poisson model with generated data (see below).
.
. * The data generation requires free Stata add-on command rndpoix
. * In Stata: search rndpoix
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** GENERATE DATA **********
.
. * Model is
. * y ~ Poisson[exp(b1 + b2*x2]
. * where
. * x2 is iid ~ N[0,1]
. * and b1=0 and b2=1.
.
. set seed 10001
. set obs 200
obs was 0, now 200
. scalar b1 = 0
169
. scalar b2 = 1
.
.
. * Generate y
. gen mupoiss = exp(b1+b2*x2)
. rndpoix(mupoiss)
( Generating ................ )
. gen y = xp
.
. outfile y x2 using mma08p1cmtests.asc, replace
.
. ********* POISSON REGRESSION **********
.
. poisson y x2
Poisson regression
Number of obs =
LR chi2(1)
= 321.75
Prob > chi2 = 0.0000
Pseudo R2
=
200
0.3791
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x2 | 1.12402 .0687868 16.34 0.000 .9892006 1.25884
_cons | -.1652935 .089065 -1.86 0.063 -.3398578 .0092707
-----------------------------------------------------------------------------. * Obtain exp(x'b)
.
. * Obtain the scores to be used later
. predict yhat
. * For the Poisson s = dlnf(y)/db = (y - exp(x'b))*x
. gen s1 = (y - yhat)
170
. gen s2 = (y - yhat)*x2
.
. * Summarize data
. * Should get s1 and s2 summing to zero
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
200 -.0091098 1.010072 -2.857666 2.149822
mupoiss |
200 1.599601 1.674071 .0574026 8.58333
xp |
200
1.525 2.363749
0
15
y|
200
1.525 2.363749
0
15
yhat |
200
1.525 1.803242 .0341372 9.498652
-------------+-------------------------------------------------------s1 |
200 1.36e-09 1.36719 -3.148933 6.245292
s2 |
200 6.69e-09 1.889198 -6.420406 12.97311
.
. ********** ANALYSIS: CONDITIONAL MOMENTS TESTS **********
.
. * The program is appropriate for MLE with density assumed to be correctly specified.
. * Let H0: E[m(y,x,theta)] = 0
. * Then CM = explained sum of squares or N times uncentered Rsq from
. * auxiliary regression of 1 on m and the components of s = dlnf(y)//dtheta
. * The test is chi-squared with dim(m) degrees of freedom.
.
. * Define the dependent variable one for the aucxiliary regressions
. gen one = 1
.
. *** (A) TEST OF THE CONDITIONAL MEAN (Table 8.1 p.270 row 1)
.
. * Test H0: E[(y - exp(x'b))*z] = 0 where z = x2sq
.
. * A smilar test is relevant for many nonlinear models
. * Just change the expression for the conditional mean.
. * Here we used E[y|x] = exp(x'b) for the Poisson
. * Also for the Poisson z cannot be x as this sums to zero by Poisson foc
. * For some other models (basically non-LEF models) z can be x
.
. gen z = x2*x2
. gen mA = (y - yhat)*z
. regress one mA s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 3, 197) = 1.09
Model | 3.27177115 3 1.09059038
Prob > F
= 0.3536
Residual | 196.728229 197 .998620451
R-squared = 0.0164
171
-------------+-----------------------------Total |
200 200
1
Adj R-squared = 0.0014

Root MSE
= .99931
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------mA | .1046155 .0577969 1.81 0.072 -.0093646 .2185956
s1 | -.0377486 .0822939 -0.46 0.647 -.2000387 .1245415
s2 | -.1544278 .1029465 -1.50 0.135 -.3574463 .0485908
-----------------------------------------------------------------------------. scalar CMA = e(N)*e(r2)
. di "CMA: " CMA " p-value: " chi2tail(1,CMA)
CMA: 3.2717711 p-value: .07048149
.
. * Check that three different ways give same answer.
. di "N times Uncentered R-squared: " e(N)*e(r2)
N times Uncentered R-squared: 3.2717711
. di "Explained Sum of Squares:
" e(mss)
Explained Sum of Squares:
3.2717711
. di "N minus Residual Sum of Squares: " e(N) - e(rss)
N minus Residual Sum of Squares: 3.2717711
.
. *** (B) TEST THAT CONDITIONAL VARIANCE = MEAN (Table 8.1 p.270 row 2)
.
. * Test H0: E[{(y - exp(x'b))^2 - exp(x'b)}*x] = 0
.
. * This test is peculiar to Poisson which restricts mean = variance
.
. * Here m has 2 terms
. gen mB1 = ((y - yhat)^2 - yhat)
. gen mB2 = ((y - yhat)^2 - yhat)*x2
. regress one mB1 mB2 s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 4, 196) = 0.60
Model | 2.43400011 4 .608500026
Prob > F
= 0.6604
Residual | 197.566 196 1.0079898
R-squared = 0.0122
-------------+-----------------------------Adj R-squared = -0.0080
Total |
200 200
1
Root MSE
= 1.004
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------172
mB1 | .0432045 .0542516 0.80 0.427 -.0637873 .1501963

mB2 | -.0052374 .0357193 -0.15 0.884 -.0756808 .065206
s1 | -.0399879 .1073712 -0.37 0.710 -.251739 .1717633
s2 | -.003196 .0852726 -0.04 0.970 -.1713655 .1649735
-----------------------------------------------------------------------------. scalar CMB = e(N)*e(r2)
. di "CMB: " CMB " p-value: " chi2tail(2,CMB)
CMB: 2.4340001 p-value: .29611717
.
. *** (C) ALTERNATIVE TEST THAT CONDITIONAL VARIANCE = MEAN (Table 8.1 p.270
row 3)
.
. * Test H0: E[{(y - exp(x'b))^2 - y}*x] = 0
.
. * This test is peculiar to Poisson which restricts mean = variance
. * This test is also peculiar as here dm/db = 0
.
. gen mC1 = ((y - yhat)^2 - y)
. gen mC2 = ((y - yhat)^2 - y)*x2
.
. * To be consistent with other tests include s1 and s2.
. regress one mC1 mC2 s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 4, 196) = 0.60
Model | 2.43400011 4 .608500027
Prob > F
= 0.6604
Residual | 197.566 196 1.0079898
R-squared = 0.0122
-------------+-----------------------------Adj R-squared = -0.0080
Total |
200 200
1
Root MSE
= 1.004
-----------------------------------------------------------------------------one |
-------------+---------------------------------------------------------------mC1 | .0432045 .0542516 0.80 0.427 -.0637873 .1501963
mC2 | -.0052374 .0357192 -0.15 0.884 -.0756808 .065206
s1 | .0032166 .0825345 0.04 0.969 -.1595531 .1659863
s2 | -.0084334 .0641096 -0.13 0.895 -.1348665 .1179997
-----------------------------------------------------------------------------. scalar CMC = e(N)*e(r2)
. di "CMC: " CMC " p-value: " chi2tail(2,CMC)
CMC: 2.4340001 p-value: .29611717
.
173
. * Since dm/db = 0 could just do the regression without the scores

. regress one mC1 mC2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 2, 198) = 1.21
Model | 2.40695177 2 1.20347588
Prob > F
= 0.3016
Residual | 197.593048 198 .997944688
R-squared = 0.0120
-------------+-----------------------------Adj R-squared = 0.0021
Total |
200 200
1
Root MSE
= .99897
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------mC1 | .0458705 .0510111 0.90 0.370 -.0547243 .1464652
mC2 | -.0075807 .03212 -0.24 0.814 -.0709218 .0557605
-----------------------------------------------------------------------------. scalar CMCnoscores = e(N)*e(r2)
. di "CMCnoscores: " CMC " p-value: " chi2tail(2,CMCnoscores)
CMCnoscores: 2.4340001 p-value: .30014911
.
. *** (D) INFORMATION MATRIX TEST (Table 8.1 p.270 row 4)
.
. * Test H0: E[{(y - exp(x'b))^2 - y}*vech(xx')] = 0
.
. * A similar test is relevant for other parametric models
. * In general m = vech(d2lnf(y)/dbdb')
. * and for Poisson this yields above
.
. * Here m is a 3x1 vector
. gen mD1 = ((y - yhat)^2 - y)
. gen mD2 = ((y - yhat)^2 - y)*x2
. gen mD3 = ((y - yhat)^2 - y)*x2*x2
.
. * To be consistent with other tests include s1 and s2.
. regress one mD1 mD2 mD3 s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 5, 195) = 0.58
Model | 2.9463051 5 .58926102
Prob > F
= 0.7129
Residual | 197.053695 195 1.01053177
R-squared = 0.0147
-------------+-----------------------------Adj R-squared = -0.0105
Total |
200 200
1
Root MSE
= 1.0053
-----------------------------------------------------------------------------one |
Coef. Std. Err.
174
-------------+---------------------------------------------------------------mD1 | .0546342 .0566422 0.96 0.336 -.0570759 .1663442

mD2 | -.0712751 .0994042 -0.72 0.474 -.2673205 .1247703
mD3 | .0330527 .0464213 0.71 0.477 -.0584996 .124605
s1 | -.0098554 .0846533 -0.12 0.907 -.176809 .1570982
s2 | -.0146441 .0647803 -0.23 0.821 -.1424041 .1131158
-----------------------------------------------------------------------------. scalar CMD = e(N)*e(r2)
. di "CMD: " CMD " p-value: " chi2tail(3,CMD)
CMD: 2.9463051 p-value: .39997818
.
. * Since dm/db = 0 could just do the regression without the scores
. regress one mD1 mD2 mD3, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 3, 197) = 0.91
Model | 2.73445751 3 .911485837
Prob > F
= 0.4370
Residual | 197.265542 197 1.00134793
R-squared = 0.0137
-------------+-----------------------------Adj R-squared = -0.0013
Total |
200 200
1
Root MSE
= 1.0007
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------mD1 | .056165 .054176 1.04 0.301 -.0506743 .1630043
mD2 | -.056325 .0911035 -0.62 0.537 -.2359884 .1233384
mD3 | .0233527 .0408339 0.57 0.568 -.057175 .1038805
-----------------------------------------------------------------------------. scalar CMDnoscores = e(N)*e(r2)
. di "CMDnoscores: " CMDnoscores " p-value: " chi2tail(3,CMDnoscores)
CMDnoscores: 2.7344575 p-value: .43440333
.
. *** (E) CHI-SQUARE GOODNESS OF FIT TEST (Table 8.1 p.270 row 5)
.
. * Test H0: E[{d_j - Pr[y = j]] = 0
. * where d_j = 1 if y = j for j = 0, 1, 2, and 3 or more
. * and Pr[y = j] = exp(-lamda)*lamda^y/y! for lamda = exp(x'b)
. * Cells get too small if have more cells than up to 3 or more.
.
. * A similar test is relevant for other parametric models,
. * though a natural partitioning for y may be less obvious.
.
. gen d0 = 0
175
. replace d0 = 1 if y==0
. gen d1 = 0
. gen d2 = 0
. gen p0 = exp(-yhat)
. gen p1 = exp(-yhat)*yhat
. gen p2 = exp(-yhat)*(yhat^2)/2
. gen mE1 = d0 - p0
. gen mE2 = d1 - p1
. gen mE3 = d2 - p2
. regress one mE1 mE2 mE3 s1 s2, noconstant
Source |
SS
df
MS
Number of obs = 200
-------------+-----------------------------F( 5, 195) = 0.49
Model | 2.50056717 5 .500113433
Prob > F
= 0.7807
Residual | 197.499433 195 1.0128176
R-squared = 0.0125
-------------+-----------------------------Adj R-squared = -0.0128
Total |
200 200
1
Root MSE
= 1.0064
-----------------------------------------------------------------------------one |
Coef. Std. Err.
-------------+---------------------------------------------------------------mE1 | 1.020078 .7290569 1.40 0.163 -.4177712 2.457927
mE2 | .7149016 .5053259 1.41 0.159 -.2817042 1.711507
mE3 | .2705081 .383646 0.71 0.482 -.4861201 1.027136
s1 | .2916116 .2217763 1.31 0.190 -.1457765 .7289997
s2 | -.1341565 .1125046 -1.19 0.235 -.3560384 .0877255
-----------------------------------------------------------------------------. scalar CME = e(N)*e(r2)
. di "CME: " CME " p-value: " chi2tail(3,CME)
CME: 2.5005672 p-value: .47518859
.
. * Wrong alternative is basic chisquare
176
. quietly sum d0
. scalar sumd0 = r(sum)
. quietly sum d1
. quietly sum d2
. scalar sumd3 = 1 - sumd0 - sumd1 - sumd2
. quietly sum p0
. scalar sump0 = r(sum)
. quietly sum p1
. quietly sum p2
. scalar sump3 = 1 - sump0 - sump1 - sump2
. scalar chisq = (sumd0-sump0)^2/sump0 + (sumd1-sump1)^2/sump1 /*
>
*/ + (sumd2-sump2)^2/sump2 + (sumd3-sump3)^2/sump3
. di "Wrong Traditional chi-square: " chisq " p = " chi2tail(3,chisq)
Wrong Traditional chi-square: .47431003 p = .92449803
.
.
. ********** DISPLAY RESULTS (Table 8.1 p.270) **********
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
200 -.0091098 1.010072 -2.857666 2.149822
mupoiss |
200 1.599601 1.674071 .0574026 8.58333
xp |
200
1.525 2.363749
0
15
y|
200
1.525 2.363749
0
15
yhat |
200
1.525 1.803242 .0341372 9.498652
-------------+-------------------------------------------------------s1 |
200 1.36e-09 1.36719 -3.148933 6.245292
s2 |
200 6.69e-09 1.889198 -6.420406 12.97311
one |
200
1
0
1
1
177
z|
200 1.015227 1.286795 .0000877 8.166255
mA |
200 .1563713 3.403966 -13.52498 26.94856
-------------+-------------------------------------------------------mB1 |
200 .334863 3.470417 -6.436038 30.24896
mB2 |
200
.43869 5.749749 -11.74974 62.83503
mC1 |
200 .334863 3.077815 -6.838236 24.00367
mC2 |
200
.43869 4.897291 -12.484 49.86192
mD1 |
200 .334863 3.077815 -6.838236 24.00367
-------------+-------------------------------------------------------mD2 |
200
.43869 4.897291 -12.484 49.86192
mD3 |
200 .8381842 9.190652 -22.791 103.5763
d0 |
200
.435 .4970011
0
1
d1 |
200
.255 .436955
0
1
d2 |
200
.11 .3136749
0
1
-------------+-------------------------------------------------------p0 |
200 .429237 .2918348 .000075 .9664389
p1 |
200 .2406035 .1137756 .000712 .367864
p2 |
200 .1235594 .0894167 .0005631 .2706694
mE1 |
200 .005763 .4287003 -.9289918 .9571021
mE2 |
200 .0143965 .4210301 -.367864 .9315748
-------------+-------------------------------------------------------mE3 |
200 -.0135594 .3065698 -.2706694 .9688674
.
. * Gives Rows 1-5 of Table 8.1 (The CMxnoscores are not reported)
. di "CMA: " CMA " p-value: " chi2tail(1,CMA)
CMA: 3.2717711 p-value: .07048149
. di "CMB: " CMB " p-value: " chi2tail(2,CMB)
CMB: 2.4340001 p-value: .29611717
. di "CMC: " CMC " p-value: " chi2tail(2,CMC)
CMC: 2.4340001 p-value: .29611717
. di "CMD: " CMD " p-value: " chi2tail(3,CMD)
CMD: 2.9463051 p-value: .39997818
. di "CME: " CME " p-value: " chi2tail(3,CME)
CME: 2.5005672 p-value: .47518859
. di "CMCnoscores: " CMCnoscores " p-value: " chi2tail(2,CMCnoscores)
CMCnoscores: 2.4069518 p-value: .30014911
. di "CMDnoscores: " CMDnoscores " p-value: " chi2tail(3,CMDnoscores)
CMDnoscores: 2.7344575 p-value: .43440333
.
. ********** FURTHER ANALYSIS gives M** column in Table 8.1 **********
.
. * The following drops the scores from the regression. Provides lower bound.
. * Results are reported in last column in Table 8.1
178
. quietly regress one mA, noconstant

. di "CMA without scores:" e(N)*e(r2) " with p = " chi2tail(1,e(N)*e(r2))
CMA without scores:.42328231 with p = .51530376
. quietly regress one mB1 mB2, noconstant
. di "CMB without scores:" e(N)*e(r2) " with p = " chi2tail(2,e(N)*e(r2))
CMB without scores:1.8897296 with p = .38873213
. quietly regress one mC1 mC2, noconstant
. di "CMC without scores:" e(N)*e(r2) " with p = " chi2tail(2,e(N)*e(r2))
CMC without scores:2.4069518 with p = .30014911
. quietly regress one mD1 mD2 mD3, noconstant
. di "CMD without scores:" e(N)*e(r2) " with p = " chi2tail(3,e(N)*e(r2))
CMD without scores:2.7344575 with p = .43440333
. quietly regress one mE1 mE2 mE3, noconstant
. di "CME without scores:" e(N)*e(r2) " with p = " chi2tail(3,e(N)*e(r2))
CME without scores:.73842732 with p = .86413036
.
. log close
log: c:\Imbook\bwebpage\Section2\mma08p1cmtests.txt
log type: text
closed on: 17 May 2005, 14:04:20
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma08p2nonnested.txt
log type: text
opened on: 18 May 2005, 21:27:00
.
. ********** OVERVIEW OF MMA08P2NONNESTED.DO **********
.
. * STATA Program
.
. * Nonnested model comparison given in Table 8.2:
.
. * (A) AIC AND VARIATIONS
. * (B) VUONG TEST for Overlapping Models
179

.
. * This example requires the free Stata add-on command rndpoix.
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** GENERATE DATA **********
.
. * Dgp is
. * y ~ Poisson[exp(b1 + b2*x2 + b3*x3]
. * where
. * x2, x3 is iid ~ N[0,1]
. * and b1=0 and b2=1 and b3=1.
.
. * The Models compared are
. * Poisson of y on x2
. * Poisson of y on x3 and x3^2
.
. set seed 10001
. set obs 100
obs was 0, now 100
. scalar b1 = 0.5
. scalar b2 = 0.5
. scalar b3 = 0.5
.
. gen x2sq = x2*x2
. gen x3sq = x3*x3
.
. * Generate y
. gen mupoiss = exp(b1+b2*x2+b3*x3)
180

. rndpoix(mupoiss)
( Generating ......... )
. gen y = xp
.
. outfile y x2 x3 x2sq x3sq using mma08p2nonnested.asc, replace
.
. ********* SETUP FOR THIS PROGRAM *********
.
. * Change this if want different regressors
. * Here both models differ from the dgp
. * The Vuong test below assumes that the two models are OVERLAPPING
. global XLISTMODEL1 x2
. global XLISTMODEL2 x3 x3sq
.
. ********* (A) AIC AND VARIATIONS *********
.
. * Stata output from Poisson saves much of this.
. * Also calculate manually.
.
. * The following code can be changed to different models than poisson
. * provided
. * ereturn list yields N = e(N); q = e(k); and LnL = e(ll)
. * We use AIC = -2lnL+2q; BIC = -2lnL+lnN*q; CAIC = -2lnL+(1+lnN)*q
.
. poisson y $XLISTMODEL1
Poisson regression
Number of obs =
100
LR chi2(1)
=
16.28
Prob > chi2 = 0.0001
Pseudo R2
= 0.0425
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x2 | .291164 .072311 4.03 0.000 .1494371 .4328909
_cons | .6084331 .0752833 8.08 0.000 .4608806 .7559857
-----------------------------------------------------------------------------. estimates store model1
181
. scalar ll1 = e(ll)

. scalar q1 = e(k)
. scalar N1 = e(N)
. scalar aic1 = -2*ll1 + 2*q1
. scalar bic1 = -2*ll1 + ln(N1)*q1
. scalar caic1 = -2*ll1 + (1 + ln(N1))*q1
.
. poisson y $XLISTMODEL2
Poisson regression
Number of obs =
100
LR chi2(2)
=
30.96
Prob > chi2 = 0.0000
Pseudo R2
= 0.0808
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x3 | .3588412 .07035 5.10 0.000 .2209578 .4967245
x3sq | .0912999 .0514311 1.78 0.076 -.0095032 .1921029
_cons | .492656 .0958903 5.14 0.000 .3047144 .6805975
-----------------------------------------------------------------------------. estimates store model2
. scalar ll2 = e(ll)
. scalar q2 = e(k)
. scalar N2 = e(N)
. scalar aic2 = -2*ll2 + 2*q2
. scalar bic2 = -2*ll2 + ln(N2)*q2
. scalar caic2 = -2*ll2 + (1 + ln(N2))*q2
.
. * Display results given in first three rows of Table 8.2 page 284
.
. estimates table model1 model2, stats(N k ll aic bic)
182
---------------------------------------Variable | model1
model2
-------------+-------------------------x2 | .29116396
x3 |
.35884118
x3sq |
.09129986
_cons | .60843314 .49265596
-------------+-------------------------N|
100
100
k|
2
3
ll | -183.43146 -176.09119
aic | 370.86292 358.18238
bic | 376.07326 365.99789
---------------------------------------.
. di "Model 1: " _n "lnL: " ll1 " q: " q1 _n " N: " N1
Model 1:
lnL: -183.43146 q: 2
N: 100
. di "-2lnL: " -2*ll1 _n "AIC: " aic1 _n " BIC: " bic1 _n "caic: " caic1
-2lnL: 366.86292
AIC: 370.86292
BIC: 376.07326
caic: 378.07326
.
. di "Model 2: " _n "lnL: " ll2 " q: " q2 _n " N: " N2
Model 2:
lnL: -176.09119 q: 3
N: 100
. di "-2lnL: " -2*ll2 _n "AIC: " aic2 _n " BIC: " bic2 _n "caic: " caic2
-2lnL: 352.18238
AIC: 358.18238
BIC: 365.99789
caic: 368.99789
.
. ********* (B) VUONG TEST FOR OVERLAPPING MODELS *********
.
. * The test has three variants
. * (1) Nested models: G is contained in F
. * (2) Strictly non-nested models: F intersection G equals null set
. * (3) Overlapping models: F intersection G does not equal null set
.
. * Need to compute lnf(y) for models 1 and 2,
. * where density f is model 1 and density g is model 2
.
. * The procedures will vary with model. Here use Poisson.
183
.
. * (0) COMPUTE THE LR TEST STATISTIC
.
. * This is LR = Sum_i [ ln (fy1_i / gy2_i) ]
.*
= Sum_i lnfy1_i - Sum_i lngy2_i
.*
= difference in log-likelihood for the two models
.
. * Easiest if program output gives logL
. * Otherwise need to generate manually
.
. quietly poisson y $XLISTMODEL1
. scalar llf = e(ll)
. scalar llg = e(ll)
. scalar LR = llf - llg
. di "LR = " LR " and llf = " llf " llg = " llg
LR = -7.3402698 and llf = -183.43146 llg = -176.09119
.
. * (1) NESTED MODELS
.
. * Not done here as not relevant for the example of this application.
.
. * (1A) Usual LR test if assume densities correctly specified.
.
. * (1B) If instead want robustified version then need to compute W
. * and use the weighted chi-square test.
. * This is not the appropriate test here,
. * but in 3(A) below W is computed and a weighted chi-square test used.
. * This code could be easily adapted to here.
.
. * (2) STRICTLY NON-NESTED MODELS
.
. * Not done here as not relevant for the example of this application.
. * Test uses LR/what ~ normal where what is computed in 3(B) below.
.
. * (3) OVERLAPPING MODELS
.
. * This is the relevant test here
. * First test whether overlapping (even though here know that is)
. * THen do the test
.
. * (3A-1) Compute what^2
.
. * Calculate what^2
. * = (1/N)*Sum_i[ln(fy1_i/gy2_i)^2] - [(1/N)*Sum_i[ln(fy1_i/gy2_i)]^2
184
. * = (1/N) * Sum_i [(ln(fy1_i) - ln(gy2_i))^2] - (LR/N)^2

.
. * For the Poisson
.*
f(y) = exp(-mu)*mu^y/y!
. * so lnf(y) = -mu + y*ln(mu) - lny!
. predict yhatf
. * Poisson default predict gives yhat = exp(x'b)
. gen lnf = -yhatf + y*ln(yhatf) - lnfact(y)
. predict yhatg
. gen lng = -yhatg + y*ln(yhatg) - lnfact(y)
. gen lnratiosq = (lnf-lng)^2
. sum lnratiosq
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnratiosq |
100 .6967792 1.816804 .0000331 13.85592
. scalar whatsq = r(sum)/_N - (LR/_N)^2
. scalar Nwhatsq = _N*whatsq
. di "First-stage test statistic whatsq - still need to find critical value"
First-stage test statistic whatsq - still need to find critical value
. di "N*omegahatsq = " Nwhatsq
N*omegahatsq = 69.139128
.
. * Aside: Check by recomputing LR this long way
. gen lnratio = (lnf-lng)
. sum lnratio
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnratio |
100 -.0734027 .8356883 -3.722355 2.571382
. scalar LRcheck = r(sum)
.
185
. *** Display results given in second last row of Table 8.2 page 284
.
. di "LR = " LR " and LRcheck = " LRcheck
LR = -7.3402698 and LRcheck = -7.3402702
.
. * (3A-2) Find the critical value by first find W, then eigenvalues lamda, then simulate
.
. * Calculate estimate of the W matrix on page ?? of Vuong.
. * (a) Can estimate Af = E[d2lnf(y)/dbdb'] as inverse of usual ML variance matrix
. * (b) Since the robust ML variance matrix is V = Ainv*B*Ainv
. * can estimate Bf = -E[dlnf(y)/dbxdlnf(y)/db'] by A*V*A where A is in (a)
. * (c) For Ag same as in part (a) except for model g
. * (d) For Bg same as in part (a) except for model g
. * (e) The only tricky bit is computation of Bfg
.
. gen one = 1
. * (a) Af
. quietly poisson y one $XLISTMODEL1, noconstant
. matrix Af = syminv(e(V))
. * (b) Bf
. quietly poisson y one $XLISTMODEL1, noconstant robust
. * robust gives Ainv*B*Ainv so pre and post multiply by A gives B
. * Also make adjustment s Stata divides by (_N-1). Here use _N.
. matrix Bf = Af*e(V)*Af*(_N-1)/_N
. * (c) Ag
. quietly poisson y one $XLISTMODEL2, noconstant
. matrix Ag = syminv(e(V))
. * (d) Bg
. quietly poisson y one $XLISTMODEL2, noconstant robust
. matrix Bg = Ag*e(V)*Ag*(_N-1)/_N
.
. * (e) Bfg requires more specialized code pecuuliar to this example
. * For Poisson dlnf(y)/db = Sum_I (y_i - mu_i)*x_i
. * so Bfg = (1/N)*Sum_i [(y_i - muf_i)*xf_i]*[(y_i - mug_i)*xg_i]'
. * For model 1 x is intercept and x2 (global XLISTMODEL1 x2)
. gen bf1 = (y - yhatf)
/* yhatf saved earlier = y - muf */
. gen bf2 = (y - yhatf)*x2
. * For model 2 x is intercept, x3 and x3sq (global XLISTMODEL2 x3 x3sq)
. gen bg1 = (y - yhatg)
/* yhatg saved earlier = y - mug */
186
. gen bg2 = (y - yhatg)*x3

. gen bg3 = (y - yhatg)*x3sq
. * Create Bfg
. matrix accum BfBg = bf1 bf2 bg1 bg2 bg3, noconstant
(obs=100)
. * and Bfg is the (1,2) submatrix: rows 1 to 2 and columns 3 to 5
. matrix Bfg = BfBg[1..2,3..5]
.
. * Form the matrix W
. * Note there is no need for minus sign as A has been defined as -A
. matrix W11 = Bf*syminv(Af)
. matrix W12 = Bfg*syminv(Ag)
. matrix W21 = Bfg'*syminv(Af)
. matrix W22 = Bg*syminv(Ag)
. matrix W = W11,W12\W21,W22
. matrix list W
W[5,5]
y:one
y:x2
bg1
bg2
bg3
y:
y:
y:
y:
y:
one
x2
one
x3
x3sq
1.5571072 .01745302 1.3738479 .03868485 -.1702893
.05110494 1.4484966 .61074273 .07847014 -.15039712
1.1488275 .1064062 1.6030095 .0647251 -.18944561
.39558125 .08428705 .20709641 1.0650899 -.05677421
1.1180355 -.0564763 .19914593 .07617139 .90718177
.
. * Calculate the eigenvalues of W
. matrix eigenvalues reigvalW ceigvalW = W
. * Real eigenvalues
. matrix list reigvalW
reigvalW[1,5]
y:
y:
y:
y:
y:
one
x2
one
x3
x3sq
real 2.7511946 .29082285 1.4750881 1.0021719 1.0616075
. * Complex eigenvalues - hopefully none
. matrix list ceigvalW
187
ceigvalW[1,5]
y: y: y: y: y:
one x2 one x3 x3sq
complex 0 0 0 0 0
.
. * This gives the vector lamda of eigenvalus of W
. matrix lamda = reigvalW
. scalar l1 = lamda[1,1]
.
. * Now obtain the p-value and critical value at level 0.05
. preserve
. * Obtain the 5 percent critical value by simulating 10000 draws from
. * M_p+q(lamda) = Sum_j lamda*j*z_j^2 where z_j are N[0,1] so z_j^2 are chi(1)
. set seed 10101
. set obs 10000
. gen randomdraw = l1*invnorm(uniform())^2 + l2*invnorm(uniform())^2 + /*
> */ l3*invnorm(uniform())^2 + l4*invnorm(uniform())^2 + l5*invnorm(uniform())^2
. gen indicator = Nwhatsq >= randomdraw
. quietly sum indicator
. di "p-value for the Omegahatsq test = " 1-r(mean)
p-value for the Omegahatsq test = 0
. sum randomdraw, detail
randomdraw
------------------------------------------------------------Percentiles
Smallest
1% .6438425
.0756691
5% 1.286375
.1250253
10% 1.850972
.1326376
Obs
10000
25% 3.137835
.1402145
Sum of Wgt.
10000
50%
5.359223
Mean
6.614841
188
75%
90%
95%
99%
Largest
Std. Dev.
4.90562
8.751276
38.32291
12.8871
38.75208
Variance
24.06511
16.10237
40.94431
Skewness
1.733549
23.85304
44.08449
Kurtosis
7.514808
. di "Reject overlapping at level .05 if N*omegahatsq exceeds " r(p95)

Reject overlapping at level .05 if N*omegahatsq exceeds 16.102374
. restore
. di "where N*omegahatequals " Nwhatsq
where N*omegahatequals 69.139128
. di "If reject then continue to second step."
If reject then continue to second step.
. di "Otherwise stop as cannot determine whether models are overlapping."
Otherwise stop as cannot determine whether models are overlapping.
.
. * (3B) Do the second stage test if reject at (3A)
. gen TLR = (LR/sqrt(whatsq))/sqrt(_N)
.
. *** Display results given in second last row of Table 8.2 page 284
.
. di "TLR is N[0,1]. Here TLR = " TLR
TLR is N[0,1]. Here TLR = -.88277513
. di "Two-tailed test p-value: " chi2tail(1,TLR^2)
Two-tailed test p-value: .37735778
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma08p2nonnested.txt
log type: text
closed on: 18 May 2005, 21:27:00
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma08p3diagnostics.txt
log type: text
opened on: 17 May 2005, 14:10:13
.
. ********** OVERVIEW OF MMA08P3DIAGNOSTICS.DO **********
.
. * STATA Program
189

.
. * Model diagnostics example (Table 8.3)
.
. * (A) DIFFERENT R-SQUAREDS
. * (B) CALCULATION OF RESIDUALS
.
. * The data generation requires free Stata add-on command rndpoix
.
. * This program gives results for model 2
. * For model 1 need to rerun with only x3 as regressor
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** GENERATE DATA **********
.
. * Model is
. * y ~ Poisson[exp(b1 + b2*x2 + b3*x3]
. * where
. * x2 and x3 are iid ~ N[0,1]
. * and b1=0.5 and b2=0.5 and b3=0.5.
.
. * The Diagnostics below are from Poisson regression of y on x3 alone
. * or from Poisson regression of y on x3 and x3sq. [Note" x2 is omitted]
.
. set seed 10001
. set obs 100
obs was 0, now 100
. scalar b1 = 0.5
. scalar b2 = 0.5
. scalar b3 = 0.5
.
190
.
. * Generate y
. gen mupoiss = exp(b1+b2*x2+b3*x3)
. rndpoix(mupoiss)
( Generating ......... )
. gen y = xp
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
100 .0053689 1.000686 -2.173506 2.106561
x3 |
100 -.0235884 1.024207 -2.857666 2.149822
mupoiss |
100 2.020511 1.400564 .3380426 7.029678
xp |
100
1.92 1.835013
0
8
y|
100
1.92 1.835013
0
8
.
. outfile y x2 x3 using mma08p3diagnostics.asc, replace
.
. ********* SETUP FOR THIS PROGRAM **********
.
. * Change this if want different regressors
. gen x3sq = x3*x3
. * global XLIST x3
/* Model 1 */
. global XLIST x3 x3sq /* Model 2 */
.
. ********* R-SQUARED (reported in Table 8.3 p.291) **********
.
. * The following code can be changed to diffferent models than poisson
. * For RsqRES, RsqEXP and RsqCOR need
.* y
dependent variable
. * yhat predicted value of dependent variable
. * For RsqWRSS additionally need
. * sigmasq predicted variance of dependent variable
. * For RsqRG need log density evaluated at values given below
.
. * Obtain exp(x'b) Will vary with the model
. poisson y $XLIST
191

Poisson regression
Number of obs =
100
LR chi2(2)
=
30.96
Prob > chi2 = 0.0000
Pseudo R2
= 0.0808
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x3 | .3588412 .07035 5.10 0.000 .2209578 .4967245
x3sq | .0912999 .0514311 1.78 0.076 -.0095032 .1921029
_cons | .492656 .0958903 5.14 0.000 .3047144 .6805975
-----------------------------------------------------------------------------. predict yhat
. scalar dof = e(N)-e(k)
.
. * RsqRES and RsqEXP are R-squared from sums of squares
. * First get TSS, ESS and RSS
. gen ylessybarsq = (y - ybar)^2
. quietly sum ylessybarsq
. scalar totalss = r(mean)
. gen yhatlessybarsq = (yhat - ybar)^2
. quietly sum yhatlessybarsq
. scalar explainedss = r(mean)
. gen residualsq = (y - yhat)^2
. quietly sum residualsq
. scalar residualss = r(mean)
. * Second computed the rsquared
. scalar sereg = sqrt(residualss/dof)
. scalar RsqRES = 1 - residualss/totalss
. scalar RsqEXP = explainedss/totalss
192
.
. * RsqCOR uses sample correlation
. quietly correlate y yhat
. scalar RsqCOR = r(rho)^2
.
. di "standard error of regression: " sereg
standard error of regression: .16620308
. di "totalss: " totalss _n "explainedss: " explainedss _n "residualss: " residualss
totalss: 3.3336
explainedss: .69556676
residualss: 2.6794761
. di "RsqRES: " RsqRES _n "RsqEXP: " RsqEXP _n "RsqCOR: " RsqCOR
RsqRES: .19622149
RsqEXP: .20865333
RsqCOR: .19640666
.
. * RsqWRSS uses weighted sums of squares
. * First generate estimated variance of y
. * Here for Poisson use fact that variance = mean
. gen sigmasq = yhat
. gen weightedylessybarsq = ((y - ybar)^2) / sigmasq
. quietly sum weightedylessybarsq
. scalar weightedtotalss = r(mean)
. gen weightedresidualsq = ((y - yhat)^2) / sigmasq
. quietly sum weightedresidualsq
. scalar weightedresidualss = r(mean)
. scalar RsqWRSS = 1 - weightedresidualss/weightedtotalss
. di "RsqWRSS: " RsqWRSS
RsqWRSS: .16945018
.
. * RsqRG is from ML. Difficult to generalize beyond LEF models.
. * Need
. * lnL_fit log-likelihood at fitted values (the usual)
. * lnL_0 log-likelihood at intecept only
. * lnL_max log-likelihood at best fit
. quietly poisson y $XLIST
193
. scalar lnL_fit = e(ll)

. scalar lnL_0 = e(ll_0)
. * The following applies only for Poisson. Differs for otehr models.
. * lnf(y) = -mu + y*ln(mu) - ln(y!)
. * is maximized at mu = y
. * so compute lnL_max = sum of [-y + y*ln(y) - lny!]
. * Following sets 0*ln0 = 0
. gen ylny = 0
. replace ylny = y*ln(y) if y > 0
. gen lnfyatmax = -y + ylny - lnfact(y)
. quietly sum lnfyatmax
. scalar lnL_max = r(sum)
. scalar RsqRG = (lnL_fit - lnL_0) / (lnL_max - lnL_0)
.
. * RsqQ should only be used for binary and other discrete choice models
. * And definitely use only if lnL_fit < 0
. scalar RsqQ = 1 - lnL_fit/lnL_0
.
. di "lnL_0: " lnL_0 _n "lnL_fit: " lnL_fit _n "lnL_max: " lnL_max
lnL_0: -191.57162
lnL_fit: -176.09119
lnL_max: -101.12402
. di "RsqRG: " RsqRG _n "RsqQ: " RsqQ
RsqRG: .17115358
RsqQ: .08080754
.
. * Check
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x2 |
100 .0053689 1.000686 -2.173506 2.106561
x3 |
100 -.0235884 1.024207 -2.857666 2.149822
mupoiss |
100 2.020511 1.400564 .3380426 7.029678
xp |
100
1.92 1.835013
0
8
y|
100
1.92 1.835013
0
8
-------------+-------------------------------------------------------x3sq |
100 1.039067 1.446146 .0000877 8.166255
yhat |
100
1.92 .838208 1.150405 5.398193
194
ybar |
100
1.92
0
1.92
1.92
ylessybarsq |
100
3.3336 5.966374
.0064 36.9664
yhatlessyb~q |
100 .6955668 1.572256 4.82e-06 12.09783
-------------+-------------------------------------------------------residualsq |
100 2.679476 4.830379 .0000825 36.93972
sigmasq |
100
1.92 .838208 1.150405 5.398193
weightedyl~q |
100 1.681324 2.560112 .0018502 19.23135
weightedre~q |
100 1.396423 2.424518 .0000276 19.21747
ylny |
100 2.15694 3.48234
0 16.63553
-------------+-------------------------------------------------------lnfyatmax |
100 -1.01124 .6233793 -1.969071
0
. poisson y $XLIST /* Stata Rsq = RsqQ */
Poisson regression
Number of obs =
100
LR chi2(2)
=
30.96
Prob > chi2 = 0.0000
Pseudo R2
= 0.0808
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x3 | .3588412 .07035 5.10 0.000 .2209578 .4967245
x3sq | .0912999 .0514311 1.78 0.076 -.0095032 .1921029
_cons | .492656 .0958903 5.14 0.000 .3047144 .6805975
-----------------------------------------------------------------------------.
. *** The following results are for Model 2 in Table 8.3 p.291
. *** For model 1 R-squareds need to rerun with only x3 as regressor
. di "standard error of regression: " sereg
standard error of regression: .16620308
. di "RsqRES: " RsqRES _n "RsqEXP: " RsqEXP _n "RsqCOR: " RsqCOR
RsqRES: .19622149
RsqEXP: .20865333
RsqCOR: .19640666
. di "RsqWRSS: " RsqWRSS _n "RsqRG: " RsqRG _n "RsqQ: " RsqQ
RsqWRSS: .16945018
RsqRG: .17115358
RsqQ: .08080754
.
. ********* RESIDUAL ANALYSIS (text bottom p.290 to top p.291) **********
.
. * Assume that from earlier have yhat
195
.
. * raw residual
. gen raw = y - yhat
. gen sigma = sqrt(yhat)
. gen Pearson = (y - yhat)/sigma
. * Note that earlier defined ylny = 0 if y=0 and = yln(y) otherwise
. gen deviance = sign(y-yhat)*sqrt(2*(-y+ylny)-2*(-yhat+y*ln(yhat)))
.
. *** The following are results reported in text bottom p.290 to top p.291
. sum raw Pearson deviance
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------raw |
100 -2.38e-09 1.645157 -2.993904 6.077806
Pearson |
100 -.0014455 1.187656 -1.498094 4.383774
deviance |
100 -.2103819 1.212345 -2.016939 3.264961
. corr raw Pearson deviance
(obs=100)
|
raw Pearson deviance
-------------+--------------------------raw | 1.0000
Pearson | 0.9852 1.0000
deviance | 0.9625 0.9818 1.0000
. * Example of use to find whether x3 belongs in the model

. * graph twoway scatter Pearson x3
.
. log close
log: c:\Imbook\bwebpage\Section2\mma08p3diagnostics.txt
log type: text
closed on: 17 May 2005, 14:10:13
196
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma09p1np.txt
log type: text
opened on: 17 May 2005, 14:16:51
.
. ********** OVERVIEW OF MMA09P1NP.DO **********
.
. * STATA Program
.
. * Chapter 9.2 p.295-297
. * Nonparametric density estimation and nonparametric regression using actual data.
.
. * (1) Histogram: Figure 9.1 in chapter 9.2.1 (ch9hist)
. * (2) Kernel density estimate as bandwidth varies: Figure 9.2 in chapter 9.2.1 (ch9kd1)
. * (3) Kernel density estimate as kernel varies: Figure 9.4 in chapter 9.3.4 (ch9kdensu1)
. * (4) Lowess regression: Figure 9.3 in chapter 9.4.3 (ch9ksm1)
. * (5) Extra: Nearest neighbours regression: using Lowess and using add-on knnreg
. * (6) Extra: Kernel regression: using add-on kernreg
.
. * using data on earnings and education (see below)
.
. * NOTE: This particular program uses version 8.2 rather than 8.0
.*
For kernel density Stata uses an alternative formulation of Epanechnikov
.*
To follow book and e.g. Hardle (1990) use epan2 rather than epan
.*
epan = epan2 if epan bandwidth is epan2 bandwidth divided by sqrt(5)
.*
where kernel epan2 is an update to Stata version 8.2
.
. * To run this program you need file
. * psidf3050.dat
. * in your directory
.
. * To do (5) and (6) you need Stata add-ons knnreg and kernreg
. * In Stata give command search knnreg and search kernreg
.
. * See also mma9p2npmore.do for more on nonparametric regression (Figures 9.5-9.7)
.
. ********** SETUP
.
. di "mma09p1np.do Cameron and Trivedi: Stata nonparametrics with wages and education"
mma09p1np.do Cameron and Trivedi: Stata nonparametrics with wages and education
. set more off
. version 8
197
.
. ********** DATA DESCRIPTION
.*
. * The original data are from the PSID Individual Level Final Release 1993 data
. * From www.isr.umich.edu/src/psid then choose Data Center
. * 4856 observations on 9 variables for Females 30 to 50 years
.
. * Fixed width data
. * intnum 1-4 V30001="1968 INTERVIEW NUMBER"
. * persnum 5-7 V30002="PERSON NUMBER"
. * age
8-9 V30809="AGE OF INDIVIDUAL
93"
. * educatn 10-11 V30820="G90 HIGHEST GRADE COMPLETED
93"
. * earnings 12-17 V30821="TOTAL LABOR INCOME
93"
. * hours 18-21 V30823="1992 ANNUAL WORK HOURS
93"
. * sex
22 V32000="SEX OF INDIVIDUAL"
. * kids 23-24 V32022="# LIVE BIRTHS TO THIS INDIVIDUAL"
. * [NOTE: DO NOT USE THE kids VARIABLE AS IT IS NUMBER OF BIRTHS
.*
NOT NUMBER OF KIDS CURRENTLYU IN HOUSEHOLD]
. * married 25 V32049="LAST KNOWN MARITAL STATUS"
.
. ********** READ DATA **********
.
. * Data are fixed format so use infix
. infix intnum 1-4 persnum 5-7 age 8-9 educatn 10-11 earnings 12-17 /*
> */ hours 18-21 sex 22 kids 23-24 married 25 using psidf3050.dat
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------intnum |
4856 4598.101 2761.971
4
9306
persnum |
4856 59.21355 79.74856
1
205
age |
4856 38.46293 5.595116
30
50
educatn |
4855 16.37714 18.4495
0
99
earnings |
4856 14244.51 15985.45
0 240000
-------------+-------------------------------------------------------hours |
4856 1235.335 947.1758
0
5160
sex |
4856
2
0
2
2
kids |
4856 4.48126 14.88786
0
99
married |
4856 1.920717 1.504848
1
9
.
. ********** MISSING VALUES, DATA TRANSFORMATIONS and SAMPLE SELECTION
.
. * For Highest grade codes the missing codes are 98 DK and 99 NA and 0 inappropriate
. * Here treat these as missing
. replace educatn = . if (educatn==0 | educatn==98 | educatn==99)
(290 real changes made, 290 to missing)
198
.
. * For marital status the codes are
. * 1 married; 2 Never married; 3 Widowed; 4 Divorced, annulment;
. * 5 Separated; 8 NA / DK; 9 No histories 85-93
. * Recode 2-5 as not married and treat 8 and 9 as missing
. replace married = . if (married==8 | married==9)
. replace married = 0 if married > 1
.
. * For kids the missing codes are 98 DK/NA and 99 no birth history
. replace kids = . if (kids==98 | kids==99)
. * But do not use these data as it is number of births
. * not number of kids currently in household
. * So I drop kids
. drop kids
.
. * Work with positive earnings only
. drop if earnings==0
. * Topcode women with very high earnings
. replace earnings=100000 if earnings>100000
. * Create log hourly wage
. gen hwage = earnings/hours
. gen lnhwage = ln(hwage)
.
. * Work with age 36 and nonmissing education data
. keep if age == 36
. drop if educatn == .
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------intnum |
177 4699.853 2765.081
14
9240
persnum |
177 59.53672 79.73001
1
188
age |
177
36
0
36
36
educatn |
177 12.58757 2.841347
3
17
199
earnings |
177 17470.55 13513.56
87
70000
-------------+-------------------------------------------------------hours |
177 1506.401 698.4145
8
3160
sex |
177
2
0
2
2
married |
177 .7457627 .4366669
0
1
hwage |
177 12.71631 16.58889 .6837607
175
lnhwage |
177 2.198163 .8281614 -.3801473 5.164786
.
. outfile intnum persnum age educatn earnings hours sex married hwage /*
> */ lnhwage using mma09p1np.asc, replace
.
. ********* ANALYSIS: (1)-(3) NONPARAMETRIC DENSITY ESTIMATES
.
. set scheme s1mono
.
. * Here give bin width for histogram and kdensity
.
. * Calculate Silberman's plugin estimate of optimal bandwidth in (9.13)
. * with delta given in Table 9.1 for Epanechnikov kernel
. quietly sum lnhwage, detail
. global sadj = min(r(sd),(r(p75)-r(p25))/1.349)
. di "sadj: " $sadj " iqr/1349: " (r(p75)-r(p25))/1.349 " stdev: " r(sd)
sadj: .65488184 iqr/1349: .65488184 stdev: .82816143
. global bwepan2 = 1.3643*1.7188*$sadj/(r(N)^0.2)
. di "Bandwidth: " $bwepan2
Bandwidth: .54538542
.
. * HISTOGRAM ONLY - Figure 9.1
. graph twoway (histogram lnhwage, bin(20) bcolor(*.2)), /*
> */ title("Histogram for Log Wage") /*
> */ xtitle("Log Hourly Wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ legend( label(1 "Histogram") label(2 "Kernel"))
. graph save ch9hist, replace
(file ch9hist.gph saved)
. graph export ch9hist.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9hist.wmf written in Windows Metafile format)
200
.
. * COMBINED HISTOGRAM AND KERNEL DENSITY ESTIMATE
. graph twoway (histogram lnhwage, bin(20) bcolor(*.2)) /*
> */ (kdensity lnhwage, width($bwepan2) epan2 clstyle(p1)), /*
> */ title("Histogram and Kernel Density for Log Wage") /*
> */ caption("Note: Kernel is Epanechnikov with bandwidth 0.55")
.
. * KERNEL DENSITY ESTIMATE FOR 3 BANDWIDTHS - Figure 9.2
. global bwonehalf = 0.5*$bwepan2
. global btwotimes = 2*$bwepan2
. graph twoway (kdensity lnhwage, width($bwonehalf) epan2 clstyle(p2)) /*
> */ (kdensity lnhwage, width($bwepan2) epan2 clstyle(p1)) /*
> */ (kdensity lnhwage, width($btwotimes) epan2 clstyle(p3)), /*
> */ title("Density Estimates as Bandwidth Varies") /*
> */ ytitle("Kernel density estimates", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "One-half plug-in") label(2 "Plug-in") /*
> */
label(3 "Two times plug-in"))
. graph save ch9kd1, replace
(file ch9kd1.gph saved)
. graph export ch9kd1.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9kd1.wmf written in Windows Metafile format)
.
. * KERNEL DENSITY ESTIMATE FOR 4 DIFFERENT KERNELS - Figure 9.4
. * Calculate Silberman's plugin optimal bandwidths using (9.13)
. * with delta given in Table 9.1 for the different kernels
.
. * Use sadj calculated earlier for Epanecnnikov
. global bwgauss = 1.3643*0.7764*$sadj/(_N^0.2)
. global bwbiweight = 1.3643*2.0362*$sadj/(_N^0.2)
. global bwrectang = 0.5*1.3643*1.3510*$sadj/(_N^0.2)
. di "Usual Epanechnikov (epan2):
" $bwepan2
Usual Epanechnikov (epan2):
.54538542
. di "Gaussian:
Gaussian:
" $bwgauss
.24635632
. di "Quartic or biweight:
Quartic or biweight:
" $bwbiweight
.64609832
201
. di "Uniform or rectangular:
" $bwrectang
Uniform or rectangular:
.21434015
. graph twoway (kdensity lnhwage, width($bwepan2) epan2) /*
> */ (kdensity lnhwage, width($bwgauss) gauss) /*
> */ (kdensity lnhwage, width($bwbiweight) biweight) /*
> */ (kdensity lnhwage, width($bwrectang) rectangle), /*
> */ title("Density Estimates as Kernel Varies") /*
> */ ytitle("Kernel density estimates", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Epanechnikov (h=0.545)") label(2 "Gaussian (h=0.246)") /*
> */
label(3 "Quartic (h=0.646)") label(4 "Uniform (h=0.214)"))
. graph save ch9kdensu1, replace
(file ch9kdensu1.gph saved)
. graph export ch9kdensu1.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9kdensu1.wmf written in Windows Metafile format)
.
. * SHOW THAT STATA EPANECHNIKOV = USUAL EPANECHNIKOV
. * Once divide usual Epanechnikov bandwidth by sqrt(5).
. * (Pagan and Ullah (1999, p.28) have formulae.)
. global bwepan = $bwepan2/sqrt(5)
. graph twoway (kdensity lnhwage, width($bwepan2) epan2) /*
> */ (kdensity lnhwage, width($bwepan) epan), /*
> */ title("Epan = Epan2 if bandwidth adjusted") /*
> */ legend( label(1 "Usual Epanechnikov") label(2 "Stata Epanechnikov"))
.
.
. ********* ANALYSIS: (4) LOWESS NONPARAMETRIC REGRESSION ESTIMATES
.
. * LOWESS WITH DEFAULT BANDWIDTH of 0.8
. lowess lnhwage educatn
.
. * LOWESS REGRESSION WITH BANDWIDTHS of 0.1, 0.4 and 0.8 - Figure 9.3
. graph twoway (scatter lnhwage educatn, msize(medsmall) msymbol(o)) /*
> */ (lowess lnhwage educatn, bwidth(0.8) clstyle(p2)) /*
> */ (lowess lnhwage educatn, bwidth(0.4) clstyle(p1)) /*
> */ (lowess lnhwage educatn, bwidth(0.1) clstyle(p3)), /*
> */ title("Nonparametric Regression as Bandwidth Varies") /*
> */ xtitle("Years of Schooling", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log Hourly Wage", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Actual data") label(2 "Bandwidth h=0.8") /*
202
> */
label(3 "Bandwidth h=0.4") label(4 "Bandwidth h=0.1"))
. graph save ch9ksm1, replace

(file ch9ksm1.gph saved)
. graph export ch9ksm1.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9ksm1.wmf written in Windows Metafile format)
.
. ********* ANALYSIS: (5) EXTRA: K-NEAREST NEIGHBORS NONPARAMETRIC
REGRESSION
.
. * NEAREST NEIGHBOURS REGRESSION USING LOWESS
. * Use lowess with mean and noweight options to give running means = centered kNN
. global knnbwidth = 0.3
. di "knn via Lowess uses following % of sample: " $knnbwidth
knn via Lowess uses following % of sample: .3
. lowess lnhwage educatn, bwidth($knnbwidth) mean noweight
.
. * LOWESS COMPARED TO NEAREST NEIGHBOURS
. graph twoway (lowess lnhwage educatn, bwidth(0.3) mean noweight) /*
> */ (lowess lnhwage educatn, bwidth(0.3)), /*
> */ title("Centered kNN versus Lowess") /*
> */ legend( label(1 "Centered kNN") label(2 "Lowess 0.8"))
.
. * NEAREST NEIGHBOURS REGRESSION USING KNNREG COMPARED TO USING
LOWESS
. * knnreg is a Stata add-on (in Stata search knnreg to find and download)
. * Here we verify that same as lowess knn except knnreg drops endpoints
. global k = round($knnbwidth*_N)
. di "knnreg uses following number of neighbours: " $k
knnreg uses following number of neighbours: 53
. knnreg lnhwage educatn, k($k) gen(knnregpred) ylabel nograph
. lowess lnhwage educatn, bwidth($knnbwidth) gen(knnlowesspred) mean noweight nograph
. * Following shows that the same except knnreg drops endpoints and lowess does not
. sum knnlowesspred knnregpred
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------knnlowessp~d |
177 2.180308 .4522163 1.475512 2.954416
knnregpred |
125 2.184309 .3412013 1.529874 2.802865
. corr knnlowesspred knnregpred
203
(obs=125)
| knnlow~d knnreg~d
-------------+-----------------knnlowessp~d | 1.0000
knnregpred | 1.0000 1.0000
.
. ********* ANALYSIS: (6) EXTRA: KERNEL NONPARAMETRIC REGRESSION
.
. * KERNEL REGRESSION
. * Kercode 1 = Uniform; 2 = Triangle; 3 = Epanechnikov; 4 = Quartic (Biweight);
.*
5 = Triweight; 6 = Gaussian; 7 = Cosinus
. * bwidth(#) defines width of the weight function window around each grid point.
. * npoint(#) specifies the number of equally spaced grid points over range of x.
. * Here bwidth(3) gives e.g. positive weight from x=4 to x=10 if current x0=7
. kernreg lnhwage educatn, bwidth(3) kercode(3) npoint(100) ylabel gen(kernregpred1 xkernreg)
. graph twoway (lowess lnhwage educatn, bwidth(0.5) clstyle(p2)) /*
> */ (line kernregpred xkernreg, clstyle(p1)), /*
> */ title("Lowess versus kernel regression") /*
> */ legend( label(1 "Lowess") label(2 "Kernreg"))
.
. log close
log: c:\Imbook\bwebpage\Section2\mma09p1np.txt
log type: text
closed on: 17 May 2005, 14:17:05
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma09p2npmore.txt
log type: text
opened on: 17 May 2005, 14:17:35
.
. ********** OVERVIEW OF MMA09P2NPMORE.DO **********
.
. * STATA Program
.
. * Chapter 9.4-9.5 (pages 307-19)
. * More on nonparametric regression, including Figures 9.5 - 9.7
.
. * It provides
. * (1) Nonparametric regression
.*
k-nearest neighbors regression: Figure 9.5 in chapter 9.4.2 (ch9ksmma)
204
.*
Lowess regression: Figure 9.6 in chapter 9.4.3 (ch9ksmlowess)
.*
Kernel regression (using Stata add-on kernreg)
. * (2) Nonparametric derivative estimation
.*
Figure 9.7 in chapter 9.5.5 (ch9kderiv)
. * (3) Cross-validation - still incomplete
.
. * See also mma09p1np.do for nonparametric density estimation and regression
.
. * This program uses free Stata add-on command kernreg
. * To obtain in Stata give command search kernreg
.
. ********** SETUP **********
.
. di "mma09p2npmore.do Cameron and Trivedi: Stata nonparametrics with generated data"
mma09p2npmore.do Cameron and Trivedi: Stata nonparametrics with generated data
. set more off
. version 8.0
.
. ********** GENERATE DATA **********
.
. * Model is y = 150 + 6.5*x - 0.15*x^2 + 0.001*x^3 + u
. * where u ~ N[0, 25^2]
.*
x = 1, 2, 3, ... , 100
.*
e ~ N[0, 2^2]
.
. set seed 10101
. set obs 100
obs was 0, now 100
. gen u = 25*invnorm(uniform())
. gen x = _n
. gen y = 150 + 6.5*x - 0.15*x^2 + 0.001*x^3 + u
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u|
100 2.809606 25.26291 -71.97334 73.59318
x|
100
50.5 29.01149
1
100
y|
100 228.5596 35.25377 132.2952 345.5873
.
205
. outfile y x using mma09p2npmore.asc, replace
.
. ******** PARAMETRIC REGRESSION **********
.
. * OLS regression on cubic polymomial
. gen xsquared = x^2
. gen xcubed = x^3
. reg y x xsquared xcubed
Source |
SS
df
MS
Number of obs = 100
-------------+-----------------------------F( 3, 96) = 31.15
Model | 60691.6801 3 20230.56
Prob > F
= 0.0000
Residual | 62348.2994 96 649.461452
R-squared = 0.4933
-------------+-----------------------------Adj R-squared = 0.4774
Total | 123039.98 99 1242.82808
Root MSE
= 25.485
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | 6.055295 .9033915 6.70 0.000 4.262077 7.848513
xsquared | -.1402283 .0207284 -6.77 0.000 -.1813738 -.0990828
xcubed | .0009492 .0001349 7.03 0.000 .0006814 .0012171
_cons | 155.1521 10.58835 14.65 0.000 134.1344 176.1698
-----------------------------------------------------------------------------. predict ycubic
. summarize y ycubic
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------y|
100 228.5596 35.25377 132.2952 345.5873
ycubic |
100 228.5596 24.75979 161.0681 307.6293
.
. ******** (1) NONPARAMETRIC REGRESSION **********
.
. * K-NEAREST NEIGHBORS REGRESSION - FIGURE 9.5
. * ksm without options gives running mean = moving average = centered kNN
. * Here _N = 100 so bwidth = 0.05 gives 100*0.05 = 5 nearest neighbours
. graph twoway (scatter y x, msize(medsmall) msymbol(o)) /*
> */ (lowess y x, mean noweight bwidth(0.05) clstyle(p1)) /*
> */ (lfit y x, clstyle(p3)) /*
> */ (lowess y x, mean noweight bwidth(0.25) clstyle(p2)), /*
> */ title("k-Nearest Neighbours Regression as k Varies") /*
206
>
>
>
>
>
*/ xtitle("Regressor x", size(medlarge)) xscale(titlegap(*5)) /*

*/ ytitle("Dependent variable y", size(medlarge)) yscale(titlegap(*5)) /*
*/ legend(pos(12) ring(0) col(1)) legend(size(small)) /*
*/ legend( label(1 "Actual Data") label(2 "kNN (k=5)") /*
*/
label(3 "Linear OLS") label(4 "kNN (k=25)"))
. graph save ch9ksmma, replace

(file ch9ksmma.gph saved)
. graph export ch9ksmma.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9ksmma.wmf written in Windows Metafile format)
.
. * VERIFY THAT kNN SAME AS MOVING AVERAGE
. * Do moving average by hand for k = 5
. gen yma5 = (y[_n-2] + y[_n-1] + y + y[_n+1] + y[_n+2])/5
. replace yma5 = (y[_n]+y[_n+1]+y[_n+2])/3 if _n==1
(1 real change made)
. replace yma5 = (y[_n-1]+y[_n]+y[_n+1]+y[_n+2])/4 if _n==2
. replace yma5 = (y[_n+1]+y[_n]+y[_n-1]+y[_n-2])/4 if _n==99
. replace yma5 = (y[_n]+y[_n-1]+y[_n-2])/3 if _n==100
. lowess y x, mean noweight bwidth(0.05) nogr gen(yknn5)
. sum yma5 yknn5
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------yma5 |
100 228.6037 26.63323 157.1434 297.4832
yknn5 |
100 228.6037 26.63323 157.1434 297.4832
.
. * LOWESS REGRESSION - FIGURE 9.6
. graph twoway (scatter y x, msize(medsmall) msymbol(o)) /*
> */ (lowess y x, bwidth(0.25) clstyle(p1)) /*
> */ (line ycubic x, clstyle(p3)), /*
> */ title("Lowess Nonparametric Regression") /*
> */ xtitle("Regressor x", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Dependent variable y", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Actual Data") label(2 "Lowess (k=25)") /*
> */
label(3 "OLS Cubic Regression") )
207
. graph save ch9ksmlowess, replace

(file ch9ksmlowess.gph saved)
. graph export ch9ksmlowess.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9ksmlowess.wmf written in Windows Metafile format)
.
. * KERNEL REGRESSION COMPARED TO k NEAREST NEIGHBORS REGRESSION
. * For this artificial example (with equally spaced x)
. * knn = kernel regression using uniform prior
. * Kercode 1 = Uniform; 2 = Triangle; 3 = Epanechnikov; 4 = Quartic (Biweight);
.*
5 = Triweight; 6 = Gaussian; 7 = Cosinus
. * bwidth(#) defines width of the weight function window around each grid point.
. * npoint(#) specifies the number of equally spaced grid points over range of x.
. * Here bwidth(12) gives e.g. positive weight from x=15 to x=39 if current x=37
. kernreg y x, bwidth(12) kercode(1) npoint(100) ylabel gen(pykernreg xkernreg)
. lowess y x, mean noweight bwidth(0.25) gen(yknn25)
. sum pykernreg yknn25
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------pykernreg |
100 228.6856 18.75275 181.1579 272.5488
yknn25 |
100 228.6856 18.75275 181.1578 272.5488
.
. ******** (2) DERIVATIVE ESTIMATION **********
.
. * DERIVATIVE ESTIMATION - FIGURE 9.7
. * Here use Lowess regression
. lowess y x, xlab ylab bwidth(0.25) lowess nogr gen(yplowess)
. * Need to first sort data on regressor if data on regressor are not ordered
. sort x
. gen dydxlowess = (yplowess - yplowess[_n-1])/(x - x[_n-1])
(1 missing value generated)
. * And do the same for the earlier fitted cubic
. gen dydxcubic = (ycubic - ycubic[_n-1])/(x - x[_n-1])
. graph twoway (line dydxlowess x, clstyle(p1)) /*
> */ (line dydxcubic x, clstyle(p3)), /*
> */ title("Nonparametric Derivative Estimation") /*
> */ xtitle("Regressor x", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Dependent variable y", size(medlarge)) yscale(titlegap(*5)) /*
208
> */ legend( label(1 "From Lowess (k=25)") /*

> */ label(2 "From OLS Cubic Regression") )
. graph save ch9kderiv, replace
(file ch9kderiv.gph saved)
. graph export ch9kderiv.wmf, replace
(file c:\Imbook\bwebpage\Section2\ch9kderiv.wmf written in Windows Metafile format)
.
. ******** (3) CROSS-VALIDATION [PRELIMINARY] **********
.
. /* The following does not work.
> I need to figure out use of macros */
.
. forvalues i = 5/25 {
2. scalar bdì' = 0.01*ì'
3. global bwì' = bdì'
4. lowess y x, mean noweight bwidth($bwì') gen(pyì') nogr
5. gen cvì' = sum(3/2*(y-pyì')^2)
6. }
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u|
100 2.809606 25.26291 -71.97334 73.59318
x|
100
50.5 29.01149
1
100
y|
100 228.5596 35.25377 132.2952 345.5873
xsquared |
100
3383.5 3024.356
1
10000
xcubed |
100
255025 289320.7
1 1000000
-------------+-------------------------------------------------------ycubic |
100 228.5596 24.75979 161.0681 307.6293
yma5 |
100 228.6037 26.63323 157.1434 297.4832
yknn5 |
100 228.6037 26.63323 157.1434 297.4832
pykernreg |
100 228.6856 18.75275 181.1579 272.5488
xkernreg |
100
50.5 29.01149
1
100
-------------+-------------------------------------------------------yknn25 |
100 228.6856 18.75275 181.1578 272.5488
yplowess |
100 228.6494 25.46305 156.8217 302.5474
dydxlowess |
99 1.471977 2.20262 -1.953159 6.964434
dydxcubic |
99 1.480416 2.100452 -.8495026 6.342957
py5 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv5 |
100 84655.13 34359.8 10940.13 162417.9
py6 |
100 228.0408 8.046055 217.6967 243.0812
cv6 |
100 84655.13 34359.8 10940.13 162417.9
py7 |
100 228.0408 8.046055 217.6967 243.0812
cv7 |
100 84655.13 34359.8 10940.13 162417.9
-------------+-------------------------------------------------------py8 |
100 228.0408 8.046055 217.6967 243.0812
209
cv8 |
100 84655.13 34359.8 10940.13 162417.9
py9 |
100 228.0408 8.046055 217.6967 243.0812
cv9 |
100 84655.13 34359.8 10940.13 162417.9
py10 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv10 |
100 84655.13 34359.8 10940.13 162417.9
py11 |
100 228.0408 8.046055 217.6967 243.0812
cv11 |
100 84655.13 34359.8 10940.13 162417.9
py12 |
100 228.0408 8.046055 217.6967 243.0812
cv12 |
100 84655.13 34359.8 10940.13 162417.9
-------------+-------------------------------------------------------py13 |
100 228.0408 8.046055 217.6967 243.0812
cv13 |
100 84655.13 34359.8 10940.13 162417.9
py14 |
100 228.0408 8.046055 217.6967 243.0812
cv14 |
100 84655.13 34359.8 10940.13 162417.9
py15 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv15 |
100 84655.13 34359.8 10940.13 162417.9
py16 |
100 228.0408 8.046055 217.6967 243.0812
cv16 |
100 84655.13 34359.8 10940.13 162417.9
py17 |
100 228.0408 8.046055 217.6967 243.0812
cv17 |
100 84655.13 34359.8 10940.13 162417.9
-------------+-------------------------------------------------------py18 |
100 228.0408 8.046055 217.6967 243.0812
cv18 |
100 84655.13 34359.8 10940.13 162417.9
py19 |
100 228.0408 8.046055 217.6967 243.0812
cv19 |
100 84655.13 34359.8 10940.13 162417.9
py20 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv20 |
100 84655.13 34359.8 10940.13 162417.9
py21 |
100 228.0408 8.046055 217.6967 243.0812
cv21 |
100 84655.13 34359.8 10940.13 162417.9
py22 |
100 228.0408 8.046055 217.6967 243.0812
cv22 |
100 84655.13 34359.8 10940.13 162417.9
-------------+-------------------------------------------------------py23 |
100 228.0408 8.046055 217.6967 243.0812
cv23 |
100 84655.13 34359.8 10940.13 162417.9
py24 |
100 228.0408 8.046055 217.6967 243.0812
cv24 |
100 84655.13 34359.8 10940.13 162417.9
py25 |
100 228.0408 8.046055 217.6967 243.0812
-------------+-------------------------------------------------------cv25 |
100 84655.13 34359.8 10940.13 162417.9
. * Then need to choose the ì' with minimum cvì'
. * Problem here is that this gives e.g. $bw5 = 5 not 0.05
.
. log close
log: c:\Imbook\bwebpage\Section2\mma09p2npmore.txt
log type: text
closed on: 17 May 2005, 14:17:43
210
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma09p3kernels.txt
log type: text
opened on: 18 May 2005, 21:31:55
.
. ********** OVERVIEW OF MMA09P3KERNELS.DO **********
.
. * STATA Program
.
. * This program plots different kernel regression functions
. * This is not included in the book
. * There is no data
.
. * Results:
. * Epanstata is similar to Gaussian kernel. Less peaked than Epanechnikov.
. * Triangular, Quartic, Triweight and Tricubic are similar,
. * and are more peaked than Epanechnikov
. * The fourth oreder Kernels can take negative values.
.
. * NOTE: For kernel density Stata uses an alternative formulation of Epanechnikov
.*
To follow book and e.g. Hardle (1990) use epan2
.*
(available in Stata version 8.2) rather than epan
.
. ********** SETUP **********
.
. di "mma09p3kernels.do Cameron and Trivedi: Stata Kernel Functions"
mma09p3kernels.do Cameron and Trivedi: Stata Kernel Functions
. set more off
. version 8.0
.
. ********** GENERATE DATA **********
.
. * Graphs will be for z = -2.5 to 2.5 in increments of 0.02
. set obs 251
obs was 0, now 251
. gen z = -2.52 + 0.02*_n
.
. ********** CALCULATE THE KERNELS **********
211
.
. * Indicator for |z| < 1
. gen abszltone = 1
. replace abszltone = 0 if abs(z)>=1
.
. gen kuniform = 0.5*abszltone
.
. gen ktriangular = (1 - abs(z))*abszltone
.
. * Stata calls the usual Epanechnikov kernel epan2
. gen kepanechnikov = (3/4)*(1 - z^2)*abszltone
.
. * Stata uses alternative epanechnikov
. gen abszltsqrtfive = 1
. replace abszltsqrtfive = 0 if abs(z)>=sqrt(5)
. gen kepanstata = (3/4)*(1 - (z^2)/5)/sqrt(5)*abszltsqrtfive
.
. gen kquartic = (15/16)*((1 - z^2)^2)*abszltone
.
. gen ktriweight = (35/32)*((1 - z^2)^3)*abszltone
.
. gen ktricubic = (70/81)*((1 - (abs(z))^3)^3)*abszltone
.
. gen kgaussian = normden(z)
.
. gen k4thordergauss = (1/2)*(3-(z^2))*normden(z)
.
. * This is the optimal 4th order - Pagan and Ullah p.57
. gen k4thorderquartic = (15/32)*(3 - 10*z^2 + 7*z^4)*abszltone
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------z|
251
0 1.452033
-2.5
2.5
212
abszltone |
251 .3944223 .4897027
0
1
kuniform |
251 .1972112 .2448514
0
.5
ktriangular |
251 .1992032 .3058094
0
1
kepanechni~v |
251 .1991833 .2831384
0
.75
-------------+-------------------------------------------------------abszltsqrt~e |
251 .8884462 .3154457
0
1
kepanstata |
251 .199203 .1175801
0 .3354102
kquartic |
251 .1992032 .3209618
0
.9375
ktriweight |
251 .1992032 .351183
0 1.09375
ktricubic |
251 .1992032 .3191548
0 .8641976
-------------+-------------------------------------------------------kgaussian |
251 .1967985 .1323354 .0175283 .3989423
k4thorderg~s |
251 .2053453 .2297148 -.0327459 .5984134
k4thorderq~c |
251 .199253 .4584096 -.2676096 1.40625
.
. outfile z abszltone kuniform ktriangular kepanechnikov abszltsqrtfive /*
> */ kepanstata kquartic ktriweight ktricubic kgaussian /*
> */ k4thordergauss k4thorderquartic using mma09p3kernels.asc, replace
.
. ********** PLOT THE KERNEL FUNCTIONS **********
.
. * Epanstata is similar to Gaussian kernel. Less peaked than Epanechnikov
. graph twoway (line kuniform z) (line kepanechnikov z) (line kepanstata z) /*
> */ (line kgaussian z), title("Four standard kernel functions")
.
. * Triangular, Quartic, Triweight and Tricubic are similar
. * and are more peaked than Epanechnikov
. graph twoway (line ktriangular z) (line kquartic z) (line ktriweight z) /*
> */ (line ktricubic z), title("Four similar kernel functions")
.
. graph twoway (line k4thordergauss z) (line k4thorderquartic z), /*
> */ title("Two fourth order kernel functions")
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma09p3kernels.txt
log type: text
closed on: 18 May 2005, 21:32:00
213
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section2\mma10p1gradient.txt
log type: text
opened on: 17 May 2005, 14:21:11
.
. ********** OVERVIEW OF MMA10P1GRADIENT.DO **********
.
. * STATA Program
.
. * Chapter 10.2.4 page 338-9
. * Gradient Method Example (Newton-Raphson)
. * using artificial data
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** ANALYSIS: FIRST SIX ROUNDS OF NR **********
.
. * General Algorithm is
. * b_s+1 = b_s + A_s*g_s
.
. * For this the example in section 10.2.4
. * Q(b) = -(1/2N) * Sum_i {(y_i-exp(b))^2}
.*
= -(1/2N) * Sum_i {(y_i)^2 -2*y_i*exp(b) + exp(b)^2}
.*
= ymean*exp(b) - 0.5*(exp(b))^2 - (1/N) * Sum_i {(y_i)^2}
.
. * so the gradient vector (here a scalar)
.*
g = dQ_s / db
.*
= (ymean - exp(b))*exp(b)
.
. * and using the Method of scoring variation of Newton-Raphson
. * the weighting matrix (here a scalar)
. * A_s = Inv [ - E[d^2 Q_s / db^2 ] ]
. * A_s = Inv [ - E[(ymean - exp(b))*exp(b) - exp(b)*exp(b)] ]
.*
= Inv [ exp(2b) ] since E[(ymean - exp(b)] = 0
.*
= exp(-2b)
.
. * Data
. scalar ymean = 2.0
214
.
. * Starting value
. scalar b_1 = 0.0
.
. * First round
. scalar g_1 = (ymean - exp(b_1))*exp(b_1)
. scalar A_1 = exp(-2*b_1)
. scalar b_2 = b_1 + A_1*g_1
.
. * Second round
. scalar b_3 = b_2 + A_2*g_2
.
. * Third round
. scalar b_4 = b_3 + A_3*g_3
.
. * Fourth round
. scalar b_5 = b_4 + A_4*g_4
.
. * Fifth round
. scalar b_6 = b_5 + A_5*g_5
.
. * Sixth round
.
215
. * We also calculate the objective function at each round

. * (ignoring the term - (1/N) * Sum_i {(y_i)^2} which does not depend on b)
. scalar Q_1 = ymean*exp(b_1) - 0.5*(exp(b_1))^2
.
. * DISPLAY THE RESULTS GIVEN IN TABLE 10.1 page 339
. di "Round Estiamte Gradient Weight Function"
Round Estiamte Gradient Weight Function
. di " 1: " b_1 %8.6f " " g_1 %8.6f " " A_1 %8.6f " " Q_1 %8.6f
1: 0 1 1 1.5
. di " 2: " b_2 %8.6f " " g_2 %8.6f " " A_2 %8.6f " " Q_2 %8.6f
2: 1 -1.9524924 .13533528 1.7420356
. di " 3: " b_3 %8.6f " " g_3 %8.6f " " A_3 %8.6f " " Q_3 %8.6f
3: .73575888 -.18171081 .22957678 1.9962098
. di " 4: " b_4 %8.6f " " g_4 %8.6f " " A_4 %8.6f " " Q_4 %8.6f
4: .6940423 -.00358529 .24955284 1.9999984
. di " 5: " b_5 %8.6f " " g_5 %8.6f " " A_5 %8.6f " " Q_5 %8.6f
5: .69314758 -1.602e-06 .2499998 2
. di " 6: " b_6 %8.6f " " g_6 %8.6f " " A_6 %8.6f " " Q_6 %-8.6f
6: .69314718 -3.206e-13 .25 2
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section2\mma10p1gradient.txt
log type: text
closed on: 17 May 2005, 14:21:11
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section3\mma11p1boot.txt
log type: text
opened on: 18 May 2005, 15:52:55
.
. ********** OVERVIEW OF MMA11P1BOOT.DO **********
216
.
. * STATA Program
.
. * Bootstrap applied to exponential regression model
. * Provides
. * (1) Bootstrap distribution of beta and t-statistic (Table 11.1)
. * (2) Various statistics from bootstrap (pages 366-8)
. * (3) Bootstrap density of the t-statistic (Figure 11.1)
.
. * Note: To speed up progam reduce breps - the number of bootstrap replications
.*
But final program should use many repications
.
. * Note: This program uses ereg which is an old Stata command
.*
superceded by streg, dist(exp)
.
. * Note: For bootstrap see also mm07p4boot.do
.*
which has additional commands / ways to bootstrap
.
. ********** SETUP **********
.
. set more off
. version 8
.
. ********** GENERATE DATA **********
.
. * Model is y ~ exponential(exp(a + bx + cz))
. * where x and z are joint normal (1,1,0.1,0.1,0.5)
. * i.e. means 0.1 and 0.1
.*
sd's 0.1 and 0.1 and correln 0.5 (so correln^2 = .25)
. * variances 0.01 and 0.01 and covariance 0.005
.
. * Generate data from joint normal
. * Use fact that x is N(mu0.1,0.1)
.*
and z | x is N(0.1 + .05/.1*(x - .1), .01x.75 = .0075)
.*
so that st dev = sqrt(0.0075) = 0.0866025
.
. set obs 50
obs was 0, now 50
. set seed 10001
. * Generate x and z bivariate normal
. scalar mu1=0.1
217
. scalar mu2=0.1
. scalar sig1=0.1
. scalar sig2=0.1
. scalar rho=0.5
. scalar sig12=rho*sig1*sig2
. gen x = mu1 + sig1*invnorm(uniform())
. gen muzgivx = mu2+(sig12/(sig2*sig2))*(x-mu1)
. gen sigzgivx = sqrt(sig2*sig2*(1-rho*rho))
. gen z = muzgivx + sigzgivx*invnorm(uniform())
. * To generate y exponential with mean mu=Ey use
. * Integral 0 to a of (1/mu)exp(-x/mu) dx by change of variables
. * = Integral 0 to a/mu of exp(-t)dt
. * = incomplete gamma function P(0,a/mu) in the terminology of Stata
. gen Ey = exp(-2.0+2*x+2*z)
. gen y = Ey*invgammap(1,uniform())
. gen logy = log(y)
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------x|
50 .0935209 .1031485 -.1173506 .2778609
muzgivx |
50 .0967604 .0515742 -.0086753 .1889304
sigzgivx |
50 .0866025
0 .0866025 .0866025
z|
50 .1033014 .0909297 -.0885447 .3137469
Ey |
50 .2114837 .071719 .0945722 .4314067
-------------+-------------------------------------------------------y|
50 .2024206 .2237202 .0005293 .9601147
logy |
50 -2.282336 1.45494 -7.543878 -.0407026
. ereg y x z
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

218
Exponential regression -- entry time 0

log expected-time form
Number of obs =
LR chi2(2)
=
8.75
Prob > chi2 =
50
0.0126
-----------------------------------------------------------------------------y|
Coef. Std. Err.
-------------+---------------------------------------------------------------x | .2670543 1.417339 0.19 0.851 -2.510879 3.044988
z | 4.663384 1.740712 2.68 0.007 1.251652 8.075117
_cons | -2.191619 .2328589 -9.41 0.000 -2.648014 -1.735224
-----------------------------------------------------------------------------.
. save mma11p1boot, replace
file mma11p1boot.dta saved
.
. outfile y x z using mma11p1boot.asc, replace
.
. ********** SIMPLE BOOTSTRAP **********
.
. * Stata produces four bootstrap 100*(1-alpha) confidence intervals
. * (N) and (P) have no asymptotic refinement
. * (BC)-(BCA) have asymptotic refinement
. * For details see program mma07p4boot.do
.
. * From page 399, for testing better to use 999 than 1000
. global breps = 999 /* The number of bootstrap reps used below */
.
. set seed 20001
.
. * A simple and adequate bootstrap command for the slope coefficients is
. bs "ereg y x z" "_b[x] _b[z]", reps($breps) level(95)
command:
ereg y x z
statistics: _bs_1
= _b[x]
_bs_2
= _b[z]
Number of obs =
Replications =
999
50

-------------+---------------------------------------------------------------219
_bs_1 | 999 .2670543 -.1885509 1.420956 -2.52135 3.055458 (N)

|
-2.9054 2.696445 (P)
|
-2.590993 2.864327 (BC)
_bs_2 | 999 4.663384 .0524786 1.939086 .8582302 8.468539 (N)
|
.5006047 8.483892 (P)
|
.231034 8.174835 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
.
. ********** MORE DETAILED BOOTSTRAP **********
.
. * The following bootstrap also gives standard error at each replication
. * and saves data from replications for further analysis
.
. * In partiulcar, want to use the percentile-t method,
. * which provides asymtptotic refinement
.
. * Stata does not give this. For methods see
. * e.g. Efron and Tibsharani (1993, pp.160-162)
. * e.g. Cameron and Trivedi (2005) Chapter 11.2.6-11.2.7
. * For sample s compute t-test(s) = (bhat(s)-bhat) / se(s)
. * where bhat is initial estimate
. * and bhat(s) and se(s) are for sth round.
. * Order the t-test(s) statistics and choose the alpha/2 percentiles
. * which give the critical values for the t-test
.
. * Implementation requires saving the results from each bootstrap replication
. * in order to obtain ccritical values from percentiles of bootstrap distribution
.
. use mma11p1boot.dta, clear
.
. * Get and store coefficients (b)
. * for regressors in the original model and data before bootstrap
. quietly ereg y x z
. global bx=_b[x]
. global sex=_se[x]
. global bz=_b[z]
. global sez=_se[z]
. di " Coefficients bx: " $bx " and bz: " $bz
Coefficients bx: .26705432 and bz: 4.6633845
. di " Standard error sex: " $sex " and sez: " $sez
220
Standard error sex: 1.4173391 and sez: 1.7407119

.
. * Bootstrap and save coeff estimates and se's from each replication
. set seed 20001
. bs "ereg y x z" "_b[x] _b[z] _se[x] _se[z]", reps($breps) level(95) saving(mma11p1bootreps) repl
> ace
command:
ereg y x z
statistics: _bs_1
= _b[x]
_bs_2
= _b[z]
_bs_3
= _se[x]
_bs_4
= _se[z]
Number of obs =
Replications =
999
50

-------------+---------------------------------------------------------------_bs_1 | 999 .2670543 -.1885509 1.420956 -2.52135 3.055458
|
-2.9054 2.696445 (P)
|
-2.590993 2.864327 (BC)
_bs_2 | 999 4.663384 .0524786 1.939086 .8582302 8.468539
|
.5006047 8.483892 (P)
|
.231034 8.174835 (BC)
_bs_3 | 999 1.417339 .0644196 .1718393 1.080131 1.754547
|
1.234399 1.902349 (P)
|
1.196068 1.742845 (BC)
_bs_4 | 999 1.740712 .0910103 .186631 1.374478 2.106946
|
1.542322 2.257937 (P)
|
1.453673 2.058318 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
(N)
(N)
(N)
(N)
.
. * Now use the bootstrap estimates
(bootstrap: ereg y x z)
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------_bs_1 |
999 .0785034 1.420956 -9.431229 4.278278
_bs_2 |
999 4.715863 1.939086 -1.747643 12.09208
_bs_3 |
999 1.481759 .1718393 1.145421 2.761842
_bs_4 |
999 1.831722 .186631 1.387625 2.910449
221
. * Order comes from "_b[x] _b[z] _se[x] _se[z]" in earlier bs

. gen bxs = _bs_1
. gen bzs = _bs_2
. gen sexs = _bs_3
. gen sezs = _bs_4
. gen ttestxs = (bxs - $bx)/sexs
. gen ttestzs = (bzs - $bz)/sezs
.
. ********** (1) TABLE 11.1 (page 367)
.
. summarize bzs ttestzs, d
bzs
------------------------------------------------------------Percentiles
Smallest
1% -.3361366
-1.747643
5% 1.544816
-1.716207
10% 2.270323 -1.366866
Obs
999
25% 3.570291 -1.205571
Sum of Wgt.
999
50%
75%
90%
95%
99%
4.77197
Mean
4.715863
Largest
Std. Dev.
1.939086
5.970802
10.10243
7.100958
10.42623
Variance
3.760056
7.810663
10.76733
Skewness
-.1344324
9.426978
12.09208
Kurtosis
3.545415
ttestzs
------------------------------------------------------------Percentiles
Smallest
1% -2.66391 -3.921595
5% -1.727528
-3.483456
10% -1.32364 -3.201425
Obs
999
25% -.6209012 -2.975815
Sum of Wgt.
999
50%
75%
90%
95%
99%
.0618649
Mean
.0261125
Largest
Std. Dev.
1.046855
.7034938
2.693856
1.323415
3.087892
Variance
1.095904
1.70558
3.11692
Skewness
-.1596043
2.529097
3.738328
Kurtosis
3.337749
.
. * Additionally need the 2.5 and 97.5 percentiles not given in summarize, d
222
.
. * Coefficient of z
. _pctile bzs, p(2.5,97.5)
. di " Lower 2.5 and upper 2.5 percentile of coeff b for z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of coeff b for z: .50060469 and 8.4838924
.
. * t-statistic for z
. _pctile ttestzs, p(2.5,97.5)
. di " Lower 2.5 and upper 2.5 percentile of ttest on z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of ttest on z: -2.1827998 and 2.0659592
.
. ********** (2) RESULTS IN TEXT PAGES 366-7 **********
.
. * (2A) Bootstrap standard error estimate (no refinement)
. * These are given earlier in bootstrap table output
. * Equivalently get the standard deviation of bzs
.
. quietly sum bzs
. scalar bzbootse = r(sd)
. di "Bootstrap estimate of standard error: " bzbootse
Bootstrap estimate of standard error: 1.9390864
.
. * (2B) Test b3 = 0 using percentile-t method (asymptotic refinement)
. * Use the 2.5% and 97.5% bootstrap critical values for t-statistic for z
.
. _pctile ttestzs, p(2.5,97.5)
. di " Lower 2.5 and upper 2.5 percentile of ttest on z: " r(r1) " and " r(r2)
Lower 2.5 and upper 2.5 percentile of ttest on z: -2.1827998 and 2.0659592
.
. * (2D) 95% confidence interval with asymptotic refinement
. * Use the preceding critical values
.
. scalar lbz = $bz + r(r1)*$sez /* Note the plus sign here */
. scalar ubz = $bz + r(r2)*$sez
. di " Percentile-t interval lower and upper bounds: (" lbz "," ubz ")"
Percentile-t interval lower and upper bounds: (.86375888,8.2596243)
.
. * (2B-Var) Variation for symmetric two-sided test on z
.
223
. gen absttestzs = abs(ttestzs)

. _pctile absttestzs, p(95)
. di " Upper 5 percentile of symmetric two-sided test on z: " r(r1) "
Upper 5 percentile of symmetric two-sided test on z: 2.0775187
.
. * (2C) Test b3 = 0 without asymptotic refinement
. * Usual Wald test except use bootstrap estimate of standard error
.
. scalar Wald = ($bz - 0) / bzbootse
. di "Wald statistic using bootstrap standard error: " Wald
Wald statistic using bootstrap standard error: 2.404939
.
. * (2E) Bootstrap estimate of bias
. * This is given in the earlier bootstrap results table
. * and is explained in the text
.
. ********** (3) FIGURE 11.1 (p.368) PLOTS ESTIMATED DENSITY OF T-STATISTIC FOR
Z
.
. set scheme s1mono
. label var ttestzs "Bootstrap t-statistic"
. kdensity ttestzs, normal /*
> */ title("Bootstrap Density of 't-Statistic'") /*
> */ xtitle("t-statistic from each bootstrap replication", size(medlarge)) xscale(titlegap(*5)) /*
>
> */ legend( label(1 "Bootstrap Estimate") label(2 "Standard Normal"))
. graph save ch11boot, replace
(file ch11boot.gph saved)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section3\mma11p1boot.txt
log type: text
closed on: 18 May 2005, 15:53:47
224
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section3\mma12p1integration.txt
log type: text
opened on: 18 May 2005, 21:17:14
.
. ********** OVERVIEW OF MMA12P1INTEGRATION.DO **********
.
. * STATA Program
.
. * Computes integral numerically and by simulation
. * (1) Illustrate Midpoint Rule (page 392)
. * (2) Illustrate Monte Carlo integral (Table 12.1 page 392)
.*
. * for computing E[x] and E[exp(-exp(x))] for x ~ N[0,1]
.
. * No data need be read in.
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** (1) NUMERICAL INTEGRATION USING MIDPOINT RULE **********
.
. * Midpoint rule for n evaluation points between a and b is
. * Integral = Sum (j=1 to n) [(b-a)/n]*f(xbar_j)
. * where xbar_j is midpoint between x_j-1 and x_j
.
. program midpointrule, rclass
1. version 8
. args neval a b
3. drop _all
4. scalar increment = (`b'-à') / `neval'
5. set obs `neval'
6. /* Compute the function of interest */
. gen xbar = à' - 0.5*increment + increment*_n
7. gen density = exp(-xbar*xbar/2)/sqrt(2*_pi)
8. * Following is contribution to E[x] when x ~ N[0,1]
. gen f1xbar = xbar*density
9. * Following is contribution to E[exp(-exp(x))] when x ~ N[0,1]
. gen f2xbar = exp(-exp(x))*density
10. /* Compute the averages */
225
. quietly sum f1xbar

11. scalar Ex = r(sum)*increment
12. quietly sum f2xbar
13. scalar Eexpminexpx = r(sum)*increment
14. /* Print results */
. di "Evaluation points: " `neval' " over range: (" à' "," `b' ")
15. di "Midpoint rule estimate of E[x] is: " Ex
16. di "Midpoint rule estimate of E[exp(-exp(x))] is: " Eexpminexpx
17. end
.
. midpointrule 20 -5 5
obs was 0, now 20
Evaluation points: 20 over range: (-5,5)
Midpoint rule estimate of E[x] is: 0
Midpoint rule estimate of E[exp(-exp(x))] is: .38175625
obs was 0, now 200
obs was 0, now 2000
.
. ********** (2) MONTE CARLO INTEGRATION USING DRAWS FROM DENSITY OF X
**********
.
. * To get E[g(x)]
. * make draws from N[0,1], compute g(x), and average over draws
.
. program simintegration, rclass
1. version 8
. args nsims
3. /* Generate the data: here x */
. drop _all
4. set obs `nsims'
5. set seed 10101
6. gen x = invnorm(uniform())
7. /* Compute the function of interest */
. gen f1x = x /* For E[x] just need x */
8. gen f2x = exp(-exp(x)) /* For E[exp(-exp(x))] */
9. /* Compute the averages */
. quietly sum f1x
10. scalar Ex = r(mean)
226
11. quietly sum f2x

12. scalar Eexpminexpx = r(mean)
13. di "Number of simulations: " `nsims'
14. di "Monte Carlo estimate of E[x] is: " Ex
15. di "Monte Carlo estimate of E[exp(-exp(x))] is: " Eexpminexpx
16. end
.
. * Note a different program was used to obtain Table 12.1 on page 392
. * So results will differ somewhat from text, except for very high number of simulations
.
. simintegration 10
obs was 0, now 10
Number of simulations: 10
Monte Carlo estimate of E[x] is: -.10143571
Monte Carlo estimate of E[exp(-exp(x))] is: .42635197
. simintegration 25
obs was 0, now 25
Monte Carlo estimate of E[x] is: .17496346
. simintegration 50
obs was 0, now 50
. simintegration 100
obs was 0, now 100
obs was 0, now 500
obs was 0, now 1000
obs was 0, now 1000
227

. clear
. set mem 20m
(20480k)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section3\mma12p1integration.txt
log type: text
closed on: 18 May 2005, 21:17:16
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section3\mma12p2mslmsm.txt
log type: text
opened on: 18 May 2005, 21:46:27
.
. ********** OVERVIEW OF MMA12P2MSLMSM.DO **********
.
. * STATA Program
.
. * Chapter 12.4.5 pages 397-8 and 12.5.5 pages 402-4
. * Computes integral numerically and by simulation
. * (1) Maximum Simulated likelihood Table 12.2
. * (2) Method of Simulated Moments Table 12.3
. * with application to generated data
.
. * The application is only illustrative.
. * This is not a template program for MSL or MSM.
.
. * Different number of simulations S lead to different estimators.
. * This program gives entries in Tables 12.2 and 12.3 for S = 100
. * For other values of S change the value of simreps
228
. * from the current global simreps 100

.
. ********** SETUP **********
.
. set more off
. version 8
.
.
. * Model is y = theta + u + e
. * where theta is a scalar parameter equal to 1
.*
u is extreme value type 1
.*
e is N(0,1)
. * n is set in global numobs
.
. ********** DEFINE GLOBALS **********
.
. global simreps 100 /* change this to change the number of simulations */
. global numobs 100 /* change this to change the number of observations */
.
.
. ********** (1) MAXIMUM SIMULATED LIKELIHOOD (Table 12.2 p.398) **********
.
. * This MSL program is inefficiently written computer code
. * as it requires drawing the same random variates at each iteration
.
. * Generate data
. clear
. set obs $numobs
obs was 0, now 100
. set seed 10101
. gen u = -log(-log(uniform()))
. gen e = invnorm(uniform())
. gen y = 1 + u + e
. summarize u e y
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u|
100 .7236045 1.372637 -1.827296 6.423636
e|
100 .0415449 .9472174 -2.906972 2.302204
y|
100 1.765149 1.684177 -2.227185 8.143228
229
.
. outfile u e y using mma12p2mslmsm.asc, replace
.
. * Use the variant ml d0 as this gives the entire likelihood, not just one observation.
. * I want this so that seed is only reset for the entire data.
. * My program is inefficient as variates needs to be redrawn at each iteration
. program define msl
1. version 6.0
2. args todo b lnf
/* Need to use the names todo b and lnf
>
todo always contains 1 and may be ignored
>
b is parameters and lnf is log-density */
3. tempvar theta1
/* create as needed to calculate lf, g, ... */
4. mleval `theta1' = `b', eq(1) /* theta1 is theta1_i = x_i'b
*/
5. local y "$ML_y1"
/* create to make program more readable */
6. set seed 10101
7. tempvar denssim
8. global isim=1
9. quietly gen `denssim' = exp(-0.5*(`y'-`theta1'+log(-log(uniform())))^2)/sqrt(2*_pi)
10. while $isim < $simreps {
11.
quietly replace `denssim' = `denssim' + exp(-0.5*(`y'-`theta1'+log(-log(uniform())))^2)/sq
> rt(2*_pi)
12. global isim=$isim+1
13. }
14. mlsum `lnf' = ln(`denssim'/$isim)
15. end
.
. gen one = 1
. ml model d0 msl (y = one, nocons )
. ml maximize
initial:
alternative: log likelihood = -199.54479
rescale:
Number of obs =
100
Wald chi2(1) =
65.72
Prob > chi2 =
0.0000
-----------------------------------------------------------------------------y|
Coef. Std. Err.
230
-------------+---------------------------------------------------------------one | 1.177456 .1452451 8.11 0.000 .8927806 1.462131

-----------------------------------------------------------------------------.
. *** Display MSL results in one column of Table 12.2 p.398
.
. di "For number of simulations S = " $simreps
For number of simulations S = 100
. di "MSL estimator: " _b[one]
MSL estimator: 1.1774557
. di "Standard error: " _se[one]
Standard error: .14524511
.
. ********** (2) METHOD OF SIMULATED MOMENTS (Table 12.3 p.404) **********
.
. clear
. set obs $numobs
obs was 0, now 100
. set seed 10101
. gen u = -log(-log(uniform()))
. gen e = invnorm(uniform())
. gen y = 1 + u + e
. summarize u e y
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u|
100 .7236045 1.372637 -1.827296 6.423636
e|
100 .0415449 .9472174 -2.906972 2.302204
y|
100 1.765149 1.684177 -2.227185 8.143228
.
. global isim=1
. gen usim = -log(-log(uniform()))
. gen esim = invnorm(uniform())
. while $isim < $simreps {
2. quietly replace usim = usim-log(-log(uniform()))
3. quietly replace esim = esim+invnorm(uniform())
4. global isim=$isim+1
231
5. }
. gen usimbar = usim/$isim
. gen esimbar = esim/$isim
. gen theta = y - usimbar - esimbar
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------u|
100 .7236045 1.372637 -1.827296 6.423636
e|
100 .0415449 .9472174 -2.906972 2.302204
y|
100 1.765149 1.684177 -2.227185 8.143228
usim |
100 57.36345 13.16979 21.96637 90.07499
esim |
100 -.9702956 11.38655 -26.38858 33.28406
-------------+-------------------------------------------------------usimbar |
100 .5736345 .1316979 .2196637 .9007499
esimbar |
100 -.009703 .1138655 -.2638858 .3328406
theta |
100 1.201218 1.681435 -2.757669 7.75245
.
. * Results for Table 12.3 on page 404
. * Here the st.eror of theta_MSM is approximated by the st. dev. of theta
. * divided by the square root of S (the number of simulations)
. quietly sum theta
. scalar theta_MSM = r(mean)
. scalar approx_sterror = r(sd)/sqrt($simreps)
.
. * Display MSM results in one column of Table 12.3 p.404
. di "For number of simulations S = " $simreps
For number of simulations S = 100
. di "MSM estimator: " theta_MSM
MSM estimator: 1.2012178
. di "Approximate standard error: " approx_sterror
Approximate standard error: .16814348
.
. * As written this will not give the correct standard errors (see p.403).
. * Can get this by also computing the squared rv to get E[y^2]
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section3\mma12p2mslmsm.txt
log type: text
232
closed on: 18 May 2005, 21:46:28

-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section3\mma12p3draws.txt
log type: text
opened on: 18 May 2005, 21:48:36
.
. ********** OVERVIEW OF MMA12P3DRAWS.DO **********
.
. * STATA Program
.
. * Draws figures that illustrate two common ways to draw random variates
.
. * (1) Illustrate Inverse Transformation method: Figure 12.2
. * (2) Illustrate Envelope method: Figure 12.3
.
. * No data need be read in.
.
. ********** SETUP **********
.
. set more off
. version 8
. set scheme s1mono
.
. ********** (1) INVERSE TRANSFORMATION - FIGURE 12.2 page 413 **********
.
. * Graph is for x = 0 to 5 in increments of 0.05
. set obs 100
obs was 0, now 100
. gen x = 0.05*_n
. * Unit Exponential cdf
. gen Fx = 1 - exp(-x)
. * Suppose uniform draw is 0.64
. gen uniformdraw = 0.64
.
. graph twoway (line Fx x, yline(0.64) xline(1.02)), /*
> */ title("Inverse Transformation Method") /*
233
> */ xtitle("Random variable x", size(medlarge)) xscale(titlegap(*5)) /*

> */ ytitle("Cdf F(x)", size(medlarge)) yscale(titlegap(*5)) /*
> */ caption(" " "Draw of 0.64 (vertical axis) yields x = 1.02 (horizontal axis).")
. graph save ch12fig2invtransform, replace
(file ch12fig2invtransform.gph saved)
. graph export ch12fig2invtransform.wmf, replace
(file c:\Imbook\bwebpage\Section3\ch12fig2invtransform.wmf written in Windows Metafile
format)
.
. ********** (2) ENVELOPE METHOD - FIGURE 12.3 **********
.
. * The following is a modification of the figure in the book
. * making clear that the envelope is a scaling up of g(x)
.
. clear
.
. * Graph is for x = 0 to 10 in increments of 0.1
. set obs 101
obs was 0, now 101
. gen x = -0.05 + 0.1*_n
. * Unit Exponential cdf
. gen fx = normden(x-4)
. gen gx = 1.5*normden(x-4)+0.005
.
. graph twoway (line fx x, clstyle(p1)) /*
> */ (line gx x, clstyle(p1) clwidth(*2) clcolor(gs12)), /*
> */ title("Accept-reject Method") /*
> */ xtitle("Random variable x", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("f(x) and kg(x)", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Desired density f(x)") label(2 "Envelope kg(x)") )
. graph save ch12fig3envelope, replace
(file ch12fig3envelope.gph saved)
. graph export ch12fig3envelope.wmf, replace
(file c:\Imbook\bwebpage\Section3\ch12fig3envelope.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section3\mma12p3draws.txt
234
log type: text

closed on: 18 May 2005, 21:48:42
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section3\mma13p1bayesthm.txt
log type: text
opened on: 24 May 2005, 11:04:08
.
. ********** OVERVIEW OF MMA13P1BAYESTHM.DO **********
.
. * STATA Program
.
. * Chapter 13.2.2 page 424
. * Create Figure 13.1
. * (1) Bayes Analysis illustrated using normal distribution and prior
.
. * No data are needed.
.
. ********** SETUP
.
. set more off
. version
version 8.2
.
.
. * Model is y ~ normal(theta, sigmesq) where sigmasq is known.
. * and the prior is theta ~ normal(mu, tau)
. * which gives a normal posterior
. * n is set below in set obs
.
. ********** CREATE DATA **********
.
. * The likleihood and prior are normal so the posterior is also normal
.
. * Will evaluate the densities at points between 0 and 15
. set obs 150
obs was 0, now 150
. gen xeval = 0.1*_n
.
235
. * Likelihood with sigmasq known

. scalar nobs = 50
. scalar ybar = 10
. scalar sigmasq = 100
. gen likelihood = normden(xeval,ybar,sqrt(sigmasq/nobs))
.
. * Prior
. scalar mu = 5
. scalar tausq = 3
. gen prior = normden(xeval,mu,sqrt(tausq))
.
. * Posterior given sample mean of using
. scalar tau1sq=1/((nobs/sigmasq)+(1/tausq))
. scalar mu1 = tau1sq*((ybar*nobs/sigmasq)+(mu/tausq))
. gen posterior = normden(xeval,mu1,sqrt(tau1sq))
.
. scalar list
mu1 =
tau1sq =
tausq =
mu =
sigmasq =
ybar =
nobs =
8
1.2
3
5
100
10
50
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------xeval |
150
7.55 4.344537
.1
15
likelihood |
150 .0666548 .0944174 6.44e-12 .2820948
prior |
150 .0665247 .0804685 1.33e-08 .2303294
posterior |
150 .0666667 .1131755 1.85e-12 .3641828
.
. graph twoway (line likelihood xeval, clstyle(p2)) /*
> */ (line prior xeval, clstyle(p3)) /*
> */ (line posterior xeval, clstyle(p1)), /*
> */ title("Bayes: Likelihood, Prior and Posterior") /*
> */ xtitle("Evaluation point", size(medlarge)) xscale(titlegap(*5)) /*
236
>
>
>
>
*/ ytitle("Density", size(medlarge)) yscale(titlegap(*5)) /*

*/ legend( label(1 "Likelihood N[10,2]") label(2 "Prior N[5,3]") /*
*/
label(3 "Posterior N[8,1.2]") )
. graph save Ch13_Bayes1, replace

(file Ch13_Bayes1.gph saved)
. graph export Ch13_Bayes1.wmf, replace
(file c:\Imbook\bwebpage\Section3\Ch13_Bayes1.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section3\mma13p1bayesthm.txt
log type: text
closed on: 24 May 2005, 11:04:12
1
The SAS System
25, 2005
08:50 Wednesday, May
NOTE: Copyright (c) 2002-2003 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) 9.1 (TS1M2)
Licensed to UNIV OF CA/DAVIS, Site 0029107010.
NOTE: This session is executing on the SunOS 5.9 platform.
You are running SAS 9. Some SAS 8 files will be automatically converted
by the V9 engine; others are incompatible. Please see
http://support.sas.com/rnd/migration/planning/platform/64bit.html
PROC MIGRATE will preserve current SAS file attributes and is
recommended for converting all your SAS libraries from any
SAS 8 release to SAS 9. For details and examples, please see
http://support.sas.com/rnd/migration/index.html
This message is contained in the SAS news file, and is presented upon
initialization. Edit the file "news" in the "misc/base" directory to
display site-specific news and information in the program log.
The command line option "-nonews" will prevent this display.
NOTE: SAS initialization used:

real time
0.11 seconds
cpu time
0.10 seconds
1
2
* MMA13P2BAYES.SAS March 2005 for SAS version 8.2
237
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
********** OVERVIEW OF MMA13P2BAYES.SAS **********

* SAS Program
* copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi
* used for "Microeconometrics: Methods and Applications"
* by A. Colin Cameron and Pravin K. Trivedi (2005)
* Cambridge University Press
* Chapter 13.6 p.452-4
* MCMC Example: Gibbs Sampler for 2 equation SUR
* Program creates the first column of Table 13.3
* (though differs somewhat due to use of different seed)
* For different columns of Table 13.3 change
* nobs = Sample size N (1000 or 10000)
* replics = Gibbs sample replications (50000 or 100000)
* tau = 1, 10 or 0.1
* This program does first column: tau=10, nobs=1000, replics=50000
* Note that the program does not exactly replicate Table 13.3
* Table 13.3 used the computer clock for seed,
* with third argument zero in rannor(j( , ,0))
* Here instead the seed is consecutively 10101, 20101, ... , 70101
* so third argument is eg rannor(j( , ,10101))
* to permit reproducability by other users
* This programs creates
238
2
25, 2005
The SAS System
08:50 Wednesday, May
30
* MMA13P2BAYES.1ST SAS Output with one column of Table 13.3
31
* MMA13P2BAYES.LOG SAS log file
32
33
* This program uses generated data - so no data set required
34
* This program uses a lot of memory - 1 gigabyte should do
35
* In Unix give command sas -MEMSIZE 1G mma13p2bayesgibbs.sas
36
37
*********************************************************************;
38
*****
BIVARIATE NORMAL-BAYESIAN-ESTIMATION-BY-MCMC
**************;
39
*********************************************************************;
40
41
OPTIONS LS=75;
42
options NOTES;
43
44
PROC IML;
NOTE: IML Ready
45
start main;
45
!
46
47
print "A. Colin Cameron and Pravin K. Trivedi (2005)";
47
!
48
print "Microeconometrics: Methods and Applications, CUP";
48
!
49
print "MCMC Example: Gibbs Sampler for SUR";
49
!
50
51
************* GENERATING DATA: 2 EQUATION SUR
51
! ****************;
52
53
nobs = 1000;
53
!
54
replics = 50000;
54
!
55
burn = 5000;
55
!
56
replics = replics + burn;
56
!
57
58
npar1 = 2;
58
!
59
npar2 = 2;
59
!
60
61
alpha1 ={1,1};
61
!
62
alpha2 ={1,1};
62
!
239
63
64
64
65
65
66
66
67
67
68
69
sigma = {1 -0.5,-0.5 1};

!
T = {0.15 2.18 0.725 0.45};
!
EPS = 1e-20;
!
IC = (1/2.506628275);
!
R1 = j(nobs,1,1)||rannor(j(nobs,1,10101));
240
3
69
70
70
71
72
72
73
73
74
74
75
76
76
77
77
78
79
79
80
81
81
82
82
83
84
84
85
85
86
86
87
87
88
89
89
90
90
91
91
92
93
93
94
95
95
96
97
97
98
The SAS System 08:50 Wednesday, May 25, 2005

!
R2 = j(nobs,1,1)||rannor(j(nobs,1,20101));
!
e = rannor(j(nobs,2,30101))*root(sigma);
!
e1 = e[,1];
!
e2 = e[,2];
!
Y1 = R1*alpha1 + e1;
!
Y2 = R2*alpha2 + e2;
!
*************
SPECIFY PRIOR DISTRIBUTIONS
! ******************;
alpha01 = j(npar1,1,0);
!
alpha02 = j(npar2,1,0);
!
sigma = I(2);
!
p = 3;
!
df = 5;
!
tau = 10;
!
MUalpha = alpha01//alpha02;
!
OMalpha = tau*I(npar1+npar2);
!
OMphi = I(2);
!
************ ANALYSIS: GIBBS SAMLING BEGINS HERE
! ***************;
do rep = 1 to replics;
!
*************
GENERATE ALPHA1 ALPHA2 RHO
! *******************;
241
99
99
100
101
102
102
103
104
isigma = inv(sigma);
!
LL = ((isigma[1,1]*R1`*R1||isigma[1,2]*R1`*R2)//
(isigma[2,1]*R2`*R1||isigma[2,2]*R2`*R2));
!
LisigY = ((isigma[1,1]*R1`*Y1+isigma[1,2]*R1`*Y2)//
(isigma[2,1]*R2`*Y1+isigma[2,2]*R2`*Y2));
242
4
104
105
106
107
107
108
109
109
110
110
111
112
112
113
113
114
115
115
116
117
118
118
119
119
120
120
121
121
122
122
123
123
124
124
125
126
126
127
128
128
129
130
130
131
131
132
132
133
134

!
alpha = inv(inv(OMalpha)+ LL)*(LisigY + inv(OMalpha)*MUalpha)
+ root(inv(inv(OMalpha)+
! LL))`*rannor(j(npar1+npar2,1,40101));
alpha1 = alpha[1:npar1];
!
alpha2 = alpha[npar1+1:npar1+npar2];
!
e1 = Y1 - R1*alpha1;
!
e2 = Y2 - R2*alpha2;
!
*************
GENERATE SIGMA
! *******************;
mt = (sqrt((rannor(j(1,nobs+df,50101))##2)[,+])||0)//
(rannor(j(1,1,60101))||sqrt((rannor(j(1,nobs+df-1,70101))##
! 2)[,+]));
mv = mt*mt`;
!
e=(e1||e2);
!
ms = e`*e+inv(OMphi);
!
ml = root(inv(ms))`;
!
mg = ml*mv*ml`;
!
sigma = inv(mg);
!
free mt mv e ml;
!
************* WRITE TO OUTPUT FILE IF AFTER BURN-IN
! **************;
if rep <= burn then goto point300;
!
sigma3 = sigma[1,1]||sigma[1,2]||sigma[2,2];
!
out1 = alpha1`||alpha2`||sigma3;
!
output1=output1//out1;
243
134
135
136
136
136
137
138
138
!
!
point300:
end;
*************
! **************;
END OF GIBBS SAMPLING
244
5
139
140
141
141
142
142
143
143
144
145
145
146
147
147
148
148
149
150
150
151
151
152
152
153
153
154
155
155
156
156
157
157
158
158
159
160
160
161
161
162
162
163
164
164
165
165
166
166
167
****************************************************************
! *****;
***** RESULTS: COMPARE LAST HALF WITH ALL (AFTER BURN-IN)
! *******;
****************************************************************
! *****;
replics = replics-burn;
!
out1 = output1[replics/2+1:replics,];
!
out = output1[1:replics,];
!
create exp from out1;
!
append from out1;
!
summary var _num_;
!
close exp;
!
create exp from out;
!
append from out;
!
summary var _num_;
!
close exp;
!
****************************************************************
! *****;
****** RESULTS: POSTERIOR MEAN AND SD - TABLE 13.3 P.454
! ********;
****************************************************************
! *****;
xnames1 = {"CONSTANT"} || {"R1"};
!
xnames2 = {"CONSTANT"} || {"R2"};
!
parnames = concat({"d1"}," ",xnames1)||concat({"d2"},"
! ",xnames2)||{"SIGMA11"}||{"SIGMA12"}||{"SIGMA22"};
245
168
168
169
169
170
170
171
171
meanout = out[+,]/replics;
!
stderr =
! sqrt(((out-j(replics,1,1)*meanout)##2)[+,]/(replics-1));
parm = meanout;
!
stderr = stderr`;
!
246
6
172
172
173
174
174
175
175
176
176
177
177
178
178
179
179
180
180
181
181
182
182
183
183
184
185
185
186
186
187
187
188
189
189
190
191
191
192
193
193
194
194
195
195
196
196
197
198
198
199

tnpar = npar1 + npar2 + 3;
!
tstat = parm`/ stderr;
!
coeff = parm` || stderr || tstat;
!
info = tau // nobs // replics // burn // tnpar;
!
rowinfo={'TAU' '# OBSERVATIONS' '# REPLICATIONS' '# BURN-IN' '#
! PARAMETERS'};
estcol ={ 'ESTIMATE' 'STD ERR' 'T-STAT'};
!
mattrib info rowname=rowinfo label={" "};
!
mattrib coeff rowname=parnames colname=estcol label={" "};
!
print / "Results for Table 13.3 p.454";
!
print info;
!
print coeff;
!
****************************************************************
! *****;
********** RESULTS: CONVERGENCE CHECK: SEE P.454
! ***************;
****************************************************************
! *****;
print / "Convergence check on p.454";
!
corr = j(20,7,0);
!
do i = 1 to 7;
!
cov = covlag(out[,i],20)`;
!
corr[,i] = cov/cov[1];
!
end;
!
covd1 = j(20,2,0);
!
247
200
200
201
201
202
202
203
203
do k = 1 to 3;
!
covd1 = corr[,2*k-1:2*k];
!
print covd1;
!
end;
!
248
204
205
covd1 = corr[,7];
205
!
206
print covd1;
206
!
207
208
finish main;
NOTE: Module MAIN defined.
208
!
209
210
run main;
NOTE: The data set WORK.EXP has 25000 observations and 7 variables.
NOTE: The data set WORK.EXP has 50000 observations and 7 variables.
210
!
NOTE: Exiting IML.
NOTE: 65925 workspace compresses.
NOTE: The PROCEDURE IML printed pages 1-6.
NOTE: PROCEDURE IML used (Total process time):
real time
5:44.35
cpu time
5:44.04
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414
NOTE: The SAS System used:
real time
5:45.48
cpu time
5:45.15
249
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma14p1binary.txt
log type: text
opened on: 19 May 2005, 09:01:28
.
. ********** OVERVIEW OF MMA14P1BINARY.DO **********
.
. * STATA Program
.
. * Chapter 14.2 (pages 464-6) Logit and probit models.
. * Provides
. * (1) Table 14.1: Data summary
. * (2) Table 14.2: Logit, Probit and OLS slope estimates
. * (3) Figure 14.1: Plot of Logit Probit and OLS predicted probabilities
.
. * Nldata.asc
.
. ********** SETUP
.
. set more off
. version 8.0
.
.
. * Data Set comes from :
. * J. A. Herriges and C. L. Kling,
. * "Nonlinear Income Effects in Random Utility Models",
. * Review of Economics and Statistics, 81(1999): 62-72
.
. * The data are given as a combined observation with data on all 4 choices.
. * This will work for multinomial logit program.
. * For conditional logit will need to make a new data set which has
. * four separate entries for each observation as there are four alternatives.
.
. * Filename: NLDATA.ASC
. * Format: Ascii
. * Number of Observations: 1182
. * Each observations appears over 3 lines with 4 variables per line
. * so 4 x 1182 = 4728 observations
. * Variable Number and Description
. * 1 Recreation mode choice. = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
250
. * 2 Price for chosen alternative

. * 3 Catch rate for chosen alternative
. * 4 = 1 if beach mode chosen; = 0 otherwise
. * 5 = 1 if pier mode chosen; = 0 otherwise
. * 6 = 1 if private boat mode chosen; = 0 otherwise
. * 7 = 1 if charter boat mode chosen; = 0 otherwise
. * 8 = price for beach mode
. * 9 = price for pier mode
. * 10 = price for private boat mode
. * 11 = price for charter boat mode
. * 12 = catch rate for beach mode
. * 13 = catch rate for pier mode
. * 14 = catch rate for private boat mode
. * 15 = catch rate for charter boat mode
. * 16 = monthly income
.
. ********** READ IN DATA **********
.
. infile mode price crate dbeach dpier dprivate dcharter pbeach ppier /*
> */ pprivate pcharter qbeach qpier qprivate qcharter income /*
> */ using nldata.asc
.
. * Divide income by 1000 so that results are easy to read
. gen ydiv1000 = income/1000
.
. label define modetype 1 "beach" 2 "pier" 3 "private" 4 "charter"
. label values mode modetype
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524
.0678
.5333
qpier |
1182 .1622237 .1603898 .0014 .4522
251
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
.
. ********** CREATE BINARY DATA: CHARTER vs PIER **********
.
. * Binary logit of charter (mode = 2) versus pier (mode = 4)
. keep if mode == 2 | mode == 4
. * charter is 1 if fish from charter boat and 0 if fish from pier
. gen charter = 0
. replace charter = 1 if mode == 4
.
. gen pratio = 100*ln(pcharter/ppier)
. gen lnrelp = ln(pchart/ppier)
.
. * Overall summary
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
630 3.434921 .9011843
2
4
price |
630 62.51669 52.31219
1.29 387.208
crate |
630 .5533478 .6953035
.0014 2.3101
dbeach |
630
0
0
0
0
dpier |
630 .2825397 .4505921
0
1
-------------+-------------------------------------------------------dprivate |
630
0
0
0
0
dcharter |
630 .7174603 .4505921
0
1
pbeach |
630 95.19802 95.62037
1.29 578.048
ppier |
630 95.19802 95.62037
1.29 578.048
pprivate |
630 55.26221 59.99482
2.29 494.058
-------------+-------------------------------------------------------pcharter |
630 84.89158 60.79327
27.29 529.058
qbeach |
630 .2546022 .1983357
.0678
.5333
qpier |
630 .1716835 .1687288
.0014 .4522
qprivate |
630 .1695303 .2033172
.0014
.7369
qcharter |
630 .6368509 .688508
.0029 2.3101
-------------+-------------------------------------------------------income |
630 3741.402 2145.71 416.6667
12500
ydiv1000 |
630 3.741402 2.14571 .4166667
12.5
charter |
630 .7174603 .4505921
0
1
252
pratio |
lnrelp |
630 27.45581 126.2598 -215.3976 406.2712

630 .2745581 1.262598 -2.153976 4.062713
. * Summary by charter or by pier

. sort mode
. by mode: summarize
----------------------------------------------------------------------------------------------------> mode = pier
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
178
2
0
2
2
price |
178 30.57133 35.58442
1.29 224.296
crate |
178 .2025348 .1702942
.0014
.4522
dbeach |
178
0
0
0
0
dpier |
178
1
0
1
1
-------------+-------------------------------------------------------dprivate |
178
0
0
0
0
dcharter |
178
0
0
0
0
pbeach |
178 30.57133 35.58442
1.29 224.296
ppier |
178 30.57133 35.58442
1.29 224.296
pprivate |
178 82.42908 69.30802
2.29 494.058
-------------+-------------------------------------------------------pcharter |
178 109.7633 72.37726
27.29 529.058
qbeach |
178 .2614444 .1949684
.0678
.5333
qpier |
178 .2025348 .1702942
.0014
.4522
qprivate |
178 .1501489 .0968393
.0014
.2601
qcharter |
178 .4980798 .3756255
.0029 1.0266
-------------+-------------------------------------------------------income |
178 3387.172 2340.324 416.6667
12500
ydiv1000 |
178 3.387172 2.340324 .4166667
12.5
charter |
178
0
0
0
0
pratio |
178 164.2956 104.3052 -79.13918 406.2712
lnrelp |
178 1.642956 1.043052 -.7913917 4.062713
----------------------------------------------------------------------------------------------------> mode = charter
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
452
4
0
4
4
price |
452 75.09694 52.51942
27.29 387.208
crate |
452 .6914998 .7714728
.0029 2.3101
dbeach |
452
0
0
0
0
dpier |
452
0
0
0
0
-------------+-------------------------------------------------------dprivate |
452
0
0
0
0
dcharter |
452
1
0
1
1
pbeach |
452 120.6483 99.78664
4.29 578.048
253
ppier |
452 120.6483 99.78664
4.29 578.048
pprivate |
452 44.56376 52.23744
2.29 362.208
-------------+-------------------------------------------------------pcharter |
452 75.09694 52.51942
27.29 387.208
qbeach |
452 .2519077 .1997956
.0678
.5333
qpier |
452 .1595341 .1667353 .0014 .4522
qprivate |
452 .1771628 .2318749
.0014
.7369
qcharter |
452 .6914998 .7714728
.0029 2.3101
-------------+-------------------------------------------------------income |
452
3880.9 2050.028 416.6667
12500
ydiv1000 |
452
3.8809 2.050028 .4166667
12.5
charter |
452
1
0
1
1
pratio |
452 -26.43243 87.53686 -215.3976 235.8242
lnrelp |
452 -.2643243 .8753686 -2.153976 2.358242
.
. * Write final data to a text (ascii) file so can use with programs other than Stata
. outfile charter lnrelp using mma14p1binary.asc, replace
.
. ********** TABLE 14.1 - DATA SUMMARY BY OUTCOME AND OVERALL **********
.
. * Following gives Table 14.1 page 464
. summarize charter pcharter ppier lnrelp
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------charter |
630 .7174603 .4505921
0
1
pcharter |
630 84.89158 60.79327
27.29 529.058
ppier |
630 95.19802 95.62037
1.29 578.048
lnrelp |
630 .2745581 1.262598 -2.153976 4.062713
. sort mode
. by mode: summarize charter pcharter ppier lnrelp
----------------------------------------------------------------------------------------------------> mode = pier
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------charter |
178
0
0
0
0
pcharter |
178 109.7633 72.37726
27.29 529.058
ppier |
178 30.57133 35.58442
1.29 224.296
lnrelp |
178 1.642956 1.043052 -.7913917 4.062713
----------------------------------------------------------------------------------------------------> mode = charter
Variable |
Obs
Mean
Std. Dev.
Min
Max
254
-------------+-------------------------------------------------------charter |
452
1
0
1
1
pcharter |
452 75.09694 52.51942
27.29 387.208
ppier |
452 120.6483 99.78664
4.29 578.048
lnrelp |
452 -.2643243 .8753686 -2.153976 2.358242
.
. ********** TABLE 14.2 - ESTIMATE LOGIT, PROBIT AND OLS MODELS
.
. logit charter lnrelp
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:

Logit estimates
Number of obs =
630
LR chi2(1)
= 336.47
Prob > chi2 = 0.0000
Pseudo R2
= 0.4486
-----------------------------------------------------------------------------charter |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnrelp | -1.82253 .1445681 -12.61 0.000 -2.105879 -1.539182
_cons | 2.053125 .1689307 12.15 0.000 1.722027 2.384223
-----------------------------------------------------------------------------. estimates store blogit
.
. probit charter lnrelp
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Probit estimates
Number of obs =
630
LR chi2(1)
= 341.30
Prob > chi2 = 0.0000
Pseudo R2
= 0.4550
-----------------------------------------------------------------------------charter |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnrelp | -1.055515 .0761117 -13.87 0.000 -1.204691 -.9063383
255
_cons | 1.19436 .089504 13.34 0.000 1.018936 1.369785

-----------------------------------------------------------------------------. estimates store bprobit
.
. regress charter lnrelp
Source |
SS
df
MS
Number of obs = 630
-------------+-----------------------------F( 1, 628) = 542.12
Model | 59.1676598 1 59.1676598
Prob > F
= 0.0000
Residual | 68.5402767 628 .109140568
R-squared = 0.4633
-------------+-----------------------------Adj R-squared = 0.4624
Total | 127.707937 629 .203033285
Root MSE
= .33036
-----------------------------------------------------------------------------charter |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnrelp | -.2429137 .0104328 -23.28 0.000 -.2634011 -.2224262
_cons | .7841542 .0134701 58.21 0.000 .7577023 .8106061
-----------------------------------------------------------------------------. estimates store bOLS
.
. * Heteroskedastic robust standard errors only needed for OLS
. * but given for other models for completeness
.
. logit charter lnrelp, robust
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:

Logit estimates
Number of obs =
630
Wald chi2(1) = 194.28
Prob > chi2 = 0.0000
Pseudo R2
= 0.4486
-----------------------------------------------------------------------------|
Robust
charter |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnrelp | -1.82253 .1307556 -13.94 0.000 -2.078807 -1.566254
_cons | 2.053125 .1473477 13.93 0.000 1.764329 2.341921
-----------------------------------------------------------------------------. estimates store bloghet
256
.
. probit charter lnrelp, robust
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Probit estimates
Number of obs =
630
Wald chi2(1) = 232.07
Prob > chi2 = 0.0000
Pseudo R2
= 0.4550
-----------------------------------------------------------------------------|
Robust
charter |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnrelp | -1.055515 .0692881 -15.23 0.000 -1.191317 -.9197122
_cons | 1.19436 .0794429 15.03 0.000 1.038655 1.350066
-----------------------------------------------------------------------------. estimates store bprobhet
.
. regress charter lnrelp, robust
Number of obs =
F( 1, 628) = 792.44
Prob > F
= 0.0000
R-squared = 0.4633
Root MSE = .33036
630
-----------------------------------------------------------------------------|
Robust
charter |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnrelp | -.2429137 .0086292 -28.15 0.000 -.2598592 -.2259681
_cons | .7841542 .0119566 65.58 0.000 .7606744 .8076341
-----------------------------------------------------------------------------. estimates store bOLShet
.
. * Following gives Table 14.2 page 465
. estimates table blogit bprobit bOLS bloghet bprobhet bOLShet, /*
> */ t stats(N ll r2 r2_p) b(%8.3f) keep(_cons lnrelp)
-------------------------------------------------------------------------------Variable | blogit bprobit
bOLS bloghet bprobhet bOLShet
257
-------------+-----------------------------------------------------------------_cons | 2.053
1.194
0.784
2.053
1.194
0.784
| 12.15
13.34
58.21
13.93
15.03
65.58
lnrelp | -1.823 -1.056 -0.243 -1.823 -1.056 -0.243
| -12.61 -13.87 -23.28 -13.94 -15.23 -28.15
-------------+-----------------------------------------------------------------N | 630.000 630.000 630.000 630.000 630.000 630.000
ll | -206.827 -204.411 -195.167 -206.827 -204.411 -195.167
r2 |
0.463
0.463
r2_p | 0.449
0.455
0.449
0.455
-------------------------------------------------------------------------------legend: b/t
.
. ********** FIGURE 14.1 - PLOT PREDICTED PROBABILITY AGAINST X FOR MODELS
.
. quietly logit charter lnrelp
. predict plogit, p
.
. quietly probit charter lnrelp
. predict pprobit, p
.
. quietly regress charter lnrelp
. predict pOLS
.
. sum charter plogit pprobit pOLS
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------charter |
630 .7174603 .4505921
0
1
plogit |
630 .7174603 .3193077 .0047196 .9974746
pprobit |
630
.72019 .3196164 .0009877 .9997377
pOLS |
630 .7174603 .3067022 -.2027341 1.307384
.
. sort lnrelp
.
. * Following gives Figure 14.1 page 466
. graph twoway (scatter charter lnrelp, msize(vsmall) jitter(3)) /*
> */ (line plogit lnrelp, clstyle(p1)) /*
> */ (line pprobit lnrelp, clstyle(p2)) /*
> */ (line pOLS lnrelp, clstyle(p3)), /*
258
>
>
>
>
>
>
*/ title("Predicted Probabilities Across Models") /*

*/ xtitle("Log relative price (lnrelp)", size(medlarge)) xscale(titlegap(*5)) /*
*/ ytitle("Predicted probability", size(medlarge)) yscale(titlegap(*5)) /*
*/ legend( label(1 "Actual Data (jittered)") label(2 "Logit") /*
*/
label(3 "Probit") label(4 "OLS"))
. graph export ch14binary.wmf, replace

(file c:\Imbook\bwebpage\Section4\ch14binary.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma14p1binary.txt
log type: text
closed on: 19 May 2005, 09:01:31
259
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma15p1mnl.txt
log type: text
opened on: 19 May 2005, 12:16:20
.
. ********** OVERVIEW OF MMA15P1MNL.DO **********
.
. * STATA Program
.
. * Chapter 15.2.1-3 pages 491-5
. * Multinomial and conditional logit models analysis.
. * It provides ....
. * (0) Data summary (Table 15.1)
. * (1A) Multinomial Logit estimates (Table 15.1)
. * (1B) Multinomial Logit marginal effects (text page 494)
. * (2A) Conditional Logit estimates (Table 15.2)
. * (2B) Conditional Logit marginal effects (Table 15.3)
. * (3) Multinomial estimates obtained using Cinditional Logit
. * (4) "Mixed Model" estimates (Table 15.1)
.
. * Related programs are
. * mma15p2gev.do estimates a nested logit model using Stata
. * mma15p3mnl.lim estimates multinomial models using Limdep
. * mma15p4gev.lim estimates conditional and nested logit models using Limdep
.
. * Nldata.asc
.
. /* Program summary:
>
> (1) Multinomial logit of mode on alternative-invariant regressor (income)
>
mlogit mode income
>
> (2) Conditional logit of mode on alternative-specific regressor (price, catch rate)
>
First reshape data so 4 observations per individual - one for each mode.
>
clogit mode p q
>
> (3) Conditional logit of mode on alternative-invariant regressor (income)
>
>
Then create dummy variables for each mode d2 d3 d4
>
clogit mode d2 d3 d4 d2y d3y d4y
>
This gives same results as (1)
>
> (4) Conditional logit of mode on alternative-invariant regressor (income)
>
and on alternative-sepcific regressor (price, catch rate)
>
260
>
Then create dummy variables for each mode d2 d3 d4
>
clogit mode d2 d3 d4 d2y d3y d4y p q
> */
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
.
.
. * Format: Ascii
.
. ********** READ IN DATA and SUMMARIZE (Table 15.1, p.492) **********
.
. * Method to read in depends on model used
261
.
. /* Data are on fishing mode: 1 beach, 2 pier, 3 private boat, 4 charter
> Data come as one observation having data for all 4 modes.
> Both alternative specific and alternative invariant regresssors.
> */
.
.
.
. * Look at data by alternative
. label define modetype 1 "beach" 2 "pier" 3 "private" 4 "charter"
. label values mode modetype
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524
.0678
.5333
qpier |
1182 .1622237 .1603898 .0014 .4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
. sort mode
. by mode: summarize
---------------------------------------------------------------------------------------------------262
-> mode = beach

Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
134
1
0
1
1
price |
134 35.69949 43.09414
1.29 306.82
crate |
134 .2791948 .1938734
.0678
.5333
dbeach |
134
1
0
1
1
dpier |
134
0
0
0
0
-------------+-------------------------------------------------------dprivate |
134
0
0
0
0
dcharter |
134
0
0
0
0
pbeach |
134 35.69949 43.09414
1.29 306.82
ppier |
134 35.69949 43.09414
1.29 306.82
pprivate |
134 97.80913 75.43844
2.29 392.946
-------------+-------------------------------------------------------pcharter |
134 125.0032 78.37641
27.29 427.946
qbeach |
134 .2791948 .1938734
.0678
.5333
qpier |
134 .2190015 .1677117
.0025
.4522
qprivate |
134 .1593985 .0948855 .0008 .2601
qcharter |
134 .5176089 .3629096
.0027 1.0266
-------------+-------------------------------------------------------income |
134 4051.617 2505.42 416.6667
12500
ydiv1000 |
134 4.051617 2.50542 .4166667
12.5
----------------------------------------------------------------------------------------------------> mode = pier
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
178
2
0
2
2
price |
178 30.57133 35.58442
1.29 224.296
crate |
178 .2025348 .1702942 .0014 .4522
dbeach |
178
0
0
0
0
dpier |
178
1
0
1
1
-------------+-------------------------------------------------------dprivate |
178
0
0
0
0
dcharter |
178
0
0
0
0
pbeach |
178 30.57133 35.58442
1.29 224.296
ppier |
178 30.57133 35.58442
1.29 224.296
pprivate |
178 82.42908 69.30802
2.29 494.058
-------------+-------------------------------------------------------pcharter |
178 109.7633 72.37726
27.29 529.058
qbeach |
178 .2614444 .1949684 .0678 .5333
qpier |
178 .2025348 .1702942
.0014
.4522
qprivate |
178 .1501489 .0968393
.0014
.2601
qcharter |
178 .4980798 .3756255
.0029 1.0266
-------------+-------------------------------------------------------income |
178 3387.172 2340.324 416.6667
12500
ydiv1000 |
178 3.387172 2.340324 .4166667
12.5
263
----------------------------------------------------------------------------------------------------> mode = private

Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
418
3
0
3
3
price |
418 41.60681 55.90806
2.29 666.11
crate |
418 .1775411 .2435798
.0002
.7369
dbeach |
418
0
0
0
0
dpier |
418
0
0
0
0
-------------+-------------------------------------------------------dprivate |
418
1
0
1
1
dcharter |
418
0
0
0
0
pbeach |
418 137.5271 115.3058
2.29 843.186
ppier |
418 137.5271 115.3058
2.29 843.186
pprivate |
418 41.60681 55.90806
2.29 666.11
-------------+-------------------------------------------------------pcharter |
418 70.58409 56.39575
27.29 691.11
qbeach |
418 .2082868 .1729351
.0678
.5333
qpier |
418 .1297646 .1368029
.0025 .4522
qprivate |
418 .1775411 .2435798
.0002
.7369
qcharter |
418 .6539167 .8064379
.0021 2.3101
-------------+-------------------------------------------------------income |
418 4654.107 2777.898 416.6667
12500
ydiv1000 |
418 4.654107 2.777898 .4166667
12.5
----------------------------------------------------------------------------------------------------> mode = charter
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
452
4
0
4
4
price |
452 75.09694 52.51942
27.29 387.208
crate |
452 .6914998 .7714728
.0029 2.3101
dbeach |
452
0
0
0
0
dpier |
452
0
0
0
0
-------------+-------------------------------------------------------dprivate |
452
0
0
0
0
dcharter |
452
1
0
1
1
pbeach |
452 120.6483 99.78664
4.29 578.048
ppier |
452 120.6483 99.78664
4.29 578.048
pprivate |
452 44.56376 52.23744
2.29 362.208
-------------+-------------------------------------------------------pcharter |
452 75.09694 52.51942
27.29 387.208
qbeach |
452 .2519077 .1997956
.0678
.5333
qpier |
452 .1595341 .1667353
.0014
.4522
qprivate |
452 .1771628 .2318749
.0014
.7369
qcharter |
452 .6914998 .7714728
.0029 2.3101
-------------+-------------------------------------------------------income |
452
3880.9 2050.028 416.6667
12500
ydiv1000 |
452
3.8809 2.050028 .4166667
12.5
264
.
. * Following commands give Table 15.1, p.492
. summarize ydiv100 pbeach ppier pprivate pcharter qbeach qpier /*
> */ qprivate qcharter dbeach dpier dprivate dcharter
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
pcharter |
1182 84.37924 63.54465
27.29 691.11
-------------+-------------------------------------------------------qbeach |
1182 .2410113 .1907524 .0678 .5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
dbeach |
1182 .1133672 .3171753
0
1
-------------+-------------------------------------------------------dpier |
1182 .1505922 .3578023
0
1
dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
. sort mode
. by mode: summarize ydiv100 pbeach ppier pprivate pcharter qbeach qpier /*
> */ qprivate qcharter dbeach dpier dprivate dcharter
----------------------------------------------------------------------------------------------------> mode = beach
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ydiv1000 |
134 4.051617 2.50542 .4166667
12.5
pbeach |
134 35.69949 43.09414
1.29 306.82
ppier |
134 35.69949 43.09414
1.29 306.82
pprivate |
134 97.80913 75.43844
2.29 392.946
pcharter |
134 125.0032 78.37641
27.29 427.946
-------------+-------------------------------------------------------qbeach |
134 .2791948 .1938734
.0678
.5333
qpier |
134 .2190015 .1677117
.0025
.4522
qprivate |
134 .1593985 .0948855
.0008
.2601
qcharter |
134 .5176089 .3629096
.0027 1.0266
dbeach |
134
1
0
1
1
-------------+-------------------------------------------------------dpier |
134
0
0
0
0
dprivate |
134
0
0
0
0
dcharter |
134
0
0
0
0
265
----------------------------------------------------------------------------------------------------> mode = pier

Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ydiv1000 |
178 3.387172 2.340324 .4166667
12.5
pbeach |
178 30.57133 35.58442
1.29 224.296
ppier |
178 30.57133 35.58442
1.29 224.296
pprivate |
178 82.42908 69.30802
2.29 494.058
pcharter |
178 109.7633 72.37726
27.29 529.058
-------------+-------------------------------------------------------qbeach |
178 .2614444 .1949684 .0678 .5333
qpier |
178 .2025348 .1702942
.0014
.4522
qprivate |
178 .1501489 .0968393
.0014
.2601
qcharter |
178 .4980798 .3756255
.0029 1.0266
dbeach |
178
0
0
0
0
-------------+-------------------------------------------------------dpier |
178
1
0
1
1
dprivate |
178
0
0
0
0
dcharter |
178
0
0
0
0
----------------------------------------------------------------------------------------------------> mode = private
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ydiv1000 |
418 4.654107 2.777898 .4166667
12.5
pbeach |
418 137.5271 115.3058
2.29 843.186
ppier |
418 137.5271 115.3058
2.29 843.186
pprivate |
418 41.60681 55.90806
2.29 666.11
pcharter |
418 70.58409 56.39575
27.29 691.11
-------------+-------------------------------------------------------qbeach |
418 .2082868 .1729351
.0678
.5333
qpier |
418 .1297646 .1368029
.0025
.4522
qprivate |
418 .1775411 .2435798
.0002
.7369
qcharter |
418 .6539167 .8064379
.0021 2.3101
dbeach |
418
0
0
0
0
-------------+-------------------------------------------------------dpier |
418
0
0
0
0
dprivate |
418
1
0
1
1
dcharter |
418
0
0
0
0
----------------------------------------------------------------------------------------------------> mode = charter
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ydiv1000 |
452
3.8809 2.050028 .4166667
12.5
pbeach |
452 120.6483 99.78664
4.29 578.048
ppier |
452 120.6483 99.78664
4.29 578.048
pprivate |
452 44.56376 52.23744
2.29 362.208
266
pcharter |
452 75.09694 52.51942
27.29 387.208
-------------+-------------------------------------------------------qbeach |
452 .2519077 .1997956
.0678
.5333
qpier |
452 .1595341 .1667353
.0014
.4522
qprivate |
452 .1771628 .2318749
.0014
.7369
qcharter |
452 .6914998 .7714728
.0029 2.3101
dbeach |
452
0
0
0
0
-------------+-------------------------------------------------------dpier |
452
0
0
0
0
dprivate |
452
0
0
0
0
dcharter |
452
1
0
1
1
.
. ********** (1) MULTINOMIAL LOGIT: ALTERNATIVE-INVARIANT REGRESSOR
*********
.
. *** (1A) Estimate the model
.
. * Data are already in form for mlogit
.
. * The following gives MNL column of Table 15.2, p.493
. mlogit mode ydiv1000, basecategory(1)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Multinomial logistic regression

LR chi2(3)
Prob > chi2
Number of obs =
1182
=
41.14
= 0.0000
Pseudo R2
= 0.0137
-----------------------------------------------------------------------------mode |
Coef. Std. Err.
-------------+---------------------------------------------------------------pier
|
ydiv1000 | -.1434029 .0532882 -2.69 0.007 -.2478459 -.03896
_cons | .8141503 .2286316 3.56 0.000 .3660405 1.26226
-------------+---------------------------------------------------------------private
|
ydiv1000 | .0919064 .0406638 2.26 0.024 .0122069 .1716059
_cons | .7389208 .1967309 3.76 0.000 .3533352 1.124506
-------------+---------------------------------------------------------------charter
|
ydiv1000 | -.0316399 .0418463 -0.76 0.450 -.1136571 .0503774
_cons | 1.341291 .1945167 6.90 0.000 .9600457 1.722537
-----------------------------------------------------------------------------(Outcome mode==beach is the comparison group)
267
.
. *** (1B) Calculate the marginal effects
.
. quietly mlogit mode ydiv1000, basecategory(1)
. * Predict by default gives the probabilities
. predict p1 p2 p3 p4
(option p assumed; predicted probabilities)
.
. * As check compare predicted to actual probabilities
. summarize dbeach p1 dpier p2 dprivate p3 dcharter p4
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dbeach |
1182 .1133672 .3171753
0
1
p1 |
1182 .1133672 .0036716 .0947395 .1153659
dpier |
1182 .1505922 .3578023
0
1
p2 |
1182 .1505922 .0444575 .0356142 .2342903
dprivate |
1182 .3536379 .4783008
0
1
-------------+-------------------------------------------------------p3 |
1182 .3536379 .0797714 .2396973 .625706
dcharter |
1182 .3824027 .4861799
0
1
p4 |
1182 .3824027 .0346281 .2439403 .4158273
.
. * Quick way to compute marginal effects (or semi-elasticities dp/dlnx or elasticities)
. * is to use built-in Stata function whcih evaluates at sample mean
. * dydx, eyex, dwex or eydx
. mfx compute, dydx predict(outcome(1))
Marginal effects after mlogit
y = Pr(mode==1) (predict, outcome(1))
= .11541492
-----------------------------------------------------------------------------variable |
X
---------+-------------------------------------------------------------------ydiv1000 | .000075
.00393 0.02 0.985 -.007635 .007785 4.09934
-----------------------------------------------------------------------------. mfx compute, dydx predict(outcome(2))
= .14472379
-----------------------------------------------------------------------------variable |
X
---------+-------------------------------------------------------------------ydiv1000 | -.0206598
.00487 -4.24 0.000 -.030212 -.011108 4.09934
------------------------------------------------------------------------------
268
. mfx compute, dydx predict(outcome(3))

= .35220366
-----------------------------------------------------------------------------variable |
dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+-------------------------------------------------------------------ydiv1000 | .0325985
.00569 5.73 0.000 .021442 .043755 4.09934
-----------------------------------------------------------------------------. mfx compute, dydx predict(outcome(4))
= .38765763
-----------------------------------------------------------------------------variable |
X
---------+-------------------------------------------------------------------ydiv1000 | -.0120137
.00608 -1.98 0.048 -.023922 -.000106 4.09934
-----------------------------------------------------------------------------.
. * Better is to evaluate marginal effect for each observation and average
. * The following calculates marginal effects using noncalculus methods
. * by comparing the predicted probability before and after change in x
. * Here consider small change of 0.0001 - then multiply by 1000
. * So should be similar to using calculus methods.
. replace ydiv1000 = ydiv1000 + 0.0001
. predict p1new p2new p3new p4new
(option p assumed; predicted probabilities)
. gen dp1dy = 10000*(p1new - p1)
. gen dp2dy = 10000*(p2new - p2)
. gen dp3dy = 10000*(p3new - p3)
. gen dp4dy = 10000*(p4new - p4)
.
. * The computed marginal effects follow.
. * These are close to those given in text page 494 (which were calculated using Limdep)
. sum dp1dy dp2dy dp3dy dp4dy
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dp1dy |
1182 .0001549 .0015919 -.0042468 .0027567
dp2dy |
1182 -.0207849 .0046004 -.0278652 -.0067055
269
dp3dy |
dp4dy |
1182 .0318045 .0014852 .0280142 .0336766

1182 -.0111929 .0041308 -.0190735 -.0026822
.
. * Note that here these are similar to the earlier values at means
. * This is because little variation in predicted probability across individuals here
.
. * ASIDE: Binary logit will differ a little from MNL
. keep if mode == 1 | mode == 2
. mlogit mode ydiv1000
Multinomial logistic regression
LR chi2(1)
Prob > chi2
Number of obs =
312
=
5.72
= 0.0168
Pseudo R2
= 0.0134
-----------------------------------------------------------------------------mode |
Coef. Std. Err.
-------------+---------------------------------------------------------------beach
|
ydiv1000 | .1134757 .0481736 2.36 0.018 .0190571 .2078942
_cons | -.7037127 .2125851 -3.31 0.001 -1.120372 -.2870535
-----------------------------------------------------------------------------(Outcome mode==pier is the comparison group)
.
. ******* (2) CONDITIONAL LOGIT: ALTERNATIVE-SPECIFIC REGRESSOR *********
.
. *** (2A) Estimate the model
.
. * This requires reshaping the data
. clear
.
.
. * Data are one entry per individual
. * Need to reshape to 4 observations per individual - one for each alternative
. * Use reshape to do this which also creates variable (see below)
270
. * alternatv = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter

. gen id = _n
. gen d1 = dbeach
. gen p1 = pbeach
. gen q1 = qbeach
. gen d2 = dpier
. gen p2 = ppier
. gen q2 = qpier
. gen d3 = dprivate
. gen p3 = pprivate
. gen q3 = qprivate
. gen d4 = dcharter
. gen p4 = pcharter
. gen q4 = qcharter
. describe
Contains data
obs:
1,182
vars:
30
size:
label
variable label
------------------------------------------------------------------------------mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
271
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
id
float %9.0g
d1
float %9.0g
p1
float %9.0g
q1
float %9.0g
d2
float %9.0g
p2
float %9.0g
q2
float %9.0g
d3
float %9.0g
p3
float %9.0g
q3
float %9.0g
d4
float %9.0g
p4
float %9.0g
q4
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524
.0678
.5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885 .0002 .7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
id |
1182
591.5 341.3583
1
1182
d1 |
1182 .1133672 .3171753
0
1
p1 |
1182 103.422 103.641
1.29 843.186
-------------+-------------------------------------------------------q1 |
1182 .2410113 .1907524
.0678
.5333
d2 |
1182 .1505922 .3578023
0
1
p2 |
1182 103.422 103.641
1.29 843.186
272
q2 |
1182 .1622237 .1603898
.0014
.4522
d3 |
1182 .3536379 .4783008
0
1
-------------+-------------------------------------------------------p3 |
1182 55.25657 62.71344
2.29 666.11
q3 |
1182 .1712146 .2097885
.0002
.7369
d4 |
1182 .3824027 .4861799
0
1
p4 |
1182 84.37924 63.54465
27.29 691.11
q4 |
1182 .6293679 .7061142
.0021 2.3101
.
. reshape long d p q, i(id) j(alterntv)
(note: j = 1 2 3 4)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
1182 -> 4728
Number of variables
30 ->
22
j variable (4 values)
-> alterntv
xij variables:
d1 d2 ... d4 -> d
p1 p2 ... p4 -> p
q1 q2 ... q4 -> q
----------------------------------------------------------------------------. * This automatically creates alterntv = 1 (beach), ... 4 (charter)
. describe
Contains data
obs:
4,728
vars:
22
size:
label
variable label
------------------------------------------------------------------------------id
float %9.0g
alterntv
byte %9.0g
mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
273
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
d
float %9.0g
p
float %9.0g
q
float %9.0g
------------------------------------------------------------------------------Sorted by: id alterntv
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219 .0002 .7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
.
. clogit d q, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Conditional (fixed-effects) logistic regression Number of obs =

LR chi2(1)
=
67.97
4728
274
Prob > chi2

= 0.0000
Pseudo R2
=
0.0207
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------q | .6307908 .0757624 8.33 0.000 .4822993 .7792823
-----------------------------------------------------------------------------. clogit d p, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


4728
LR chi2(1)
= 531.33
Prob > chi2 = 0.0000
Pseudo R2
= 0.1621
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------p | -.0179501 .0010694 -16.79 0.000 -.0200461 -.0158542
-----------------------------------------------------------------------------.
. * The following gives CL column of Table 15.2
. clogit d p q, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


4728
LR chi2(2)
= 653.24
Prob > chi2 = 0.0000
Pseudo R2
= 0.1993
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------p | -.0204765 .0012231 -16.74 0.000 -.0228737 -.0180794
q | .9530985 .0894134 10.66 0.000 .7778514 1.128346
-----------------------------------------------------------------------------275
.
. *** (2B) Calculate the marginal effects
.
. quietly clogit d p q, group(id)
. predict pinitial
(option pc1 assumed; conditional probability for single outcome within group)
.
. * Now compute marginal effects
. * Consider in turn a change in each price and catch rate
. * Change price by 1 unit and then multiply by 100 as in Table 15.2
. * Change catch rate by 0.001 and then multiply by 1000
.
. * Change p1: price beach
. replace p = p + 1 if alterntv==1
. predict pnewp1
. gen mep1 = 100*(pnewp1 - pinitial)
. replace p = p - 1 if alterntv==1
.
. * Change p2: price pier
. predict pnewp2
.
. * Change p3: price private boat
. predict pnewp3
276

.
. * Change p4: price charter boat
. predict pnewp4
.
. * Change q1: catch rate beach
. replace q = q + 0.001 if alterntv==1
. predict pnewq1
. gen meq1 = 1000*(pnewq1 - pinitial)
. replace q = q - 0.001 if alterntv==1
.
. * Change q2: catch rate pier
. predict pnewq2
.
. * Change q1: catch rate private boat
. predict pnewq3
277

.
. * Change q1: catch rate charter boat
. predict pnewq4
.
. * Following gives Table 15.3 on page 493
. sort alterntv
. by alterntv: sum pinitial mep1 mep2 mep3 mep4 meq1 meq2 meq3 meq4
----------------------------------------------------------------------------------------------------> alterntv = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------pinitial |
1182 .1942074 .1545855 6.19e-08 .6159062
mep1 |
1182 -.2703818 .1753241 -.5119085 -1.26e-07
mep2 |
1182 .1183563 .1425011
0 .5107701
mep3 |
1182 .0846517 .0561764 6.24e-08 .1818448
mep4 |
1182 .0675326 .0398588 6.44e-08 .1960158
-------------+-------------------------------------------------------meq1 |
1182 .1264198 .0817316 5.91e-08 .2382994
meq2 |
1182 -.0552685 .0664207 -.2378225
0
meq3 |
1182 -.0395602 .0262581 -.0849366 -2.91e-08
meq4 |
1182 -.0315872 .0186528 -.0915527 -3.00e-08
----------------------------------------------------------------------------------------------------> alterntv = 2
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------pinitial |
1182 .1832872 .1456892 5.73e-08 .484103
mep1 |
1182 .1184102 .1425963
0 .5111754
mep2 |
1182 -.2618934 .1742628 -.5112112 -1.16e-07
mep3 |
1182 .0801368 .0543153 5.78e-08 .1729459
mep4 |
1182 .0636229 .0381182 5.96e-08 .1775354
-------------+-------------------------------------------------------meq1 |
1182 -.0552672 .0664175 -.2378225
0
meq2 |
1182 .1224849 .0812789 5.47e-08 .2380311
278
meq3 |
meq4 |
1182 -.0374514
1182 -.0297604
.0253908 -.0807345 -2.69e-08

.0178421 -.0829101 -2.78e-08
----------------------------------------------------------------------------------------------------> alterntv = 3
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------pinitial |
1182 .3298317 .173932 .0000756 .6739099
mep1 |
1182 .084509 .0561326
0 .1815647
mep2 |
1182 .0799891 .0542687
0 .172469
mep3 |
1182 -.3897785 .1364849 -.5119085 -.0001532
mep4 |
1182 .2248109 .1606873 1.24e-08 .5118489
-------------+-------------------------------------------------------meq1 |
1182 -.0395636
.02626 -.0849366
0
meq2 |
1182 -.0374553 .0253917 -.0807345
0
meq3 |
1182 .1818861 .0633881 .0000721 .2382994
meq4 |
1182 -.104879 .0748259 -.2382398 -7.28e-09
----------------------------------------------------------------------------------------------------> alterntv = 4
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------pinitial |
1182 .2926737 .1807255 .000078 .7322331
mep1 |
1182 .0674624 .0398696
0 .1958013
mep2 |
1182 .0635479 .0381287
0 .1772434
mep3 |
1182
.22499 .1608719 1.24e-08 .511682
mep4 |
1182 -.3559665 .1370352 -.5119085 -.0001582
-------------+-------------------------------------------------------meq1 |
1182 -.0315891 .018653 -.0915825
0
meq2 |
1182 -.0297618 .0178418 -.0829399
0
meq3 |
1182 -.1048757 .0748219 -.2382398 -7.28e-09
meq4 |
1182 .1662257 .0636901 .0000744 .2382994
.
. ******* (3) CONDITIONAL LOGIT: ALTERNATIVE-INVARIANT REGRESSOR *********
.
. * Here we get clogit to do something that is easier done by mlogit
.
. clear
.
279
.
. * Use reshape to do this but first create variable
. * Alternative = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
. gen id = _n
. gen d1 = dbeach
. gen d2 = dpier
. gen d3 = dprivate
. gen d4 = dcharter
. describe
Contains data
obs:
1,182
vars:
22
size:
label
variable label
------------------------------------------------------------------------------mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
id
float %9.0g
d1
float %9.0g
d2
float %9.0g
d3
float %9.0g
d4
float %9.0g
------------------------------------------------------------------------------Sorted by:
280
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate | 1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524 .0678 .5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
id |
1182
591.5 341.3583
1
1182
d1 |
1182 .1133672 .3171753
0
1
d2 |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------d3 | 1182 .3536379 .4783008
0
1
d4 |
1182 .3824027 .4861799
0
1
.
. reshape long d, i(id) j(alterntv)
(note: j = 1 2 3 4)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
1182 -> 4728
Number of variables
22 ->
20
-> alterntv
xij variables:
d1 d2 ... d4 -> d
----------------------------------------------------------------------------. describe
Contains data
obs:
4,728
vars:
20
size:
------------------------------------------------------------------------------281
storage display value

label
variable label
------------------------------------------------------------------------------id
float %9.0g
alterntv
byte %9.0g
mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
d
float %9.0g
------------------------------------------------------------------------------Sorted by: id alterntv
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219
.0002
.7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
282
income |
4728 4099.337 2461.183 416.6667
ydiv1000 |
4728 4.099337 2.461183 .4166667
d|
4728
.25 .4330585
0
1
12500
12.5
.
. gen obsnum=_n
. gen d2 = 0
. replace d2 = 1 if mod(obsnum,4)==2
. gen d3 = 0
. gen d4 = 0
. gen d2y = 0
. replace d2y = d2*ydiv1000
. gen d3y = 0
. gen d4y = 0
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
283
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919 .0678 .5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219
.0002 .7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------obsnum |
4728
2364.5
1365
1
4728
d2 |
4728
.25 .4330585
0
1
d3 |
4728
.25 .4330585
0
1
d4 |
4728
.25 .4330585
0
1
d2y |
4728 1.024834 2.160064
0
12.5
-------------+-------------------------------------------------------d3y |
4728 1.024834 2.160064
0
12.5
d4y |
4728 1.024834 2.160064
0
12.5
.
. * The following gives MNL column of Table 15.2, p.493,
. * which was more easily obtained using mlogit earlier
. clogit d d2 d3 d4 d2y d3y d4y, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:


4728
LR chi2(6)
= 322.90
Prob > chi2 = 0.0000
Pseudo R2
= 0.0985
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------d2 | .8141503 .228632 3.56 0.000 .3660399 1.262261
d3 | .7389208 .1967309 3.76 0.000 .3533352 1.124506
d4 | 1.341291 .1945167 6.90 0.000 .9600457 1.722537
d2y | -.1434029 .0532884 -2.69 0.007 -.2478463 -.0389595
d3y | .0919064 .0406637 2.26 0.024 .0122069 .1716058
d4y | -.0316399 .0418463 -0.76 0.450 -.1136571 .0503774
-----------------------------------------------------------------------------.
284
. ******* (4) "MIXED LOGIT" = CONDITIONAL LOGIT WITH BOTH

.*
ALTERNATIVE-SPECIFIC REGRESSOR
.*
AND ALTERNATIVE INVARIANT REGRESSOR *********
.
. clear
.
.
. * Use reshape to do this but first create variable
. * Alternative = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
. gen id = _n
. gen d1 = dbeach
. gen p1 = pbeach
. gen q1 = qbeach
. gen d2 = dpier
. gen p2 = ppier
. gen q2 = qpier
. gen d3 = dprivate
. gen p3 = pprivate
. gen q3 = qprivate
. gen d4 = dcharter
. gen p4 = pcharter
. gen q4 = qcharter
.
(note: j = 1 2 3 4)
Data
wide -> long
----------------------------------------------------------------------------285
Number of obs.
1182 -> 4728
Number of variables
30 ->
22
-> alterntv
xij variables:
d1 d2 ... d4 -> d
p1 p2 ... p4 -> p
q1 q2 ... q4 -> q
----------------------------------------------------------------------------. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter | 4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219
.0002
.7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
.
. * Bring in alternative specific dummies
. * Since d2-d4 already used instead call them dummy2 - dummy4
. gen obsnum=_n
. gen dummy1 = 0
. replace dummy1 = 1 if mod(obsnum,4)==1
. gen dummy2 = 0
286

. gen dummy3 = 0
. gen dummy4 = 0
. * And interact with income
. gen d1y = 0
. replace d1y = dummy1*ydiv1000
. gen d2y = 0
. gen d3y = 0
. gen d4y = 0
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
287
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219 .0002 .7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
obsnum |
4728
2364.5
1365
1
4728
dummy1 |
4728
.25 .4330585
0
1
dummy2 |
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------dummy3 |
4728
.25 .4330585
0
1
dummy4 |
4728
.25 .4330585
0
1
d1y |
4728 1.024834 2.160064
0
12.5
d2y |
4728 1.024834 2.160064
0
12.5
d3y |
4728 1.024834 2.160064
0
12.5
-------------+-------------------------------------------------------d4y |
4728 1.024834 2.160064
0
12.5
.
. clogit d dummy2 dummy3 dummy4 p q, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:


4728
LR chi2(5)
= 815.63
Prob > chi2 = 0.0000
Pseudo R2
= 0.2489
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------dummy2 | .3070552 .1145738 2.68 0.007 .0824947 .5316158
dummy3 | .8713749 .1140428 7.64 0.000 .6478551 1.094895
dummy4 | 1.498888 .1329328 11.28 0.000 1.238345 1.759432
p | -.0247896 .0017044 -14.54 0.000 -.0281301 -.021449
q | .3771689 .1099707 3.43 0.001 .1616303 .5927074
288
-----------------------------------------------------------------------------.
. * The following gives Mixed column of Table 15.2, p.493
. clogit d p q dummy2 dummy3 dummy4 d2y d3y d4y, group(id)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:


4728
LR chi2(8)
= 846.92
Prob > chi2 = 0.0000
Pseudo R2
= 0.2584
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------p | -.0251166 .0017317 -14.50 0.000 -.0285106 -.0217225
q | .357782 .1097733 3.26 0.001 .1426302 .5729337
dummy2 | .7779594 .2204939 3.53 0.000 .3457992 1.21012
dummy3 | .5272788 .2227927 2.37 0.018 .0906131 .9639444
dummy4 | 1.694366 .2240506 7.56 0.000 1.255235 2.133497
d2y | -.1275771 .0506395 -2.52 0.012 -.2268288 -.0283255
d3y | .0894398 .0500671 1.79 0.074 -.0086898 .1875695
d4y | -.0332917 .0503409 -0.66 0.508 -.131958 .0653746
-----------------------------------------------------------------------------.
. * Output data file for Read into Limdep program mma15p4gev.lim
. outfile id d p q ydiv1000 dummy2 dummy3 dummy4 d2y d3y d4y using mma15p4gev.asc, replace
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma15p1mnl.txt
log type: text
closed on: 19 May 2005, 12:16:24
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma15p2gev.txt
log type: text
opened on: 19 May 2005, 12:16:29
.
. ********** OVERVIEW OF MMA15P2GEV.DO **********
.
289
. * STATA Program
.
. * Nested logit (GEV) model analysis.
. * (1) Set data up and reproduce Mixed estimates in Table 15.2 p.493
. * (2A) Nested logit model estimates (page 511)
. * (2B) Restricted nested logit model estimates (page 511)
. * (2C) Equivalent conditional logit model estimates (same as (2B))
.
. * Related programs are
. * mma15p1mnl.do multinomial and conditional logit using Stata
. * mma15p3mnl.lim multinomial logit using Limdep
. * mma15p4gev.lim conditional and nested logit using Limdep and Nlogit
.
. * Nldata.asc
.
. * NOTE: The example here is deliberately simple and merely illustrative.
.*
with nesting structure
.*
/ \
.*
/ \ / \
. * In this case with parameter rho_j differing across alternatives
. * Stata 8 estimates the earlier variant of the nested logit model
. * rather than the preferred variant given in the text.
. * See the discussion at bottom of page 511 and also Train (2003, p.88)
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
.
.
290
. * Format: Ascii
.
. ******* (1) CONDITIONAL LOGIT MODEL (Table 15.2 p.493 Mixed column) *********
.
.
.
. * Use reshape to do this which also creates variable (see below)
. * alternatv = 1 if beach, = 2 if pier; = 3 if private boat; = 4 if charter
. gen id = _n
. gen d1 = dbeach
. gen p1 = pbeach
. gen q1 = qbeach
. gen d2 = dpier
. gen p2 = ppier
. gen q2 = qpier
291
. gen d3 = dprivate
. gen p3 = pprivate
. gen q3 = qprivate
. gen d4 = dcharter
. gen p4 = pcharter
. gen q4 = qcharter
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------mode |
1182 3.005076 .9936162
1
4
price |
1182 52.08197 53.82997
1.29 666.11
crate |
1182 .3893684 .5605964
.0002 2.3101
dbeach |
1182 .1133672 .3171753
0
1
dpier |
1182 .1505922 .3578023
0
1
-------------+-------------------------------------------------------dprivate |
1182 .3536379 .4783008
0
1
dcharter |
1182 .3824027 .4861799
0
1
pbeach |
1182 103.422 103.641
1.29 843.186
ppier |
1182 103.422 103.641
1.29 843.186
pprivate |
1182 55.25657 62.71344
2.29 666.11
-------------+-------------------------------------------------------pcharter |
1182 84.37924 63.54465
27.29 691.11
qbeach |
1182 .2410113 .1907524
.0678
.5333
qpier |
1182 .1622237 .1603898
.0014
.4522
qprivate |
1182 .1712146 .2097885
.0002
.7369
qcharter |
1182 .6293679 .7061142
.0021 2.3101
-------------+-------------------------------------------------------income |
1182 4099.337 2461.964 416.6667
12500
ydiv1000 |
1182 4.099337 2.461964 .4166667
12.5
id |
1182
591.5 341.3583
1
1182
d1 |
1182 .1133672 .3171753
0
1
p1 |
1182 103.422 103.641
1.29 843.186
-------------+-------------------------------------------------------q1 |
1182 .2410113 .1907524
.0678
.5333
d2 |
1182 .1505922 .3578023
0
1
p2 |
1182 103.422 103.641
1.29 843.186
q2 |
1182 .1622237 .1603898
.0014
.4522
d3 |
1182 .3536379 .4783008
0
1
-------------+-------------------------------------------------------p3 |
1182 55.25657 62.71344
2.29 666.11
q3 |
1182 .1712146 .2097885
.0002
.7369
d4 |
1182 .3824027 .4861799
0
1
p4 |
1182 84.37924 63.54465
27.29 691.11
292
q4 |
1182 .6293679
.7061142
.0021
2.3101
.
(note: j = 1 2 3 4)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
1182 -> 4728
Number of variables
30 ->
22
-> alterntv
xij variables:
d1 d2 ... d4 -> d
p1 p2 ... p4 -> p
q1 q2 ... q4 -> q
----------------------------------------------------------------------------. * This automatically creates alterntv = 1 (beach), ... 4 (charter)
. describe
Contains data
obs:
4,728
vars:
22
size:
label
variable label
------------------------------------------------------------------------------id
float %9.0g
alterntv
byte %9.0g
mode
float %9.0g
price
float %9.0g
crate
float %9.0g
dbeach
float %9.0g
dpier
float %9.0g
dprivate
float %9.0g
dcharter
float %9.0g
pbeach
float %9.0g
ppier
float %9.0g
pprivate
float %9.0g
pcharter
float %9.0g
qbeach
float %9.0g
qpier
float %9.0g
qprivate
float %9.0g
qcharter
float %9.0g
income
float %9.0g
ydiv1000
float %9.0g
d
float %9.0g
p
float %9.0g
q
float %9.0g
------------------------------------------------------------------------------293
Sorted by: id alterntv

. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185 .0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919 .0678 .5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219
.0002
.7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
.
. * Bring in alternative specific dummies
. * Since d2-d4 already used instead call them dummy2 - dummy4
. gen obsnum=_n
. gen dummy1 = (mod(obsnum,4)==1) * 1
. gen d1y = (mod(obsnum,4)==1) * ydiv1000
294

.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
4728
591.5
341.25
1
1182
alterntv |
4728
2.5 1.118152
1
4
mode |
4728 3.005076 .9933008
1
4
price |
4728 52.08197 53.81289
1.29 666.11
crate |
4728 .3893684 .5604185
.0002 2.3101
-------------+-------------------------------------------------------dbeach |
4728 .1133672 .3170746
0
1
dpier |
4728 .1505922 .3576888
0
1
dprivate |
4728 .3536379 .478149
0
1
dcharter |
4728 .3824027 .4860256
0
1
pbeach |
4728 103.422 103.6081
1.29 843.186
-------------+-------------------------------------------------------ppier |
4728 103.422 103.6081
1.29 843.186
pprivate |
4728 55.25657 62.69354
2.29 666.11
pcharter |
4728 84.37924 63.52448
27.29 691.11
qbeach |
4728 .2410113 .1906919
.0678
.5333
qpier |
4728 .1622237 .1603389
.0014
.4522
-------------+-------------------------------------------------------qprivate |
4728 .1712146 .2097219
.0002
.7369
qcharter |
4728 .6293679 .7058901
.0021 2.3101
income |
4728 4099.337 2461.183 416.6667
12500
ydiv1000 |
4728 4.099337 2.461183 .4166667
12.5
d|
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------p|
4728 86.61996 88.01813
1.29 843.186
q|
4728 .3009544 .4335593
.0002 2.3101
obsnum |
4728
2364.5
1365
1
4728
dummy1 |
4728
.25 .4330585
0
1
dummy2 |
4728
.25 .4330585
0
1
-------------+-------------------------------------------------------dummy3 |
4728
.25 .4330585
0
1
dummy4 |
4728
.25 .4330585
0
1
d1y |
4728 1.024834 2.160064
0
12.5
d2y |
4728 1.024834 2.160064
0
12.5
d3y |
4728 1.024834 2.160064
0
12.5
-------------+-------------------------------------------------------d4y |
4728 1.024834 2.160064
0
12.5
.
. * The following gives Mixed column of Table 15.2 p.493
. * Note that dummy1 and d1y are omitted to avoid dummy variablle trap
.
295
. clogit d dummy2 dummy3 dummy4 d2y d3y d4y p q, group(id)

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:


4728
LR chi2(8)
= 846.92
Prob > chi2 = 0.0000
Pseudo R2
= 0.2584
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------dummy2 | .7779594 .2204939 3.53 0.000 .3457992 1.21012
dummy3 | .5272788 .2227927 2.37 0.018 .0906131 .9639444
dummy4 | 1.694366 .2240506 7.56 0.000 1.255235 2.133497
d2y | -.1275771 .0506395 -2.52 0.012 -.2268288 -.0283255
d3y | .0894398 .0500671 1.79 0.074 -.0086898 .1875695
d4y | -.0332917 .0503409 -0.66 0.508 -.131958 .0653746
p | -.0251166 .0017317 -14.50 0.000 -.0285106 -.0217225
q | .357782 .1097733 3.26 0.001 .1426302 .5729337
-----------------------------------------------------------------------------.
. ******* (2) NESTED LOGIT MODEL (p.511) *********
.
. * Define the Tree for Nested logit
.*
with nesting structure
.*
/ \
.*
/ \ / \
. * In this case with parameter rho_j differing across alternatives
. * Stata 8 estimates the earlier variant of the nested logit model
. * rather than the preferred variant given in the text.
. * See the discussion at bottom of page 511 and also Train (2003, p.88)
.
. nlogitgen type = alterntv(shore: 1 | 2 , boat: 3 | 4)
new variable type is generated with 2 groups
label list lb_type
lb_type:
1 shore
2 boat
. nlogittree alterntv type
tree structure specified for the nested logit model
296
top --> bottom

type
alterntv
-------------------------shore
1
2
boat
3
4
.
. *** (2A) Estimate the nested logit model
. ***
This is the model on p.511 that has "higher log-likelihood"
.
. * For the top level we use regressors that do not vary at the lower level
. * So not p or q, but could be income or alternative dummy
. * Here use income and alternative dummy
. gen dshore = (type ==1) * 1
. gen dshorey = (type ==1) * ydiv1000
. nlogit d (alterntv = p q) (type = dshore dshorey), group(id)
top --> bottom
type
alterntv
-------------------------shore
1
2
boat
3
4
initial:
rescale:
297

BFGS stepping has contracted, resetting BFGS Hessian (0)
298

299

300

301

302

Nested logit estimates
Levels
=
2
Dependent variable =
d
Number of obs
=
4728
LR chi2(6)
= 917.1687
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
-------------+---------------------------------------------------------------alterntv |
p | -.0013303 .001081 -1.23 0.218 -.003449 .0007883
q | .1284825 .1038986 1.24 0.216 -.075155
.33212
-------------+---------------------------------------------------------------type
|
dshore | -11.40196 9.15307 -1.25 0.213 -29.34164 6.537733
dshorey | .1108341 .0531049 2.09 0.037 .0067505 .2149178
-------------+---------------------------------------------------------------(incl. value |
parameters) |
type
|
/shore | 29.98591 24.40089 1.23 0.219 -17.83896 77.81078
/boat | 14.06438 11.39886 1.23 0.217 -8.276971 36.40572
-----------------------------------------------------------------------------LR test of homoskedasticity (iv = 1): chi2(2)= 145.39 Prob > chi2 = 0.0000
-----------------------------------------------------------------------------. estimates store nlogitunrest
.
. *** (2B) Estimate the restricted nested logit model
. ***
This is the model on p.511 that has log L = -1252
.
. * Set the inclusive value parameters to 1
. nlogit d (alterntv = p q) (type = dshore dshorey), group(id) ivc(shore=1, boat=1)
top --> bottom
type
alterntv
-------------------------shore
1
2
boat
3
4
User-defined constraint(s):
IV constraint(s):
[shore]_cons = 1
[boat]_cons = 1
303
initial:
rescale:
Nested logit estimates
Levels
=
2
Dependent variable =
d
Number of obs
=
4728
LR chi2(4)
= 771.7778
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
-------------+---------------------------------------------------------------alterntv |
p | -.020246 .0012832 -15.78 0.000 -.022761 -.017731
q | .7552644 .0918004 8.23 0.000
.575339 .9351899
-------------+---------------------------------------------------------------type
|
dshore | -.5897435 .1565201 -3.77 0.000 -.8965172 -.2829697
dshorey | -.0790869 .0381453 -2.07 0.038 -.1538503 -.0043235
-------------+---------------------------------------------------------------(incl. value |
parameters) |
type
|
/shore |
1
.
.
.
.
.
/boat |
1
.
.
.
.
.
-----------------------------------------------------------------------------LR test of homoskedasticity (iv = 1): chi2(0)= 0.00 Prob > chi2 =
.
-----------------------------------------------------------------------------. estimates store nlogitrest
.
. * Perform a likelihood ratio test that inclusive parameters = 1
. lrtest nlogitunrest nlogitrest
LR chi2(2) = 145.39
(Assumption: nlogitrest nested in nlogitunrest)
Prob > chi2 =
0.0000
.
. *** (2C) As a check, verify that this restricted nested logit = conditional logit
.
. clogit d p q dshore dshorey, group(id)
304
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


4728
LR chi2(4)
= 771.78
Prob > chi2 = 0.0000
Pseudo R2
= 0.2355
-----------------------------------------------------------------------------d|
Coef. Std. Err.
-------------+---------------------------------------------------------------p | -.0202461 .0012832 -15.78 0.000 -.0227611 -.0177311
q | .7552646 .0918003 8.23 0.000 .5753392 .9351899
dshore | -.5897442 .15652 -3.77 0.000 -.8965178 -.2829706
dshorey | -.0790866 .0381453 -2.07 0.038 -.1538499 -.0043232
-----------------------------------------------------------------------------.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma15p2gev.txt
log type: text
closed on: 19 May 2005, 12:19:10
305
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma16p1tobit.txt
log type: text
opened on: 19 May 2005, 13:00:31
.
. ********** OVERVIEW OF MMA16P1TOBIT.DO **********
.
. * STATA Program
.
. * Chapter 16.2.1 pages 530-1 and 16.9.2 page 565
. * Classic Tobit model with generated data
. * Provides
. * (1) Graph of various conditional means Figure 16.1 (ch16condmeans.wmf)
. * (2) Tobit model estimation: various estimators not reported in book
. * (3) Tobit model estimation: CLAD estimation mentioned on page 565
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** GENERATE DATA **********
.
. * Data generating process is
. * Regressor:
lnwage ~ N(2.75, 0.6^2)
. * Error term:
e ~ N(0, 1000^2)
. * Latent variable:
ystar = -2500 + 1000*lnwage + e
. * Truncated variable: ytrunc = 1(ystar>0)*ystar
. * Censored variable: ycens = 1(ystar<=0)*0 + 1(ystar>0)*ystar
. * Censoring Indicator: dy = 1(ycens>0)
.
. set seed 10101
. set obs 200
obs was 0, now 200
. gen e = 1000*invnorm(uniform( ))
. gen lnwage = 2.75 + 0.6*invnorm(uniform( ))
. gen ystar = -2500 + 1000*lnwage + e
306
. gen ytrunc = ystar

. replace ytrunc = . if (ystar < 0)
. gen ycens = ystar
. replace ycens = 0 if (ystar < 0)
. gen dy = ycens
. replace dy = 1 if (ycens>0)
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------e|
200 76.96455 977.5598 -2906.972 2943.727
lnwage |
200 2.792559 .6249093 .9039821 4.373462
ystar |
200 369.5237 1163.722 -2852.944 3105.383
ytrunc |
130 1047.602 712.0859 17.88135 3105.383
ycens |
200 680.9414 761.3346
0 3105.383
-------------+-------------------------------------------------------dy |
200
.65 .4781665
0
1
.
. * Save data as text (ascii) so that can use programs other than Stata
. outfile e lnwage ystar ytrunc ycens dy using mma16p1tobit.asc, replace
.
. ********** (1) PLOT THEORETICAL CONDITIONAL MEANS **********
.
. * Here we use the true parameter values used in the dgp
.
. * Compute the censored and truncated means
. gen xb = -2500 + 1000*lnwage
. gen sigma = 1000
. gen capphixb = normprob(xb/sigma)
. gen phixb = normd(xb/sigma)
. gen lamda = phixb/capphixb
. gen eytrunc = xb + sigma*lamda
307
. gen eycens = capphixb*eytrunc

.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------e|
200 76.96455 977.5598 -2906.972 2943.727
lnwage |
200 2.792559 .6249093 .9039821 4.373462
ystar |
200 369.5237 1163.722 -2852.944 3105.383
ytrunc |
130 1047.602 712.0859 17.88135 3105.383
ycens |
200 680.9414 761.3346
0 3105.383
-------------+-------------------------------------------------------dy |
200
.65 .4781665
0
1
xb |
200 292.5592 624.9093 -1596.018 1873.462
sigma |
200
1000
0
1000
1000
capphixb |
200 .5983181 .2092614 .0552424 .9694977
phixb |
200 .3271769 .0771531 .0689849 .3989196
-------------+-------------------------------------------------------lamda |
200 .6687834 .3533611 .0711553 2.020711
eytrunc |
200 961.3426 283.2587 424.693 1944.617
eycens |
200 631.3493 380.6074 23.46106 1885.302
.
. * Plot Figure 16.1 on page 531
. sort lnwage
. graph twoway (scatter ystar lnwage, msize(small)) /*
> */ (scatter eytrunc lnwage, c(l) msize(vtiny) clstyle(p3) clwidth(medthick)) /*
> */ (scatter eycens lnwage, c(l) msize(vtiny) clstyle(p2) clwidth(medthick)) /*
> */ (scatter xb lnwage, c(l) msize(vtiny) clstyle(p1) clwidth(medthick)), /*
> */ title("Tobit: Censored and Truncated Means") /*
> */ xtitle("Natural Logarithm of Wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Different Conditional Means", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Actual Latent Variable") label(2 "Truncated Mean") /*
> */
label(3 "Censored Mean") label(4 "Uncensored Mean"))
. graph export ch16condmeans.wmf, replace
(file c:\Imbook\bwebpage\Section4\ch16condmeans.wmf written in Windows Metafile format)
.
. ********** (2) TOBIT MODEL ESTIMATION FOR THESE DATA **********
.
. * These are computations not reported in the book.
.
. * With only 200 observations the Heckman 2-step estimates given below
. * are very inefficient. To verify that they are consistent
. * increase the sample size e.g. set obs 20000
308
.
. * (2A) ESTIMATE THE VARIOUS MODELS
.
. *** UNCENSORED OLS REGRESSION
. * Possible here since for these generated data we actually know ystar
. * Yelds consistent estimate. Expect slope = 1000 approximately.
. regress ystar lnwage, robust
Number of obs =
F( 1, 198) = 96.32
Prob > F
= 0.0000
R-squared = 0.2944
Root MSE = 980
200
-----------------------------------------------------------------------------|
Robust
ystar |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 1010.39 102.9518 9.81 0.000 807.3673 1213.413
_cons | -2452.05 303.2432 -8.09 0.000 -3050.051 -1854.049
-----------------------------------------------------------------------------. estimates store ols
. predict ystarols
.
. *** CENSORED OLS REGRESSION
. * Yields inconsistent estimates
. * From subsection 16.3.6 for slope coefficient OLS converges to p times b
. * where p is fraction of sample with positive values. Here 0.65*1000 = 650.
. regress ycens lnwage, robust
Number of obs =
F( 1, 198) = 84.20
Prob > F
= 0.0000
R-squared = 0.2522
Root MSE = 660.04
200
-----------------------------------------------------------------------------|
Robust
ycens |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 611.8108 66.67493 9.18 0.000 480.3267 743.2949
_cons | -1027.577 176.0776 -5.84 0.000 -1374.805 -680.3484
-----------------------------------------------------------------------------. estimates store censols
. predict ycensols
309

.
. *** TRUNCATED OLS REGRESSION for POSITIVE WAGE
. * Yields inconsistent estimates
. * See subsection 16.3.6 for discussion.
. regress ytrunc lnwage, robust
Number of obs =
F( 1, 128) = 22.05
Prob > F
= 0.0000
R-squared = 0.1261
Root MSE
= 668.28
130
-----------------------------------------------------------------------------|
Robust
ytrunc |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 442.6319 94.26938 4.70 0.000 256.1038
629.16
_cons | -282.4444 282.9091 -1.00 0.320 -842.2285 277.3396
-----------------------------------------------------------------------------. estimates store truncols
. predict ytrunols
.
. *** CENSORED TOBIT MLE REGRESSION for HWAGE
. * Yields consistent estimates
. tobit ycens lnwage, ll(0)
Tobit estimates
Number of obs =
200
LR chi2(1)
=
65.64
Prob > chi2 = 0.0000
Pseudo R2
= 0.0285
-----------------------------------------------------------------------------ycens |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 956.4877 116.8382 8.19 0.000 726.0879 1186.887
_cons | -2244.567 346.8778 -6.47 0.000 -2928.595 -1560.539
-------------+---------------------------------------------------------------_se | 896.6811 59.14988
(Ancillary parameter)
-----------------------------------------------------------------------------Obs. summary:
130
70 left-censored observations at ycens<=0

uncensored observations
. estimates store censtobit
310
. predict ycenstob
.
. *** TRUNCATED TOBIT MLE REGRESSION for HWAGE
. * If done propoerly yields consistent estimates
. * Not sure how to do this in Stata
. * The obvious command is
. * tobit ytrunc lnwage, ll(0)
. * but this gives the same estimates as truncated OLS
.
. *** PROBIT REGRESSION for HWAGE
. * Yields consistent estimates for slope b/s = 1000/1000 = 1
. * but uses less information so expect less efficient than tobit
. probit dy lnwage
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Probit estimates
Number of obs =
200
LR chi2(1)
=
48.39
Prob > chi2 = 0.0000
Pseudo R2
= 0.1868
-----------------------------------------------------------------------------dy |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 1.173851 .1870053 6.28 0.000 .8073277 1.540375
_cons | -2.795715 .508104 -5.50 0.000 -3.79158 -1.799849
-----------------------------------------------------------------------------. estimates store probit
. predict yprobit
(option p assumed; Pr(dy))
.
. *** HECKMAN 2-STEP ESTIMATOR DONE MANUALLY
. * Yields consistent estimates but less efficient than censored tobit MLE
. * The second stage standard errors will be incorrect
. probit dy lnwage
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Probit estimates
Number of obs =
LR chi2(1)
=
48.39
200
311
Prob > chi2

= 0.0000
Pseudo R2
=
0.1868
-----------------------------------------------------------------------------dy |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 1.173851 .1870053 6.28 0.000 .8073277 1.540375
_cons | -2.795715 .508104 -5.50 0.000 -3.79158 -1.799849
-----------------------------------------------------------------------------. predict probity, xb
. gen invmills = normd(probity)/normprob(probity)
. summarize dy probity invmills
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dy |
200
.65 .4781665
0
1
probity |
200 .482335 .7335506 -1.734574 2.33808
invmills |
200 .5867037 .3823083 .0261866 2.140342
. regress ytrunc lnwage invmills
Source |
SS
df
MS
Number of obs = 130
-------------+-----------------------------F( 2, 127) = 9.41
Model | 8440402.78 2 4220201.39
Prob > F
= 0.0002
Residual | 56971158.9 127 448591.802
R-squared = 0.1290
-------------+-----------------------------Adj R-squared = 0.1153
Total | 65411561.6 129 507066.369
Root MSE
= 669.77
-----------------------------------------------------------------------------ytrunc |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 176.6468 418.2392 0.42 0.673 -650.9731 1004.267
invmills | -498.9958 760.3525 -0.66 0.513 -2003.596 1005.604
_cons | 745.3069 1597.558 0.47 0.642 -2415.972 3906.586
-----------------------------------------------------------------------------. estimates store heck2step
. correlate lnwage invmills
(obs=200)
| lnwage invmills
-------------+-----------------lnwage | 1.0000
invmills | -0.9745 1.0000
. * And more robust standard errors may be found by

312
. regress ytrunc lnwage invmills, robust

Number of obs =
F( 2, 127) = 13.96
Prob > F
= 0.0000
R-squared = 0.1290
Root MSE = 669.77
130
-----------------------------------------------------------------------------|
Robust
ytrunc |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 176.6468 379.1739 0.47 0.642 -573.6699 926.9636
invmills | -498.9958 635.4917 -0.79 0.434 -1756.519 758.5276
_cons | 745.3069 1431.149 0.52 0.603 -2086.68 3577.293
-----------------------------------------------------------------------------. estimates store heck2srobust
.
. *** HECKMAN 2-STEP ESTIMATOR DONE USING BUILT-IN HECKMAN COMMAND
. * Yields consistent estimates but less efficient than censored tobit MLE
. heckman ytrunc lnwage, select(lnwage) twostep
Heckman selection model -- two-step estimates Number of obs
(regression model with sample selection)
Censored obs
=
Uncensored obs =
130
Wald chi2(2)
Prob > chi2
200
70
= 39.57
= 0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
-------------+---------------------------------------------------------------ytrunc
|
lnwage | 176.6469 425.0025 0.42 0.678 -656.3428 1009.636
_cons | 745.3067 1617.583 0.46 0.645 -2425.098 3915.711
-------------+---------------------------------------------------------------select
|
lnwage | 1.173851 .1870053 6.28 0.000 .8073277 1.540375
_cons | -2.795715 .508104 -5.50 0.000 -3.79158 -1.799849
-------------+---------------------------------------------------------------mills
|
lambda | -498.9957 760.5005 -0.66 0.512 -1989.549 991.5578
-------------+---------------------------------------------------------------rho | -0.67419
sigma | 740.1433
lambda | -498.99575 760.5005
-----------------------------------------------------------------------------. estimates store heckman
313
. predict ystarhec, xb
. predict ytrunhec, ycond
. predict ycenshec, yexpected
. predict yinvmill, mills
. predict yprobsel, psel
. correlate lnwage yinvmill
(obs=200)
| lnwage yinvmill
-------------+-----------------lnwage | 1.0000
yinvmill | -0.9745 1.0000
.
. * (2B) DISPLAY COEFFICIENT ESTIMATES
.
. * OLS estimates True model is -2500 + 1000*lnwage
. estimates table ols censols truncols, b(%10.2f) se(%10.2f) t stats(N ll)
----------------------------------------------------Variable | ols
censols
truncols
-------------+--------------------------------------lnwage | 1010.39
611.81
442.63
| 102.95
66.67
94.27
|
9.81
9.18
4.70
_cons | -2452.05 -1027.58 -282.44
| 303.24
176.08
282.91
|
-8.09
-5.84
-1.00
-------------+--------------------------------------N | 200.00
200.00
130.00
ll | -1660.29 -1581.24 -1029.07
----------------------------------------------------legend: b/se/t
.
. * Tobit estimates True model is -2500 + 1000*lnwage
. estimates table censtobit probit, b(%10.2f) se(%10.2f) t stats(N ll)
---------------------------------------Variable | censtobit
probit
-------------+-------------------------lnwage | 956.49
1.17
| 116.84
0.19
|
8.19
6.28
314
_se | 896.68
|
59.15
| 15.16
_cons | -2244.57
-2.80
| 346.88
0.51
|
-6.47
-5.50
-------------+-------------------------N | 200.00
200.00
ll | -1118.39
-105.30
---------------------------------------legend: b/se/t
.
. * Tobit estimates using Heckman manual True model is -2500 + 1000*lnwage
. estimates table heck2step heck2srobust, b(%10.2f) se(%10.2f) t stats(N ll)
---------------------------------------Variable | heck2step heck2sro~t
-------------+-------------------------lnwage | 176.65
176.65
| 418.24
379.17
|
0.42
0.47
invmills | -499.00 -499.00
| 760.35
635.49
|
-0.66
-0.79
_cons | 745.31
745.31
| 1597.56
1431.15
|
0.47
0.52
-------------+-------------------------N | 130.00
130.00
ll | -1028.85 -1028.85
---------------------------------------legend: b/se/t
.
. * Tobit estimates using Heckman built-in True model is -2500 + 1000*lnwage
. estimates table heckman, b(%10.2f) se(%10.2f) t stats(N ll)
--------------------------Variable | heckman
-------------+------------ytrunc
|
lnwage | 176.65
| 425.00
|
0.42
_cons | 745.31
| 1617.58
|
0.46
-------------+------------select
|
lnwage |
1.17
315
|
0.19
|
6.28
_cons | -2.80
|
0.51
|
-5.50
-------------+------------mills
|
lambda | -499.00
| 760.50
|
-0.66
-------------+------------Statistics |
N | 200.00
ll |
--------------------------legend: b/se/t
.
. ********** (3) CLAD ESTIMATION FOR THESE DATA page 565 **********
.
. * Compare tobit MLE with censored least absolute deviations (CLAD) estimator
. * Gives results at end of section 16.9.3 page 565
.
. tobit ycens lnwage, ll(0)
Tobit estimates
Number of obs =
200
LR chi2(1)
=
65.64
Prob > chi2 = 0.0000
Pseudo R2
= 0.0285
-----------------------------------------------------------------------------ycens |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwage | 956.4877 116.8382 8.19 0.000 726.0879 1186.887
_cons | -2244.567 346.8778 -6.47 0.000 -2928.595 -1560.539
-------------+---------------------------------------------------------------_se | 896.6811 59.14988
(Ancillary parameter)
-----------------------------------------------------------------------------Obs. summary:
130
70 left-censored observations at ycens<=0

uncensored observations
. clad ycens lnwage, reps(100) ll(0)

Initial sample size = 200
Final sample size = 159
Pseudo R2 = .12380382
Variable | Reps Observed

316
---------+------------------------------------------------------------------lnwage | 100 838.2366 59.09127 165.7476 509.3575 1167.116 (N)

|
666.9485 1298.217 (P)
|
664.528 1247.371 (BC)
---------+------------------------------------------------------------------const | 100 -1897.847 -184.2656 529.6713 -2948.83 -846.8643 (N)
|
-3406.233 -1435.466 (P)
|
-3406.233 -1435.466 (BC)
----------------------------------------------------------------------------N = normal, P = percentile, BC = bias-corrected
.
. log close
log: c:\Imbook\bwebpage\Section4\mma16p1tobit.txt
log type: text
closed on: 19 May 2005, 13:00:37
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma16p2mills.txt
log type: text
opened on: 19 May 2005, 13:02:12
.
. ********** OVERVIEW OF MMA16P2MILLS.DO **********
.
. * STATA Program
.
. * Presentation of Mills ratio
. * It provides
. * (1) Figure 16.1 (ch16millsratio.wmf)
. * This program requires no data
.
. ********** SETUP ***********
.
. set more off
. version 8
.
. ********** GENERATE DATA AND FUNCTIONS
.
. * Create density cdf Mills ratio for N[0,1]
. set obs 100
obs was 0, now 100
317
. gen c = 4*(50-_n)/100
. gen PHIc = norm(c)
. gen phic = normden(c)
. gen lamdac = phic/(1-PHIc)
.
. * Descriptive statistics
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------c|
100
-.02 1.16046
-2
1.96
PHIc |
100 .4952275 .338039 .0227501 .9750021
phic |
100 .2386177 .1157086 .053991 .3989423
lamdac |
100 .9284788 .7023349 .0552479 2.337835
.
. *********** FIGURE 16.2 page 540 ***********
.
. * This graph shows Mills ratio and cdf and density
. graph twoway (scatter lamdac c, c(l) msize(vtiny) clstyle(p1) clwidth(medthick)) /*
> */ (scatter PHIc c, c(l) msize(vtiny) clstyle(p3) clwidth(medthick)) /*
> */ (scatter phic c, c(l) msize(vtiny) clstyle(p2) clwidth(medthick)), /*
> */ title("Inverse Mills Ratio as Cutoff Varies") /*
> */ xtitle("Cutoff point c", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Inverse Mills, pdf and cdf", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Inverse Mills ratio") label(2 "N[0,1] Cdf") label(3 "N[0,1] Density"))
. graph export ch16millsratio.wmf, replace
(file c:\Imbook\bwebpage\Section4\ch16millsratio.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT ***********
. log close
log: c:\Imbook\bwebpage\Section4\mma16p2mills.txt
log type: text
closed on: 19 May 2005, 13:02:15
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma16p3selection.txt
log type: text
opened on: 19 May 2005, 13:04:33
.
. ********** OVERVIEW OF MMA16P3SELECTION.DO **********
318
.
. * STATA Program
.
. * Selection models example
. * It provides
. * (1) Two-part model estimation (Table 16.1)
. * (2) Selection model estimation
. * (2A) ML estimates (Table 16.1)
. * (2B) Heckman 2-step estimates (Table 16.1)
. * (2C) Check for possible collinearity problems in Heckman 2-Step
.
. * To use this program you need health expenditure data in Stata data set
. * randdata.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
. * Essentially same data as in P. Deb and P.K. Trivedi (2002)
. * "The Structure of Demand for Medical Care: Latent Class versus
. * Two-Part Models", Journal of Health Economics, 21, 601-625
. * except that paper used different outcome (counts rather than $)
.
. * Each observation is for an individual over a year.
. * Individuals may appear in up to five years.
. * All available sample is used except only fee for service plans included.
. * In analysis here only year 2 is used so panel complications are avoided.
. * Clustering of individuals within household is ignored here.
.
. * Dependent variable is
.*
MED
med
Annual medical expenditures in constant dollars
.*
excluding dental and outpatient mental
.*
LNMED lnmeddol Ln(Medical expenditures) given meddol > 0
.*
Missing otherwise
.*
DMED binexp 1 if medical expenditures > 0
.
. * Regressors are
. * - Health insurance measures
.*
LC
logc
log(coinsrate+1) where coinsurance rate is 0 to 100
319
.*
IDP
idp
1 if individual deductible plan
.*
LPI
lpi
1og(annual participation incentive payment) or 0 if no payment
.*
FMDE
fmde
log(max(medical deductible expenditure)) if IDP=1 and MDE>1 or 0
otherw
> ise.
. * - Health status measures
.*
NDISEASE disea number of chronic diseases
.*
PHYSLIM physlm 1 if physical limitation
.*
HLTHG hlthg 1 if good health
.*
HLTHF hlthf 1 if good health
.*
HLTHP hlthp 1 if good health (omitted is excellent)
. * - Socioeconomic characteristics
.*
LINC linc
log of annual family income (in $)
.*
LFAM lfam
log of family size
.*
EDUCDEC educdec years of schooling of decision maker
.*
AGE
xage
exact age
.*
BLACK black 1 if black
.*
FEMALE female 1 if female
.*
CHILD child 1 if child
.*
FEMCHILD fchild 1 if female child
.
. * If panel data used then clustering is on
.*
zper
person id
.
. ********** READ DATA **********
.
. use randdata.dta, clear
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------plan | 20190 11.17553 3.976751
1
19
site | 20190 3.298811 1.80382
1
6
coins | 20190 26.3056 36.40386
0
100
tookphys | 20190 .5974245 .4904288
0
1
year | 20190 2.420109 1.217141
1
5
-------------+-------------------------------------------------------zper | 20190 357965.5 180868.1 125024 632167
black | 20190 .1814983 .3827071
0
1
income | 20190 8037.409 4058.371
0 29237.54
xage | 20190 25.72233 16.76945
0 64.27515
female | 20190 .5170381 .499722
0
1
-------------+-------------------------------------------------------educdec | 20186 11.96681 2.806255
0
25
time | 20190 .9989561 .0259741 .0767123
1
outpdol | 20190 51.12649 94.92627
0 2599.902
drugdol | 20190 13.1687 33.76212
0 706.3979
suppdol | 20190
6.8024 21.39346
0 1009.47
-------------+-------------------------------------------------------mentdol | 20190 6.870347 58.41298
0 1340.834
320
inpdol | 20190 100.4694 655.6215

0 38649.81
meddol | 20190 171.5679 698.2015
0 39182.02
totadm | 20190 .1127291 .4111857
0
8
inpmis | 20190 .0039624 .062824
0
1
-------------+-------------------------------------------------------mentvis | 20190 .4322437 3.430789
0
62
mdvis | 20190 2.860426 4.504365
0
77
notmdvis | 20190 .6855869 3.763543
0
109
num | 20190 3.954235 1.853034
1
14
mhi | 20190 76.55584 12.50224
12.2
100
-------------+-------------------------------------------------------disea | 20190 11.24449 6.741449
0
58.6
physlm | 20190 .1235003 .3220164
0
1
ghindx | 14967 73.09055 15.99371
3.7
100
mdeoff | 20185 417.8422 384.1199
0
1000
pioff | 20185 446.677 367.466
0 1291.68
-------------+-------------------------------------------------------child | 20190 .4013373 .4901812
0
1
fchild | 20190 .1937098 .3952139
0
1
lfam | 20190 1.248156 .539301
0 2.639057
lpi | 20190 4.707894 2.69784
0 7.163699
idp | 20190 .2599802 .4386343
0
1
-------------+-------------------------------------------------------logc | 20190 2.383342 2.041776
0 4.564348
fmde | 20190 4.029524 3.471353
0 8.294049
hlthg | 20190 .3620109 .4805938
0
1
hlthf | 20190 .077266 .2670196
0
1
hlthp | 20190 .0149579 .1213874
0
1
-------------+-------------------------------------------------------xghindx | 20190 73.2375 14.2332
3.7
100
linc | 20190 8.708265 1.228309
0 10.28324
lnum | 20190 1.248156 .539301
0 2.639057
lnmeddol | 15737 4.109318 1.484654 -.8495329 10.57597
binexp | 20190 .7794453 .414631
0
1
.
. /* Describe and summarize the original data.
> describe
> summarize
> * The orignal data are a panel.
> * The following summarizes panel features for completeness
> iis zper
> tis year
> xtdes
> xtsum meddol lnmeddol binexp
> */
.
. ********** DATA SELECTION AND TRANSFORMATIONS **********
.
. * Use only Year 2
. keep if year==2
321

.
. * educdec is missing for one observation
. drop if educdec==.
(1 observation deleted)
.
. * rename variables
. rename meddol MED
. rename binexp DMED
. rename lnmeddol LNMED
. rename linc LINC
. rename lfam LFAM
. rename educdec EDUCDEC
. rename xage AGE
. rename female FEMALE
. rename child CHILD
. rename fchild FEMCHILD
. rename black BLACK
. rename disea NDISEASE
. rename physlm PHYSLIM
. rename hlthg HLTHG
. rename hlthf HLTHF
. rename hlthp HLTHP
. rename idp IDP
. rename logc LC
. rename lpi LPI
. rename fmde FMDE
.
. * Define the regressor list which in commands can refer to as $XLIST
322
. global XLIST LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK
.
. * Summarize the dependents and regressors
. sum MED DMED LNMED $XLIST
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
DMED |
5574 .7680301 .4221277
0
1
LNMED |
4281 4.069462 1.499372 -.5343859 10.57597
LC |
5574 2.420739 2.043883
0 4.564348
IDP |
5574 .261751 .4396272
0
1
-------------+-------------------------------------------------------LPI |
5574 4.726834 2.681354
0 7.163699
FMDE |
5574 4.065015 3.450558
0 8.294049
PHYSLIM |
5574 .1242463 .3233768
0
1
NDISEASE |
5574 11.20526 6.788959
0
58.6
HLTHG |
5574 .3649085 .4814477
0
1
-------------+-------------------------------------------------------HLTHF |
5574 .0782203 .268542
0
1
HLTHP | 5574 .0156082 .123965
0
1
LINC |
5574 8.696929 1.220592
0 10.28324
LFAM |
5574 1.241407 .5403965
0 2.564949
EDUCDEC |
5574 11.9466 2.837492
0
25
-------------+-------------------------------------------------------AGE |
5574 25.57613 16.73011 .0253251 63.27515
FEMALE |
5574 .5184787 .4997032
0
1
CHILD |
5574 .4050951 .4909545
0
1
FEMCHILD |
5574 .1955508 .3966597
0
1
BLACK |
5574 .1859852 .3860055
0
1
.
. * Detailed summary shows that MED>0 very skewed whereas LNMED is not
. sum MED LNMED if MED>0, detail
medical exp excl outpatient men
------------------------------------------------------------Percentiles
Smallest
1% 2.109705
.5860291
5% 5.752914
.6630728
10% 9.376465
.6770833
Obs
4281
25% 21.31435
.6770833
Sum of Wgt.
4281
50%
75%
90%
95%
99%
52.64357
Mean
220.987
Largest
Std. Dev.
909.9021
136.4518
12044.11
453.8059
17465.98
Variance
827921.9
904.328
18641.98
Skewness
24.00829
2666.309
39182.02
Kurtosis
873.379
323
LNMED
------------------------------------------------------------Percentiles
Smallest
1%
.746548 -.5343859
5% 1.749707
-.4108706
10% 2.238203 -.3899609
Obs
4281
25% 3.059381 -.3899609
Sum of Wgt.
4281
50%
75%
90%
95%
99%
3.963544
Mean
4.069462
Largest
Std. Dev.
1.499372
4.915971
9.396331
6.11767
9.76801
Variance
2.248116
6.807192
9.833171
Skewness
.347695
7.888451
10.57597
Kurtosis
3.28909
.
. outfile DMED MED LNMED LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF
HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK /*
>
*/ using mma16p3selection.asc, replace
.
. ****************** CHAPTER 16.6 REGRESSION ANALYSIS **************
.
. * The analysis below models log expenditure (lny), not expenditure (y)
. * where here y = MED and lny = LNMED.
.
. * This makes regular tobit difficult as it is not clear
. * what the censoring/truncation point is since ln(0) = -infinity
. * Also note that some LNMED<0 as 0<MED<1 is possible.
. * So just do two-part model and sample selection model.
.
. * Interested in comparing MED not LNMED at end of day.
. * So use
. * If lny = xb + u, u ~ N[0, s^2] for y > 0
. * Then E[y] = exp(xb + (s^2)/2)
for y > 0
. * and E[y] = Pr[y>0]*exp(xb + (s^2)/2) for all y
.
. * The models estimated are
. * (1) Two-part model using
. * (a) probit for whether positive y
. * (b) regress with lny as dependent variable
. * (2) Sample selection model similar to (3)
. * except that inverse Mills ratio appears in (b), estimated by
. * (a) MLE
. * (b) Heckman 2-step
.
. * Additionally censored tobit and truncated tobit commands in levels
. * are given below for completeness.
324
.
. ************ (1) TWO-PART MODEL ************
.
. * Two-part model: binary probit and then lognormal for expenditures
.
. * First part: probit for MED > 0
. probit DMED $XLIST
/* global XLIST defined earlier */
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Probit estimates
Number of obs =
5574
LR chi2(17) = 657.11
Prob > chi2 = 0.0000
Pseudo R2
= 0.1088
-----------------------------------------------------------------------------DMED |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.118708 .0269005 -4.41 0.000 -.1714319 -.065984
IDP | -.1279483 .0522351 -2.45 0.014 -.2303272 -.0255693
LPI | .0283091 .0088793 3.19 0.001
.010906 .0457121
FMDE | .0075319 .0161584 0.47 0.641 -.024138 .0392018
PHYSLIM | .2732013 .0743761 3.67 0.000 .1274268 .4189758
NDISEASE | .0224861 .0035958 6.25 0.000 .0154384 .0295338
HLTHG | .0387516 .0438545 0.88 0.377 -.0472016 .1247049
HLTHF | .1920062 .0836688 2.29 0.022 .0280185 .355994
HLTHP | .6397294 .2126322 3.01 0.003 .222978 1.056481
LINC | .0518413 .0168128 3.08 0.002 .0188889 .0847938
LFAM | -.0335599 .041728 -0.80 0.421 -.1153452 .0482253
EDUCDEC | .036307 .0076536 4.74 0.000 .0213062 .0513078
AGE | .0002631 .0021606 0.12 0.903 -.0039715 .0044978
FEMALE | .4451035 .054292 8.20 0.000 .3386932 .5515138
CHILD | .111489 .0808338 1.38 0.168 -.0469424 .2699203
FEMCHILD | -.4512845 .0799219 -5.65 0.000 -.6079284 -.2946405
BLACK | -.6057367 .0523148 -11.58 0.000 -.7082718 -.5032017
_cons | -.271605 .1877345 -1.45 0.148 -.6395579 .0963478
-----------------------------------------------------------------------------. estimates store twoparta
. scalar llprobit = e(ll)
. predict probsel2part, p
. predict xbprobit, xb
/* version 8 command for later table */
/* Log-likelihood */
/* Pr[y>0] = PHI(x'b) */
/* x'b */
.
325
. * Second part: OLS for log of positive values

. * Here LNMED where LNMED missing if MED < 0
. regress LNMED $XLIST
Source |
SS
df
MS
-------------+-----------------------------F( 17, 4263) = 39.69
Model | 1314.70352 17 77.335501
Prob > F
= 0.0000
Residual | 8307.23358 4263 1.94868252
R-squared = 0.1366
-------------+-----------------------------Adj R-squared = 0.1332
Total | 9621.9371 4280 2.24811614
Root MSE
= 1.396
-----------------------------------------------------------------------------LNMED |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0164006 .0312495 -0.52 0.600 -.0776658 .0448647
IDP | -.0789998 .061796 -1.28 0.201 -.2001522 .0421526
LPI | .0027057 .0097138 0.28 0.781 -.0163383 .0217498
FMDE | -.0306123 .0180695 -1.69 0.090 -.0660379 .0048134
PHYSLIM | .2619829 .0687459 3.81 0.000 .1272052 .3967607
NDISEASE | .0198922 .0034441 5.78 0.000
.01314 .0266444
HLTHG | .1438008 .0483778 2.97 0.003 .0489553 .2386464
HLTHF | .3642649 .0881004 4.13 0.000 .1915422 .5369876
HLTHP | .7865099 .1700502 4.63 0.000 .453123 1.119897
LINC | .0931988 .0217849 4.28 0.000 .0504891 .1359085
LFAM | -.1408033 .046203 -3.05 0.002 -.2313852 -.0502214
EDUCDEC | -5.66e-06 .0082599 -0.00 0.999 -.0161993 .016188
AGE | .0055602 .002251 2.47 0.014 .0011471 .0099733
FEMALE | .3442509 .0571573 6.02 0.000 .2321929 .456309
CHILD | -.2677921 .0904307 -2.96 0.003 -.4450833 -.0905009
FEMCHILD | -.3512207 .0896517 -3.92 0.000 -.5269847 -.1754568
BLACK | -.1964412 .0677021 -2.90 0.004 -.3291725 -.0637099
_cons | 3.077182 .2213448 13.90 0.000
2.64323 3.511133
-----------------------------------------------------------------------------. estimates store twopartb
. scalar lllognormal = e(ll) /* Log-likelihood */
. scalar sols = e(rmse)
/* Standard error of the regression */
. predict pLNMED, xb
/* Predicted mean from OLS */
. predict rLNMED, residuals

.
. * Check for normal errors
. hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
326
Variables: fitted values of LNMED

chi2(1)
= 17.11
Prob > chi2 = 0.0000
. * imtest
. sktest LNMED rLNMED
Skewness/Kurtosis tests for Normality
------- joint -----Variable | Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2
-------------+------------------------------------------------------LNMED |
0.000
0.001
.
0.0000
rLNMED |
0.000
0.000
.
0.0000
.
. * Create two-part model log-likelihood
. scalar lltwopart = llprobit + lllognormal
. di "lltwopart = " lltwopart
lltwopart = -10184.076
.
. * Create predictions of level of expenditures not logs
. * E[y] = exp(pLNMED + (s^2)/2) for y > 0
. * and E[y] = Pr[y>0]*exp(xb + (s^2)/2) for all y
. gen pMEDpos2part = exp(pLNMED + (sols^2)/2)
. gen pMEDall2part = probsel2part*pMEDpos2part
.
. * Compare predictions to actual for MED > 0
. sum LNMED pLNMED MED pMEDpos2part if MED > 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------LNMED |
4281 4.069462 1.499372 -.5343859 10.57597
pLNMED |
4281 4.069462 .5542326 2.298199 6.482164
MED |
4281 220.987 909.9021 .5860291 39182.02
pMEDpos2part |
4281 183.462 126.0213 26.37827 1731.088
. corr LNMED pLNMED MED pMEDpos2part if MED > 0
(obs=4281)
| LNMED pLNMED
MED pMEDpo~t
-------------+-----------------------------------LNMED | 1.0000
pLNMED | 0.3696 1.0000
MED | 0.4560 0.1576 1.0000
pMEDpos2part | 0.3387 0.9204 0.1669 1.0000
327
.
. * Compare predictions to actual including zeroes
. sum MED pMEDall2part DMED probsel2part
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDall2part |
5574 140.966 120.2022 4.880651 1729.783
DMED |
5574 .7680301 .4221277
0
1
probsel2part |
5574 .7678377 .1457464 .1526731 .999246
. corr MED pMEDall2part DMED probsel2part
(obs=5574)
|
MED pMEDal~t DMED probse~t
-------------+-----------------------------------MED | 1.0000
pMEDall2part | 0.1772 1.0000
DMED | 0.1162 0.2158 1.0000
probsel2part | 0.1031 0.6380 0.3467 1.0000
.
. ************ (2) SELECTION MODEL ************
.
. * Sample selection model for log expenditures
. * Selection equation:
.*
Observe y = y* if I = z'a + u > 0 u ~ N[0,1]
. * Regression equation:
.*
y* = x'b + v v ~ N[0,s^2] and Corr[u,v]=rho
.
. * (2A) MLE for sample selection model
. heckman LNMED $XLIST, select (DMED = $XLIST)
Heckman selection model
Number of obs
=
5574
Censored obs
=
1293
328
Uncensored obs
4281
Wald chi2(17)
= 805.17
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------|
Coef. Std. Err.
-------------+---------------------------------------------------------------LNMED
|
LC | -.0760236 .0337456 -2.25 0.024 -.1421638 -.0098833
IDP | -.1497199 .0661379 -2.26 0.024 -.2793478 -.020092
LPI | .01493 .0105015 1.42 0.155 -.0056526 .0355127
FMDE | -.023522 .0194745 -1.21 0.227 -.0616913 .0146474
PHYSLIM | .3548628 .0755425 4.70 0.000 .2068023 .5029233
NDISEASE | .0286474 .0037972 7.54 0.000 .0212051 .0360897
HLTHG | .1559173 .0521775 2.99 0.003 .0536513 .2581834
HLTHF | .4451223 .0955263 4.66 0.000 .2578942 .6323505
HLTHP | .9986065 .1878791 5.32 0.000 .6303701 1.366843
LINC | .1214009 .0230845 5.26 0.000 .0761562 .1666457
LFAM | -.1583018 .0497464 -3.18 0.001 -.255803 -.0608005
EDUCDEC | .0175951 .0090183 1.95 0.051 -.0000805 .0352707
AGE | .0057376 .0024426 2.35 0.019 .0009501 .0105251
FEMALE | .5503441 .0633313 8.69 0.000 .4262171 .6744711
CHILD | -.1976875 .097398 -2.03 0.042 -.3885841 -.006791
FEMCHILD | -.5653227 .0975292 -5.80 0.000 -.7564765 -.374169
BLACK | -.5358684 .0749191 -7.15 0.000 -.6827072 -.3890296
_cons | 2.107745 .2442285 8.63 0.000 1.629066 2.586424
-------------+---------------------------------------------------------------DMED
|
LC | -.1068027 .0264766 -4.03 0.000 -.1586959 -.0549096
IDP | -.108769 .0509938 -2.13 0.033 -.2087149 -.0088231
LPI | .0294804 .0086214 3.42 0.001 .0125827 .0463781
FMDE | .0007403 .0158738 0.05 0.963 -.0303719 .0318524
PHYSLIM | .2848256 .0722656 3.94 0.000 .1431877 .4264635
NDISEASE | .0210805 .0034967 6.03 0.000 .0142271 .027934
HLTHG | .0576901 .042799 1.35 0.178 -.0261945 .1415747
HLTHF | .2237238 .0814547 2.75 0.006 .0640755 .3833721
HLTHP | .7984291 .2048087 3.90 0.000 .3970114 1.199847
LINC | .0553122 .0166179 3.33 0.001 .0227416 .0878827
LFAM | -.031201 .0402985 -0.77 0.439 -.1101846 .0477827
EDUCDEC | .031499 .0074987 4.20 0.000 .0168018 .0461961
AGE | -.0006072 .0021064 -0.29 0.773 -.0047357 .0035212
FEMALE | .4093059 .0532548 7.69 0.000 .3049283 .5136834
CHILD | .0530643 .0786326 0.67 0.500 -.1010527 .2071813
FEMCHILD | -.3953421 .0783811 -5.04 0.000 -.5489662 -.241718
BLACK | -.5831049 .0520534 -11.20 0.000 -.6851277 -.4810822
_cons | -.2141574 .1842169 -1.16 0.245 -.5752159 .146901
-------------+---------------------------------------------------------------/athrho | .9408188 .0736303 12.78 0.000
.796506 1.085132
/lnsigma | .4511091 .0177227 25.45 0.000 .4163732 .485845
-------------+---------------------------------------------------------------329
rho | .7355982 .0337886

.6620789 .7950943
sigma | 1.570053 .0278256
1.516452 1.625548
lambda | 1.154928 .0702985
1.017145 1.29271
-----------------------------------------------------------------------------LR test of indep. eqns. (rho = 0): chi2(1) = 27.93 Prob > chi2 = 0.0000
-----------------------------------------------------------------------------. estimates store heckmle
. scalar llhecklogs = e(ll)
. scalar shml = e(sigma)
/* Log-likelihood */
/* s where Var[v]=s^2 */
.
. * Save the Stata predictions:
. * Distinguish between ystar=E[y*], ypos=E[y|I>0] and yall=E[y]
. predict ystarhml, xb
/* E[y*] = x'b */
. predict yposhml, ycond
/* E[y|I>0] = E[y*|I>0] = x'b+c*lamda(z'a) */
. predict invmillhml, mills
/* lamda(z'a) = phi(z'a)/PHI(z'a) */
. predict probselhml, psel
/* PHI(z'a) */
. * The following not appropriate here as it sets y=0 if I<0

. * whereas here data is in logs and y=ln(MED)=-infinity if I<0
. predict yallhml, yexpected /* E[y] = PHI(z'a)*E[y|I>0] */
. sum ystarhml yposhml invmillhml probselhml yallhml
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ystarhml |
5574 3.543161 .7462608 .9570364 6.92732
yposhml |
5574 4.000607 .5482433 2.50515 6.92955
invmillhml |
5574 .396082 .2165116 .0019309 1.476998
probselhml |
5574 .7674107 .1404707 .1737047 .9994534
yallhml |
5574 3.124032 .9125439 .4932862 6.925763
.
. * E[y] = exp(ypos + (s^2)/2) for y > 0 Var[v]=s^2
. * and E[y] = Pr[y>0]*exp(ypos + (s^2)/2) for all y
. gen pMEDposhml = exp(yposhml + (shml^2)/2)
. gen pMEDallhml = probselhml*pMEDposhml
.
. sum LNMED yposhml MED pMEDposhml if MED > 0
Variable |
Obs
Mean
Std. Dev.
Min
Max
330
-------------+-------------------------------------------------------LNMED |
4281 4.069462 1.499372 -.5343859 10.57597
yposhml |
4281 4.071295 .5573439 2.50515 6.92955
MED |
4281 220.987 909.9021 .5860291 39182.02
pMEDposhml |
4281 240.4096 185.0424 42.00053 3505.48
. corr LNMED yposhml MED pMEDpos2part if MED > 0
(obs=4281)
| LNMED yposhml
MED pMEDpo~t
-------------+-----------------------------------LNMED | 1.0000
yposhml | 0.3690 1.0000
MED | 0.4560 0.1592 1.0000
pMEDpos2part | 0.3387 0.9343 0.1669 1.0000
.
. sum MED pMEDallhml DMED probselhml
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDallhml |
5574 184.5571 174.1649 8.814864 3503.564
DMED |
5574 .7680301 .4221277
0
1
probselhml |
5574 .7674107 .1404707 .1737047 .9994534
. corr MED pMEDallhml DMED probselhml
(obs=5574)
|
MED pMEDal~l DMED probse~l
-------------+-----------------------------------MED | 1.0000
pMEDallhml | 0.1734 1.0000
DMED | 0.1162 0.2015 1.0000
probselhml | 0.1074 0.6092 0.3468 1.0000
.
. * (2B) Heckman 2 step for sample selection model
. * Same as MLE execpt add option twostep in heckman command
. heckman LNMED $XLIST, select (DMED = $XLIST) twostep
Heckman selection model -- two-step estimates Number of obs
Censored obs
=
Uncensored obs =
4281
=
5574
1293
Wald chi2(34)
= 944.44
Prob > chi2
= 0.0000
331
-----------------------------------------------------------------------------|
Coef. Std. Err.
-------------+---------------------------------------------------------------LNMED
|
LC | -.0279209 .039754 -0.70 0.482 -.1058373 .0499955
IDP | -.0922898 .0680191 -1.36 0.175 -.2256048 .0410252
LPI | .0052225 .0111057 0.47 0.638 -.0165442 .0269893
FMDE | -.0295212 .0182427 -1.62 0.106 -.0652762 .0062339
PHYSLIM | .2814948 .0804535 3.50 0.000 .1238088 .4391808
NDISEASE | .021617 .0050395 4.29 0.000 .0117398 .0314943
HLTHG | .1474026 .0490497 3.01 0.003 .051267 .2435381
HLTHF | .3821683 .0961284 3.98 0.000
.19376 .5705765
HLTHP | .833294 .1974488 4.22 0.000 .4463015 1.220287
LINC | .0990973 .0251548 3.94 0.000 .0497948 .1483998
LFAM | -.1441358 .0468074 -3.08 0.002 -.2358766 -.052395
EDUCDEC | .0033639 .0109501 0.31 0.759 -.0180979 .0248257
AGE | .0055556 .0022549 2.46 0.014 .0011361 .0099751
FEMALE | .3846323 .1032799 3.72 0.000 .1822074 .5870573
CHILD | -.2565136 .0936771 -2.74 0.006 -.4401173 -.0729098
FEMCHILD | -.392146 .125089 -3.13 0.002 -.637316 -.146976
BLACK | -.2633649 .1577542 -1.67 0.095 -.5725574 .0458276
_cons | 2.882514 .4698969 6.13 0.000 1.961533 3.803495
-------------+---------------------------------------------------------------DMED
|
LC | -.118708 .0269005 -4.41 0.000 -.1714319 -.065984
IDP | -.1279483 .0522351 -2.45 0.014 -.2303272 -.0255693
LPI | .0283091 .0088793 3.19 0.001
.010906 .0457121
FMDE | .0075319 .0161584 0.47 0.641 -.024138 .0392018
PHYSLIM | .2732013 .0743761 3.67 0.000 .1274268 .4189758
NDISEASE | .0224861 .0035958 6.25 0.000 .0154384 .0295338
HLTHG | .0387516 .0438545 0.88 0.377 -.0472016 .1247049
HLTHF | .1920062 .0836688 2.29 0.022 .0280185 .355994
HLTHP | .6397294 .2126322 3.01 0.003 .222978 1.056481
LINC | .0518413 .0168128 3.08 0.002 .0188889 .0847938
LFAM | -.0335599 .041728 -0.80 0.421 -.1153452 .0482253
EDUCDEC | .036307 .0076536 4.74 0.000 .0213062 .0513078
AGE | .0002631 .0021606 0.12 0.903 -.0039715 .0044978
FEMALE | .4451035 .054292 8.20 0.000 .3386932 .5515138
CHILD | .111489 .0808338 1.38 0.168 -.0469424 .2699203
FEMCHILD | -.4512845 .0799219 -5.65 0.000 -.6079284 -.2946405
BLACK | -.6057367 .0523148 -11.58 0.000 -.7082718 -.5032017
_cons | -.271605 .1877345 -1.45 0.148 -.6395579 .0963478
-------------+---------------------------------------------------------------mills
|
lambda | .2358048 .5018117 0.47 0.638 -.7477282 1.219338
-------------+---------------------------------------------------------------rho | 0.16833
sigma | 1.4008246
lambda | .23580476 .5018117
------------------------------------------------------------------------------
332
. estimates store heck2step

. scalar sh2s = e(sigma)
/* s where Var[v]=s^2 */
.
. * Save the Stata predictions:
. * Distinguish between ystar=E[y*], ypos=E[y|I>0] and yall=E[y]
. predict ystarh2s, xb
/* E[y*] = x'b */
. predict yposh2s, ycond
/* E[y|I>0] = E[y*|I>0] = x'b+c*lamda(z'a) */
. predict invmillh2s, mills
/* lamda(z'a) = phi(z'a)/PHI(z'a) */
. predict probselh2s, psel
/* PHI(z'a) */
. * The following not appropriate here as it sets y=0 if I<0

. * whereas here data is in logs and y=ln(MED)=-infinity if I<0
. predict yallh2s, yexpected /* E[y] = PHI(z'a)*E[y|I>0] */
. sum ystarh2s yposh2s invmillh2s probselh2s yallh2s
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------ystarh2s |
5574 3.904371 .589474 2.005307 6.573941
yposh2s |
5574 3.997637 .5516546 2.337985 6.574553
invmillh2s |
5574 .3955256 .2253329 .002599 1.545223
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
yallh2s |
5574 3.124344 .9213697 .4450346 6.569597
.
. * E[y] = exp(ypos + (s^2)/2) for y > 0 Var[v]=s^2
. * and E[y] = Pr[y>0]*exp(ypos + (s^2)/2) for all y
. gen pMEDposh2s = exp(yposh2s + (sh2s^2)/2)
. gen pMEDallh2s = probselh2s*pMEDposh2s
.
. sum LNMED yposh2s MED pMEDposh2s if MED > 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------LNMED |
4281 4.069462 1.499372 -.5343859 10.57597
yposh2s |
4281 4.069462 .5543231 2.337985 6.574553
MED |
4281 220.987 909.9021 .5860291 39182.02
pMEDposh2s |
4281 184.9993 129.5432 27.63657 1911.624
. corr LNMED yposh2s MED pMEDpos2part if MED > 0
(obs=4281)
333
| LNMED yposh2s
MED pMEDpo~t
-------------+-----------------------------------LNMED | 1.0000
yposh2s | 0.3697 1.0000
MED | 0.4560 0.1584 1.0000
pMEDpos2part | 0.3387 0.9240 0.1669 1.0000
.
. sum MED pMEDallh2s DMED probselh2s
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDallh2s |
5574 142.1438 123.2964 5.272963 1910.182
DMED |
5574 .7680301 .4221277
0
1
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
. corr MED pMEDallh2s DMED probselh2s
(obs=5574)
|
MED pMEDa~2s DMED probs~2s
-------------+-----------------------------------MED | 1.0000
pMEDallh2s | 0.1772 1.0000
DMED | 0.1162 0.2132 1.0000
probselh2s | 0.1031 0.6298 0.3467 1.0000
.
. * (2C) Check for possible collinearity problems in Heckman 2-Step
.
. * Check variation in inverse mills ratio and related measures
. gen zprimea = invnorm(probselh2s)
. gen zprimeasq = zprimea*zprimea
. sum invmillh2s probselh2s zprimea ystarh2s
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------invmillh2s |
5574 .3955256 .2253329 .002599 1.545223
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
zprimea |
5574 .8217315 .5175712 -1.025036 3.17314
ystarh2s |
5574 3.904371 .589474 2.005307 6.573941
. sum invmillh2s probselh2s zprimea ystarh2s, detail
Mills' ratio
------------------------------------------------------------334
Percentiles
Smallest
1% .0443035
.002599
5% .1081773
.0065964
10% .1479522
.0074306
25% .2404661
.0111331
50%
75%
90%
95%
99%
Obs
5574
Sum of Wgt.
5574
.3522253
Mean
.3955256
Largest
Std. Dev.
.2253329
.5044507
1.42819
.7088638
1.42819
Variance
.0507749
.863094
1.466996
Skewness
1.105156
1.080771
1.545223
Kurtosis
4.403004
Pr(DMED)
------------------------------------------------------------Percentiles
Smallest
1%
.338421
.1526731
5% .4598847
.1769602
10% .5570307
.1900167
Obs
5574
25% .6946899
.1900167
Sum of Wgt.
5574
50%
75%
90%
95%
99%
.7984734
Mean
.7678377
Largest
Std. Dev.
.1457464
.8717066
.9962835
.927941
.9976236
Variance
.021242
.9502093
.9979156
Skewness
-1.048826
.9823552
.999246
Kurtosis
3.903288
zprimea
------------------------------------------------------------Percentiles
Smallest
1% -.4167765
-1.025036
5% -.1007243
-.9270119
10% .1434453 -.8778346
Obs
5574
25% .5091883 -.8778346
Sum of Wgt.
5574
50%
75%
90%
95%
99%
.8361809
Mean
.8217315
Largest
Std. Dev.
.5175712
1.134495
2.676793
1.460626
2.82333
Variance
.2678799
1.646887
2.865093
Skewness
-.0298741
2.105021
3.17314
Kurtosis
3.462529
Linear prediction
------------------------------------------------------------Percentiles
Smallest
1% 2.770451
2.005307
5% 3.096997
2.005307
10% 3.248734
2.066777
Obs
5574
25% 3.460358
2.093177
Sum of Wgt.
5574
335
50%
75%
90%
95%
99%
3.818303
Mean
3.904371
Largest
Std. Dev.
.589474
4.304362
6.054721
4.68132
6.055911
Variance
.3474796
4.946257
6.273092
Skewness
.5047628
5.495563
6.573941
Kurtosis
3.235111
.
. * Check for Mills ratio linear in zprimea
. regress invmillh2s zprimea
Source |
SS
df
MS
-------------+-----------------------------F( 1, 5572) =84783.34
Model | 265.518552 1 265.518552
Prob > F
= 0.0000
Residual | 17.4500012 5572 .00313173
R-squared = 0.9383
-------------+-----------------------------Adj R-squared = 0.9383
Total | 282.968553 5573 .050774906
Root MSE
= .05596
-----------------------------------------------------------------------------invmillh2s |
Coef. Std. Err.
-------------+---------------------------------------------------------------zprimea | -.4217284 .0014484 -291.18 0.000 -.4245677 -.418889
_cons | .7420731 .0014065 527.59 0.000 .7393158 .7448305
-----------------------------------------------------------------------------. regress invmillh2s zprimea zprimeasq
Source |
SS
df
MS
Number of obs =
-------------+-----------------------------F( 2, 5571) =
Model | 282.919807 2 141.459904
Prob > F
Residual | .04874607 5571 8.7500e-06
R-squared
-------------+-----------------------------Adj R-squared =
Total | 282.968553 5573 .050774906
Root MSE
5574
.
= 0.0000
= 0.9998
0.9998
= .00296
-----------------------------------------------------------------------------invmillh2s |
Coef. Std. Err.
-------------+---------------------------------------------------------------zprimea | -.6381933 .0001715 -3720.60 0.000 -.6385296 -.6378571
zprimeasq | .1329635 .0000943 1410.22 0.000 .1327787 .1331484
_cons | .7945547 .0000831 9556.73 0.000 .7943917 .7947177
-----------------------------------------------------------------------------. * twoway scatter yinvmill probitxb
.
. * Check R-squared from regress yinvmill on other regressors
. regress invmillh2s $XLIST
Source |
SS
df
MS
-------------+-----------------------------F( 17, 5556) = 7477.36
Model | 271.118403 17 15.9481414
Prob > F
= 0.0000
Residual | 11.85015 5556 .002132856
R-squared = 0.9581
336
-------------+-----------------------------Adj R-squared = 0.9580

Total | 282.968553 5573 .050774906
Root MSE
= .04618
-----------------------------------------------------------------------------invmillh2s |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | .0529008 .000877 60.32 0.000 .0511815 .0546202
IDP | .0590603 .0017037 34.67 0.000 .0557204 .0624003
LPI | -.0113774 .0002792 -40.75 0.000 -.0119247 -.01083
FMDE | -.0054681 .0005178 -10.56 0.000 -.0064831 -.004453
PHYSLIM | -.0864947 .0021028 -41.13 0.000 -.090617 -.0823724
NDISEASE | -.0077731 .0001032 -75.31 0.000 -.0079754 -.0075707
HLTHG | -.0155696 .0013947 -11.16 0.000 -.0183037 -.0128355
HLTHF | -.0844067 .0025693 -32.85 0.000 -.0894435 -.0793698
HLTHP | -.2164141 .0052914 -40.90 0.000 -.2267872 -.206041
LINC | -.0293205 .0005678 -51.64 0.000 -.0304337 -.0282074
LFAM | .0170455 .0013216 12.90 0.000 .0144545 .0196364
EDUCDEC | -.0152414 .0002405 -63.38 0.000 -.0157128 -.01477
AGE | .0001145 .0000665 1.72 0.085 -.0000158 .0002448
FEMALE | -.1792718 .0016754 -107.00 0.000 -.1825563 -.1759873
CHILD | -.0474152 .0025807 -18.37 0.000 -.0524744 -.042356
FEMCHILD | .1803783 .002565 70.32 0.000 .1753498 .1854067
BLACK | .3020816 .0017915 168.62 0.000 .2985695 .3055937
_cons | .875215 .0061051 143.36 0.000 .8632467 .8871833
-----------------------------------------------------------------------------.
. * Find the condition number with inverse mills ratio included
. matrix accum XX = invmillh2s $XLIST
(obs=5574)
. matrix XXScaled = corr(XX)
. matrix symeigen XXSeigvec XXSeigval = XXScaled
. scalar rowsXX = rowsof(XX)
. scalar condnum1 = sqrt(XXSeigval[1,1]/XXSeigval[1,rowsXX])
. scalar condnum2 = sqrt(XXSeigval[1,1]/XXSeigval[1,(rowsXX-1)])
.
. * Find the condition number without inverse mills ratio
. matrix accum ZZ = $XLIST
(obs=5574)
. matrix ZZScaled = corr(ZZ)
. matrix symeigen ZZSeigvec ZZSeigval = ZZScaled
. scalar rowsZZ = rowsof(ZZ)
337
. scalar condnumnoinvmills1 = sqrt(ZZSeigval[1,1]/ZZSeigval[1,rowsZZ])

. scalar condnumnoinvmills2 = sqrt(ZZSeigval[1,1]/ZZSeigval[1,(rowsZZ-1)])
.
. * Condition numbers between 30 and 100 indicate a strong near dependency
. scalar list condnum1 condnum2
condnum1 = 82.333696
condnum2 = 24.558474
. scalar list condnumnoinvmills1 condnumnoinvmills2
condnumnoinvmills1 = 36.660119
condnumnoinvmills2 = 20.990872
.
. * (2D) Do Heckman 2 step manually (this is unnecessary)
. quietly probit DMED $XLIST
/* global XLIST defined earlier */
. predict pselmanual, p
/* Pr[y>0] = PHI(x'b) */
. predict xbmanual, xb
/* x'b */
. gen invmillsmanual = normden(xbmanual)/pselmanual

. regress LNMED $XLIST invmillsmanual if MED > 0
Source |
SS
df
MS
-------------+-----------------------------F( 18, 4262) = 37.49
Model | 1315.13292 18 73.06294
Prob > F
= 0.0000
Residual | 8306.80418 4262 1.94903899
R-squared = 0.1367
-------------+-----------------------------Adj R-squared = 0.1330
Total | 9621.9371 4280 2.24811614
Root MSE
= 1.3961
-----------------------------------------------------------------------------LNMED |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0279209 .0397381 -0.70 0.482 -.1058282 .0499864
IDP | -.0922898 .067979 -1.36 0.175 -.225564 .0409844
LPI | .0052225 .0110962 0.47 0.638 -.0165318 .0269769
FMDE | -.0295212 .01822 -1.62 0.105 -.065242 .0061996
PHYSLIM | .2814948 .0803424 3.50 0.000 .1239819 .4390076
NDISEASE | .0216171 .0050367 4.29 0.000 .0117426 .0314915
HLTHG | .1474026 .0489869 3.01 0.003 .0513627 .2434424
HLTHF | .3821683 .0960103 3.98 0.000 .1939381 .5703985
HLTHP | .833294 .1971219 4.23 0.000 .4468325 1.219756
LINC | .0990973 .0251514 3.94 0.000 .0497875 .1484071
LFAM | -.1441358 .0467495 -3.08 0.002 -.2357891 -.0524825
EDUCDEC | .0033639 .0109441 0.31 0.759 -.0180922 .0248201
AGE | .0055556 .0022512 2.47 0.014
.001142 .0099692
FEMALE | .3846324 .103291 3.72 0.000 .1821281 .5871366
338
CHILD | -.2565135 .0935766 -2.74 0.006 -.4399725 -.0730546

FEMCHILD | -.392146 .1250644 -3.14 0.002 -.6373374 -.1469547
BLACK | -.2633649 .1578399 -1.67 0.095 -.5728134 .0460835
invmillsma~l | .235805 .5023784 0.47 0.639 -.7491182 1.220728
_cons | 2.882514 .470116 6.13 0.000 1.960841 3.804186
-----------------------------------------------------------------------------. predict yposmanual, xb
. * Predictions here should equal those from heckman two-step earlier
. sum yposh2s yposmanual invmillh2s invmillsmanual probselh2s pselmanual
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------yposh2s |
5574 3.997637 .5516546 2.337985 6.574553
yposmanual |
5574 3.997637 .5516546 2.337985 6.574553
invmillh2s |
5574 .3955256 .2253329 .002599 1.545223
invmillsma~l |
5574 .3955256 .2253329 .002599 1.545223
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
-------------+-------------------------------------------------------pselmanual |
5574 .7678377 .1457464 .1526731 .999246
. * And put in squared invmills ratio
. gen invmillssq = invmillsmanual*invmillsmanual
. regress LNMED $XLIST invmillsmanual invmillssq if MED > 0
Source |
SS
df
MS
-------------+-----------------------------F( 19, 4261) = 35.64
Model | 1319.30272 19 69.4369854
Prob > F
= 0.0000
Residual | 8302.63438 4261 1.94851781
R-squared = 0.1371
-------------+-----------------------------Adj R-squared = 0.1333
Total | 9621.9371 4280 2.24811614
Root MSE
= 1.3959
-----------------------------------------------------------------------------LNMED |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0793176 .0530386 -1.50 0.135 -.1833009 .0246658
IDP | -.1419148 .075965 -1.87 0.062 -.2908457 .0070161
LPI | .0174224 .0138796 1.26 0.209 -.0097888 .0446337
FMDE | -.0258495 .0183897 -1.41 0.160 -.0619029 .0102039
PHYSLIM | .3867535 .1078448 3.59 0.000 .1753217 .5981854
NDISEASE | .0305019 .0078898 3.87 0.000 .0150337 .0459701
HLTHG | .1652111 .0504705 3.27 0.001 .0662626 .2641596
HLTHF | .4576241 .1089774 4.20 0.000 .2439716 .6712766
HLTHP | 1.056745 .2493566 4.24 0.000 .5678762 1.545614
LINC | .1169339 .027948 4.18 0.000 .0621414 .1717264
LFAM | -.1550441 .0473343 -3.28 0.001 -.2478439 -.0622443
EDUCDEC | .018452 .0150373 1.23 0.220 -.011029 .047933
AGE | .0057227 .0022538 2.54 0.011
.001304 .0101414
FEMALE | .5748999 .1660813 3.46 0.001 .2492941 .9005056
339
CHILD | -.2096856 .0988886 -2.12 0.034 -.4035587 -.0158125

FEMCHILD | -.5873068 .1828525 -3.21 0.001 -.9457929 -.2288207
BLACK | -.5010232 .2264954 -2.21 0.027 -.9450721 -.0569744
invmillsma~l | 2.159812 1.407886 1.53 0.125 -.6003768 4.920001
invmillssq | -1.043357 .7132265 -1.46 0.144 -2.441653 .3549381
_cons | 1.909849 .8142753 2.35 0.019 .3134454 3.506253
-----------------------------------------------------------------------------.
. ************ (3) DISPLAY RESULTS FOR TABLE 16.1 (page 554) ************
.
. * Note for brevity the coefficients for only some of the regressors are reported
.
. * First two columns of Table 16.1 (page 554)
. * Two part estimates: probit for first part and lognormal for second
. estimates table twoparta twopartb, t stats(N ll rank aic bic) b(%10.3f)
---------------------------------------Variable | twoparta twopartb
-------------+-------------------------LC | -0.119
-0.016
|
-4.41
-0.52
IDP | -0.128
-0.079
|
-2.45
-1.28
LPI |
0.028
0.003
|
3.19
0.28
FMDE |
0.008
-0.031
|
0.47
-1.69
PHYSLIM |
0.273
0.262
|
3.67
3.81
NDISEASE |
0.022
0.020
|
6.25
5.78
HLTHG |
0.039
0.144
|
0.88
2.97
HLTHF |
0.192
0.364
|
2.29
4.13
HLTHP |
0.640
0.787
|
3.01
4.63
LINC |
0.052
0.093
|
3.08
4.28
LFAM | -0.034
-0.141
|
-0.80
-3.05
EDUCDEC |
0.036
-0.000
|
4.74
-0.00
AGE |
0.000
0.006
|
0.12
2.47
FEMALE |
0.445
0.344
|
8.20
6.02
CHILD |
0.111
-0.268
|
1.38
-2.96
FEMCHILD | -0.451
-0.351
340
|
-5.65
-3.92
BLACK | -0.606
-0.196
| -11.58
-2.90
_cons | -0.272
3.077
|
-1.45
13.90
-------------+-------------------------N | 5574.000 4281.000
ll | -2690.577 -7493.499
rank | 18.000
18.000
aic | 5417.154 15022.998
bic | 5536.419 15137.513
---------------------------------------legend: b/t
. di "lltwopart = " lltwopart
lltwopart = -10184.076
.
. * Last four columns of Table 16.1 (page 554)
. * Sample selection estimates: 2step and MLE estimates
. set matsize 60
. estimates table heck2step heckmle, t stats(N ll rank aic bic) b(%10.3f)
---------------------------------------Variable | heck2step heckmle
-------------+-------------------------LNMED
|
LC | -0.028
-0.076
|
-0.70
-2.25
IDP | -0.092
-0.150
|
-1.36
-2.26
LPI |
0.005
0.015
|
0.47
1.42
FMDE | -0.030
-0.024
|
-1.62
-1.21
PHYSLIM |
0.281
0.355
|
3.50
4.70
NDISEASE |
0.022
0.029
|
4.29
7.54
HLTHG |
0.147
0.156
|
3.01
2.99
HLTHF |
0.382
0.445
|
3.98
4.66
HLTHP |
0.833
0.999
|
4.22
5.32
LINC |
0.099
0.121
|
3.94
5.26
LFAM | -0.144
-0.158
|
-3.08
-3.18
EDUCDEC |
0.003
0.018
341
|
0.31
1.95
AGE |
0.006
0.006
|
2.46
2.35
FEMALE |
0.385
0.550
|
3.72
8.69
CHILD | -0.257
-0.198
|
-2.74
-2.03
FEMCHILD | -0.392
-0.565
|
-3.13
-5.80
BLACK | -0.263
-0.536
|
-1.67
-7.15
_cons |
2.883
2.108
|
6.13
8.63
-------------+-------------------------DMED
|
LC | -0.119
-0.107
| -4.41
-4.03
IDP | -0.128
-0.109
|
-2.45
-2.13
LPI |
0.028
0.029
|
3.19
3.42
FMDE |
0.008
0.001
|
0.47
0.05
PHYSLIM |
0.273
0.285
|
3.67
3.94
NDISEASE |
0.022
0.021
|
6.25
6.03
HLTHG |
0.039
0.058
|
0.88
1.35
HLTHF |
0.192
0.224
|
2.29
2.75
HLTHP |
0.640
0.798
|
3.01
3.90
LINC |
0.052
0.055
|
3.08
3.33
LFAM | -0.034
-0.031
|
-0.80
-0.77
EDUCDEC |
0.036
0.031
|
4.74
4.20
AGE |
0.000
-0.001
|
0.12
-0.29
FEMALE |
0.445
0.409
|
8.20
7.69
CHILD |
0.111
0.053
|
1.38
0.67
FEMCHILD | -0.451
-0.395
|
-5.65
-5.04
BLACK | -0.606
-0.583
| -11.58
-11.20
_cons | -0.272
-0.214
|
-1.45
-1.16
342
-------------+-------------------------mills
|
lambda |
0.236
|
0.47
-------------+-------------------------athrho
|
_cons |
0.941
|
12.78
-------------+-------------------------lnsigma
|
_cons |
0.451
|
25.45
-------------+-------------------------Statistics |
N | 5574.000 5574.000
ll |
-10170.110
rank | 37.000
38.000
aic |
. 20416.221
bic |
. 20668.004
---------------------------------------legend: b/t
.
. ************ (4) A LITTLE FURTHER ANALYSIS **********
.
. * Predictions
. sum MED pMEDpos2part pMEDposhml pMEDposh2s if MED > 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
4281 220.987 909.9021 .5860291 39182.02
pMEDpos2part |
4281 183.462 126.0213 26.37827 1731.088
pMEDposhml |
4281 240.4096 185.0424 42.00053 3505.48
pMEDposh2s |
4281 184.9993 129.5432 27.63657 1911.624
. corr MED pMEDpos2part pMEDposhml pMEDposh2s if MED > 0
(obs=4281)
|
MED pMEDpo~t pMEDpo~l pMEDp~2s
-------------+-----------------------------------MED | 1.0000
pMEDpos2part | 0.1669 1.0000
pMEDposhml | 0.1617 0.9830 1.0000
pMEDposh2s | 0.1669 0.9994 0.9887 1.0000
.
. sum MED pMEDall2part pMEDallhml pMEDallh2s DMED probsel2part probselhml probselh2s
343
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MED |
5574 169.7247 802.8303
0 39182.02
pMEDall2part |
5574 140.966 120.2022 4.880651 1729.783
pMEDallhml |
5574 184.5571 174.1649 8.814864 3503.564
pMEDallh2s |
5574 142.1438 123.2964 5.272963 1910.182
DMED |
5574 .7680301 .4221277
0
1
-------------+-------------------------------------------------------probsel2part | 5574 .7678377 .1457464 .1526731 .999246
probselhml |
5574 .7674107 .1404707 .1737047 .9994534
probselh2s |
5574 .7678377 .1457464 .1526731 .999246
. corr MED pMEDall2part pMEDallhml pMEDallh2s DMED probsel2part probselhml probselh2s
(obs=5574)
|
MED pMEDal~t pMEDal~l pMEDa~2s DMED probse~t probse~l probs~2s
-------------+-----------------------------------------------------------------------MED | 1.0000
pMEDall2part | 0.1772 1.0000
pMEDallhml | 0.1734 0.9861 1.0000
pMEDallh2s | 0.1772 0.9995 0.9909 1.0000
DMED | 0.1162 0.2158 0.2015 0.2132 1.0000
probsel2part | 0.1031 0.6380 0.5939 0.6298 0.3467 1.0000
probselhml | 0.1074 0.6552 0.6092 0.6468 0.3468 0.9980 1.0000
probselh2s | 0.1031 0.6380 0.5939 0.6298 0.3467 1.0000 0.9980 1.0000
.
. log close
log: c:\Imbook\bwebpage\Section4\mma16p3selection.txt
log type: text
closed on: 19 May 2005, 13:04:40
344
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma17p1km.txt
log type: text
opened on: 19 May 2005, 13:19:55
.
. ********** OVERVIEW OF MMA17P1KM.DO **********
.
. * STATA Program
.
. * Chapter 17.2 (pages 574-5) and 17.5.1 (pages 581-3)
. * Nonparametric Duration Analysis
. * It provides
. * (1) Kaplan-Meier Survival Estimate Graph (Figure 17.1: kennanstrk.wmf)
. * (2) Nelson-Aalen Cumulative Hazard Estimate Graph
. * (3) Kaplan-Meier Survivor Function Estimates (Table 17.3)
. * (4) Shows that Cox regression on intercept gives same results
.
. * strkdur.dta
.
. ********** SETUP **********
.
. set more off
. version 8
.
.
. * The data is the same data as given in Table 1 of
. * J. Kennan, "The Duration of Contract strikes in U.S. Manufacturing",
. * Journal of Econometrics, 1985, Vol. 28, pp.5-28.
.
. * There are 566 observations from 1968-1976 with two variables
. * 1. dur is duration of the strike in days
. * 2. gdp is a measure of stage of business cycle
.*
(deviation of monthly log industrial production in manufacturing
.*
from prediction from OLS on time, time-squared and monthly dummies)
.
. * All observations are complete for these data. There is no censoring !!
. * For an example with censoring see mma17p2kmextra.do or mma17p4duration.do
.
. ********** READ DATA **********
.
345
. use strkdur.dta
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------dur |
566 43.62367 44.66641
1
235
gdp |
566 .0060411 .0499072 -.13996 .08554
.
. * Create ASCII data set so that can use programs other than Stata
. outfile dur gdp using strkdur.asc, replace
.
. ********* ANALYSIS: NONPARAMETRIC SURVIVAL CURVE AND HAZARD
FUNCTION **********
.
. * Stata st curves require defining the dependent variable
. stset dur
failure event: (assumed to fail at time=dur)
obs. time interval: (0, dur]
-----------------------------------------------------------------------------566 total obs.
0 exclusions
24691 total analysis time at risk, at risk from t =
0
0
last observed exit t =
235
.
. * The data here are complete. If dur is instead right-censored,
. * then also need to define a censoring indicator. For example
. * stset dur, fail(censor=1)
. * where the variable censor=1 if data are right-censored and =0 otherwise
. * See mma17p3duration.do
.
. * (1) GRAPH KAPLAN-MEIER SURVIVAL CURVE
.
. * Minimal command that gives 95% confidence bands
. sts graph, gwood
analysis time _t: dur
.
. * Longer command for Figure 17.1 (page 575)
346
. * Nicer graphs and also confidence bands are bolder and easier to read
. sts gen surv = s
. sts gen lbsurv = lb(s)
. sts gen ubsurv = ub(s)
. sort dur
. graph twoway (line ubsurv dur, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)) /*
> */ (line surv dur, msize(vtiny) mstyle(p1) c(J) clstyle(p1)) /*
> */ (line lbsurv dur, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)), /*
> */ title("Kaplan-Meier Survival Function Estimate") /*
> */ xtitle("Strike duration in days", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Survival Probability", size(medlarge)) yscale(titlegap(*5)) /*
> */ ylabel(0.00(0.25)1.00,grid)/*
> */ legend( label(1 "Upper 95% confidence band") label(2 "Survival Function") /*
> */
label(3 "Lower 95% confidence band") )
. graph export kennanstrk.wmf, replace
(file c:\Imbook\bwebpage\Section4\kennanstrk.wmf written in Windows Metafile format)
.
. * (2) GRAPH NELSON-AALEN CUMULATIVE HAZARD FUNCTION
.
. * Minimal command that gives 95% confidence bands
. sts graph, cna
.
. * Longer command gives nicer figure
. sts graph, cna /*
> */ title("Nelson-Aalen Cumulative Hazard") /*
> */ xtitle("Strike duration in days", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Cumulative Hazard", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(label(1 "95% confidence bands") label(2 "Cumulative Hazard"))
.
. * (3) LIST SURVIVOR and NELSON-AALEN CUMULATIVE HAZARD ESTIMATES
.
. * Gives a lot of output
.
347
. * Table 17.2: Kaplan-Meier Survivor Function (page 583)

. sts list
Beg.
Net
Survivor
Std.
Time Total Fail Lost
Function Error [95% Conf. Int.]
------------------------------------------------------------------------------1
566 10
0
0.9823 0.0055 0.9674 0.9905
2
556 21
0
0.9452 0.0096 0.9230 0.9612
3
535 16
0
0.9170 0.0116 0.8910 0.9369
4
519 17
0
0.8869 0.0133 0.8578 0.9104
5
502 18
0
0.8551 0.0148 0.8234 0.8816
6
484
9
0
0.8392 0.0154 0.8063 0.8670
7
475 12
0
0.8180 0.0162 0.7837 0.8474
8
463 12
0
0.7968 0.0169 0.7613 0.8277
9
451 13
0
0.7739 0.0176 0.7371 0.8061
10
438
8
0
0.7597 0.0180 0.7223 0.7928
11
430
9
0
0.7438 0.0183 0.7058 0.7777
12
421 10
0
0.7261 0.0187 0.6874 0.7609
13
411 11
0
0.7067 0.0191 0.6673 0.7424
14
400 11 0
0.6873 0.0195 0.6473 0.7237
15
389 12
0
0.6661 0.0198 0.6256 0.7033
16
377
8
0
0.6519 0.0200 0.6111 0.6896
17
369
6
0
0.6413 0.0202 0.6003 0.6793
18
363
8
0
0.6272 0.0203 0.5860 0.6656
19
355
7
0
0.6148 0.0205 0.5734 0.6535
20
348
7
0
0.6025 0.0206 0.5609 0.6415
21
341
5
0
0.5936 0.0206 0.5519 0.6328
22
336 11
0
0.5742 0.0208 0.5324 0.6137
23
325 10
0
0.5565 0.0209 0.5146 0.5964
24
315
8
0
0.5424 0.0209 0.5004 0.5824
25
307
4
0
0.5353 0.0210 0.4934 0.5754
26
303
7
0
0.5230 0.0210 0.4810 0.5632
27
296
6
0
0.5124 0.0210 0.4704 0.5527
28
290
9
0
0.4965 0.0210 0.4546 0.5369
29
281
5
0
0.4876 0.0210 0.4458 0.5281
30
276
5
0
0.4788 0.0210 0.4371 0.5193
31
271
8
0
0.4647 0.0210 0.4231 0.5051
32
263
5
0
0.4558 0.0209 0.4144 0.4963
33
258
6
0
0.4452 0.0209 0.4039 0.4857
34
252
5
0
0.4364 0.0208 0.3952 0.4768
35
247
4
0
0.4293 0.0208 0.3883 0.4697
36
243
6
0
0.4187 0.0207 0.3779 0.4590
37
237
6
0
0.4081 0.0207 0.3675 0.4483
38
231
8
0
0.3940 0.0205 0.3537 0.4340
39
223
3
0
0.3887 0.0205 0.3485 0.4287
40
220
1
0
0.3869 0.0205 0.3468 0.4269
41
219
4
0
0.3799 0.0204 0.3399 0.4197
42
215
8
0
0.3657 0.0202 0.3261 0.4053
348
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
67
68
70
71
72
74
75
77
82
83
84
85
86
87
88
90
91
92
94
98
99
100
101
102
103
104
105
106
207
203
194
191
187
182
179
174
166
165
157
151
150
148
145
142
141
137
131
126
124
122
117
114
113
112
108
107
106
105
104
101
99
98
95
93
92
91
90
89
87
86
85
82
79
77
74
72
71
68
67
4
9
3
4
5
3
5
8
1
8
6
1
2
3
3
1
4
6
5
2
2
5
3
1
1
4
1
1
1
1
3
2
1
3
2
1
1
1
1
2
1
1
3
3
2
3
2
1
3
1
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.3587
0.3428
0.3375
0.3304
0.3216
0.3163
0.3074
0.2933
0.2915
0.2774
0.2668
0.2650
0.2615
0.2562
0.2509
0.2491
0.2420
0.2314
0.2226
0.2191
0.2155
0.2067
0.2014
0.1996
0.1979
0.1908
0.1890
0.1873
0.1855
0.1837
0.1784
0.1749
0.1731
0.1678
0.1643
0.1625
0.1608
0.1590
0.1572
0.1537
0.1519
0.1502
0.1449
0.1396
0.1360
0.1307
0.1272
0.1254
0.1201
0.1184
0.1148
0.0202
0.0200
0.0199
0.0198
0.0196
0.0195
0.0194
0.0191
0.0191
0.0188
0.0186
0.0186
0.0185
0.0183
0.0182
0.0182
0.0180
0.0177
0.0175
0.0174
0.0173
0.0170
0.0169
0.0168
0.0167
0.0165
0.0165
0.0164
0.0163
0.0163
0.0161
0.0160
0.0159
0.0157
0.0156
0.0155
0.0154
0.0154
0.0153
0.0152
0.0151
0.0150
0.0148
0.0146
0.0144
0.0142
0.0140
0.0139
0.0137
0.0136
0.0134
0.3193
0.3039
0.2988
0.2919
0.2834
0.2783
0.2698
0.2563
0.2546
0.2411
0.2310
0.2294
0.2260
0.2210
0.2159
0.2143
0.2076
0.1976
0.1893
0.1860
0.1827
0.1744
0.1695
0.1678
0.1662
0.1596
0.1580
0.1563
0.1547
0.1530
0.1481
0.1449
0.1432
0.1384
0.1351
0.1335
0.1319
0.1302
0.1286
0.1254
0.1238
0.1222
0.1173
0.1125
0.1093
0.1045
0.1013
0.0997
0.0950
0.0934
0.0902
0.3981
0.3819
0.3765
0.3693
0.3602
0.3548
0.3457
0.3312
0.3293
0.3147
0.3037
0.3019
0.2982
0.2927
0.2872
0.2854
0.2780
0.2669
0.2577
0.2540
0.2503
0.2410
0.2354
0.2335
0.2317
0.2242
0.2223
0.2205
0.2186
0.2167
0.2111
0.2073
0.2055
0.1998
0.1960
0.1942
0.1923
0.1904
0.1885
0.1847
0.1828
0.1809
0.1752
0.1695
0.1657
0.1600
0.1561
0.1542
0.1485
0.1465
0.1427
349
107
65
2
0
0.1113 0.0132 0.0871 0.1388
108
63
2
0
0.1078 0.0130 0.0839 0.1349
109
61
2
0
0.1042 0.0128 0.0808 0.1311
111
59
1
0
0.1025 0.0127 0.0792 0.1291
112
58
1
0
0.1007 0.0126 0.0777 0.1272
114
57
1
0
0.0989 0.0126 0.0761 0.1252
115
56
1
0
0.0972 0.0124 0.0745 0.1233
116
55
1
0
0.0954 0.0123 0.0730 0.1213
117
54
2
0
0.0919 0.0121 0.0699 0.1174
118
52
1
0
0.0901 0.0120 0.0683 0.1155
119
51
1
0
0.0883 0.0119 0.0668 0.1135
122
50
3
0
0.0830 0.0116 0.0622 0.1076
123
47
1
0
0.0813 0.0115 0.0606 0.1056
124
46
1
0
0.0795 0.0114 0.0591 0.1037
125
45
2
0
0.0760 0.0111 0.0561 0.0997
126
43
1
0
0.0742 0.0110 0.0545 0.0977
127
42
2
0
0.0707 0.0108 0.0515 0.0937
130
40
2
0
0.0671 0.0105 0.0485 0.0897
131
38
1
0
0.0654 0.0104 0.0470 0.0877
133
37
1
0
0.0636 0.0103 0.0455 0.0857
135
36
1
0
0.0618 0.0101 0.0440 0.0837
136
35
2
0
0.0583 0.0098 0.0410 0.0797
139
33
2
0
0.0548 0.0096 0.0381 0.0756
140
31
1
0
0.0530 0.0094 0.0366 0.0736
141
30
3
0
0.0477 0.0090 0.0323 0.0675
142
27
1
0
0.0459 0.0088 0.0308 0.0654
143
26
1
0
0.0442 0.0086 0.0294 0.0633
146
25
2
0
0.0406 0.0083 0.0265 0.0592
147
23
1
0
0.0389 0.0081 0.0251 0.0571
148
22
2
0
0.0353 0.0078 0.0223 0.0529
151
20
1
0
0.0336 0.0076 0.0209 0.0508
152
19
1
0
0.0318 0.0074 0.0196 0.0487
153
18
2
0
0.0283 0.0070 0.0169 0.0444
154
16
1
0
0.0265 0.0068 0.0155 0.0423
160
15
1
0
0.0247 0.0065 0.0142 0.0401
163
14
2
0
0.0212 0.0061 0.0116 0.0357
165
12
1
0
0.0194 0.0058 0.0103 0.0335
168
11
1
0
0.0177 0.0055 0.0091 0.0312
174
10
1
0
0.0159 0.0053 0.0079 0.0290
175
9
1
0
0.0141 0.0050 0.0067 0.0267
179
8
1
0
0.0124 0.0046 0.0055 0.0244
191
7
1
0
0.0106 0.0043 0.0044 0.0220
192
6
1
0
0.0088 0.0039 0.0034 0.0196
205
5
1
0
0.0071 0.0035 0.0024 0.0171
208
4
1
0
0.0053 0.0031 0.0015 0.0146
216
3
1
0
0.0035 0.0025 0.0007 0.0121
226
2
1
0
0.0018 0.0018 0.0002 0.0095
235
1 1 0
0.0000
.
.
.
------------------------------------------------------------------------------.
350
. * And Nelson-Aalen Integrated Hazard

. * sts list, na
.
. * (4) STCOX REGRESS ON INTERCEPT GIVES SAME RESULTS AS ABOVE
.
. * Cox Regression on an intercept
. gen one = 1
. stcox one, basesurv(coxbasesurv) basechazard(coxbasecumhaz) basehc(coxbasehaz)
note: one dropped due to collinearity
Refining estimates:
Cox regression -- Breslow method for ties
No. of subjects =
No. of failures =
Time at risk =
566
566
24691
Number of obs =
LR chi2(0)
Log likelihood =
-3032.134
566
=
0.00
Prob > chi2 =
-----------------------------------------------------------------------------_t | Haz. Ratio Std. Err.

-------------+--------------------------------------------------------------------------------------------------------------------------------------------.
. * Instead use sts which analyzes dependent in isolation
. * sts gen surv = s
. sts gen cumhaz = na
. sts gen haz = h
.
. * Compare to verify that same answers
. sum surv coxbasesurv cumhaz coxbasecumhaz haz coxbasehaz
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------surv |
566 .493014 .2848417
0 .9823322
coxbasesurv |
566 .493014 .2848417
0 .9823322
cumhaz |
566
1 .9834583 .0176678 6.871446
coxbasecum~z |
566
1 .9834583 .0176678 6.871446
haz |
566 .0345186 .0515235 .0045455
1
-------------+-------------------------------------------------------coxbasehaz |
566 .0345186 .0515235 .0045455
1
351
. corr surv coxbasesurv

(obs=566)
| surv coxbas~v
-------------+-----------------surv | 1.0000
coxbasesurv | 1.0000 1.0000
. corr cumhaz coxbasecumhaz

(obs=566)
| cumhaz cox~mhaz
-------------+-----------------cumhaz | 1.0000
coxbasecum~z | 1.0000 1.0000
. corr haz coxbasehaz

(obs=566)
|
haz cox~ehaz
-------------+-----------------haz | 1.0000
coxbasehaz | 1.0000 1.0000
.
. * (5) ESTIMATE HAZARD FUNCTION
.
. * sts graph does not give the true hazard function - it instead gives the
. * difference in the cumulative hazard (without division by time difference).
.
. log close
log: c:\Imbook\bwebpage\Section4\mma17p1km.txt
log type: text
closed on: 19 May 2005, 13:20:01
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma17p2kmextra.txt
log type: text
opened on: 19 May 2005, 13:24:01
.
. ********** OVERVIEW OF MMA17PP2KMEXTRA.DO **********
.
. * STATA Program
352

.
. * Nonparametric Survival Analysis
. * Provides
. * (1) K-M Survivor Function and N_A Cum Hazard Estimates (Table 17.2)
. * using artificial data
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** GENERATE DATA **********
.
. * The time does not matter except for the hazard.
. * Here arbitrarily let durations be 1, 4, 6, 11 and 20 (so irregularly spaced)
. * 1. At t = 10 (time t1): 6 failures
. * 2. At t = 15:
4 censored (lost) between t1 and t2
. * 4. At t = 25:
. * 4. At t = 35:
. * 4. At t = 45:
32 failures (lost) between t4 and t5
. * 5. At t = 50 (time t5): 26 censored
.
. * Indicator failed = 1 if fail and 0 if censored
. input duration failed
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
duration
10 1
10 1
10 1
10 1
10 1
10 1
15 0
15 0
15 0
15 0
20 1
20 1
20 1
20 1
20 1
25 0
failed
353
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
25
25
30
30
35
40
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
45
50
50
50
50
50
50
50
50
50
50
50
50
50
0
0
1
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
354
68. 50
69. 50
70. 50
71. 50
72. 50
73. 50
74. 50
75. 50
76. 50
77. 50
78. 50
79. 50
80. 50
81. end
1
1
1
1
1
1
1
1
1
1
1
1
1
.
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------duration |
80
39.625 13.40166
10
50
failed |
80
.5 .5031546
0
1
.
. ***** COMPUTATION USING STATA **********
.
. stset duration, fail(failed=1)
failure event: failed == 1
obs. time interval: (0, duration]
-----------------------------------------------------------------------------80 total obs.
0 exclusions
0
0
50
. stsum
failure _d: failed == 1
analysis time _t: duration
|
incidence
no. of |------ Survival time -----|
| time at risk rate
subjects
25%
50%
75%
---------+--------------------------------------------------------------------355
total |
3170 .0126183
80
50
50
50
. stdes
|-------------- per subject --------------|
Category
total
mean
min median
max
-----------------------------------------------------------------------------no. of subjects
80
no. of records
80
1
1
1
1
(first) entry time
(final) exit time
subjects with gap
time on gap if gap
time at risk
0
39.625
0
10
0
45
50
0
0
3170
39.625
10
45
50
failures
40
.5
0
.5
1
-----------------------------------------------------------------------------.
. * K-M survival graph
. * sts graph, gwood
.
. * N-A Cumulative Hazard
. * sts graph, cna
.
. * Kaplan-Meier Survivor Function listed (last column Table 17.2)
. sts list
Beg.
Net
Survivor
Std.
------------------------------------------------------------------------------10
80
6
0
0.9250 0.0294 0.8407 0.9656
15
74
0
4
0.9250 0.0294 0.8407 0.9656
20
70
5
0
0.8589 0.0395 0.7596 0.9193
25
65
0
3
0.8589 0.0395 0.7596 0.9193
30
62
2
0
0.8312 0.0428 0.7268 0.8984
35
60
0
1
0.8312 0.0428 0.7268 0.8984
40
59
1
0
0.8171 0.0443 0.7104 0.8875
45
58
0 32
0.8171 0.0443 0.7104 0.8875
50
26 26 0
0.0000
.
.
.
------------------------------------------------------------------------------.
356
. * Nelson-Aalen Cumulative Hazard Listed (second last column Table 17.2)

. sts list, na
Beg.
Net
Nelson-Aalen Std.
Cum. Haz. Error [95% Conf. Int.]
------------------------------------------------------------------------------10
80
6
0
0.0750 0.0306 0.0337 0.1669
15
74
0
4
0.0750 0.0306 0.0337 0.1669
20
70
5
0
0.1464 0.0442 0.0810 0.2648
25
65
0
3
0.1464 0.0442 0.0810 0.2648
30
62
2
0
0.1787 0.0498 0.1035 0.3085
35
60
0
1
0.1787 0.0498 0.1035 0.3085
40
59
1
0
0.1956 0.0526 0.1155 0.3313
45
58
0 32
0.1956 0.0526 0.1155 0.3313
50
26 26
0
1.1956 0.2030 0.8571 1.6678
------------------------------------------------------------------------------.
. ***** MANUAL COMPUTATION AS IN TABLE 17.2 (page 582) **********
.
. scalar cumhaz1 = 6/80
. scalar cumhaz2 = 6/80 + 5/70
. scalar cumhaz3 = 6/80 + 5/70 + 2/62
. scalar surv1 = 1-6/80
. scalar surv2 = (1-6/80)*(1-5/70)
. scalar surv3 = (1-6/80)*(1-5/70)*(1-2/62)
. di "Cumulative hazard at t1: " cumhaz1 " at t2: " cumhaz2 " at t3: " cumhaz3
Cumulative hazard at t1: .075 at t2: .14642857 at t3: .17868664
. di "Survivor function at t1: " surv1 " at t2: " surv2 " at t3: " surv3
Survivor function at t1: .925 at t2: .85892857 at t3: .8312212
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma17p2kmextra.txt
log type: text
closed on: 19 May 2005, 13:24:01
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma17p3weib.txt
log type: text
opened on: 19 May 2005, 14:22:25
357
.
. ********** OVERVIEW OF MMA17P3WEIB.DO **********
.
. * STATA Program
.
. * Chapter 17.6.1 (pages 584-6)
. * Plot of Weibull density, survuvor, hazard and cumulative hazard functions
. * Provides
. * (1) Figure 17.2 (ch17weibull.wmf)
.
. * This program requires no data
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
. ********** GENERATE DATA AND FUNCTIONS **********
.
. set obs 800
obs was 0, now 800
.
. gen t = 0.1*_n /* duration time */
.
. * Generate the survivor, hazard, cumulative hazard and density
. scalar g = 0.01 /* gamma */
. scalar a = 1.5 /* alpha */
. gen surv = exp(-g*(t^(a)))
. gen density = g*a*(t^(a-1))*exp(-g*(t^(a)))
. gen hazard = g*a*(t^(a-1))
. gen cumhaz = -ln(surv)
.
. ********** DO THE FOUR SEPARATE GRAPHS FOR FIGURE 17.2 **********
.
358
. * Weibull density
. graph twoway (scatter density t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ xtitle("Duration time", size(large)) xscale(titlegap(*5)) /*
> */ ytitle("Weibull density", size(large)) yscale(titlegap(*5)) /*
> */ xlabel(,labsize(medlarge)) ylabel(,labsize(medlarge))
. graph save ch17fig2a, replace
(file ch17fig2a.gph saved)
.
. * Weibull survivor
. graph twoway (scatter surv t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ ytitle("Weibull survivor", size(large)) yscale(titlegap(*5)) /*
. graph save ch17fig2b, replace
(file ch17fig2b.gph saved)
.
. * Weibull hazard
. graph twoway (scatter hazard t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ ytitle("Weibull hazard", size(large)) yscale(titlegap(*5)) /*
. graph save ch17fig2c, replace
(file ch17fig2c.gph saved)
.
. * Weibull cumulative hazard
. graph twoway (scatter cumhaz t, c(l) msize(vtiny) clwidth(medthick) clstyle(p1)), /*
> */ ytitle("Cumulative hazard", size(large)) yscale(titlegap(*5)) /*
. graph save ch17fig2d, replace
(file ch17fig2d.gph saved)
.
. ********** COMBINE THE FOUR GRAPHS FOR FIGURE 17.2 (page 585) **********
.
. graph combine ch17fig2a.gph ch17fig2b.gph ch17fig2c.gph ch17fig2d.gph, /*
> */ title("Weibull Distribution", margin(b=2) size(vlarge))
. graph export ch17weibull.wmf, replace
(file c:\Imbook\bwebpage\Section4\ch17weibull.wmf written in Windows Metafile format)
359
.
. log close
log: c:\Imbook\bwebpage\Section4\mma17p3weib.txt
log type: text
closed on: 19 May 2005, 14:22:39
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma17p4duration.txt
log type: text
opened on: 19 May 2005, 15:25:00
.
. ********** OVERVIEW OF MMA17P4DURATION.DO **********
.
. * STATA Program
.
. * Chapter 17.11 (pages 603-8)
. * Duration regression with censored data example
. * Provides
. * (1) Data summary: Table 17.6
. * (2) List of Survivor Function and Cumulative Hazard Estimates: Table 17.7
. * (3) Various graphs describing the data
.*
(3A) K-M Survival Graph for all data (Figure 17.3: km_pt1.wmf)
.*
(3B) K-M Survival Graph by unemployment insurance (Figure 17.4: km_pt2.wmf)
.*
(3C) N-A Cumulative Hazard Graph for all data (Figure 17.5: na_pt1.wmf)
.*
(3D) N-A Cumulative Hazard Graph by unemployment insurance (Figure 17.6: na_pt2.wmf)
. * (4) Coefficient Estimates of Some Parametric Models (Table 17.8)
. * (4) Hazard Rate Estimates of Some Parametric Models (Table 17.9)
.
. * ema1996.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set matsize 100
.
.
360
. * The data is from

. * B.P. McCall (1996), "Unemployment Insurance Rules, Joblessness,
.*
and Part-time Work," Econometrica, 64, 647-682.
.
. * McCalls data set named ema_1996_pt_lastweek.dta
. * has name changed to ema1996.dta
.
. * There are 3343 observations from the CPS Displaced Worker Surveys
. * of 1986, 1988, 1990 and 1992
. * 1. spell is length of spell in number of two-week intervals
. * 2. CENSOR1 = 1 if re-employed at full-time job
. * 3. CENSOR2 = 1 if re-employed at part-time job
. * 4. CENSOR3 = 1 if re-employed but left job: pt-ft status unknown
. * 5. CENSOR4 = 1 if still jobless
. * 6. ui (UI) = 1 if filed UI claim
. * 7. reprate (RR) = eligible replacement rate
. * 8. disrate (DR) = eligible disregard rate
. * 9. tenure (TENURE) = years tenure in lost job
. * 10. logwage (LOGWAGE) = log weekly earnings in lost job (1985$)
. * 11.-43. other variables listed in McCall (1986) table 2 p.657
.
. ********** READ DATA **********
.
. use ema1996.dta
(Sample for 1996 EMA paper: part-time= worked part-time last week)
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------ui |
3343 .5527969 .4972791
0
1
reprate |
3343 .4544717 .1137918
.066
2.059
logwage |
3343 5.692994 .5356591 2.70805 7.600402
tenure |
3343 4.114867 5.862322
0
40
disrate |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------slack |
3343 .4884834 .4999421
0
1
abolpos |
3343 .1456775 .3528354
0
1
explose |
3343 .5025426 .5000683
0
1
stateur |
3343
6.5516 1.803825
2.5
13
houshead |
3343 .6120251 .4873617
0
1
-------------+-------------------------------------------------------married |
3343 .5860006 .4926221
0
1
female |
3343 .3478911 .4763725
0
1
child |
3343 .4501944 .4975876
0
1
361
ychild |
3343 .1956327 .3967463
0
1
nonwhite |
3343 .1390966 .3460991
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
smsa |
3343 .7241998 .4469835
0
1
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
-------------+-------------------------------------------------------services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
-------------+-------------------------------------------------------midatl |
3343 .1088842 .3115405
0
1
encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
-------------+-------------------------------------------------------wscen |
3343 .1441819 .3513266
0
1
mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
.
. * The following gives variables in same order as Table 2 p.657 of McCall (1996)
. * which gives fuller names for the variables
. sum spell censor1 censor2 censor3 censor4 age /*
> */ ui reprate disrate logwage tenure slack abolpos explose bluecoll /*
> */ houshead married child ychild female schlt12 schgt12 nonwhite smsa /*
> */ midatl encen wncen southatl escen wscen mountain pacific /*
> */ mining constr transp trade fire services pubadmin /*
> */ year85 year87 year89
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
ui |
3343 .5527969 .4972791
0
1
362
reprate |
3343 .4544717 .1137918
.066
2.059
disrate |
3343 .1094376 .0735274
.002
1.02
logwage |
3343 5.692994 .5356591 2.70805 7.600402
-------------+-------------------------------------------------------tenure |
3343 4.114867 5.862322
0
40
slack |
3343 .4884834 .4999421
0
1
abolpos |
3343 .1456775 .3528354
0
1
explose |
3343 .5025426 .5000683
0
1
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------houshead |
3343 .6120251 .4873617
0
1
married |
3343 .5860006 .4926221
0
1
child |
3343 .4501944 .4975876
0
1
ychild |
3343 .1956327 .3967463
0
1
female |
3343 .3478911 .4763725
0
1
-------------+-------------------------------------------------------schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
nonwhite |
3343 .1390966 .3460991
0
1
smsa |
3343 .7241998 .4469835
0
1
midatl |
3343 .1088842 .3115405
0
1
-------------+-------------------------------------------------------encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
wscen |
3343 .1441819 .3513266
0
1
-------------+-------------------------------------------------------mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
-------------+-------------------------------------------------------trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
-------------+-------------------------------------------------------year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
.
. * The following creates a space-delimited data set with
. * variables in same order as Table 2 p.657 of McCall (1996)
. * Permits use by programs other than Stata
. * Note that order has been changed a little from the original Stata data set
.
. outfile spell censor1 censor2 censor3 censor4 age /*
> */ ui reprate disrate logwage tenure slack abolpos explose bluecoll /*
363
>
>
>
>
*/ houshead married child ychild female schlt12 schgt12 nonwhite smsa /*

*/ midatl encen wncen southatl escen wscen mountain pacific /*
*/ mining constr transp trade fire services pubadmin /*
*/ year85 year87 year89 using ema1996.asc, replace
.
. ********* ANALYSIS: UNEMPLOYMENT DURATION **********
.
. * and the censoring variable if there is one
. stset spell, fail(censor1=1)
failure event: censor1 == 1
obs. time interval: (0, spell]
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. stdes
failure _d: censor1 == 1
analysis time _t: spell
|-------------- per subject --------------|
Category
total
mean
min median
max
-----------------------------------------------------------------------------no. of subjects
3343
no. of records
3343
1
1
1
1
(first) entry time
(final) exit time
subjects with gap
time on gap if gap
time at risk
0
6.247981
0
0
20887 6.247981
0
1
0
5
28
28
failures
1073 .3209692
0
0
1
-----------------------------------------------------------------------------.
. * (1) SUMMARIZE KEY VARIABLES (Table 17.6, p.603)
.
. sum spell censor1 censor2 censor3 censor4 ui reprate disrate tenure logwage
364
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------ui |
3343 .5527969 .4972791
0
1
reprate | 3343 .4544717 .1137918
.066
2.059
disrate |
3343 .1094376 .0735274
.002
1.02
tenure |
3343 4.114867 5.862322
0
40
logwage |
3343 5.692994 .5356591 2.70805 7.600402
.
. * (2) LIST SURVIVAL CURVE AND CUMULATIVE HAZARD ESTIMATES (Table 17.7,
p.605)
.
. * Kaplan-Meier Estimates of Survival Function
. sts list
Beg.
Net
Survivor
Std.
------------------------------------------------------------------------------1 3343 294 246
0.9121 0.0049 0.9019 0.9212
2 2803 178 304
0.8541 0.0062 0.8415 0.8659
3 2321 119 305
0.8103 0.0071 0.7960 0.8238
4 1897 56 165
0.7864 0.0076 0.7712 0.8008
5 1676 104 233
0.7376 0.0085 0.7206 0.7538
6 1339 32 111
0.7200 0.0088 0.7023 0.7369
7 1196 85 178
0.6688 0.0098 0.6492 0.6876
8
933 15 70
0.6581 0.0100 0.6380 0.6773
9
848 33 98
0.6325 0.0106 0.6113 0.6528
10
717
3 55
0.6298 0.0106 0.6086 0.6503
11
659 26 77
0.6050 0.0113 0.5825 0.6267
12
556
7 40
0.5974 0.0115 0.5744 0.6195
13
509 25 69
0.5680 0.0123 0.5434 0.5918
14
415 30 74
0.5270 0.0135 0.5001 0.5531
15
311 19 40
0.4948 0.0146 0.4658 0.5230
16
252 10 41
0.4751 0.0153 0.4449 0.5047
17
201
8 24
0.4562 0.0161 0.4245 0.4874
18
169
7 13
0.4373 0.0169 0.4040 0.4702
19
149
4 15
0.4256 0.0174 0.3912 0.4595
20
130
3 18
0.4158 0.0179 0.3804 0.4507
21
109
4 23
0.4005 0.0188 0.3635 0.4372
22
82
4
9
0.3810 0.0203 0.3412 0.4206
23
69
0
9
0.3810 0.0203 0.3412 0.4206
365
24
60
0
2
0.3810 0.0203 0.3412 0.4206
25
58
0 10
0.3810 0.0203 0.3412 0.4206
26
48
2 13
0.3651 0.0223 0.3214 0.4088
27
33
5 24
0.3098 0.0296 0.2528 0.3684
28
4
0
4
0.3098 0.0296 0.2528 0.3684
------------------------------------------------------------------------------.
. * Nelson-Aalen Estimates of Cumulative Hazard
. sts list, na
Beg.
Net
Nelson-Aalen Std.
Cum. Haz. Error [95% Conf. Int.]
------------------------------------------------------------------------------1 3343 294 246
0.0879 0.0051 0.0784 0.0986
2 2803 178 304
0.1514 0.0070 0.1383 0.1658
3 2321 119 305
0.2027 0.0084 0.1869 0.2199
4 1897 56 165
0.2322 0.0093 0.2147 0.2512
5 1676 104 233
0.2943 0.0111 0.2733 0.3169
6 1339 32 111
0.3182 0.0119 0.2957 0.3424
7 1196 85 178
0.3893 0.0142 0.3624 0.4181
8
933 15 70
0.4053 0.0148 0.3774 0.4353
9
848 33 98
0.4443 0.0162 0.4135 0.4773
10 717
3 55
0.4484 0.0164 0.4174 0.4818
11
659 26 77
0.4879 0.0182 0.4536 0.5248
12
556
7 40
0.5005 0.0188 0.4650 0.5387
13
509 25 69
0.5496 0.0212 0.5096 0.5927
14
415 30 74
0.6219 0.0250 0.5748 0.6728
15
311 19 40
0.6830 0.0286 0.6291 0.7415
16
252 10 41
0.7227 0.0313 0.6639 0.7866
17
201
8 24
0.7625 0.0343 0.6982 0.8327
18
169
7 13
0.8039 0.0377 0.7333 0.8812
19
149
4 15
0.8307 0.0400 0.7559 0.9130
20
130
3 18
0.8538 0.0422 0.7750 0.9406
21
109
4 23
0.8905 0.0460 0.8048 0.9853
22
82
4
9
0.9393 0.0521 0.8426 1.0470
23
69
0
9
0.9393 0.0521 0.8426 1.0470
24
60
0
2
0.9393 0.0521 0.8426 1.0470
25
58
0 10
0.9393 0.0521 0.8426 1.0470
26
48
2 13
0.9809 0.0598 0.8705 1.1055
27
33
5 24
1.1325 0.0904 0.9685 1.3242
28
4
0
4
1.1325 0.0904 0.9685 1.3242
------------------------------------------------------------------------------.
. * (3) VARIOUS GRAPHS (Figures 17.3-17.6)
.
. * (3A) Figure 17.3: Overall Survival Function (page 604)
366
. * sts graph, gwood

. sts gen surv = s
. sts gen lbsurv = lb(s)
. sts gen ubsurv = ub(s)
. sort spell
. graph twoway (line ubsurv spell, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)) /*
> */ (line surv spell, msize(vtiny) mstyle(p1) c(J) clstyle(p1)) /*
> */ (line lbsurv spell, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)), /*
> */ title("Overall Survival Function Estimate") /*
> */ xtitle("Unemployment Duration in 2-week intervals", size(medlarge)) xscale(titlegap(*5)) /*
> */ ylabel(0.00(0.25)1.00,grid)/*
> */ legend( label(1 "Upper 95% confidence band") label(2 "Survival Estimate") /*
> */
. graph export km_pt1.wmf, replace
(file c:\Imbook\bwebpage\Section4\km_pt1.wmf written in Windows Metafile format)
.
. * (3B) Figure 17.4: Survival Function by Treatment (here ui) (p.605)
. * sts graph, by(ui)
. sts graph, by(ui) /*
> */ title("Survival Function Estimates by UI Status") /*
> */ legend(label(1 "No UI (UI = 0)") label(2 "Received UI (UI = 1)") )
. graph export km_pt2.wmf, replace
(file c:\Imbook\bwebpage\Section4\km_pt2.wmf written in Windows Metafile format)
.
. * (3C) Figure 17.5: Overall Cumulative Hazard Function (p.606)
. * sts graph, cna
. sts gen cumhaz = na
. sts gen lbcumhaz = lb(na)
. sts gen ubcumhaz = ub(na)
367
. sort spell
. graph twoway (line ubcumhaz spell, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)) /*
> */ (line cumhaz spell, msize(vtiny) mstyle(p1) c(J) clstyle(p1)) /*
> */ (line lbcumhaz spell, msize(vtiny) mstyle(p2) c(J) clstyle(p1) clcolor(gs10)), /*
> */ title("Overall Cumulative Hazard Estimate") /*
> */ ylabel(0.00(0.50)1.50,grid)/*
> */ legend( label(1 "Upper 95% confidence band") label(2 "Cumulative Hazard Estimate") /*
> */
. graph export na_pt1.wmf, replace
(file c:\Imbook\bwebpage\Section4\na_pt1.wmf written in Windows Metafile format)
.
. * (3D) Figure 17.6: Cumulative Hazard Function by Treatment (here ui) (p.606)
. * sts graph, na by(ui)
. sts graph, na by(ui) /*
> */ title("Cumulative Hazard Estimates by UI Status") /*
> */ legend(label(1 "No UI (UI = 0)") label(2 "Received UI (UI = 1)") )
. graph export na_pt2.wmf, replace
(file c:\Imbook\bwebpage\Section4\na_pt2.wmf written in Windows Metafile format)
.
. * (4) VARIOUS PARAMETRIC MODELS: COEFFICIENTS (Table 17.8)
.
. * streg default is to report hazard rates ratehr than coeffcients
. * streg with nohr option reports coefficients
.
. * Create regressors
. gen RR = reprate
. gen DR = disrate
. gen UI = ui
. gen RRUI = RR*UI
. gen DRUI = DR*UI
368
. gen LOGWAGE = logwage

.
. * Define $xlist = list of regressors used in subsequent regressions
. global xlist RR DR UI RRUI DRUI LOGWAGE /*
> */ tenure slack abolpos explose stateur houshead married /*
> */ female child ychild nonwhite age schlt12 schgt12 smsa bluecoll /*
> */ year85 year87 year89 midatl /*
> */ encen wncen southatl escen wscen mountain pacific
.
. * Exponential regression
. streg $xlist, nohr robust dist(exponential)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 565.24

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .4720235 .6005534 0.79 0.432 -.7050396 1.649087
DR | -.5756396 .7624489 -0.75 0.450 -2.070012 .9187327
UI | -1.424561 .2493917 -5.71 0.000 -1.91336 -.9357622
RRUI | .9655904 .6118408 1.58 0.115 -.2335956 2.164776
DRUI | -.1990635 1.019118 -0.20 0.845 -2.196498 1.798371
LOGWAGE | .3508005 .115598 3.03 0.002 .1242327 .5773684
tenure | -.0001462 .0064637 -0.02 0.982 -.0128147 .0125224
slack | -.2593666 .0759363 -3.42 0.001 -.4081991 -.1105342
abolpos | -.1550897 .0953306 -1.63 0.104 -.3419342 .0317549
explose | .198458 .0648354 3.06 0.002
.071383 .3255331
stateur | -.064626 .0229903 -2.81 0.005 -.1096862 -.0195659
houshead | .3812208 .0836602 4.56 0.000 .2172499 .5451918
married | .369552 .0786145 4.70 0.000 .2154705 .5236335
369
female | .1164067 .0852986 1.36 0.172 -.0507754 .2835888

child | -.0333008 .0794577 -0.42 0.675 -.1890352 .1224335
ychild | -.1449722 .1022781 -1.42 0.156 -.3454336 .0554892
nonwhite | -.6692066 .1188272 -5.63 0.000 -.9021037 -.4363095
age | -.0220821 .0039256 -5.63 0.000 -.0297762 -.0143879
schlt12 | -.1231414 .0966102 -1.27 0.202 -.3124939 .066211
schgt12 | .1114395 .082945 1.34 0.179 -.0511297 .2740087
smsa | .1922291 .0799904 2.40 0.016 .0354508 .3490075
bluecoll | -.2033718 .085129 -2.39 0.017 -.3702215 -.036522
mining | -.1205818 .1973575 -0.61 0.541 -.5073955 .2662319
constr | -.04475 .1081519 -0.41 0.679 -.2567237 .1672238
transp | -.1786694 .156034 -1.15 0.252 -.4844906 .1271517
trade | -.0345159 .1019152 -0.34 0.735 -.234266 .1652341
fire | .1120549 .1386716 0.81 0.419 -.1597365 .3838462
services | .1840002 .0983911 1.87 0.061 -.0088428 .3768432
pubadmin | .1090606 .2954211 0.37 0.712 -.4699541 .6880752
year85 | .2147661 .0888664 2.42 0.016 .0405911 .388941
year87 | .3541162 .0948499 3.73 0.000 .1682139 .5400186
year89 | .467082 .1104355 4.23 0.000 .2506325 .6835316
midatl | .0264112 .1465647 0.18 0.857 -.2608503 .3136727
encen | .0043916 .1502813 0.03 0.977 -.2901544 .2989375
wncen | .1724311 .1607689 1.07 0.283 -.1426703 .4875324
southatl | .2638807 .1183726 2.23 0.026 .0318747 .4958867
escen | .35414 .19317 1.83 0.067 -.0244664 .7327463
wscen | .3385896 .1433308 2.36 0.018 .0576664 .6195128
mountain | .0063693 .1538821 0.04 0.967 -.2952341 .3079727
pacific | .0770202 .2393505 0.32 0.748 -.3920982 .5461385
_cons | -4.079107 .8767097 -4.65 0.000 -5.797426 -2.360788
-----------------------------------------------------------------------------. estimates store bexponential
.
. * Weibull regression
. streg $xlist, nohr robust dist(weibull)
Fitting constant-only model:
Fitting full model:
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

370

Weibull regression -- log relative-hazard form
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 501.65

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .4481156 .6381895 0.70 0.483 -.8027127 1.698944
DR | -.4269187 .8086983 -0.53 0.598 -2.011938 1.158101
UI | -1.496066 .2639679 -5.67 0.000 -2.013434 -.9786984
RRUI | 1.015226 .6455611 1.57 0.116 -.2500501 2.280503
DRUI | -.2988417 1.065384 -0.28 0.779 -2.386956 1.789272
LOGWAGE | .3655253 .12212 2.99 0.003 .1261745 .6048761
tenure | -.0011127 .0068716 -0.16 0.871 -.0145809 .0123554
slack | -.2652154 .0803214 -3.30 0.001 -.4226424 -.1077883
abolpos | -.1604227 .1012942 -1.58 0.113 -.3589557 .0381103
explose | .2075085 .0684715 3.03 0.002 .0733068 .3417103
stateur | -.0708745 .0242117 -2.93 0.003 -.1183286 -.0234204
houshead | .3976626 .0887192 4.48 0.000 .2237762 .571549
married | .3786057 .0830317 4.56 0.000 .2158665 .541345
female | .1260829 .0896987 1.41 0.160 -.0497233 .301889
child | -.0336778 .0839956 -0.40 0.688 -.1983061 .1309505
ychild | -.1613066 .108947 -1.48 0.139 -.3748389 .0522256
nonwhite | -.7025504 .12426 -5.65 0.000 -.9460956 -.4590052
age | -.0235823 .0041922 -5.63 0.000 -.0317989 -.0153658
schlt12 | -.1226759 .1022762 -1.20 0.230 -.3231335 .0777816
schgt12 | .1162848 .0880692 1.32 0.187 -.0563278 .2888973
smsa | .1999567 .0841129 2.38 0.017 .0350985 .3648149
bluecoll | -.1994925 .0899354 -2.22 0.027 -.3757626 -.0232223
mining | -.1015676 .2036644 -0.50 0.618 -.5007425 .2976073
constr | -.0253737 .1135609 -0.22 0.823 -.247949 .1972016
transp | -.1981522 .1672141 -1.19 0.236 -.5258858 .1295814
trade | -.0311361 .1079502 -0.29 0.773 -.2427146 .1804423
fire | .1262153 .1492527 0.85 0.398 -.1663145 .4187452
services | .2031673 .1038945 1.96 0.051 -.0004622 .4067968
pubadmin | .1117728 .3087374 0.36 0.717 -.4933415 .716887
year85 | .2374972 .093387 2.54 0.011
.054462 .4205325
year87 | .3787397 .1011782 3.74 0.000 .1804341 .5770454
year89 | .4920278 .1180472 4.17 0.000 .2606596 .7233959
midatl | .02465 .1542139 0.16 0.873 -.2776037 .3269036
encen | -.0014111 .1579065 -0.01 0.993 -.3109023 .30808
wncen | .1844363 .1694444 1.09 0.276 -.1476687 .5165413
southatl | .2740974 .1250481 2.19 0.028 .0290076 .5191872
371
escen | .367742 .2024771 1.82 0.069 -.0291058 .7645899

wscen | .3440005 .1527804 2.25 0.024 .0445563 .6434446
mountain | .0159627 .1620188 0.10 0.922 -.3015883 .3335136
pacific | .0849532 .2504077 0.34 0.734 -.4058368 .5757432
_cons | -4.357886 .9196792 -4.74 0.000 -6.160424 -2.555347
-------------+---------------------------------------------------------------/ln_p | .1215314 .0194374 6.25 0.000 .0834348 .1596281
-------------+---------------------------------------------------------------p | 1.129225 .0219492
1.087014 1.173075
1/p | .8855632 .0172131
.8524608 .9199511
-----------------------------------------------------------------------------. estimates store bweibull
.
. * Gompertz regression
. streg $xlist, nohr robust dist(gompertz)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:

Gompertz regression -- log relative-hazard form

No. of subjects
No. of failures
Time at risk
=
=
=
Log pseudo-likelihood =
3343
1073
20887
Number of obs =
Wald chi2(40) = 529.75

-2700.605
Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .472405 .6033813 0.78 0.434 -.7102005 1.655011
DR | -.5627894 .7646131 -0.74 0.462 -2.061404 .9358247
372
UI | -1.428355 .2508349 -5.69 0.000 -1.919982 -.9367272

RRUI | .9689413 .6144464 1.58 0.115 -.2353514 2.173234
DRUI | -.2112495 1.021112 -0.21 0.836 -2.212593 1.790094
LOGWAGE | .3524722 .1162698 3.03 0.002 .1245876 .5803567
tenure | -.0002233 .0065002 -0.03 0.973 -.0129635 .0125168
slack | -.2593933 .0762829 -3.40 0.001 -.4089051 -.1098815
abolpos | -.1552595 .0958002 -1.62 0.105 -.3430244 .0325053
explose | .1991286 .0650876 3.06 0.002 .0715592 .326698
stateur | -.065244 .0231645 -2.82 0.005 -.1106456 -.0198424
houshead | .3822818 .0841671 4.54 0.000 .2173173 .5472464
married | .3700141 .0789107 4.69 0.000
.215352 .5246762
female | .1170987 .0856236 1.37 0.171 -.0507206 .2849179
child | -.0331425 .0798246 -0.42 0.678 -.1895958 .1233108
ychild | -.1466596 .102884 -1.43 0.154 -.3483085 .0549893
nonwhite | -.6720521 .1197092 -5.61 0.000 -.9066778 -.4374264
age | -.0222175 .0039787 -5.58 0.000 -.0300157 -.0144193
schlt12 | -.1228615 .097015 -1.27 0.205 -.3130075 .0672845
schgt12 | .1121295 .0831976 1.35 0.178 -.0509348 .2751938
smsa | .1925807 .0803478 2.40 0.017 .0351019 .3500596
bluecoll | -.203405 .0854986 -2.38 0.017 -.3709791 -.0358309
mining | -.1183683 .1976441 -0.60 0.549 -.5057435 .269007
constr | -.0423947 .1082891 -0.39 0.695 -.2546375 .169848
transp | -.1799724 .1570001 -1.15 0.252 -.487687 .1277422
trade | -.0341793 .1023611 -0.33 0.738 -.2348034 .1664447
fire | .1143611 .1398161 0.82 0.413 -.1596734 .3883955
services | .1854033 .0987923 1.88 0.061 -.0082261 .3790327
pubadmin | .1089298 .2965867 0.37 0.713 -.4723694 .690229
year85 | .2172389 .0890506 2.44 0.015 .0427028 .3917749
year87 | .3564181 .095298 3.74 0.000 .1696374 .5431988
year89 | .4690752 .1114266 4.21 0.000
.250683 .6874674
midatl | .026766 .1471298 0.18 0.856 -.2616031 .3151351
encen | .0043808 .15089 0.03 0.977 -.2913581 .3001198
wncen | .1735986 .1614007 1.08 0.282 -.142741 .4899382
southatl | .2647448 .1188746 2.23 0.026
.031755 .4977347
escen | .3560917 .1938142 1.84 0.066 -.0237772 .7359606
wscen | .3393956 .1442438 2.35 0.019 .0566829 .6221082
mountain | .0076507 .1545162 0.05 0.961 -.2951954 .3104969
pacific | .0778885 .2400495 0.32 0.746 -.3925999 .5483769
_cons | -4.09733 .8802997 -4.65 0.000 -5.822686 -2.371975
-------------+---------------------------------------------------------------gamma | .002658 .0067759 0.39 0.695 -.0106225 .0159386
-----------------------------------------------------------------------------. estimates store bgompertz
.
. stcox $xlist, nohr robust
373

Refining estimates:
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 540.98

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .5222796 .5711698 0.91 0.361 -.5971926 1.641752
DR | -.752507 .72175 -1.04 0.297 -2.167111 .6620971
UI | -1.317719 .2372893 -5.55 0.000 -1.782798 -.8526409
RRUI | .8822462 .582115 1.52 0.130 -.2586783 2.023171
DRUI | -.0951357 .977774 -0.10 0.922 -2.011538 1.821266
LOGWAGE | .3352639 .1106483 3.03 0.002 .1183972 .5521306
tenure | .0008278 .0061286 0.14 0.893 -.0111841 .0128396
slack | -.247863 .0721173 -3.44 0.001 -.3892103 -.1065158
abolpos | -.1511638 .0905035 -1.67 0.095 -.3285475 .0262198
explose | .1865068 .0615742 3.03 0.002 .0658236 .30719
stateur | -.0590475 .022085 -2.67 0.008 -.1023334 -.0157616
houshead | .3601866 .0794827 4.53 0.000 .2044035 .5159698
married | .358819 .0746355 4.81 0.000 .2125362 .5051019
female | .1002758 .0813277 1.23 0.218 -.0591236 .2596753
child | -.0396054 .0755365 -0.52 0.600 -.1876542 .1084435
ychild | -.1276638 .0967856 -1.32 0.187 -.3173602 .0620325
nonwhite | -.6394475 .1151332 -5.55 0.000 -.8651043 -.4137906
age | -.0204623 .0037593 -5.44 0.000 -.0278305 -.0130942
schlt12 | -.1220585 .0920073 -1.33 0.185 -.3023895 .0582726
schgt12 | .1104817 .0783542 1.41 0.159 -.0430897 .2640531
smsa | .1864841 .0766075 2.43 0.015 .0363361 .3366321
bluecoll | -.2108023 .080867 -2.61 0.009 -.3692986 -.052306
mining | -.1238251 .1906352 -0.65 0.516 -.4974632 .249813
constr | -.054455 .1029488 -0.53 0.597 -.256231 .1473209
transp | -.1551657 .1466515 -1.06 0.290 -.4425973 .1322659
trade | -.0383252 .0968106 -0.40 0.692 -.2280706 .1514201
fire | .1097585 .1300779 0.84 0.399 -.1451895 .3647065
services | .1666262 .0939507 1.77 0.076 -.0175138 .3507662
pubadmin | .1022002 .2829817 0.36 0.718 -.4524336 .6568341
year85 | .204162 .084908 2.40 0.016 .0377454 .3705786
374
year87 | .3384229 .0899115 3.76 0.000 .1621997 .5146462

year89 | .4486559 .104937 4.28 0.000 .2429832 .6543286
midatl | .0342238 .140515 0.24 0.808 -.2411805 .3096282
encen | .0174597 .1438862 0.12 0.903 -.2645521 .2994716
wncen | .1650967 .1532559 1.08 0.281 -.1352795 .4654728
southatl | .2518023 .1127138 2.23 0.025 .0308874 .4727172
escen | .3450422 .1839818 1.88 0.061 -.0155554 .7056398
wscen | .3316752 .1359801 2.44 0.015 .0651591 .5981914
mountain | .009484 .1468626 0.06 0.949 -.2783613 .2973293
pacific | .0720292 .2263339 0.32 0.750 -.3715771 .5156355
-----------------------------------------------------------------------------. estimates store bcox
.
. * Display Results for Table 17.8 (page 607)
. estimates table bexponential bweibull bgompertz, t stats(N ll) b(%8.3f) /*
> */ keep(RR DR UI RRUI DRUI LOGWAGE _cons)
----------------------------------------------Variable | bexpon~l bweibull bgompe~z
-------------+--------------------------------RR | 0.472
0.448
0.472
| 0.79
0.70
0.78
DR | -0.576 -0.427 -0.563
| -0.75 -0.53 -0.74
UI | -1.425 -1.496 -1.428
| -5.71 -5.67 -5.69
RRUI | 0.966
1.015
0.969
| 1.58
1.57
1.58
DRUI | -0.199 -0.299 -0.211
| -0.20 -0.28 -0.21
LOGWAGE | 0.351
0.366
0.352
| 3.03
2.99
3.03
_cons | -4.079 -4.358 -4.097
| -4.65 -4.74 -4.65
-------------+--------------------------------N | 3343.000 3343.000 3343.000
ll | -2.7e+03 -2.7e+03 -2.7e+03
----------------------------------------------legend: b/t
. estimates table bcox, t stats(N ll) b(%8.3f) keep(RR DR UI RRUI DRUI LOGWAGE)
------------------------Variable | bcox
-------------+----------RR | 0.522
| 0.91
DR | -0.753
| -1.04
375
UI | -1.318
| -5.55
RRUI | 0.882
| 1.52
DRUI | -0.095
| -0.10
LOGWAGE | 0.335
| 3.03
-------------+----------N | 3343.000
ll | -7.7e+03
------------------------legend: b/t
.
. * (5) VARIOUS PARAMETRIC MODELS: HAZARD RATIOS (Table 17.9, page 608))
.
. * streg default is to report hazard rates rather than coeffcients
. * streg with nohr option reports coefficients
.
. * Exponential regression
. streg $xlist, robust dist(exponential)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 565.24

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t | Haz. Ratio Std. Err.
-------------+---------------------------------------------------------------RR | 1.603235 .9628283 0.79 0.432 .494089 5.202226
DR | .5623451 .4287594 -0.75 0.450 .1261843 2.506112
UI | .2406141 .0600072 -5.71 0.000 .1475837 .3922867
RRUI | 2.626338 1.606901 1.58 0.115 .7916819 8.712654
DRUI | .8194978 .8351649 -0.20 0.845 .1111919 6.039799
LOGWAGE | 1.420204 .1641727 3.03 0.002 1.132279 1.781344
376
tenure | .9998539 .0064627 -0.02 0.982 .9872671 1.012601

slack | .7715401 .0585879 -3.42 0.001 .6648465 .8953557
abolpos | .8563384 .0816353 -1.63 0.104 .7103949 1.032264
explose | 1.219521 .0790681 3.06 0.002 1.073992 1.384769
stateur | .937418 .0215515 -2.81 0.005 .8961153 .9806243
houshead | 1.464071 .1224844 4.56 0.000 1.242655 1.724939
married | 1.447086 .1137619 4.70 0.000 1.240445 1.68815
female | 1.123453 .0958289 1.36 0.172 .9504921 1.327887
child | .9672475 .0768553 -0.42 0.675 .8277574 1.130244
ychild | .8650463 .0884753 -1.42 0.156 .7079133 1.057058
nonwhite | .5121147 .0608532 -5.63 0.000 .4057153 .6464176
age | .9781599 .0038399 -5.63 0.000 .9706627 .9857151
schlt12 | .8841386 .0854168 -1.27 0.202 .7316201 1.068452
schgt12 | 1.117886 .0927231 1.34 0.179 .9501554 1.315226
smsa | 1.211948 .0969443 2.40 0.016 1.036087 1.41766
bluecoll | .8159748 .0694631 -2.39 0.017 .6905813 .9641369
mining | .8864046 .1749386 -0.61 0.541 .6020616 1.305038
constr | .9562365 .1034188 -0.41 0.679 .7735819 1.182019
transp | .8363823 .1305041 -1.15 0.252 .6160109 1.135589
trade | .966073 .0984575 -0.34 0.735 .7911514 1.179669
fire | 1.118574 .1551145 0.81 0.419 .8523684 1.46792
services | 1.202016 .1182677 1.87 0.061 .9911962 1.457676
pubadmin | 1.11523 .3294624 0.37 0.712
.625031 1.989882
year85 | 1.239572 .1101563 2.42 0.016 1.041426 1.475418
year87 | 1.424921 .1351536 3.73 0.000
1.18319 1.716039
year89 | 1.595332 .1761812 4.23 0.000 1.284838 1.980861
midatl | 1.026763 .1504872 0.18 0.857 .7703962 1.368442
encen | 1.004401 .1509427 0.03 0.977 .7481481 1.348425
wncen | 1.18819 .191024 1.07 0.283 .8670399 1.628293
southatl | 1.301973 .1541179 2.23 0.026 1.032388 1.641953
escen | 1.424955 .2752586 1.83 0.067 .9758305 2.080787
wscen | 1.402967 .2010884 2.36 0.018 1.059362 1.858023
mountain | 1.00639 .1548654 0.04 0.967 .7443573 1.360664
pacific | 1.080064 .2585138 0.32 0.748 .6756378 1.726573
-----------------------------------------------------------------------------.
. streg $xlist, robust dist(weibull)
Fitting full model:
377
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 501.65

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
-------------+---------------------------------------------------------------RR | 1.56536 .998996 0.70 0.483 .4481117 5.46817
DR | .6525166 .527689 -0.53 0.598 .1337292 3.183881
UI | .2240097 .0591314 -5.67 0.000 .1335294 .3757999
RRUI | 2.759988 1.781741 1.57 0.116 .7787618 9.781599
DRUI | .7416768 .7901705 -0.28 0.779 .0919091 5.985096
LOGWAGE | 1.441271 .176008 2.99 0.003
1.13448 1.831025
tenure | .9988879 .006864 -0.16 0.871 .9855249 1.012432
slack | .7670407 .0616098 -3.30 0.001 .6553129 .8978176
abolpos | .8517837 .0862808 -1.58 0.113 .6984053 1.038846
explose | 1.230608 .0842616 3.03 0.002 1.076061 1.407352
stateur | .9315788 .0225551 -2.93 0.003 .8884041 .9768517
houshead | 1.488342 .1320445 4.48 0.000 1.250791 1.771008
married | 1.460247 .1212469 4.56 0.000 1.240937 1.718316
female | 1.134376 .101752 1.41 0.160 .9514927 1.352411
child | .966883 .0812139 -0.40 0.688 .8201188 1.139911
ychild | .8510311 .0927173 -1.48 0.139
.6874 1.053613
nonwhite | .4953204 .0615485 -5.65 0.000
.388254 .6319119
age | .9766936 .0040945 -5.63 0.000 .9687014 .9847517
schlt12 | .8845503 .0904684 -1.20 0.230 .7238772 1.080887
schgt12 | 1.123316 .0989295 1.32 0.187 .9452293 1.334955
smsa | 1.22135 .1027313 2.38 0.017 1.035722 1.440247
bluecoll | .8191464 .0736702 -2.22 0.027 .6867654 .9770452
mining | .9034201 .1839945 -0.50 0.618 .6060805 1.346633
constr | .9749455 .1107157 -0.22 0.823 .7803997 1.21799
transp | .820245 .1371565 -1.19 0.236 .5910316 1.138352
trade | .9693436 .1046408 -0.29 0.773 .7844954 1.197747
fire | 1.134526 .1693311 0.85 0.398 .8467799 1.520053
services | 1.225277 .1272996 1.96 0.051 .9995379 1.501999
pubadmin | 1.118259 .3452483 0.36 0.717 .6105827 2.048048
year85 | 1.268072 .1184214 2.54 0.011 1.055972 1.522772
year87 | 1.460443 .147765 3.74 0.000 1.197737 1.780769
year89 | 1.63563 .1930814 4.17 0.000 1.297786 2.061422
378
midatl | 1.024956 .1580625 0.16 0.873

.757597 1.386668
encen | .9985899 .1576839 -0.01 0.993 .7327855 1.36081
wncen | 1.20254 .2037638 1.09 0.276 .8627169 1.67622
southatl | 1.315343 .1644812 2.19 0.028 1.029432 1.680661
escen | 1.444469 .292472 1.82 0.069 .9713137 2.148113
wscen | 1.410579 .2155089 2.25 0.024 1.045564 1.903025
mountain | 1.016091 .1646258 0.10 0.922 .7396425 1.395864
pacific | 1.088666 .2726104 0.34 0.734 .6664189 1.778452
-------------+---------------------------------------------------------------/ln_p | .1215314 .0194374 6.25 0.000 .0834348 .1596281
-------------+---------------------------------------------------------------p | 1.129225 .0219492
1.087014 1.173075
1/p | .8855632 .0172131
.8524608 .9199511
-----------------------------------------------------------------------------.
. * Gompertz regression
. streg $xlist, robust dist(gompertz)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:

Gompertz regression -- log relative-hazard form

No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 529.75

-2700.605
Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
-------------+---------------------------------------------------------------RR | 1.603847 .9677311 0.78 0.434 .4915456 5.233135
379
DR | .5696179 .4355373 -0.74 0.462 .1272752 2.549315

UI | .239703 .0601259 -5.69 0.000 .1466096 .3919084
RRUI | 2.635153 1.61916 1.58 0.115 .7902931 8.786655
DRUI | .809572 .8266639 -0.21 0.836 .1094166 5.990014
LOGWAGE | 1.42258 .165403 3.03 0.002 1.132681 1.786676
tenure | .9997767 .0064987 -0.03 0.973 .9871202 1.012595
slack | .7715195 .0588538 -3.40 0.001 .6643773 .8959403
abolpos | .856193 .0820234 -1.62 0.105 .7096209 1.033039
explose | 1.220339 .079429 3.06 0.002 1.074182 1.386383
stateur | .9368388 .0217014 -2.82 0.005 .895256 .9803531
houshead | 1.465625 .1233575 4.54 0.000 1.242738 1.728487
married | 1.447755 .1142433 4.69 0.000 1.240298 1.689912
female | 1.12423 .0962607 1.37 0.171 .9505442 1.329653
child | .9674007 .0772224 -0.42 0.678 .8272934 1.131236
ychild | .8635879 .0888493 -1.43 0.154 .7058811 1.056529
nonwhite | .5106596 .0611307 -5.61 0.000 .4038637 .6456961
age | .9780275 .0038913 -5.58 0.000 .9704303 .9856841
schlt12 | .8843861 .0857988 -1.27 0.205 .7312444 1.0696
schgt12 | 1.118658 .0930697 1.35 0.178 .9503406 1.316786
smsa | 1.212374 .0974117 2.40 0.017 1.035725 1.419152
bluecoll | .8159478 .0697624 -2.38 0.017 .6900584 .9648035
mining | .8883688 .1755808 -0.60 0.549 .603057 1.308664
constr | .9584913 .1037942 -0.39 0.695 .7751974 1.185125
transp | .8352933 .1311411 -1.15 0.252
.614045 1.13626
trade | .9663982 .0989216 -0.33 0.738 .7907263 1.181098
fire | 1.121157 .1567557 0.82 0.413 .8524222 1.474613
services | 1.203704 .1189167 1.88 0.061 .9918076 1.460871
pubadmin | 1.115084 .3307191 0.37 0.713 .6235232 1.994172
year85 | 1.242641 .110658 2.44 0.015 1.043628 1.479605
year87 | 1.428205 .1361051 3.74 0.000 1.184875 1.721505
year89 | 1.598515 .1781172 4.21 0.000 1.284903 1.988673
midatl | 1.027127 .1511211 0.18 0.856 .7698165 1.370444
encen | 1.00439 .1515525 0.03 0.977
.747248 1.35002
wncen | 1.189578 .1919987 1.08 0.282 .8669786 1.632215
southatl | 1.303098 .1549053 2.23 0.026 1.032265 1.644991
escen | 1.427739 .276716 1.84 0.066 .9765033 2.087486
wscen | 1.404099 .2025325 2.35 0.019
1.05832 1.862851
mountain | 1.00768 .1557029 0.05 0.961 .7443861 1.364103
pacific | 1.081002 .2594941 0.32 0.746 .6752989 1.730442
-------------+---------------------------------------------------------------gamma | .002658 .0067759 0.39 0.695 -.0106225 .0159386
-----------------------------------------------------------------------------.
. * Cox regression
. stcox $xlist, robust
380

Refining estimates:
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 540.98

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
-------------+---------------------------------------------------------------RR | 1.685866 .962916 0.91 0.361 .5503545 5.164209
DR | .4711838 .3400769 -1.04 0.297 .1145079 1.938854
UI | .2677452 .0635331 -5.55 0.000
.168167 .4262877
RRUI | 2.416321 1.406577 1.52 0.130 .7720714 7.562264
DRUI | .9092495 .8890406 -0.10 0.922 .1337828 6.179678
LOGWAGE | 1.398309 .1547206 3.03 0.002 1.125691 1.73695
tenure | 1.000828 .0061337 0.14 0.893 .9888782 1.012922
slack | .7804668 .0562851 -3.44 0.001 .6775918 .8989608
abolpos | .8597068 .0778065 -1.67 0.095 .7199688 1.026567
explose | 1.205033 .0741989 3.03 0.002 1.068038 1.359599
stateur | .942662 .0208187 -2.67 0.008 .9027285 .9843619
houshead | 1.433597 .1139461 4.53 0.000 1.226793 1.675262
married | 1.431638 .106851 4.81 0.000 1.236811 1.657154
female | 1.105476 .0899059 1.23 0.218 .9425903 1.296509
child | .9611687 .0726033 -0.52 0.600 .8289013 1.114542
ychild | .8801492 .0851858 -1.32 0.187 .7280685 1.063997
nonwhite | .5275839 .0607424 -5.55 0.000 .4210076 .6611394
age | .9797456 .0036832 -5.44 0.000 .9725532 .9869912
schlt12 | .8850966 .0814354 -1.33 0.185 .7390501 1.060004
schgt12 | 1.116816 .0875072 1.41 0.159 .9578255 1.302197
smsa | 1.205005 .0923125 2.43 0.015 1.037004 1.400224
bluecoll | .8099341 .0654969 -2.61 0.009 .6912189 .9490384
mining | .8835344 .1684327 -0.65 0.516 .6080713 1.283785
constr | .9470011 .0974926 -0.53 0.597 .7739632 1.158726
transp | .8562733 .1255737 -1.06 0.290 .6423659 1.141412
trade | .9623999 .0931706 -0.40 0.692
.796068 1.163485
fire | 1.116009 .1451681 0.84 0.399 .8648584 1.440091
services | 1.181313 .1109851 1.77 0.076 .9826387 1.420155
pubadmin | 1.107605 .313432 0.36 0.718 .6360783 1.928677
year85 | 1.226497 .1041394 2.40 0.016 1.038467 1.448572
year87 | 1.402734 .1261218 3.76 0.000 1.176095 1.673046
year89 | 1.566206 .1643529 4.28 0.000 1.275047 1.92385
381
midatl | 1.034816 .1454072 0.24 0.808 .7856998 1.362918

encen | 1.017613 .1464205 0.12 0.903 .7675496 1.349146
wncen | 1.179507 .1807665 1.08 0.281 .8734718 1.592767
southatl | 1.286342 .1449884 2.23 0.025 1.031369 1.604348
escen | 1.41205 .2597913 1.88 0.061
.984565 2.025142
wscen | 1.3933 .1894611 2.44 0.015 1.067329 1.818826
mountain | 1.009529 .148262 0.06 0.949 .7570232 1.346259
pacific | 1.074687 .243238 0.32 0.750 .6896459 1.674702
-----------------------------------------------------------------------------.
. * Display results for Table 17.9 page 608
. * Not possible here as estimates table gives coefficients not hazard rates
. * Instead need to use output for each model
. * Not sure why t-statistics differ somewhat from those in Table 17.9
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma17p4duration.txt
log type: text
closed on: 19 May 2005, 15:25:17
382
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma18p1heterogeneity.txt
log type: text
opened on: 19 May 2005, 17:58:22
.
. ********** OVERVIEW OF MMA18P1HETEROGENEITY.DO **********
.
. * STATA Program
.
. * Chapter 18.8 Pages 632-6
. * Unobserved Heterogeneity with Duration data Example
. * (1) Exponential with and without heterogeneity
.*
Residuals Plots: Figures 18.2 (exp.wmf) and 18.3 (exp_gamma.wmf)
.*
Tabulate Model Estimates: Table 18.1
. * (2) Weibull with and without heterogeneity: Generalized Residuals Plots
.*
Residuals Plots: Figures 18.4 (Weibul16.wmf) and 18.5 (Weibul16_IG.wmf)
.*
Tabulate model Estimates: Table 18.2
.
. * ema1996.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
. set matsize 100
.
.
.*
.
. * of 1986, 1988, 1990 and 1992 on 33 variables including
. * spell = length of spell in number of two-week intervals
. * CENSOR1 = 1 if re-employed at full-time job
.
. * See program mma17p4duration.do for further description of the data set
.
. ********** READ DATA **********
383
.
. use ema1996.dta
.
. ********** CREATE ADDITIONAL VARIABLES **********
.
. gen RR = reprate
. gen DR = disrate
. gen UI = ui
. gen RRUI = RR*UI
. gen DRUI = DR*UI
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------ui |
3343 .5527969 .4972791
0
1
reprate |
3343 .4544717 .1137918
.066
2.059
logwage |
3343 5.692994 .5356591 2.70805 7.600402
tenure |
3343 4.114867 5.862322
0
40
disrate |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------slack |
3343 .4884834 .4999421
0
1
abolpos |
3343 .1456775 .3528354
0
1
explose |
3343 .5025426 .5000683
0
1
stateur |
3343
6.5516 1.803825
2.5
13
houshead |
3343 .6120251 .4873617
0
1
-------------+-------------------------------------------------------married |
3343 .5860006 .4926221
0
1
female |
3343 .3478911 .4763725
0
1
child |
3343 .4501944 .4975876
0
1
ychild |
3343 .1956327 .3967463
0
1
nonwhite |
3343 .1390966 .3460991
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
smsa |
3343 .7241998 .4469835
0
1
384
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
-------------+-------------------------------------------------------services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
-------------+-------------------------------------------------------midatl |
3343 .1088842 .3115405
0
1
encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
-------------+-------------------------------------------------------wscen |
3343 .1441819 .3513266
0
1
mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
RR |
3343 .4544717 .1137918
.066
2.059
DR |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------UI |
3343 .5527969 .4972791
0
1
RRUI |
3343 .2478687 .2380667
0
2.059
DRUI |
3343 .0602776 .0754261
0
.824
LOGWAGE |
3343 5.692994 .5356591 2.70805 7.600402
.
. ********* ANALYSIS: UNEMPLOYMENT DURATION **********
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
385
. stdes
|-------------- per subject --------------|
Category
total
mean
min median
max
-----------------------------------------------------------------------------no. of subjects
3343
no. of records
3343
1
1
1
1
(first) entry time
(final) exit time
subjects with gap
time on gap if gap
time at risk
0
6.247981
0
0
20887 6.247981
0
1
0
5
28
28
failures
1073 .3209692
0
0
1
-----------------------------------------------------------------------------.
> */ female child ychild nonwhite age schlt12 schgt12 smsa bluecoll /*
> */ year85 year87 year89 midatl /*
> */ encen wncen southatl escen wscen mountain pacific
.
. * (1) EXPONENTIAL REGRESSION
.
. * Estimate exponential without heterogeneity
. streg $xlist, nolog nohr dist(exponential) robust
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 565.24

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
386
-------------+---------------------------------------------------------------RR | .4720235 .6005534 0.79 0.432 -.7050396 1.649087

DR | -.5756396 .7624489 -0.75 0.450 -2.070012 .9187327
UI | -1.424561 .2493917 -5.71 0.000 -1.91336 -.9357622
RRUI | .9655904 .6118408 1.58 0.115 -.2335956 2.164776
DRUI | -.1990635 1.019118 -0.20 0.845 -2.196498 1.798371
LOGWAGE | .3508005 .115598 3.03 0.002 .1242327 .5773684
tenure | -.0001462 .0064637 -0.02 0.982 -.0128147 .0125224
slack | -.2593666 .0759363 -3.42 0.001 -.4081991 -.1105342
abolpos | -.1550897 .0953306 -1.63 0.104 -.3419342 .0317549
explose | .198458 .0648354 3.06 0.002
.071383 .3255331
stateur | -.064626 .0229903 -2.81 0.005 -.1096862 -.0195659
houshead | .3812208 .0836602 4.56 0.000 .2172499 .5451918
married | .369552 .0786145 4.70 0.000 .2154705 .5236335
female | .1164067 .0852986 1.36 0.172 -.0507754 .2835888
child | -.0333008 .0794577 -0.42 0.675 -.1890352 .1224335
ychild | -.1449722 .1022781 -1.42 0.156 -.3454336 .0554892
nonwhite | -.6692066 .1188272 -5.63 0.000 -.9021037 -.4363095
age | -.0220821 .0039256 -5.63 0.000 -.0297762 -.0143879
schlt12 | -.1231414 .0966102 -1.27 0.202 -.3124939 .066211
schgt12 | .1114395 .082945 1.34 0.179 -.0511297 .2740087
smsa | .1922291 .0799904 2.40 0.016 .0354508 .3490075
bluecoll | -.2033718 .085129 -2.39 0.017 -.3702215 -.036522
mining | -.1205818 .1973575 -0.61 0.541 -.5073955 .2662319
constr | -.04475 .1081519 -0.41 0.679 -.2567237 .1672238
transp | -.1786694 .156034 -1.15 0.252 -.4844906 .1271517
trade | -.0345159 .1019152 -0.34 0.735 -.234266 .1652341
fire | .1120549 .1386716 0.81 0.419 -.1597365 .3838462
services | .1840002 .0983911 1.87 0.061 -.0088428 .3768432
pubadmin | .1090606 .2954211 0.37 0.712 -.4699541 .6880752
year85 | .2147661 .0888664 2.42 0.016 .0405911 .388941
year87 | .3541162 .0948499 3.73 0.000 .1682139 .5400186
year89 | .467082 .1104355 4.23 0.000 .2506325 .6835316
midatl | .0264112 .1465647 0.18 0.857 -.2608503 .3136727
encen | .0043916 .1502813 0.03 0.977 -.2901544 .2989375
wncen | .1724311 .1607689 1.07 0.283 -.1426703 .4875324
southatl | .2638807 .1183726 2.23 0.026 .0318747 .4958867
escen | .35414 .19317 1.83 0.067 -.0244664 .7327463
wscen | .3385896 .1433308 2.36 0.018 .0576664 .6195128
mountain | .0063693 .1538821 0.04 0.967 -.2952341 .3079727
pacific | .0770202 .2393505 0.32 0.748 -.3920982 .5461385
_cons | -4.079107 .8767097 -4.65 0.000 -5.797426 -2.360788
-----------------------------------------------------------------------------. estimates store bexp
.
. * Figure 18.2 (p.633) - Generalized (Cox-Snell) Residuals for Exponential
. predict resid, csnell
. stset resid, fail(censor1)
387
failure event: censor1 != 0 & censor1 < .

obs. time interval: (0, resid]
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
. sts generate survivor=s
. generate cumhaz = -ln(survivor)
. sort resid
. graph twoway (scatter cumhaz resid, c(J) msymbol(i) msize(small) clstyle(p1)) /*
> */ (scatter resid resid, c(l) msymbol(i) msize(small) clstyle(p2)), /*
> */ title("Exponential Model Residuals") /*
> */ xtitle("Generalized (Cox-Snell) Residual", size(medlarge)) xscale(titlegap(*5)) /*
> */ legend( label(1 "Cumulative Hazard") label(2 "45 degree line"))
. graph export exp.wmf, replace
(file c:\Imbook\bwebpage\Section4\exp.wmf written in Windows Metafile format)
. drop resid survivor cumhaz
.
. * Estimate exponential with gamma heterogeneity
. stset spell, fail(censor1)
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
388
28
. streg $xlist, nolog nohr dist(exponential) frailty(gamma) robust

failure _d: censor1
Gamma frailty
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 576.86

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .5005828 .6187508 0.81 0.419 -.7121465 1.713312
DR | -.8824469 .7894395 -1.12 0.264 -2.42972 .664826
UI | -1.584537 .2622252 -6.04 0.000 -2.098489 -1.070586
RRUI | 1.091168 .6327026 1.72 0.085 -.1489067 2.331242
DRUI | .0574048 1.047123 0.05 0.956 -1.994919 2.109729
LOGWAGE | .3792805 .1191278 3.18 0.001 .1457944 .6127666
tenure | .0007938 .0065903 0.12 0.904 -.012123 .0137106
slack | -.2862928 .0770348 -3.72 0.000 -.4372782 -.1353074
abolpos | -.1842749 .0977213 -1.89 0.059 -.3758051 .0072552
explose | .2151452 .0663117 3.24 0.001 .0851767 .3451137
stateur | -.0650451 .023552 -2.76 0.006 -.1112061 -.0188841
houshead | .3960399 .0847153 4.67 0.000 .2300009 .5620789
married | .3961194 .0806744 4.91 0.000 .2380005 .5542384
female | .1102564 .0869256 1.27 0.205 -.0601147 .2806275
child | -.0464355 .0815869 -0.57 0.569 -.206343 .113472
ychild | -.1213622 .103309 -1.17 0.240 -.3238441 .0811196
nonwhite | -.6909793 .1217489 -5.68 0.000 -.9296027 -.4523559
age | -.0225342 .0040184 -5.61 0.000 -.0304101 -.0146582
schlt12 | -.1513782 .0968026 -1.56 0.118 -.3411079 .0383515
schgt12 | .1011742 .0834622 1.21 0.225 -.0624088 .2647572
smsa | .212363 .081774 2.60 0.009
.052089 .372637
bluecoll | -.220439 .0862751 -2.56 0.011 -.3895351 -.0513429
mining | -.1721823 .2051663 -0.84 0.401 -.5743008 .2299362
constr | -.0897602 .11034 -0.81 0.416 -.3060225 .1265022
transp | -.1572488 .1563607 -1.01 0.315 -.4637102 .1492126
trade | -.0451107 .1034986 -0.44 0.663 -.2479642 .1577428
fire | .0881685 .1386688 0.64 0.525 -.1836175 .3599544
services | .1682835 .1005405 1.67 0.094 -.0287723 .3653393
pubadmin | .0961407 .3092103 0.31 0.756 -.5099004 .7021817
year85 | .1940199 .0906564 2.14 0.032 .0163366 .3717031
year87 | .3564373 .0959014 3.72 0.000 .1684741 .5444005
389
year89 | .4924007 .1101907 4.47 0.000 .2764308 .7083705

midatl | .0156736 .1488094 0.11 0.916 -.2759874 .3073347
encen | .0089345 .1538505 0.06 0.954 -.2926069 .3104759
wncen | .1742124 .1634726 1.07 0.287 -.1461881 .4946129
southatl | .2676635 .1192515 2.24 0.025 .0339348 .5013922
escen | .3741169 .199389 1.88 0.061 -.0166783 .7649121
wscen | .361461 .1423856 2.54 0.011 .0823903 .6405316
mountain | -.00019 .1557385 -0.00 0.999 -.3054318 .3050519
pacific | .0800478 .2463547 0.32 0.745 -.4027986 .5628941
_cons | -4.095067 .9086039 -4.51 0.000 -5.875898 -2.314236
-------------+---------------------------------------------------------------/ln_the | -1.462995 .31608 -4.63 0.000
-2.0825 -.8434894
-------------+---------------------------------------------------------------theta | .2315418 .0731857
.1246183 .4302067
-----------------------------------------------------------------------------. estimates store bexpgamma
.
. * Figure 18.3 (p.633) - Generalized (Cox-Snell) Residuals for Exponential-Gamma
(option unconditional assumed)
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
. sort resid
> */ title("Exponential-Gamma Model Residuals") /*
390

. graph export exp_gamma.wmf, replace
(file c:\Imbook\bwebpage\Section4\exp_gamma.wmf written in Windows Metafile format)
.
. /*
> * Following did not work, even with starting values provided
> * Results in book obtained on different computer with different Stata version
> * Estimate exponential with IG heterogeneity
> stset spell, fail(censor1=1)
> quietly streg $xlist, nolog nohr dist(exponential) robust
> matrix theta = 1.6
> matrix bstart = e(b),theta
> streg $xlist, nohr dist(exponential) frailty(invgauss) robust from(bstart)
> * estimates store bexpIG
> */
.
. * Table 18.1 (p.634) - Display Parameter Estimates
. * Note that exponetial-IG missing
. estimates table bexp bexpgamma, t(%9.3f) stats(N ll) b(%9.3f) /*
-------------------------------------Variable | bexp
bexpgamma
-------------+-----------------------RR | 0.472
0.501
| 0.786
0.809
DR | -0.576 -0.882
| -0.755 -1.118
UI | -1.425 -1.585
| -5.712 -6.043
RRUI | 0.966
1.091
| 1.578
1.725
DRUI | -0.199
0.057
| -0.195
0.055
LOGWAGE | 0.351
0.379
| 3.035
3.184
_cons | -4.079 -4.095
| -4.653 -4.507
-------------+-----------------------N | 3343.000 3343.000
ll | -2700.690 -2695.352
-------------------------------------legend: b/t
.
. * (2) WEIBULL REGRESSION
391
.
. * Estimate Weibull without heterogeneity
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. streg $xlist, nolog nohr dist(weibull) robust
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 501.65

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .4481156 .6381895 0.70 0.483 -.8027127 1.698944
DR | -.4269187 .8086983 -0.53 0.598 -2.011938 1.158101
UI | -1.496066 .2639679 -5.67 0.000 -2.013434 -.9786984
RRUI | 1.015226 .6455611 1.57 0.116 -.2500501 2.280503
DRUI | -.2988417 1.065384 -0.28 0.779 -2.386956 1.789272
LOGWAGE | .3655253 .12212 2.99 0.003 .1261745 .6048761
tenure | -.0011127 .0068716 -0.16 0.871 -.0145809 .0123554
slack | -.2652154 .0803214 -3.30 0.001 -.4226424 -.1077883
abolpos | -.1604227 .1012942 -1.58 0.113 -.3589557 .0381103
explose | .2075085 .0684715 3.03 0.002 .0733068 .3417103
stateur | -.0708745 .0242117 -2.93 0.003 -.1183286 -.0234204
houshead | .3976626 .0887192 4.48 0.000 .2237762 .571549
married | .3786057 .0830317 4.56 0.000 .2158665 .541345
female | .1260829 .0896987 1.41 0.160 -.0497233 .301889
child | -.0336778 .0839956 -0.40 0.688 -.1983061 .1309505
ychild | -.1613066 .108947 -1.48 0.139 -.3748389 .0522256
392
nonwhite | -.7025504 .12426 -5.65 0.000 -.9460956 -.4590052

age | -.0235823 .0041922 -5.63 0.000 -.0317989 -.0153658
schlt12 | -.1226759 .1022762 -1.20 0.230 -.3231335 .0777816
schgt12 | .1162848 .0880692 1.32 0.187 -.0563278 .2888973
smsa | .1999567 .0841129 2.38 0.017 .0350985 .3648149
bluecoll | -.1994925 .0899354 -2.22 0.027 -.3757626 -.0232223
mining | -.1015676 .2036644 -0.50 0.618 -.5007425 .2976073
constr | -.0253737 .1135609 -0.22 0.823 -.247949 .1972016
transp | -.1981522 .1672141 -1.19 0.236 -.5258858 .1295814
trade | -.0311361 .1079502 -0.29 0.773 -.2427146 .1804423
fire | .1262153 .1492527 0.85 0.398 -.1663145 .4187452
services | .2031673 .1038945 1.96 0.051 -.0004622 .4067968
pubadmin | .1117728 .3087374 0.36 0.717 -.4933415 .716887
year85 | .2374972 .093387 2.54 0.011
.054462 .4205325
year87 | .3787397 .1011782 3.74 0.000 .1804341 .5770454
year89 | .4920278 .1180472 4.17 0.000 .2606596 .7233959
midatl | .02465 .1542139 0.16 0.873 -.2776037 .3269036
encen | -.0014111 .1579065 -0.01 0.993 -.3109023 .30808
wncen | .1844363 .1694444 1.09 0.276 -.1476687 .5165413
southatl | .2740974 .1250481 2.19 0.028 .0290076 .5191872
escen | .367742 .2024771 1.82 0.069 -.0291058 .7645899
wscen | .3440005 .1527804 2.25 0.024 .0445563 .6434446
mountain | .0159627 .1620188 0.10 0.922 -.3015883 .3335136
pacific | .0849532 .2504077 0.34 0.734 -.4058368 .5757432
_cons | -4.357886 .9196792 -4.74 0.000 -6.160424 -2.555347
-------------+---------------------------------------------------------------/ln_p | .1215314 .0194374 6.25 0.000 .0834348 .1596281
-------------+---------------------------------------------------------------p | 1.129225 .0219492
1.087014 1.173075
1/p | .8855632 .0172131
.8524608 .9199511
-----------------------------------------------------------------------------. estimates store bweib
.
. * Figure 18.4 (p.635) - Generalized (Cox-Snell) Residuals for Weibull
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
393

0
. sort resid
> */ title("Weibull Model Residuals") /*
. graph export Weibul16.wmf, replace
(file c:\Imbook\bwebpage\Section4\Weibul16.wmf written in Windows Metafile format)
.
. * Estimate Weibull with gamma heterogeneity
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. streg $xlist, nolog nohr dist(weibull) frailty(invgauss) robust
Inverse-Gaussian frailty
No. of subjects
No. of failures
=
=
3343
1073
Number of obs =
3343
394
Time at risk
20887
Wald chi2(40) = 643.00

Prob > chi2
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .7356277 .9058181 0.81 0.417 -1.039743 2.510998
DR | -1.072566 1.149098 -0.93 0.351 -3.324758 1.179625
UI | -2.574752 .3843798 -6.70 0.000 -3.328123 -1.821381
RRUI | 1.733571 .9333928 1.86 0.063 -.0958458 3.562987
DRUI | -.060621 1.537813 -0.04 0.969 -3.07468 2.953438
LOGWAGE | .575656 .1766599 3.26 0.001 .2294089 .9219031
tenure | -.0009848 .0097472 -0.10 0.920 -.0200889 .0181194
slack | -.4416007 .1142976 -3.86 0.000 -.6656199 -.2175814
abolpos | -.2873066 .1465357 -1.96 0.050 -.5745113 -.0001019
explose | .3641943 .0976897 3.73 0.000 .1727259 .5556627
stateur | -.0981133 .0346763 -2.83 0.005 -.1660775 -.030149
houshead | .5924383 .1256739 4.71 0.000 .3461219 .8387546
married | .6083214 .1183487 5.14 0.000 .3763624 .8402805
female | .1788439 .1285074 1.39 0.164 -.0730259 .4307137
child | -.0914227 .121778 -0.75 0.453 -.3301031 .1472578
ychild | -.1805373 .1527477 -1.18 0.237 -.4799173 .1188426
nonwhite | -1.008517 .1725174 -5.85 0.000 -1.346645 -.6703894
age | -.0333776 .0059183 -5.64 0.000 -.0449772 -.0217779
schlt12 | -.2258621 .1439543 -1.57 0.117 -.5080075 .0562832
schgt12 | .1505129 .124469 1.21 0.227 -.0934418 .3944677
smsa | .3009952 .119907 2.51 0.012 .0659819 .5360086
bluecoll | -.3211857 .1253163 -2.56 0.010 -.5668012 -.0755702
mining | -.2319827 .3008491 -0.77 0.441 -.8216361 .3576708
constr | -.1260324 .1633669 -0.77 0.440 -.4462257 .1941609
transp | -.2763858 .225893 -1.22 0.221 -.7191279 .1663562
trade | -.0687616 .1518284 -0.45 0.651 -.3663399 .2288166
fire | .0668973 .2131814 0.31 0.754 -.3509306 .4847252
services | .231914 .1494712 1.55 0.121 -.0610441 .5248721
pubadmin | .0901949 .4579252 0.20 0.844 -.807322 .9877117
year85 | .2780139 .1339053 2.08 0.038 .0155644 .5404634
year87 | .5208783 .1415375 3.68 0.000 .2434699 .7982867
year89 | .7209598 .1655487 4.35 0.000 .3964903 1.045429
midatl | -.0192077 .2222646 -0.09 0.931 -.4548382 .4164228
encen | -.0297055 .2284931 -0.13 0.897 -.4775438 .4181328
wncen | .2460338 .24216 1.02 0.310 -.2285911 .7206586
southatl | .3563643 .1793284 1.99 0.047 .0048872 .7078415
escen | .5461543 .2910193 1.88 0.061 -.024233 1.116542
wscen | .4606814 .2140966 2.15 0.031 .0410598 .880303
mountain | .017581 .2293804 0.08 0.939 -.4319963 .4671584
pacific | .1379886 .3636985 0.38 0.704 -.5748475 .8508247
_cons | -5.303059 1.34133 -3.95 0.000 -7.932017 -2.6741
-------------+---------------------------------------------------------------/ln_p | .5611667 .0225898 24.84 0.000 .5168915 .6054418
395
/ln_the | 1.852696 .0896755 20.66 0.000 1.676935 2.028457

-------------+---------------------------------------------------------------p | 1.752716 .0395935
1.676807 1.832062
1/p | .570543 .0128884
.5458332 .5963715
theta | 6.376987 .5718595
5.349136 7.602343
-----------------------------------------------------------------------------. estimates store bweibIG
.
. * Figure 18.5 (p.636) - Generalized (Cox-Snell) Residuals for Weibull-IG
(option unconditional assumed)
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
. sort resid
> */ title("Weibull-IG Model Residuals") /*
. graph export Weibul16_IG.wmf, replace
(file c:\Imbook\bwebpage\Section4\Weibul16_IG.wmf written in Windows Metafile format)
.
396
. * Table 18.2 (p.635) - Display Parameter Estimates

. estimates table bweibIG bweib, t(%9.3f) stats(N ll) b(%9.3f) /*
-------------------------------------Variable | bweibIG
bweib
-------------+-----------------------RR | 0.736
0.448
| 0.812
0.702
DR | -1.073 -0.427
| -0.933 -0.528
UI | -2.575 -1.496
| -6.698 -5.668
RRUI | 1.734
1.015
| 1.857
1.573
DRUI | -0.061 -0.299
| -0.039 -0.281
LOGWAGE | 0.576
0.366
| 3.259
2.993
_cons | -5.303 -4.358
| -3.954 -4.738
-------------+-----------------------N | 3343.000 3343.000
ll | -2616.322 -2687.600
-------------------------------------legend: b/t
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma18p1heterogeneity.txt
log type: text
closed on: 19 May 2005, 17:58:38
397
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma19p1comprisks.txt
log type: text
opened on: 19 May 2005, 17:52:44
.
. ********** OVERVIEW OF MMA18P1COMPRISKS.DO **********
.
. * STATA Program
.
. * Competing Risks Example with censoring mechanism each of the three risks
. * (1A) Table 19.2 p.659 Exponential
. * (1B) Table 19.2 p.659 Exponential with IG frailty
. * (2A) Table 19.3 p.659 Weibull
. * (2B) Table 19.3 p.659 Weibull with IG frailty
. * (2C) Table 19.3 p.660 Cox model
. * (2D) Graph the resulting Cox baseline survival and cumulative hazards
.*
Figure 19.1: (combined_bsf.wmf) baseline survival functions
.*
Figure 19.2: (combined_cbh.wmf) baseline cumulative hazards
.
. * ema1996.dta
.
. * NOTE: The IG Heterogeneity estimation was unsuccessful for exponential
.*
but successful for Weibull
.
. ********** SETUP **********
.
. set more off
. version 8
. set matsize 80
/* Needed for this program */
.
.
.*
.
. * of 1986, 1988, 1990 and 1992 on 33 variables including
. * spell = length of spell in number of two-week intervals
398

. * CENSOR2 = 1 if re-employed at part-time job
. * CENSOR3 = 1 if re-employed but left job: pt-ft status unknown
. * CENSOR4 = 1 if still jobless
.
. * See program mma17p4duration.do for further description of the data set
.
. ********** READ DATA and CREATE ADDITIONAL VARIABLES **********
.
. use ema1996.dta
.
. gen RR = reprate
. gen DR = disrate
. gen UI = ui
. gen RRUI = RR*UI
. gen DRUI = DR*UI
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------spell |
3343 6.247981 5.611271
1
28
censor1 |
3343 .3209692 .4669188
0
1
censor2 |
3343 .1014059 .3019106
0
1
censor3 |
3343 .1717021 .3771777
0
1
censor4 |
3343 .3754113 .4843014
0
1
-------------+-------------------------------------------------------ui |
3343 .5527969 .4972791
0
1
reprate |
3343 .4544717 .1137918
.066
2.059
logwage |
3343 5.692994 .5356591 2.70805 7.600402
tenure |
3343 4.114867 5.862322
0
40
disrate |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------slack |
3343 .4884834 .4999421
0
1
abolpos |
3343 .1456775 .3528354
0
1
explose |
3343 .5025426 .5000683
0
1
stateur |
3343
6.5516 1.803825
2.5
13
houshead |
3343 .6120251 .4873617
0
1
-------------+-------------------------------------------------------married |
3343 .5860006 .4926221
0
1
female |
3343 .3478911 .4763725
0
1
child |
3343 .4501944 .4975876
0
1
ychild |
3343 .1956327 .3967463
0
1
399
nonwhite |
3343 .1390966 .3460991
0
1
-------------+-------------------------------------------------------age |
3343 35.44331 10.6402
20
61
schlt12 |
3343 .2811846 .4496446
0
1
schgt12 |
3343 .3356267 .4722797
0
1
smsa |
3343 .7241998 .4469835
0
1
bluecoll |
3343 .6036494 .489212
0
1
-------------+-------------------------------------------------------mining |
3343 .029315 .1687132
0
1
constr |
3343 .1480706 .3552231
0
1
transp |
3343 .0646126 .2458778
0
1
trade |
3343 .1848639 .3882452
0
1
fire |
3343 .0514508 .2209484
0
1
-------------+-------------------------------------------------------services |
3343 .1699073 .3756075
0
1
pubadmin |
3343 .0095722 .097383
0
1
year85 |
3343 .2677236 .442839
0
1
year87 |
3343 .2174693 .4125862
0
1
year89 |
3343 .1998205 .3999251
0
1
-------------+-------------------------------------------------------midatl |
3343 .1088842 .3115405
0
1
encen |
3343 .1429853 .3501103
0
1
wncen |
3343 .0643135 .2453472
0
1
southatl |
3343 .2375112 .4256217
0
1
escen |
3343 .0532456 .2245564
0
1
-------------+-------------------------------------------------------wscen |
3343 .1441819 .3513266
0
1
mountain |
3343 .1079868 .3104102
0
1
pacific |
3343 .0260245 .159232
0
1
RR |
3343 .4544717 .1137918
.066
2.059
DR |
3343 .1094376 .0735274
.002
1.02
-------------+-------------------------------------------------------UI |
3343 .5527969 .4972791
0
1
RRUI |
3343 .2478687 .2380667
0
2.059
DRUI |
3343 .0602776 .0754261
0
.824
LOGWAGE |
3343 5.692994 .5356591 2.70805 7.600402
.
. ********* COMPETING RISKS FOR UNEMPLOYMENT DURATION **********
.
. * Stata analysis requires using stset to define the dependent variable
.
. * For the competing risks model there are three censoring variables
. * CENSOR2 = 1 if re-employed at part-time job
. * CENSOR3 = 1 if re-employed but left job: pt-ft status unknown
.
400
>
>
>
>
*/ female child ychild nonwhite age schlt12 schgt12 smsa bluecoll /*

*/ mining constr transp trade fire services pubadmin /*
*/ year85 year87 year89 midatl /*
*/ encen wncen southatl escen wscen mountain pacific
.
. *** (1A) EXPONENTIAL WITH NO HETEROGENEITY Table 19.2
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. streg $xlist, nolog nohr robust dist(exponential)
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 565.24

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .4720235 .6005534 0.79 0.432 -.7050396 1.649087
DR | -.5756396 .7624489 -0.75 0.450 -2.070012 .9187327
UI | -1.424561 .2493917 -5.71 0.000 -1.91336 -.9357622
RRUI | .9655904 .6118408 1.58 0.115 -.2335956 2.164776
DRUI | -.1990635 1.019118 -0.20 0.845 -2.196498 1.798371
LOGWAGE | .3508005 .115598 3.03 0.002 .1242327 .5773684
tenure | -.0001462 .0064637 -0.02 0.982 -.0128147 .0125224
slack | -.2593666 .0759363 -3.42 0.001 -.4081991 -.1105342
abolpos | -.1550897 .0953306 -1.63 0.104 -.3419342 .0317549
explose | .198458 .0648354 3.06 0.002
.071383 .3255331
401
stateur | -.064626 .0229903 -2.81 0.005 -.1096862 -.0195659

houshead | .3812208 .0836602 4.56 0.000 .2172499 .5451918
married | .369552 .0786145 4.70 0.000 .2154705 .5236335
female | .1164067 .0852986 1.36 0.172 -.0507754 .2835888
child | -.0333008 .0794577 -0.42 0.675 -.1890352 .1224335
ychild | -.1449722 .1022781 -1.42 0.156 -.3454336 .0554892
nonwhite | -.6692066 .1188272 -5.63 0.000 -.9021037 -.4363095
age | -.0220821 .0039256 -5.63 0.000 -.0297762 -.0143879
schlt12 | -.1231414 .0966102 -1.27 0.202 -.3124939 .066211
schgt12 | .1114395 .082945 1.34 0.179 -.0511297 .2740087
smsa | .1922291 .0799904 2.40 0.016 .0354508 .3490075
bluecoll | -.2033718 .085129 -2.39 0.017 -.3702215 -.036522
mining | -.1205818 .1973575 -0.61 0.541 -.5073955 .2662319
constr | -.04475 .1081519 -0.41 0.679 -.2567237 .1672238
transp | -.1786694 .156034 -1.15 0.252 -.4844906 .1271517
trade | -.0345159 .1019152 -0.34 0.735 -.234266 .1652341
fire | .1120549 .1386716 0.81 0.419 -.1597365 .3838462
services | .1840002 .0983911 1.87 0.061 -.0088428 .3768432
pubadmin | .1090606 .2954211 0.37 0.712 -.4699541 .6880752
year85 | .2147661 .0888664 2.42 0.016 .0405911 .388941
year87 | .3541162 .0948499 3.73 0.000 .1682139 .5400186
year89 | .467082 .1104355 4.23 0.000 .2506325 .6835316
midatl | .0264112 .1465647 0.18 0.857 -.2608503 .3136727
encen | .0043916 .1502813 0.03 0.977 -.2901544 .2989375
wncen | .1724311 .1607689 1.07 0.283 -.1426703 .4875324
southatl | .2638807 .1183726 2.23 0.026 .0318747 .4958867
escen | .35414 .19317 1.83 0.067 -.0244664 .7327463
wscen | .3385896 .1433308 2.36 0.018 .0576664 .6195128
mountain | .0063693 .1538821 0.04 0.967 -.2952341 .3079727
pacific | .0770202 .2393505 0.32 0.748 -.3920982 .5461385
_cons | -4.079107 .8767097 -4.65 0.000 -5.797426 -2.360788
-----------------------------------------------------------------------------. estimates store bexpr1
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
402

No. of subjects
No. of failures
Time at risk
=
=
=
3343
339
20887
Number of obs =
Wald chi2(40) = 227.08

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.0928628 .9761428 -0.10 0.924 -2.006068 1.820342
DR | -.9600127 1.246692 -0.77 0.441 -3.403483 1.483458
UI | -1.047747 .5236826 -2.00 0.045 -2.074146 -.021348
RRUI | -.6698307 1.191869 -0.56 0.574 -3.005851 1.666189
DRUI | 1.987208 1.726509 1.15 0.250 -1.396688 5.371105
LOGWAGE | -.2577715 .1793075 -1.44 0.151 -.6092077 .0936646
tenure | .0053684 .0125538 0.43 0.669 -.0192366 .0299734
slack | -.2636908 .1311029 -2.01 0.044 -.5206477 -.0067339
abolpos | -.5626836 .202701 -2.78 0.006 -.9599703 -.1653969
explose | .0490271 .1130116 0.43 0.664 -.1724715 .2705258
stateur | -.1032439 .0406788 -2.54 0.011 -.182973 -.0235148
houshead | -.073544 .1343412 -0.55 0.584 -.3368479 .18976
married | -.0618813 .1339552 -0.46 0.644 -.3244287 .2006661
female | .4531912 .1384047 3.27 0.001
.181923 .7244594
child | -.2164986 .1452571 -1.49 0.136 -.5011973 .0682002
ychild | .149031 .1815684 0.82 0.412 -.2068365 .5048986
nonwhite | -.4563527 .1820135 -2.51 0.012 -.8130927 -.0996127
age | -.001781 .0064207 -0.28 0.781 -.0143653 .0108033
schlt12 | -.1803101 .1661528 -1.09 0.278 -.5059636 .1453433
schgt12 | -.0534463 .1462829 -0.37 0.715 -.3401555 .2332629
smsa | .1295376 .1384588 0.94 0.349 -.1418367 .400912
bluecoll | .0088207 .1510547 0.06 0.953 -.2872411 .3048825
mining | -.0141252 .4078632 -0.03 0.972 -.8135225 .785272
constr | .1867498 .1896106 0.98 0.325 -.1848802 .5583799
transp | -.402533 .2898061 -1.39 0.165 -.9705426 .1654766
trade | .1106678 .1735195 0.64 0.524 -.2294241 .4507598
fire | -.3396026 .3006096 -1.13 0.259 -.9287865 .2495813
services | .1619867 .1705571 0.95 0.342 -.172299 .4962724
pubadmin | .7445446 .5413463 1.38 0.169 -.3164746 1.805564
year85 | -.0548375 .149323 -0.37 0.713 -.3475052 .2378301
year87 | -.12113 .1616797 -0.75 0.454 -.4380164 .1957563
year89 | .1244437 .1950397 0.64 0.523 -.257827 .5067144
midatl | -.3969537 .2577568 -1.54 0.124 -.9021477 .1082403
403
encen | -.5115788 .2576815 -1.99 0.047 -1.016625 -.0065323

wncen | -.0674875 .257402 -0.26 0.793 -.5719862 .4370113
southatl | -.2719375 .1944647 -1.40 0.162 -.6530813 .1092062
escen | .065407 .3099463 0.21 0.833 -.5420766 .6728905
wscen | -.0941963 .2338712 -0.40 0.687 -.5525754 .3641827
mountain | .2287682 .2264905 1.01 0.312 -.215145 .6726814
pacific | -.2060074 .3970221 -0.52 0.604 -.9841563 .5721415
_cons | -.8636363 1.325425 -0.65 0.515 -3.461421 1.734148
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
No. of subjects
No. of failures
Time at risk
=
=
=
3343
574
20887
Number of obs =
Wald chi2(40) = 372.34

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.6011551 .724665 -0.83 0.407 -2.021472 .8191621
DR | 1.121525 .9012528 1.24 0.213 -.6448975 2.887948
UI | -.9672682 .4486302 -2.16 0.031 -1.846567 -.0879691
RRUI | -.4326869 1.014413 -0.43 0.670
-2.4209 1.555526
DRUI | 2.102012 1.302564 1.61 0.107 -.450967 4.654991
404
LOGWAGE | .0029166 .1448149 0.02 0.984 -.2809153 .2867485

tenure | -.0479889 .0121403 -3.95 0.000 -.0717835 -.0241942
slack | -.4583215 .097709 -4.69 0.000 -.6498277 -.2668154
abolpos | -.2736409 .1396283 -1.96 0.050 -.5473073 .0000255
explose | .0246749 .0862551 0.29 0.775 -.144382 .1937319
stateur | -.1086692 .0319298 -3.40 0.001 -.1712504 -.046088
houshead | .5298135 .1054798 5.02 0.000 .3230769 .7365501
married | .0268657 .1062998 0.25 0.800 -.1814781 .2352095
female | .2590041 .109547 2.36 0.018 .0442959 .4737122
child | -.141802 .1114763 -1.27 0.203 -.3602915 .0766876
ychild | -.0885931 .136915 -0.65 0.518 -.3569416 .1797553
nonwhite | -.4668153 .143211 -3.26 0.001 -.7475036 -.186127
age | -.0247346 .0054431 -4.54 0.000 -.0354029 -.0140662
schlt12 | -.1034495 .1224893 -0.84 0.398 -.3435241 .1366251
schgt12 | .0952043 .1081669 0.88 0.379 -.1167988 .3072075
smsa | .0128711 .1021476 0.13 0.900 -.1873344 .2130767
bluecoll | .3098248 .1110841 2.79 0.005 .0921038 .5275457
mining | .2388579 .2604652 0.92 0.359 -.2716445 .7493603
constr | .0983356 .1419787 0.69 0.489 -.1799376 .3766088
transp | -.0783446 .1897853 -0.41 0.680 -.4503169 .2936278
trade | .1033278 .1292151 0.80 0.424 -.1499291 .3565847
fire | -.3607287 .2689374 -1.34 0.180 -.8878363 .166379
services | .0248212 .1323061 0.19 0.851 -.234494 .2841363
pubadmin | -1.770536 1.040329 -1.70 0.089 -3.809544 .2684714
year85 | .295673 .1143137 2.59 0.010 .0716222 .5197237
year87 | .4303606 .1198341 3.59 0.000 .1954901 .6652311
year89 | -.1373874 .1627204 -0.84 0.398 -.4563135 .1815386
midatl | -.5339921 .2188609 -2.44 0.015 -.9629516 -.1050326
encen | -.075022 .1998626 -0.38 0.707 -.4667454 .3167014
wncen | .1239805 .2095321 0.59 0.554 -.2866948 .5346559
southatl | .1522514 .1635982 0.93 0.352 -.1683951 .472898
escen | -.5123015 .3170723 -1.62 0.106 -1.133752 .1091488
wscen | .0198459 .1898764 0.10 0.917 -.3523051 .3919968
mountain | .1999108 .1869463 1.07 0.285 -.1664972 .5663188
pacific | .4481059 .2705097 1.66 0.098 -.0820833 .9782951
_cons | -1.620926 1.072666 -1.51 0.131 -3.723312 .4814595
.
. * Table 19.2 (page 658) first three columns
. estimates table bexpr1 bexpr2 bexpr3, b(%10.3f) se(%10.3f) stats(N ll) /*
> */ keep(RR DR UI RRUI DRUI LOGWAGE tenure)
----------------------------------------------------Variable | bexpr1
bexpr2
bexpr3
-------------+--------------------------------------RR |
0.472
-0.093
-0.601
|
0.601
0.976
0.725
DR | -0.576
-0.960
1.122
405
|
0.762
1.247
0.901
UI | -1.425
-1.048
-0.967
|
0.249
0.524
0.449
RRUI |
0.966
-0.670
-0.433
|
0.612
1.192
1.014
DRUI | -0.199
1.987
2.102
|
1.019
1.727
1.303
LOGWAGE |
0.351
-0.258
0.003
|
0.116
0.179
0.145
tenure | -0.000
0.005
-0.048
|
0.006
0.013
0.012
-------------+--------------------------------------N | 3343.000 3343.000 3343.000
ll | -2700.690 -1250.545 -1742.396
----------------------------------------------------legend: b/se
.
. *** (1B) EXPONENTIAL WITH IG HETEROGENEITY Table 19.2
.
. /* Did not work even though Weibull with IG heterogeneity did
>
> streg $xlist, nohr robust dist(exponential) frailty(invgauss)
> estimates store bexpigr1
>
> streg $xlist, nolog nohr robust dist(exponential) frailty(invgauss)
> estimates store bexpigr2
>
> streg $xlist, nolog nohr robust dist(exponential)
> estimates store bexpiggr3
>
> * Table 19.2 (page 658) first three columns
> estimates table bexpigr1 bexpigr2 bexpigr3, b(%10.3f) se(%10.3f) stats(N ll) /*
>
> */
.
. *** (2A) WEIBULL WITH NO HETEROGENEITY Table 19.3
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
406

0
0
28
. streg $xlist, nolog nohr robust dist(weibull)
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 501.65

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .4481156 .6381895 0.70 0.483 -.8027127 1.698944
DR | -.4269187 .8086983 -0.53 0.598 -2.011938 1.158101
UI | -1.496066 .2639679 -5.67 0.000 -2.013434 -.9786984
RRUI | 1.015226 .6455611 1.57 0.116 -.2500501 2.280503
DRUI | -.2988417 1.065384 -0.28 0.779 -2.386956 1.789272
LOGWAGE | .3655253 .12212 2.99 0.003 .1261745 .6048761
tenure | -.0011127 .0068716 -0.16 0.871 -.0145809 .0123554
slack | -.2652154 .0803214 -3.30 0.001 -.4226424 -.1077883
abolpos | -.1604227 .1012942 -1.58 0.113 -.3589557 .0381103
explose | .2075085 .0684715 3.03 0.002 .0733068 .3417103
stateur | -.0708745 .0242117 -2.93 0.003 -.1183286 -.0234204
houshead | .3976626 .0887192 4.48 0.000 .2237762 .571549
married | .3786057 .0830317 4.56 0.000 .2158665 .541345
female | .1260829 .0896987 1.41 0.160 -.0497233 .301889
child | -.0336778 .0839956 -0.40 0.688 -.1983061 .1309505
ychild | -.1613066 .108947 -1.48 0.139 -.3748389 .0522256
nonwhite | -.7025504 .12426 -5.65 0.000 -.9460956 -.4590052
age | -.0235823 .0041922 -5.63 0.000 -.0317989 -.0153658
schlt12 | -.1226759 .1022762 -1.20 0.230 -.3231335 .0777816
schgt12 | .1162848 .0880692 1.32 0.187 -.0563278 .2888973
smsa | .1999567 .0841129 2.38 0.017 .0350985 .3648149
bluecoll | -.1994925 .0899354 -2.22 0.027 -.3757626 -.0232223
mining | -.1015676 .2036644 -0.50 0.618 -.5007425 .2976073
constr | -.0253737 .1135609 -0.22 0.823 -.247949 .1972016
transp | -.1981522 .1672141 -1.19 0.236 -.5258858 .1295814
trade | -.0311361 .1079502 -0.29 0.773 -.2427146 .1804423
fire | .1262153 .1492527 0.85 0.398 -.1663145 .4187452
407
services | .2031673 .1038945 1.96 0.051 -.0004622 .4067968

pubadmin | .1117728 .3087374 0.36 0.717 -.4933415 .716887
year85 | .2374972 .093387 2.54 0.011
.054462 .4205325
year87 | .3787397 .1011782 3.74 0.000 .1804341 .5770454
year89 | .4920278 .1180472 4.17 0.000 .2606596 .7233959
midatl | .02465 .1542139 0.16 0.873 -.2776037 .3269036
encen | -.0014111 .1579065 -0.01 0.993 -.3109023 .30808
wncen | .1844363 .1694444 1.09 0.276 -.1476687 .5165413
southatl | .2740974 .1250481 2.19 0.028 .0290076 .5191872
escen | .367742 .2024771 1.82 0.069 -.0291058 .7645899
wscen | .3440005 .1527804 2.25 0.024 .0445563 .6434446
mountain | .0159627 .1620188 0.10 0.922 -.3015883 .3335136
pacific | .0849532 .2504077 0.34 0.734 -.4058368 .5757432
_cons | -4.357886 .9196792 -4.74 0.000 -6.160424 -2.555347
-------------+---------------------------------------------------------------/ln_p | .1215314 .0194374 6.25 0.000 .0834348 .1596281
-------------+---------------------------------------------------------------p | 1.129225 .0219492
1.087014 1.173075
1/p | .8855632 .0172131
.8524608 .9199511
-----------------------------------------------------------------------------. estimates store bweibr1
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
No. of subjects
No. of failures
Time at risk
=
=
=
3343
339
20887
Number of obs =
Wald chi2(40) =
3343
222.95
408
Prob > chi2
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.0855974 .9920715 -0.09 0.931 -2.030022 1.858827
DR | -.9387836 1.279111 -0.73 0.463 -3.445794 1.568227
UI | -1.110175 .5267037 -2.11 0.035 -2.142496 -.0778551
RRUI | -.6171912 1.203735 -0.51 0.608 -2.976469 1.742086
DRUI | 1.973269 1.756599 1.12 0.261 -1.469601 5.41614
LOGWAGE | -.2437885 .1833224 -1.33 0.184 -.6030938 .1155168
tenure | .0050643 .0127387 0.40 0.691 -.0199031 .0300317
slack | -.2689689 .133176 -2.02 0.043 -.529989 -.0079487
abolpos | -.5721689 .2059292 -2.78 0.005 -.9757826 -.1685551
explose | .0555267 .1147555 0.48 0.628
-.16939 .2804433
stateur | -.1087083 .0413647 -2.63 0.009 -.1897816 -.027635
houshead | -.0679894 .13661 -0.50 0.619 -.3357401 .1997613
married | -.060856 .1362403 -0.45 0.655 -.327882 .20617
female | .4583892 .1408831 3.25 0.001 .1822634 .734515
child | -.2228982 .147376 -1.51 0.130 -.5117499 .0659535
ychild | .1463598 .1844362 0.79 0.427 -.2151284 .507848
nonwhite | -.485664 .186033 -2.61 0.009 -.8502819 -.121046
age | -.0027009 .0065569 -0.41 0.680 -.0155521 .0101503
schlt12 | -.1837633 .1684487 -1.09 0.275 -.5139167 .1463901
schgt12 | -.0488958 .1485385 -0.33 0.742 -.340026 .2422343
smsa | .1380042 .1410747 0.98 0.328 -.1384971 .4145055
bluecoll | .0132584 .1537386 0.09 0.931 -.2880637 .3145805
mining | -.0138734 .4110202 -0.03 0.973 -.8194583 .7917115
constr | .1973771 .1920481 1.03 0.304 -.1790303 .5737845
transp | -.4116241 .2927848 -1.41 0.160 -.9854717 .1622234
trade | .1125741 .1765277 0.64 0.524 -.2334139 .4585621
fire | -.3378747 .3046641 -1.11 0.267 -.9350054 .2592561
services | .1700335 .1729565 0.98 0.326 -.1689551 .5090221
pubadmin | .7553679 .5487635 1.38 0.169 -.3201889 1.830925
year85 | -.0501695 .1515048 -0.33 0.741 -.3471135 .2467745
year87 | -.1116858 .1645254 -0.68 0.497 -.4341497 .2107781
year89 | .1344555 .1987084 0.68 0.499 -.2550059 .5239168
midatl | -.4039691 .2606153 -1.55 0.121 -.9147658 .1068276
encen | -.5105877 .2608364 -1.96 0.050 -1.021818 .0006423
wncen | -.0579723 .2607792 -0.22 0.824 -.5690902 .4531456
southatl | -.2682241 .1972983 -1.36 0.174 -.6549216 .1184733
escen | .079807 .3146812 0.25 0.800 -.5369568 .6965709
wscen | -.0854421 .2368638 -0.36 0.718 -.5496865 .3788024
mountain | .2441762 .2300886 1.06 0.289 -.2067892 .6951416
pacific | -.1999107 .4003467 -0.50 0.618 -.9845758 .5847544
_cons | -1.055211 1.353275 -0.78 0.436 -3.707582 1.597159
-------------+---------------------------------------------------------------/ln_p | .0815649 .0308379 2.64 0.008 .0211236 .1420061
-------------+---------------------------------------------------------------p | 1.084984 .0334587
1.021348 1.152584
409
1/p | .9216729 .0284225

.8676159 .9790979
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
No. of subjects
No. of failures
Time at risk
=
=
=
3343
574
20887
Number of obs =
Wald chi2(40) = 350.72

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.6946399 .762754 -0.91 0.362 -2.18961 .8003305
DR | 1.361414 .9691375 1.40 0.160 -.5380611 3.260888
UI | -1.098453 .4595297 -2.39 0.017 -1.999115 -.1977918
RRUI | -.3055217 1.046769 -0.29 0.770 -2.357151 1.746107
DRUI | 1.990913 1.37004 1.45 0.146 -.6943156 4.676141
LOGWAGE | .0401096 .1526549 0.26 0.793 -.2590886 .3393078
tenure | -.0495153 .0126559 -3.91 0.000 -.0743204 -.0247103
slack | -.473113 .1025776 -4.61 0.000 -.6741614 -.2720647
abolpos | -.2910168 .1465355 -1.99 0.047 -.5782212 -.0038124
explose | .0315602 .0906338 0.35 0.728 -.1460787 .2091991
stateur | -.1199252 .0337488 -3.55 0.000 -.1860717 -.0537787
houshead | .5592843 .1107798 5.05 0.000 .3421598 .7764087
410
married | .032312 .1115613 0.29 0.772 -.1863442 .2509681

female | .2764899 .1147909 2.41 0.016 .0515039 .5014759
child | -.149619 .1167679 -1.28 0.200 -.3784799 .079242
ychild | -.1018703 .1436607 -0.71 0.478 -.3834401 .1796996
nonwhite | -.5164388 .1517355 -3.40 0.001 -.8138349 -.2190427
age | -.0275549 .0057648 -4.78 0.000 -.0388536 -.0162561
schlt12 | -.1115642 .1291366 -0.86 0.388 -.3646673 .1415389
schgt12 | .1015553 .1135108 0.89 0.371 -.1209217 .3240324
smsa | .0270168 .1078739 0.25 0.802 -.1844122 .2384459
bluecoll | .3229431 .1167884 2.77 0.006
.094042 .5518443
mining | .2437267 .2731206 0.89 0.372 -.2915799 .7790332
constr | .1307943 .1484399 0.88 0.378 -.1601425 .4217311
transp | -.1004424 .2004105 -0.50 0.616 -.4932397 .2923549
trade | .1181562 .136055 0.87 0.385 -.1485068 .3848192
fire | -.344603 .2792784 -1.23 0.217 -.8919787 .2027726
services | .0519644 .1386656 0.37 0.708 -.2198151 .3237438
pubadmin | -1.780582 1.049217 -1.70 0.090 -3.837009 .2758459
year85 | .311726 .1192592 2.61 0.009 .0779822 .5454698
year87 | .4514345 .126241 3.58 0.000 .2040067 .6988623
year89 | -.1180122 .1713414 -0.69 0.491 -.4538352 .2178108
midatl | -.5476552 .224463 -2.44 0.015 -.9875945 -.1077158
encen | -.084084 .20745 -0.41 0.685 -.4906786 .3225106
wncen | .1288938 .2191536 0.59 0.556 -.3006393 .5584268
southatl | .16223 .1702456 0.95 0.341 -.1714454 .4959053
escen | -.5110545 .3270884 -1.56 0.118 -1.152136 .130027
wscen | .0218047 .1978693 0.11 0.912 -.3660121 .4096214
mountain | .2045852 .1949939 1.05 0.294 -.1775957 .5867662
pacific | .4535074 .2840292 1.60 0.110 -.1031795 1.010194
_cons | -2.017592 1.123888 -1.80 0.073 -4.220372 .1851884
-------------+---------------------------------------------------------------/ln_p | .163312 .0235045 6.95 0.000
.117244 .2093801
-------------+---------------------------------------------------------------p | 1.177404 .0276744
1.124394 1.232914
1/p | .8493261 .019963
.8110869 .8893682
.
. estimates table bweibr1 bweibr2 bweibr3, b(%10.3f) se(%10.3f) stats(N ll) /*
----------------------------------------------------Variable | bweibr1
bweibr2
bweibr3
-------------+--------------------------------------RR |
0.448
-0.086
-0.695
|
0.638
0.992
0.763
DR | -0.427
-0.939
1.361
|
0.809
1.279
0.969
UI | -1.496
-1.110
-1.098
411
|
0.264
0.527
0.460
RRUI |
1.015
-0.617
-0.306
|
0.646
1.204
1.047
DRUI | -0.299
1.973
1.991
|
1.065
1.757
1.370
LOGWAGE |
0.366
-0.244
0.040
|
0.122
0.183
0.153
tenure | -0.001
0.005
-0.050
|
0.007
0.013
0.013
-------------+--------------------------------------N | 3343.000 3343.000 3343.000
ll | -2687.600 -1248.686 -1729.836
----------------------------------------------------legend: b/se
.
. *** (2B) WEIBULL WITH IG HETEROGENEITY Table 19.3
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. streg $xlist, nohr robust dist(weibull) frailty(invgauss)
Fitting weibull model:
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
log pseudo-likelihood = -3134.2376 (not concave)

412
Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:


No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) = 643.00

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .7356277 .9058181 0.81 0.417 -1.039743 2.510998
DR | -1.072566 1.149098 -0.93 0.351 -3.324758 1.179625
UI | -2.574752 .3843798 -6.70 0.000 -3.328123 -1.821381
RRUI | 1.733571 .9333928 1.86 0.063 -.0958458 3.562987
DRUI | -.060621 1.537813 -0.04 0.969 -3.07468 2.953438
LOGWAGE | .575656 .1766599 3.26 0.001 .2294089 .9219031
tenure | -.0009848 .0097472 -0.10 0.920 -.0200889 .0181194
slack | -.4416007 .1142976 -3.86 0.000 -.6656199 -.2175814
abolpos | -.2873066 .1465357 -1.96 0.050 -.5745113 -.0001019
explose | .3641943 .0976897 3.73 0.000 .1727259 .5556627
stateur | -.0981133 .0346763 -2.83 0.005 -.1660775 -.030149
houshead | .5924383 .1256739 4.71 0.000 .3461219 .8387546
married | .6083214 .1183487 5.14 0.000 .3763624 .8402805
female | .1788439 .1285074 1.39 0.164 -.0730259 .4307137
child | -.0914227 .121778 -0.75 0.453 -.3301031 .1472578
ychild | -.1805373 .1527477 -1.18 0.237 -.4799173 .1188426
nonwhite | -1.008517 .1725174 -5.85 0.000 -1.346645 -.6703894
age | -.0333776 .0059183 -5.64 0.000 -.0449772 -.0217779
schlt12 | -.2258621 .1439543 -1.57 0.117 -.5080075 .0562832
schgt12 | .1505129 .124469 1.21 0.227 -.0934418 .3944677
smsa | .3009952 .119907 2.51 0.012 .0659819 .5360086
bluecoll | -.3211857 .1253163 -2.56 0.010 -.5668012 -.0755702
mining | -.2319827 .3008491 -0.77 0.441 -.8216361 .3576708
constr | -.1260324 .1633669 -0.77 0.440 -.4462257 .1941609
transp | -.2763858 .225893 -1.22 0.221 -.7191279 .1663562
trade | -.0687616 .1518284 -0.45 0.651 -.3663399 .2288166
fire | .0668973 .2131814 0.31 0.754 -.3509306 .4847252
services | .231914 .1494712 1.55 0.121 -.0610441 .5248721
pubadmin | .0901949 .4579252 0.20 0.844 -.807322 .9877117
413
year85 | .2780139 .1339053 2.08 0.038 .0155644 .5404634

year87 | .5208783 .1415375 3.68 0.000 .2434699 .7982867
year89 | .7209598 .1655487 4.35 0.000 .3964903 1.045429
midatl | -.0192077 .2222646 -0.09 0.931 -.4548382 .4164228
encen | -.0297055 .2284931 -0.13 0.897 -.4775438 .4181328
wncen | .2460338 .24216 1.02 0.310 -.2285911 .7206586
southatl | .3563643 .1793284 1.99 0.047 .0048872 .7078415
escen | .5461543 .2910193 1.88 0.061 -.024233 1.116542
wscen | .4606814 .2140966 2.15 0.031 .0410598 .880303
mountain | .017581 .2293804 0.08 0.939 -.4319963 .4671584
pacific | .1379886 .3636985 0.38 0.704 -.5748475 .8508247
_cons | -5.303059 1.34133 -3.95 0.000 -7.932017 -2.6741
-------------+---------------------------------------------------------------/ln_p | .5611667 .0225898 24.84 0.000 .5168915 .6054418
/ln_the | 1.852696 .0896755 20.66 0.000 1.676935 2.028457
-------------+---------------------------------------------------------------p | 1.752716 .0395935
1.676807 1.832062
1/p | .570543 .0128884
.5458332 .5963715
theta | 6.376987 .5718595
5.349136 7.602343
-----------------------------------------------------------------------------. estimates store bweibigr1
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. streg $xlist, nolog nohr robust dist(weibull) frailty(invgauss)
No. of subjects
No. of failures
Time at risk
=
=
=
3343
339
20887
Number of obs =
3343
414
Wald chi2(40) = 253.77

Prob > chi2
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.3802006 1.452095 -0.26 0.793 -3.226255 2.465854
DR | -1.689504 1.779553 -0.95 0.342 -5.177363 1.798355
UI | -2.063963 .7469659 -2.76 0.006 -3.527989 -.5999369
RRUI | -.3019038 1.702153 -0.18 0.859 -3.638063 3.034255
DRUI | 3.263067 2.469908 1.32 0.186 -1.577863 8.103998
LOGWAGE | -.4954862 .2614747 -1.89 0.058 -1.007967 .0169948
tenure | .0174014 .0192239 0.91 0.365 -.0202768 .0550795
slack | -.3889861 .1911789 -2.03 0.042 -.7636898 -.0142824
abolpos | -.8027208 .2877528 -2.79 0.005 -1.366706 -.2387356
explose | .1187808 .1663987 0.71 0.475 -.2073546 .4449162
stateur | -.1753726 .059272 -2.96 0.003 -.2915437 -.0592015
houshead | -.0832153 .1944376 -0.43 0.669 -.464306 .2978754
married | -.0092249 .1945187 -0.05 0.962 -.3904747 .3720248
female | .6284921 .2064768 3.04 0.002
.223805 1.033179
child | -.389325 .2127697 -1.83 0.067 -.806346 .0276959
ychild | .3144939 .2663886 1.18 0.238 -.2076182 .836606
nonwhite | -.6691885 .2633831 -2.54 0.011 -1.18541 -.1529671
age | -.0034533 .0093696 -0.37 0.712 -.0218174 .0149108
schlt12 | -.3242365 .2380109 -1.36 0.173 -.7907293 .1422562
schgt12 | -.0745655 .2138285 -0.35 0.727 -.4936618 .3445307
smsa | .2107394 .2012744 1.05 0.295 -.1837512
.60523
bluecoll | -.0065426 .2175612 -0.03 0.976 -.4329548 .4198696
mining | .1293103 .6093175 0.21 0.832 -1.06493 1.323551
constr | .2870954 .2728176 1.05 0.293 -.2476172 .8218081
transp | -.6470251 .4118414 -1.57 0.116 -1.454219 .1601692
trade | .1901489 .2529975 0.75 0.452 -.3057172 .6860149
fire | -.4680763 .4488502 -1.04 0.297 -1.347807 .411654
services | .2462185 .2531429 0.97 0.331 -.2499325 .7423696
pubadmin | 1.351206 .7621665 1.77 0.076 -.1426127 2.845025
year85 | -.1501166 .2195046 -0.68 0.494 -.5803377 .2801044
year87 | -.2400145 .236954 -1.01 0.311 -.7044358 .2244069
year89 | .1828811 .2831188 0.65 0.518 -.3720216 .7377838
midatl | -.4074373 .3806192 -1.07 0.284 -1.153437 .3385627
encen | -.6525035 .381508 -1.71 0.087 -1.400245 .0952385
wncen | -.1300751 .3835973 -0.34 0.735 -.8819119 .6217617
southatl | -.3491396 .2954776 -1.18 0.237 -.928265 .2299859
escen | .2960895 .4558667 0.65 0.516 -.5973927 1.189572
wscen | -.0903554 .3527441 -0.26 0.798 -.7817212 .6010104
mountain | .3721587 .3457717 1.08 0.282 -.3055413 1.049859
pacific | -.1996218 .6042626 -0.33 0.741 -1.383955 .9847112
_cons | 1.157635 1.957298 0.59 0.554 -2.678599 4.993869
-------------+---------------------------------------------------------------/ln_p | .5004283 .0361284 13.85 0.000
.429618 .5712386
/ln_the | 2.896807 .1749249 16.56 0.000
2.55396 3.239653
415
-------------+---------------------------------------------------------------p | 1.649428 .0595911

1.53667 1.770459
1/p | .6062709 .0219036
.5648254 .6507577
theta | 18.11621 3.168976
12.85793 25.52487
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. streg $xlist, nolog nohr robust dist(weibull) frailty(invgauss)
No. of subjects
No. of failures
Time at risk
=
=
=
3343
574
20887
Number of obs =
Wald chi2(40) = 416.91

Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.4326716 1.111223 -0.39 0.697 -2.610628 1.745285
DR | 1.166629 1.377826 0.85 0.397 -1.533861 3.867119
UI | -1.761667 .623017 -2.83 0.005 -2.982758 -.5405758
RRUI | -.5160276 1.418361 -0.36 0.716 -3.295964 2.263909
DRUI | 3.668779 1.93489 1.90 0.058 -.1235355 7.461093
LOGWAGE | -.0069584 .2162461 -0.03 0.974 -.4307929 .4168762
tenure | -.0677151 .0174959 -3.87 0.000 -.1020065 -.0334237
slack | -.7093182 .145145 -4.89 0.000 -.9937971 -.4248392
416
abolpos | -.4327781 .2106818 -2.05 0.040 -.8457069 -.0198494

explose | .0930879 .1284587 0.72 0.469 -.1586864 .3448623
stateur | -.1684826 .0472936 -3.56 0.000 -.2611764 -.0757887
houshead | .7760519 .1555864 4.99 0.000 .4711081 1.080996
married | .0849334 .1585652 0.54 0.592 -.2258487 .3957154
female | .329107 .1637254 2.01 0.044 .0082111 .6500028
child | -.2734744 .1667453 -1.64 0.101 -.6002892 .0533403
ychild | -.101407 .2021952 -0.50 0.616 -.4977024 .2948883
nonwhite | -.7325977 .211777 -3.46 0.001 -1.147673 -.3175223
age | -.0354358 .007992 -4.43 0.000 -.0510998 -.0197719
schlt12 | -.1729163 .1803828 -0.96 0.338 -.5264602 .1806275
schgt12 | .0955174 .1615133 0.59 0.554 -.2210429 .4120777
smsa | .0225321 .1500451 0.15 0.881 -.2715509 .3166151
bluecoll | .4311626 .1651405 2.61 0.009 .1074931 .7548321
mining | .4464055 .3724328 1.20 0.231 -.2835495 1.17636
constr | .1875875 .2104018 0.89 0.373 -.2247926 .5999675
transp | -.0190191 .2877627 -0.07 0.947 -.5830237 .5449855
trade | .1708654 .1960546 0.87 0.383 -.2133945 .5551253
fire | -.3548846 .3851005 -0.92 0.357 -1.109668 .3998985
services | .0199891 .1978478 0.10 0.920 -.3677854 .4077636
pubadmin | -2.249289 1.450209 -1.55 0.121 -5.091646 .5930688
year85 | .3978277 .1726143 2.30 0.021 .0595099 .7361456
year87 | .6809662 .1807412 3.77 0.000
.32672 1.035212
year89 | -.1380237 .2307311 -0.60 0.550 -.5902485 .314201
midatl | -.7908245 .3280754 -2.41 0.016 -1.43384 -.1478085
encen | -.1035781 .2984816 -0.35 0.729 -.6885913 .4814351
wncen | .2578004 .3150731 0.82 0.413 -.3597316 .8753324
southatl | .2314723 .2430344 0.95 0.341 -.2448663 .7078109
escen | -.6777305 .4486486 -1.51 0.131 -1.557065 .2016045
wscen | .0308173 .2842933 0.11 0.914 -.5263874 .5880219
mountain | .2849032 .2816226 1.01 0.312 -.267067 .8368734
pacific | .7162217 .4103619 1.75 0.081 -.0880727 1.520516
_cons | -1.42279 1.617429 -0.88 0.379 -4.592894 1.747313
-------------+---------------------------------------------------------------/ln_p | .5795747 .026888 21.56 0.000 .5268752 .6322742
/ln_the | 2.262575 .1322516 17.11 0.000 2.003367 2.521783
-------------+---------------------------------------------------------------p | 1.785279 .0480026
1.693632 1.881886
1/p | .5601365 .0150609
.5313819 .5904471
theta | 9.607798 1.270647
7.413974 12.45078
.
. estimates table bweibigr1 bweibigr2 bweibigr3, b(%10.3f) se(%10.3f) stats(N ll) /*
----------------------------------------------------Variable | bweibigr1 bweibigr2 bweibigr3
417
-------------+--------------------------------------RR |
0.736
-0.380
-0.433
|
0.906
1.452
1.111
DR | -1.073
-1.690
1.167
|
1.149
1.780
1.378
UI | -2.575
-2.064
-1.762
|
0.384
0.747
0.623
RRUI |
1.734
-0.302
-0.516
|
0.933
1.702
1.418
DRUI | -0.061
3.263
3.669
|
1.538
2.470
1.935
LOGWAGE |
0.576
-0.495
-0.007
|
0.177
0.261
0.216
tenure | -0.001
0.017
-0.068
|
0.010
0.019
0.017
-------------+--------------------------------------N | 3343.000 3343.000 3343.000
ll | -2616.322 -1230.164 -1696.846
----------------------------------------------------legend: b/se
.
. *** (2C) ESTIMATE COX MODEL SPECIFICATION OF COMPETING RISKS
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
. stcox $xlist, nolog nohr robust basesurv(survrisk1) basechazard(chrisk1)
No. of subjects
No. of failures
Time at risk
=
=
=
3343
1073
20887
Number of obs =
Wald chi2(40) =
3343
540.98
418
Prob > chi2
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | .5222796 .5711698 0.91 0.361 -.5971926 1.641752
DR | -.752507 .72175 -1.04 0.297 -2.167111 .6620971
UI | -1.317719 .2372893 -5.55 0.000 -1.782798 -.8526409
RRUI | .8822462 .582115 1.52 0.130 -.2586783 2.023171
DRUI | -.0951357 .977774 -0.10 0.922 -2.011538 1.821266
LOGWAGE | .3352639 .1106483 3.03 0.002 .1183972 .5521306
tenure | .0008278 .0061286 0.14 0.893 -.0111841 .0128396
slack | -.247863 .0721173 -3.44 0.001 -.3892103 -.1065158
abolpos | -.1511638 .0905035 -1.67 0.095 -.3285475 .0262198
explose | .1865068 .0615742 3.03 0.002 .0658236
.30719
stateur | -.0590475 .022085 -2.67 0.008 -.1023334 -.0157616
houshead | .3601866 .0794827 4.53 0.000 .2044035 .5159698
married | .358819 .0746355 4.81 0.000 .2125362 .5051019
female | .1002758 .0813277 1.23 0.218 -.0591236 .2596753
child | -.0396054 .0755365 -0.52 0.600 -.1876542 .1084435
ychild | -.1276638 .0967856 -1.32 0.187 -.3173602 .0620325
nonwhite | -.6394475 .1151332 -5.55 0.000 -.8651043 -.4137906
age | -.0204623 .0037593 -5.44 0.000 -.0278305 -.0130942
schlt12 | -.1220585 .0920073 -1.33 0.185 -.3023895 .0582726
schgt12 | .1104817 .0783542 1.41 0.159 -.0430897 .2640531
smsa | .1864841 .0766075 2.43 0.015 .0363361 .3366321
bluecoll | -.2108023 .080867 -2.61 0.009 -.3692986 -.052306
mining | -.1238251 .1906352 -0.65 0.516 -.4974632 .249813
constr | -.054455 .1029488 -0.53 0.597 -.256231 .1473209
transp | -.1551657 .1466515 -1.06 0.290 -.4425973 .1322659
trade | -.0383252 .0968106 -0.40 0.692 -.2280706 .1514201
fire | .1097585 .1300779 0.84 0.399 -.1451895 .3647065
services | .1666262 .0939507 1.77 0.076 -.0175138 .3507662
pubadmin | .1022002 .2829817 0.36 0.718 -.4524336 .6568341
year85 | .204162 .084908 2.40 0.016 .0377454 .3705786
year87 | .3384229 .0899115 3.76 0.000 .1621997 .5146462
year89 | .4486559 .104937 4.28 0.000 .2429832 .6543286
midatl | .0342238 .140515 0.24 0.808 -.2411805 .3096282
encen | .0174597 .1438862 0.12 0.903 -.2645521 .2994716
wncen | .1650967 .1532559 1.08 0.281 -.1352795 .4654728
southatl | .2518023 .1127138 2.23 0.025 .0308874 .4727172
escen | .3450422 .1839818 1.88 0.061 -.0155554 .7056398
wscen | .3316752 .1359801 2.44 0.015 .0651591 .5981914
mountain | .009484 .1468626 0.06 0.949 -.2783613 .2973293
pacific | .0720292 .2263339 0.32 0.750 -.3715771 .5156355
-----------------------------------------------------------------------------. estimates store bcoxrisk1
.
419

-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
No. of subjects
No. of failures
Time at risk
=
=
=
3343
339
20887
Number of obs =
Wald chi2(40) = 211.82

-2444.342
Prob > chi2
3343
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.0719673 .9513101 -0.08 0.940 -1.936501 1.792566
DR | -1.0236 1.193087 -0.86 0.391 -3.362007 1.314807
UI | -.906022 .5109396 -1.77 0.076 -1.907445 .0954013
RRUI | -.7818457 1.166182 -0.67 0.503 -3.06752 1.503829
DRUI | 2.031968 1.671862 1.22 0.224 -1.244821 5.308756
LOGWAGE | -.2800345 .1736454 -1.61 0.107 -.6203732 .0603043
tenure | .0059934 .0122664 0.49 0.625 -.0180483 .0300352
slack | -.2476685 .12775 -1.94 0.053 -.498054 .0027169
abolpos | -.5434923 .1976775 -2.75 0.006 -.9309331 -.1560516
explose | .0334802 .1101886 0.30 0.761 -.1824856 .2494459
stateur | -.0923228 .0393339 -2.35 0.019 -.1694157 -.0152299
houshead | -.0864111 .1303336 -0.66 0.507 -.3418602 .1690379
married | -.065464 .1298376 -0.50 0.614 -.3199409 .189013
female | .4386603 .1340263 3.27 0.001 .1759735 .7013471
child | -.2049337 .1413612 -1.45 0.147 -.4819966 .0721293
ychild | .1556684 .1766059 0.88 0.378 -.1904727 .5018095
nonwhite | -.3956483 .1761206 -2.25 0.025 -.7408382 -.0504583
age | .0001207 .0062519 0.02 0.985 -.0121327 .0123741
420
schlt12 | -.1723734 .1618354 -1.07 0.287 -.489565 .1448182

schgt12 | -.0583556 .142103 -0.41 0.681 -.3368724 .2201611
smsa | .1120279 .1334106 0.84 0.401 -.1494521 .3735079
bluecoll | -.0021333 .1460376 -0.01 0.988 -.2883617 .2840951
mining | -.0132972 .401138 -0.03 0.974 -.7995132 .7729188
constr | .1654229 .1852256 0.89 0.372 -.1976127 .5284584
transp | -.3818733 .2831048 -1.35 0.177 -.9367485 .1730019
trade | .1065755 .1677346 0.64 0.525 -.2221782 .4353293
fire | -.345295 .2945472 -1.17 0.241 -.9225969 .2320068
services | .1443583 .1664345 0.87 0.386 -.1818474 .470564
pubadmin | .7203208 .5238954 1.37 0.169 -.3064953 1.747137
year85 | -.0647735 .1460286 -0.44 0.657 -.3509844 .2214373
year87 | -.138436 .1574958 -0.88 0.379 -.4471221 .1702502
year89 | .100033 .1887671 0.53 0.596 -.2699437 .4700097
midatl | -.3838124 .2529706 -1.52 0.129 -.8796257 .1120009
encen | -.5058645 .2521219 -2.01 0.045 -1.000014 -.0117146
wncen | -.081463 .2512893 -0.32 0.746 -.5739811 .411055
southatl | -.2799968 .1891246 -1.48 0.139 -.6506742 .0906805
escen | .0372908 .2993588 0.12 0.901 -.5494417 .6240233
wscen | -.1157119 .2286912 -0.51 0.613 -.5639385 .3325146
mountain | .204597 .2206239 0.93 0.354 -.2278179 .6370119
pacific | -.2138749 .3899895 -0.55 0.583 -.9782404 .5504905
-----------------------------------------------------------------------------. estimates store bcoxrisk2
.
-----------------------------------------------------------------------------3343 total obs.
0 exclusions
0
0
28
No. of subjects
3343
Number of obs =
3343
421
No. of failures
Time at risk
=
=
574
20887
Wald chi2(40) = 357.81

Prob > chi2
0.0000
-----------------------------------------------------------------------------|
Robust
_t |
Coef. Std. Err.
-------------+---------------------------------------------------------------RR | -.4692082 .7157644 -0.66 0.512 -1.872081 .9336643
DR | .8759221 .8786992 1.00 0.319 -.8462967 2.598141
UI | -.9051384 .4449384 -2.03 0.042 -1.777202 -.0330753
RRUI | -.5392752 1.002388 -0.54 0.591 -2.503919 1.425369
DRUI | 2.293752 1.274021 1.80 0.072 -.2032836 4.790787
LOGWAGE | -.0140883 .1415912 -0.10 0.921 -.291602 .2634253
tenure | -.0465013 .0118142 -3.94 0.000 -.0696567 -.0233458
slack | -.4587556 .0952092 -4.82 0.000 -.6453621 -.2721491
abolpos | -.2743895 .136703 -2.01 0.045 -.5423223 -.0064566
explose | .0199625 .0843281 0.24 0.813 -.1453176 .1852426
stateur | -.1013309 .0311307 -3.26 0.001 -.1623459 -.0403159
houshead | .5154239 .1031203 5.00 0.000 .3133117 .717536
married | .0280002 .1037338 0.27 0.787 -.1753143 .2313148
female | .2477194 .1071841 2.31 0.021 .0376425 .4577962
child | -.1477253 .1086376 -1.36 0.174 -.3606511 .0652005
ychild | -.0702224 .1341067 -0.52 0.601 -.3330667 .1926219
nonwhite | -.4472066 .1401892 -3.19 0.001 -.7219723 -.1724409
age | -.0227849 .0053188 -4.28 0.000 -.0332096 -.0123602
schlt12 | -.1050265 .1191449 -0.88 0.378 -.3385462 .1284931
schgt12 | .0912594 .1057371 0.86 0.388 -.1159815 .2985004
smsa | .0078536 .0994133 0.08 0.937 -.1869928
.2027
bluecoll | .2916892 .1085873 2.69 0.007 .0788619 .5045165
mining | .2392902 .2514416 0.95 0.341 -.2535263 .7321067
constr | .0659352 .1393882 0.47 0.636 -.2072606 .339131
transp | -.0724276 .1845329 -0.39 0.695 -.4341054 .2892502
trade | .0824395 .1260009 0.65 0.513 -.1645178 .3293967
fire | -.3901171 .2648329 -1.47 0.141
-.90918 .1289458
services | .0007351 .1296195 0.01 0.995 -.2533144 .2547847
pubadmin | -1.749927 1.038715 -1.68 0.092 -3.785771 .2859182
year85 | .2810465 .1124259 2.50 0.012 .0606957 .5013973
year87 | .4139684 .117016 3.54 0.000 .1846212 .6433155
year89 | -.1485614 .1590621 -0.93 0.350 -.4603173 .1631946
midatl | -.5271828 .2165005 -2.44 0.015 -.9515159 -.1028497
encen | -.063171 .1962513 -0.32 0.748 -.4478166 .3214745
wncen | .134275 .2051501 0.65 0.513 -.2678118 .5363617
southatl | .1522905 .1610446 0.95 0.344 -.1633512 .4679321
escen | -.5030762 .3118938 -1.61 0.107 -1.114377 .1082245
wscen | .0116807 .1858946 0.06 0.950 -.352666 .3760273
mountain | .2043736 .1827277 1.12 0.263 -.1537662 .5625134
pacific | .4327009 .2661013 1.63 0.104 -.088848 .9542498
------------------------------------------------------------------------------
422
. estimates store bcoxrisk3

.
. * Table 19.3 (page 659) last three columns
. * NOTE: The results from this program differ a little from those
.*
given in text. Need to resolve this.
. estimates table bcoxrisk1 bcoxrisk2 bcoxrisk3, b(%10.3f) se(%10.3f) stats(N ll) /*
----------------------------------------------------Variable | bcoxrisk1 bcoxrisk2 bcoxrisk3
-------------+--------------------------------------RR |
0.522
-0.072
-0.469
|
0.571
0.951
0.716
DR | -0.753
-1.024
0.876
|
0.722
1.193
0.879
UI | -1.318
-0.906
-0.905
|
0.237
0.511
0.445
RRUI |
0.882
-0.782
-0.539
|
0.582
1.166
1.002
DRUI | -0.095
2.032
2.294
|
0.978
1.672
1.274
LOGWAGE |
0.335
-0.280
-0.014
|
0.111
0.174
0.142
tenure |
0.001
0.006
-0.047
|
0.006
0.012
0.012
-------------+--------------------------------------N | 3343.000 3343.000 3343.000
ll | -7717.233 -2444.342 -4094.236
----------------------------------------------------legend: b/se
.
. *** (2D) GRAPHS FOR COX COMPETING RISKS MODEL
.
. * Figure 19.1 (page 661) - Plot the three baseline survival functions
. sort _t
. graph twoway (scatter survrisk1 _t, c(J) msymbol(i) msize(small) clstyle(p1)) /*
> */ (scatter survrisk2 _t, c(J) msymbol(i) msize(small) clstyle(p2)) /*
> */ (scatter survrisk3 _t, c(J) msymbol(i) msize(small) clstyle(p3)), /*
> */ title("Baseline Survival Functions") /*
> */ ytitle("Baseline Survival Probability", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Risk 1 (full-time job)") label(2 "Risk 2 (part-time job)") label(3 "Risk 3 (
> unknown job)"))
. graph export combined_bsf.wmf, replace
(file c:\Imbook\bwebpage\Section4\combined_bsf.wmf written in Windows Metafile format)
423
.
. * Figure 19.2 (page 659) - Plot the three baseline cumulative hazards
. sort _t
. graph twoway (scatter chrisk1 _t, c(J) msymbol(i) msize(small) clstyle(p1)) /*
> */ (scatter chrisk2 _t, c(J) msymbol(i) msize(small) clstyle(p2)) /*
> */ (scatter chrisk3 _t, c(J) msymbol(i) msize(small) clstyle(p3)), /*
> */ title("Baseline Cumulative Hazard Functions") /*
> */ ytitle("Baseline Cumulative Hazard", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Risk 1 (full-time job)") label(2 "Risk 2 (part-time job)") label(3 "Risk 3 (
> unknown job)"))
. graph export combined_cbh.wmf, replace
(file c:\Imbook\bwebpage\Section4\combined_cbh.wmf written in Windows Metafile format)
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section4\mma19p1comprisks.txt
log type: text
closed on: 19 May 2005, 17:53:08
424
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section4\mma20p1count.txt
log type: text
opened on: 20 May 2005, 08:41:33
.
. ********* OVERVIEW OF MMA20P1COUNT.DO **********
.
. * STATA Program
.
. * Chapter 20.3 pages 671-4 and 20.7 page 690
. * Count data regression example
. * It provides
. * (1) Frequency distribution for count (Table 20.3)
. * (2) Data summary (Table 20.4)
. * (3) Poisson regression with various standard errors (Table 20.5)
. * (4) Negative binomial regression with various standard errors (Table 20.5)
.
. * To use this program you need health expenditure data in Stata data set
. * randdata.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
. * Essentially same data as in P. Deb and P.K. Trivedi (2002)
. * "The Structure of Demand for Medical Care: Latent Class versus
. * Two-Part Models", Journal of Health Economics, 21, 601-625
. * except that paper used different outcome (counts rather than $)
.
. * Each observation is for an individual over a year.
. * Individuals may appear in up to five years.
. * All available sample is used except only fee for service plans included.
. * In analysis here only year 2 is used so panel complications are avoided.
. * Clustering of individuals within household is ignored here.
.
. * Dependent variable is
.*
MED
med
Annual medical expenditures in constant dollars
.*
excluding dental and outpatient mental
.*
LNMED lnmeddol Ln(Medical expenditures) given meddol > 0
425
.*
Missing otherwise
.*
DMED binexp 1 if medical expenditures > 0
.
. * Regressors are
. * - Health insurance measures
.*
LC
logc
log(coinsrate+1) where coinsurance rate is 0 to 100
.*
IDP
idp
1 if individual deductible plan
.*
LPI
lpi
1og(annual participation incentive payment) or 0 if no payment
.*
FMDE
fmde
log(max(medical deductible expenditure)) if IDP=1 and MDE>1 or 0
otherw
> ise.
. * - Health status measures
.*
NDISEASE disea number of chronic diseases
.*
PHYSLIM physlm 1 if physical limitation
.*
HLTHG hlthg 1 if good health
.*
HLTHF hlthf 1 if good health
.*
HLTHP hlthp 1 if good health (omitted is excellent)
. * - Socioeconomic characteristics
.*
LINC linc
log of annual family income (in $)
.*
LFAM lfam
log of family size
.*
EDUCDEC educdec years of schooling of decision maker
.*
AGE
xage
exact age
.*
BLACK black 1 if black
.*
FEMALE female 1 if female
.*
CHILD child 1 if child
.*
FEMCHILD fchild 1 if female child
.
. * If panel data used then clustering is on
.*
zper
person id
.
. ********** READ DATA, SELECT AND TRANSFORM **********
.
. use randdata.dta, clear
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------plan | 20190 11.17553 3.976751
1
19
site | 20190 3.298811 1.80382
1
6
coins | 20190 26.3056 36.40386
0
100
tookphys | 20190 .5974245 .4904288
0
1
year | 20190 2.420109 1.217141
1
5
-------------+-------------------------------------------------------zper | 20190 357965.5 180868.1 125024 632167
black | 20190 .1814983 .3827071
0
1
income | 20190 8037.409 4058.371
0 29237.54
xage | 20190 25.72233 16.76945
0 64.27515
female | 20190 .5170381 .499722
0
1
-------------+-------------------------------------------------------educdec | 20186 11.96681 2.806255
0
25
426
time | 20190 .9989561 .0259741 .0767123

1
outpdol | 20190 51.12649 94.92627
0 2599.902
drugdol | 20190 13.1687 33.76212
0 706.3979
suppdol | 20190
6.8024 21.39346
0 1009.47
-------------+-------------------------------------------------------mentdol | 20190 6.870347 58.41298
0 1340.834
inpdol | 20190 100.4694 655.6215
0 38649.81
meddol | 20190 171.5679 698.2015
0 39182.02
totadm | 20190 .1127291 .4111857
0
8
inpmis | 20190 .0039624 .062824
0
1
-------------+-------------------------------------------------------mentvis | 20190 .4322437 3.430789
0
62
mdvis | 20190 2.860426 4.504365
0
77
notmdvis | 20190 .6855869 3.763543
0
109
num | 20190 3.954235 1.853034
1
14
mhi | 20190 76.55584 12.50224
12.2
100
-------------+-------------------------------------------------------disea | 20190 11.24449 6.741449
0
58.6
physlm | 20190 .1235003 .3220164
0
1
ghindx | 14967 73.09055 15.99371
3.7
100
mdeoff | 20185 417.8422 384.1199
0
1000
pioff | 20185 446.677 367.466
0 1291.68
-------------+-------------------------------------------------------child | 20190 .4013373 .4901812
0
1
fchild | 20190 .1937098 .3952139
0
1
lfam | 20190 1.248156 .539301
0 2.639057
lpi | 20190 4.707894 2.69784
0 7.163699
idp | 20190 .2599802 .4386343
0
1
-------------+-------------------------------------------------------logc | 20190 2.383342 2.041776
0 4.564348
fmde | 20190 4.029524 3.471353
0 8.294049
hlthg | 20190 .3620109 .4805938
0
1
hlthf | 20190 .077266 .2670196
0
1
hlthp | 20190 .0149579 .1213874
0
1
-------------+-------------------------------------------------------xghindx | 20190 73.2375 14.2332
3.7
100
linc | 20190 8.708265 1.228309
0 10.28324
lnum | 20190 1.248156 .539301
0 2.639057
lnmeddol | 15737 4.109318 1.484654 -.8495329 10.57597
binexp | 20190 .7794453 .414631
0
1
.
. /* Describe and summarize the original data.
> describe
> summarize
> * The orignal data are a panel.
> * The following summarizes panel features for completeness
> iis zper
> tis year
> xtdes
> xtsum meddol lnmeddol binexp
427
> */
.
. * Note that unlike chapter 16 we use all years, not just year 2
.
. * educdec is missing for some observations
. drop if educdec==.
.
. * rename variables
. rename mdvis MDU
. rename meddol MED
. rename binexp DMED
. rename lnmeddol LNMED
. rename linc LINC
. rename lfam LFAM
. rename educdec EDUCDEC
. rename xage AGE
. rename female FEMALE
. rename child CHILD
. rename fchild FEMCHILD
. rename black BLACK
. rename disea NDISEASE
. rename physlm PHYSLIM
. rename hlthg HLTHG
. rename hlthf HLTHF
. rename hlthp HLTHP
. rename idp IDP
. rename logc LC
. rename lpi LPI
. rename fmde FMDE
428
.
. * Define the regressor list which in commands can refer to as $XLIST
. global XLIST LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK
.
. sum MDU $XLIST
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MDU | 20186 2.860696 4.504765
0
77
LC | 20186 2.383588 2.041713
0 4.564348
IDP | 20186 .2599822 .4386354
0
1
LPI | 20186 4.708827 2.697293
0 7.163699
FMDE | 20186 4.030322 3.471234
0 8.294049
-------------+-------------------------------------------------------PHYSLIM | 20186 .1235247 .3220437
0
1
NDISEASE | 20186 11.2445 6.741647
0
58.6
HLTHG | 20186 .3620826 .4806144
0
1
HLTHF | 20186 .0772813 .2670439
0
1
HLTHP | 20186 .0149609 .1213992
0
1
-------------+-------------------------------------------------------LINC | 20186 8.708167 1.22841
0 10.28324
LFAM | 20186 1.248404 .5390681
0 2.639057
EDUCDEC | 20186 11.96681 2.806255
0
25
AGE | 20186 25.71844 16.76759
0 64.27515
FEMALE | 20186 .5169424 .4997252
0
1
-------------+-------------------------------------------------------CHILD | 20186 .4014168 .4901972
0
1
FEMCHILD | 20186 .1937481 .3952436
0
1
BLACK | 20186 .1815343 .3827365
0
1
.
. outfile MDU LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF HLTHP /*
>
*/ LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK /*
>
*/ using mma20p1count.asc, replace
.
. ********** (1) FREQUENCIES OF COUNT (Table 20.3, page 672) **********
.
. * Following ggives Table 20.3 (page 672) frequencies
. tabulate MDU
number |
face-to-fac |
t md visits |
Freq. Percent
Cum.
------------+----------------------------------0|
6,308
31.25
31.25
1|
3,815
18.90
50.15
429
2|
3|
4|
5|
6|
7|
8|
9|
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
37 |
38 |
39 |
40 |
41 |
44 |
45 |
46 |
48 |
51 |
52 |
55 |
56 |
57 |
58 |
62 |
63 |
2,795
1,884
1,345
968
689
531
408
287
206
190
118
109
82
59
56
33
37
35
26
22
19
19
13
8
10
6
12
6
8
8
4
5
9
5
5
9
1
3
5
6
2
2
2
1
3
1
1
1
1
1
1
13.85
9.33
6.66
4.80
3.41
2.63
2.02
1.42
1.02
0.94
0.58
0.54
0.41
0.29
0.28
0.16
0.18
0.17
0.13
0.11
0.09
0.09
0.06
0.04
0.05
0.03
0.06
0.03
0.04
0.04
0.02
0.02
0.04
0.02
0.02
0.04
0.00
0.01
0.02
0.03
0.01
0.01
0.01
0.00
0.01
0.00
0.00
0.00
0.00
0.00
0.00
63.99
73.33
79.99
84.79
88.20
90.83
92.85
94.27
95.29
96.24
96.82
97.36
97.77
98.06
98.34
98.50
98.68
98.86
98.98
99.09
99.19
99.28
99.35
99.39
99.44
99.46
99.52
99.55
99.59
99.63
99.65
99.68
99.72
99.75
99.77
99.82
99.82
99.84
99.86
99.89
99.90
99.91
99.92
99.93
99.94
99.95
99.95
99.96
99.96
99.97
99.97
430
65 |
1
0.00
99.98
69 |
1
0.00
99.98
72 |
1
0.00
99.99
74 |
1
0.00
99.99
76 |
1
0.00
100.00
77 |
1
0.00
100.00
------------+----------------------------------Total | 20,186
100.00
.
. * Histogram with kernel density estimate
. hist MDU, discrete kdensity
(start=0, width=1)
.
. ********** (2) DATA SUMMARY (Table 20.4, page 672) **********
.
. * Following gives variables in same order as Table 20.4 (page 672)
. sum MDU LC IDP LPI FMDE LINC LFAM AGE FEMALE CHILD FEMCHILD BLACK /*
>
*/ EDUCDEC PHYSLIM NDISEASE HLTHG HLTHF HLTHP
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------MDU | 20186 2.860696 4.504765
0
77
LC | 20186 2.383588 2.041713
0 4.564348
IDP | 20186 .2599822 .4386354
0
1
LPI | 20186 4.708827 2.697293
0 7.163699
FMDE | 20186 4.030322 3.471234
0 8.294049
-------------+-------------------------------------------------------LINC | 20186 8.708167 1.22841
0 10.28324
LFAM | 20186 1.248404 .5390681
0 2.639057
AGE | 20186 25.71844 16.76759
0 64.27515
FEMALE | 20186 .5169424 .4997252
0
1
CHILD | 20186 .4014168 .4901972
0
1
-------------+-------------------------------------------------------FEMCHILD | 20186 .1937481 .3952436
0
1
BLACK | 20186 .1815343 .3827365
0
1
EDUCDEC | 20186 11.96681 2.806255
0
25
PHYSLIM | 20186 .1235247 .3220437
0
1
NDISEASE | 20186 11.2445 6.741647
0
58.6
-------------+-------------------------------------------------------HLTHG | 20186 .3620826 .4806144
0
1
HLTHF | 20186 .0772813 .2670439
0
1
HLTHP | 20186 .0149609 .1213992
0
1
.
.
. *********** (3, 4) REGRESSION ANALYSIS **************
.
. * Here just two estimators - Poisson and negative binomial
. * but three ways to calculate standard errors
431
. * (A) default ML
. * (B) robust (to misspecification of heteroskedasticity)
. * (C) cluster-robust needed here as data are actually panel (see chapter 21, 24)
.
. *** Table 20.5 Poisson regression estimates
.
. * Default standard errors assume variance = mean (ignoring overdispersion)
. * This is first t-ratio in Table 20.5
. poisson MDU $XLIST
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Poisson regression
Number of obs =
20186
LR chi2(17) = 13106.07
Prob > chi2 = 0.0000
Pseudo R2
= 0.0983
-----------------------------------------------------------------------------MDU |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0427332 .0060785 -7.03 0.000 -.0546469 -.0308195
IDP | -.1613169 .0116218 -13.88 0.000 -.1840952 -.1385385
LPI | .0128511 .0018362 7.00 0.000 .0092523 .0164499
FMDE | -.020613 .0035521 -5.80 0.000 -.027575 -.0136511
PHYSLIM | .2684048 .0123624 21.71 0.000 .2441749 .2926347
NDISEASE | .023183 .0006081 38.12 0.000 .0219912 .0243749
HLTHG | .0394004 .0095884 4.11 0.000 .0206074 .0581934
HLTHF | .2531119 .016212 15.61 0.000 .2213369 .2848869
HLTHP | .5216034 .0272382 19.15 0.000 .4682176 .5749892
LINC | .0834099 .0051656 16.15 0.000 .0732854 .0935343
LFAM | -.1296626 .0089603 -14.47 0.000 -.1472245 -.1121008
EDUCDEC | .0176149 .0016387 10.75 0.000 .0144031 .0208268
AGE | .0023756 .0004311 5.51 0.000 .0015306 .0032206
FEMALE | .3487667 .0113504 30.73 0.000 .3265203 .371013
CHILD | .3361904 .0178194 18.87 0.000 .3012649 .3711158
FEMCHILD | -.3625218 .0179396 -20.21 0.000 -.3976827 -.3273608
BLACK | -.6800518 .0155484 -43.74 0.000 -.7105262 -.6495775
_cons | -.1898766 .0491731 -3.86 0.000 -.2862541 -.093499
-----------------------------------------------------------------------------. estimates store poisml
.
. * Should always control for possible overdispersion
. * This is second t-ratio in Table 20.5
. poisson MDU $XLIST, robust
432

Poisson regression
Number of obs =
20186
Wald chi2(17) = 1924.78
Prob > chi2 = 0.0000
Pseudo R2
= 0.0983
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0427332 .0150712 -2.84 0.005 -.0722723 -.0131942
IDP | -.1613169 .0279441 -5.77 0.000 -.2160863 -.1065474
LPI | .0128511 .0044136 2.91 0.004 .0042007 .0215015
FMDE | -.020613 .0088874 -2.32 0.020 -.0380319 -.0031941
PHYSLIM | .2684048 .0325743 8.24 0.000 .2045604 .3322493
NDISEASE | .023183 .0017189 13.49 0.000
.019814 .0265521
HLTHG | .0394004 .023194 1.70 0.089 -.006059 .0848598
HLTHF | .2531119 .0429454 5.89 0.000 .1689405 .3372833
HLTHP | .5216034 .0748808 6.97 0.000 .3748398 .668367
LINC | .0834099 .0139182 5.99 0.000 .0561306 .1106891
LFAM | -.1296626 .0226793 -5.72 0.000 -.1741132 -.085212
EDUCDEC | .0176149 .004042 4.36 0.000 .0096927 .0255371
AGE | .0023756 .0011184 2.12 0.034 .0001837 .0045675
FEMALE | .3487667 .0283549 12.30 0.000
.293192 .4043413
CHILD | .3361904 .040411 8.32 0.000 .2569863 .4153945
FEMCHILD | -.3625218 .04415 -8.21 0.000 -.4490542 -.2759893
BLACK | -.6800518 .0368748 -18.44 0.000 -.7523252 -.6077785
_cons | -.1898766 .127516 -1.49 0.136 -.4398033 .0600502
-----------------------------------------------------------------------------. estimates store poisrobust
.
. * Should also control here for clustering (see chapter 24)
. * as up to four years of data for each person.
. * Table 20.5 did not report these results
. poisson MDU $XLIST, cluster(zper)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Poisson regression
Number of obs =
20186
Wald chi2(17) = 827.07
Prob > chi2 = 0.0000
(standard errors adjusted for clustering on zper)
433
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0427332 .0226824 -1.88 0.060 -.0871899 .0017235
IDP | -.1613169 .0424591 -3.80 0.000 -.2445352 -.0780986
LPI | .0128511 .0067697 1.90 0.058 -.0004173 .0261195
FMDE | -.020613 .0134449 -1.53 0.125 -.0469646 .0057386
PHYSLIM | .2684048 .0491061 5.47 0.000 .1721586 .364651
NDISEASE | .023183 .0027457 8.44 0.000 .0178015 .0285645
HLTHG | .0394004 .0354001 1.11 0.266 -.0299825 .1087833
HLTHF | .2531119 .0675164 3.75 0.000 .1207822 .3854416
HLTHP | .5216034 .1163731 4.48 0.000 .2935163 .7496905
LINC | .0834099 .0200881 4.15 0.000 .0440379 .1227818
LFAM | -.1296626 .0340038 -3.81 0.000 -.1963089 -.0630164
EDUCDEC | .0176149 .0062678 2.81 0.005 .0053302 .0298996
AGE | .0023756 .0016549 1.44 0.151 -.0008681 .0056192
FEMALE | .3487667 .0432567 8.06 0.000
.263985 .4335483
CHILD | .3361904 .0586109 5.74 0.000 .2213151 .4510656
FEMCHILD | -.3625218 .0660639 -5.49 0.000 -.4920045 -.233039
BLACK | -.6800518 .0544268 -12.49 0.000 -.7867263 -.5733774
_cons | -.1898766 .1860343 -1.02 0.307 -.5544971 .174744
-----------------------------------------------------------------------------. estimates store poiscluster
.
. *** Table 20.5 Negative binomial regression estimates
.
. * Default standard errors assume variance = mean (ignoring overdispersion)
. * This is first t-ratio in Table 20.5
. nbreg MDU $XLIST
Fitting Poisson model:
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:


Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Fitting full model:

434

Negative binomial regression
Number of obs =
20186
LR chi2(17) = 2828.01
Prob > chi2 = 0.0000
Pseudo R2
= 0.0320
-----------------------------------------------------------------------------MDU |
Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+---------------------------------------------------------------LC | -.0504405 .0128694 -3.92 0.000 -.0756641 -.0252169
IDP | -.1475976 .0254099 -5.81 0.000 -.1974001 -.0977951
LPI | .0158351 .0040586 3.90 0.000 .0078805 .0237898
FMDE | -.021335 .0075119 -2.84 0.005 -.036058 -.0066119
PHYSLIM | .2751715 .0295572 9.31 0.000 .2172404 .3331026
NDISEASE | .0259352 .0014827 17.49 0.000 .0230292 .0288412
HLTHG | .0065371 .0202235 0.32 0.747 -.0331002 .0461744
HLTHF | .2368643 .0374086 6.33 0.000 .1635448 .3101837
HLTHP | .4256563 .0741812 5.74 0.000 .2802638 .5710488
LINC | .0845165 .0085659 9.87 0.000 .0677277 .1013053
LFAM | -.1226764 .019308 -6.35 0.000 -.1605195 -.0848333
EDUCDEC | .0162582 .0034846 4.67 0.000 .0094285 .0230879
AGE | .0025943 .0009433 2.75 0.006 .0007455 .0044432
FEMALE | .3672884 .024005 15.30 0.000 .3202395 .4143373
CHILD | .3060317 .0385618 7.94 0.000 .230452 .3816115
FEMCHILD | -.3755503 .0371392 -10.11 0.000 -.4483418 -.3027587
BLACK | -.7104372 .0274929 -25.84 0.000 -.7643223 -.6565521
_cons | -.2069298 .0899431 -2.30 0.021 -.3832151 -.0306445
-------------+---------------------------------------------------------------/lnalpha | .1674206 .0147901
.1384326 .1964087
-------------+---------------------------------------------------------------alpha | 1.182251 .0174856
1.148472 1.217024
-----------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 3.5e+04 Prob>=chibar2 = 0.000
. estimates store nbml
.
. * Should always control for possible overdispersion
. * This is second t-ratio in Table 20.5
. nbreg MDU $XLIST, robust
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:


435
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Number of obs =
20186
Wald chi2(17) = 2203.12
Prob > chi2 = 0.0000
Pseudo R2
= 0.0320
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0504405 .0156238 -3.23 0.001 -.0810625 -.0198184
IDP | -.1475976 .0303777 -4.86 0.000 -.2071367 -.0880585
LPI | .0158351 .004431 3.57 0.000 .0071505 .0245197
FMDE | -.021335 .0090748 -2.35 0.019 -.0391211 -.0035488
PHYSLIM | .2751715 .0341067 8.07 0.000 .2083235 .3420195
NDISEASE | .0259352 .0016925 15.32 0.000
.022618 .0292524
HLTHG | .0065371 .023814 0.27 0.784 -.0401375 .0532118
HLTHF | .2368643 .0436579 5.43 0.000 .1512963 .3224322
HLTHP | .4256563 .0686042 6.20 0.000 .2911945 .560118
LINC | .0845165 .0113918 7.42 0.000 .0621891 .106844
LFAM | -.1226764 .0231639 -5.30 0.000 -.1680769 -.0772759
EDUCDEC | .0162582 .0040332 4.03 0.000 .0083533 .024163
AGE | .0025943 .0011128 2.33 0.020 .0004133 .0047753
FEMALE | .3672884 .0285724 12.85 0.000 .3112876 .4232892
CHILD | .3060317 .0428976 7.13 0.000
.221954 .3901095
FEMCHILD | -.3755503 .0447039 -8.40 0.000 -.4631682 -.2879323
BLACK | -.7104372 .0359462 -19.76 0.000 -.7808903 -.639984
_cons | -.2069298 .1130753 -1.83 0.067 -.4285533 .0146938
-------------+---------------------------------------------------------------/lnalpha | .1674206 .0187562
.1306591 .2041821
-------------+---------------------------------------------------------------alpha | 1.182251 .0221746
1.139579 1.226522
-----------------------------------------------------------------------------. estimates store nbrobust
.
. * Should also control here for clustering (see chapter 24)
. * as up to four years of data for each person.
436
. * Table 20.5 did not report these results

. nbreg MDU $XLIST, cluster(zper)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:


Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Number of obs =
20186
Wald chi2(17) = 1034.43
Prob > chi2 = 0.0000
(standard errors adjusted for clustering on zper)
-----------------------------------------------------------------------------|
Robust
MDU |
Coef. Std. Err.
-------------+---------------------------------------------------------------LC | -.0504405 .0236804 -2.13 0.033 -.0968533 -.0040277
IDP | -.1475976 .0457769 -3.22 0.001 -.2373186 -.0578766
LPI | .0158351 .0066968 2.36 0.018 .0027096 .0289607
FMDE | -.021335 .0137245 -1.55 0.120 -.0482344 .0055645
PHYSLIM | .2751715 .0489905 5.62 0.000 .1791519 .371191
NDISEASE | .0259352 .0025814 10.05 0.000 .0208758 .0309946
HLTHG | .0065371 .0359676 0.18 0.856 -.0639581 .0770323
HLTHF | .2368643 .0653989 3.62 0.000 .1086848 .3650437
HLTHP | .4256563 .1000813 4.25 0.000 .2295005 .621812
LINC | .0845165 .0152197 5.55 0.000 .0546864 .1143467
LFAM | -.1226764 .0340453 -3.60 0.000 -.189404 -.0559488
EDUCDEC | .0162582 .0059501 2.73 0.006 .0045962 .0279202
AGE | .0025943 .001581 1.64 0.101 -.0005045 .0056931
FEMALE | .3672884 .0420327 8.74 0.000 .2849059 .4496709
CHILD | .3060317 .0598167 5.12 0.000 .1887932 .4232702
FEMCHILD | -.3755503 .0649845 -5.78 0.000 -.5029175 -.2481831
BLACK | -.7104372 .0531155 -13.38 0.000 -.8145417 -.6063326
_cons | -.2069298 .1576721 -1.31 0.189 -.5159613 .1021018
437
-------------+---------------------------------------------------------------/lnalpha | .1674206 .0252599

.1179121 .2169291
-------------+---------------------------------------------------------------alpha | 1.182251 .0298635
1.125145 1.242256
-----------------------------------------------------------------------------. estimates store nbcluster
.
. ************ DISPLAY RESULTS FOR TABLE 20.5 (page 673) ************
.
. * Note for brevity the coefficients for only some of the regressors
. * are given in Table 20.5
.
. * First columns of Table 20.5 (page 673) plus cluster-robust
. estimates table poisml poisrobust poiscluster, t stats(N ll rank aic bic) b(%10.4f) t(%10.3f)
----------------------------------------------------Variable | poisml poisrobust poisclus~r
-------------+--------------------------------------LC | -0.0427 -0.0427 -0.0427
| -7.030
-2.835
-1.884
IDP | -0.1613 -0.1613 -0.1613
| -13.881
-5.773
-3.799
LPI | 0.0129
0.0129
0.0129
|
6.999
2.912
1.898
FMDE | -0.0206 -0.0206 -0.0206
| -5.803
-2.319
-1.533
PHYSLIM | 0.2684
0.2684
0.2684
| 21.711
8.240
5.466
NDISEASE | 0.0232
0.0232
0.0232
| 38.124
13.487
8.443
HLTHG | 0.0394
0.0394
0.0394
|
4.109
1.699
1.113
HLTHF | 0.2531
0.2531
0.2531
| 15.613
5.894
3.749
HLTHP | 0.5216
0.5216
0.5216
| 19.150
6.966
4.482
LINC | 0.0834
0.0834
0.0834
| 16.147
5.993
4.152
LFAM | -0.1297 -0.1297 -0.1297
| -14.471
-5.717
-3.813
EDUCDEC | 0.0176
0.0176
0.0176
| 10.749
4.358
2.810
AGE | 0.0024
0.0024
0.0024
|
5.510
2.124
1.435
FEMALE | 0.3488
0.3488
0.3488
| 30.727
12.300
8.063
CHILD | 0.3362
0.3362
0.3362
| 18.866
8.319
5.736
FEMCHILD | -0.3625 -0.3625 -0.3625
438
| -20.208
-8.211
-5.487
BLACK | -0.6801 -0.6801 -0.6801
| -43.738 -18.442 -12.495
_cons | -0.1899 -0.1899 -0.1899
| -3.861
-1.489
-1.021
-------------+--------------------------------------N | 20186.0000 20186.0000 20186.0000
ll | -6.009e+04 -6.009e+04 -6.009e+04
rank | 18.0000
18.0000
18.0000
aic | 1.202e+05 1.202e+05 1.202e+05
bic | 1.204e+05 1.204e+05 1.204e+05
----------------------------------------------------legend: b/t
.
. * Last columns of Table 20.5 (page 673) give bnbml. Also give others.
. estimates table nbml nbrobust nbcluster, t stats(N ll rank aic bic) b(%10.4f) t(%10.3f)
----------------------------------------------------Variable | nbml
nbrobust nbcluster
-------------+--------------------------------------MDU
|
LC | -0.0504 -0.0504 -0.0504
| -3.919
-3.228
-2.130
IDP | -0.1476 -0.1476 -0.1476
| -5.809
-4.859
-3.224
LPI | 0.0158
0.0158
0.0158
|
3.902
3.574
2.365
FMDE | -0.0213 -0.0213 -0.0213
| -2.840
-2.351
-1.555
PHYSLIM | 0.2752
0.2752
0.2752
|
9.310
8.068
5.617
NDISEASE | 0.0259
0.0259
0.0259
| 17.492
15.324
10.047
HLTHG | 0.0065
0.0065
0.0065
|
0.323
0.275
0.182
HLTHF | 0.2369
0.2369
0.2369
|
6.332
5.425
3.622
HLTHP | 0.4257
0.4257
0.4257
|
5.738
6.205
4.253
LINC | 0.0845
0.0845
0.0845
|
9.867
7.419
5.553
LFAM | -0.1227 -0.1227 -0.1227
| -6.354
-5.296
-3.603
EDUCDEC | 0.0163
0.0163
0.0163
|
4.666
4.031
2.732
AGE | 0.0026
0.0026
0.0026
|
2.750
2.331
1.641
FEMALE | 0.3673
0.3673
0.3673
| 15.300
12.855
8.738
CHILD | 0.3060
0.3060
0.3060
439
|
7.936
7.134
5.116
FEMCHILD | -0.3756 -0.3756 -0.3756
| -10.112
-8.401
-5.779
BLACK | -0.7104 -0.7104 -0.7104
| -25.841 -19.764 -13.375
_cons | -0.2069 -0.2069 -0.2069
| -2.301
-1.830
-1.312
-------------+--------------------------------------lnalpha
|
_cons | 0.1674
0.1674
0.1674
| 11.320
8.926
6.628
-------------+--------------------------------------Statistics |
N | 20186.0000 20186.0000 20186.0000
ll | -4.278e+04 -4.278e+04 -4.278e+04
rank | 19.0000
19.0000
19.0000
aic | 85593.2220 85593.2220 85593.2220
bic | 85743.5642 85743.5642 85743.5642
----------------------------------------------------legend: b/t
.
. * For Poisson correcting for overdispersion is most important.
. * For negative binomial overdispersion is already incorporated.
. * For both contreolling for clustering (in this example with panel data)
. * is also needed.
.
. log close
log: c:\Imbook\bwebpage\Section4\mma20p1count.txt
log type: text
closed on: 20 May 2005, 08:41:56
440
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p1panfeandre.txt
log type: text
opened on: 23 May 2005, 11:27:25
.
. ********** OVERVIEW OF MMA21P1PANBFEANDRE.DO **********
.
. * STATA Program
.
. * Chapter 21.3.1-3 pages 709-14
. * Program performs basic panel analysis, mainly using XTREG:
. * It derives most of Table 21.1 and Figures 21.1-21.4
. * (1) pooled OLS
. * (2) between
. * (3) within (or fixed effects)
. * (4) first differences
. * (5) random effects - GLS
. * (6) random effects - MLE
. * (7) Hausman test of FE versus RE
. * Standard errors are default plus panel bootstrap
.
. * The individual effects model is
. * y_it = x_it'b + a_i + e_it
. * Default panel output assumes e_it is random.
. * This is usually too strong an assumption.
. * Instead should get panel-robust or cluster-robust errors after xtreg
. * See Section 21.2.3 pages 709-12
. * Stata Version 8 does not do this but Stata version 9 does.
.
. * Three ways to obtain panel-robust se's for fixed and random effects models:
. * (1) Use Stata version 9 and cluster option in xtreg
. * (2) Use Stata version 8 xtreg and then panel bootstrap (this program)
. * (3) Use Stata version 8 regress cluster option on transformed model (next program)
.
. * The four basic linear panel programs are
. * mma21p1panfeandre.do Linear fixed and random effects using xtreg
. * mma21p2panfeandre.do Linear fe and re using transformation and regress
.*
plus also has valid Hausman test
. * mma21p3panresiduals.do Residual analysis after linear fe and re
. * mma21p4panpangls.do Pooled panel OLS and GLS
.
. * MOM.dat
.
. * To speed up this program reduce nreps, the number of bootstraps
. * used in the panel bootstrap to get panel-robust standard errors
441
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
. * The original data is from
. * Jim Ziliak (1997)
. * "Efficient Estimation With Panel Data when Instruments are Predetermined:
. * An Empirical Comparison of Moment-Condition Estimators"
. * Journal of Business and Economic Statistics, 15, 419-431
.
. * File MOM.dat has data on 532 men over 10 years (1979-1988)
. * Data are space-delimited ordered by person with separate line for each year
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
. * lnhr lnwg kids ageh agesq disab id year
.
. * File MOM.dat is the version of the data posted at the JBES website
. * Note that in chapter 22 we instead use MOMprecise.dat
. * which is the same data set but with more significant digits
.
. ********** READ DATA **********
.
. * The data are in ascii file MOM.dat
. * There are 532 individuals with 10 lines (years) per individual
. * Read in using Infile: FREE FORMAT WITHOUT DICTIONARY
. infile lnhr lnwg kids ageh agesq disab id year using MOM.dat
.
. ********** DATA TRANSFORMATIONS AND CHECK **********
.
. * Create year dummies
. tabulate year, generate(dyear)
year |
Freq. Percent
Cum.
------------+----------------------------------1979 |
532
10.00
10.00
1980 |
532
10.00
20.00
1981 |
532
10.00
30.00
1982 |
532
10.00
40.00
1983 |
532
10.00
50.00
1984 |
532
10.00
60.00
1985 |
532
10.00
70.00
442
1986 |
532
10.00
80.00
1987 |
532
10.00
90.00
1988 |
532
10.00
100.00
------------+----------------------------------Total |
5,320
100.00
.
. * The following lists the variables in data set and summarizes data
. describe
Contains data
obs:
5,320
vars:
18
size:
label
variable label
------------------------------------------------------------------------------lnhr
float %9.0g
lnwg
float %9.0g
kids
float %9.0g
ageh
float %9.0g
agesq
float %9.0g
disab
float %9.0g
id
float %9.0g
year
float %9.0g
dyear1
byte %8.0g
year== 1979.0000
dyear2
byte %8.0g
year== 1980.0000
dyear3
byte %8.0g
year== 1981.0000
dyear4
byte %8.0g
year== 1982.0000
dyear5
byte %8.0g
year== 1983.0000
dyear6
byte %8.0g
year== 1984.0000
dyear7
byte %8.0g
year== 1985.0000
dyear8
byte %8.0g
year== 1986.0000
dyear9
byte %8.0g
year== 1987.0000
dyear10
byte %8.0g
year== 1988.0000
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
443
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
dyear1 |
5320
.1 .3000282
0
1
dyear2 |
5320
.1 .3000282
0
1
-------------+-------------------------------------------------------dyear3 |
5320
.1 .3000282
0
1
dyear4 |
5320
.1 .3000282
0
1
dyear5 |
5320
.1 .3000282
0
1
dyear6 |
5320
.1 .3000282
0
1
dyear7 |
5320
.1 .3000282
0
1
-------------+-------------------------------------------------------dyear8 |
5320
.1 .3000282
0
1
dyear9 |
5320
.1 .3000282
0
1
dyear10 |
5320
.1 .3000282
0
1
. save mom, replace
file mom.dta saved
.
. * The following summarizes panel features for completeness
. iis id
. tis year
. xtdes
id: 1, 2, ..., 532
n=
532
year: 1979, 1980, ..., 1988
T=
Delta(year) = 1; (1988-1979)+1 = 10
(id*year uniquely identifies each observation)
Distribution of T_i: min
5%
10
10
10
10
25%
50%
75%
10
10
10
10
95%
max
Freq. Percent Cum. | Pattern

---------------------------+-----------532 100.00 100.00 | 1111111111
---------------------------+-----------532 100.00
| XXXXXXXXXX
. xtsum lnhr lnwg kids ageh agesq disab
Variable
|
Mean Std. Dev.
Min
Max | Observations
-----------------+--------------------------------------------+---------------lnhr overall | 7.65743 .2855914
2.77
8.56 | N = 5320
between |
.1790083
6.416
8.242 | n = 532
within |
.2226492 3.66943 9.001431 | T =
10
|
|
lnwg overall | 2.609436 .4258924
-.26
4.69 | N = 5320
between |
.3911937
1.346
4.543 | n = 532
within |
.1691472 .0694361 4.487436 | T =
10
444
|
|
overall | 1.555827 1.195924
0
6 | N = 5320
between |
1.032205
0
5.4 | n = 532
within |
.605468 -2.444173 5.055827 | T =
10
|
|
ageh overall | 38.91823 8.450351
22
60 | N = 5320
between |
7.945371
26.5
55.5 | n = 532
within |
2.895916 32.71823 52.21823 | T =
10
|
|
agesq overall | 1586.024 689.7759
484
3600 | N = 5320
between |
650.9138
710.5 3088.5 | n = 532
within |
229.8235 963.3239 2581.724 | T =
10
|
|
disab overall | .0609023 .2391734
0
1 | N = 5320
between |
.1657419
0
1 | n = 532
within |
.1725689 -.8390977 .9609023 | T =
10
kids
.
. ********** DEFINE GLOBALS INCLUDING REGRESSOR LIST **********
.
. * Number of reps for the boostrap
. * Table 21.2 pge 710 used 500
. global nreps 500
.
. * The regression below are of lnhrs on lnwg
. * Additional regressors to be included below are defined in xextra
. * Choose one of the following
.
. * No additional regressors
. global xextra
. global xextrashort
.
. * Include year dummies with one ommitted (or two omitted for first differences)
. * global xextra dyear1 dyear2 dyear3 dyear3 dyear4 dyear5 dyear6 dyear7 dyear8 dyear9
. * global xextrashort dyear2 dyear3 dyear3 dyear4 dyear5 dyear6 dyear7 dyear8 dyear9
.
. * Include socioeconomic characteristics
. * global xextra kids ageh agesq disab
. * global xextrashort kids ageh agesq disab
.
. ********* DIFFERENT PANEL ESTIMATES pages 709-14 **********
.
. * Note that in the first xt command need to give , i(id)
. * to indicate that the ith observation is for the ith id
.
. * XTDATA permits plots of between, within and overall
. * Useful for looking at the data. See Stata manual under xtdata for example.
. * XTREG gives between, within and RE estiamtes though not correct standard errors
445
.
. * The graphs below use new Stata 8 graphics
. * Change graphics scheme from default s2color to s1mono for printing
. set scheme s1mono
. * The following graphs include
. * legend(pos(4) ring(0) col(1))
.*
changes position of legend to four o'clock
. * legend( label(1 "Data used") label(2 "Smoothed fit") label(3 "Linear fit"))
.*
changes labels for the legends
.
. *** (1) POOLED OLS (OVERALL) REGRESSION (Table 21.2 POLS column and Figure 21.1)
.
. use mom, clear
.
. * Wrong formula OLS standard errors require e_it is i.i.d.
. regress lnhr lnwg $xextra
Source |
SS
df
MS
-------------+-----------------------------F( 1, 5318) = 82.22
Model | 6.60538417 1 6.60538417
Prob > F
= 0.0000
Residual | 427.225206 5318 .080335691
R-squared = 0.0152
-------------+-----------------------------Adj R-squared = 0.0150
Total | 433.830591 5319 .081562435
Root MSE
= .28344
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0827436 .0091251 9.07 0.000 .0648545 .1006326
_cons | 7.441516 .0241265 308.44 0.000 7.394219 7.488814
-----------------------------------------------------------------------------. estimates store polsiid
.
. * Wrong White heteroskesdastic-consistent standard errors
. * assume standard errors require e_it is independent over i
. regress lnhr lnwg $xextra, robust
Number of obs =
F( 1, 5318) = 16.61
Prob > F
= 0.0000
R-squared = 0.0152
Root MSE = .28344
5320
-----------------------------------------------------------------------------|
Robust
lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0827436 .0203042 4.08 0.000 .0429391 .122548
446
_cons | 7.441516 .0548992 135.55 0.000 7.333891 7.549141

-----------------------------------------------------------------------------. estimates store polshet
.
. * Correct panel robust standard errors
. regress lnhr lnwg $xextra, cluster(id)
F( 1, 531) = 7.99
Prob > F
= 0.0049
R-squared = 0.0152
Number of clusters (id) = 532
Root MSE
= .28344
-----------------------------------------------------------------------------|
Robust
lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0827436 .0292711 2.83 0.005 .0252421 .140245
_cons | 7.441516 .079587 93.50 0.000 7.285172 7.59786
-----------------------------------------------------------------------------. estimates store polspanel
.
. * Correct panel bootstrap standard errors
. * Note that use cluster option so that bootstrap is over just i and not both i and t
. set seed 10001
. bs "regress lnhr lnwg $xextra" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
regress lnhr lnwg
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .0827435 -.0005317 .0298395 .024117 .1413701 (N)
|
.027782 .1408137 (P)
|
.0284079 .1434854 (BC)
_bs_2 | 500 7.441516 .001375 .0805676 7.283223 7.59981 (N)
|
7.281352 7.593587 (P)
|
7.269371 7.585756 (BC)
-----------------------------------------------------------------------------Note: N = normal
447
P = percentile
BC = bias-corrected
. matrix polsbootse = e(se)
.
. * Overall plot of data with lowess local regression line - Figure 21.1 page 712
. graph twoway (scatter lnhr lnwg, msize(vsmall)) (lowess lnhr lnwg) (lfit lnhr lnwg), /*
> */ title("Pooled (Overall) Regression") /*
> */ xtitle("Log hourly wage", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log annual hours", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Original data") label(2 "Nonparametric fit") label(3 "Linear fit"))
. graph export ch21pantot.wmf, replace
(file c:\Imbook\bwebpage\Section5\ch21pantot.wmf written in Windows Metafile format)
.
. *** (2) BETWEEN REGRESSION (Table 21.2 Between column and Figure 21.2)
.
. use mom, clear
.
. * Usual standard errors assume iid error
. xtreg lnhr lnwg, be i(id)
Between regression (regression on group means) Number of obs
=
Group variable (i): id
Number of groups =
532
R-sq: within = 0.0162
between = 0.0213
overall = 0.0152
F(1,530)
sd(u_i + avg(e_i.))= .1772555
Obs per group: min =

avg =
10.0
max =
10
= 11.55
Prob > F
=
5320
10
0.0007
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0668379 .0196635 3.40 0.001 .0282099 .1054658
_cons | 7.483021 .0518829 144.23 0.000
7.3811 7.584943
-----------------------------------------------------------------------------. estimates store beiid
.
. * Heteroskedasticity robust standard errors
. * Stata has no option for this. See ch21panel2.do
.
448
. set seed 10001

. bootstrap "xtreg lnhr lnwg, be i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
xtreg lnhr lnwg , be i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .0668379 -.0005547 .0192363 .0290438 .1046319 (N)
|
.0240799 .1059889 (P)
|
.0274993 .1066802 (BC)
_bs_2 | 500 7.483021 .0016537 .0519151 7.381022 7.58502 (N)
|
7.383433 7.595335 (P)
|
7.382822 7.592656 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix bebootse = e(se)
.
. * Betweeen plot of data with lowess local regression line - Figure 21.2 page 712
. iis id
. xtdata, be
> */ title("Between Regression") /*
> */ legend( label(1 "Averages") label(2 "Nonparametric fit") label(3 "Linear fit"))
. graph export ch21panbe.wmf, replace
(file c:\Imbook\bwebpage\Section5\ch21panbe.wmf written in Windows Metafile format)
.
. *** (3) WITHIN (FIXED EFFECTS) REGRESSION (Table 21.2 Within column and Figure 21.3)
.
. use mom, clear
.
449

. xtreg lnhr lnwg $xextra, fe i(id)
Fixed-effects (within) regression
Number of obs
=
5320
Number of groups =
532

between = 0.0213
overall = 0.0152
corr(u_i, Xb) = -0.1995

avg =
10.0
max =
10
F(1,4787)
=
Prob > F
10
78.96
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
rho | .37789558 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(531, 4787) = 5.83
Prob > F = 0.0000
. estimates store feiid
.
.
. set seed 10001
. bootstrap "xtreg lnhr lnwg $xextra, fe i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level
> (95)
command:
xtreg lnhr lnwg , fe i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .1676755 -.0055543 .0844631 .0017284 .3336226 (N)
|
.0213276 .3318829 (P)
|
.0300515 .3605573 (BC)
450
_bs_2 | 500 7.219892 .01461 .223047 6.781665 7.658119 (N)

|
6.782279 7.604026 (P)
|
6.683465 7.574718 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix febootse = e(se)
.
. * Within plot of data with lowess local regression line - Figure 21.3 page 712
. iis id
. xtdata, fe
> */ title("Within (Fixed Effects) Regression") /*
> */ legend( label(1 "Deviations from average") label(2 "Nonparametric fit") label(3 "Linear fit")
>)
. graph export ch21panfe.wmf, replace
(file c:\Imbook\bwebpage\Section5\ch21panfe.wmf written in Windows Metafile format)
.
. *** (4) FIRST DIFFERENCES REGRESSION (Table 21.2 First diff column and Figure 21.4)
.
. * Stata has no command for first differences regression
. * Though may be possible with xtabond
. * Instead need to create differenced data
.
. use mom, clear
. * The following only works if each observation is (i,t)
. * and within i the data are ordered by t
. gen dlnhr = lnhr - lnhr[_n-1]
. gen dlnwg = lnwg - lnwg[_n-1]
. gen dkids = kids - kids[_n-1]
. gen dageh = ageh - ageh[_n-1]
451
. gen dagesq = agesq - agesq[_n-1]

. gen ddisab = disab - disab[_n-1]
. * The following drops the first year which here is 1979
. drop if year == 1979
.
. regress dlnhr dlnwg $xextrashort
Source |
SS
df
MS
-------------+-----------------------------F( 1, 4786) = 26.09
Model | 2.27870825 1 2.27870825
Prob > F
= 0.0000
Residual | 417.943979 4786 .087326364
R-squared = 0.0054
-------------+-----------------------------Adj R-squared = 0.0052
Total | 420.222687 4787 .087784142
Root MSE
= .29551
-----------------------------------------------------------------------------dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1089851 .0213351 5.11 0.000 .0671584 .1508118
_cons | .0008283 .0042712 0.19 0.846 -.0075452 .0092018
-----------------------------------------------------------------------------. estimates store fdiffiid
.
. regress dlnhr dlnwg $xextrashort, cluster(id)
F( 1, 531) = 1.69
Prob > F
= 0.1936
R-squared = 0.0054
Root MSE
= .29551
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1089851 .0837266 1.30 0.194 -.0554909 .2734612
_cons | .0008283 .0016148 0.51 0.608 -.0023439 .0040005
-----------------------------------------------------------------------------. estimates store fdiffpanel
.
452
. * "Robust" standard errors only control for heteroskedasticity

. regress dlnhr dlnwg $xextrashort, robust
Number of obs =
F( 1, 4786) = 2.51
Prob > F
= 0.1135
R-squared = 0.0054
Root MSE = .29551
4788
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1089851 .0688514 1.58 0.114 -.0259952 .2439654
_cons | .0008283 .0042856 0.19 0.847 -.0075735 .0092301
-----------------------------------------------------------------------------. estimates store fdiffhet
.
. set seed 10001
. bs "regress dlnhr dlnwg $xextrashort" "_b[dlnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
regress dlnhr dlnwg
statistics: _bs_1
= _b[dlnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
4788

-------------+---------------------------------------------------------------_bs_1 | 500 .1089851 -.0092694 .0832844 -.0546462 .2726165 (N)
|
-.0486034 .2608319 (P)
|
-.0329857 .2929305 (BC)
_bs_2 | 500 .0008283 -8.39e-06 .0015843 -.0022843 .003941 (N)
|
-.0023564 .0038644 (P)
|
-.0023692 .003842 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix fdiffbootse = e(se)
.
. * First differences plot with lowess local regression line - Figure 21.4 page 713
453
. graph twoway (scatter dlnhr dlnwg, msize(vsmall)) (lowess dlnhr dlnwg) (lfit dlnhr dlnwg), /*
> */ title("First Differences Regression") /*
> */ legend( label(1 "First differences") label(2 "Nonparametric fit") label(3 "Linear fit"))
. graph export ch21panfd.wmf, replace
(file c:\Imbook\bwebpage\Section5\ch21panfd.wmf written in Windows Metafile format)
.
. *** (5) RANDOM EFFECTS GLS REGRESSION (Table 21.2 RE-GLS column)
.
. use mom, clear
.
. xtreg lnhr lnwg, re i(id)
Random-effects GLS regression
between = 0.0213
overall = 0.0152
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
avg =
10.0
max =
10
=
5320
532
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
-----------------------------------------------------------------------------. estimates store reglsiid
.
. * or use xtgee corr(exchangeable), robust see ch21panel4.do
.
. set seed 10001
454
. bootstrap "xtreg lnhr lnwg, re i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
xtreg lnhr lnwg , re i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .1193322 .0084025 .0563763 .008568 .2300965 (N)
|
.0332454 .2379648 (P)
|
.0203328 .2199058 (BC)
_bs_2 | 500 7.346041 -.0217114 .1492226 7.052859 7.639223 (N)
|
7.029869 7.577236 (P)
|
7.082208 7.614716 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix reglsbootse = e(se)
.
. *** (6) RANDOM EFFECTS MLE REGRESSION (Table 21.2 RE-MLE column)
.
. use mom, clear
.
. xtreg lnhr lnwg, mle i(id)
Fitting full model:
Random-effects ML regression
Number of obs
Number of groups =
=
5320
532

avg =
10.0
max =
10
10
455
LR chi2(1)
= 76.14
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1195474 .0137484 8.70 0.000 .092601 .1464938
_cons | 7.345479 .0366973 200.16 0.000 7.273554 7.417404
-------------+---------------------------------------------------------------/sigma_u | .162175 .0060469 26.82 0.000 .1503233 .1740266
/sigma_e | .2329172 .0023819 97.79 0.000 .2282488 .2375856
-------------+---------------------------------------------------------------rho | .3265097 .017266
.2934209 .3610233
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 1147.08 Prob>=chibar2 = 0.000
. estimates store remleiid
.
.
. set seed 10001
. bootstrap "xtreg lnhr lnwg, mle i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
xtreg lnhr lnwg , mle i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .1195474 .0094957 .0582585 .0050852 .2340096 (N)
|
.0333037 .2445228 (P)
|
.0209889 .2249033 (BC)
_bs_2 | 500 7.345479 -.0245541 .1540811 7.042751 7.648207 (N)
|
7.013718 7.577084 (P)
|
7.070499 7.613971 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix remlebootse = e(se)
456
.
. * Population averaged is similar to re (gives similar to mle version of re)
. * Exactly same as xtgee, i(id)
. xtreg lnhr lnwg, pa i(id)
Iteration 1: tolerance = .03364039
Iteration 3: tolerance = 4.733e-06
GEE population-averaged model
Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
exchangeable
max =
10
Wald chi2(1)
= 76.70
Scale parameter:
.0805511
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1195474 .0136507 8.76 0.000 .0927925 .1463023
_cons | 7.345479 .0364481 201.53 0.000 7.274042 7.416916
-----------------------------------------------------------------------------. estimates store paiid
.
. *** (7) HAUSMAN TEST (NOT ROBUST)
.
. * Hausman test of fixed versus random effects
. * The FE estimates are saved in feiid
. * The RE estimates are saved in reglsiid
.
. * From Section 21.4.3 pages 717-9 this usual implementation of the Hausman test
. * is invalid if there is any intracluster correlation left in the RE model
. * as then the RE estimator is no longer fully efficient
. * so Var[b_RE - b_FE] does not equal Var[b_FE] - V[b_RE]
.
. * Following is not valid - see MMA21P2PANMANUAL.DO for robust version
. hausman feiid reglsiid
---- Coefficients ---|
(b)
(B)
(b-B) sqrt(diag(V_b-V_B))
| feiid
reglsiid
Difference
S.E.
-------------+---------------------------------------------------------------lnwg | .1676755 .1193322
.0483432
.0130486
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
457
Test: Ho: difference in coefficients not systematic

chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
13.73
Prob>chi2 =
0.0002
.
. ********* DISPLAY RESULTS - Table 21.2 on page 710 *********
.
. * Standard error using iid errors and in somce cases panel
. estimates table polsiid polshet polspanel beiid feiid, /*
> */ se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
------------------------------------------------------------------------------Variable | polsiid
polshet polspanel
beiid
feiid
-------------+----------------------------------------------------------------lnwg |
0.083
0.083
0.083
0.067
0.168
|
0.009
0.020
0.029
0.020
0.019
_cons |
7.442
7.442
7.442
7.483
7.220
|
0.024
0.055
0.080
0.052
0.049
-------------+----------------------------------------------------------------N | 5320.000 5320.000 5320.000 5320.000 5320.000
ll | -840.453 -840.453 -840.453
166.573
486.743
r2 |
0.015
0.015
0.015
0.021
0.016
tss |
433.831
rss | 427.225
427.225
427.225
16.652
259.398
mss |
6.605
6.605
6.605
0.363
4.279
rmse |
0.283
0.283
0.283
0.177
0.233
df_r | 5318.000 5318.000
531.000
530.000 4787.000
------------------------------------------------------------------------------legend: b/se
. estimates table fdiffiid fdiffhet fdiffpanel reglsiid remleiid, /*
------------------------------------------------------------------------------Variable | fdiffiid fdiffhet fdiffpanel reglsiid remleiid
-------------+----------------------------------------------------------------_
|
dlnwg |
0.109
0.109
0.109
|
0.021
0.069
0.084
lnwg |
0.119
|
0.014
_cons |
0.001
0.001
0.001
7.346
|
0.004
0.004
0.002
0.036
-------------+----------------------------------------------------------------lnhr
|
lnwg |
0.120
|
0.014
_cons |
7.345
458
|
0.037
-------------+----------------------------------------------------------------sigma_u
|
_cons |
0.162
|
0.006
-------------+----------------------------------------------------------------sigma_e
|
_cons |
0.233
|
0.002
-------------+----------------------------------------------------------------Statistics |
N | 4788.000 4788.000 4788.000 5320.000 5320.000
ll | -956.059 -956.059 -956.059
-266.912
r2 | 0.005
0.005
0.005
tss |
rss | 417.944
417.944
417.944
mss |
2.279
2.279
2.279
rmse |
0.296
0.296
0.296
df_r | 4786.000 4786.000
531.000
------------------------------------------------------------------------------legend: b/se
. estimates table paiid, se stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
--------------------------Variable | paiid
-------------+------------lnwg |
0.120
|
0.014
_cons |
7.345
|
0.036
-------------+------------N | 5320.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
--------------------------legend: b/se
.
. * Standard errors using panel bootstrap (regular bootstrap for between)
. matrix list polsbootse
polsbootse[1,2]
_bs_1
_bs_2
se .02983953 .0805676
459
. matrix list bebootse

bebootse[1,2]
_bs_1
_bs_2
se .01923625 .05191507
. matrix list febootse
febootse[1,2]
_bs_1
_bs_2
se .08446309 .22304703
. matrix list fdiffbootse
fdiffbootse[1,2]
_bs_1
_bs_2
se .08328443 .00158427
. matrix list reglsbootse
reglsbootse[1,2]
_bs_1
_bs_2
se .05637633 .14922264
. matrix list remlebootse
remlebootse[1,2]
_bs_1
_bs_2
se .05825849 .15408111
.
. ********** CLOSE OUTPUT *********
. log close
log: c:\Imbook\bwebpage\Section5\mma21p1panfeandre.txt
log type: text
closed on: 23 May 2005, 11:34:06
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p2panmanual.txt
log type: text
opened on: 23 May 2005, 11:34:50
.
. ********** OVERVIEW OF MMA21P2PANMANUAL.DO **********
.
. * STATA Program
.
460
. * Chapter 21.3.1-3 pages 709-14

. * Program performs basic panel analysis and gets panel robust se's
. * by first transforming model and then using REGRESS
. * It also presents a valid Hausman test of FE versus RE model
.
. * This program estimates
. * (2) between estimator by regress y_bar on x_bar
. * (4) within estimator by regress (y - y_bar) on (x - x_bar)
. * (5) random effects gls by regress (y - rho*y_bar) on (x - rho*x_bar)
. * (6) random effects mle by regress (y - rho*y_bar) on (x - rho*x_bar)
. * (7) robust variant of the Hausman test
. * and calculates
. * - usual standard errors
.*
(which may differ from xtreg due to different degrees of freedom)
. * - panel robust standard errors
.*
(which for RE simplify by assuming lamda_hat is known not estimated)
. * - panel bootstrap standard errors
.*
(which should equal panel robust from ch21panel.do as #bootstrap reps --> infinity)
. * - heteroskedasticity robust standard errors
.*
(which are wrong but included for comparison with others)
.
. * The code is very limited:
. * - it considers only one regressor
. * - it assumes a balanced data set with exactly 10 years of data per obnservations
. * - it does not use loops for transformations which would generalize code
.
. * NOTE: If have Stata Version 9 (rather than version 8) a simpler way to proceed is
. * to directly use XTREG (see program mma21p1panfeandre.do) with option cluster(id)
.
.*
.
. * MOM.dat
.
. * To speed up this program reduce nreps, the number of bootstraps
. * used in the panel bootstrap.
.
. ********** SETUP **********
.
. set more off
. version 8.0
461
.
.
. * An Emprirical Comparison of Moment-Condition Estimators"
.
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
.
.
. ********** READ DATA **********
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ********** DEFINE GLOBALS **********
.
. * Table 21.1 used 500
. global nreps 500
.
. ******** RUN REGRESSIONS USING XTREG **********
.
462
. * This is to verify alternative estimates later on

. * And for random effects it saves lamda
. * used later on to construct transformed regression
. * of (y - lamda*y_1) on (x - lamda*x_1)
.
. xtreg lnhr lnwg, be i(id)
Between regression (regression on group means) Number of obs
=
Number of groups =
532
between = 0.0213
overall = 0.0152

avg =
10.0
max =
10
F(1,530)
sd(u_i + avg(e_i.))= .1772555
= 11.55
Prob > F
=
5320
10
0.0007
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0668379 .0196635 3.40 0.001 .0282099 .1054658
_cons | 7.483021 .0518829 144.23 0.000
7.3811 7.584943
-----------------------------------------------------------------------------. estimates store bextreg
.
. xtreg lnhr lnwg, fe i(id)
between = 0.0213
overall = 0.0152
corr(u_i, Xb) = -0.1995
Number of obs
=
5320
Number of groups =
532
avg =
10.0
max =
10
F(1,4787)
=
Prob > F
10
78.96
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
Prob > F = 0.0000
463
. estimates store fextreg

.
between = 0.0213
overall = 0.0152
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
=
5320
532

avg =
10.0
max =
10
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
-----------------------------------------------------------------------------. estimates store reglsxtreg
. scalar sesq = e(sigma_e)^2
. scalar susq = e(sigma_u)^2
. scalar lamdaregls = 1 - sqrt( sesq / (e(Tbar)*susq + sesq) )
. di lamdaregls
.58470925
.
. xtreg lnhr lnwg, mle i(id)
Fitting full model:
Number of obs
5320
464
Number of groups =

avg =
10.0
max =
10
LR chi2(1)
532
= 76.14
Prob > chi2
=
10
0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1195474 .0137484 8.70 0.000
.092601 .1464938
_cons | 7.345479 .0366973 200.16 0.000 7.273554 7.417404
-------------+---------------------------------------------------------------/sigma_u | .162175 .0060469 26.82 0.000 .1503233 .1740266
/sigma_e | .2329172 .0023819 97.79 0.000 .2282488 .2375856
-------------+---------------------------------------------------------------rho | .3265097 .017266
.2934209 .3610233
. estimates store remlextreg
. scalar sesq2 = e(sigma_e)^2
. scalar susq2 = e(sigma_u)^2
. scalar lamdaremle = 1 - sqrt( sesq2 / (e(g_avg)*susq2 + sesq2) )
. di lamdaremle
.58648101
.
. ******** ANALYSIS: FE, RE and FD ESTIMATORS CALCULATED MANUALLY
**********
.
. *** FIRST TRANSFORM DATA FROM LONG FORM TO WIDE FORM
.
. * Here just do this for lnhr and lnwg
. keep lnhr lnwg id year
. reshape wide lnhr lnwg, i(id) j(year)
(note: j = 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 -> 532
Number of variables
4 ->
21
year -> (dropped)
xij variables:
465
lnhr -> lnhr1979 lnhr1980 ... lnhr1988

lnwg -> lnwg1979 lnwg1980 ... lnwg1988
----------------------------------------------------------------------------.
. * Since year is 1979 to 1988 this will create
. * lnhr1979 to lnhr1988 and lnwg1979 to lnwg1988
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
532
266.5 153.7194
1
532
lnhr1979 |
532 7.669342 .249361
5.89
8.54
lnwg1979 |
532 2.597763 .4188951
.52
4.62
lnhr1980 |
532 7.660094 .2691995
5.22
8.34
lnwg1980 |
532 2.602368 .3945963
.8
4.61
-------------+-------------------------------------------------------lnhr1981 |
532 7.66765 .2105797
6.36
8.4
lnwg1981 |
532 2.610959 .3870011
1.53
4.53
lnhr1982 |
532 7.64609 .2427195
5.38
8.31
lnwg1982 |
532 2.61468 .4014363
1.21
4.61
lnhr1983 |
532 7.613064 .382703
2.77
8.37
-------------+-------------------------------------------------------lnwg1983 |
532 2.610526 .4111869
1.08
4.62
lnhr1984 |
532 7.636523 .3316735
3.18
8.44
lnwg1984 |
532 2.600188 .4621549
-.26
4.65
lnhr1985 |
532 7.668365 .2597423
5.08
8.54
lnwg1985 |
532 2.614944 .4347554
1.33
4.69
-------------+-------------------------------------------------------lnhr1986 |
532 7.659286 .3330862
2.77
8.38
lnwg1986 |
532 2.602632 .4432807
.07
4.59
lnhr1987 |
532 7.67406 .2745015
4.38
8.56
lnwg1987 |
532 2.614699 .4300122
1.28
4.03
lnhr1988 |
532 7.679831 .2552894
4.79
8.53
-------------+-------------------------------------------------------lnwg1988 |
532 2.625602 .4701759
-.22
4.6
.
. *** (1) POOLED OLS (OVERALL) REGRESSION
.
. * Not relevant
.
. *** (2) CREATE INDIVIDUAL AVERAGES AND DO BETWEEN REGRESSION
.
. gen avelnhr = (lnhr1979+lnhr1980+lnhr1981+lnhr1982+lnhr1983+lnhr1984+ /*
>
*/ lnhr1985+lnhr1986+lnhr1987+lnhr1988) / 10
. gen avelnwg = (lnwg1979+lnwg1980+lnwg1981+lnwg1982+lnwg1983+lnwg1984+ /*
>
*/ lnwg1985+lnwg1986+lnwg1987+lnwg1988) / 10
466
.
. * Should replicate xtreg, be
. regress avelnhr avelnwg
Source |
SS
df
MS
Number of obs = 532
-------------+-----------------------------F( 1, 530) = 11.55
Model | .363013807 1 .363013807
Prob > F
= 0.0007
Residual | 16.6523404 530 .03141951
R-squared = 0.0213
-------------+-----------------------------Adj R-squared = 0.0195
Total | 17.0153542 531 .032043982
Root MSE
= .17726
-----------------------------------------------------------------------------avelnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------avelnwg | .0668379 .0196635 3.40 0.001 .0282099 .1054658
_cons | 7.483021 .0518829 144.23 0.000
7.3811 7.584943
-----------------------------------------------------------------------------. estimates store bebyols
.
. * Better is the following as gives heteroskedastic robust standard errors
. regress avelnhr avelnwg, robust
Number of obs =
F( 1, 530) = 7.55
Prob > F
= 0.0062
R-squared = 0.0213
Root MSE = .17726
532
-----------------------------------------------------------------------------|
Robust
avelnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------avelnwg | .0668379 .0243185 2.75 0.006 .0190654 .1146103
_cons | 7.483021 .0657699 113.78 0.000
7.35382 7.612223
-----------------------------------------------------------------------------. estimates store behet
.
. * Or could bootstrap
. bootstrap "regress avelnhr avelnwg" "_b[avelnwg] _b[_cons]", reps(200) level(95)
command:
regress avelnhr avelnwg
statistics: _bs_1
= _b[avelnwg]
_bs_2
= _b[_cons]
Number of obs =
Replications =
200
532
467

-------------+---------------------------------------------------------------_bs_1 | 200 .0668379 -.0010221 .0239486 .0196123 .1140634 (N)
|
.0233175 .1143305 (P)
|
.0266221 .1175503 (BC)
_bs_2 | 200 7.483021 .0029632 .0648396 7.35516 7.610882 (N)
|
7.362745 7.600107 (P)
|
7.358079 7.591704 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix bebootse = e(se)
.
. *** (3) CREATE DIFFERENCED DATA FOR FE AND RE
.
. * Continue with data already and then reshape
. * Mean difference for FE and quasi for RE-GLS and RE-MLE
.
. * Mean difference for FE
. gen mdlnhr1979 = lnhr1979 - avelnhr
. gen mdlnwg1979 = lnwg1979 - avelnwg
468

.
. * Quasi difference for RE - GLS
. gen reglsdlnhr1979 = lnhr1979 - lamdaregls*avelnhr
. gen reglsdlnwg1979 = lnwg1979 - lamdaregls*avelnwg
469

.
. * Quasi difference for RE - MLE
. gen remledlnhr1979 = lnhr1979 - lamdaremle*avelnhr
. gen remledlnwg1979 = lnwg1979 - lamdaremle*avelnwg
.
. *** NOW BACK TO LONG FORM
.
. * Then back to long form
. reshape long lnhr lnwg mdlnhr mdlnwg reglsdlnhr reglsdlnwg remledlnhr remledlnwg, i(id) j(year)
(note: j = 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988)
470
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
532 -> 5320
Number of variables
85 ->
14
-> year
xij variables:
lnhr1979 lnhr1980 ... lnhr1988 -> lnhr
lnwg1979 lnwg1980 ... lnwg1988 -> lnwg
mdlnhr1979 mdlnhr1980 ... mdlnhr1988 -> mdlnhr
mdlnwg1979 mdlnwg1980 ... mdlnwg1988 -> mdlnwg
reglsdlnhr1979 reglsdlnhr1980 ... reglsdlnhr1988->reglsdlnhr
reglsdlnwg1979 reglsdlnwg1980 ... reglsdlnwg1988->reglsdlnwg
remledlnhr1979 remledlnhr1980 ... remledlnhr1988->remledlnhr
remledlnwg1979 remledlnwg1980 ... remledlnwg1988->remledlnwg
----------------------------------------------------------------------------.
. describe
Contains data
obs:
5,320
vars:
14
size:
label
variable label
------------------------------------------------------------------------------id
float %9.0g
year
int %9.0g
lnhr
float %9.0g
lnwg
float %9.0g
avelnhr
float %9.0g
avelnwg
float %9.0g
_est_bebyols byte %8.0g
esample() from estimates store
_est_behet
byte %8.0g
esample() from estimates store
mdlnhr
float %9.0g
mdlnwg
float %9.0g
reglsdlnhr
float %9.0g
reglsdlnwg
float %9.0g
remledlnhr
float %9.0g
remledlnwg
float %9.0g
------------------------------------------------------------------------------Sorted by: id year
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
5320
266.5 153.5893
1
532
471
year |
5320
1983.5 2.872551
1979
1988
lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
avelnhr |
5320 7.65743 .1788568
6.416
8.242
-------------+-------------------------------------------------------avelnwg |
5320 2.609436 .3908626
1.346
4.543
_est_bebyols |
5320
1
0
1
1
_est_behet |
5320
1
0
1
1
mdlnhr |
5320 -1.21e-09 .2226492 -3.988
1.344
mdlnwg |
5320 -9.86e-10 .1691472 -2.54 1.878
-------------+-------------------------------------------------------reglsdlnhr |
5320 3.18006 .2347122 -1.181465 4.008506
reglsdlnwg |
5320 1.083675 .2344336 -1.593137 2.966892
remledlnhr |
5320 3.166493 .2346121 -1.193439 3.997138
remledlnwg |
5320 1.079051 .2339546 -1.597177 2.962247
. save MOM2, replace
file MOM2.dta saved
.
. *** (4) FIXED EFFECTS ESTIMATOR USING DIFFERENCED DATA
.
. * This should replicate xtreg, fe
. regress mdlnhr mdlnwg
Source |
SS
df
MS
-------------+-----------------------------F( 1, 5318) = 87.72
Model | 4.27857391 1 4.27857391
Prob > F
= 0.0000
Residual | 259.39846 5318 .048777446
R-squared = 0.0162
-------------+-----------------------------Adj R-squared = 0.0160
Total | 263.677034 5319 .04957267
Root MSE
= .22086
-----------------------------------------------------------------------------mdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------mdlnwg | .1676755 .0179032 9.37 0.000
.132578 .202773
_cons | -1.04e-09 .003028 -0.00 1.000 -.0059361 .0059361
-----------------------------------------------------------------------------. estimates store febyols
.
. * This gives panel corrected standard errors
. regress mdlnhr mdlnwg, cluster(id)
F( 1, 531) = 3.89
Prob > F
= 0.0490
R-squared = 0.0162
Root MSE
= .22086
472
-----------------------------------------------------------------------------|
Robust
mdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------mdlnwg | .1676755 .0849706 1.97 0.049 .0007557 .3345953
_cons | -1.04e-09 6.39e-09 -0.16 0.870 -1.36e-08 1.15e-08
-----------------------------------------------------------------------------. estimates store fepanel
.
. * This gives panel bootstrap standard errors
. * Similar to bootstrap applied to xtreg, fe
. set seed 10001
. bs "regress mdlnhr mdlnwg" "_b[mdlnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
regress mdlnhr mdlnwg
statistics: _bs_1
= _b[mdlnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .1676755 -.0055543 .0844631 .0017284 .3336226 (N)
|
.0213276 .3318829 (P)
|
.0300515 .3605573 (BC)
_bs_2 | 500 -1.04e-09 2.79e-10 6.50e-09 -1.38e-08 1.17e-08 (N)
|
-1.39e-08 1.28e-08 (P)
|
-1.41e-08 1.17e-08 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix febootse = e(se)
.
. * This gives heteroskedasticity corrected standard errors that are not panel robust
. regress mdlnhr mdlnwg, robust
Number of obs =
F( 1, 5318) = 7.79
Prob > F
= 0.0053
R-squared = 0.0162
Root MSE
= .22086
5320
473
-----------------------------------------------------------------------------|
Robust
mdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------mdlnwg | .1676755 .0600942 2.79 0.005 .0498662 .2854848
_cons | -1.04e-09 .003028 -0.00 1.000 -.0059361 .0059361
-----------------------------------------------------------------------------. estimates store fehet
.
. *** (5) RANDOM EFFECTS - GLS ESTIMATOR USING DIFFERENCED DATA
.
. * Should give same coefficient estimates as xtreg
. * May give different standard errors as treats lamda as known
. * but in practice the differnece is not great as lamda precisely estimated
.
. * This should replicate xtreg, re
. regress reglsdlnhr reglsdlnwg
Source |
SS
df
MS
-------------+-----------------------------F( 1, 5318) = 76.64
Model | 4.16279701 1 4.16279701
Prob > F
= 0.0000
Residual | 288.860014 5318 .054317415
R-squared = 0.0142
-------------+-----------------------------Adj R-squared = 0.0140
Total | 293.022811 5319 .055089831
Root MSE
= .23306
-----------------------------------------------------------------------------reglsdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0136312 8.75 0.000 .0926095 .146055
_cons | 3.050743 .0151135 201.86 0.000 3.021114 3.080371
-----------------------------------------------------------------------------. estimates store reglsbyols
.
. regress reglsdlnhr reglsdlnwg, cluster(id)
F( 1, 531) = 5.39
Prob > F
= 0.0206
R-squared = 0.0142
Root MSE
= .23306
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0514016 2.32 0.021 .0183568 .2203077
474
_cons | 3.050743 .0571367 53.39 0.000 2.938501 3.162984

-----------------------------------------------------------------------------. estimates store reglspanel
.
. set seed 10001
. bs "regress reglsdlnhr reglsdlnwg" "_b[reglsdlnwg] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
regress reglsdlnhr reglsdlnwg
statistics: _bs_1
= _b[reglsdlnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .1193323 -.0020689 .0516757 .0178035 .220861 (N)
|
.0300938 .2277364 (P)
|
.0339291 .236732 (BC)
_bs_2 | 500 3.050743 .0022622 .0571941 2.938372 3.163114 (N)
|
2.93212 3.148191 (P)
|
2.920954 3.143819 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix reglsbootse = e(se)
.
. regress reglsdlnhr reglsdlnwg, robust
Number of obs =
F( 1, 5318) = 7.81
Prob > F
= 0.0052
R-squared = 0.0142
Root MSE
= .23306
5320
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0426897 2.80 0.005
.035643 .2030215
475
_cons | 3.050743 .047821 63.80 0.000 2.956994 3.144491

-----------------------------------------------------------------------------. estimates store reglshet
.
. *** (6) RANDOM EFFECTS - MLE ESTIMATOR USING DIFFERENCED DATA
.
. * Should give same coefficient estimates as xtreg
. * May give different standard errors as treats lamda as known
. * but in practice the differnece is not great as lamda precisely estimated
.
. * This should replicate xtreg, mle
. regress remledlnhr remledlnwg
Source |
SS
df
MS
-------------+-----------------------------F( 1, 5318) = 76.67
Model | 4.16076808 1 4.16076808
Prob > F
= 0.0000
Residual | 288.612179 5318 .054270812
R-squared = 0.0142
-------------+-----------------------------Adj R-squared = 0.0140
Total | 292.772947 5319 .055042855
Root MSE
= .23296
-----------------------------------------------------------------------------remledlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------remledlnwg | .1195474 .0136533 8.76 0.000 .0927814 .1463134
_cons | 3.037495 .0150748 201.49 0.000 3.007942 3.067048
-----------------------------------------------------------------------------. estimates store remlebyols
.
. regress remledlnhr remledlnwg, cluster(id)
F( 1, 531) = 5.38
Prob > F
= 0.0208
R-squared = 0.0142
Root MSE
= .23296
-----------------------------------------------------------------------------|
Robust
remledlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------remledlnwg | .1195474 .0515474 2.32 0.021 .0182855 .2208093
_cons | 3.037495 .0570501 53.24 0.000 2.925424 3.149567
-----------------------------------------------------------------------------. estimates store remlepanel
476
.
. set seed 10001
. bs "regress remledlnhr remledlnwg" "_b[remledlnwg] _b[_cons]", cluster(id) reps($nreps)
level(95)
command:
regress remledlnhr remledlnwg
statistics: _bs_1
= _b[remledlnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .1195474 -.0020813 .0518188 .0177375 .2213573 (N)
|
.0300552 .2282355 (P)
|
.0339668 .2372786 (BC)
_bs_2 | 500 3.037495 .0022658 .0571042 2.925301 3.149689 (N)
|
2.919076 3.134685 (P)
|
2.907989 3.13043 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix remlebootse = e(se)
.
. regress reglsdlnhr reglsdlnwg, robust
Number of obs =
F( 1, 5318) = 7.81
Prob > F
= 0.0052
R-squared = 0.0142
Root MSE
= .23306
5320
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------reglsdlnwg | .1193323 .0426897 2.80 0.005
.035643 .2030215
_cons | 3.050743 .047821 63.80 0.000 2.956994 3.144491
-----------------------------------------------------------------------------. estimates store remlehet
477
.
. *** (7) ROBUST VARIANT OF HAUSMAN TEST
.
. * From Section 21.4.3 pages 717-9 the usual implementation of the Hausman test
. * is invalid if there is any intracluster correlation left in the RE model
. * as then the RE estimator is no longer fully efficient
. * so Var[b_RE - b_FE] does not equal Var[b_FE] - V[b_RE]
.
. * (7A) Nonrobust version of Hausman test by auxiliary regression
.*
[will be similar to nonrobust version in mma21p1panfeandre.do]
. regress reglsdlnhr reglsdlnwg mdlnwg
Source |
SS
df
MS
-------------+-----------------------------F( 2, 5317) = 45.26
Model | 4.90465081 2 2.45232541
Prob > F
= 0.0000
Residual | 288.11816 5317 .054188106
R-squared = 0.0167
-------------+-----------------------------Adj R-squared = 0.0164
Total | 293.022811 5319 .055089831
Root MSE
= .23278
-----------------------------------------------------------------------------reglsdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------reglsdlnwg | .0668379 .0196635 3.40 0.001 .0282893 .1053864
mdlnwg | .1008376 .0272531 3.70 0.000 .0474104 .1542648
_cons | 3.10763 .0215465 144.23 0.000
3.06539 3.14987
-----------------------------------------------------------------------------. scalar Hnonrobust = (_b[mdlnwg]/_se[mdlnwg])^2
. di Hnonrobust
13.690344
.
. * Perform preferred valid robust version of Hausman test
. * This gives the results presented on p.719
. regress reglsdlnhr reglsdlnwg mdlnwg, cluster(id)
F( 2, 531) = 4.24
Prob > F
= 0.0149
R-squared = 0.0167
Root MSE
= .23278
-----------------------------------------------------------------------------|
Robust
reglsdlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------reglsdlnwg | .0668379 .0243001 2.75 0.006 .0191016 .1145741
mdlnwg | .1008376 .0785137 1.28 0.200 -.053398 .2550732
_cons | 3.10763 .027293 113.86 0.000 3.054014 3.161245
478
-----------------------------------------------------------------------------. scalar Hrobust = (_b[mdlnwg]/_se[mdlnwg])^2

. di Hrobust
1.6495074
.
. ********* DISPLAY RESULTS - Table 21.2 on page 710 *********
.
. * All estimates should be equal for a given estimator.
. * The standard errors will vary.
. * The first and second assume iid errors and generally will be the same.
. * The third assumes heteroskedastic errors, but are not panel robust.
. * The fourth are panel robust and also allow for heteroskedasticity.
. estimates table bextreg bebyols behet, b(%10.3f) se /*
> */ stats(N ll r2 tss rss mss rmse df_r)
----------------------------------------------------Variable | bextreg
bebyols
behet
-------------+--------------------------------------lnwg |
0.067
|
0.020
avelnwg |
0.067
0.067
|
0.020
0.024
_cons |
7.483
7.483
7.483
|
0.052
0.052
0.066
-------------+--------------------------------------N | 5320.000
532.000
532.000
ll | 166.573
166.573
166.573
r2 |
0.021
0.021
0.021
tss |
rss | 16.652
16.652
16.652
mss |
0.363
0.363
0.363
rmse |
0.177
0.177
0.177
df_r | 530.000
530.000
530.000
----------------------------------------------------legend: b/se
. estimates table fextreg febyols fehet fepanel, b(%10.3f) se /*
-----------------------------------------------------------------Variable | fextreg
febyols
fehet
fepanel
-------------+---------------------------------------------------lnwg |
0.168
| 0.019
mdlnwg |
0.168
0.168
0.168
|
0.018
0.060
0.085
_cons |
7.220
-0.000
-0.000
-0.000
|
0.049
0.003
0.003
0.000
479
-------------+---------------------------------------------------N | 5320.000 5320.000 5320.000 5320.000

ll | 486.743
486.743
486.743
486.743
r2 |
0.016
0.016
0.016
0.016
tss | 433.831
rss | 259.398
259.398
259.398
259.398
mss |
4.279
4.279
4.279
4.279
rmse |
0.233
0.221
0.221
0.221
df_r | 4787.000 5318.000 5318.000
531.000
-----------------------------------------------------------------legend: b/se
. estimates table reglsxtreg reglsbyols reglshet reglspanel, b(%10.3f) se /*
-----------------------------------------------------------------Variable | reglsxtreg reglsbyols reglshet reglspanel
-------------+---------------------------------------------------lnwg |
0.119
| 0.014
reglsdlnwg |
0.119
0.119
0.119
|
0.014
0.043
0.051
_cons |
7.346
3.051
3.051
3.051
|
0.036
0.015
0.048
0.057
-------------+---------------------------------------------------N | 5320.000 5320.000 5320.000 5320.000
ll |
200.589
200.589
200.589
r2 |
0.014
0.014
0.014
tss |
rss |
288.860
288.860
288.860
mss |
4.163
4.163
4.163
rmse |
0.233
0.233
0.233
df_r |
5318.000 5318.000
531.000
-----------------------------------------------------------------legend: b/se
. estimates table remlextreg remlebyols remlehet remlepanel, b(%10.3f) se /*
-----------------------------------------------------------------Variable | remlextreg remlebyols remlehet remlepanel
-------------+---------------------------------------------------lnhr
|
lnwg |
0.120
| 0.014
_cons |
7.345
| 0.037
-------------+---------------------------------------------------sigma_u
|
_cons |
0.162
| 0.006
480
-------------+---------------------------------------------------sigma_e
|
_cons |
0.233
| 0.002
-------------+---------------------------------------------------_
|
remledlnwg |
0.120
0.120
|
0.014
0.052
reglsdlnwg |
0.119
|
0.043
_cons |
3.037
3.051
3.037
|
0.015
0.048
0.057
-------------+---------------------------------------------------Statistics |
N | 5320.000 5320.000 5320.000 5320.000
ll | -266.912
202.872
200.589
202.872
r2 |
0.014
0.014
0.014
tss |
rss |
288.612
288.860
288.612
mss |
4.161
4.163
4.161
rmse |
0.233
0.233
0.233
df_r |
5318.000 5318.000
531.000
-----------------------------------------------------------------legend: b/se
.
. * The following are (panel) bootstrap standard errors
. matrix list bebootse
bebootse[1,2]
_bs_1
_bs_2
se .02394857 .06483965
. matrix list febootse
febootse[1,2]
_bs_1
_bs_2
se .08446309 6.497e-09
. * Note that the following two differ from mma21p1panfeandre.do
. * as here the same value of lamda is used throught the bootstraps
. matrix list remlebootse
remlebootse[1,2]
_bs_1
_bs_2
se .05181879 .05710419
. matrix list reglsbootse
reglsbootse[1,2]
_bs_1
_bs_2
481
se .05167569 .05719414
.
. * For completeness give lamda
. di lamdaregls
.58470925
. di lamdaremle
.58648101
.
. * Robust and nonrobust versions of Hausman test given on p.719
. di Hnonrobust /* Not valid if intracluster correlation */
13.690344
. di Hrobust
1.6495074
/* Valid if intracluster correlation */
.
. log close
log: c:\Imbook\bwebpage\Section5\mma21p2panmanual.txt
log type: text
closed on: 23 May 2005, 11:35:55
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p2panresiduals.txt
log type: text
opened on: 23 May 2005, 11:37:22
.
. ********** OVERVIEW OF MMA21P3PANRESIDUALS.DO **********
.
. * STATA Program
.
. * Chapter 21.3.4 pages 713-15 Residual analysis
. * This program
. * (1) estimates correlations for
. * - dependent variable
. * - regressors variable
. * - residuals from pooled ols [Table 21.3]
. * - residuals from within estimation [Table 21.4]
. * - residuals from random effects estimation
. * (2) separately estimates correlations for
. * - residuals from first differences estiamtion
. * (3) gets correlations for each individual observation
.
482

.
.*
.
. * To run you need file
. * MOM.dat
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
.
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
.
.
. ********** READ DATA **********
.*
483
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ************ (1) ANALYSIS: OBTAIN KEY AUTOCORRELATIONS Tables 21.3, 21.4
**********
.
. ** RUN REGRESSIONS AND GET RESIDUALS OF INTEREST
.
. * pooled ols
. regress lnhr lnwg
Source |
SS
df
MS
-------------+-----------------------------F( 1, 5318) = 82.22
Model | 6.60538417 1 6.60538417
Prob > F
= 0.0000
Residual | 427.225206 5318 .080335691
R-squared = 0.0152
-------------+-----------------------------Adj R-squared = 0.0150
Total | 433.830591 5319 .081562435
Root MSE
= .28344
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0827436 .0091251 9.07 0.000 .0648545 .1006326
_cons | 7.441516 .0241265 308.44 0.000 7.394219 7.488814
-----------------------------------------------------------------------------. predict upols, residuals
.
. * fixed effects (within)
between = 0.0213
overall = 0.0152
Number of obs
=
5320
Number of groups =
532
avg =
10.0
max =
10
F(1,4787)
10
78.96
484
corr(u_i, Xb) = -0.1995
Prob > F
0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
Prob > F = 0.0000
. predict ufe, e
.
. * random effects
between = 0.0213
overall = 0.0152
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
=
5320
532

avg =
10.0
max =
10
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
-----------------------------------------------------------------------------. predict ure, e
.
. summarize upols ufe ure
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------upols |
5320 -1.27e-10 .2834089 -4.826247 .964581
ufe |
5320 -5.52e-11 .2208354 -4.003929 1.2719
ure |
5320 -9.00e-11 .2231118 -4.131111 1.085362
485
. save mom3, replace

file mom3.dta saved
.
. ** TRANSFORM DATA FROM LONG FORM TO WIDE FORM
.
. * Here just do this for lnhr and lnwg and the residuals
. keep lnhr lnwg id year upols ufe ure
. reshape wide lnhr lnwg upols ufe ure, i(id) j(year)
(note: j = 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 -> 532
Number of variables
7 ->
51
year -> (dropped)
xij variables:
upols -> upols1979 upols1980 ... upols1988
ufe -> ufe1979 ufe1980 ... ufe1988
ure -> ure1979 ure1980 ... ure1988
----------------------------------------------------------------------------.
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
532
266.5 153.7194
1
532
lnhr1979 |
532 7.669342 .249361
5.89
8.54
lnwg1979 |
532 2.597763 .4188951
.52
4.62
upols1979 |
532 .0128775 .2517228 -1.764168 .8312218
ufe1979 |
532 .0138689 .2249175 -1.578105 1.2719
-------------+-------------------------------------------------------ure1979 |
532 .0133046 .2200196 -1.618987 1.085362
lnhr1980 |
532 7.660094 .2691995
5.22
8.34
lnwg1980 |
532 2.602368 .3945963
.8
4.61
upols1980 |
532 .0032483 .2679463 -2.354734 .6659743
ufe1980 |
532 .0038486 .2253673 -2.085636 1.128546
-------------+-------------------------------------------------------ure1980 |
532 .0035069 .2238723 -2.089847 .9429754
lnhr1981 |
532 7.66765 .2105797
6.36
8.4
lnwg1981 |
532 2.610959 .3870011
1.53
4.53
upols1981 |
532 .0100939 .2133106 -1.342159 .7582438
ufe1981 |
532 .0099646 .163407 -1.001722 1.03687
486
-------------+-------------------------------------------------------ure1981 |
532 .0100382 .1596593 -1.02491 .8517824
lnhr1982 |
532 7.64609 .2427195
5.38
8.31
lnwg1982 |
532 2.61468 .4014363
1.21
4.61
upols1982 |
532 -.0117742 .2422735 -2.264238 .6897579
ufe1982 |
532 -.0122196 .1890237 -1.623214 .7918997
-------------+-------------------------------------------------------ure1982 |
532 -.0119661 .1875585 -1.737484 .6666697
lnhr1983 |
532 7.613064 .382703
2.77
8.37
lnwg1983 |
532 2.610526 .4111869
1.08
4.62
upols1983 |
532 -.0444568 .3778255 -4.826247 .7307264
ufe1983 |
532 -.0445494 .2836351 -3.577253 .5196197
-------------+-------------------------------------------------------ure1983 |
532 -.0444967 .294545 -3.804399 .5078294
lnhr1984 |
532 7.636523 .3316735
3.18
8.44
lnwg1984 |
532 2.600188 .4621549
-.26
4.65
upols1984 |
532 -.0201427 .3208512 -4.240003 .8263766
ufe1984 |
532 -.0193572 .225836 -2.810104 .8327778
-------------+-------------------------------------------------------ure1984 |
532 -.0198043 .2378605 -3.140221 .7036628
lnhr1985 |
532 7.668365 .2597423
5.08
8.54
lnwg1985 |
532 2.614944 .4347554
1.33
4.69
upols1985 |
532 .0104785 .259051 -2.503835 .8624523
ufe1985 |
532 .0100107 .1856724 -1.581894 .7944546
-------------+-------------------------------------------------------ure1985 |
532 .010277 .1886509 -1.752727 .7370209
lnhr1986 |
532 7.659286 .3330862
2.77
8.38
lnwg1986 |
532 2.602632 .4432807
.07
4.59
upols1986 |
532 .0024183 .3312105 -4.801424 .7439653
ufe1986 |
532 .0029962 .2595405 -4.003929 .6384854
-------------+-------------------------------------------------------ure1986 |
532 .0026673 .264328 -4.131111 .5111209
lnhr1987 |
532 7.67406 .2745015
4.38
8.56
lnwg1987 |
532 2.614699 .4300122
1.28
4.03
upols1987 |
532 .0161942 .2749153 -3.283269 .964581
ufe1987 |
532 .0157472 .2141618 -2.817174 1.009662
-------------+-------------------------------------------------------ure1987 |
532 .0160016 .2148092 -2.897725 .8441463
lnhr1988 |
532 7.679831 .2552894
4.79
8.53
lnwg1988 |
532 2.625602 .4701759
-.22
4.6
upols1988 |
532 .0210628 .2519891 -2.633313 .9072749
ufe1988 |
532 .0196898 .2048927 -1.68379 1.123516
-------------+-------------------------------------------------------ure1988 |
532 .0204713 .2022375 -1.897506 .9393954
.
. ** OBTAIN THE VARIOUS CORRELATIONS
.
. corr lnhr1979 lnhr1980 lnhr1981 lnhr1982 lnhr1983 lnhr1984 lnhr1985 lnhr1986 lnhr1987
lnhr1988
(obs=532)
487
| lnhr1979 lnhr1980 lnhr1981 lnhr1982 lnhr1983 lnhr1984 lnhr1985 lnhr1986 lnhr1987

-------------+--------------------------------------------------------------------------------lnhr1979 | 1.0000
lnhr1980 | 0.3220 1.0000
lnhr1981 | 0.4321 0.4022 1.0000
lnhr1982 | 0.2947 0.3142 0.5670 1.0000
lnhr1983 | 0.2070 0.2324 0.3788 0.4781 1.0000
lnhr1984 | 0.1908 0.2235 0.3141 0.3318 0.6476 1.0000
lnhr1985 | 0.2284 0.3184 0.3999 0.3453 0.3930 0.5839 1.0000
lnhr1986 | 0.1934 0.1931 0.2813 0.2524 0.3162 0.3595 0.4128 1.0000
lnhr1987 | 0.1986 0.3160 0.3322 0.2951 0.3261 0.3464 0.3987 0.3603 1.0000
lnhr1988 | 0.1640 0.2551 0.3081 0.2674 0.2267 0.2537 0.3509 0.5741 0.5248
| lnhr1988
-------------+--------lnhr1988 | 1.0000
. corr lnwg1979 lnwg1980 lnwg1981 lnwg1982 lnwg1983 lnwg1984 lnwg1985 lnwg1986

lnwg1987 lnwg1988
(obs=532)
| lnwg1979 lnwg1980 lnwg1981 lnwg1982 lnwg1983 lnwg1984 lnwg1985 lnwg1986
lnwg1987
-------------+--------------------------------------------------------------------------------lnwg1979 | 1.0000
lnwg1980 | 0.8415 1.0000
lnwg1981 | 0.8283 0.8920 1.0000
lnwg1982 | 0.7984 0.8559 0.9015 1.0000
lnwg1983 | 0.7795 0.8408 0.8787 0.9155 1.0000
lnwg1984 | 0.7208 0.7737 0.8102 0.8267 0.8625 1.0000
lnwg1985 | 0.7424 0.7929 0.8290 0.8511 0.8636 0.8620 1.0000
lnwg1986 | 0.7250 0.7714 0.8122 0.8286 0.8530 0.8399 0.9157 1.0000
lnwg1987 | 0.7188 0.7639 0.8029 0.8282 0.8525 0.8681 0.9117 0.9111 1.0000
lnwg1988 | 0.7220 0.7604 0.7900 0.8139 0.8326 0.8373 0.8787 0.8743 0.9101
| lnwg1988
-------------+--------lnwg1988 | 1.0000
. * The following gives Table 21.3 p.714

. corr upols1979 upols1980 upols1981 upols1982 upols1983 upols1984 upols1985 upols1986
upols1987 upo
> ls1988
(obs=532)
| upo~1979 upo~1980 upo~1981 upo~1982 upo~1983 upo~1984 upo~1985 upo~1986
upo~1987
-------------+--------------------------------------------------------------------------------488
upols1979 |
upols1980 |
upols1981 |
upols1982 |
upols1983 |
upols1984 |
upols1985 |
upols1986 |
upols1987 |
upols1988 |
1.0000
0.3283
0.4442
0.3008
0.2089
0.2025
0.2395
0.1987
0.2091
0.1619
1.0000
0.4035
0.3140
0.2298
0.2289
0.3246
0.1903
0.3167
0.2456
1.0000
0.5678
0.3739
0.3194
0.4087
0.2797
0.3340
0.3016
1.0000
0.4684
0.3360
0.3484
0.2470
0.2877
0.2582
1.0000
0.6398
0.3898
0.3109
0.3097
0.2083
1.0000
0.5800
0.3535
0.3361
0.2470
1.0000
0.3991 1.0000
0.3941 0.3496 1.0000
0.3436 0.5545 0.5242
| upo~1988
-------------+--------upols1988 | 1.0000
. corr ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987 ure1988
(obs=532)
| ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987
-------------+--------------------------------------------------------------------------------ure1979 | 1.0000
ure1980 | 0.0778 1.0000
ure1981 | 0.1777 0.0604 1.0000
ure1982 | -0.0250 -0.0519 0.2492 1.0000
ure1983 | -0.2339 -0.2277 -0.1609 0.0587 1.0000
ure1984 | -0.2482 -0.2431 -0.2691 -0.1709 0.3795 1.0000
ure1985 | -0.1842 -0.0919 -0.1054 -0.1581 -0.0939 0.2197 1.0000
ure1986 | -0.1860 -0.2333 -0.2434 -0.2405 -0.1110 -0.0763 -0.0361 1.0000
ure1987 | -0.1665 -0.0481 -0.1580 -0.1904 -0.1710 -0.1506 -0.0646 -0.0553 1.0000
ure1988 | -0.1960 -0.1251 -0.1646 -0.1949 -0.3265 -0.2786 -0.1221 0.2708 0.2379
| ure1988
-------------+--------ure1988 | 1.0000

. corr ufe1979 ufe1980 ufe1981 ufe1982 ufe1983 ufe1984 ufe1985 ufe1986 ufe1987 ufe1988
(obs=532)
| ufe1979 ufe1980 ufe1981 ufe1982 ufe1983 ufe1984 ufe1985 ufe1986 ufe1987
-------------+--------------------------------------------------------------------------------ufe1979 | 1.0000
ufe1980 | 0.1017 1.0000
ufe1981 | 0.2082 0.0802 1.0000
ufe1982 | 0.0003 -0.0380 0.2631 1.0000
ufe1983 | -0.2632 -0.2691 -0.2113 0.0089 1.0000
ufe1984 | -0.2594 -0.2698 -0.3004 -0.2037 0.3249 1.0000
ufe1985 | -0.1757 -0.0958 -0.1069 -0.1685 -0.1617 0.1713 1.0000
ufe1986 | -0.1915 -0.2534 -0.2644 -0.2676 -0.1723 -0.1364 -0.0865 1.0000
489
ufe1987 | -0.1519 -0.0497 -0.1561 -0.2008 -0.2399 -0.2066 -0.0918 -0.0908 1.0000
ufe1988 | -0.1650 -0.1109 -0.1385 -0.1772 -0.3816 -0.3096 -0.1268 0.2420 0.2439
| ufe1988
-------------+--------ufe1988 | 1.0000
.
. * The following does estimation for just one year
. regress lnhr1979 lnwg1979
Source |
SS
df
MS
Number of obs = 532
-------------+-----------------------------F( 1, 530) = 0.00
Model | .000035507 1 .000035507
Prob > F
= 0.9810
Residual | 33.0180361 530 .062298181
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0019
Total | 33.0180716 531 .062180926
Root MSE
= .2496
-----------------------------------------------------------------------------lnhr1979 |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg1979 | .0006173 .0258574 0.02 0.981 -.0501783 .0514129
_cons | 7.667738 .0680375 112.70 0.000 7.534082 7.801395
-----------------------------------------------------------------------------.
. ************ (2) ANALYSIS: OBTAIN AUTOCORRELATIONS FOR FIRST DIFFERNCES
.
. ** SET UP THE DATA
. use mom, clear
. regress dlnhr dlnwg
Source |
SS
df
MS
-------------+-----------------------------F( 1, 4786) = 26.09
Model | 2.27870825 1 2.27870825
Prob > F
= 0.0000
Residual | 417.943979 4786 .087326364
R-squared = 0.0054
-------------+-----------------------------Adj R-squared = 0.0052
Total | 420.222687 4787 .087784142
Root MSE
= .29551
490
-----------------------------------------------------------------------------dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1089851 .0213351 5.11 0.000 .0671584 .1508118
_cons | .0008283 .0042712 0.19 0.846 -.0075452 .0092018
-----------------------------------------------------------------------------. predict ufdiff, residuals
. keep dlnhr dlnwg ufdiff id year
. reshape wide dlnhr dlnwg ufdiff, i(id) j(year)
(note: j = 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
4788 -> 532
Number of variables
5 ->
28
year -> (dropped)
xij variables:
dlnhr -> dlnhr1980 dlnhr1981 ... dlnhr1988
dlnwg -> dlnwg1980 dlnwg1981 ... dlnwg1988
ufdiff -> ufdiff1980 ufdiff1981 ... ufdiff1988
----------------------------------------------------------------------------. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
532
266.5 153.7194
1
532
dlnhr1980 |
532 -.0092481 .3023508
-2.5
1.71
dlnwg1980 |
532 .0046053 .2301879
-2.12
1.05
ufdiff1980 |
532 -.0105783 .3014161 -2.499738 1.690644
dlnhr1981 |
532 .0075564 .2668644
-1.2
2.32
-------------+-------------------------------------------------------dlnwg1981 |
532 .0085902 .1818033
-.79
1.62
ufdiff1981 |
532 .0057919 .2669213 -1.145188 2.343149
dlnhr1982 |
532 -.0215602 .212834 -2.06
1.14
dlnwg1982 |
532 .0037218 .1755574
-1.17
.74
ufdiff1982 |
532 -.0227941 .213709 -2.036851 1.135902
-------------+-------------------------------------------------------dlnhr1983 |
532 -.0330263 .3413969 -4.51 .9899998
dlnwg1983 |
532 -.0041541 .1673057
-.88 .6399999
ufdiff1983 |
532 -.0334019 .3398726 -4.419281 .9780819
dlnhr1984 |
532 .0234586 .3034213
-2.31
2.57
dlnwg1984 |
532 -.0103383 .2342514 -2.13
.77
-------------+-------------------------------------------------------ufdiff1984 |
532 .0237571 .3004287 -2.168058 2.502691
dlnhr1985 |
532 .0318421 .2772558
-1.46
3.52
dlnwg1985 |
532 .0147556 .2371054
-1.33
3.06
491
ufdiff1985 |
532 .0294057 .2697542 -1.315878 3.185677
dlnhr1986 |
532 -.0090789 .3270724 -4.79
1.8
-------------+-------------------------------------------------------dlnwg1986 |
532 -.012312 .1804162
-1.83
1.04
ufdiff1986 |
532 -.0085654 .3299129 -4.796278 1.789363
dlnhr1987 |
532 .0147744 .3470122
-3.24
4.52
dlnwg1987 |
532 .0120677 .1845692 -.9400001
1.95
ufdiff1987 |
532 .0126309 .3494111 -3.243008 4.550777
-------------+-------------------------------------------------------dlnhr1988 |
532 .0057707 .2587991
-2.5
2.74
dlnwg1988 |
532 .0109023 .194813
-1.5
1.22
ufdiff1988 |
532 .0037542 .2576554 -2.337351 2.739172
.
. ** GET THE CORRELATIONS
. corr dlnhr1980 dlnhr1981 dlnhr1982 dlnhr1983 dlnhr1984 dlnhr1985 dlnhr1986 dlnhr1987
dlnhr1988
(obs=532)
| dlnhr1~0 dlnhr1~1 dlnhr1~2 dlnhr1~3 dlnhr1~4 dlnhr1~5 dlnhr1~6 dlnhr1~7 dlnhr1~8
-------------+--------------------------------------------------------------------------------dlnhr1980 | 1.0000
dlnhr1981 | -0.6289 1.0000
dlnhr1982 | 0.0402 -0.2306 1.0000
dlnhr1983 | 0.0144 -0.0204 -0.2209 1.0000
dlnhr1984 | -0.0001 -0.0570 -0.1410 -0.4495 1.0000
dlnhr1985 | 0.0393 -0.0320 -0.0827 -0.4035 -0.1969 1.0000
dlnhr1986 | -0.0629 0.0322 0.0112 0.0233 -0.1192 -0.2334 1.0000
dlnhr1987 | 0.0811 -0.0709 -0.0029 -0.0448 -0.0202 0.0093 -0.6231 1.0000
dlnhr1988 | -0.0341 0.0461 -0.0082 -0.1020 0.0261 0.0682 0.2486 -0.6064 1.0000
. corr dlnwg1980 dlnwg1981 dlnwg1982 dlnwg1983 dlnwg1984 dlnwg1985 dlnwg1986 dlnwg1987

dlnwg1988
(obs=532)
| dlnwg1~0 dlnwg1~1 dlnwg1~2 dlnwg1~3 dlnwg1~4 dlnwg1~5 dlnwg1~6 dlnwg1~7
dlnwg1~8
-------------+--------------------------------------------------------------------------------dlnwg1980 | 1.0000
dlnwg1981 | -0.3507 1.0000
dlnwg1982 | -0.0149 -0.2849 1.0000
dlnwg1983 | 0.0215 -0.0351 -0.3338 1.0000
dlnwg1984 | -0.0112 0.0098 -0.0686 -0.1899 1.0000
dlnwg1985 | -0.0135 -0.0085 0.0141 -0.1179 -0.5560 1.0000
dlnwg1986 | -0.0121 0.0289 -0.0303 0.0725 -0.0526 -0.2665 1.0000
dlnwg1987 | -0.0042 -0.0119 0.0382 -0.0083 0.1200 -0.1482 -0.5043 1.0000
dlnwg1988 | -0.0281 -0.0377 0.0157 -0.0133 -0.0174 -0.0058 -0.0174 -0.2627 1.0000
492
. corr ufdiff1980 ufdiff1981 ufdiff1982 ufdiff1983 ufdiff1984 ufdiff1985 ufdiff1986 ufdiff1987

ufdif
> f1988
(obs=532)
| ufd~1980 ufd~1981 ufd~1982 ufd~1983 ufd~1984 ufd~1985 ufd~1986 ufd~1987
ufd~1988
-------------+--------------------------------------------------------------------------------ufdiff1980 | 1.0000
ufdiff1981 | -0.6263 1.0000
ufdiff1982 | 0.0451 -0.2389 1.0000
ufdiff1983 | 0.0128 -0.0239 -0.2316 1.0000
ufdiff1984 | -0.0010 -0.0588 -0.1291 -0.4804 1.0000
ufdiff1985 | 0.0453 -0.0285 -0.0868 -0.3731 -0.1853 1.0000
ufdiff1986 | -0.0674 0.0321 0.0110 0.0256 -0.1138 -0.2538 1.0000
ufdiff1987 | 0.0811 -0.0711 -0.0077 -0.0533 -0.0081 0.0211 -0.6250 1.0000
ufdiff1988 | -0.0323 0.0499 0.0022 -0.1019 0.0368 0.0543 0.2326 -0.5943 1.0000
.
. ************ (3) ANALYSIS: CORRELATIONS FOR AN INDIVIDUAL OBSERVATION
.
. * Look at correlations for each individual
.
. ** TRANSFORM DATA FROM LONG FORM TO WIDE FORM FOR INDIVIDUALS
.
. use mom3, replace
. reshape wide lnhr lnwg, i(year) j(id)
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33
> 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
65 6
> 6 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
98
> 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
121 122 123
> 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
146 147 1
> 48 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169
170 171 172
> 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194
195 196 1
> 97 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
219 220 221
> 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243
244 245 2
493
> 46 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
268 269 270
> 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
293 294 2
> 95 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316
317 318 319
> 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341
342 343 3
> 44 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365
366 367 368
> 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390
391 392 3
> 93 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414
415 416 417
> 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439
440 441 4
> 42 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463
464 465 466
> 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488
489 490 4
> 91 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512
513 514 515
> 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 ->
10
Number of variables
4 -> 1065
id -> (dropped)
xij variables:
----------------------------------------------------------------------------. * Note that i and j are reversed
.
.
. tsset year
time variable: year, 1979 to 1988
.
. * First-order Correlation over T years for the first observation
. corr lnhr1 L.lnhr1
(obs=9)
|
L.
| lnhr1 lnhr1
-------------+-----------------lnhr1
|
494
-- | 1.0000
L1 | 0.6378 1.0000
. * First-order Correlation over T years for the second observation

(obs=9)
|
L.
| lnhr2 lnhr2
-------------+-----------------lnhr2
|
-- | 1.0000
L1 | 0.5553 1.0000
. * And so on
.
. log close
log: c:\Imbook\bwebpage\Section5\mma21p2panresiduals.txt
log type: text
closed on: 23 May 2005, 11:37:30
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p3panresiduals.txt
log type: text
opened on: 23 May 2005, 13:01:06
.
. ********** OVERVIEW OF MMA21P3PANRESIDUALS.DO **********
.
. * STATA Program
.
. * Chapter 21.3.4 pages 713-15 Residual analysis
. * This program
. * (1) estimates correlations for
. * - dependent variable
. * - regressors variable
. * - residuals from pooled ols [Table 21.3]
. * - residuals from within estimation [Table 21.4]
. * - residuals from random effects estimation
. * (2) separately estimates correlations for
. * - residuals from first differences estiamtion
. * (3) gets correlations for each individual observation
.
495

.
.*
.
. * MOM.dat
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
.
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
.
.
. ********** READ DATA **********
.*
. summarize
496
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ************ (1) ANALYSIS: OBTAIN KEY AUTOCORRELATIONS Tables 21.3, 21.4
**********
.
. ** RUN REGRESSIONS AND GET RESIDUALS OF INTEREST
.
. * pooled ols
. regress lnhr lnwg
Source |
SS
df
MS
-------------+-----------------------------F( 1, 5318) = 82.22
Model | 6.60538417 1 6.60538417
Prob > F
= 0.0000
Residual | 427.225206 5318 .080335691
R-squared = 0.0152
-------------+-----------------------------Adj R-squared = 0.0150
Total | 433.830591 5319 .081562435
Root MSE
= .28344
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0827436 .0091251 9.07 0.000 .0648545 .1006326
_cons | 7.441516 .0241265 308.44 0.000 7.394219 7.488814
-----------------------------------------------------------------------------. predict upols, residuals
.
. * fixed effects (within)
between = 0.0213
overall = 0.0152
corr(u_i, Xb) = -0.1995
Number of obs
=
5320
Number of groups =
532
avg =
10.0
max =
10
F(1,4787)
=
Prob > F
10
78.96
= 0.0000
497
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1676755 .01887 8.89 0.000 .1306816 .2046694
_cons | 7.219892 .0493434 146.32 0.000 7.123156 7.316628
-------------+---------------------------------------------------------------sigma_u | .18142881
sigma_e | .23278339
Prob > F = 0.0000
. predict ufe, e
.
. * random effects
between = 0.0213
overall = 0.0152
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
=
5320
532

avg =
10.0
max =
10
10
Wald chi2(1)
= 76.64
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1193322 .0136312 8.75 0.000 .0926155 .146049
_cons | 7.346041 .0363925 201.86 0.000 7.274713 7.417368
-------------+---------------------------------------------------------------sigma_u | .16124733
sigma_e | .23278339
-----------------------------------------------------------------------------. predict ure, e
.
. summarize upols ufe ure
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------upols |
5320 -1.27e-10 .2834089 -4.826247 .964581
ufe |
5320 -5.52e-11 .2208354 -4.003929 1.2719
ure |
5320 -9.00e-11 .2231118 -4.131111 1.085362
498
. save mom3, replace

file mom3.dta saved
.
. ** TRANSFORM DATA FROM LONG FORM TO WIDE FORM
.
. keep lnhr lnwg id year upols ufe ure
. reshape wide lnhr lnwg upols ufe ure, i(id) j(year)
(note: j = 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 -> 532
Number of variables
7 ->
51
year -> (dropped)
xij variables:
upols -> upols1979 upols1980 ... upols1988
ufe -> ufe1979 ufe1980 ... ufe1988
ure -> ure1979 ure1980 ... ure1988
----------------------------------------------------------------------------.
.
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
532
266.5 153.7194
1
532
lnhr1979 |
532 7.669342 .249361
5.89
8.54
lnwg1979 |
532 2.597763 .4188951
.52
4.62
upols1979 |
532 .0128775 .2517228 -1.764168 .8312218
ufe1979 |
532 .0138689 .2249175 -1.578105 1.2719
-------------+-------------------------------------------------------ure1979 |
532 .0133046 .2200196 -1.618987 1.085362
lnhr1980 |
532 7.660094 .2691995
5.22
8.34
lnwg1980 |
532 2.602368 .3945963
.8
4.61
upols1980 |
532 .0032483 .2679463 -2.354734 .6659743
ufe1980 |
532 .0038486 .2253673 -2.085636 1.128546
-------------+-------------------------------------------------------ure1980 |
532 .0035069 .2238723 -2.089847 .9429754
lnhr1981 |
532 7.66765 .2105797
6.36
8.4
lnwg1981 |
532 2.610959 .3870011
1.53
4.53
upols1981 |
532 .0100939 .2133106 -1.342159 .7582438
ufe1981 |
532 .0099646 .163407 -1.001722 1.03687
-------------+-------------------------------------------------------499
ure1981 |
532 .0100382 .1596593 -1.02491 .8517824
lnhr1982 |
532 7.64609 .2427195
5.38
8.31
lnwg1982 |
532 2.61468 .4014363
1.21
4.61
upols1982 |
532 -.0117742 .2422735 -2.264238 .6897579
ufe1982 |
532 -.0122196 .1890237 -1.623214 .7918997
-------------+-------------------------------------------------------ure1982 |
532 -.0119661 .1875585 -1.737484 .6666697
lnhr1983 |
532 7.613064 .382703
2.77
8.37
lnwg1983 |
532 2.610526 .4111869
1.08
4.62
upols1983 |
532 -.0444568 .3778255 -4.826247 .7307264
ufe1983 |
532 -.0445494 .2836351 -3.577253 .5196197
-------------+-------------------------------------------------------ure1983 |
532 -.0444967 .294545 -3.804399 .5078294
lnhr1984 |
532 7.636523 .3316735
3.18
8.44
lnwg1984 |
532 2.600188 .4621549
-.26
4.65
upols1984 |
532 -.0201427 .3208512 -4.240003 .8263766
ufe1984 |
532 -.0193572 .225836 -2.810104 .8327778
-------------+-------------------------------------------------------ure1984 |
532 -.0198043 .2378605 -3.140221 .7036628
lnhr1985 |
532 7.668365 .2597423
5.08
8.54
lnwg1985 |
532 2.614944 .4347554
1.33
4.69
upols1985 |
532 .0104785 .259051 -2.503835 .8624523
ufe1985 |
532 .0100107 .1856724 -1.581894 .7944546
-------------+-------------------------------------------------------ure1985 |
532 .010277 .1886509 -1.752727 .7370209
lnhr1986 |
532 7.659286 .3330862
2.77
8.38
lnwg1986 |
532 2.602632 .4432807
.07
4.59
upols1986 |
532 .0024183 .3312105 -4.801424 .7439653
ufe1986 |
532 .0029962 .2595405 -4.003929 .6384854
-------------+-------------------------------------------------------ure1986 |
532 .0026673 .264328 -4.131111 .5111209
lnhr1987 |
532 7.67406 .2745015
4.38
8.56
lnwg1987 |
532 2.614699 .4300122
1.28
4.03
upols1987 |
532 .0161942 .2749153 -3.283269 .964581
ufe1987 |
532 .0157472 .2141618 -2.817174 1.009662
-------------+-------------------------------------------------------ure1987 |
532 .0160016 .2148092 -2.897725 .8441463
lnhr1988 |
532 7.679831 .2552894
4.79
8.53
lnwg1988 |
532 2.625602 .4701759
-.22
4.6
upols1988 |
532 .0210628 .2519891 -2.633313 .9072749
ufe1988 |
532 .0196898 .2048927 -1.68379 1.123516
-------------+-------------------------------------------------------ure1988 |
532 .0204713 .2022375 -1.897506 .9393954
.
. ** OBTAIN THE VARIOUS CORRELATIONS
.
. corr lnhr1979 lnhr1980 lnhr1981 lnhr1982 lnhr1983 lnhr1984 lnhr1985 lnhr1986 lnhr1987
lnhr1988
(obs=532)
500
| lnhr1979 lnhr1980 lnhr1981 lnhr1982 lnhr1983 lnhr1984 lnhr1985 lnhr1986 lnhr1987

-------------+--------------------------------------------------------------------------------lnhr1979 | 1.0000
lnhr1980 | 0.3220 1.0000
lnhr1981 | 0.4321 0.4022 1.0000
lnhr1982 | 0.2947 0.3142 0.5670 1.0000
lnhr1983 | 0.2070 0.2324 0.3788 0.4781 1.0000
lnhr1984 | 0.1908 0.2235 0.3141 0.3318 0.6476 1.0000
lnhr1985 | 0.2284 0.3184 0.3999 0.3453 0.3930 0.5839 1.0000
lnhr1986 | 0.1934 0.1931 0.2813 0.2524 0.3162 0.3595 0.4128 1.0000
lnhr1987 | 0.1986 0.3160 0.3322 0.2951 0.3261 0.3464 0.3987 0.3603 1.0000
lnhr1988 | 0.1640 0.2551 0.3081 0.2674 0.2267 0.2537 0.3509 0.5741 0.5248
| lnhr1988
-------------+--------lnhr1988 | 1.0000
. corr lnwg1979 lnwg1980 lnwg1981 lnwg1982 lnwg1983 lnwg1984 lnwg1985 lnwg1986

lnwg1987 lnwg1988
(obs=532)
| lnwg1979 lnwg1980 lnwg1981 lnwg1982 lnwg1983 lnwg1984 lnwg1985 lnwg1986
lnwg1987
-------------+--------------------------------------------------------------------------------lnwg1979 | 1.0000
lnwg1980 | 0.8415 1.0000
lnwg1981 | 0.8283 0.8920 1.0000
lnwg1982 | 0.7984 0.8559 0.9015 1.0000
lnwg1983 | 0.7795 0.8408 0.8787 0.9155 1.0000
lnwg1984 | 0.7208 0.7737 0.8102 0.8267 0.8625 1.0000
lnwg1985 | 0.7424 0.7929 0.8290 0.8511 0.8636 0.8620 1.0000
lnwg1986 | 0.7250 0.7714 0.8122 0.8286 0.8530 0.8399 0.9157 1.0000
lnwg1987 | 0.7188 0.7639 0.8029 0.8282 0.8525 0.8681 0.9117 0.9111 1.0000
lnwg1988 | 0.7220 0.7604 0.7900 0.8139 0.8326 0.8373 0.8787 0.8743 0.9101
| lnwg1988
-------------+--------lnwg1988 | 1.0000

. corr upols1979 upols1980 upols1981 upols1982 upols1983 upols1984 upols1985 upols1986
upols1987 upo
> ls1988
(obs=532)
| upo~1979 upo~1980 upo~1981 upo~1982 upo~1983 upo~1984 upo~1985 upo~1986
upo~1987
-------------+--------------------------------------------------------------------------------upols1979 | 1.0000
501
upols1980 |
upols1981 |
upols1982 |
upols1983 |
upols1984 |
upols1985 |
upols1986 |
upols1987 |
upols1988 |
0.3283
0.4442
0.3008
0.2089
0.2025
0.2395
0.1987
0.2091
0.1619
1.0000
0.4035
0.3140
0.2298
0.2289
0.3246
0.1903
0.3167
0.2456
1.0000
0.5678
0.3739
0.3194
0.4087
0.2797
0.3340
0.3016
1.0000
0.4684
0.3360
0.3484
0.2470
0.2877
0.2582
1.0000
0.6398
0.3898
0.3109
0.3097
0.2083
1.0000
0.5800
0.3535
0.3361
0.2470
1.0000
0.3991 1.0000
0.3941 0.3496 1.0000
0.3436 0.5545 0.5242
| upo~1988
-------------+--------upols1988 | 1.0000
. corr ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987 ure1988
(obs=532)
| ure1979 ure1980 ure1981 ure1982 ure1983 ure1984 ure1985 ure1986 ure1987
-------------+--------------------------------------------------------------------------------ure1979 | 1.0000
ure1980 | 0.0778 1.0000
ure1981 | 0.1777 0.0604 1.0000
ure1982 | -0.0250 -0.0519 0.2492 1.0000
ure1983 | -0.2339 -0.2277 -0.1609 0.0587 1.0000
ure1984 | -0.2482 -0.2431 -0.2691 -0.1709 0.3795 1.0000
ure1985 | -0.1842 -0.0919 -0.1054 -0.1581 -0.0939 0.2197 1.0000
ure1986 | -0.1860 -0.2333 -0.2434 -0.2405 -0.1110 -0.0763 -0.0361 1.0000
ure1987 | -0.1665 -0.0481 -0.1580 -0.1904 -0.1710 -0.1506 -0.0646 -0.0553 1.0000
ure1988 | -0.1960 -0.1251 -0.1646 -0.1949 -0.3265 -0.2786 -0.1221 0.2708 0.2379
| ure1988
-------------+--------ure1988 | 1.0000

. corr ufe1979 ufe1980 ufe1981 ufe1982 ufe1983 ufe1984 ufe1985 ufe1986 ufe1987 ufe1988
(obs=532)
| ufe1979 ufe1980 ufe1981 ufe1982 ufe1983 ufe1984 ufe1985 ufe1986 ufe1987
-------------+--------------------------------------------------------------------------------ufe1979 | 1.0000
ufe1980 | 0.1017 1.0000
ufe1981 | 0.2082 0.0802 1.0000
ufe1982 | 0.0003 -0.0380 0.2631 1.0000
ufe1983 | -0.2632 -0.2691 -0.2113 0.0089 1.0000
ufe1984 | -0.2594 -0.2698 -0.3004 -0.2037 0.3249 1.0000
ufe1985 | -0.1757 -0.0958 -0.1069 -0.1685 -0.1617 0.1713 1.0000
ufe1986 | -0.1915 -0.2534 -0.2644 -0.2676 -0.1723 -0.1364 -0.0865 1.0000
ufe1987 | -0.1519 -0.0497 -0.1561 -0.2008 -0.2399 -0.2066 -0.0918 -0.0908 1.0000
502
ufe1988 | -0.1650 -0.1109 -0.1385 -0.1772 -0.3816 -0.3096 -0.1268 0.2420 0.2439
| ufe1988
-------------+--------ufe1988 | 1.0000
.
. * The following does estimation for just one year
. regress lnhr1979 lnwg1979
Source |
SS
df
MS
Number of obs = 532
-------------+-----------------------------F( 1, 530) = 0.00
Model | .000035507 1 .000035507
Prob > F
= 0.9810
Residual | 33.0180361 530 .062298181
R-squared = 0.0000
-------------+-----------------------------Adj R-squared = -0.0019
Total | 33.0180716 531 .062180926
Root MSE
= .2496
-----------------------------------------------------------------------------lnhr1979 |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg1979 | .0006173 .0258574 0.02 0.981 -.0501783 .0514129
_cons | 7.667738 .0680375 112.70 0.000 7.534082 7.801395
-----------------------------------------------------------------------------.
. ************ (2) ANALYSIS: OBTAIN AUTOCORRELATIONS FOR FIRST DIFFERNCES
.
. ** SET UP THE DATA
. use mom, clear
. regress dlnhr dlnwg
Source |
SS
df
MS
-------------+-----------------------------F( 1, 4786) = 26.09
Model | 2.27870825 1 2.27870825
Prob > F
= 0.0000
Residual | 417.943979 4786 .087326364
R-squared = 0.0054
-------------+-----------------------------Adj R-squared = 0.0052
Total | 420.222687 4787 .087784142
Root MSE
= .29551
-----------------------------------------------------------------------------503
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1089851 .0213351 5.11 0.000 .0671584 .1508118
_cons | .0008283 .0042712 0.19 0.846 -.0075452 .0092018
-----------------------------------------------------------------------------. predict ufdiff, residuals
. keep dlnhr dlnwg ufdiff id year
. reshape wide dlnhr dlnwg ufdiff, i(id) j(year)
(note: j = 1980 1981 1982 1983 1984 1985 1986 1987 1988)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
4788 -> 532
Number of variables
5 ->
28
year -> (dropped)
xij variables:
dlnhr -> dlnhr1980 dlnhr1981 ... dlnhr1988
dlnwg -> dlnwg1980 dlnwg1981 ... dlnwg1988
ufdiff -> ufdiff1980 ufdiff1981 ... ufdiff1988
----------------------------------------------------------------------------. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
532
266.5 153.7194
1
532
dlnhr1980 |
532 -.0092481 .3023508
-2.5
1.71
dlnwg1980 |
532 .0046053 .2301879
-2.12
1.05
ufdiff1980 |
532 -.0105783 .3014161 -2.499738 1.690644
dlnhr1981 |
532 .0075564 .2668644
-1.2
2.32
-------------+-------------------------------------------------------dlnwg1981 |
532 .0085902 .1818033
-.79
1.62
ufdiff1981 |
532 .0057919 .2669213 -1.145188 2.343149
dlnhr1982 |
532 -.0215602 .212834 -2.06
1.14
dlnwg1982 |
532 .0037218 .1755574
-1.17
.74
ufdiff1982 |
532 -.0227941 .213709 -2.036851 1.135902
-------------+-------------------------------------------------------dlnhr1983 |
532 -.0330263 .3413969 -4.51 .9899998
dlnwg1983 |
532 -.0041541 .1673057
-.88 .6399999
ufdiff1983 |
532 -.0334019 .3398726 -4.419281 .9780819
dlnhr1984 |
532 .0234586 .3034213
-2.31
2.57
dlnwg1984 |
532 -.0103383 .2342514 -2.13
.77
-------------+-------------------------------------------------------ufdiff1984 |
532 .0237571 .3004287 -2.168058 2.502691
dlnhr1985 |
532 .0318421 .2772558
-1.46
3.52
dlnwg1985 |
532 .0147556 .2371054
-1.33
3.06
ufdiff1985 |
532 .0294057 .2697542 -1.315878 3.185677
504
dlnhr1986 |
532 -.0090789 .3270724 -4.79
1.8
-------------+-------------------------------------------------------dlnwg1986 |
532 -.012312 .1804162 -1.83
1.04
ufdiff1986 |
532 -.0085654 .3299129 -4.796278 1.789363
dlnhr1987 |
532 .0147744 .3470122
-3.24
4.52
dlnwg1987 |
532 .0120677 .1845692 -.9400001
1.95
ufdiff1987 |
532 .0126309 .3494111 -3.243008 4.550777
-------------+-------------------------------------------------------dlnhr1988 |
532 .0057707 .2587991
-2.5
2.74
dlnwg1988 |
532 .0109023 .194813
-1.5
1.22
ufdiff1988 |
532 .0037542 .2576554 -2.337351 2.739172
.
. ** GET THE CORRELATIONS
. corr dlnhr1980 dlnhr1981 dlnhr1982 dlnhr1983 dlnhr1984 dlnhr1985 dlnhr1986 dlnhr1987
dlnhr1988
(obs=532)
| dlnhr1~0 dlnhr1~1 dlnhr1~2 dlnhr1~3 dlnhr1~4 dlnhr1~5 dlnhr1~6 dlnhr1~7 dlnhr1~8
-------------+--------------------------------------------------------------------------------dlnhr1980 | 1.0000
dlnhr1981 | -0.6289 1.0000
dlnhr1982 | 0.0402 -0.2306 1.0000
dlnhr1983 | 0.0144 -0.0204 -0.2209 1.0000
dlnhr1984 | -0.0001 -0.0570 -0.1410 -0.4495 1.0000
dlnhr1985 | 0.0393 -0.0320 -0.0827 -0.4035 -0.1969 1.0000
dlnhr1986 | -0.0629 0.0322 0.0112 0.0233 -0.1192 -0.2334 1.0000
dlnhr1987 | 0.0811 -0.0709 -0.0029 -0.0448 -0.0202 0.0093 -0.6231 1.0000
dlnhr1988 | -0.0341 0.0461 -0.0082 -0.1020 0.0261 0.0682 0.2486 -0.6064 1.0000
. corr dlnwg1980 dlnwg1981 dlnwg1982 dlnwg1983 dlnwg1984 dlnwg1985 dlnwg1986 dlnwg1987

dlnwg1988
(obs=532)
| dlnwg1~0 dlnwg1~1 dlnwg1~2 dlnwg1~3 dlnwg1~4 dlnwg1~5 dlnwg1~6 dlnwg1~7
dlnwg1~8
-------------+--------------------------------------------------------------------------------dlnwg1980 | 1.0000
dlnwg1981 | -0.3507 1.0000
dlnwg1982 | -0.0149 -0.2849 1.0000
dlnwg1983 | 0.0215 -0.0351 -0.3338 1.0000
dlnwg1984 | -0.0112 0.0098 -0.0686 -0.1899 1.0000
dlnwg1985 | -0.0135 -0.0085 0.0141 -0.1179 -0.5560 1.0000
dlnwg1986 | -0.0121 0.0289 -0.0303 0.0725 -0.0526 -0.2665 1.0000
dlnwg1987 | -0.0042 -0.0119 0.0382 -0.0083 0.1200 -0.1482 -0.5043 1.0000
dlnwg1988 | -0.0281 -0.0377 0.0157 -0.0133 -0.0174 -0.0058 -0.0174 -0.2627 1.0000
. corr ufdiff1980 ufdiff1981 ufdiff1982 ufdiff1983 ufdiff1984 ufdiff1985 ufdiff1986 ufdiff1987

ufdif
505
> f1988
(obs=532)
| ufd~1980 ufd~1981 ufd~1982 ufd~1983 ufd~1984 ufd~1985 ufd~1986 ufd~1987
ufd~1988
-------------+--------------------------------------------------------------------------------ufdiff1980 | 1.0000
ufdiff1981 | -0.6263 1.0000
ufdiff1982 | 0.0451 -0.2389 1.0000
ufdiff1983 | 0.0128 -0.0239 -0.2316 1.0000
ufdiff1984 | -0.0010 -0.0588 -0.1291 -0.4804 1.0000
ufdiff1985 | 0.0453 -0.0285 -0.0868 -0.3731 -0.1853 1.0000
ufdiff1986 | -0.0674 0.0321 0.0110 0.0256 -0.1138 -0.2538 1.0000
ufdiff1987 | 0.0811 -0.0711 -0.0077 -0.0533 -0.0081 0.0211 -0.6250 1.0000
ufdiff1988 | -0.0323 0.0499 0.0022 -0.1019 0.0368 0.0543 0.2326 -0.5943 1.0000
.
. ************ (3) ANALYSIS: CORRELATIONS FOR AN INDIVIDUAL OBSERVATION
.
. * Look at correlations for each individual
.
. ** TRANSFORM DATA FROM LONG FORM TO WIDE FORM FOR INDIVIDUALS
.
. use mom3, replace
. reshape wide lnhr lnwg, i(year) j(id)
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33
> 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
65 6
> 6 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
98
> 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
121 122 123
> 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
146 147 1
> 48 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169
170 171 172
> 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194
195 196 1
> 97 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
219 220 221
> 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243
244 245 2
> 46 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
268 269 270
506
> 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
293 294 2
> 95 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316
317 318 319
> 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341
342 343 3
> 44 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365
366 367 368
> 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390
391 392 3
> 93 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414
415 416 417
> 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439
440 441 4
> 42 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463
464 465 466
> 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488
489 490 4
> 91 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512
513 514 515
> 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532)
Data
long -> wide
----------------------------------------------------------------------------Number of obs.
5320 ->
10
Number of variables
4 -> 1065
id -> (dropped)
xij variables:
----------------------------------------------------------------------------. * Note that i and j are reversed
.
.
. tsset year
.
. * First-order Correlation over T years for the first observation
(obs=9)
|
L.
| lnhr1 lnhr1
-------------+-----------------lnhr1
|
-- | 1.0000
L1 | 0.6378 1.0000
507
. * First-order Correlation over T years for the second observation

(obs=9)
|
L.
| lnhr2 lnhr2
-------------+-----------------lnhr2
|
-- | 1.0000
L1 | 0.5553 1.0000
. * And so on
.
. log close
log: c:\Imbook\bwebpage\Section5\mma21p3panresiduals.txt
log type: text
closed on: 23 May 2005, 13:01:15
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma21p4pangls.txt
log type: text
opened on: 23 May 2005, 11:38:01
.
. ********** OVERVIEW OF MMA21P4PANGLS.DO **********
.
. * STATA Program
.
. * Chapter 21.5.5 page 725 Table 21.6 Pooled panel OLS and GLS
. * Demonstrate pooled GLS estimation using XTGEE
. * (1) No correlation (i.e. pooled OLS)
. * (2) Equicorrelated
. * (3) AR1
. * (4) Unrestricted
. * Standard errors are default plus panel boostrap
.
. * MOM.dat
.
508
.*
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
.
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
.
.
. ********** READ DATA AND SUMMARIZE **********
.*
.
. describe
Contains data
obs:
5,320
vars:
8
size:
label
variable label
------------------------------------------------------------------------------lnhr
float %9.0g
509
lnwg
float %9.0g
kids
float %9.0g
ageh
float %9.0g
agesq
float %9.0g
disab
float %9.0g
id
float %9.0g
year
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.65743 .2855914
2.77
8.56
lnwg |
5320 2.609436 .4258924
-.26
4.69
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ********** DEFINE GLOBALS INCLUDING REGRESSOR LIST *********
.
. * Table 21.6 used 500
. global nreps 500
.
. ********* ANALYSIS: DIFFERENT POOLED GLS ESTIMATES USING XTGEE *********
.
. *** (1) N0 ERROR CORRELATION - SAME AS POOLED OLS Table 21.7 first column
.
. * Default standard error
. xtgee lnhr lnwg, corr(independent) i(id)
Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
independent
max =
10
Wald chi2(1)
= 82.25
Scale parameter:
.0803055
Prob > chi2
= 0.0000
Pearson chi2(5320):
427.23
Deviance
427.23
510
Dispersion (Pearson):
.0803055
Dispersion
= .0803055
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0827436 .0091234 9.07 0.000 .064862 .1006251
_cons | 7.441516 .0241219 308.50 0.000 7.394238 7.488795
-----------------------------------------------------------------------------. estimates store ind
. * "Robust" standard error
. xtgee lnhr lnwg, corr(independent) i(id) robust
Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
independent
max =
10
Wald chi2(1)
=
7.99
Scale parameter:
.0803055
Prob > chi2
= 0.0047
Pearson chi2(5320):
427.23
Deviance
.0803055
Dispersion
427.23
= .0803055
(standard errors adjusted for clustering on id)

-----------------------------------------------------------------------------|
Semi-robust
lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0827436 .0292684 2.83 0.005 .0253785 .1401086
_cons | 7.441516 .0795795 93.51 0.000 7.285543 7.597489
-----------------------------------------------------------------------------. estimates store indrob
. set seed 10001
. bootstrap "xtgee lnhr lnwg, corr(independent) i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps
> ) level(95)
command:
xtgee lnhr lnwg , corr(independent) i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320
511

-------------+---------------------------------------------------------------_bs_1 | 72 .0827435 -.0007854 .0317837 .0193687 .1461184 (N)
|
.0090096 .1413525 (P)
|
.0154833 .1413525 (BC)
_bs_2 | 72 7.441516 .0024828 .0861859 7.269667 7.613366 (N)
|
7.27043 7.635125 (P)
|
7.27043 7.631187 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix indbootse = e(se)
.
. *** (2) EQUICORRELATED - SAME AS RE-GLS Table 21.7 second column
.
. xtgee lnhr lnwg, corr(exchangeable) i(id)
Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
exchangeable
max =
10
Wald chi2(1)
= 76.70
Scale parameter:
.0805511
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1195474 .0136507 8.76 0.000 .0927925 .1463023
_cons | 7.345479 .0364481 201.53 0.000 7.274042 7.416916
-----------------------------------------------------------------------------. estimates store exch
. xtgee lnhr lnwg, corr(exchangeable) i(id) robust
512

Number of obs
=
5320
Group variable:
id
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
exchangeable
max =
10
Wald chi2(1)
=
5.38
Scale parameter:
.0805511
Prob > chi2
= 0.0204
-----------------------------------------------------------------------------|
Semi-robust
lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .1195474 .0515426 2.32 0.020 .0185258 .220569
_cons | 7.345479 .1379494 53.25 0.000 7.075103 7.615855
-----------------------------------------------------------------------------. estimates store exchrob
. set seed 10001
. bootstrap "xtgee lnhr lnwg, corr(exchangeable) i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nrep
> s) level(95)
command:
xtgee lnhr lnwg , corr(exchangeable) i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 72 .1195474 .0068755 .059895 .0001201 .2389747 (N)
|
.0256504 .2573869 (P)
|
.0256504 .2286118 (BC)
_bs_2 | 72 7.345479 -.0179736 .1585556 7.029328 7.66163 (N)
|
6.990765 7.605015 (P)
|
7.066358 7.605015 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix exchbootse = e(se)
513
.
. *** (3) AR(1) Table 21.7 third column
.
. xtgee lnhr lnwg, corr(ar 1) i(id) t(year)
Number of obs
=
5320
Group and time vars:
id year
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
AR(1)
max =
10
Wald chi2(1)
= 46.73
Scale parameter:
.0803129
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0843777 .0123428 6.84 0.000 .0601862 .1085691
_cons | 7.439893 .0327698 227.04 0.000 7.375665 7.50412
-----------------------------------------------------------------------------. estimates store ar1
. xtgee lnhr lnwg, corr(ar 1) i(id) t(year) robust
Number of obs
=
5320
id year
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
AR(1)
max =
10
Wald chi2(1)
=
5.15
Scale parameter:
.0803129
Prob > chi2
= 0.0232
-----------------------------------------------------------------------------|
Semi-robust
lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0843777 .0371764 2.27 0.023 .0115133 .1572421
_cons | 7.439893 .100308 74.17 0.000 7.243293 7.636493
------------------------------------------------------------------------------
514
. estimates store ar1rob

. set seed 10001
. bootstrap "xtgee lnhr lnwg, corr(ar 1) i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nreps) level
> (95)
command:
xtgee lnhr lnwg , corr(ar 1) i(id)
statistics: _bs_1
= _b[lnwg]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
532
Replications =
500
5320

-------------+---------------------------------------------------------------_bs_1 | 500 .0843777 -.0025819 .050393 -.014631 .1833863 (N)
|
-.0060264 .184696 (P)
|
-.0031327 .1860251 (BC)
_bs_2 | 500 7.439893 .0077122 .136732 7.171251 7.708534 (N)
|
7.165532 7.686645 (P)
|
7.157923 7.676162 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix ar1bootse = e(se)
.
. *** (4) HOMOSKEDASTIC UNSTRUCTURED Table 21.7 fourth column
.
. xtgee lnhr lnwg, corr(unstructured) i(id) t(year)
Number of obs
=
5320
id year
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
unstructured
max =
10
Wald chi2(1)
= 43.67
Scale parameter:
.0803575
Prob > chi2
= 0.0000
515
-----------------------------------------------------------------------------lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0910023 .0137712 6.61 0.000 .0640113 .1179933
_cons | 7.426262 .0366836 202.44 0.000 7.354363 7.49816
-----------------------------------------------------------------------------. estimates store unstr
. xtgee lnhr lnwg, corr(unstructured) i(id) t(year) robust
Number of obs
=
5320
id year
Number of groups =
532
Link:
identity
10
Family:
Gaussian
avg =
10.0
Correlation:
unstructured
max =
10
Wald chi2(1)
=
3.29
Scale parameter:
.0803575
Prob > chi2
= 0.0695
-----------------------------------------------------------------------------|
Semi-robust
lnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------lnwg | .0910023 .0501344 1.82 0.069 -.0072594 .189264
_cons | 7.426262 .1328255 55.91 0.000 7.165929 7.686595
-----------------------------------------------------------------------------. estimates store unstrrob
. set seed 10001
. /* For some reason the following did not work
> bootstrap "xtgee lnhr lnwg, corr(unstructured) i(id)" "_b[lnwg] _b[_cons]", cluster(id) reps($nrep
> s) level(95)
> matrix unstrbootse = e(se)
> */
.
. ********** DISPLAY RESULTS IN TABLE 21.7 page 725 **********
.
. * Standard error using iid errors and in some cases panel
. estimates table ind indrob exch exchrob, /*
516
-----------------------------------------------------------------Variable | ind
indrob
exch
exchrob
-------------+---------------------------------------------------lnwg |
0.083
0.083
0.120
0.120
|
0.009
0.029
0.014
0.052
_cons |
7.442
7.442
7.345
7.345
|
0.024
0.080
0.036
0.138
-------------+---------------------------------------------------N | 5320.000 5320.000 5320.000 5320.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
-----------------------------------------------------------------legend: b/se
. estimates table ar1 ar1rob unstr unstrrob, /*
-----------------------------------------------------------------Variable | ar1
ar1rob
unstr
unstrrob
-------------+---------------------------------------------------lnwg |
0.084
0.084
0.091
0.091
|
0.012
0.037
0.014
0.050
_cons |
7.440
7.440
7.426
7.426
|
0.033
0.100
0.037
0.133
-------------+---------------------------------------------------N | 5320.000 5320.000 5320.000 5320.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
-----------------------------------------------------------------legend: b/se
.
. * Standard errors using panel bootstrap (regular bootstrap for between)
. matrix list indbootse
indbootse[1,2]
_bs_1
_bs_2
se .03178369 .0861859
. matrix list exchbootse
517
exchbootse[1,2]
_bs_1
_bs_2
se .05989501 .15855561
. matrix list ar1bootse
ar1bootse[1,2]
_bs_1
_bs_2
se .05039303 .13673201
. matrix list unstrbootse
matrix unstrbootse not found
r(111);
end of do-file
r(111);
. exit, clear
518
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma22p1pangmm.txt
log type: text
opened on: 23 May 2005, 11:52:35
.
. ********** OVERVIEW OF MMA22P1PANGMM.DO **********
.
. * STATA Program
.
. * Panel 2SLS and GMM for a linear model with endogenous regressors
. * Fixed effects are first differenced.
. * Then 2SLS and GMM applied to first differenced model.
.
. * Program derives Table 22.2 and does other analysis in section
. * (1) pooled OLS
. * (2) 2SLS in base instruments case
. * (3) 2SLS in stacked instruments case
. * (4) 2SGMM in base instruments case
. * (5) 2SGMM in stacked instruments case
. * (6) F-statistics for weak instruments
. * (7) Partial R-squared for weak instruments
.
. * The pooled OLS and 2SLS replicate Ziliak (1997) Table 1 Top left-hand corner
. * for Base Case (9 instruments) and first Stacked Case (72 instruments)
. * 2SLS in first differences where both 1979 and 1980 are dropped
.
. * MOMprecise.dat
.
. * NOTE: This data set is different from MOM.dat used in chapter 21.
.*
The data here has more significant digits.
.*
leading to some difference in resulting coefficient estiamtes.
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
519

. * NOTE: Data originally posted on JBES website was to only 2 dec places
. * Here more accurate data is used (the same as the data used by Ziliak)
. * Ziliak used Gauss. Here Stata is used.
.
. * So id 1 1979, id 1 1980, ..., id 1 1988, id 2 1979, 1d 2 1980, ...
. * 8 variables:
.
. * File MOMprecise.dat has more significant digits than file MOM.dat
. * (the version of the data posted at the JBES website (used in chapter 21)
.
. ********** READ DATA **********
.
. infile lnhr lnwg kids ageh agesq disab id year using MOMprecise.dat
. describe
Contains data
obs:
5,320
vars:
8
size:
label
variable label
------------------------------------------------------------------------------lnhr
float %9.0g
lnwg
float %9.0g
kids
float %9.0g
ageh
float %9.0g
agesq
float %9.0g
disab
float %9.0g
id
float %9.0g
year
float %9.0g
------------------------------------------------------------------------------Sorted by:
. summarize
Variable |
Obs
Mean
Std. Dev.
Min
Max
520
-------------+-------------------------------------------------------lnhr |
5320 7.657458
.28564 2.772589 8.556414
lnwg |
5320 2.609477 .4260333 -.2613648 4.686474
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
.
. ********** FIRST DIFFERENCES REGRESSION **********
.
. * Stata has no command for first differences regression
. * Though may be possible with xtivreg
.
. * The following only works if each observation is (i,t)
. * and within i the data are ordered by t
. gen dkids = kids - kids[_n-1]
. gen dageh = ageh - ageh[_n-1]
. gen dagesq = agesq - agesq[_n-1]
. gen ddisab = disab - disab[_n-1]
.
. * The regression is of
. * dlnhr on constant dlnwg dkids dageh dagesq ddisab
.
. ********** GENERATE THE INSTRUMENTS **********
.
. * The endogenous variable is dlnwg. The others are exogenous.
. * It is not clear whether current values of the exogenous variables are used as instruments.
. * I would think so but there is no mention in the paper of this.
. * In addition Table 1 considers various instrument sets
. * We consider the first (first rows) and second (second rows)
.
. * (1) Use the levels of the exogenous regressors lagged one and two periods
. * and the level of the endogenous regressor lagged two periods
521
. * This gives nine instruments

. gen kidsl1 = kids[_n-1]
. gen kidsl2 = kids[_n-2]
. gen agehl1 = ageh[_n-1]
. gen agehl2 = ageh[_n-2]
. gen agesql1 = agesq[_n-1]
. gen agesql2 = agesq[_n-2]
. gen disabl1 = disab[_n-1]
. gen disabl2 = disab[_n-2]
. gen lnwgl2 = lnwg[_n-2]
.
. * (2) Use the same instruments as in (1) except now stacked so that
. * now the instrument matrix is block-diagonal.
. * This gives nine instruments times number of time periods.
. * The original data are 1979 to 1988.
. * We will eventually drop the first two years as lose 2 years due to lags.
. * For short hand call the instruments z1 to z9 and the years 1981 to 1988 y1 to y8.
. * Pad out to 8 x 9 = 72 instruments for 8 years
.
. program define makeZ
1. forvalues i=1(1)8 {
2. gen z1yì'=0
3. replace z1yì' = ageh[_n-1] if year==1980+ì'
4. gen z2yì'=0
5. replace z2yì' = agesq[_n-1] if year==1980+ì'
6. gen z3yì'=0
7. replace z3yì' = kids[_n-1] if year==1980+ì'
8. gen z4yì'=0
9. replace z4yì' = disab[_n-1] if year==1980+ì'
10. gen z5yì'=0
11. replace z5yì' = ageh[_n-2] if year==1980+ì'
12. gen z6yì'=0
13. replace z6yì' = agesq[_n-2] if year==1980+ì'
522
14. gen z7yì'=0

15. replace z7yì' = kids[_n-2] if year==1980+ì'
16. gen z8yì'=0
17. replace z8yì' = disab[_n-2] if year==1980+ì'
18. gen z9yì'=0
19. replace z9yì' = lnwg[_n-2] if year==1980+ì'
20. }
21. end
. quietly makeZ
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.657458
.28564 2.772589 8.556414
lnwg |
5320 2.609477 .4260333 -.2613648 4.686474
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
dlnhr |
5319 .0000192 .3016322 -4.787492 4.521109
dlnwg |
5319 .0001115 .2718437 -2.32463 3.062298
-------------+-------------------------------------------------------dkids |
5319 -.000188 .6629109
-5
6
dageh |
5319 .0030081 4.611209
-36
19
dagesq |
5319 .2105659 371.0841
-3024
1577
ddisab |
5319
0 .2429913
-1
1
kidsl1 |
5319 1.555932 1.196012
0
6
-------------+-------------------------------------------------------kidsl2 |
5318 1.556036 1.196101
0
6
agehl1 |
5319 38.91747 8.45096
22
60
agehl2 |
5318 38.91707 8.451706
22
60
agesql1 |
5319 1585.974 689.8313
484
3600
agesql2 |
5318 1585.957 689.8949
484
3600
-------------+-------------------------------------------------------disabl1 |
5319 .0609137 .2391944
0
1
disabl2 |
5318 .0609252 .2392155
0
1
lnwgl2 |
5318 2.609513 .4261095 -.2613648 4.686474
z1y1 |
5320 3.544549 10.92972
0
52
z2y1 |
5320 132.0002 438.9997
0
2704
-------------+-------------------------------------------------------z3y1 |
5320 .1567669 .5978681
0
6
z4y1 |
5320 .0048872 .0697442
0
1
z5y1 |
5320 3.445489 10.64043
0
51
z6y1 |
5320 125.0688 418.0247
0
2601
z7y1 |
5320 .1520677 .5938801
0
6
-------------+-------------------------------------------------------523
z8y1 |
5320 .0054511 .0736372
0
1
z9y1 |
5320 .2597756 .7905791
0 4.61522
z1y2 |
5320 3.63891 11.20265
0
53
z2y2 |
5320 138.7175 458.8032
0
2809
z3y2 |
5320 .1590226 .6057112
0
6
-------------+-------------------------------------------------------z4y2 |
5320 .0039474 .0627099
0
1
z5y2 |
5320 3.544549 10.92972
0
52
z6y2 |
5320 132.0002 438.9997
0
2704
z7y2 |
5320 .1567669 .5978681
0
6
z8y2 |
5320 .0048872 .0697442
0
1
-------------+-------------------------------------------------------z9y2 |
5320 .2602349 .7906729
0 4.60976
z1y3 |
5320 3.737218 11.49054
0
54
z2y3 |
5320 145.9744 480.6547
0
2916
z3y3 |
5320 .1637218 .6172305
0
6
z4y3 |
5320 .0052632 .0723633
0
1
-------------+-------------------------------------------------------z5y3 |
5320 3.63891 11.20265
0
53
z6y3 |
5320 138.7175 458.8032
0
2809
z7y3 |
5320 .1590226 .6057112
0
6
z8y3 |
5320 .0039474 .0627099
0
1
z9y3 |
5320 .2610997 .7928738
0 4.52656
-------------+-------------------------------------------------------z1y4 |
5320 3.83985 11.79093
0
55
z2y4 |
5320 153.7444 503.9576
0
3025
z3y4 |
5320 .1620301 .6132476
0
6
z4y4 |
5320 .0037594 .0612043
0
1
z5y4 |
5320 3.737218 11.49054
0
54
-------------+-------------------------------------------------------z6y4 |
5320 145.9744 480.6547
0
2916
z7y4 |
5320 .1637218 .6172305
0
6
z8y4 |
5320 .0052632 .0723633
0
1
z9y4 |
5320 .2614749 .7946793
0 4.607767
z1y5 |
5320 3.940414 12.08767
0
56
-------------+-------------------------------------------------------z2y5 |
5320 161.6111 527.9522
0
3136
z3y5 |
5320 .1595865 .608814
0
6
z4y5 |
5320 .006015 .0773303
0
1
z5y5 |
5320 3.83985 11.79093
0
55
z6y5 |
5320 153.7444 503.9576
0
3025
-------------+-------------------------------------------------------z7y5 |
5320 .1620301 .6132476
0
6
z8y5 |
5320 .0037594 .0612043
0
1
z9y5 |
5320 .2610663 .7939903
0 4.618777
z1y6 |
5320 4.047368 12.40128
0
57
z2y6 |
5320 170.144 553.5552
0
3249
-------------+-------------------------------------------------------z3y6 |
5320 .1575188 .6042401
0
5
z4y6 |
5320 .0065789 .0808511
0
1
z5y6 |
5320 3.940414 12.08767
0
56
524
z6y6 |
5320 161.6111 527.9522
0
3136
z7y6 |
5320 .1595865 .608814
0
6
-------------+-------------------------------------------------------z8y6 |
5320 .006015 .0773303
0
1
z9y6 |
5320 .2600271 .7937085 -.2613648 4.648325
z1y7 |
5320 4.140602 12.67474
0
58
z2y7 |
5320 177.7635 576.2959
0
3364
z3y7 |
5320 .1537594 .5983346
0
5
-------------+-------------------------------------------------------z4y7 |
5320 .006203 .0785219
0
1
z5y7 |
5320 4.047368 12.40128
0
57
z6y7 |
5320 170.144 553.5552
0
3249
z7y7 |
5320 .1575188 .6042401
0
5
z8y7 |
5320 .0065789 .0808511
0
1
-------------+-------------------------------------------------------z9y7 |
5320 .261494 .7964894
0 4.686474
z1y8 |
5320 4.240414 12.96638
0
59
z2y8 |
5320 186.0765 600.9297
0
3481
z3y8 |
5320 .1494361 .5901043
0
5
z4y8 |
5320 .0090226 .0945665
0
1
-------------+-------------------------------------------------------z5y8 |
5320 4.140602 12.67474
0
58
z6y8 |
5320 177.7635 576.2959
0
3364
z7y8 |
5320 .1537594 .5983346
0
5
z8y8 |
5320 .006203 .0785219
0
1
z9y8 |
5320 .2602616 .7933278
0 4.5933
.
. * Define variable lists for regressors X and instruments Z
.
. global XREG dlnwg dkids dageh dagesq ddisab
.
. global ZBASECASE kidsl1 agehl1 agesql1 disabl1 agehl2 kidsl2 agesql2 disabl2 lnwgl2
.
. global ZSTACKED z1y1 z2y1 z3y1 z4y1 z5y1 z6y1 z7y1 z8y1 z9y1 /*
> */
z1y2 z2y2 z3y2 z4y2 z5y2 z6y2 z7y2 z8y2 z9y2 /*
> */
> */
> */
> */
> */
> */
z1y8 z2y8 z3y8 z4y8 z5y8 z6y8 z7y8 z8y8 z9y8
.
. * Define variable lists for weak instruments test which drops
.
. save momfdiffgmm, replace
file momfdiffgmm.dta saved
525
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------lnhr |
5320 7.657458
.28564 2.772589 8.556414
lnwg |
5320 2.609477 .4260333 -.2613648 4.686474
kids |
5320 1.555827 1.195924
0
6
ageh |
5320 38.91823 8.450351
22
60
agesq |
5320 1586.024 689.7759
484
3600
-------------+-------------------------------------------------------disab |
5320 .0609023 .2391734
0
1
id |
5320
266.5 153.5893
1
532
year |
5320
1983.5 2.872551
1979
1988
dlnhr |
5319 .0000192 .3016322 -4.787492 4.521109
dlnwg |
5319 .0001115 .2718437 -2.32463 3.062298
-------------+-------------------------------------------------------dkids |
5319 -.000188 .6629109
-5
6
dageh |
5319 .0030081 4.611209
-36
19
dagesq |
5319 .2105659 371.0841
-3024
1577
ddisab |
5319
0 .2429913
-1
1
kidsl1 |
5319 1.555932 1.196012
0
6
-------------+-------------------------------------------------------kidsl2 |
5318 1.556036 1.196101
0
6
agehl1 |
5319 38.91747 8.45096
22
60
agehl2 |
5318 38.91707 8.451706
22
60
agesql1 |
5319 1585.974 689.8313
484
3600
agesql2 |
5318 1585.957 689.8949
484
3600
-------------+-------------------------------------------------------disabl1 |
5319 .0609137 .2391944
0
1
disabl2 |
5318 .0609252 .2392155
0
1
lnwgl2 |
5318 2.609513 .4261095 -.2613648 4.686474
z1y1 |
5320 3.544549 10.92972
0
52
z2y1 |
5320 132.0002 438.9997
0
2704
-------------+-------------------------------------------------------z3y1 |
5320 .1567669 .5978681
0
6
z4y1 |
5320 .0048872 .0697442
0
1
z5y1 |
5320 3.445489 10.64043
0
51
z6y1 |
5320 125.0688 418.0247
0
2601
z7y1 |
5320 .1520677 .5938801
0
6
-------------+-------------------------------------------------------z8y1 |
5320 .0054511 .0736372
0
1
z9y1 |
5320 .2597756 .7905791
0 4.61522
z1y2 |
5320 3.63891 11.20265
0
53
z2y2 |
5320 138.7175 458.8032
0
2809
z3y2 |
5320 .1590226 .6057112
0
6
-------------+-------------------------------------------------------z4y2 |
5320 .0039474 .0627099
0
1
z5y2 |
5320 3.544549 10.92972
0
52
z6y2 |
5320 132.0002 438.9997
0
2704
z7y2 |
5320 .1567669 .5978681
0
6
z8y2 |
5320 .0048872 .0697442
0
1
526
-------------+-------------------------------------------------------z9y2 |
5320 .2602349 .7906729
0 4.60976
z1y3 |
5320 3.737218 11.49054
0
54
z2y3 |
5320 145.9744 480.6547
0
2916
z3y3 |
5320 .1637218 .6172305
0
6
z4y3 |
5320 .0052632 .0723633
0
1
-------------+-------------------------------------------------------z5y3 |
5320 3.63891 11.20265
0
53
z6y3 |
5320 138.7175 458.8032
0
2809
z7y3 |
5320 .1590226 .6057112
0
6
z8y3 |
5320 .0039474 .0627099
0
1
z9y3 |
5320 .2610997 .7928738
0 4.52656
-------------+-------------------------------------------------------z1y4 |
5320 3.83985 11.79093
0
55
z2y4 |
5320 153.7444 503.9576
0
3025
z3y4 |
5320 .1620301 .6132476
0
6
z4y4 |
5320 .0037594 .0612043
0
1
z5y4 |
5320 3.737218 11.49054
0
54
-------------+-------------------------------------------------------z6y4 |
5320 145.9744 480.6547
0
2916
z7y4 |
5320 .1637218 .6172305
0
6
z8y4 |
5320 .0052632 .0723633
0
1
z9y4 |
5320 .2614749 .7946793
0 4.607767
z1y5 |
5320 3.940414 12.08767
0
56
-------------+-------------------------------------------------------z2y5 |
5320 161.6111 527.9522
0
3136
z3y5 |
5320 .1595865 .608814
0
6
z4y5 |
5320 .006015 .0773303
0
1
z5y5 |
5320 3.83985 11.79093
0
55
z6y5 |
5320 153.7444 503.9576
0
3025
-------------+-------------------------------------------------------z7y5 |
5320 .1620301 .6132476
0
6
z8y5 |
5320 .0037594 .0612043
0
1
z9y5 |
5320 .2610663 .7939903
0 4.618777
z1y6 |
5320 4.047368 12.40128
0
57
z2y6 |
5320 170.144 553.5552
0
3249
-------------+-------------------------------------------------------z3y6 |
5320 .1575188 .6042401
0
5
z4y6 |
5320 .0065789 .0808511
0
1
z5y6 |
5320 3.940414 12.08767
0
56
z6y6 |
5320 161.6111 527.9522
0
3136
z7y6 |
5320 .1595865 .608814
0
6
-------------+-------------------------------------------------------z8y6 |
5320 .006015 .0773303
0
1
z9y6 |
5320 .2600271 .7937085 -.2613648 4.648325
z1y7 |
5320 4.140602 12.67474
0
58
z2y7 |
5320 177.7635 576.2959
0
3364
z3y7 |
5320 .1537594 .5983346
0
5
-------------+-------------------------------------------------------z4y7 |
5320 .006203 .0785219
0
1
z5y7 |
5320 4.047368 12.40128
0
57
527
z6y7 |
5320 170.144 553.5552
0
3249
z7y7 |
5320 .1575188 .6042401
0
5
z8y7 |
5320 .0065789 .0808511
0
1
-------------+-------------------------------------------------------z9y7 |
5320 .261494 .7964894
0 4.686474
z1y8 |
5320 4.240414 12.96638
0
59
z2y8 |
5320 186.0765 600.9297
0
3481
z3y8 |
5320 .1494361 .5901043
0
5
z4y8 |
5320 .0090226 .0945665
0
1
-------------+-------------------------------------------------------z5y8 |
5320 4.140602 12.67474
0
58
z6y8 |
5320 177.7635 576.2959
0
3364
z7y8 |
5320 .1537594 .5983346
0
5
z8y8 |
5320 .006203 .0785219
0
1
z9y8 |
5320 .2602616 .7933278
0 4.5933
.
. ********** (1)-(3) 2SLS USING IVREG IS STRAIGHTFORWARD (Table 22.2, p.755)
**********
.
. * Note that this will automatically includes the exogenous variables as instrumetns
. * It is not clear that Ziliak does this
.
. * The following drops the first two years which here are 1979 and 1980
. drop if year == 1979 | year == 1980
.
. * (1) OLS results at bottom Ziliak table 1
. * Table 22.2 (page 755) OLS column with various standard errors estimates
. regress dlnhr $XREG, noconstant
Source |
SS
df
MS
-------------+-----------------------------F( 5, 4251) = 5.38
Model | 2.3389287 5 .467785741
Prob > F
= 0.0001
Residual | 369.369193 4251 .086889954
R-squared = 0.0063
-------------+-----------------------------Adj R-squared = 0.0051
Total | 371.708121 4256 .087337435
Root MSE
= .29477
-----------------------------------------------------------------------------dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1115114 .0230566 4.84 0.000 .0663084 .1567144
dkids | -.0062887 .0116719 -0.54 0.590 -.0291717 .0165943
dageh | .0066935 .0212744 0.31 0.753 -.0350154 .0484025
dagesq | -.0000797 .0002644 -0.30 0.763 -.000598 .0004387
ddisab | -.0352603 .0199796 -1.76 0.078 -.0744306 .0039101
-----------------------------------------------------------------------------. estimates store olsiid
528
. regress dlnhr $XREG, noconstant robust

Number of obs =
F( 5, 4251) = 0.70
Prob > F
= 0.6246
R-squared = 0.0063
Root MSE = .29477
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1115114 .0791674 1.41 0.159 -.043698 .2667207
dkids | -.0062887 .011057 -0.57 0.570 -.0279662 .0153888
dageh | .0066935 .0243788 0.27 0.784 -.0411016 .0544887
dagesq | -.0000797 .0003147 -0.25 0.800 -.0006965 .0005372
ddisab | -.0352603 .0364021 -0.97 0.333 -.1066273 .0361067
-----------------------------------------------------------------------------. estimates store olshet
. regress dlnhr $XREG, noconstant cluster(id)
F( 5, 531) = 0.52
Prob > F
= 0.7617
R-squared = 0.0063
Root MSE
= .29477
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .1115114 .0960926 1.16 0.246 -.0772569 .3002797
dkids | -.0062887 .0109558 -0.57 0.566 -.0278107 .0152333
dageh | .0066935 .012339 0.54 0.588 -.0175458 .0309328
dagesq | -.0000797 .0001551 -0.51 0.608 -.0003843 .000225
ddisab | -.0352603 .0452557 -0.78 0.436 -.1241625 .053642
-----------------------------------------------------------------------------. estimates store olspanel
.
. * (2) 2SLS using the base case instrument set
. * Table 22.2 (page 755) 2SLS column base case with various se estimates
. ivreg dlnhr ($XREG = $ZBASECASE), noconstant
Source |
SS
df
MS
-------------+------------------------------
Number of obs =
F( 5, 4251) =
4256
.
529
Model | .164904559 5 .032980912

Prob > F
=
.
Residual | 371.543217 4251 .087401368
R-squared =
.
-------------+-----------------------------Adj R-squared =
.
Total | 371.708121 4256 .087337435
Root MSE
= .29564
-----------------------------------------------------------------------------dlnhr |
-------------+---------------------------------------------------------------dlnwg | .2091087 .3886332 0.54 0.591 -.5528154 .9710328
dkids | -.0296864 .0437001 -0.68 0.497 -.1153615 .0559886
dageh | .026388 .0289908 0.91 0.363 -.030449 .0832251
dagesq | -.0003411 .0003688 -0.92 0.355 -.0010641 .000382
ddisab | .000402 .0429076 0.01 0.993 -.0837194 .0845233
-----------------------------------------------------------------------------Instrumented: dlnwg dkids dageh dagesq ddisab
Instruments: kidsl1 agehl1 agesql1 disabl1 agehl2 kidsl2 agesql2 disabl2
lnwgl2
-----------------------------------------------------------------------------. estimates store baseiid
. ivreg dlnhr ($XREG = $ZBASECASE), noconstant robust
Number of obs =
F( 5, 4251) = 0.23
Prob > F
= 0.9510
R-squared =
.
Root MSE = .29564
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .2091087 .423312 0.49 0.621 -.6208038 1.039021
dkids | -.0296864 .0400461 -0.74 0.459 -.1081977 .0488249
dageh | .026388 .0361631 0.73 0.466 -.0445106 .0972866
dagesq | -.0003411 .0004555 -0.75 0.454 -.0012342 .000552
ddisab | .000402 .0731433 0.01 0.996 -.142997 .143801
lnwgl2
-----------------------------------------------------------------------------. estimates store basehet
. ivreg dlnhr ($XREG = $ZBASECASE), noconstant cluster(id)
Number of obs =
F( 5, 531) = 1.44
Prob > F
= 0.2087
4256
530
R-squared
=
.
Root MSE
= .29564
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .2091087 .3741705 0.56 0.576 -.5259273 .9441447
dkids | -.0296864 .0293678 -1.01 0.313 -.0873777 .0280048
dageh | .026388 .0153921 1.71 0.087 -.0038488 .0566249
dagesq | -.0003411 .0001837 -1.86 0.064 -.0007019 .0000198
ddisab | .000402 .0667719 0.01 0.995 -.1307674 .1315714
lnwgl2
-----------------------------------------------------------------------------. estimates store basepanel
.
. * (3) 2SLS using the stacked instrument set
. * Table 22.2 (page 755) 2SLS column stacked case with various se estimates
. set matsize 100
. ivreg dlnhr ($XREG = $ZSTACKED), noconstant
Source |
SS
df
MS
-------------+-----------------------------F( 5, 4251) =
.
Model | -29.3711267 5 -5.87422533
Prob > F
=
.
Residual | 401.079248 4251 .094349388
R-squared =
.
-------------+-----------------------------Adj R-squared =
.
Total | 371.708121 4256 .087337435
Root MSE
= .30716
-----------------------------------------------------------------------------dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .542827 .1691348 3.21 0.001 .2112345 .8744195
dkids | -.0482932 .0393723 -1.23 0.220 -.1254834 .028897
dageh | .0268935 .0288808 0.93 0.352 -.029728 .0835151
dagesq | -.0003511 .0003671 -0.96 0.339 -.0010709 .0003687
ddisab | .0079759 .0397995 0.20 0.841 -.0700519 .0860037
Instruments: z1y1 z2y1 z3y1 z4y1 z5y1 z6y1 z7y1 z8y1 z9y1 z1y2 z2y2 z3y2 z4y2
z5y2 z6y2 z7y2 z8y2 z9y2 z1y3 z2y3 z3y3 z4y3 z5y3 z6y3 z7y3 z8y3
531
z3y8 z4y8 z5y8 z6y8 z7y8 z8y8 z9y8

-----------------------------------------------------------------------------. estimates store stackiid
. ivreg dlnhr ($XREG = $ZSTACKED), noconstant robust
Number of obs =
F( 5, 4251) = 1.59
Prob > F
= 0.1596
R-squared =
.
Root MSE = .30716
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .542827 .2260738 2.40 0.016 .0996043 .9860497
dkids | -.0482932 .0350149 -1.38 0.168 -.1169408 .0203544
dageh | .0268935 .0339561 0.79 0.428 -.0396781 .0934652
dagesq | -.0003511 .0004324 -0.81 0.417 -.0011989 .0004966
ddisab | .0079759 .064012 0.12 0.901 -.1175211 .1334729
-----------------------------------------------------------------------------. estimates store stackhet
. ivreg dlnhr ($XREG = $ZSTACKED), noconstant cluster(id)
Number of obs =
F( 5, 531) = 2.41
Prob > F
= 0.0357
R-squared =
.
Root MSE
= .30716
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .542827 .2085225 2.60 0.009 .1331968 .9524572
dkids | -.0482932 .0245011 -1.97 0.049 -.0964242 -.0001622
dageh | .0268935 .0149934 1.79 0.073 -.0025602 .0563473
dagesq | -.0003511 .0001866 -1.88 0.060 -.0007176 .0000154
ddisab | .0079759 .0624423 0.13 0.898 -.1146884 .1306402
532

-----------------------------------------------------------------------------. estimates store stackpanel
. ivreg dlnhr ($XREG = $ZSTACKED), noconstant robust cluster(id)
Number of obs =
F( 5, 531) = 2.41
Prob > F
= 0.0357
R-squared =
.
Root MSE
= .30716
4256
-----------------------------------------------------------------------------|
Robust
dlnhr |
Coef. Std. Err.
-------------+---------------------------------------------------------------dlnwg | .542827 .2085225 2.60 0.009 .1331968 .9524572
dkids | -.0482932 .0245011 -1.97 0.049 -.0964242 -.0001622
dageh | .0268935 .0149934 1.79 0.073 -.0025602 .0563473
dagesq | -.0003511 .0001866 -1.88 0.060 -.0007176 .0000154
ddisab | .0079759 .0624423 0.13 0.898 -.1146884 .1306402
-----------------------------------------------------------------------------.
. * DISPLAY THE OLS AND 2SLS RESULTS
.
. * The following are used in Table 22.2 (page 755)
.
. * OLS column with various standard errors estimates
. estimates table olspanel olshet olsiid, /*
----------------------------------------------------Variable | olspanel
olshet
olsiid
-------------+--------------------------------------533
dlnwg |
0.112
0.112
0.112
|
0.096
0.079
0.023
dkids | -0.006
-0.006
-0.006
|
0.011
0.011
0.012
dageh |
0.007
0.007
0.007
|
0.012
0.024
0.021
dagesq | -0.000
-0.000
-0.000
|
0.000
0.000
0.000
ddisab | -0.035
-0.035
-0.035
|
0.045
0.036
0.020
-------------+--------------------------------------N | 4256.000 4256.000 4256.000
ll | -837.557 -837.557 -837.557
r2 |
0.006
0.006
0.006
tss |
rss | 369.369
369.369
369.369
mss |
2.339
2.339
2.339
rmse |
0.295
0.295
0.295
df_r | 531.000 4251.000 4251.000
----------------------------------------------------legend: b/se
.
. * 2SLS column base case with various standard errors estimates
. estimates table basepanel basehet baseiid, /*
----------------------------------------------------Variable | basepanel basehet
baseiid
-------------+--------------------------------------dlnwg |
0.209
0.209
0.209
|
0.374
0.423
0.389
dkids | -0.030
-0.030
-0.030
|
0.029
0.040
0.044
dageh |
0.026
0.026
0.026
|
0.015
0.036
0.029
dagesq | -0.000
-0.000
-0.000
|
0.000
0.000
0.000
ddisab |
0.000
0.000
0.000
| 0.067
0.073
0.043
-------------+--------------------------------------N | 4256.000 4256.000 4256.000
ll |
r2 |
.
.
.
tss |
rss | 371.543
371.543
371.543
mss |
0.165
0.165
0.165
rmse |
0.296
0.296
0.296
df_r | 531.000 4251.000 4251.000
----------------------------------------------------legend: b/se
534
.
. * 2SLS column stacked case with various standard errors estimates
. estimates table stackpanel stackhet stackiid, /*
----------------------------------------------------Variable | stackpanel stackhet stackiid
-------------+--------------------------------------dlnwg |
0.543
0.543
0.543
|
0.209
0.226
0.169
dkids | -0.048
-0.048
-0.048
|
0.025
0.035
0.039
dageh |
0.027
0.027
0.027
|
0.015
0.034
0.029
dagesq | -0.000
-0.000
-0.000
|
0.000
0.000
0.000
ddisab |
0.008
0.008
0.008
|
0.062
0.064
0.040
-------------+--------------------------------------N | 4256.000 4256.000 4256.000
ll |
r2 |
.
.
.
tss |
rss | 401.079
401.079
401.079
mss | -29.371
-29.371 -29.371
rmse |
0.307
0.307
0.307
df_r | 531.000 4251.000 4251.000
----------------------------------------------------legend: b/se
.
. ********** (4)-(5) 2SGMM REQUIRES SPECIAL MARTRIX CODING **********
.
. *** PROGRAM PANELGMM DOES 2SLS (as check) and 2SGMM USING MATRIX
COMMANDS
.
. * This program:
. * - requires as inputs the global macros
.*
y gives the dependent variable name
.*
X gives the list of regressor names
.*
Z gives the list of instrument names
. * - assumes the appropriate data is in memory
. * - assumes the cluster identifier is called id
.
. * If the regressors and instruments include an intercept include
. * this as a separate regressor, say called ONE, in X and Z.
. * Then continue to use the following code with the noconstant option for accum and optaccum.
. * (accum and optaccum automatically include a constant AT THE END,
. * which is not where we want the constant.)
.
535
. * This program computes the 2SLS and two-step GMM estimators

.*
[(X'Z)(Z'Z)_inv Z'X]_inv (X'Z)(Z'Z)_inv Z'y
. * and [(X'Z)S_inv Z'X]_inv (X'Z)S_inv Z'y
. * and appropriate panel robust standard errors
. * assuming a short panel with errors correlated over t for given i and heteroskedastic.
.
. program define panelgmm
1.
. * (1) Create Z'Z and check that full rank
. matrix accum ZZ = $Z, noconstant
2. scalar dimz = rowsof(ZZ)
3. scalar detzz = det(ZZ)
4. di "Redundant instruments if det(Z'Z) zero. Here det(Z'Z) = " detzz
5.
. * (2) Create Z'X which is trickier
. * Create ZX'ZX = [Z X]' [Z X] using accum which automatically adds a constant
. matrix accum ZXZX = $Z $X, noconstant
6. * Then Z'X is the (1,2) submatrix: rows 1 to dimz and columns dimz+1 to dimzx
. scalar dimzx = rowsof(ZXZX)
7. * Also need dimension of X
. matrix accum XX = $X, noconstant
8. scalar dimx = rowsof(XX)
9. matrix ZX = ZXZX[1..dimz,dimz+1...]
10.
. * (3) Create Z'y
. * Create Zy'Zy = [Z y]' [Z y] using accum which automatically adds a constant
. matrix accum ZyZy = $Z $y, noconstant
11. * Then Z'y is the (1,2) submatrix: rows 1 to dimz and the last column
. matrix Zy = ZyZy[1..dimz,dimz+1]
12.
. * (4) Compute 2SLS Estimator
. di " "
13. di "2SLS results: "
14. matrix b2SLS = syminv(ZX'*syminv(ZZ)*ZX)*ZX'*syminv(ZZ)*Zy
15. matrix list b2SLS
16.
. * (5) Compute S = Sum_i Zi'u_i*u_i'Z_i using opaccum
. * Key is use of opaccum.
. * Need to compute the residuals.
. gen yhat = 0
17. foreach var of varlist $X {
18. matrix a`var' = b2SLS["`var'",1]
19. scalar b`var' = trace(a`var') /* converts matrix to scalar */
20. quietly replace yhat = yhat + (b`var')*(`var')
21. }
22. gen uhat = $y - yhat
23. gen uhatsq = uhat*uhat
24. quietly sum(uhatsq)
25. scalar rmse = sqrt(r(sum)/(_N-dimx))
26. di "rmse = " rmse
27. * Alternative and check uses ivreg.
536
. quietly ivreg $y ($X = $Z), noconstant cluster(id)

28. predict uhat2, residuals
29. quietly sum uhat uhat2
30. * Sort data for opaccum to work
. preserve
31. sort id
32. matrix opaccum S = $Z, group(id) opvar(uhat) noconstant
33. /*
> * Ziliak uses heteroskedastic errors but not correlated.
> * Then instead use the following which assumes time identifier is year.
> * Make a unique identifier obsid so that group(obsid) does not group
> gen obsid = 10000*id + year
> sort obsid
> matrix opaccum S = $Z, group(obsid) opvar(uhat) noconstant
> */
. restore
34.
. * (6) Compute Variance of 2SLS.
.
matrix
v2SLS
=
syminv(ZX'*syminv(ZZ)*ZX)*ZX'*syminv(ZZ)*S*syminv(ZZ)*ZX*syminv(ZX'*syminv(ZZ)*Z
X)
35. * matrix list v2SLS
. * Now need to get standard errors
. matrix se2SLS = J(dimx,1,0) /* Initially column vector of zeroes */
36. scalar icol = 1
37. * Need loop here as Stata does not do square root on a vector
. while icol <= dimx {
38. matrix se2SLS[icol,1] = sqrt(v2SLS[icol,icol])
39. scalar icol = icol+1
40. }
41. matrix list se2SLS
42.
. * (7) Compute Two-step GMM
. di " "
43. di "2SGMM results: "
44. matrix b2SGMM = syminv(ZX'*syminv(S)*ZX)*ZX'*syminv(S)*Zy
45. matrix list b2SGMM
46.
. * (8) Compute Variance of Two-step GMM
. * Compute the residuals to recompute S at the new estimates.
. * Note that could just use the old S
. drop yhat uhat uhatsq
47. gen yhat = 0
48. foreach var of varlist $X {
49. matrix a`var' = b2SGMM["`var'",1]
50. scalar b`var' = trace(a`var') /* converts matrix to scalar */
51. quietly replace yhat = yhat + (b`var')*(`var')
52. }
53. gen uhat = $y - yhat
54. gen uhatsq = uhat*uhat
55. quietly sum(uhatsq)
537
56. scalar rmse = sqrt(r(sum)/(_N-dimx))

57. di "rmse = " rmse
58. * Sort data for opaccum to work
. preserve
59. sort id
60. matrix opaccum S = $Z, group(id) opvar(uhat) noconstant
61. matrix v2SGMM = syminv(ZX'*syminv(S)*ZX)
62. * matrix list v2SGMM
. matrix se2SGMM = J(dimx,1,0) /* Initially column vector of zeroes */
63. scalar icol = 1
64. * Need loop here as Stata does not do square root on a vector
. while icol <= dimx {
65. matrix se2SGMM[icol,1] = sqrt(v2SGMM[icol,icol])
66. scalar icol = icol+1
67. }
68. matrix list se2SGMM
69.
. * (9) Compute the overidentifying restrictions test
. * Create row vector u'Z using vecaccum which automatically adds a constant
. matrix vecaccum uZ = uhat $Z, noconstant
70. matrix maxobjfunction = uZ*syminv(S)*uZ'
71. scalar ortest = maxobjfunction[1,1]
72. scalar dof = dimz - dimx
73. di " Over-identifying restrictions test " ortest " dof " dof " p-value " chi2tail(dof,ortest)
74.
. end
.
. *** EXECUTE THE PROGRAM PANEL GMM FOR THESE DATA
.
. * Note that Ziliak does not use an intercept.
. * If have an intercept then need to add in the constant explicitly
. * generate ONE = 1
. * and then add this to the X and Z
.
. * Define the dependent variable
. global y dlnhr
.
. * Define the regressors.
. global X $XREG
.
. * (4) 2SGMM (and 2SLS as check) using the base case instrument set
. * Gives 2SGMM Base Case column of Table 22.2 (page 755)
.
. global Z $ZBASECASE
. panelgmm
(obs=4256)
Redundant instruments if det(Z'Z) zero. Here det(Z'Z) = 6.375e+37
538
(obs=4256)
(obs=4256)
(obs=4256)
2SLS results:
b2SLS[5,1]
dlnhr
dlnwg .20910869
dkids -.02968643
dageh .02638804
dagesq -.00034108
ddisab .00040197
rmse = .29563723
se2SLS[5,1]
c1
r1 .3736429
r2 .02932634
r3 .01537039
r4 .00018343
r5 .06667771
2SGMM results:
b2SGMM[5,1]
dlnhr
dlnwg .54679602
dkids -.04490416
dageh .02747594
dagesq -.00035912
ddisab -.0468348
rmse = .30719932
se2SGMM[5,1]
c1
r1 .32762396
r2 .02714405
r3 .01295984
r4 .00015941
r5 .06236006
Over-identifying restrictions test 5.4503878 dof 4 p-value .24412497
.
. * (5) 2SGMM (and 2SLS as check) using the stacked instrument set
. * Gives 2SGMM Stacked Case column of Table 22.2 (page 755)
.
. drop uhat yhat uhatsq uhat2 /* Obtained in panelgmm */
. global Z $ZSTACKED
539
. * dlnwg dkids dageh dagesq ddisab

. panelgmm
(obs=4256)
Redundant instruments if det(Z'Z) zero. Here det(Z'Z) = 7.52e+234
(obs=4256)
(obs=4256)
(obs=4256)
2SLS results:
b2SLS[5,1]
dlnhr
dlnwg .54282703
dkids -.0482932
dageh .02689353
dagesq -.00035113
ddisab .0079759
rmse = .30716345
se2SLS[5,1]
c1
r1 .20822845
r2 .02446659
r3 .01497229
r4 .0001863
r5 .0623543
2SGMM results:
b2SGMM[5,1]
dlnhr
dlnwg .32999732
dkids -.01681724
dageh .01637783
dagesq -.00019221
ddisab -.02010632
rmse = .29791501
se2SGMM[5,1]
c1
r1 .10965082
r2 .01356737
r3 .00834178
r4 .0001037
r5 .02357317
Over-identifying restrictions test 69.506226 dof 67 p-value .39307324
.
. ********** (6) F-STATISTICS FOR WEAK INSTRUMENTS (page 756) **********
.
. * (1) Weak Instruments using base case instrument set
540
.
. * Test weak instruments for dlnwg using panel robust inference
. quietly regress dlnwg $ZBASECASE, cluster(id)
. quietly test $ZBASECASE
. * This value should have been reported in the text on page 756
. * [Instead by mistake the F assuning iid errors below was reported]
. di "r2 = " e(r2) " F = " r(F) " p = " r(p) " dof = " r(df)
r2 = .00590049 F = 2.3790046 p = .01209278 dof = 9
.
. * Same except use wrong inference assuming iid errors
. quietly regress dlnwg $ZBASECASE
r2 = .00590049 F = 2.800243 p = .00281135 dof = 9
.
. * (2) Weak Instruments using stacked instrument set
.
. quietly regress dlnwg $ZSTACKED, cluster(id)
. quietly test $ZSTACKED
. * This value was reported in the text on page 756
r2 = .02256803 F = 1.9000813 p = .00003808 dof = 72
.
. * Same except use wrong inference assuming iid errors
. quietly regress dlnwg $ZSTACKED
r2 = .02256803 F = 1.341413 p = .02961833 dof = 72
.
. * (3) Weak Instruments for other regressors
. * Here all regressors are instrumented. So should test all as above.
. * These find no problems.
. * For example, for dkids and base case instrument set
. quietly regress dkids $ZSTACKED, cluster(id)
541
r2 = .16281613 F = 8.4145744 p = 3.349e-52 dof = 72

. quietly regress dageh $ZSTACKED, cluster(id)
r2 = .22076423 F = 24.002499 p = 6.30e-126 dof = 72
. quietly regress dagesq $ZSTACKED, cluster(id)
r2 = .36856999 F = 150.79951 p = 4.10e-309 dof = 72
. quietly regress ddisab $ZSTACKED, cluster(id)
r2 = .28591864 F = 25.786283 p = 4.70e-132 dof = 72
.
. ********** PARTIAL R-SQUARED FOR WEAK INSTRUMENTS (page 756) **********
.
. * (1) Weak Instruments using base case instrument set
.
. quietly regress dlnwg $ZBASECASE, cluster(id)
r2 = .00590049 F = 2.3790046 p = .01209278 dof = 9
.
. **** (D) Shea (1997) partial R-squared
.
. * Here we have five endogenous regressors and no exogenous regressors.
. * Need to change code below if there are exogenous regressors. See ch4ivkling.do
. * Focus on the endogenous wage regressor.
. * For the other four just need to replace dlnwg in the first line of (1)
. * and replace the first line of (2B)
.
. * (1) Form x1 - x1tilda: residual from regress x1 on other regressors
. quietly reg dlnwg dkids dageh dagesq ddisab
. * quietly reg dkids dlnwg dageh dagesq ddisab
. predict x1minusx1tilda, resid
542
.
. * (2) Form x1hat - x1hattilda: residual from regress x1hat on fitted values of other regressors
. * (2A) First get the fitted values from regress endogenous on instruments
. quietly reg dlnwg $ZBASECASE
. predict dlnwghat, xb
. di e(r2) " r2 from regress x1 on Z"
.00590049 r2 from regress x1 on Z
. quietly reg dkids $ZBASECASE
. predict dkidshat, xb
. di e(r2) " r2 from regress second endog regressor on Z"
.1473738 r2 from regress second endog regressor on Z
. quietly reg dageh $ZBASECASE
. predict dagehhat, xb
. di e(r2) " r2 from regress third endog regressor on Z"
.13903221 r2 from regress third endog regressor on Z
. quietly reg dagesq $ZBASECASE
. predict dagesqhat, xb
. di e(r2) " r2 from regress fourth endog regressor on Z"
.3049799 r2 from regress fourth endog regressor on Z
. quietly reg ddisab $ZBASECASE
. predict ddisabhat, xb
. di e(r2) " r2 from regress fifth endog regressor on Z"
.26087493 r2 from regress fifth endog regressor on Z
. * (2B) Run the regression of x1hat on fitted values of other regressors
. quietly reg dlnwghat dkidshat dagehhat dagesqhat ddisabhat
. * quietly reg dkidshat dlnwghat dagehhat dagesqhat ddisabhat
. di e(r2) " r2 from regress prediction of x1 on predictions of x2
.38268288 r2 from regress prediction of x1 on predictions of x2
. predict x1hatminusx1hattilda, resid
.
. * (3) Form the correlation between (1) and (2)
. * This value is reported in the text on page 756
. corr x1minusx1tilda x1hatminusx1hattilda
543
(obs=4256)
| x1minu~a x1hatm~a
-------------+-----------------x1minusx1t~a | 1.0000
x1hatminus~a | 0.0604 1.0000
. di r(rho)^2 " Shea's partial R-squared measure"

.00364741 Shea's partial R-squared measure
.
.
. log close
log: c:\Imbook\bwebpage\Section5\mma22p1pangmm.txt
log type: text
closed on: 23 May 2005, 11:52:42
544
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section5\mma23p1pannonlin.txt
log type: text
opened on: 23 May 2005, 12:46:16
.
. ********** OVERVIEW OF MMA23P1PANNONLIN.DO **********
.
. * STATA Program
.
. * Example of nonlinear model (multiplicative effects)
.
. * This program derives Table 23.1 and Figure 23.1.
. * It performs nonlinear panel analysis for multiplicative effects model
. * y_it = a_i*exp(x_it'b) = exp(c_i+x_it'b)
. * and parametric count data models
.
. * (1) Linear (xtreg) for log(PAT) with adjustment for PAT=0
.*
Output include Figure 23.1
. * (2) Poisson (xtpoisson) fixed and random effects
. * (3) GEE (xtgee) which includes pooled NLS
.
. * The Poisson individual effects model is
. * y_it ~ Poisson(x_it'b + a_i)
. * The standard errors assume this model correctly specified
. * i.e. Variance = mean given x+it and a_i
.
. * FOr "panel robust se's see section 23.2.6 pages 788-791
. * To obtain more panel robust standard errors this program panel bootstraps
. * Note that the panel se entries of 0.033 under GEE, Poisson-RE and Poisson-FE
. * are not panel robust to the extent that the bootstrap se's are panel robust
. * and in fact are the usual se's in the case of Poisson-RE and Poisson-FE
. * Unlike ch.21 here "panel se" means "defaul panel se" and not "panel-robust se".
.
. * To speed up program reduce nreps, the number of bootstrap replications
.
. * patr7079.asc
.
. ********** SETUP **********
.
. set more off
. version 8.0
545
.
.
. * There are ten years of data but only five years 1975-79 are used in estimation
.
. * Bronwyn Hall, Zvi Griliches, and Jerry Hausman (1986),
. * "Patents and R&D: Is There a Lag?",
. * International Economic Review, 27, 265-283.
.
. * File patr7079.dat has data on 346 firms
. * There are 4 lines per firm, with 25 variables
. * Time-invariant: CUSIP,ARDSSIC,SCISECT,LOGK,SUMPAT,
. * Time-varying X: LOGR70,LOGR71,LOGR72, ....., LOGR77,LOGR78,LOGR79
. * Time-varying Y: PAT70,PAT71,PAT72, ....., PAT77,PAT78,PAT79
. * in the format:
. * I7,I3,I2,5F12.6/6F12.6/6F12.6/5F12.6/
. * where
. * CUSIP Compustat's identifying number for the firm (Committee on
.*
Uniform Security Identification Procedures number).
. * ARDSIC A two-digit code for the applied R&D industrial classification
.*
(roughly that in Bound, Cummins, Griliches, Hall, and Jaffe, in
.*
the Griliches R&D, Patents, and Productivity volume).
. * SCISECT Dummy equal to one for firms in the scientific sector.
. * LOGK The logarithm of the book value of capital in 1972.
. * SUMPAT The sum of patents applied for between 1972-1979.
. * LOGR70- The logarithm of R&D spending during the year (in 1972 dollars).
. * LOGR79
. * PAT70- The number of patents applied for during the year that were
. * PAT79 eventually granted.
.
. ********** READ DATA **********
.
. * The data are in ascii file patr7079.asc
. * There are 346 observations on 25 variables with four lines per obs
. * The data are fixed format with
. * line 1 variables 1-8 I7,I3,I2,5F12.6
. * line 2 variables 9-14 6F12.6
.
. * As there is space between each observation data is also space-delimited
. * free format and then there is no need for a dictionary file
. * The following command spans more that one line so use /* and */
. infile CUSIP ARDSSIC SCISECT LOGK SUMPAT LOGR70 LOGR71 LOGR72 LOGR73 /*
> */ LOGR74 LOGR75 LOGR76 LOGR77 LOGR78 LOGR79 PAT70 PAT71 PAT72 /*
> */ PAT73 PAT74 PAT75 PAT76 PAT77 PAT78 PAT79 using patr7079.asc
546
.
. ********** DATA TRANSFORMATIONS **********
.
. * Use observation number as an identifier, not just CUSIP
. gen id = _n
. label variable id "id"
. * The following lists the variables in data set and summarizes data
. describe
Contains data
obs:
346
vars:
26
size:
label
variable label
------------------------------------------------------------------------------CUSIP
float %9.0g
ARDSSIC
float %9.0g
SCISECT
float %9.0g
LOGK
float %9.0g
SUMPAT
float %9.0g
LOGR70
float %9.0g
LOGR71
float %9.0g
LOGR72
float %9.0g
LOGR73
float %9.0g
LOGR74
float %9.0g
LOGR75
float %9.0g
LOGR76
float %9.0g
LOGR77
float %9.0g
LOGR78
float %9.0g
LOGR79
float %9.0g
PAT70
float %9.0g
PAT71
float %9.0g
PAT72
float %9.0g
PAT73
float %9.0g
PAT74
float %9.0g
PAT75
float %9.0g
PAT76
float %9.0g
PAT77
float %9.0g
PAT78
float %9.0g
PAT79
float %9.0g
id
float %9.0g
id
------------------------------------------------------------------------------Sorted by:
. summarize
547
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------CUSIP |
346 531201.2 282074.9
800 989399
ARDSSIC |
336 9.97619 5.459706
1
21
SCISECT |
346 .4248555 .4950369
0
1
LOGK |
346 3.921063 2.095542 -1.76965 9.66626
SUMPAT |
346 284.7312 571.1136
0
3806
-------------+-------------------------------------------------------LOGR70 |
346 1.198348 1.941968 -3.67354 6.56641
LOGR71 |
346 1.169182 1.929444 -3.53055 6.95687
LOGR72 |
346 1.185953 1.929078 -3.35241 6.97009
LOGR73 |
346 1.231135 1.934896 -3.67395 7.06211
LOGR74 |
346 1.232636 1.946417 -3.15274 7.06524
-------------+-------------------------------------------------------LOGR75 |
346 1.165802 1.98001 -3.5476 6.76486
LOGR76 |
346 1.212888 1.979273 -3.84868 6.8285
LOGR77 |
346 1.250034 2.003002 -3.47884 6.90253
LOGR78 |
346 1.306511 2.019792 -3.2832 6.96345
LOGR79 |
346 1.345581 2.054982 -3.57742 7.03432
-------------+-------------------------------------------------------PAT70 |
346 40.00289 82.50335
0
608
PAT71 |
346 38.10983 78.40308
0
553
PAT72 |
346 36.30925 74.81591
0
557
PAT73 |
346 36.95376 77.91971
0
595
PAT74 |
346 37.60983 75.94388
0
528
-------------+-------------------------------------------------------PAT75 |
346 36.87283 75.98788
0
508
PAT76 |
346 35.84682 73.31613
0
487
PAT77 |
346 36.23121 72.75146
0
456
PAT78 |
346 32.80636 65.6505
0
434
PAT79 |
346 32.10116 66.36197
0
515
-------------+-------------------------------------------------------id |
346
173.5 100.0258
1
346
.
. ******** CHANGE ORGANIZATION OF DATA USING RESHAPE AND MORE
TRANSFORMATIONS
.
. reshape long PAT LOGR, i(id) j(year)
(note: j = 70 71 72 73 74 75 76 77 78 79)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
346 -> 3460
Number of variables
26 ->
9
-> year
xij variables:
PAT70 PAT71 ... PAT79 -> PAT
LOGR70 LOGR71 ... LOGR79 -> LOGR
-----------------------------------------------------------------------------
548
. describe
Contains data
obs:
3,460
vars:
9
size:
label
variable label
------------------------------------------------------------------------------id
float %9.0g
id
year
byte %9.0g
CUSIP
float %9.0g
ARDSSIC
float %9.0g
SCISECT
float %9.0g
LOGK
float %9.0g
SUMPAT
float %9.0g
LOGR
float %9.0g
PAT
float %9.0g
------------------------------------------------------------------------------Sorted by: id year
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
3460
173.5 99.89562
1
346
year |
3460
74.5 2.872696
70
79
CUSIP |
3460 531201.2 281707.7
800 989399
ARDSSIC |
3360 9.97619 5.452387
1
21
SCISECT |
3460 .4248555 .4943925
0
1
-------------+-------------------------------------------------------LOGK |
3460 3.921063 2.092814 -1.76965 9.66626
SUMPAT |
3460 284.7312 570.3701
0
3806
LOGR |
3460 1.229807 1.970524 -3.84868 7.06524
PAT |
3460 36.28439 74.46563
0
608
.
. * Create new variable log(patents) with adjustment for patents = 0
. gen NEWPAT = PAT
. replace NEWPAT = 0.5 if NEWPAT==0.
. gen LPAT = ln(NEWPAT)
. label variable LPAT "Ln(Patents)"
. label variable PAT "Patents"
549
. * Dummy variable for logit analysis

. gen DPAT = 0
. replace DPAT = 1 if PAT>0
. label variable DPAT "Patent Indicator"
. * R and D
. gen RANDD = exp(LOGR)
. label variable LOGR "Ln(R&D)"
. label variable RANDD "R&D"
. * Lagged log R and D
. tsset id year
panel variable: id, 1 to 346
. gen LOGRL1 = L1.LOGR
. label variable LOGRL1 "Ln(R&D) lagged once"
. label variable LOGRL2 "Ln(R&D) lagged twice"
. label variable LOGRL3 "Ln(R&D) lagged three times"
. label variable LOGRL4 "Ln(R&D) lagged four times"
. label variable LOGRL5 "Ln(R&D) lagged five times"
. * Year dummies
. gen dyear2 = 0
. replace dyear2 = 1 if year==76
550
. gen dyear3 = 0
. gen dyear4 = 0
. gen dyear5 = 0
.
. * Check data and Save data as Stata data set
. describe
Contains data
obs:
3,460
vars:
22
size:
label
variable label
------------------------------------------------------------------------------id
float %9.0g
id
year
byte %9.0g
CUSIP
float %9.0g
ARDSSIC
float %9.0g
SCISECT
float %9.0g
LOGK
float %9.0g
SUMPAT
float %9.0g
LOGR
float %9.0g
Ln(R&D)
PAT
float %9.0g
Patents
NEWPAT
float %9.0g
LPAT
float %9.0g
Ln(Patents)
DPAT
float %9.0g
Patent Indicator
RANDD
float %9.0g
R&D
LOGRL1
float %9.0g
Ln(R&D) lagged once
LOGRL2
float %9.0g
Ln(R&D) lagged twice
LOGRL3
float %9.0g
Ln(R&D) lagged three times
LOGRL4
float %9.0g
Ln(R&D) lagged four times
LOGRL5
float %9.0g
Ln(R&D) lagged five times
dyear2
float %9.0g
dyear3
float %9.0g
dyear4
float %9.0g
dyear5
float %9.0g
------------------------------------------------------------------------------Sorted by: id year
551

. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
3460
173.5 99.89562
1
346
year |
3460
74.5 2.872696
70
79
CUSIP |
3460 531201.2 281707.7
800 989399
ARDSSIC |
3360 9.97619 5.452387
1
21
SCISECT |
3460 .4248555 .4943925
0
1
-------------+-------------------------------------------------------LOGK |
3460 3.921063 2.092814 -1.76965 9.66626
SUMPAT |
3460 284.7312 570.3701
0
3806
LOGR |
3460 1.229807 1.970524 -3.84868 7.06524
PAT |
3460 36.28439 74.46563
0
608
NEWPAT |
3460 36.37182 74.42325
.5
608
-------------+-------------------------------------------------------LPAT |
3460 1.935464 1.949421 -.6931472 6.410175
DPAT |
3460 .8251445 .3798984
0
1
RANDD |
3460 23.02263 82.90186 .0213078 1170.563
LOGRL1 |
3114 1.216943 1.960836 -3.84868 7.06524
LOGRL2 |
2768 1.205747 1.953427 -3.84868 7.06524
-------------+-------------------------------------------------------LOGRL3 |
2422 1.19942 1.946583 -3.84868 7.06524
LOGRL4 |
2076 1.197176 1.941555 -3.67395 7.06524
LOGRL5 |
1730 1.203451 1.934293 -3.67395 7.06524
dyear2 |
3460
.1 .3000434
0
1
dyear3 |
3460
.1 .3000434
0
1
-------------+-------------------------------------------------------dyear4 |
3460
.1 .3000434
0
1
dyear5 |
3460
.1 .3000434
0
1
. drop NEWPAT
. save patr7079, replace
file patr7079.dta saved
. summarize
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------id |
3460
173.5 99.89562
1
346
year |
3460
74.5 2.872696
70
79
CUSIP |
3460 531201.2 281707.7
800 989399
ARDSSIC |
3360 9.97619 5.452387
1
21
SCISECT | 3460 .4248555 .4943925
0
1
-------------+-------------------------------------------------------LOGK |
3460 3.921063 2.092814 -1.76965 9.66626
SUMPAT |
3460 284.7312 570.3701
0
3806
LOGR |
3460 1.229807 1.970524 -3.84868 7.06524
552
PAT |
3460 36.28439 74.46563
0
608
LPAT |
3460 1.935464 1.949421 -.6931472 6.410175
-------------+-------------------------------------------------------DPAT |
3460 .8251445 .3798984
0
1
RANDD |
3460 23.02263 82.90186 .0213078 1170.563
LOGRL1 |
3114 1.216943 1.960836 -3.84868 7.06524
LOGRL2 |
2768 1.205747 1.953427 -3.84868 7.06524
LOGRL3 |
2422 1.19942 1.946583 -3.84868 7.06524
-------------+-------------------------------------------------------LOGRL4 |
2076 1.197176 1.941555 -3.67395 7.06524
LOGRL5 |
1730 1.203451 1.934293 -3.67395 7.06524
dyear2 |
3460
.1 .3000434
0
1
dyear3 |
3460
.1 .3000434
0
1
dyear4 |
3460
.1 .3000434
0
1
-------------+-------------------------------------------------------dyear5 |
3460
.1 .3000434
0
1
. xtsum, i(id)
Variable
|
Mean Std. Dev.
Min
Max | Observations
-----------------+--------------------------------------------+---------------id
overall | 173.5 99.89562
1
346 | N = 3460
between |
100.0258
1
346 | n = 346
within |
0
173.5
173.5 | T =
10
|
|
year overall |
74.5 2.872696
70
79 | N = 3460
between |
0
74.5
74.5 | n = 346
within |
2.872696
70
79 | T =
10
|
|
CUSIP overall | 531201.2 281707.7
800 989399 | N = 3460
between |
282074.9
800 989399 | n = 346
within |
0 531201.2 531201.2 | T =
10
|
|
ARDSSIC overall | 9.97619 5.452387
1
21 | N = 3360
between |
5.459706
1
21 | n = 336
within |
0 9.97619 9.97619 | T =
10
|
|
SCISECT overall | .4248555 .4943925
0
1 | N = 3460
between |
.4950369
0
1 | n = 346
within |
0 .4248555 .4248555 | T =
10
|
|
LOGK overall | 3.921063 2.092814 -1.76965 9.66626 | N = 3460
between |
2.095542 -1.76965 9.66626 | n = 346
within |
0 3.921063 3.921063 | T =
10
|
|
SUMPAT overall | 284.7312 570.3701
0
3806 | N = 3460
between |
571.1136
0
3806 | n = 346
within |
0 284.7312 284.7312 | T =
10
|
|
LOGR overall | 1.229807 1.970524 -3.84868 7.06524 | N = 3460
between |
1.944421 -3.120133 6.911438 | n = 346
553
within |
.3347099 -1.19673 4.218814 | T =
10
|
|
PAT
overall | 36.28439 74.46563
0
608 | N = 3460
between |
72.5989
0
484.8 | n = 346
within |
16.97772 -177.7156 224.3844 | T =
10
|
|
LPAT overall | 1.935464 1.949421 -.6931472 6.410175 | N =
between |
1.873181 -.6931472 6.180623 | n = 346
within |
.5482375 -.2643028 4.368045 | T =
10
|
|
DPAT overall | .8251445 .3798984
0
1 | N = 3460
between |
.2831052
0
1 | n = 346
within |
.2537376 -.0748555 1.725145 | T =
10
|
|
RANDD overall | 23.02263 82.90186 .0213078 1170.563 | N =
between |
81.69163 .0582575 1014.058 | n = 346
within |
14.71596 -280.2214 311.47 | T =
10
|
|
LOGRL1 overall | 1.216943 1.960836 -3.84868 7.06524 | N =
between |
1.937733 -3.123236 6.897784 | n = 346
within |
.3157841 -.6151992 4.203909 | T =
9
|
|
LOGRL2 overall | 1.205747 1.953427 -3.84868 7.06524 | N =
between |
1.932143 -3.12461 6.889576 | n = 346
within |
.3035537 -.486563 4.187752 | T =
8
|
|
LOGRL3 overall | 1.19942 1.946583 -3.84868 7.06524 | N =
between |
1.926813 -3.074006 6.887726 | n = 346
within |
.2928787 -.2381882 4.153968 | T =
7
|
|
LOGRL4 overall | 1.197176 1.941555 -3.67395 7.06524 | N =
between |
1.923302 -2.989647 6.897597 | n = 346
within |
.2818841 -.2335892 4.095286 | T =
6
|
|
LOGRL5 overall | 1.203451 1.934293 -3.67395 7.06524 | N =
between |
1.917687 -2.99075 6.924144 | n = 346
within |
.2692134 -.1899074 4.062701 | T =
5
|
|
dyear2 overall |
.1 .3000434
0
1 | N = 3460
between |
0
.1
.1 | n = 346
within |
.3000434
0
1| T=
10
|
|
dyear3 overall |
.1 .3000434
0
1 | N = 3460
between |
0
.1
.1 | n = 346
within |
.3000434
0
1| T=
10
|
|
dyear4 overall |
.1 .3000434
0
1 | N = 3460
between |
0
.1
.1 | n = 346
within |
.3000434
0
1| T=
10
|
|
dyear5 overall |
.1 .3000434
0
1 | N = 3460
3460
3460
3114
2768
2422
2076
1730
554
between |
within |
0
.3000434
.1
0
.1 | n = 346
1| T=
10
.
. ********** DEFINE GLOBALS INCLUDING REGRESSOR LIST **********
.
. * Number of reps for the bootstrap
. * Table 23.1 used 500
. global nreps 500
.
. * The regressions below are of patents on LOGR ??? on ???
. * Additional regressors to be included below are defined in xextra
. * Here no additional regressors
. global xextra
.
. ********** (1) LINEAR PANEL RANDOM AND FIXED EFFECTS FOR LOG(PAT)
**********
.
. * This adhoc method uses as dependent variable
. * LPAT = ln(PAT) if PAT > 0
.*
= ln(0.5) if PAT = 0
. * which is analyzed using chapter 21 methods
.
. * Note that in the first xt command need to give , i(id)
. * to indicate that the ith observation is for the ith id
. * Time invariant regressors LOGK SCISECT are not included
.
. use patr7079, clear
. drop if year<75
.
. * Overall plot of data
. * The graphs below use new Stata 8 graphics
. * Change graphics scheme from default s2color to s1mono for printing
. set scheme s1mono
.
. * Figure 21.1 page 792 [with axis labels corrected - book is wrong]
. graph twoway (scatter LPAT LOGR, msize(vsmall)) (lowess LPAT LOGR) (lfit LPAT LOGR), /*
> */ title("Pooled (Overall) Regression") /*
> */ xtitle("Log R&D Spending", size(medlarge)) xscale(titlegap(*5)) /*
> */ ytitle("Log Patents", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend( label(1 "Original data") label(2 "Nonparametric fit") label(3 "Linear fit"))
. graph export ch23fig1.wmf, replace
555
(file c:\Imbook\bwebpage\Section5\ch23fig1.wmf written in Windows Metafile format)

.
. * OLS
. regress LPAT LOGR $xextra, cluster(id)
F( 1, 345) = 1330.60
Prob > F
= 0.0000
R-squared = 0.7192
Root MSE
= 1.0461
-----------------------------------------------------------------------------|
Robust
LPAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .8340745 .0228655 36.48 0.000 .7891012 .8790478
_cons | .7954785 .0579246 13.73 0.000 .6815487 .9094083
-----------------------------------------------------------------------------. estimates store linolspan
.
. * Fixed effects
. xtreg LPAT LOGR $xextra, fe i(id)
between = 0.7669
overall = 0.7192
corr(u_i, Xb) = 0.8405
Number of obs
=
1730
Number of groups =
346
avg =
5.0
max =
5
F(1,1383)
=
Prob > F
3.63
=
0.0570
-----------------------------------------------------------------------------LPAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .1067505 .0560364 1.91 0.057 -.0031749 .216676
_cons | 1.709116 .0714557 23.92 0.000 1.568943 1.849289
-------------+---------------------------------------------------------------sigma_u | 1.7380872
sigma_e | .51119065
Prob > F = 0.0000
. estimates store linfe
.
556
. * Random effects
. xtreg LPAT LOGR $xextra, re i(id)
between = 0.7669
overall = 0.7192
corr(u_i, X)
= 0 (assumed)
Number of obs
Number of groups =
=
1730
346

avg =
5.0
max =
5
Wald chi2(1)
= 915.90
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------LPAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .7202377 .0237986 30.26 0.000 .6735932 .7668821
_cons | .9384761 .0599584 15.65 0.000 .8209598 1.055992
-------------+---------------------------------------------------------------sigma_u | .90057544
sigma_e | .51119065
-----------------------------------------------------------------------------. estimates store linre
.
.
. ********** (2) POISSON RANDOM AND FIXED EFFECTS (Table 32.1 p.794 ) **********
.
. use patr7079, clear
. drop if year<75
.
. * Poisson Cross-section with Poisson standard errors
. * Table 23.1 Poisson column
.
. poisson PAT LOGR $xextra
Poisson regression
Number of obs =
1730
LR chi2(1)
= 108479.76
Prob > chi2 = 0.0000
Pseudo R2
= 0.7206
-----------------------------------------------------------------------------557
PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .6929337 .0022454 308.61 0.000 .6885329 .6973346
_cons | 1.711528 .009767 175.24 0.000 1.692385 1.730671
-----------------------------------------------------------------------------. estimates store poisiid
.
. * Poisson Cross-section with heteroskedastic robust standard errors
. poisson PAT LOGR $xextra, robust
Poisson regression
Number of obs =
1730
Wald chi2(1) = 1223.63
Prob > chi2 = 0.0000
Pseudo R2
= 0.7206
-----------------------------------------------------------------------------|
Robust
PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .6929337 .0198092 34.98 0.000 .6541084 .731759
_cons | 1.711528 .0620025 27.60 0.000 1.590006 1.833051
-----------------------------------------------------------------------------. estimates store poishet
.
. * Poisson Cross-section with panel robust standard errors
. poisson PAT LOGR $xextra, cluster(id)
Poisson regression
Number of obs =
Wald chi2(1) = 259.15
Prob > chi2
1730
=
0.0000

-----------------------------------------------------------------------------|
Robust
PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .6929337 .0430441 16.10 0.000 .6085688 .7772987
_cons | 1.711528 .1340309 12.77 0.000 1.448832 1.974224
-----------------------------------------------------------------------------558
. estimates store poispan

.
. * Poisson panel fixed effects
. * Table 23.1 p.794 Poisson-FE column
.
. * Poisson fixed effects
. xtpoisson PAT LOGR $xextra, fe i(id)
note: 22 groups (110 obs) dropped due to all zero outcomes
Conditional fixed-effects Poisson regression Number of obs
=
Number of groups =
324
avg =
5.0
max =
5
1620
Wald chi2(1)
=
1.35
Prob > chi2
=
0.2460
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | -.0377642 .0325518 -1.16 0.246 -.1015645 .026036
-----------------------------------------------------------------------------. estimates store poisfe
.
. /*
> * Alternative way is to put in dummy variables
> set matsize 400
> xi: poisson PAT LOGR $xextra i.id
> */
.
. * Poisson panel random effects
. * Table 23.1 p.794 Poisson-RE column
.
. * Poisson random effects
. xtpoisson PAT LOGR $xextra, re i(id)
559
Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Random-effects Poisson regression

Number of obs
Number of groups =
Random effects u_i ~ Gamma
=
1730
346

avg =
5.0
max =
5
Wald chi2(1)
= 110.20
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .3487832 .0332254 10.50 0.000 .2836625 .4139039
_cons | 2.312705 .124758 18.54 0.000 2.068184 2.557226
-------------+---------------------------------------------------------------/lnalpha | .5454692 .0899144
.3692402 .7216983
-------------+---------------------------------------------------------------alpha | 1.725418 .1551399
1.446635 2.057925
-----------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 3.1e+04 Prob>=chibar2 = 0.000
. estimates store poisre
.
. * Poisson random effects with normal error
. xtpoisson PAT LOGR $xextra, re i(id) normal
Fitting comparison Poisson model:
tau =
tau =
tau =
tau =
tau =
tau =
0.0
0.1
0.2
0.3
0.4
0.5

560
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

Fitting full model:

tau = 0.0
tau = 0.1
tau = 0.2
tau = 0.3
Iteration 0:
Iteration 1:
Iteration 2:


Number of obs
Number of groups =

avg =
5.0
max =
5
LR chi2(0)
=
1730
346
= 2649.21
Prob > chi2
=
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .815977
.
.
.
.
.
_cons | 1.156293
.
.
.
.
.
-------------+---------------------------------------------------------------/lnsig2u | -1.310299
.
.
.
.
.
-------------+---------------------------------------------------------------sigma_u | .5193643
.
.
.
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01) = 3.0e+04 Pr>=chibar2 = 0.000
. estimates store poisrenormal
.
. * Poisson random effects population averaged
. xtpoisson PAT LOGR $xextra, pa i(id)
561

Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
5
Family:
Poisson
avg =
5.0
Correlation:
exchangeable
max =
5
Wald chi2(1)
= 16317.27
Scale parameter:
1
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .5595302 .0043803 127.74 0.000
.550945 .5681153
_cons | 2.067515 .0185166 111.66 0.000 2.031223 2.103807
-----------------------------------------------------------------------------. estimates store poispa
.
. * Poisson random effects population averaged with robust se
. xtpoisson PAT LOGR $xextra, robust pa i(id)
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
5
Family:
Poisson
avg =
5.0
Correlation:
exchangeable
max =
5
Wald chi2(1)
= 293.80
Scale parameter:
1
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------|
Semi-robust
PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .5595302 .0326436 17.14 0.000 .4955499 .6235104
_cons | 2.067515 .1113256 18.57 0.000 1.849321 2.285709
-----------------------------------------------------------------------------. estimates store poispapan
562
.
. ********** (3) POISSON GEE (GENERALIZED ESTIMATING EQUATIONS **********
.
. * Xtgee should reproduce Poisson random effects population averaged
. xtgee PAT LOGR $xextra, corr(exchangeable) family(poisson) link(log) i(id)
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
5
Family:
Poisson
avg =
5.0
Correlation:
exchangeable
max =
5
Wald chi2(1)
= 16317.27
Scale parameter:
1
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .5595302 .0043803 127.74 0.000
.550945 .5681153
_cons | 2.067515 .0185166 111.66 0.000 2.031223 2.103807
-----------------------------------------------------------------------------. estimates store poisgee
.
. * Xtgee should reproduce Poisson random effects population averaged with robust se
. xtgee PAT LOGR $xextra, corr(exchangeable) family(poisson) link(log) i(id) robust
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
5
563
Family:
Correlation:
Scale parameter:
Poisson
avg =
5.0
exchangeable
max =
5
Wald chi2(1)
= 293.80
1
Prob > chi2
= 0.0000

-----------------------------------------------------------------------------|
Semi-robust
PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .5595302 .0326436 17.14 0.000 .4955499 .6235104
_cons | 2.067515 .1113256 18.57 0.000 1.849321 2.285709
-----------------------------------------------------------------------------. estimates store poisgeepan
.
. * Xtgee should give NLS of exponential mean with iid standard errors
. xtgee PAT LOGR $xextra, corr(independent) family(gaussian) link(log) i(id)
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
5
Family:
Gaussian
avg =
5.0
Correlation:
independent
max =
5
Wald chi2(1)
= 2316.87
Scale parameter:
2060.724
Prob > chi2
= 0.0000
Pearson chi2(1730):
3565052.8
2060.724
Deviance
Dispersion
= 3565052.8
= 2060.724
-----------------------------------------------------------------------------PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .5084673 .0105636 48.13 0.000
.487763 .5291716
_cons | 2.528729 .0544558 46.44 0.000 2.421997 2.63546
-----------------------------------------------------------------------------. estimates store nls
.
. * Xtgee should give NLS of exponential mean with robust standard errors
. xtgee PAT LOGR $xextra, corr(independent) family(gaussian) link(log) i(id) robust
Number of obs
=
1730
Group variable:
id
Number of groups =
346
Link:
log
5
564
Family:
Correlation:
Scale parameter:
Pearson chi2(1730):
Gaussian
avg =
5.0
independent
max =
5
Wald chi2(1)
= 85.32
2060.724
Prob > chi2
= 0.0000
3565052.8
2060.724
Deviance
Dispersion
= 3565052.8
= 2060.724

-----------------------------------------------------------------------------|
Semi-robust
PAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------LOGR | .5084673 .055046 9.24 0.000 .4005791 .6163554
_cons | 2.528729 .2176674 11.62 0.000 2.102109 2.955349
-----------------------------------------------------------------------------. estimates store nlspan
.
. ********** (4) PANEL ROBUST STANDARD ERRORS BY BOOTSTRAP **********
.
. * For discussion of panel robust standard errors
. * see text Section 23.2.6 page 788-9 (nonlinear panel)
. * and text Section 21.2.3 page 705-8 (linear panel)
.
. * Pooled Poisson panel robust bootstrap standard errors
. set seed 10001
. bootstrap "poisson PAT LOGR $xextra" "_b[LOGR] _b[_cons]", cluster(id) reps($nreps) level(95)
command:
poisson PAT LOGR
statistics: _bs_1
= _b[LOGR]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
346
Replications =
500
1730

-------------+---------------------------------------------------------------_bs_1 | 500 .6929337 .0081667 .0473006 .6000008 .7858666 (N)
|
.6250867 .8100113 (P)
|
.6209522 .8025689 (BC)
_bs_2 | 500 1.711528 -.0267995 .141745 1.433038 1.990019 (N)
|
1.336657 1.924925 (P)
|
1.355381 1.935691 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
565
BC = bias-corrected
. matrix poisbootse = e(se)
.
. * Poisson fixed effects panel bootstrap standard errors
. set seed 10001
. bootstrap "xtpoisson PAT LOGR $xextra, fe i(id)" "_b[LOGR]", cluster(id) reps($nreps) level(95)
command:
xtpoisson PAT LOGR , fe i(id)
statistic: _bs_1
= _b[LOGR]
Number of obs =
N of clusters =
324
Replications =
500
1620

-------------+---------------------------------------------------------------_bs_1 | 500 -.0377642 .0057448 .1067039 -.2474085 .17188 (N)
|
-.2458792 .1454112 (P)
|
-.3182177 .1310303 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix poisfebootse = e(se)
.
. * Poisson random effects panel bootstrap standard errors
. set seed 10001
. bootstrap "xtpoisson PAT LOGR $xextra, re i(id)" "_b[LOGR] _b[_cons]", cluster(id)
reps($nreps) le
> vel(95)
command:
xtpoisson PAT LOGR , re i(id)
statistics: _bs_1
= _b[LOGR]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
346
Replications =
500
1730

-------------+---------------------------------------------------------------_bs_1 | 500 .3487832 -.1581585 .1194127 .1141695 .5833969 (N)
|
-.0414326 .4028537 (P)
566
|
.2775298 .5040658 (BC)
_bs_2 | 500 2.312705 .5382745 .4384781 1.451214 3.174196 (N)
|
2.104445 3.743506 (P)
|
1.804036 2.552794 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix poisrebootse = e(se)
.
. * Poisson population averaged panel bootstrap standard errors
. set seed 10001
. bootstrap "xtpoisson PAT LOGR $xextra, pa i(id)" "_b[LOGR] _b[_cons]", cluster(id)
reps($nreps) le
> vel(95)
command:
xtpoisson PAT LOGR , pa i(id)
statistics: _bs_1
= _b[LOGR]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
346
Replications =
500
1730

-------------+---------------------------------------------------------------_bs_1 | 338 .5595301 -.0013448 .1072904 .3484868 .7705734 (N)
|
.1938364 .6946551 (P)
|
.0630385 .6535396 (BC)
_bs_2 | 338 2.067515 -.0016997 .2940233 1.489163 2.645867 (N)
|
1.675453 3.034075 (P)
|
1.80883 3.352539 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix poispabootse = e(se)
. set seed 10001
.
. * Xtgee should give exponential mean (NLS) with iid errors with boostrap se's
. bootstrap "xtgee PAT LOGR $xextra, corr(independent) family(gaussian) link(log) i(id)"
"_b[LOGR]
> _b[_cons]", cluster(id) reps($nreps) level(95)
567
command:
xtgee PAT LOGR , corr(independent) family(gaussian) link(log) i(id)
statistics: _bs_1
= _b[LOGR]
_bs_2
= _b[_cons]
Number of obs =
N of clusters =
346
Replications =
500
1730

-------------+---------------------------------------------------------------_bs_1 | 500 .5084673 .0122215 .0541264 .4021235 .614811 (N)
|
.4453159 .6547906 (P)
|
.4372376 .6397901 (BC)
_bs_2 | 500 2.528729 -.0502655 .198022 2.139669 2.917789 (N)
|
1.953206 2.763821 (P)
|
2.084754 2.820513 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
.
. * Results fiven in same order as in Table 23.1 page 794
. matrix nlsbootse = e(se)
. matrix list poisbootse
poisbootse[1,2]
_bs_1
_bs_2
se .04730061 .14174498
. matrix list poisfebootse
symmetric poisfebootse[1,1]
_bs_1
se .10670389
. matrix list poisrebootse
poisrebootse[1,2]
_bs_1
_bs_2
se .11941272 .43847813
. matrix list poispabootse
poispabootse[1,2]
_bs_1
_bs_2
se .10729042 .29402327
.
568
. ********** DISPLAY RESULTS FOR (1)-(3) GIVEN IN TABLE 23.1 page 794 **********
.
. * Standard error using iid errors and in some cases panel
.
. estimates table linolspan linfe linre, t se /*
> */ stats(N ll r2 tss rss mss rmse df_r) b(%10.3f)
----------------------------------------------------Variable | linolspan
linfe
linre
-------------+--------------------------------------LOGR |
0.834
0.107
0.720
|
0.023
0.056
0.024
|
36.48
1.91
30.26
_cons |
0.795
1.709
0.938
|
0.058
0.071
0.060
|
13.73
23.92
15.65
-------------+--------------------------------------N | 1730.000 1730.000 1730.000
ll | -2531.658 -1100.267
r2 |
0.719
0.003
tss |
6732.584
rss | 1890.831 361.400
mss | 4841.753
0.948
rmse |
1.046
0.511
df_r | 345.000 1383.000
----------------------------------------------------legend: b/se/t
. estimates table poisiid poishet poispan, t se /*
----------------------------------------------------Variable | poisiid
poishet
poispan
-------------+--------------------------------------LOGR |
0.693
0.693
0.693
|
0.002
0.020
0.043
| 308.61
34.98
16.10
_cons |
1.712
1.712
1.712
|
0.010
0.062
0.134
| 175.24
27.60
12.77
-------------+--------------------------------------N | 1730.000 1730.000 1730.000
ll | -21030.583 -21030.583 -21030.583
r2 |
tss |
rss |
mss |
rmse |
df_r |
----------------------------------------------------legend: b/se/t
569
. estimates table poisfe poisre poisrenormal poispa poispapan, t se /*

------------------------------------------------------------------------------Variable | poisfe
poisre poisreno~l poispa poispapan
-------------+----------------------------------------------------------------PAT
|
LOGR | -0.038
0.349
0.816
|
0.033
0.033
0.000
|
-1.16
10.50
.
_cons |
2.313
1.156
|
0.125
0.000
|
18.54
.
-------------+----------------------------------------------------------------lnalpha
|
_cons |
0.545
|
0.090
|
6.07
-------------+----------------------------------------------------------------lnsig2u
|
_cons |
-1.310
|
0.000
|
.
-------------+----------------------------------------------------------------_
|
LOGR |
0.560
0.560
|
0.004
0.033
|
127.74
17.14
_cons |
2.068
2.068
|
0.019
0.111
|
111.66
18.57
-------------+----------------------------------------------------------------Statistics |
N | 1620.000 1730.000 1730.000 1730.000 1730.000
ll | -3659.593 -5553.179 -6261.982
r2 |
tss |
rss |
mss |
rmse |
df_r |
------------------------------------------------------------------------------legend: b/se/t
. estimates table poisgee poisgeepan nls nlspan, t se /*
-----------------------------------------------------------------Variable | poisgee poisgeepan
nls
nlspan
-------------+---------------------------------------------------570
LOGR |
0.560
0.560
0.508
0.508
|
0.004
0.033
0.011
0.055
| 127.74
17.14
48.13
9.24
_cons |
2.068
2.068
2.529
2.529
|
0.019
0.111
0.054
0.218
| 111.66
18.57
46.44
11.62
-------------+---------------------------------------------------N | 1730.000 1730.000 1730.000 1730.000
ll |
r2 |
tss |
rss |
mss |
rmse |
df_r |
-----------------------------------------------------------------legend: b/se/t
.
. log close
log: c:\Imbook\bwebpage\Section5\mma23p1pannonlin.txt
log type: text
closed on: 23 May 2005, 12:53:45
571
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma24p1olscluster.txt
log type: text
opened on: 24 May 2005, 14:33:58
.
. ********** OVERVIEW OF MMA24P1OLSCLUSTER.DO **********
.
. * STATA Program
.
. * Chapter 24.7 pages 848-53 Table 24.4
. * Cluster robust inference for OLS cross-section application using
. * Vietnam Living Standard Survey data
.
. * (0) Descriptive Statistics (Table 24.3 first half)
. * (1) Linear regression (in logs) with household data (Table 24.4)
.
. * For Tables 24.5-6 for clustered count data see MMA24P2POISCLUSTER.DO
.
. * The cluster effects model is
. * y_it = x_it'b + a_i + e_it
. * Default xtreg output assumes e_it is iid.
. * Instead should get cluster-robust errors after xtreg
. * See Section 21.2.3 pages 709-12
. * Stata Version 8 does not do this but Stata version 9 does.
. * Here we do a panel bootstrap - results not reported in the text
.
. * To speed up programs reduce breps - the number of bootstrap reps
.
. * vietnam_ex1.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
.
.
. * The data comes from World Bank 1997 Vietnam Living Standards Survey
. * A subset was used in chapter 4.6.4.
. * The larger sample here is described on pages 848-9
572
.
. * The data are HOUSEHOLD data
. * There are N=5006 households in 194 clusters
.
. * The separate data set vietnam_ex2.dta has household-level data
.
. ********** READ IN HOUSEHOLD DATA and SUMMARIZE (Table 24.3) **********
.
. use vietnam_ex1.dta
. desc
Contains data from vietnam_ex1.dta
obs:
5,999
vars:
8
11 Apr 2005 12:39
size:
label
variable label
------------------------------------------------------------------------------sex
byte %8.0g
age
int %8.0g
comped98
float %9.0g
diploma completed diploma HH.head
farm
float %9.0g
hhsize
long %12.0g
Household size
commune
float %9.0g
commune code PSU-SVY commands
lhhexp1
float %9.0g
lhhex12m
float %9.0g
------------------------------------------------------------------------------Sorted by:
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------sex |
5999 1.270712 .4443645
1
2
age |
5999 48.01284 13.7702
16
95
comped98 |
5999 3.385564 2.037543
0
9
farm |
5999 .5730955 .4946694
0
1
hhsize |
5999 4.752292 1.954292
1
19
-------------+-------------------------------------------------------commune |
5999 98.26588 56.00461
1
194
lhhexp1 |
5999 9.341561 .6877458 6.543108 12.20242
lhhex12m |
5006 6.310585 1.593083
0 12.36325
.
. rename sex SEX
. rename age AGE
. rename comped98 EDUC
573
. rename farm FARM

. rename hhsize HHSIZE
. rename commune COMMUNE
. rename lhhexp1 LNHHEXP
. rename lhhex12m LNEXP12M
. gen HHEXP = exp(LNHHEXP)
.
. * Following should give same descriptive statistics
. * as in top half (Household) in Table 24.3 p.850
. * But there are some differences plus here have FARM not URBAN
. sum LNEXP12M AGE SEX HHSIZE FARM EDUC HHEXP LNHHEXP COMMUNE
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------LNEXP12M |
5006 6.310585 1.593083
0 12.36325
AGE |
5999 48.01284 13.7702
16
95
SEX |
5999 1.270712 .4443645
1
2
HHSIZE |
5999 4.752292 1.954292
1
19
FARM |
5999 .5730955 .4946694
0
1
-------------+-------------------------------------------------------EDUC |
5999 3.385564 2.037543
0
9
HHEXP |
5999 14599.23 12582.31 694.4419 199271
LNHHEXP |
5999 9.341561 .6877458 6.543108 12.20242
COMMUNE |
5999 98.26588 56.00461
1
194
.
. * Note that LNEXP12M has some missing values coded as .
. outfile LNEXP12M AGE SEX HHSIZE FARM EDUC LNHHEXP COMMUNE /*
> */using vietnam_ex1.asc, replace
.
. ********** ANALYSIS: CLUSTER ANALYSIS FOR LINEAR MODEL [Table 24.4 p.851]
**********
.
. * Regressor list for the linear regressions
. global XLISTLINEAR LNHHEXP AGE SEX HHSIZE FARM EDUC
.
. * OLS with usual standard errors (Table 24.4 columns 1-2)
. regress LNEXP12M $XLISTLINEAR
Source |
SS
df
MS
-------------+------------------------------

F( 6, 4999) = 82.02
574
Model | 1138.38332 6 189.730553

Prob > F
= 0.0000
Residual | 11563.877 4999 2.31323805
R-squared = 0.0896
-------------+-----------------------------Adj R-squared = 0.0885
Total | 12702.2603 5005 2.53791415
Root MSE
= 1.5209
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .6702328 .0418711 16.01 0.000 .5881472 .7523185
AGE | .0105766 .0016554 6.39 0.000 .0073312 .013822
SEX | .097444 .0518961 1.88 0.060 -.0042952 .1991832
HHSIZE | .0289812 .0132524 2.19 0.029 .0030007 .0549617
FARM | .1346891 .0493325 2.73 0.006 .0379757 .2314025
EDUC | -.0903599 .0122803 -7.36 0.000 -.1144346 -.0662852
_cons | -.5107135 .3799642 -1.34 0.179 -1.25561 .234183
-----------------------------------------------------------------------------. estimates store olsiid
.
. * OLS with heteroskedastic-robust standard errors (Table 24.4 column 3)
. regress LNEXP12M $XLISTLINEAR, robust
Number of obs =
F( 6, 4999) = 80.80
Prob > F
= 0.0000
R-squared = 0.0896
Root MSE = 1.5209
5006
-----------------------------------------------------------------------------|
Robust
LNEXP12M |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .6702328 .0425223 15.76 0.000 .5868705 .7535952
AGE | .0105766 .0016634 6.36 0.000 .0073157 .0138376
SEX | .097444 .0519606 1.88 0.061 -.0044217 .1993096
HHSIZE | .0289812 .0134698 2.15 0.031 .0025744 .055388
FARM | .1346891 .0494286 2.72 0.006 .0377873 .2315908
EDUC | -.0903599 .0127869 -7.07 0.000 -.1154278 -.0652919
_cons | -.5107135 .3812665 -1.34 0.180 -1.258163 .2367362
-----------------------------------------------------------------------------. estimates store olshet
.
. * OLS with cluster-robust standard errors (Table 24.4 column 4)
. regress LNEXP12M $XLISTLINEAR, cluster(COMMUNE)
Number of obs =
F( 6, 193) = 54.91
Prob > F
= 0.0000
5006
575
R-squared
Number of clusters (COMMUNE) = 194
= 0.0896
Root MSE
= 1.5209
-----------------------------------------------------------------------------|
Robust
LNEXP12M |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .6702328 .0528536 12.68 0.000
.565988 .7744777
AGE | .0105766 .0019371 5.46 0.000 .0067561 .0143972
SEX | .097444 .0595084 1.64 0.103 -.0199263 .2148142
HHSIZE | .0289812 .0153602 1.89 0.061 -.0013142 .0592766
FARM | .1346891 .0608046 2.22 0.028 .0147622 .2546159
EDUC | -.0903599 .0149743 -6.03 0.000 -.1198942 -.0608255
_cons | -.5107135 .4706163 -1.09 0.279 -1.438925 .4174979
-----------------------------------------------------------------------------. estimates store olsclust
.
. * Random effects estimation (FGLS) (Table 24.4 columns 5-6)
. * This uses the xtreg command which first requires identifying the cluster
. iis COMMUNE
. xtreg LNEXP12M $XLISTLINEAR, re
Group variable (i): COMMUNE
between = 0.2884
overall = 0.0883
corr(u_i, X)
= 0 (assumed)
Number of obs
=
5006
Number of groups =
194
avg =
25.8
max =
39
Wald chi2(6)
= 335.12
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .6268899 .0468004 13.39 0.000 .5351627 .718617
AGE | .0112334 .0016411 6.85 0.000
.008017 .0144499
SEX | .1069915 .0511849 2.09 0.037 .0066709 .2073121
HHSIZE | .0158302 .0135166 1.17 0.242 -.0106618 .0423222
FARM | .0928509 .0549544 1.69 0.091 -.0148578 .2005595
EDUC | -.0638447 .0129744 -4.92 0.000 -.0892741 -.0384153
_cons | -.1660698 .4202027 -0.40 0.693 -.989652 .6575123
-------------+---------------------------------------------------------------sigma_u | .46739871
sigma_e | 1.4526468
------------------------------------------------------------------------------
576
. estimates store refgls

.
. * Note that can cluster bootstrap if desired to get more robust standard errors
. * This is done at end of program
.
. * Fixed effects estimation (FGLS) (Table 24.4 columns 7-8)
. xtreg LNEXP12M $XLISTLINEAR, fe
between = 0.2787
overall = 0.0865
corr(u_i, Xb) = 0.0797
Number of obs
=
5006
Number of groups =
194
avg =
25.8
max =
39
F(6,4806)
=
Prob > F
43.92
= 0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .6037139 .0520178 11.61 0.000 .5017352 .7056926
AGE | .0115845 .0016706 6.93 0.000 .0083092 .0148597
SEX | .112821 .0520014 2.17 0.030 .0108745 .2147675
HHSIZE | .0107124 .0141127 0.76 0.448 -.016955 .0383797
FARM | .0693037 .0609002 1.14 0.255 -.0500885 .1886959
EDUC | -.0510325 .0135817 -3.76 0.000 -.0776588 -.0244062
_cons | .0361552 .461482 0.08 0.938 -.8685606 .9408711
-------------+---------------------------------------------------------------sigma_u | .57732514
sigma_e | 1.4526468
Prob > F = 0.0000
. estimates store fe
.
.
. * Random effects estimation by MLE assuming normality (Table 24.4 columns 5-6)
. * This uses the xtreg command which first requires identifying the cluster
. iis COMMUNE
. xtreg LNEXP12M $XLISTLINEAR, mle
577

Fitting full model:
Number of obs
=
5006
Number of groups =
194
avg =
25.8
max =
39
LR chi2(6)
= 319.19
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .6276456 .0467072 13.44 0.000
.536101 .7191901
AGE | .01122 .0016406 6.84 0.000 .0080045 .0144354
SEX | .1067788 .0511618 2.09 0.037 .0065035 .207054
HHSIZE | .01603 .0135121 1.19 0.235 -.0104533 .0425133
FARM | .0936529 .0548379 1.71 0.088 -.0138274 .2011332
EDUC | -.0643046 .0130222 -4.94 0.000 -.0898277 -.0387816
_cons | -.1718111 .4192856 -0.41 0.682 -.9935959 .6499737
-------------+---------------------------------------------------------------/sigma_u | .455472 .0329742 13.81 0.000 .3908438 .5201002
/sigma_e | 1.452303 .0148092 98.07 0.000 1.423278 1.481329
-------------+---------------------------------------------------------------rho | .0895499 .0120221
.0682208 .1154799
. estimates store remle
.
. * Test of the RE specification using Breusch-Pagan test
. * This is statistic in third bottom row of Table 24.4
. quietly xtreg LNEXP12M $XLISTLINEAR, re
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects:
LNEXP12M[COMMUNE,t] = Xb + u[COMMUNE] + e[COMMUNE,t]
Estimated results:
|
Var
sd = sqrt(Var)
578
---------+----------------------------LNEXP12M | 2.537914
1.593083
e | 2.110183
1.452647
u | .2184615
.4673987
Test: Var(u) = 0
chi2(1) = 432.75
Prob > chi2 = 0.0000
.
. * Hausman test of FE vs. RE specification
. * This test is not a robust version.
. * Its validity asswumes that errors are iid after including COMMUNE-specific effect
. * For this example this may be reasonable as cluster bootstrap se's close to usual se's
. xthausman
(Warning: xthausman is no longer a supported command; use -hausman-. For instructions, see help
hausman.)
Hausman specification test

---- Coefficients ---|
Fixed
Random
LNEXP12M | Effects
Effects
Difference
-------------+----------------------------------------LNHHEXP | .6037139 .6268899
-.0231759
AGE | .0115845 .0112334
.000351
SEX | .112821 .1069915
.0058295
HHSIZE | .0107124 .0158302
-.0051179
FARM | .0693037 .0928509
-.0235472
EDUC | -.0510325 -.0638447
.0128122
Test: Ho: difference in coefficients not systematic
chi2( 6) = (b-B)'[S^(-1)](b-B), S = (S_fe - S_re)
= 17.89
Prob>chi2 = 0.0065
.
. * Alternative GLS estimation using the GEE approach
. * Same as xtgee with family(gaussian) link(id) corr(exchangeable)
. * So GLS with equicorrelated errors
. xtreg LNEXP12M $XLISTLINEAR, pa
Number of obs
5006
579
Group variable:
Link:
Family:
Correlation:
Scale parameter:
COMMUNE
Number of groups =
194
identity
1
Gaussian
avg =
25.8
exchangeable
max =
39
Wald chi2(6)
= 338.97
2.314413
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------LNEXP12M |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .6281447 .0466076 13.48 0.000 .5367955 .719494
AGE | .0112111 .0016411 6.83 0.000 .0079946 .0144275
SEX | .1066389 .0511914 2.08 0.037 .0063056 .2069722
HHSIZE | .0161625 .013502 1.20 0.231 -.0103009 .0426259
FARM | .0941811 .0547349 1.72 0.085 -.0130973 .2014594
EDUC | -.0646085 .0129528 -4.99 0.000 -.0899956 -.0392215
_cons | -.1756087 .4185566 -0.42 0.675 -.9959645 .6447472
-----------------------------------------------------------------------------. estimates store pa
.
. ********** DISPLAY TABLE 24.4 RESULTS page 851 **********
.
. estimates table olsiid olshet olsclust, /*
> */ b(%10.3f) t(%10.2f) stats(r2 N)
----------------------------------------------------Variable | olsiid
olshet
olsclust
-------------+--------------------------------------LNHHEXP |
0.670
0.670
0.670
|
16.01
15.76
12.68
AGE |
0.011
0.011
0.011
|
6.39
6.36
5.46
SEX |
0.097
0.097
0.097
|
1.88
1.88
1.64
HHSIZE |
0.029
0.029
0.029
|
2.19
2.15
1.89
FARM |
0.135
0.135
0.135
|
2.73
2.72
2.22
EDUC | -0.090
-0.090
-0.090
|
-7.36
-7.07
-6.03
_cons | -0.511
-0.511
-0.511
|
-1.34
-1.34
-1.09
-------------+--------------------------------------r2 |
0.090
0.090
0.090
N | 5006.000 5006.000 5006.000
----------------------------------------------------legend: b/t
. estimates table pa fe refgls remle, /*
580
>
*/ b(%10.3f) t(%10.2f) stats(r2 N)
-----------------------------------------------------------------Variable | pa
fe
refgls
remle
-------------+---------------------------------------------------_
|
LNHHEXP |
0.628
0.604
0.627
|
13.48
11.61
13.39
AGE |
0.011
0.012
0.011
|
6.83
6.93
6.85
SEX |
0.107
0.113
0.107
|
2.08
2.17
2.09
HHSIZE |
0.016
0.011
0.016
|
1.20
0.76
1.17
FARM |
0.094
0.069
0.093
|
1.72
1.14
1.69
EDUC | -0.065
-0.051
-0.064
|
-4.99
-3.76
-4.92
_cons | -0.176
0.036
-0.166
|
-0.42
0.08
-0.40
-------------+---------------------------------------------------LNEXP12M |
LNHHEXP |
0.628
|
13.44
AGE |
0.011
|
6.84
SEX |
0.107
|
2.09
HHSIZE |
0.016
|
1.19
FARM |
0.094
|
1.71
EDUC |
-0.064
|
-4.94
_cons |
-0.172
|
-0.41
-------------+---------------------------------------------------sigma_u
|
_cons |
0.455
|
13.81
-------------+---------------------------------------------------sigma_e
|
_cons |
1.452
|
98.07
-------------+---------------------------------------------------Statistics |
r2 |
0.052
N | 5006.000 5006.000 5006.000 5006.000
-----------------------------------------------------------------legend: b/t
581
.
. ********** ADDITIONALLY DO CLUSTER BOOTSTRAPS **********
.
. * These results not given in the text
.
. global breps = 500
.
. * Note that can bootstrap if desired to get more robust standard errors
. * The first reproduces reg , cluster(COMMUNE)
. bootstrap "reg LNEXP12M $XLISTLINEAR" _b, cluster(COMMUNE) reps($breps) level(95)
command:
reg LNEXP12M LNHHEXP AGE SEX HHSIZE FARM EDUC
statistics: b_LNHHEXP = _b[LNHHEXP]
b_AGE
= _b[AGE]
b_SEX
= _b[SEX]
b_HHSIZE = _b[HHSIZE]
b_FARM = _b[FARM]
b_EDUC = _b[EDUC]
b_cons = _b[_cons]
Number of obs =
N of clusters =
194
Replications =
500
5006

-------------+---------------------------------------------------------------b_LNHHEXP | 500 .6702328 .0000939 .0546562 .5628482 .7776175 (N)
|
.5575338 .7715588 (P)
|
.5502583 .7638555 (BC)
b_AGE | 500 .0105766 .0000108 .0019538 .0067379 .0144154 (N)
|
.0067395 .0143774 (P)
|
.006576 .0141968 (BC)
b_SEX | 500 .097444 -.0023301 .0602315 -.0208945 .2157825 (N)
|
-.0210348 .2196117 (P)
|
-.0261246 .2083439 (BC)
b_HHSIZE | 500 .0289812 -.0008009 .0160043 -.0024629 .0604252 (N)
|
-.0004838 .0628019 (P)
|
.0028144 .0662394 (BC)
b_FARM | 500 .1346891 .0026611 .0560327 .0245999 .2447782 (N)
|
.0293473 .2510255 (P)
|
.0202142 .2483591 (BC)
b_EDUC | 500 -.0903599 -.00006 .014992 -.119815 -.0609047 (N)
|
-.1205786 -.0618314 (P)
|
-.1204532 -.0615499 (BC)
b_cons | 500 -.5107135 .0044955 .4893788 -1.47221 .4507834 (N)
|
-1.435498 .4444398 (P)
|
-1.388972 .4859312 (BC)
-----------------------------------------------------------------------------Note: N = normal
582
P = percentile
BC = bias-corrected
. * The t-statistic vector is e(b)./e(se) where ./ is elt. by elt. division
. * But Stata Version 8 does not do ./ so instead need the following
. matrix tols = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tols, format(%10.2f)
tols[7,1]
r1
b_LNHHEXP 12.26
b_AGE 5.41
b_SEX 1.62
b_HHSIZE 1.81
b_FARM 2.40
b_EDUC -6.03
b_cons -1.04
.
. * The next two reproduce xtreg , cluster(COMMUNE)
. * but the cluster option for xtreg is not available for Stata version 8
.
. * For this example the cluster bootstrap se's are within 10 percent
. * of the usual xtreg se's, so usual se's may be okay here
.
. * Fixed effects estimator
. bootstrap "xtreg LNEXP12M $XLISTLINEAR, fe" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtreg LNEXP12M LNHHEXP AGE SEX HHSIZE FARM EDUC , fe
b_AGE
= _b[AGE]
b_SEX
= _b[SEX]
b_FARM = _b[FARM]
b_EDUC = _b[EDUC]
b_cons = _b[_cons]
Number of obs =
N of clusters =
194
Replications =
500
5006

-------------+---------------------------------------------------------------b_LNHHEXP | 500 .6037139 -.0006143 .0583525 .4890671 .7183608 (N)
|
.4852716 .7172067 (P)
|
.4841806 .7148217 (BC)
b_AGE | 500 .0115845 5.02e-06 .0017464 .0081532 .0150157 (N)
|
.0082637 .0151613 (P)
583
|
.0084701 .0152766 (BC)
b_SEX | 500 .112821 -.0017372 .0546362 .0054756 .2201664 (N)
|
.0129603 .2214846 (P)
|
.017047 .235448 (BC)
b_HHSIZE | 500 .0107124 -.0004379 .0150286 -.0188148 .0402395 (N)
|
-.0195233 .0415316 (P)
|
-.0184428 .044119 (BC)
b_FARM | 500 .0693037 -.0010067 .0497627 -.0284666 .167074 (N)
|
-.0291446 .1679352 (P)
|
-.0259051 .1705921 (BC)
b_EDUC | 500 -.0510325 .0003307 .0153224 -.081137 -.020928 (N)
|
-.0818133 -.0219096 (P)
|
-.0844261 -.0230367 (BC)
b_cons | 500 .0361552 .0087515 .5186644 -.9828799 1.05519 (N)
|
-.934128 1.087458 (P)
|
-.934128 1.087458 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix tfe = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tfe, format(%10.2f)
tfe[7,1]
r1
b_LNHHEXP 10.35
b_AGE 6.63
b_SEX 2.06
b_HHSIZE 0.71
b_FARM 1.39
b_EDUC -3.33
b_cons 0.07
.
. * Random effects estimator
. bootstrap "xtreg LNEXP12M $XLISTLINEAR, re" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtreg LNEXP12M LNHHEXP AGE SEX HHSIZE FARM EDUC , re
b_AGE
= _b[AGE]
b_SEX
= _b[SEX]
b_FARM = _b[FARM]
b_EDUC = _b[EDUC]
b_cons = _b[_cons]
Number of obs =
N of clusters =
194
5006
584
Replications
500

-------------+---------------------------------------------------------------b_LNHHEXP | 500 .6268899 -.0079169 .0486878 .5312314 .7225483 (N)
|
.5261016 .7155449 (P)
|
.540477 .7254891 (BC)
b_AGE | 500 .0112334 .0001211 .0017668 .0077622 .0147047 (N)
|
.0080698 .0152565 (P)
|
.0077655 .0147142 (BC)
b_SEX | 500 .1069915 .0058127 .0561182 -.0032656 .2172486 (N)
|
.0046711 .2187323 (P)
|
-.0109273 .2045939 (BC)
b_HHSIZE | 500 .0158302 -.0014562 .0146506 -.0129543 .0446147 (N)
|
-.017179 .0459636 (P)
|
-.0108163 .0482198 (BC)
b_FARM | 500 .0928509 -.0071707 .0442312 .0059485 .1797532 (N)
|
-.0014455 .1728321 (P)
|
.0053411 .1906732 (BC)
b_EDUC | 500 -.0638447 .0049481 .014058 -.0914648 -.0362246 (N)
|
-.0871102 -.029496 (P)
|
-.094956 -.0407984 (BC)
b_cons | 500 -.1660698 .0535286 .4305953 -1.012073 .6799335 (N)
|
-.8970464 .6892154 (P)
|
-.9512222 .6032417 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix tre = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tre, format(%10.2f)
tre[7,1]
r1
b_LNHHEXP 12.88
b_AGE 6.36
b_SEX 1.91
b_HHSIZE 1.08
b_FARM 2.10
b_EDUC -4.54
b_cons -0.39
.
. log close
log: c:\Imbook\bwebpage\Section6\mma24p1olscluster.txt
log type: text
closed on: 24 May 2005, 14:44:12
585
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma24p2poiscluster.txt
log type: text
opened on: 24 May 2005, 16:35:22
.
. ********** OVERVIEW OF MMA24P2POISCLUSTER.DO **********
.
. * STATA Program
.
. * Chapter 24.7 pages 848-53 Table 24.6
. * Cluster robust inference for Poisson cross-section application using
. * Vietnam Living Standard Survey data
.
. * (0) Descriptive Statistics (Table 24.3 second half)
. * (1) Frequencies of data (Table 24.5)
. * (2) Poisson regression with individual-level data (Table 24.6)
.
. * The results differ in second significant digit from those in text
. * despite same sample size. Not sure why.
.
. * For Table 24.4 for clustered household data see MMA24P1OLSCLUSTER.DO
.
. * The Poisson cluster effects model is
. * y_it ~ Poiss0n(x_it'b + a_i)
. * Default xtreg output assumes Poisson distribution - var = mean.
. * Instead should get cluster-robust errors after xtpois
. * See Section 21.2.3 pages 709-12 and section 23.26 pages 788-9
. * Stata Version 8 does not do this.
. * Here we do a panel bootstrap - results not reported in the text
.
. * To speed up programs reduce breps - the number of bootstrap reps
. * This program takes a long time if bootstrap
.
. * vietnam_ex2.dta
.
. ********** SETUP **********
.
. set more off
. version 8.0
586
.
.
. * The data comes from World Bank 1997 Vietnam Living Standards Survey
. * A subset was used in chapter 4.6.4.
. * The larger sample here is described on pages 848-9
.
. * The data are HOUSEHOLD data
. * There are N=5006 individuals in 194 clusters (communes)
.
. * The separate data set vietnam_ex1.dta has individual level data
.
. ********** READ IN INDIVIDUAL-LEVEL DATA and SUMMARIZE (Table 24.3)
**********
.
. use vietnam_ex2.dta, clear
. desc
Contains data from vietnam_ex2.dta
obs:
27,766
vars:
12
11 Apr 2005 12:33
size: 1,443,832 (85.9% of memory free)
label
variable label
------------------------------------------------------------------------------COMPED98
float %9.0g
SEX
float %9.0g
AGE
float %9.0g
MARRIED
float %9.0g
ILLDUM
float %9.0g
INJDUM
float %9.0g
ILLDAYS
float %9.0g
ACTDAYS
float %9.0g
PHARVIS
float %9.0g
HLTHINS
float %9.0g
lnhhinc
float %9.0g
commune
float %9.0g
------------------------------------------------------------------------------Sorted by:
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------COMPED98 | 27765 3.390672 1.93115
0
11
SEX | 27765 .5111471 .4998847
0
1
AGE | 27765 2.977504 .9671446
0 4.59512
MARRIED | 27765 .3988835 .4896775
0
1
ILLDUM | 27765 .6219701 .8995068
0
9
587
-------------+-------------------------------------------------------INJDUM | 27765 .0096885 .0979537

0
1
ILLDAYS | 27765 2.804034 5.45823
0
60
ACTDAYS | 27765 .0657302 1.115939
0
30
PHARVIS | 27765 .5117594 1.313427
0
30
HLTHINS | 27765 .1625788 .3689876
0
1
-------------+-------------------------------------------------------lnhhinc | 27765 2.60261 .6244145 .0467014 5.405502
commune | 27765 101.5266 56.28334
1
194
.
. rename COMPED98 EDUC
. rename ILLDUM ILLNESS
. rename INJDUM INJURY
. rename HLTHINS INSURANCE
. rename lnhhinc LNHHEXP
. rename commune COMMUNE
.
. * Following should give same descriptive statistics
. * as in bottom half (Household) in Table 24.3 p.850
. * But there are is a difference for LNHHEXP plus here no data on MEDEXP
. sum PHARVIS LNHHEXP AGE SEX MARRIED EDUC ILLNESS INJURY ILLDAYS
ACTDAYS INSURANCE COMMUNE
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------PHARVIS | 27765 .5117594 1.313427
0
30
LNHHEXP | 27765 2.60261 .6244145 .0467014 5.405502
AGE | 27765 2.977504 .9671446
0 4.59512
SEX | 27765 .5111471 .4998847
0
1
MARRIED | 27765 .3988835 .4896775
0
1
-------------+-------------------------------------------------------EDUC | 27765 3.390672 1.93115
0
11
ILLNESS | 27765 .6219701 .8995068
0
9
INJURY | 27765 .0096885 .0979537
0
1
ILLDAYS | 27765 2.804034 5.45823
0
60
ACTDAYS | 27765 .0657302 1.115939
0
30
-------------+-------------------------------------------------------INSURANCE | 27765 .1625788 .3689876
0
1
COMMUNE | 27765 101.5266 56.28334
1
194
. sum LNHHEXP, detail
LNHHEXP
------------------------------------------------------------588
Percentiles
Smallest
1% 1.302267
.0467014
5% 1.658267
.1111674
10% 1.875315
.3755146
25% 2.188848
.4177101
50%
75%
90%
95%
99%
Obs
27765
Sum of Wgt.
27765
2.534935
Mean
2.60261
Largest
Std. Dev.
.6244145
2.962732
5.405502
3.458658
5.405502
Variance
.3898934
3.737957
5.405502
Skewness
.4925002
4.295394
5.405502
Kurtosis
3.583693
.
. * Following gives Table 24.5 (page 852) frequencies
. * These differ in some places from Table 24.5 - especially for number = 0
. tabulate PHARVIS
PHARVIS |
Freq. Percent
Cum.
------------+----------------------------------0 | 20,668
74.44
74.44
1|
3,829
13.79
88.23
2|
1,716
6.18
94.41
3|
777
2.80
97.21
4|
359
1.29
98.50
5|
174
0.63
99.13
6|
64
0.23
99.36
7|
43
0.15
99.51
8|
16
0.06
99.57
9|
4
0.01
99.59
10 |
78
0.28
99.87
11 |
1
0.00
99.87
12 |
5
0.02
99.89
13 |
1
0.00
99.89
14 |
3
0.01
99.90
15 |
9
0.03
99.94
16 |
1
0.00
99.94
20 |
8
0.03
99.97
22 |
2
0.01
99.97
27 |
1
0.00
99.98
28 |
3
0.01
99.99
30 |
3
0.01
100.00
------------+----------------------------------Total | 27,765
100.00
.
. * Histogram with kernel density estimate
. hist PHARVIS, discrete kdensity
(start=0, width=1)
.
589
. outfile PHARVIS LNHHEXP AGE SEX MARRIED EDUC ILLNESS INJURY ILLDAYS /*
> */ ACTDAYS INSURANCE COMMUNE using vietnam_ex2.asc, replace
.
. ********** ANALYSIS: CLUSTER ANALYSIS FOR POISSON MODEL [Table 24.6 p.851]
*********
.
. * Regressor list for the Poisson regressions
. global XLISTPOISSON LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS ACTDAYS
INJURY ILLNESS EDUC
.
. * Poisson with usual standard errors (Table 24.6 columns 1-2)
. poisson PHARVIS $XLISTPOISSON
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Poisson regression
Number of obs =
27765
LR chi2(10) = 13226.50
Prob > chi2 = 0.0000
Pseudo R2
= 0.2073
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .078686 .0138419 5.68 0.000 .0515564 .1058156
INSURANCE | -.2485716 .0259704 -9.57 0.000 -.2994727 -.1976706
SEX | .0851733 .0171697 4.96 0.000 .0515213 .1188253
AGE | .0252426 .0106126 2.38 0.017 .0044423 .0460429
MARRIED | .1239639 .0209267 5.92 0.000 .0829483 .1649795
ILLDAYS | .0429083 .0010728 40.00 0.000 .0408057 .0450109
ACTDAYS | .0089793 .0052409 1.71 0.087 -.0012927 .0192514
INJURY | .1717029 .0747292 2.30 0.022 .0252364 .3181694
ILLNESS | .5623976 .0064536 87.15 0.000 .5497488 .5750464
EDUC | -.0524459 .0048173 -10.89 0.000 -.0618878 -.0430041
_cons | -1.640821 .0458542 -35.78 0.000 -1.730694 -1.550949
-----------------------------------------------------------------------------. estimates store poisiid
.
. * Poisson with heteroskedastic-robust standard errors (Table 24.6 column 3)
. poisson PHARVIS $XLISTPOISSON, robust
590

Poisson regression
Number of obs =
27765
Wald chi2(10) = 2423.07
Prob > chi2 = 0.0000
Pseudo R2
= 0.2073
-----------------------------------------------------------------------------|
Robust
PHARVIS |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .078686 .0255091 3.08 0.002 .0286891 .1286829
INSURANCE | -.2485716 .0437892 -5.68 0.000 -.3343969 -.1627464
SEX | .0851733 .030907 2.76 0.006 .0245967 .1457499
AGE | .0252426 .0198448 1.27 0.203 -.0136526 .0641377
MARRIED | .1239639 .0419107 2.96 0.003 .0418205 .2061073
ILLDAYS | .0429083 .0028779 14.91 0.000 .0372678 .0485488
ACTDAYS | .0089793 .0207444 0.43 0.665 -.031679 .0496377
INJURY | .1717029 .2043534 0.84 0.401 -.2288224 .5722282
ILLNESS | .5623976 .0228635 24.60 0.000
.517586 .6072092
EDUC | -.0524459 .0081043 -6.47 0.000 -.0683301 -.0365618
_cons | -1.640821 .0872497 -18.81 0.000 -1.811828 -1.469815
-----------------------------------------------------------------------------. estimates store poishet
.
. * Poisson with cluster-robust standard errors (Table 24.6 column 4)
. poisson PHARVIS $XLISTPOISSON, cluster(COMMUNE)
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Poisson regression
Number of obs =
27765
Wald chi2(10) = 1295.38
Prob > chi2 = 0.0000
(standard errors adjusted for clustering on COMMUNE)
-----------------------------------------------------------------------------|
Robust
PHARVIS |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | .078686 .0472052 1.67 0.096 -.0138344 .1712065
INSURANCE | -.2485716 .0617873 -4.02 0.000 -.3696725 -.1274708
SEX | .0851733 .0327427 2.60 0.009 .0209988 .1493478
AGE | .0252426 .0262626 0.96 0.336 -.0262311 .0767163
591
MARRIED | .1239639 .048607 2.55 0.011

.028696 .2192318
ILLDAYS | .0429083 .0037384 11.48 0.000 .0355811 .0502355
ACTDAYS | .0089793 .0190493 0.47 0.637 -.0283567 .0463154
INJURY | .1717029 .2214258 0.78 0.438 -.2622836 .6056894
ILLNESS | .5623976 .028512 19.72 0.000
.506515 .6182802
EDUC | -.0524459 .0153841 -3.41 0.001 -.0825982 -.0222937
_cons | -1.640821 .1541108 -10.65 0.000 -1.942873 -1.33877
-----------------------------------------------------------------------------. estimates store poisclust
.
. * Random effects estimation (Table 24.6 columns 5-6)
. * This uses the xtpois command which first requires identifying the cluster
. iis COMMUNE
. xtpois PHARVIS $XLISTPOISSON, re
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:


Number of obs
= 27765
Number of groups =
194
avg = 143.1
max =
206
51
Wald chi2(10)
= 13723.01
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | -.1013746 .0187549 -5.41 0.000 -.1381336 -.0646157
INSURANCE | -.1675953 .0273642 -6.12 0.000 -.2212283 -.1139624
SEX | .099303 .0172541 5.76 0.000 .0654855 .1331206
AGE | .0047406 .0107899 0.44 0.660 -.0164073 .0258884
592
MARRIED | .1579958 .0212825 7.42 0.000 .1162828 .1997088

ILLDAYS | .046055 .0011422 40.32 0.000 .0438164 .0482937
ACTDAYS | .0186084 .0054546 3.41 0.001 .0079176 .0292991
INJURY | .1479464 .0780863 1.89 0.058
-.0051 .3009928
ILLNESS | .5801872 .0076855 75.49 0.000
.565124 .5952505
EDUC | -.0284493 .0055827 -5.10 0.000 -.0393911 -.0175075
_cons | -1.276974 .0723199 -17.66 0.000 -1.418718 -1.135229
-------------+---------------------------------------------------------------/lnalpha | -1.039839 .1035295
-1.242753 -.836925
-------------+---------------------------------------------------------------alpha | .3535115 .0365989
.2885885 .4330401
-----------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 3725.31 Prob>=chibar2 = 0.000
. estimates store poisre
.
. * Following shows that cluster option for xtpois in Stata version does nothing
. xtpois PHARVIS $XLISTPOISSON, i(COMMUNE) re
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

Fitting full model:

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:


Number of obs
= 27765
Number of groups =
194
avg = 143.1
max =
206
51
Wald chi2(10)
= 13723.01
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | -.1013746 .0187549 -5.41 0.000 -.1381336 -.0646157
INSURANCE | -.1675953 .0273642 -6.12 0.000 -.2212283 -.1139624
593
SEX | .099303 .0172541 5.76 0.000 .0654855 .1331206

AGE | .0047406 .0107899 0.44 0.660 -.0164073 .0258884
MARRIED | .1579958 .0212825 7.42 0.000 .1162828 .1997088
ILLDAYS | .046055 .0011422 40.32 0.000 .0438164 .0482937
ACTDAYS | .0186084 .0054546 3.41 0.001 .0079176 .0292991
INJURY | .1479464 .0780863 1.89 0.058
-.0051 .3009928
ILLNESS | .5801872 .0076855 75.49 0.000
.565124 .5952505
EDUC | -.0284493 .0055827 -5.10 0.000 -.0393911 -.0175075
_cons | -1.276974 .0723199 -17.66 0.000 -1.418718 -1.135229
-------------+---------------------------------------------------------------/lnalpha | -1.039839 .1035295
-1.242753 -.836925
-------------+---------------------------------------------------------------alpha | .3535115 .0365989
.2885885 .4330401
-----------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 3725.31 Prob>=chibar2 = 0.000
.
.
. * Fixed effects estimation (FGLS) (Table 24.6 columns 7-8)
. xtpois PHARVIS $XLISTPOISSON, fe
note: 1 group (94 obs) dropped due to all zero outcomes
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:

Conditional fixed-effects Poisson regression Number of obs

=
Number of groups =
avg = 143.4
max =
206
27671
193
51
Wald chi2(10)
= 13621.76
Prob > chi2
= 0.0000
-----------------------------------------------------------------------------PHARVIS |
Coef. Std. Err.
-------------+---------------------------------------------------------------LNHHEXP | -.1146402 .019025 -6.03 0.000 -.1519285 -.0773519
INSURANCE | -.163603 .0274193 -5.97 0.000 -.2173438 -.1098622
SEX | .0997415 .0172564 5.78 0.000 .0659195 .1335635
AGE | .0033591 .0107945 0.31 0.756 -.0177977 .024516
MARRIED | .1606792 .0212958 7.55 0.000 .1189403 .2024182
ILLDAYS | .046148 .0011453 40.29 0.000 .0439032 .0483929
ACTDAYS | .0189184 .0054666 3.46 0.001
.008204 .0296328
594
INJURY | .1479319 .078183 1.89 0.058 -.0053039 .3011677

ILLNESS | .5803719 .0077289 75.09 0.000 .5652235 .5955203
EDUC | -.0272099 .0056191 -4.84 0.000 -.0382232 -.0161966
-----------------------------------------------------------------------------. estimates store poisfe
.
.
. ********** DISPLAY TABLE 24.6 RESULTS page 852 **********
.
. * The results here differ in the second significant digit from those in text
. * despite same sample size. Not sure why.
.
. estimates table poisiid poishet poisclust, /*
> */ b(%10.3f) t(%10.2f) stats(r2 N)
----------------------------------------------------Variable | poisiid
poishet poisclust
-------------+--------------------------------------LNHHEXP |
0.079
0.079
0.079
|
5.68
3.08
1.67
INSURANCE | -0.249
-0.249
-0.249
|
-9.57
-5.68
-4.02
SEX |
0.085
0.085
0.085
|
4.96
2.76
2.60
AGE |
0.025
0.025
0.025
|
2.38
1.27
0.96
MARRIED |
0.124
0.124
0.124
|
5.92
2.96
2.55
ILLDAYS |
0.043
0.043
0.043
|
40.00
14.91
11.48
ACTDAYS |
0.009
0.009
0.009
|
1.71
0.43
0.47
INJURY |
0.172
0.172
0.172
|
2.30
0.84
0.78
ILLNESS |
0.562
0.562
0.562
|
87.15
24.60
19.72
EDUC | -0.052
-0.052
-0.052
| -10.89
-6.47
-3.41
_cons | -1.641
-1.641
-1.641
| -35.78
-18.81
-10.65
-------------+--------------------------------------r2 |
N | 27765.000 27765.000 27765.000
----------------------------------------------------legend: b/t
. estimates table poisre poisfe, /*
595
>
*/ b(%10.3f) t(%10.2f) stats(r2 N)
---------------------------------------Variable | poisre
poisfe
-------------+-------------------------PHARVIS
|
LNHHEXP | -0.101
-0.115
|
-5.41
-6.03
INSURANCE | -0.168
-0.164
|
-6.12
-5.97
SEX |
0.099
0.100
|
5.76
5.78
AGE |
0.005
0.003
|
0.44
0.31
MARRIED |
0.158
0.161
|
7.42
7.55
ILLDAYS |
0.046
0.046
|
40.32
40.29
ACTDAYS |
0.019
0.019
|
3.41
3.46
INJURY |
0.148
0.148
|
1.89
1.89
ILLNESS |
0.580
0.580
|
75.49
75.09
EDUC | -0.028
-0.027
|
-5.10
-4.84
_cons | -1.277
| -17.66
-------------+-------------------------lnalpha
|
_cons | -1.040
| -10.04
-------------+-------------------------Statistics |
r2 |
N | 27765.000 27671.000
---------------------------------------legend: b/t
.
. ********** ADDITIONALLY DO CLUSTER BOOTSTRAPS **********
.
. * These results not given in the text
.
. * Output at website uses breps 500
. global breps 50
.
. * Note that can bootstrap if desired to get more robust standard errors
. * The first reproduces pois , cluster(COMMUNE)
. bootstrap "poisson PHARVIS $XLISTPOISSON" _b, cluster(COMMUNE) reps($breps) level(95)
596
command:
poisson PHARVIS LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS
ACTDAYS INJURY ILLNESS EDUC
statistics: b_LNHHEXP = [PHARVIS]_b[LNHHEXP]
b_INSURA~E = [PHARVIS]_b[INSURANCE]
b_SEX
= [PHARVIS]_b[SEX]
b_AGE
= [PHARVIS]_b[AGE]
b_MARRIED = [PHARVIS]_b[MARRIED]
b_ILLDAYS = [PHARVIS]_b[ILLDAYS]
b_ACTDAYS = [PHARVIS]_b[ACTDAYS]
b_INJURY = [PHARVIS]_b[INJURY]
b_ILLNESS = [PHARVIS]_b[ILLNESS]
b_EDUC = [PHARVIS]_b[EDUC]
b_cons = [PHARVIS]_b[_cons]
Number of obs =
N of clusters =
194
Replications =
50
27765

-------------+---------------------------------------------------------------b_LNHHEXP | 50 .078686 .0072233 .0475425 -.0168542 .1742262 (N)
|
-.0050689 .1878158 (P)
|
-.0204097 .1710779 (BC)
b_INSURANCE | 50 -.2485716 .0013929 .0770506 -.4034107 -.0937326 (N)
|
-.3640907 -.1004183 (P)
|
-.4677969 -.1004183 (BC)
b_SEX | 50 .0851733 -.0039062 .0345537 .0157351 .1546115 (N)
|
.022333 .1494552 (P)
|
.022333 .1494552 (BC)
b_AGE | 50 .0252426 .0012812 .0270715 -.0291596 .0796447 (N)
|
-.025843 .0726057 (P)
|
-.0479862 .0726057 (BC)
b_MARRIED | 50 .1239639 -.0017894 .0406114 .0423522 .2055756 (N)
|
.0132484 .2024732 (P)
|
.0132484 .2101617 (BC)
b_ILLDAYS | 50 .0429083 -.0005122 .0034 .0360757 .0497409 (N)
|
.0358535 .0481521 (P)
|
.0363203 .0500312 (BC)
b_ACTDAYS | 50 .0089793 -.0021093 .0249974 -.0412549 .0592135 (N)
|
-.0343906 .0573651 (P)
|
-.0352626 .0573651 (BC)
b_INJURY | 50 .1717029 -.0321969 .2090263 -.2483512 .591757 (N)
|
-.3271621 .4807015 (P)
|
-.1896703 .648314 (BC)
b_ILLNESS | 50 .5623976 .0061368 .0294736 .5031682 .621627 (N)
|
.5206931 .6271017 (P)
|
.5192547 .6206369 (BC)
b_EDUC | 50 -.0524459 .0027244 .01598 -.0845589 -.0203329 (N)
|
-.0825952 -.017323 (P)
597
|
-.0850821 -.0256777 (BC)
b_cons | 50 -1.640821 -.0414073 .1460702 -1.93436 -1.347282 (N)
|
-1.984352 -1.399226 (P)
|
-1.867373 -1.310915 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. * The t-statistic vector is e(b)./e(se) where ./ is elt. by elt. division
. * But Stata Version 8 does not do ./ so instead need the following
. matrix tpois = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tpois, format(%10.2f)
tpois[11,1]
r1
b_LNHHEXP 1.66
b_INSURANCE -3.23
b_SEX 2.46
b_AGE 0.93
b_MARRIED 3.05
b_ILLDAYS 12.62
b_ACTDAYS 0.36
b_INJURY 0.82
b_ILLNESS 19.08
b_EDUC -3.28
b_cons -11.23
.
. * The next two reproduce xtpois , cluster(COMMUNE)
. * but xtpois has no cluster option so instead cluster boostrap
.
. * Fixed effects estimator
. bootstrap "xtpois PHARVIS $XLISTPOISSON, fe" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtpois PHARVIS LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS
ACTDAYS INJURY ILLNESS EDUC ,
> fe
b_SEX
= [PHARVIS]_b[SEX]
b_AGE
= [PHARVIS]_b[AGE]
598
Number of obs =
N of clusters =
193
Replications =
50
27671

-------------+---------------------------------------------------------------b_LNHHEXP | 50 -.1146402 .0046925 .042981 -.2010138 -.0282666 (N)
|
-.1801919 -.0258064 (P)
|
-.1841975 -.043704 (BC)
b_INSURANCE | 50 -.163603 .0145077 .0513299 -.2667543 -.0604516 (N)
|
-.2391983 -.0581847 (P)
|
-.269962 -.0993868 (BC)
b_SEX | 50 .0997415 .0030381 .0298361 .0397836 .1596994 (N)
|
.0581716 .1630876 (P)
|
.055771 .1562326 (BC)
b_AGE | 50 .0033591 -.0017336 .0228288 -.042517 .0492353 (N)
|
-.0508069 .040935 (P)
|
-.0508069 .0541492 (BC)
b_MARRIED | 50 .1606793 .009603 .0435503 .0731616 .2481969 (N)
|
.1091381 .260388 (P)
|
.0877519 .2407327 (BC)
b_ILLDAYS | 50 .046148 -.0004107 .0027904 .0405406 .0517555 (N)
|
.0397139 .0504146 (P)
|
.0397139 .050898 (BC)
b_ACTDAYS | 50 .0189184 -.0049228 .0176306 -.0165115 .0543484 (N)
|
-.0169987 .0490534 (P)
|
-.0158923 .0497731 (BC)
b_INJURY | 50 .1479319 .0204617 .2194316 -.2930323 .5888962 (N)
|
-.2735089 .5520838 (P)
|
-.3044733 .5520838 (BC)
b_ILLNESS | 50 .5803719 .0003675 .0199171 .540347 .6203969 (N)
|
.5370637 .6163648 (P)
|
.5370637 .6163648 (BC)
b_EDUC | 50 -.0272099 -.0003993 .0112987 -.0499155 -.0045043 (N)
|
-.0521668 -.0068456 (P)
|
-.0531845 -.0068456 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix tpoisfe = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tpoisfe, format(%10.2f)
tpoisfe[10,1]
r1
b_LNHHEXP -2.67
b_INSURANCE -3.19
b_SEX 3.34
599
b_AGE 0.15
b_MARRIED 3.69
b_ILLDAYS 16.54
b_ACTDAYS 1.07
b_INJURY 0.67
b_ILLNESS 29.14
b_EDUC -2.41
.
. * Random effects estimator
. bootstrap "xtpois PHARVIS $XLISTPOISSON, re" _b, cluster(COMMUNE) reps($breps)
level(95)
command:
xtpois PHARVIS LNHHEXP INSURANCE SEX AGE MARRIED ILLDAYS
ACTDAYS INJURY ILLNESS EDUC ,
> re
b_SEX
= [PHARVIS]_b[SEX]
b_AGE
= [PHARVIS]_b[AGE]
b_cons = [PHARVIS]_b[_cons]
b_1cons = [lnalpha]_b[_cons]
Number of obs =
N of clusters =
194
Replications =
50
27765

-------------+---------------------------------------------------------------b_LNHHEXP | 50 -.1013746 .0038095 .0406385 -.1830407 -.0197086 (N)
|
-.1794194 -.0319058 (P)
|
-.1977448 -.0319058 (BC)
b_INSURANCE | 50 -.1675954 -.0053195 .04945 -.2669688 -.0682219 (N)
|
-.2912881 -.0900193 (P)
|
-.2677689 -.088337 (BC)
b_SEX | 50 .099303 -.0008622 .032962 .0330634 .1655427 (N)
|
.0463968 .1569125 (P)
|
.0463968 .1569125 (BC)
b_AGE | 50 .0047406 -.002087 .0196285 -.0347045 .0441856 (N)
|
-.0319554 .0398893 (P)
|
-.0212454 .0454795 (BC)
b_MARRIED | 50 .1579958 .0045701 .0386327 .0803604 .2356311 (N)
|
.1002202 .2446688 (P)
|
.0595091 .2383231 (BC)
600
b_ILLDAYS | 50 .046055 -.0000891 .0033445 .039334 .0527761 (N)

|
.0400018 .0525925 (P)
|
.0400018 .0528012 (BC)
b_ACTDAYS | 50 .0186084 -.0013996 .0204209 -.022429 .0596457 (N)
|
-.0251694 .0533912 (P)
|
-.0251694 .0624974 (BC)
b_INJURY | 50 .1479464 -.0122248 .2130704 -.2802346 .5761274 (N)
|
-.2971589 .4662884 (P)
|
-.3564237 .4662884 (BC)
b_ILLNESS | 50 .5801873 .002013 .019375 .5412517 .6191228 (N)
|
.5488635 .621733 (P)
|
.5488635 .6328769 (BC)
b_EDUC | 50 -.0284493 -.0017922 .0117021 -.0519655 -.0049331 (N)
|
-.050308 -.0116823 (P)
|
-.050308 -.0065941 (BC)
b_cons | 50 -1.276974 -.0036143 .1309168 -1.540061 -1.013887 (N)
|
-1.523902 -.9686469 (P)
|
-1.523902 -.9686469 (BC)
b_1cons | 50 -1.039839 .0148765 .0966908 -1.234147 -.8455317 (N)
|
-1.170977 -.8494586 (P)
|
-1.183111 -.8494586 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
. matrix tpoisre = (vecdiag(diag(e(b))*syminv(diag(e(se)))))'
. matrix list tpoisre, format(%10.2f)
tpoisre[12,1]
r1
b_LNHHEXP -2.49
b_INSURANCE -3.39
b_SEX 3.01
b_AGE 0.24
b_MARRIED 4.09
b_ILLDAYS 13.77
b_ACTDAYS 0.91
b_INJURY 0.69
b_ILLNESS 29.95
b_EDUC -2.43
b_cons -9.75
b_1cons -10.75
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section6\mma24p2poiscluster.txt
log type: text
closed on: 24 May 2005, 16:50:38
601
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma25p1treatment.txt
log type: text
opened on: 26 May 2005, 10:26:17
.
. ********** OVERVIEW OF MMA25P1TREATMENT.DO **********
.
. * STATA Program
.
. * Chapter 25.8.1-25.8.4 pages 889-893 Tables 25.3-25.4 and Fig. 25.3
. * Evaluating treatment effect of training on Earnings
. * using Dehejia-Wahba data (originally Lalonde data)
.
. * (0) Summarize data for treatments and controls (Table 25.3)
. * (1) Calculate the treatment effect by simple methods (Table 25.4)
. * To replicate some results in DW 1999
. * (1A) treatment-control
. * (1B) control function
. * (1C) before-after cpmparison
. * (1D) differences-in-differences
. * (2) Calculate treatment effect by propensity score (matching by strata)
. * Last entry in Table 25.4 and Figure 25.3.
.
. * The program MMA25P2MATCHING.DO uses propensity scores with matching
. * methods more sophisticated than those usd in the MMA25P1TREAMENT.DO
.
. * To run this program you need file
. * nswpsid.da1
.
. ********** STATA SETUP **********
.
. set more off
. version 8
.
.
. * Data set nswpsid.da1 is data set nswpsid.da1 from Guido Imbens
. * http://emlab.berkeley.edu/users/imbens/index.shtml
.
. * Data originally from DW99
. * R.H. Dehejia and S. Wahba (1999)
. * "Causal Effects in Nonexperimental Studies: reevaluating the
602
. * Evaluation of Training Programs", JASA, 1053-1062

. * or DW02
. * "Propensity-score Matching Methods for Nonexperimental Causal
. * Studies", ReStat, 151-161
. * which in turn are from
. * Lalonde, R. (1986), "Evaluating the Econometric Evaluations of
. * Training Programs with Experimental Data," AER, 604-620.
.
. * Each observation is for an individual.
. * There are 2,675 observations: 185 in treated group and 2490 in control
.
. * Variables are
. * TREAT 1 if treated (NSW treated) and 0 if not (PSID-1 control)
. * AGE in years
. * EDUC in years
. * BLACK 1 if black
. * HISP 1 if hispanic
. * MARR 1 if married
. * RE74 Real annual earnings in 1974 (pre-treatment)
. * RE78 Real annual earnings in 1974 (post-treatment)
. * U74 1 if unemployed in 1974
.
. * NOTE: U74 and U75 are miscoded in these data and also in the
.*
summary statistics table of DW02
.*
See below for correction to data
.
. ********** READ DATA AND TRANSFORMATIONS **********
.
. infile TREAT AGE EDUC BLACK HISP MARR RE74 RE75 RE78 U74 U75 /*
> */ using nswpsid.da1
.
. * The original data reversed U74 and U75
. * Should be U74=1 if R74=0 and U74=0 if R74>0 anmd similar for U75
. * This effects results with propensity score though not eariler results
.
. * Wrong U74 and U75
. sum U74 U75
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------U74 |
2675 .1345794 .3413376
0
1
U75 |
2675 .1293458 .335645
0
1
.
. * Correct the original data
. drop U74 U75
603
. gen U74 = cond(RE74 == 0, 1, 0)

. gen U75 = cond(RE75 == 0, 1, 0)
.
. * Correct U74 and U75
. sum U74 U75
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------U74 |
2675 .1293458 .335645
0
1
U75 |
2675 .1345794 .3413376
0
1
.
. * Create regressors used as additional controls in regressions below
. gen AGESQ = AGE*AGE
. gen EDUCSQ = EDUC*EDUC
. * DW99 do not define NODEGREE but following gives Table 1 means
. gen NODEGREE = 0
. replace NODEGREE = 1 if EDUC < 12
. gen RE74SQ = RE74*RE74
. gen U74BLACK = U74*BLACK
. gen U74HISP = U74*HISP
.
. sum AGE EDUC NODEGREE BLACK HISP MARR U74 U75 RE74 RE75 RE78 TREAT /*
> */ AGESQ EDUCSQ RE74SQ RE75SQ U74BLACK U74HISP
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
2675 34.22579 10.49984
17
55
EDUC |
2675 11.99439 3.053556
0
17
NODEGREE |
2675 .3330841 .4714045
0
1
BLACK |
2675 .2915888 .4545789
0
1
HISP |
2675 .0343925 .1822693
0
1
-------------+-------------------------------------------------------MARR |
2675 .8194393 .3847257
0
1
U74 |
2675 .1293458 .335645
0
1
U75 |
2675 .1345794 .3413376
0
1
RE74 |
2675
18230 13722.25
0 137149
RE75 |
2675 17850.89 13877.78
0 156653
604
-------------+-------------------------------------------------------RE78 |
2675 20502.38 15632.52
0 121174
TREAT |
2675 .0691589 .2537716
0
1
AGESQ |
2675 1281.61 766.8415
289
3025
EDUCSQ |
2675 153.1862 70.62231
0
289
RE74SQ |
2675 5.21e+08 8.47e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2675 5.11e+08 8.91e+08
0 2.45e+10
U74BLACK |
2675 .0549533 .2279316
0
1
U74HISP |
2675 .0056075 .0746868
0
1
.
. * Reproduce DW99 Table 1: RE74subset Treated and PSID-1 rows
. * Same as CT Table 25.3 page 890
. * except for changes to U74, U75 and U74BLACK
. bysort TREAT: sum AGE EDUC NODEGREE BLACK HISP MARR U74 U75 RE74 RE75
RE78 TREAT /*
> */ AGESQ EDUCSQ RE74SQ RE75SQ U74BLACK
----------------------------------------------------------------------------------------------------> TREAT = 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
2490 34.8506 10.44076
18
55
EDUC |
2490 12.11687 3.082435
0
17
NODEGREE |
2490 .3052209 .4605934
0
1
BLACK |
2490 .2506024 .433447
0
1
HISP |
2490 .0325301 .1774389
0
1
-------------+-------------------------------------------------------MARR |
2490 .8662651 .3404357
0
1
U74 |
2490 .0863454 .2809298
0
1
U75 |
2490
.1 .3000603
0
1
RE74 |
2490 19428.75 13406.88
0 137149
RE75 | 2490 19063.34 13596.95
0 156653
-------------+-------------------------------------------------------RE78 |
2490 21553.92 15555.35
0 121174
TREAT |
2490
0
0
0
0
AGESQ |
2490 1323.53 769.796
324
3025
EDUCSQ |
2490 156.3161 71.43048
0
289
RE74SQ |
2490 5.57e+08 8.66e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2490 5.48e+08 9.12e+08
0 2.45e+10
U74BLACK |
2490 .0144578 .1193923
0
1
----------------------------------------------------------------------------------------------------> TREAT = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
185 25.81622 7.155019
17
48
605
EDUC |
185 10.34595 2.01065
4
16
NODEGREE |
185 .7081081 .4558666
0
1
BLACK |
185 .8432432 .3645579
0
1
HISP |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------MARR |
185 .1891892 .3927217
0
1
U74 |
185 .7081081 .4558666
0
1
U75 |
185
.6 .4912274
0
1
RE74 |
185 2095.574 4886.623
0 35040.1
RE75 |
185 1532.056 3219.251
0 25142.2
-------------+-------------------------------------------------------RE78 |
185 6349.145 7867.405
0 60307.9
TREAT |
185
1
0
1
1
AGESQ |
185 717.3946 431.2517
289
2304
EDUCSQ |
185 111.0595 39.30388
16
256
RE74SQ |
185 2.81e+07 1.14e+08
0 1.23e+09
-------------+-------------------------------------------------------RE75SQ |
185 1.27e+07 5.60e+07
0 6.32e+08
U74BLACK |
185
.6 .4912274
0
1
.
. save nswpsid, replace
file nswpsid.dta saved
.
. ********** ANALYSIS: (1) CALCULATE EFFECT OF TRAINING (Table 25.4, p.891)
**********
.
. ***** (1A) TREATMENT-CONTROL COMPARISON USING POST_TREATMENT
EARNINGS
. *****
[Difference in means]
.
. * DW99 Table 5 column 1 and Table 3 column 1
. regress RE78 T
Source |
SS
df
MS
-------------+-----------------------------F( 1, 2673) = 173.41
Model | 3.9811e+10 1 3.9811e+10
Prob > F
= 0.0000
Residual | 6.1365e+11 2673 229573201
R-squared = 0.0609
-------------+-----------------------------Adj R-squared = 0.0606
Total | 6.5346e+11 2674 244375675
Root MSE
= 15152
-----------------------------------------------------------------------------RE78 |
Coef. Std. Err.
-------------+---------------------------------------------------------------TREAT | -15204.78 1154.614 -13.17 0.000 -17468.8 -12940.75
_cons | 21553.92 303.6414 70.98 0.000 20958.53 22149.32
-----------------------------------------------------------------------------.
606
. * CT Table 25.4 p.891 first row uses heteroskedastic-robust standard errors

. regress RE78 TREAT, robust
Number of obs =
F( 1, 2673) = 537.36
Prob > F
= 0.0000
R-squared = 0.0609
Root MSE = 15152
2675
-----------------------------------------------------------------------------|
Robust
RE78 |
Coef. Std. Err.
-------------+---------------------------------------------------------------TREAT | -15204.78 655.9143 -23.18 0.000 -16490.93 -13918.63
_cons | 21553.92 311.785 69.13 0.000 20942.56 22165.29
-----------------------------------------------------------------------------. estimates store treatcontrol
.
. ***** (1B) CONTROL FUNCTION ESTIMATOR Additionally Include pre-treatment controls
.
. * DW99 Table 5 column 2 using regressors in footnote a
. * Same as DW99 Table 2 column 14
. regress RE78 TREAT AGE AGESQ EDUC NODEGREE BLACK HISP RE74 RE75
Source |
SS
df
MS
-------------+-----------------------------F( 9, 2665) = 419.22
Model | 3.8296e+11 9 4.2551e+10
Prob > F
= 0.0000
Residual | 2.7050e+11 2665 101500967
R-squared = 0.5860
-------------+-----------------------------Adj R-squared = 0.5847
Total | 6.5346e+11 2674 244375675
Root MSE
= 10075
-----------------------------------------------------------------------------RE78 |
Coef. Std. Err.
-------------+---------------------------------------------------------------TREAT | 217.9438 866.1968 0.25 0.801 -1480.542 1916.43
AGE | 158.5058 155.4065 1.02 0.308 -146.2239 463.2354
AGESQ | -3.232885 2.11617 -1.53 0.127 -7.382386 .9166173
EDUC | 564.6237 103.56 5.45 0.000 361.5577 767.6898
NODEGREE | 502.0912 647.0243 0.78 0.438 -766.6292 1770.812
BLACK | -699.3353 493.1811 -1.42 0.156 -1666.392 267.7211
HISP | 2226.535 1092.71 2.04 0.042 83.88965 4369.181
RE74 | .2791682 .0279297 10.00 0.000 .2244021 .3339343
RE75 | .5680874 .0275763 20.60 0.000 .5140143 .6221605
_cons | -2836.703 2901.443 -0.98 0.328 -8526.01 2852.604
-----------------------------------------------------------------------------.
. * CT Table 25.4 p.891 second row uses heteroskedastic-robust standard errors
. regress RE78 TREAT AGE AGESQ EDUC NODEGREE BLACK HISP RE74 RE75, robust
607

Number of obs =
F( 9, 2665) = 232.85
Prob > F
= 0.0000
R-squared = 0.5860
Root MSE = 10075
2675
-----------------------------------------------------------------------------|
Robust
RE78 |
Coef. Std. Err.
-------------+---------------------------------------------------------------TREAT | 217.9438 767.8811 0.28 0.777 -1287.759 1723.647
AGE | 158.5058 151.0305 1.05 0.294 -137.6431 454.6546
AGESQ | -3.232885 2.103324 -1.54 0.124 -7.357197 .891428
EDUC | 564.6237 121.6483 4.64 0.000 326.0891 803.1583
NODEGREE | 502.0912 632.3685 0.79 0.427 -737.8914 1742.074
BLACK | -699.3353 432.4582 -1.62 0.106 -1547.323 148.6523
HISP | 2226.535 1219.08 1.83 0.068 -163.9034 4616.974
RE74 | .2791682 .0618802 4.51 0.000 .1578301 .4005063
RE75 | .5680874 .0663995 8.56 0.000 .4378876 .6982872
_cons | -2836.703 2937.385 -0.97 0.334 -8596.487 2923.081
-----------------------------------------------------------------------------. estimates store controlfunction
.
. * Variation that lets OLS coefficients differ across treatment and controls
. * Interaction of regressors with T
. gen TAGE = TREAT*AGE
. gen TAGESQ = TREAT*AGESQ
. gen TEDUC = TREAT*EDUC
. gen TNODEGREE = TREAT*NODEGREE
. gen TBLACK = TREAT*BLACK
. gen THISP = TREAT*HISP
. gen TRE74 = TREAT*RE74
. gen TRE75 = TREAT*RE75
. regress RE78 TREAT AGE AGESQ EDUC NODEGREE BLACK HISP RE74 RE75 /*
> */TAGE TAGESQ TEDUC TNODEGREE TBLACK THISP TRE74 TRE75
Source |
SS
df
MS
-------------+-----------------------------F( 17, 2657) = 223.17
Model | 3.8431e+11 17 2.2607e+10
Prob > F
= 0.0000
Residual | 2.6915e+11 2657 101297131
R-squared = 0.5881
608
-------------+-----------------------------Adj R-squared = 0.5855

Total | 6.5346e+11 2674 244375675
Root MSE
= 10065
-----------------------------------------------------------------------------RE78 |
Coef. Std. Err.
-------------+---------------------------------------------------------------TREAT | -8202.823 11960.39 -0.69 0.493 -31655.45 15249.8
AGE | 79.46291 165.6177 0.48 0.631 -245.2897 404.2155
AGESQ | -2.260967 2.239074 -1.01 0.313 -6.651471 2.129537
EDUC | 567.4906 106.2026 5.34 0.000 359.2424 775.7388
NODEGREE | 655.3534 679.5015 0.96 0.335 -677.052 1987.759
BLACK | -707.0551 505.0048 -1.40 0.162 -1697.297 283.1872
HISP | 2553.662 1154.726 2.21 0.027 289.4107 4817.914
RE74 | .2869368 .0282197 10.17 0.000
.231602 .3422715
RE75 | .5677759 .0277689 20.45 0.000 .5133251 .6222267
TAGE | 668.0022 745.1401 0.90 0.370 -793.1112 2129.116
TAGESQ | -8.651515 12.26876 -0.71 0.481 -32.7088 15.40577
TEDUC | -27.54033 529.1855 -0.05 0.958 -1065.197 1010.117
TNODEGREE | -963.4163 2410.973 -0.40 0.689 -5690.989 3764.157
TBLACK | -384.5853 2593.349 -0.15 0.882 -5469.772 4700.601
THISP | -2126.096 4086.539 -0.52 0.603 -10139.22 5887.023
TRE74 | -.2540934 .2070566 -1.23 0.220 -.6601018 .1519151
TRE75 | -.472797 .3097211 -1.53 0.127 -1.080116 .1345218
_cons | -1603.593 3069.895 -0.52 0.601 -7623.219 4416.032
-----------------------------------------------------------------------------.
. ***** (1D) DIFFERENCE-IN-DIFFERENCES
.
. * Need to stack two separate years of data RE75 and RE78
. * into a panel of two years on RE
. gen id = _n
. label variable id "id"
. gen EARNS1 = RE75
. gen EARNS2 = RE78
. reshape long EARNS, i(id) j(year)
(note: j = 1 2)
Data
wide -> long
----------------------------------------------------------------------------Number of obs.
2675 -> 5350
Number of variables
31 ->
31
-> year
xij variables:
EARNS1 EARNS2 -> EARNS
-----------------------------------------------------------------------------
609
. gen dyear2 = 0
. gen Tdyear2 = TREAT*dyear2
. regress EARNS Tdyear2 TREAT dyear2
Source |
SS
df
MS
-------------+-----------------------------F( 3, 5346) = 169.20
Model | 1.0214e+11 3 3.4047e+10
Prob > F
= 0.0000
Residual | 1.0757e+12 5346 201218724
R-squared = 0.0867
-------------+-----------------------------Adj R-squared = 0.0862
Total | 1.1779e+12 5349 220201247
Root MSE
= 14185
-----------------------------------------------------------------------------EARNS |
Coef. Std. Err.
-------------+---------------------------------------------------------------Tdyear2 | 2326.505 1528.712 1.52 0.128 -670.3928 5323.403
TREAT | -17531.28 1080.962 -16.22 0.000 -19650.41 -15412.15
dyear2 | 2490.585 402.0217 6.20 0.000 1702.458 3278.711
_cons | 19063.34 284.2723 67.06 0.000 18506.05 19620.63
-----------------------------------------------------------------------------.
. * CT Table 25.4 p.891 fourth row usea heteroskedastic-robust standard errors
. regress EARNS Tdyear2 TREAT dyear2, robust
Number of obs =
F( 3, 5346) = 1222.98
Prob > F
= 0.0000
R-squared = 0.0867
Root MSE = 14185
5350
-----------------------------------------------------------------------------|
Robust
EARNS |
Coef. Std. Err.
-------------+---------------------------------------------------------------Tdyear2 | 2326.505 748.5021 3.11 0.002 859.1359 3793.875
TREAT | -17531.28 360.5992 -48.62 0.000 -18238.2 -16824.36
dyear2 | 2490.585 414.1056 6.01 0.000 1678.769
3302.4
_cons | 19063.34 272.5318 69.95 0.000 18529.06 19597.61
-----------------------------------------------------------------------------. estimates store diffindiff
.
. * Adding pretreatment controls makes no differnce as timne-invariant
. regress EARNS Tdyear2 TREAT dyear2 AGE AGESQ EDUC NODEGREE BLACK HISP
610
Source |
SS
df
MS
-------------+-----------------------------F( 9, 5340) = 184.54
Model | 2.7943e+11 9 3.1048e+10
Prob > F
= 0.0000
Residual | 8.9843e+11 5340 168245017
R-squared = 0.2372
-------------+-----------------------------Adj R-squared = 0.2359
Total | 1.1779e+12 5349 220201247
Root MSE
= 12971
-----------------------------------------------------------------------------EARNS |
Coef. Std. Err.
-------------+---------------------------------------------------------------Tdyear2 | 2326.505 1397.856 1.66 0.096 -413.8634 5066.874
TREAT | -9766.469 1043.296 -9.36 0.000 -11811.76 -7721.183
dyear2 | 2490.585 367.6092 6.78 0.000
1769.92 3211.249
AGE | 1357.093 139.6885 9.72 0.000 1083.246 1630.939
AGESQ | -15.23373 1.911801 -7.97 0.000 -18.98164 -11.48582
EDUC | 1504.728 91.99622 16.36 0.000 1324.377 1685.078
NODEGREE | -447.8275 588.8841 -0.76 0.447 -1602.281 706.6257
BLACK | -3177.524 446.5098 -7.12 0.000 -4052.865 -2302.182
HISP | -360.5058 993.7164 -0.36 0.717 -2308.596 1587.584
_cons | -25357.74 2618.207 -9.69 0.000 -30490.49 -20224.98
-----------------------------------------------------------------------------.
. ***** (1C) BEFORE-AFTER COMPARISON
.
. * Regression for treated only
. regress EARNS Tdyear2 if TREAT==1
Source |
SS
df
MS
Number of obs = 370
-------------+-----------------------------F( 1, 368) = 59.41
Model | 2.1464e+09 1 2.1464e+09
Prob > F
= 0.0000
Residual | 1.3296e+10 368 36129816.6
R-squared = 0.1390
-------------+-----------------------------Adj R-squared = 0.1367
Total | 1.5442e+10 369 41848713.4
Root MSE
= 6010.8
-----------------------------------------------------------------------------EARNS |
Coef. Std. Err.
-------------+---------------------------------------------------------------Tdyear2 | 4817.09 624.9741 7.71 0.000 3588.121 6046.058
_cons | 1532.056 441.9234 3.47 0.001 663.0436 2401.068
-----------------------------------------------------------------------------.
. * CT Table 25.4 p.891 third row uses heteroskedastic-robust standard errors
. regress EARNS Tdyear2 if TREAT==1, robust
Number of obs =
F( 1, 368) = 59.41
Prob > F
= 0.0000
R-squared = 0.1390
Root MSE
= 6010.8
370
611
-----------------------------------------------------------------------------|
Robust
EARNS |
Coef. Std. Err.
-------------+---------------------------------------------------------------Tdyear2 | 4817.09 624.9741 7.71 0.000 3588.121 6046.058
_cons | 1532.056 236.684 6.47 0.000 1066.633 1997.478
-----------------------------------------------------------------------------. estimates store beforeafter
.
. ***** DISPLAY RESULTS FOR FIRST FOUR ROWSM OF Table 25.4, p.891
.
. estimates table treatcontrol controlfunction beforeafter diffindiff, /*
> */ b(%10.0f) se(%10.0f) stats(N)
-----------------------------------------------------------------Variable | treatcon~l controlf~n beforeaf~r diffindiff
-------------+---------------------------------------------------TREAT | -15205
218
-17531
|
656
768
361
AGE |
159
|
151
AGESQ |
-3
|
2
EDUC |
565
|
122
NODEGREE |
502
|
632
BLACK |
-699
|
432
HISP |
2227
|
1219
RE74 |
0
|
0
RE75 |
1
|
0
Tdyear2 |
4817
2327
|
625
749
dyear2 |
2491
|
414
_cons |
21554
-2837
1532
19063
|
312
2937
237
273
-------------+---------------------------------------------------N|
2675
2675
370
5350
-----------------------------------------------------------------legend: b/se
.
612
. ********** ANALYSIS: (2) PROPENSITY SCORE USING STRATA (Table 25.4, p.891)
**********
.
. use nswpsid, clear
.
. ***** (2A) COMPUTE PROPENSITY SCORE
.
. * Calculate propensity score using regressors in DW99 Table 3 footnote e
. logit TREAT AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK HISP RE74 RE75
RE74SQ RE75SQ U74BLACK
Logit estimates
Number of obs =
2675
LR chi2(13) = 935.44
Prob > chi2 = 0.0000
Pseudo R2
= 0.6953
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------AGE | .3305734 .1203353 2.75 0.006 .0947206 .5664262
AGESQ | -.0063429 .0018561 -3.42 0.001 -.0099808 -.0027049
EDUC | .8247711 .3534216 2.33 0.020 .1320775 1.517465
EDUCSQ | -.0483153 .0186057 -2.60 0.009 -.0847819 -.0118488
MARR | -1.884062 .2994614 -6.29 0.000 -2.470996 -1.297129
NODEGREE | .1299868 .4284278 0.30 0.762 -.7097163
.96969
BLACK | 1.132961 .352088 3.22 0.001 .4428814 1.823041
HISP | 1.962762 .5673735 3.46 0.001 .8507302 3.074793
RE74 | -.0001047 .0000355 -2.95 0.003 -.0001743 -.0000351
RE75 | -.0002172 .0000415 -5.23 0.000 -.0002986 -.0001357
RE74SQ | 2.36e-09 6.57e-10 3.59 0.000 1.07e-09 3.65e-09
RE75SQ | 1.58e-10 6.68e-10 0.24 0.813 -1.15e-09 1.47e-09
U74BLACK | 2.137042 .4273667 5.00 0.000 1.299419 2.974665
_cons | -7.552458 2.451721 -3.08 0.002 -12.35774 -2.747173
-----------------------------------------------------------------------------note: 19 failures and 0 successes completely determined.
613
. * Note that Table 25.6 footnote b is wrong in stating RE74*RE75 is regressor

. predict PSCORE
(option p assumed; Pr(TREAT))
.
. ***** (2B) PLOT PROPENSITY SCORE BY TREATMENT STATUS TO SEE OVERLAP
.
. * Observations with no overlap in propensity score across treatment status are dropped
.
. sum PSCORE if TREAT==1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------PSCORE |
185 .6876511 .3095136 .0006526 .9748755
. scalar PTMIN = r(min)
. scalar PTMAX = r(max)
. sum PSCORE if TREAT==0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------PSCORE |
2490 .0232066 .0901373 4.49e-11 .9735255
. scalar PCMIN = r(min)
. scalar PCMAX = r(max)
. drop if PSCORE < PTMIN
. drop if PSCORE < PCMIN
. drop if PSCORE > PTMAX
. drop if PSCORE > PCMAX
. * Following gives number of observations left
. sum PSCORE
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------PSCORE |
1325 .1350934 .2703797 .0006526 .9735255
.
. * This differs from CT text page 893 as now U74 and U75 are corrected
. * Instead of losing 1423 controls and 8 treated leaving 1244
614
. * now
lose 1344 controls and 6 treated leaving 1325
. * versus DW Figure 1 1333 controls are dropped leaving 1342
. * and Dw Table 3 column 6 says that there are 1255 left
.
. ***** (2C) CREATE FIGURE 25.3 ON PAGE 892
.
. * This will differ a little from figure in text due to U74 and U75 corrected
.
. label define tstatus 0 Comparison_sample 1 Treated_sample
. label values TREAT tstatus
. label variable TREAT "Treatment Status"
. graph twoway (scatter RE78 PSCORE if RE78 < 20000, msize(small)) /*
> */ (lowess RE78 PSCORE, bwidth(0.5) clpattern(solid)), /*
>
*/ by(TREAT, title("Post-treatment Earnings against Propensity Score", margin(b=3)
size(vlarge))
> ) /*
> */ subtitle(, bfcolor(none)) /*
> */ xtitle(" Propensity Score
Propensity Score", size(medlarge))
> xscale(titlegap(*5)) /*
> */ ytitle("Real Earnings 1978", size(medlarge)) yscale(titlegap(*5)) /*
> */ legend(pos(12) ring(0) col(2)) /*
> */ legend( label(1 "Original data") label(2 "Nonparametric regression"))
. graph export ch25treatment.wmf, replace
(file c:\Imbook\bwebpage\Section6\ch25treatment.wmf written in Windows Metafile format)
.
. ***** (2D) ADJUSTED DIFFERENCE Use PSCORE to summarize pre-treatment controls
.
. * A simple method regressors RE78 on a quadratic on PSCORE and on TREAT
. * And measures the treatment effect as coefficient of TREATED
.
. gen PSCORESQ = PSCORE*PSCORE
. regress RE78 TREAT PSCORE PSCORESQ
Source |
SS
df
MS
-------------+-----------------------------F( 3, 1321) = 46.14
Model | 1.5152e+10 3 5.0505e+09
Prob > F
= 0.0000
Residual | 1.4458e+11 1321 109450232
R-squared = 0.0949
-------------+-----------------------------Adj R-squared = 0.0928
Total | 1.5974e+11 1324 120645977
Root MSE
= 10462
-----------------------------------------------------------------------------RE78 |
Coef. Std. Err.
-------------+---------------------------------------------------------------TREAT | 301.5344 1388.756 0.22 0.828 -2422.874 3025.943
615
PSCORE | -39475.21 4836.678 -8.16 0.000 -48963.62 -29986.8

PSCORESQ | 33122.86 5037.943 6.57 0.000 23239.61 43006.1
_cons | 14560.51 347.3596 41.92 0.000 13879.07 15241.95
-----------------------------------------------------------------------------.
. * This yields coefficient of 301 with nonrobust se of 1388
. * which is close to DW 99 Table 3 column 3
.*
coefficient of 294 with nonrobust se of 1389
.
. ***** (2E) CREATE STRATA
.
. * DW are not clear on how formed.
. * NBER Working Paper W6829 appendix suggests that form five cells
. * according to range of PSCORE (where nonoverlapping PSCOREs already dropped)
.
. * Here we instead create ten strata
. * for PSCORE <0.1, 0.1-0.2, ...., 0.8-0.9 and > 0.9
. global cut1 = 0.1
. global cut2 = 0.2
. global cut3 = 0.3
. global cut4 = 0.4
. global cut5 = 0.5
. global cut6 = 0.6
. global cut7 = 0.7
. global cut8 = 0.8
. global cut9 = 0.9
. gen STRATA = 1
. replace STRATA = 2 if PSCORE > $cut1 & PSCORE <= $cut2
616

. replace STRATA = 10 if PSCORE > $cut9
.
. tab STRATA T
| Treatment Status
STRATA | Compariso Treated_s | Total
-----------+----------------------+---------1 | 1,018
11 | 1,029
2|
53
7|
60
3|
24
11 |
35
4|
17
16 |
33
5|
8
5|
13
6|
6
15 |
21
7|
8
14 |
22
8|
5
8|
13
9|
0
13 |
13
10 |
7
79 |
86
-----------+----------------------+---------Total | 1,146
179 | 1,325
.
. ***** (2F) Test for similar regressor means for treated and nontreated within each Strata
.
. * Compare means within Strata across treatment status
. tab STRATA TREAT, sum(AGE) nostand nofreq
Means of AGE
| Treatment Status
-----------+----------------------+---------1 | 31.427308 30.363636 | 31.415938
2 | 28.037736 28.714286 | 28.116667
3 | 27.833333 27.909091 | 27.857143
4 | 27.529412
28.25 | 27.878788
5 | 28.875
27.8 | 28.461538
6|
25
23.4 | 23.857143
617
7 | 24.875
24.5 | 24.636364
8|
24.8
32 | 29.230769
9|
. 29.461538 | 29.461538
10 | 23.285714 23.367089 | 23.360465
-----------+----------------------+---------Total | 30.961606 25.765363 | 30.259623
. tab STRATA TREAT, sum(EDUC) nostand nofreq
Means of EDUC
| Treatment Status
-----------+----------------------+---------1 | 11.229862 11.545455 | 11.233236
2 | 10.433962 10.714286 | 10.466667
3 | 10.583333 10.181818 | 10.457143
4 | 10.647059 10.0625 | 10.363636
5 | 10.625
9.4 | 10.153846
6 | 9.3333333 10.066667 | 9.8571429
7 | 9.875 11.071429 | 10.636364
8|
10.8
11.25 | 11.076923
9|
.
11 |
11
10 | 10.571429 10.164557 | 10.197674
-----------+----------------------+---------Total | 11.141361 10.413408 | 11.043019
. tab STRATA TREAT, sum(MARR) nostand nofreq
Means of MARR
| Treatment Status
-----------+----------------------+---------1 | .8280943 .81818182 | .82798834
2 | .56603774 .85714286 |
.6
3 | .29166667 .18181818 | .25714286
4 | .23529412
.25 | .24242424
5|
.25
0 | .15384615
6 | .16666667 .06666667 | .0952381
7|
.125 .07142857 | .09090909
8|
.2
.625 | .46153846
9|
. .53846154 | .53846154
10 |
0
0|
0
-----------+----------------------+---------Total | .77574171 .19553073 | .69735849
. tab STRATA TREAT, sum(NODEGREE) nostand nofreq
Means of NODEGREE
618
| Treatment Status
-----------+----------------------+---------1 | .38408644 .36363636 | .38386783
2 | .62264151 .57142857 | .61666667
3|
.625 .54545455 |
.6
4 | .52941176
.625 | .57575758
5|
.625
.8 | .69230769
6 | .83333333
.8 | .80952381
7|
.625 .64285714 | .63636364
8|
.8
.75 | .76923077
9|
. .76923077 | .76923077
10 | .71428571 .75949367 | .75581395
-----------+----------------------+---------Total | .41186736 .69832402 | .45056604
. tab STRATA TREAT, sum(BLACK) nostand nofreq
Means of BLACK
| Treatment Status
-----------+----------------------+---------1 | .36247544 .63636364 | .3654033
2 | .60377358 .57142857 |
.6
3 | .66666667 .54545455 | .62857143
4 | .88235294
.875 | .87878788
5|
1
.4 | .76923077
6 | .83333333
.6 | .66666667
7|
.875 .92857143 | .90909091
8|
.8
1 | .92307692
9|
. .92307692 | .92307692
10 |
1 .94936709 | .95348837
-----------+----------------------+---------Total | .40401396 .83798883 | .46264151
. tab STRATA TREAT, sum(HISP) nostand nofreq
Means of HISP
| Treatment Status
-----------+----------------------+---------1 | .04911591
0 | .04859086
2 | .0754717 .28571429 |
.1
3 | .08333333
0 | .05714286
4|
0
0|
0
5|
0
.2 | .07692308
6 | .16666667 .13333333 | .14285714
7|
.125 .07142857 | .09090909
8|
.2
0 | .07692308
619
9|
. .07692308 | .07692308
10 |
0 .05063291 | .04651163
-----------+----------------------+---------Total | .05148342 .06145251 | .05283019
. tab STRATA TREAT, sum(RE74) nostand nofreq
Means of RE74
| Treatment Status
-----------+----------------------+---------1 | 12216.528 12142.62 | 12215.738
2 | 5989.8844 2031.6573 | 5528.0912
3 | 6476.1906 5884.7335 | 6290.3041
4 | 4790.868 4895.09 | 4841.3999
5 | 2375.3662 5715.8799 | 3660.1792
6 | 3173.6867 2402.9567 | 2623.1653
7 | 1533.1259 2269.1672 | 2001.5158
8 | 1567.414
0 | 602.85154
9|
. 34.243847 | 34.243847
10 |
0
0|
0
-----------+----------------------+---------Total | 11386.483 2165.8167 | 10140.823
. tab STRATA TREAT, sum(RE75) nostand nofreq
Means of RE75
| Treatment Status
-----------+----------------------+---------1 | 10352.924 8964.4728 | 10338.081
2 | 3916.448 3250.0113 | 3838.697
3 | 2417.8314 2694.2624 | 2504.7097
4 | 3134.96 2905.615 | 3023.7624
5 | 3204.6788 1917.262 | 2709.5185
6 | 2878.54 1731.1554 | 2058.9796
7 | 643.84411 1230.5051 | 1017.1739
8 | 2539.0337 1501.9275 | 1900.8145
9|
. 201.91542 | 201.91542
10 | 127.88014 234.47151 | 225.79547
-----------+----------------------+---------Total | 9528.6389 1583.4094 | 8455.2834
. tab STRATA TREAT, sum(U74BLACK) nostand nofreq
Means of U74BLACK
| Treatment Status
STRATA | Compariso Treated_s |
Total
620
-----------+----------------------+---------1 | .01473477
0 | .01457726
2 | .05660377 .14285714 | .06666667
3 | .08333333 .09090909 | .08571429
4 | .17647059
.1875 | .18181818
5|
.25
.2 | .23076923
6 | .16666667 .06666667 | .0952381
7|
.125 .21428571 | .18181818
8|
.4
1 | .76923077
9|
. .92307692 | .92307692
10 |
1 .94936709 | .95348837
-----------+----------------------+---------Total | .03141361 .58659218 | .10641509
.
. * Formal test of difference in means within strata across treatment status
. * Example is for education
. * bysort STRATA: oneway EDUC T
.
. ***** (2G) Calculate weighted average of within strata mean difference in outcome
.
. #delimit ;
delimiter now ;
. global sum = 0 ;
.
* Sums the estimate of interest over strata ;
. global sumwgt = 0 ;
. /* Sums the number of treated obs over strata */
> global count = 0 ;
.
/* This gives the number of Strata used
> global numcut = 10;
*/
. * Possibly include extra regressors.

> * Not clear which ones, so same as DW99 Table 3 footnote a for column 2
> global XLIST AGE AGESQ EDUC NODEGREE BLACK HISP RE74 RE75;
. forvalues i = 1/$numcut { ;
2. global addon = 0 ;
3. /* Within strata estiamte of interest */
> global tobs = 0 ;
4. /* Within strata number of treated obs */
> capture { ;
5.
quiet regress RE78 TREAT $XLIST if STRATA == ì' ;
6.
global addon = _b[TREAT] ;
7.
quiet summarize TREAT if TREAT==1 & STRATA==ì' ;
8.
global tobs = _result(1) ;
9. * # of treatment observations ;
. };
10. di "ì' estimate = $addon
Top cut = ${cutì'} #treat obs = $tobs" ;
11. if $addon ~= 0 { ;
621
12.
global sum = $sum + $addon * $tobs ;
13.
global sumwgt = $sumwgt + $tobs ;
14.
global count = $count + 1 ;
15. } ;
16. } ;
1 estimate = -4410.946812653378
Top cut = .1
2 estimate = -2113.275144674707
Top cut = .2
3 estimate = 1486.684503266305
Top cut = .3
4 estimate = -6085.742371951832
Top cut = .4
5 estimate = 1899.984014892578
Top cut = .5
6 estimate = -411.1481648763024
Top cut = .6
7 estimate = 133.9267490931921
Top cut = .7
8 estimate = 1848.656362915039
Top cut = .8
9 estimate = 0
Top cut = .9 #treat obs = 13
10 estimate = 4857.563579676591
Top cut =
#treat obs = 11
#treat obs = 7
#treat obs = 11
#treat obs = 16
#treat obs = 5
#treat obs = 15
#treat obs = 14
#treat obs = 8
#treat obs = 79
. #delimit cr ;
delimiter now cr
.
.
. ***** DISPLAY RESULT: "Propensity Score" estimate in last row Table 25.4
.
. * Weighted estimate
. di $sum / $sumwgt "
Count = " $count
1562.7274
Count = 9
.
. * This differs from value 995 given in text due to
. * previously mentioned correction of U74 and U75.
. * Now get 1562 with se not estimated
. * compared to DW99 estimates Table 3 column 4 1608 and column 5 1494
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section6\mma25p1treatment.txt
log type: text
closed on: 26 May 2005, 10:26:22
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma25p2matching.txt
log type: text
opened on: 26 May 2005, 10:26:31
.
. ********** OVERVIEW OF MMA25P2MATCHING.DO **********
.
. * STATA Program
622

.
. * Chapter 25.8.5 pages 893-6 Tables 25.5-25.7
. * using Dehejia-Wahba data (originally Lalonde data)
.
. * (1) For DW 2002 specification of the logit model for propensity score
. * calculate treatment effect by matching methods (Tables 25.5-6)
. * ( ) give distribution of propensity score (Table 25.5)
. * (1A) nearest neighbor matching
. * (1B) radius matching r = 0.001
. * (1C) radius matching r = 0.001
. * (1D) radius matching r = 0.001
. * (1E) stratification
. * (1F) kernel matching
. * (2) For DW 1999 specification of the logit model for propensity score
. * calculate treatment effect by matching methods (Table 25.6)
.
. * The program MMA25P1TREATMENT.DO provides simpler nonmatching methods
. * for the same data.
.
. * nswpsid.da1
.
. * To run this program you need the Stata add-ons
. * pscore.ado, atts.ado, attr.ado, attnd.ado, attnw.ado
. * due to Sascha O. Becker and Andrea Ichino (2002)
. * "Estimation of average treatment effects based on propensity scores",
. * The Stata Journal, Vol.2, No.4, pp. 358-377.
.
. * This program uses version 2.02 May 13 2005 for Stata version 8
. * downloadable from http://www.iue.it/Personal/Ichino/#pscore
. * We earlier used version 1.29 October 8 2002 for Stata version 7
. * and obtained the same results
.
. * To speed up the program reduce breps: the number of bootstrap
. * replications used to obtain bootstrap standard errors
. * Bootstrap se's will differ from text as here seed is set to 10101
.
. ********** STATA SETUP **********
.
. set more off
. version 8
.
.
623
. * Data set nswpsid.da1 is data set nswpsid.da1 from Guido Imbens

. * http://emlab.berkeley.edu/users/imbens/index.shtml
.
. * or DW02
.
. * Each observation is for an individual.
. * There are 2,675 observations: 185 in treated group and 2490 in control
.
. * Variables are
. * TREAT 1 if treated (NSW treated) and 0 if not (PSID-1 control)
. * AGE in years
. * EDUC in years
. * BLACK 1 if black
. * HISP 1 if hispanic
. * MARR 1 if married
. * RE78 Real annual earnings in 1974 (post-treatment)
.
. * NOTE: U74 and U75 are miscoded in these data and also in the
.*
summary statistics table of DW02
.*
See below for correction to data
.
. ********** READ DATA AND TRANSFORMATIONS **********
.
. ****** propensity score for nsw-psid composite sample*************
. ****** output for MMA Tables 25.6 & 25.7 ***********************
.
. infile TREAT AGE EDUC BLACK HISP MARR RE74 RE75 RE78 U74 U75 /*
> */ using nswpsid.da1
.
. * The original data reversed U74 and U75
. * Should be U74=1 if R74=0 and U74=0 if R74>0 anmd similar for U75
. * This effects results with propensity score though not eariler results
.
. * Wrong U74 and U75
. sum U74 U75
624
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------U74 |
2675 .1345794 .3413376
0
1
U75 |
2675 .1293458 .335645
0
1
.
. * Correct the original data
. drop U74 U75
. gen U74 = cond(RE74 == 0, 1, 0)
. gen U75 = cond(RE75 == 0, 1, 0)
.
. * Correct U74 and U75
. sum U74 U75
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------U74 |
2675 .1293458 .335645
0
1
U75 |
2675 .1345794 .3413376
0
1
.
. * Create regressors used as additional controls in regressions below
. gen AGESQ = AGE*AGE
. gen EDUCSQ = EDUC*EDUC
. * DW99 do not define NODEGREE but following gives Table 1 means
. gen NODEGREE = 0
. replace NODEGREE = 1 if EDUC < 12
. gen U74BLACK = U74*BLACK
. gen U74HISP = U74*HISP
.
. sum AGE EDUC NODEGREE BLACK HISP MARR U74 U75 RE74 RE75 RE78 TREAT /*
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
2675 34.22579 10.49984
17
55
EDUC |
2675 11.99439 3.053556
0
17
625
NODEGREE |
2675 .3330841 .4714045
0
1
BLACK |
2675 .2915888 .4545789
0
1
HISP |
2675 .0343925 .1822693
0
1
-------------+-------------------------------------------------------MARR |
2675 .8194393 .3847257
0
1
U74 |
2675 .1293458 .335645
0
1
U75 |
2675 .1345794 .3413376
0
1
RE74 |
2675
18230 13722.25
0 137149
RE75 |
2675 17850.89 13877.78
0 156653
-------------+-------------------------------------------------------RE78 |
2675 20502.38 15632.52
0 121174
TREAT |
2675 .0691589 .2537716
0
1
AGESQ |
2675 1281.61 766.8415
289
3025
EDUCSQ |
2675 153.1862 70.62231
0
289
RE74SQ |
2675 5.21e+08 8.47e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2675 5.11e+08 8.91e+08
0 2.45e+10
U74BLACK |
2675 .0549533 .2279316
0
1
U74HISP |
2675 .0056075 .0746868
0
1
.
. bysort TREAT: sum AGE EDUC NODEGREE BLACK HISP MARR U74 U75 RE74 RE75
RE78 TREAT /*
----------------------------------------------------------------------------------------------------> TREAT = 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
2490 34.8506 10.44076
18
55
EDUC |
2490 12.11687 3.082435
0
17
NODEGREE |
2490 .3052209 .4605934
0
1
BLACK |
2490 .2506024 .433447
0
1
HISP |
2490 .0325301 .1774389
0
1
-------------+-------------------------------------------------------MARR |
2490 .8662651 .3404357
0
1
U74 |
2490 .0863454 .2809298
0
1
U75 |
2490
.1 .3000603
0
1
RE74 |
2490 19428.75 13406.88
0 137149
RE75 |
2490 19063.34 13596.95
0 156653
-------------+-------------------------------------------------------RE78 |
2490 21553.92 15555.35
0 121174
TREAT |
2490
0
0
0
0
AGESQ |
2490 1323.53 769.796
324
3025
EDUCSQ |
2490 156.3161 71.43048
0
289
RE74SQ |
2490 5.57e+08 8.66e+08
0 1.88e+10
-------------+-------------------------------------------------------RE75SQ |
2490 5.48e+08 9.12e+08
0 2.45e+10
U74BLACK |
2490 .0144578 .1193923
0
1
U74HISP |
2490 .0036145 .0600237
0
1
626
----------------------------------------------------------------------------------------------------> TREAT = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------AGE |
185 25.81622 7.155019
17
48
EDUC |
185 10.34595 2.01065
4
16
NODEGREE |
185 .7081081 .4558666
0
1
BLACK |
185 .8432432 .3645579
0
1
HISP |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------MARR |
185 .1891892 .3927217
0
1
U74 |
185 .7081081 .4558666
0
1
U75 |
185
.6 .4912274
0
1
RE74 |
185 2095.574 4886.623
0 35040.1
RE75 |
185 1532.056 3219.251
0 25142.2
-------------+-------------------------------------------------------RE78 |
185 6349.145 7867.405
0 60307.9
TREAT |
185
1
0
1
1
AGESQ |
185 717.3946 431.2517
289
2304
EDUCSQ |
185 111.0595 39.30388
16
256
RE74SQ |
185 2.81e+07 1.14e+08
0 1.23e+09
-------------+-------------------------------------------------------RE75SQ |
185 1.27e+07 5.60e+07
0 6.32e+08
U74BLACK |
185
.6 .4912274
0
1
U74HISP |
185 .0324324 .1776263
0
1
.
. *** NOTE: The benchmark estimate obtained from NSW experiment is
. ***
$1,794 = Average(RE_78 for NSW treated) - Average (RE_78 for NSW comtrols)
. ***
See MMA25P3EXTRA.DO
.
. ********** (1) ANALYSIS for DW02 SPECIFICATION OF THE PROPENSITY SCORE
**********
.
. * Following defines number of bootstrap replications
. * Table 25.6 used 200 (or 100 in some places)
. global breps 200
.
. * From DW02 Table 3 footnote a the propensity score uses the following regressors
. global XDW02 AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK HISP RE74 RE75
RE74SQ U74 U75 U74HISP
.
. **** Table 25.5 p.894 summarizes propensity score
. **** using just those observations with common support
.
627
. pscore TREAT $XDW02, pscore(myscore) comsup blockid(myblock) numblo(5) level(0.005)

logit
****************************************************
Algorithm to estimate the propensity score
****************************************************
The treatment is TREAT

TREAT |
Freq. Percent
Cum.
------------+----------------------------------0|
2,490
93.08
93.08
1|
185
6.92
100.00
------------+----------------------------------Total |
2,675 100.00
Estimation of the propensity score

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:
Iteration 9:

Logit estimates
Number of obs =
2675
LR chi2(14) = 951.10
Prob > chi2 = 0.0000
Pseudo R2
= 0.7070
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------AGE | .2628422 .120206 2.19 0.029 .0272428 .4984416
AGESQ | -.0053794 .0018341 -2.93 0.003 -.0089742 -.0017846
EDUC | .7149774 .3418173 2.09 0.036 .0450278 1.384927
EDUCSQ | -.0426178 .0179039 -2.38 0.017 -.0777088 -.0075269
MARR | -1.780857 .301802 -5.90 0.000 -2.372378 -1.189336
NODEGREE | .1891046 .4257533 0.44 0.657 -.6453564 1.023566
BLACK | 2.519383 .370358 6.80 0.000 1.793495 3.245272
HISP | 3.087327 .7340486 4.21 0.000 1.648618 4.526036
RE74 | -.0000448 .0000425 -1.05 0.292 -.000128 .0000385
628
RE75 | -.0002678 .0000485 -5.52 0.000 -.0003628 -.0001727

RE74SQ | 1.99e-09 7.75e-10 2.57 0.010 4.72e-10 3.51e-09
U74 | 3.100056 .5187391 5.98 0.000 2.083346 4.116766
U75 | -1.273525 .4644557 -2.74 0.006 -2.183842 -.3632088
U74HISP | -1.925803 1.07186 -1.80 0.072 -4.02661 .1750032
_cons | -7.407524 2.445692 -3.03 0.002 -12.20099 -2.614056
Note: the common support option has been selected

The region of common support is [.00036433, .98576756]
Description of the estimated propensity score

in region of common support
Estimated propensity score
------------------------------------------------------------Percentiles
Smallest
1% .0003871
.0003643
5% .0004805
.0003669
10% .0006343
.0003702
Obs
1271
25% .0016393
.0003714
Sum of Wgt.
1271
50%
75%
90%
95%
99%
.0090427
Mean
.1447205
Largest
Std. Dev.
.2809511
.0897599
.9803043
.656286
.9830988
Variance
.0789335
.9392306
.9855413
Skewness
2.049999
.9640553
.9857676
Kurtosis
5.748631
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
The final number of blocks is 6

This number of blocks ensures that the mean propensity score
is not different for treated and controls in each blocks
**********************************************************
629
Step 2: Test of balancing property of the propensity score

**********************************************************
The balancing property is satisfied
This table shows the inferior bound, the number of treated

and the number of controls for each block
Inferior |
of block |
TREAT
of pscore |
0
1 | Total
-----------+----------------------+---------.0003643 |
960
9|
969
.1 |
56
10 |
66
.2 |
33
14 |
47
.4 |
22
24 |
46
.6 |
7
33 |
40
.8 |
8
95 |
103
-----------+----------------------+---------Total | 1,086
185 | 1,271
*******************************************
End of the algorithm to estimate the pscore
*******************************************
.
. **** For completeness do same with common support option NOT selected
.
. drop myscore myblock
. pscore TREAT $XDW02, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
****************************************************
****************************************************

TREAT |
Freq. Percent
Cum.
------------+----------------------------------0|
2,490
93.08
93.08
1|
185
6.92
100.00
630
------------+----------------------------------Total |
2,675
100.00

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:
Iteration 9:

Logit estimates
Number of obs =
2675
LR chi2(14) = 951.10
Prob > chi2 = 0.0000
Pseudo R2
= 0.7070
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------AGE | .2628422 .120206 2.19 0.029 .0272428 .4984416
AGESQ | -.0053794 .0018341 -2.93 0.003 -.0089742 -.0017846
EDUC | .7149774 .3418173 2.09 0.036 .0450278 1.384927
EDUCSQ | -.0426178 .0179039 -2.38 0.017 -.0777088 -.0075269
MARR | -1.780857 .301802 -5.90 0.000 -2.372378 -1.189336
NODEGREE | .1891046 .4257533 0.44 0.657 -.6453564 1.023566
BLACK | 2.519383 .370358 6.80 0.000 1.793495 3.245272
HISP | 3.087327 .7340486 4.21 0.000 1.648618 4.526036
RE74 | -.0000448 .0000425 -1.05 0.292 -.000128 .0000385
RE75 | -.0002678 .0000485 -5.52 0.000 -.0003628 -.0001727
RE74SQ | 1.99e-09 7.75e-10 2.57 0.010 4.72e-10 3.51e-09
U74 | 3.100056 .5187391 5.98 0.000 2.083346 4.116766
U75 | -1.273525 .4644557 -2.74 0.006 -2.183842 -.3632088
U74HISP | -1.925803 1.07186 -1.80 0.072 -4.02661 .1750032
_cons | -7.407524 2.445692 -3.03 0.002 -12.20099 -2.614056

------------------------------------------------------------631
Percentiles
Smallest
1% 2.36e-09
1.76e-12
5% 8.39e-08
5.07e-12
10% 4.47e-07
1.14e-11
25% .0000107
1.14e-11
50%
75%
90%
95%
99%
Obs
2675
Sum of Wgt.
2675
.0002558
Mean
.0691589
Largest
Std. Dev.
.2074207
.0071195
.9830988
.129801
.9855413
Variance
.0430234
.6394923
.9857676
Skewness
3.407447
.9572224
.986626
Kurtosis
13.56404
******************************************************
******************************************************

**********************************************************
**********************************************************
Variable BLACK is not balanced in block 1
The balancing property is not satisfied
Try a different specification of the propensity score
Inferior |
of block |
TREAT
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 2,265
7 | 2,272
.05 |
98
2|
100
.1 |
56
10 |
66
.2 |
33
14 |
47
.4 |
22
24 |
46
.6 |
7
33 |
40
.8 |
9
95 |
104
-----------+----------------------+---------632
Total |
2,490
185 |
2,675
*******************************************
*******************************************
.
. **** All of the following use common support
.
. ****************************************************************************
. **** Note: The results in the first half of Table 25.6
. ****
erroneously added RE75SQ as a regressor.
. ****
This does not effect Table 25.5 (done correctly) or
. ****
stratification estimates (which used myscore from correct model).
. ****
But it does effect NN, radius and kernel estimates.
. ****
To enable comparison with the text we do analysis here
. ****
both with and without RE75SQ.
. ****
Even dropping RE75SQ the results continue to differ from DW02.
. ****
Text Corrected
. ****
Table 25.6 Table 25.6 DW 2002
. ****
NN
2385
1286
1202
. ****
Radius = 0.001 -7815
-7808
1187
. ****
Radius = 0.0001 -9333
-6401
1191
. ****
Radius = 0.00001 -2200
-1135
1198
. ****
Stratification
1497
1497
. ****
Kernel
1309
1342
. ****************************************************************************
.
. **** Row 1 Table 25.6: Nearest neighbor matching (random version)
. set seed 10101
. attnd RE78 TREAT $XDW02 RE75SQ, comsup boot reps($breps) dots logit
The program is searching the nearest neighbor of each treated unit.

This operation may take a while.
ATT estimation with Nearest Neighbor Matching method

(random draw version)
Analytical standard errors
--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
53
2385.430
1792.028
1.331
633
--------------------------------------------------------Note: the numbers of treated and controls refer to actual

nearest neighbour matches
Bootstrapping of standard errors

command:
attnd RE78 TREAT AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK
HISP RE74 RE75 RE74SQ U74 U
> 75 U74HISP RE75SQ , pscore() logit comsup
statistic: attnd
= r(attnd)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attnd | 200 2385.43 -859.5093 1094.969 226.1985 4544.661 (N)
|
-937.0529 3515.425 (P)
|
1202.547 4697.713 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

Bootstrapped standard errors
--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
53
2385.430
1094.969
2.179

. set seed 10101
634
. attnd RE78 TREAT $XDW02, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
60
1285.782
3895.044
0.330


command:
HISP RE74 RE75 RE74SQ U74 U
> 75 U74HISP , pscore() logit comsup
statistic: attnd
= r(attnd)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attnd | 200 1285.782 319.006 1275.405 -1229.261 3800.825 (N)
|
-1128.466 3835.567 (P)
|
-2181.243 3294.797 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
635

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
60
1285.782
1275.405
1.008

.
. **** Row 2 Table 25.6: Radius matching for Radius=0.001
. set seed 10101
. attr RE78 TREAT $XDW02 RE75SQ, comsup boot reps($breps) dots logit radius(0.001)
The program is searching for matches of treated units within radius.

ATT estimation with the Radius Matching method

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------54
517 -7815.382
1118.181
-6.989

matches within radius

command:
attr RE78 TREAT AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK
HISP RE74 RE75 RE74SQ U74 U7
> 5 U74HISP RE75SQ , pscore() logit comsup radius(.001)
statistic: attr
= r(attr)
636
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 200 -7815.381 1345.983 3794.466 -15297.9 -332.8595 (N)
|
-18163.96 936.3913 (P)
|
-21184.98 -2839.753 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------54
517 -7815.381
3794.466
-2.060

. set seed 10101
. attr RE78 TREAT $XDW02, comsup boot reps($breps) dots logit radius(0.001)


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
---------------------------------------------------------
637
51
541 -7808.241
1146.418
-6.811


command:
> 5 U74HISP , pscore() logit comsup radius(.001)
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 200 -7808.242 1022.016 3770.093 -15242.7 -373.7819 (N)
|
-16697.45 1438.308 (P)
|
-18942.21 -1204.325 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------51
541 -7808.242
3770.093
-2.071

.
638

. set seed 10101


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------24
92 -9333.120
2285.624
-4.083


command:
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 200 -9333.12 4076.044 5211.11 -19609.2 942.9621 (N)
|
-19094.04 4604.865 (P)
|
-22414.52 -4341.134 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
639
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------24
92 -9333.120
5211.110
-1.791

. set seed 10101


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------27
91 -6401.345
2054.218
-3.116


command:
statistic: attr
= r(attr)
....................................................................................................
640
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 200 -6401.345 310.4673 5618.88 -17481.53 4678.842 (N)
|
-18778.71 4636.073 (P)
|
-21404.97 3740.767 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------27
91 -6401.345
5618.880
-1.139

.
. set seed 10101


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------641
15
19 -2200.022
2986.211
-0.737


command:
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 200 -2200.022 626.9762 7009.51 -16022.47 11622.43 (N)
|
-24355.12 8831.196 (P)
|
-31741.1 4217.228 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------15
19 -2200.022
7009.510
-0.314

642
. set seed 10101



--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------16
17 -1135.184
3189.367
-0.356


command:
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 199 -1135.184 -2079.93 7030.204 -14998.87 12728.5 (N)
|
-23808.6 8048.6 (P)
|
-16939.85 9102.585 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected
643

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------16
17 -1135.184
7030.204
-0.161

.
. **** Row 5 Table 25.6: Stratification Matching
. set seed 10101
. atts RE78 TREAT, pscore(myscore) blockid(myblock) comsup boot reps($breps) dots
ATT estimation with the Stratification method

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
1086
1497.484
920.688
1.626
---------------------------------------------------------

command:
atts RE78 TREAT , pscore(myscore) blockid(myblock) comsup
statistic: atts
= r(atts)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675
644

-------------+---------------------------------------------------------------atts | 200 1497.484 91.22797 913.129 -303.1669 3298.134 (N)
|
-16.69353 3509.36 (P)
|
-64.37524 3306.115 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
1086
1497.484
913.129
1.640
--------------------------------------------------------.
. **** Row 6 Table 25.6: Kernel Matching
. set seed 10101
. attk RE78 TREAT $XDW02 RE75SQ, comsup boot reps($breps) dots logit
The program is searching for matches of each treated unit.

ATT estimation with the Kernel Matching method

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
1058
1309.217
--------------------------------------------------------Note: Analytical standard errors cannot be computed. Use

the bootstrap option to get bootstrapped standard errors.
645

command:
attk RE78 TREAT AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK
> 5 U74HISP RE75SQ , pscore() logit comsup bwidth(.06)
statistic: attk
= r(attk)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attk | 200 1309.217 45.93746 958.1801 -580.2722 3198.707 (N)
|
-412.7856 3416.999 (P)
|
-374.4567 3450.043 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
1058
1309.217
958.180
1.366
--------------------------------------------------------. set seed 10101

. attk RE78 TREAT $XDW02, comsup boot reps($breps) dots logit

646
--------------------------------------------------------n. treat. n. contr.

ATT Std. Err.
t
--------------------------------------------------------185
1086
1342.016


command:
> 5 U74HISP , pscore() logit comsup bwidth(.06)
statistic: attk
= r(attk)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attk | 200 1342.016 61.94744 933.8668 -499.5284 3183.561 (N)
|
-378.5027 3354.131 (P)
|
-405.7551 3349.118 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
1086
1342.016
933.867
1.437
--------------------------------------------------------647
.
. ********** (2) ANALYSIS for DW99 SPECIFICATION OF THE PROPENSITY SCORE
**********
.
. * From DW99 Table 3 footnote e the propensity score uses the following regressors
. global XDW99 AGE AGESQ EDUC EDUCSQ MARR NODEGREE BLACK HISP RE74 RE75
RE74SQ RE75SQ U74BLACK
.
. * Note that CT Table 25.6 footnote b erroneously lists RE74*RE75 as regressor
. * but this program (correctly) did not include RE74*RE75
.
. **** Propensity score with just those observations with common support
.
. pscore TREAT $XDW99, pscore(myscore) comsup blockid(myblock) numblo($breps)
level(0.005) logit
****************************************************
****************************************************

TREAT |
Freq. Percent
Cum.
------------+----------------------------------0|
2,490
93.08
93.08
1|
185
6.92
100.00
------------+----------------------------------Total |
2,675
100.00

648

Logit estimates
Number of obs =
2675
LR chi2(13) = 935.44
Prob > chi2 = 0.0000
Pseudo R2
= 0.6953
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------AGE | .3305734 .1203353 2.75 0.006 .0947206 .5664262
AGESQ | -.0063429 .0018561 -3.42 0.001 -.0099808 -.0027049
EDUC | .8247711 .3534216 2.33 0.020 .1320775 1.517465
EDUCSQ | -.0483153 .0186057 -2.60 0.009 -.0847819 -.0118488
MARR | -1.884062 .2994614 -6.29 0.000 -2.470996 -1.297129
NODEGREE | .1299868 .4284278 0.30 0.762 -.7097163
.96969
BLACK | 1.132961 .352088 3.22 0.001 .4428814 1.823041
HISP | 1.962762 .5673735 3.46 0.001 .8507302 3.074793
RE74 | -.0001047 .0000355 -2.95 0.003 -.0001743 -.0000351
RE75 | -.0002172 .0000415 -5.23 0.000 -.0002986 -.0001357
RE74SQ | 2.36e-09 6.57e-10 3.59 0.000 1.07e-09 3.65e-09
RE75SQ | 1.58e-10 6.68e-10 0.24 0.813 -1.15e-09 1.47e-09
U74BLACK | 2.137042 .4273667 5.00 0.000 1.299419 2.974665
_cons | -7.552458 2.451721 -3.08 0.002 -12.35774 -2.747173


------------------------------------------------------------Percentiles
Smallest
1% .0006813
.0006526
5% .0008363
.0006581
10% .0011416
.0006593
Obs
1331
25% .0024351
.0006598
Sum of Wgt.
1331
50%
75%
90%
95%
.0111854
Mean
.1388772
Largest
Std. Dev.
.275571
.0779976
.9744237
.6200607
.9747552
Variance
.0759394
.9494181
.9747918
Skewness
2.17177
649
99%
.970738
.9748754
Kurtosis
6.296349
******************************************************
******************************************************

**********************************************************
**********************************************************

Inferior |
of block |
TREAT
of pscore |
0
1 | Total
-----------+----------------------+---------.0006526 |
501
2|
503
.005 |
143
3|
146
.01 |
78
0|
78
.015 |
42
0|
42
.02 |
38
0|
38
.025 |
29
1|
30
.03 |
22
0|
22
.035 |
23
0|
23
.04 |
22
0|
22
.045 |
17
1|
18
.05 |
23
0|
23
.055 |
13
1|
14
.06 |
12
0|
12
.065 |
9
0|
9
.07 |
11
1|
12
.075 |
9
1|
10
.08 |
6
0|
6
.085 |
6
0|
6
650
.09 |
.095 |
.1 |
.105 |
.11 |
.115 |
.12 |
.125 |
.13 |
.135 |
.14 |
.145 |
.15 |
.155 |
.16 |
.165 |
.175 |
.18 |
.185 |
.19 |
.195 |
.2 |
.205 |
.215 |
.225 |
.23 |
.235 |
.24 |
.245 |
.25 |
.26 |
.265 |
.27 |
.28 |
.285 |
.29 |
.295 |
.3 |
.305 |
.315 |
.32 |
.325 |
.33 |
.335 |
.34 |
.345 |
.35 |
.355 |
.365 |
.37 |
.375 |
8
6
9
4
8
3
1
2
6
1
1
1
2
4
3
2
1
0
1
2
2
1
1
5
2
2
2
2
0
0
1
1
1
1
1
2
2
2
0
1
0
2
1
0
1
1
2
0
1
2
2
1|
0|
0|
0|
0|
0|
0|
3|
1|
0|
1|
0|
0|
0|
0|
0|
0|
1|
0|
0|
1|
0|
0|
0|
1|
1|
3|
0|
1|
2|
1|
0|
0|
0|
0|
1|
1|
0|
1|
0|
1|
1|
0|
1|
1|
2|
0|
1|
0|
0|
2|
9
6
9
4
8
3
1
5
7
1
2
1
2
4
3
2
1
1
1
2
3
1
1
5
3
3
5
2
1
2
2
1
1
1
1
3
3
2
1
1
1
3
1
1
2
3
2
1
1
2
4
651
.38 |
.385 |
.4 |
.405 |
.42 |
.425 |
.45 |
.47 |
.48 |
.485 |
.495 |
.5 |
.51 |
.515 |
.525 |
.53 |
.535 |
.54 |
.555 |
.56 |
.565 |
.57 |
.575 |
.59 |
.595 |
.6 |
.605 |
.61 |
.615 |
.62 |
.625 |
.635 |
.64 |
.645 |
.665 |
.67 |
.675 |
.68 |
.69 |
.71 |
.735 |
.74 |
.745 |
.765 |
.79 |
.795 |
.8 |
.805 |
.815 |
.825 |
.84 |
1
1
0
0
0
1
2
1
1
2
1
0
0
2
0
0
0
1
0
1
1
0
1
0
0
0
0
1
0
0
0
1
1
2
0
1
0
1
1
1
0
1
2
1
0
0
0
0
0
0
0
2|
4|
1|
2|
1|
0|
0|
0|
1|
0|
0|
2|
2|
1|
1|
2|
1|
0|
1|
1|
0|
1|
1|
1|
1|
1|
1|
2|
1|
1|
1|
2|
1|
0|
1|
0|
3|
0|
0|
1|
1|
0|
0|
1|
4|
1|
1|
2|
3|
1|
1|
3
5
1
2
1
1
2
1
2
2
1
2
2
3
1
2
1
1
1
2
1
1
2
1
1
1
1
3
1
1
1
3
2
2
1
1
3
1
1
2
1
1
2
2
4
1
1
2
3
1
1
652
.845 |
0
1|
1
.85 |
0
1|
1
.86 |
0
1|
1
.865 |
0
1|
1
.895 |
0
1|
1
.9 |
0
2|
2
.905 |
0
2|
2
.915 |
0
1|
1
.92 |
0
1|
1
.925 |
0
7|
7
.93 |
0
2|
2
.935 |
0
1|
1
.94 |
0
3|
3
.945 |
1
6|
7
.95 |
1
14 |
15
.955 |
0
16 |
16
.96 |
1
5|
6
.965 |
3
12 |
15
.97 |
1
13 |
14
-----------+----------------------+---------Total | 1,146
185 | 1,331
*******************************************
*******************************************
.
. **** For completeness do same with common support option NOT selected
.
. pscore TREAT $XDW99, pscore(myscore) blockid(myblock) numblo($breps) level(0.005) logit
****************************************************
****************************************************

TREAT |
Freq. Percent
Cum.
------------+----------------------------------0|
2,490
93.08
93.08
1|
185
6.92
100.00
------------+----------------------------------Total |
2,675
100.00
653

Logit estimates
Number of obs =
2675
LR chi2(13) = 935.44
Prob > chi2 = 0.0000
Pseudo R2
= 0.6953
-----------------------------------------------------------------------------TREAT |
Coef. Std. Err.
-------------+---------------------------------------------------------------AGE | .3305734 .1203353 2.75 0.006 .0947206 .5664262
AGESQ | -.0063429 .0018561 -3.42 0.001 -.0099808 -.0027049
EDUC | .8247711 .3534216 2.33 0.020 .1320775 1.517465
EDUCSQ | -.0483153 .0186057 -2.60 0.009 -.0847819 -.0118488
MARR | -1.884062 .2994614 -6.29 0.000 -2.470996 -1.297129
NODEGREE | .1299868 .4284278 0.30 0.762 -.7097163
.96969
BLACK | 1.132961 .352088 3.22 0.001 .4428814 1.823041
HISP | 1.962762 .5673735 3.46 0.001 .8507302 3.074793
RE74 | -.0001047 .0000355 -2.95 0.003 -.0001743 -.0000351
RE75 | -.0002172 .0000415 -5.23 0.000 -.0002986 -.0001357
RE74SQ | 2.36e-09 6.57e-10 3.59 0.000 1.07e-09 3.65e-09
RE75SQ | 1.58e-10 6.68e-10 0.24 0.813 -1.15e-09 1.47e-09
U74BLACK | 2.137042 .4273667 5.00 0.000 1.299419 2.974665
_cons | -7.552458 2.451721 -3.08 0.002 -12.35774 -2.747173

------------------------------------------------------------Percentiles
Smallest
654
1%
5%
10%
25%
2.84e-08
4.47e-07
2.07e-06
.000034
50%
.0006388
Mean
.0691589
Largest
Std. Dev.
.2063646
.010941
.9744237
.1336877
.9747552
Variance
.0425863
.6200607
.9747918
Skewness
3.471137
.9651648
.9748754
Kurtosis
14.05057
75%
90%
95%
99%
4.49e-11
4.88e-10
4.88e-10
4.95e-10
Obs
2675
Sum of Wgt.
2675
******************************************************
******************************************************

**********************************************************
**********************************************************
Variable BLACK is not balanced in block 1
Inferior |
of block |
TREAT
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 1,845
2 | 1,847
.005 |
143
3|
146
.01 |
78
0|
78
.015 |
42
0|
42
.02 |
38
0|
38
.025 |
29
1|
30
.03 |
22
0|
22
.035 |
23
0|
23
.04 |
22
0|
22
655
.045 |
.05 |
.055 |
.06 |
.065 |
.07 |
.075 |
.08 |
.085 |
.09 |
.095 |
.1 |
.105 |
.11 |
.115 |
.12 |
.125 |
.13 |
.135 |
.14 |
.145 |
.15 |
.155 |
.16 |
.165 |
.175 |
.18 |
.185 |
.19 |
.195 |
.2 |
.205 |
.215 |
.225 |
.23 |
.235 |
.24 |
.245 |
.25 |
.26 |
.265 |
.27 |
.28 |
.285 |
.29 |
.295 |
.3 |
.305 |
.315 |
.32 |
.325 |
17
23
13
12
9
11
9
6
6
8
6
9
4
8
3
1
2
6
1
1
1
2
4
3
2
1
0
1
2
2
1
1
5
2
2
2
2
0
0
1
1
1
1
1
2
2
2
0
1
0
2
1|
0|
1|
0|
0|
1|
1|
0|
0|
1|
0|
0|
0|
0|
0|
0|
3|
1|
0|
1|
0|
0|
0|
0|
0|
0|
1|
0|
0|
1|
0|
0|
0|
1|
1|
3|
0|
1|
2|
1|
0|
0|
0|
0|
1|
1|
0|
1|
0|
1|
1|
18
23
14
12
9
12
10
6
6
9
6
9
4
8
3
1
5
7
1
2
1
2
4
3
2
1
1
1
2
3
1
1
5
3
3
5
2
1
2
2
1
1
1
1
3
3
2
1
1
1
3
656
.33 |
.335 |
.34 |
.345 |
.35 |
.355 |
.365 |
.37 |
.375 |
.38 |
.385 |
.4 |
.405 |
.42 |
.425 |
.45 |
.47 |
.48 |
.485 |
.495 |
.5 |
.51 |
.515 |
.525 |
.53 |
.535 |
.54 |
.555 |
.56 |
.565 |
.57 |
.575 |
.59 |
.595 |
.6 |
.605 |
.61 |
.615 |
.62 |
.625 |
.635 |
.64 |
.645 |
.665 |
.67 |
.675 |
.68 |
.69 |
.71 |
.735 |
.74 |
1
0
1
1
2
0
1
2
2
1
1
0
0
0
1
2
1
1
2
1
0
0
2
0
0
0
1
0
1
1
0
1
0
0
0
0
1
0
0
0
1
1
2
0
1
0
1
1
1
0
1
0|
1|
1|
2|
0|
1|
0|
0|
2|
2|
4|
1|
2|
1|
0|
0|
0|
1|
0|
0|
2|
2|
1|
1|
2|
1|
0|
1|
1|
0|
1|
1|
1|
1|
1|
1|
2|
1|
1|
1|
2|
1|
0|
1|
0|
3|
0|
0|
1|
1|
0|
1
1
2
3
2
1
1
2
4
3
5
1
2
1
1
2
1
2
2
1
2
2
3
1
2
1
1
1
2
1
1
2
1
1
1
1
3
1
1
1
3
2
2
1
1
3
1
1
2
1
1
657
.745 |
2
0|
2
.765 |
1
1|
2
.79 |
0
4|
4
.795 |
0
1|
1
.8 |
0
1|
1
.805 |
0
2|
2
.815 |
0
3|
3
.825 |
0
1|
1
.84 |
0
1|
1
.845 |
0
1|
1
.85 |
0
1|
1
.86 |
0
1|
1
.865 |
0
1|
1
.895 |
0
1|
1
.9 |
0
2|
2
.905 |
0
2|
2
.915 |
0
1|
1
.92 |
0
1|
1
.925 |
0
7|
7
.93 |
0
2|
2
.935 |
0
1|
1
.94 |
0
3|
3
.945 |
1
6|
7
.95 |
1
14 |
15
.955 |
0
16 |
16
.96 |
1
5|
6
.965 |
3
12 |
15
.97 |
1
13 |
14
-----------+----------------------+---------Total | 2,490
185 | 2,675
*******************************************
*******************************************
.
. **** All of the following use common support
.
. **** Row 7 Table 25.6: Nearest neighbor matching (random version)
. set seed 10101
. attnd RE78 TREAT $XDW99, comsup boot reps($breps) dots logit

658

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
57
560.287
2205.663
0.254


command:
HISP RE74 RE75 RE74SQ RE75S
> Q U74BLACK , pscore() logit comsup
statistic: attnd
= r(attnd)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attnd | 200 560.2872 1104.87 1331.294 -2064.967 3185.542 (N)
|
-785.5272 4190.844 (P)
|
-2615.809 2016.239 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
659
--------------------------------------------------------185
57
560.287
1331.294
0.421

.
. set seed 10101


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------57
583 -9358.228
997.561
-9.381


command:
HISP RE74 RE75 RE74SQ RE75SQ
> U74BLACK , pscore() logit comsup radius(.001)
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675
660

-------------+---------------------------------------------------------------attr | 200 -9358.228 2589.204 3079.824 -15431.51 -3284.949 (N)
|
-11328.39 901.8873 (P)
|
-13053.95 -6956.288 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------57
583 -9358.228
3079.824
-3.039

.
. set seed 10101


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------27
76 -7847.460 2066.697
-3.797

661

command:
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 200 -7847.46 2920.804 4850.874 -17413.17 1718.251 (N)
|
-13423.91 5223.634 (P)
|
-15432.32 632.0693 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------27
76 -7847.460
4850.874
-1.618

.
. set seed 10101
662


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------16
13
223.468
4551.850
0.049


command:
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attr | 199 223.4685 -1272.487 5608.927 -10837.43 11284.37 (N)
|
-14600.21 8548.427 (P)
|
-10778.17 11039.05 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

663
--------------------------------------------------------n. treat. n. contr.

ATT Std. Err.
t
--------------------------------------------------------16
13
223.468
5608.927
0.040

.
. **** Row 11 Table 25.6: Stratification Matching
. set seed 10101
. atts RE78 TREAT, pscore(myscore) blockid(myblock) comsup boot reps($breps) dots

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------98
1233
1322.160
---------------------------------------------------------

command:
atts RE78 TREAT , pscore(myscore) blockid(myblock) comsup
statistic: atts
= r(atts)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------atts | 200 1322.16 -51.6285 1276.237 -1194.524 3838.844 (N)
|
-1515.399 3960.787 (P)
664
|
-1383.034 4034.298 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------98
1233
1322.160
1276.237
1.036
--------------------------------------------------------.
. **** Row 12 Table 25.6: Kernel Matching
. * pscore TREAT $XDW99, pscore(myscore) comsup blockid(myblock) numblo($breps)
level(0.005) logit
. set seed 10101
. attk RE78 TREAT $XDW99, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
1146
1518.694

665
command:
> U74BLACK , pscore() logit comsup bwidth(.06)
statistic: attk
= r(attk)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
2675

-------------+---------------------------------------------------------------attk | 200 1518.694 130.8493 808.3386 -75.31444 3112.703 (N)
|
212.6286 3165.292 (P)
|
96.05106 2991.407 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
1146
1518.694
808.339
1.879
--------------------------------------------------------.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section6\mma25p2matching.txt
log type: text
closed on: 26 May 2005, 11:15:53
-----------------------------------------------------------------------------------------------------log: c:\Imbook\bwebpage\Section6\mma25p3extra.txt
log type: text
opened on: 26 May 2005, 11:33:04
.
. ********** OVERVIEW OF MMA25P3EXTRA.DO **********
.
666
. * STATA Program
.
. * This program provides additional analysis and data not in the book
. * (1) Compare NSW experiment treated to NSW experiment controls
. * (2) Compare NSW experiment treated to CPS "controls"
. * [Same as text except "controls" are from CPS not PSID]
.
. * The program is based on
.*
MMA25P2MATCHING.DO propensity score matching
.
. * To run this program you need STATA data files
. * nswre74_treated.dta NSW Treated sample
. * nswre74_control.dta NSW Control sample (not analyzed earlier)
. * propensity_cps.dta
CPS Control sample (rather than PSID)
.
. * To run this program you need the Stata add-ons
. * pscore.ado, atts.ado, attr.ado, attnd.ado, attnw.ado
. * due to Sascha O. Becker and Andrea Ichino (2002)
. * "Estimation of average treatment effects based on propensity scores",
. * The Stata Journal, Vol.2, No.4, pp. 358-377.
.
. * This program uses version 2.02 May 13 2005 for Stata version 8
. * We earlier used version 1.29 October 8 2002 for Stata version 7
. * and obtained the same results
.
. * To speed up the program reduce breps: the number of bootstrap
. * replications used to obtain bootstrap standard errors
. * Bootstrap se's will differ from text as here seed is set to 10101
.
. ********** STATA SETUP **********
.
. set more off
. version 8
.
.
667

. * or DW02
.
. * nswre74_treated.dta N=185 NSW Treated sample only
. * nswre74_control.dta N=260 NSW Control sample only
. * propensity_cps.dta N=16177 NSW Treated + CPS Control sample (Full CPS or CPS-1)
.
. ********** (1) ANALYSIS: NSW TREATED VERSUS NSW CONTROLS **********
.
. * Read in NSW treated and control and combine
. use nswre74_treated.dta, clear
. append using nswre74_control.dta
.
. ** Summarize these data
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat |
445 .4157303 .4934022
0
1
age |
445 25.37079 7.100282
17
55
edu |
445 10.19551 1.792119
3
16
black |
445 .8337079 .3727617
0
1
hisp |
445 .0876404 .2830895
0
1
-------------+-------------------------------------------------------married |
445 .1685393 .3747658
0
1
nodegree |
445 .7820225 .4133367
0
1
re74 |
445 2102.265 5363.582
0 39570.68
re75 |
445 1377.138 3150.961
0 25142.24
re78 |
445 5300.764 6631.492
0 60307.93
-------------+-------------------------------------------------------u74 |
445 .2674157 .4431092
0
1
u75 |
445 .3505618 .4776829
0
1
. bysort treat: sum
----------------------------------------------------------------------------------------------------> treat = 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat |
260
0
0
0
0
age |
260 25.05385 7.057745
17
55
edu |
260 10.08846 1.614325
3
14
668
black |
260 .8269231 .3790434
0
1
hisp |
260 .1076923 .3105893
0
1
-------------+-------------------------------------------------------married |
260 .1538462 .3614971
0
1
nodegree |
260 .8346154 .3722439
0
1
re74 |
260 2107.027 5687.906
0 39570.68
re75 |
260 1266.909 3102.982
0 23031.98
re78 |
260 4554.801 5483.836
0 39483.53
-------------+-------------------------------------------------------u74 |
260
.25 .4338478
0
1
u75 |
260 .3153846 .4655651
0
1
----------------------------------------------------------------------------------------------------> treat = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat |
185
1
0
1
1
age |
185 25.81622 7.155019
17
48
edu |
185 10.34595 2.01065
4
16
black |
185 .8432432 .3645579
0
1
hisp |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------married |
185 .1891892 .3927217
0
1
nodegree |
185 .7081081 .4558666
0
1
re74 |
185 2095.574 4886.62
0 35040.07
re75 |
185 1532.055 3219.251
0 25142.24
re78 |
185 6349.144 7867.402
0 60307.93
-------------+-------------------------------------------------------u74 |
185 .2918919 .4558666
0
1
u75 |
185
.4 .4912274
0
1
.
. outfile treat age edu black hisp married nodegree re74 re75 re78 u74 u75 /*
> */using nswre74_all.asc, replace
.
. ** Calculate the benchmark Treatment Effect
. ** Same as DW02 Tables 2 and 3 NSW row second last column
. ** and is the number given in CT page 894 second last line
.
. regress re78 treat
Source |
SS
df
MS
Number of obs = 445
-------------+-----------------------------F( 1, 443) = 8.04
Model | 348013183 1 348013183
Prob > F
= 0.0048
Residual | 1.9178e+10 443 43290369.3
R-squared = 0.0178
-------------+-----------------------------Adj R-squared = 0.0156
Total | 1.9526e+10 444 43976681.9
Root MSE
= 6579.5
669
-----------------------------------------------------------------------------re78 |
Coef. Std. Err.
-------------+---------------------------------------------------------------treat | 1794.342 632.8534 2.84 0.005 550.5745 3038.11
_cons | 4554.801 408.0459 11.16 0.000 3752.855 5356.747
-----------------------------------------------------------------------------.
. ********** (2) ANALYSIS: NSW TREATED VERSUS CPS CONTROLS **********
.
. * This data set has NSW treated and full CPS controls
. use propensity_cps.dta, clear
.
. * Variables u74, u75 were evaluated wrongly in the original file
. * So make the following correction
. drop u74 u75
. gen u74=0
. replace u74=1 if re74==0
. gen u75=0
. replace u75=1 if re75==0
. gen age2=age*age
. gen age3=age2*age
. gen edu2=edu*edu
. gen edure74=edu*re74
. * Not sure whether this is needed
. * Does DW99 use edu*re74*age3 or separately edu*re74 and age3 ?
. gen edre74age3=edu*re74*age3
.
. ** Summarize these data
. sum
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat | 16177 .011436 .1063292
0
1
age | 16177 33.14051 11.03651
16
55
edu | 16177 12.00828 2.868005
0
18
black | 16177 .0823391 .2748892
0
1
670
hisp | 16177 .0718922 .2583173

0
1
-------------+-------------------------------------------------------married | 16177 .7057551 .4557167
0
1
nodegree | 16177 .3005502 .4585115
0
1
re74 | 16177 13880.47 9613.115
0 35040.07
re75 | 16177 13512.21 9313.207
0 25243.55
re78 | 16177 14749.48 9670.996
0 60307.93
-------------+-------------------------------------------------------u74 | 16177 .1263522 .3322562
0
1
u75 | 16177 .1149162 .3189307
0
1
age2 | 16177 1220.09 783.4604
256
3025
age3 | 16177 48988.49 45032.59
4096 166375
edu2 | 16177 152.4238 67.06033
0
324
-------------+-------------------------------------------------------edure74 | 16177 169452.3 129585.8
0 490561
edre74age3 | 16177 9.53e+09 1.21e+10
0 7.75e+10
. bysort treat: sum
----------------------------------------------------------------------------------------------------> treat = 0
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat | 15992
0
0
0
0
age | 15992 33.22524 11.04522
16
55
edu | 15992 12.02751 2.870846
0
18
black | 15992 .0735368 .2610237
0
1
hisp | 15992 .072036 .2585556
0
1
-------------+-------------------------------------------------------married | 15992 .7117309 .4529712
0
1
nodegree | 15992 .2958354 .4564316
0
1
re74 | 15992 14016.8 9569.796
0 25862.32
re75 | 15992 13650.8 9270.403
0 25243.55
re78 | 15992 14846.66 9647.392
0 25564.67
-------------+-------------------------------------------------------u74 | 15992 .1196223 .3245295
0
1
u75 | 15992 .1093047 .3120308
0
1
age2 | 15992 1225.906 784.7382
256
3025
age3 | 15992 49305.85 45139.01
4096 166375
edu2 | 15992 152.9023 67.16633
0
324
-------------+-------------------------------------------------------edure74 | 15992 171147.6 129218.8
0 465521.8
edre74age3 | 15992 9.64e+09 1.21e+10
0 7.75e+10
----------------------------------------------------------------------------------------------------> treat = 1
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------treat |
185
1
0
1
1
671
age |
185 25.81622 7.155019
17
48
edu |
185 10.34595 2.01065
4
16
black |
185 .8432432 .3645579
0
1
hisp |
185 .0594595 .2371244
0
1
-------------+-------------------------------------------------------married |
185 .1891892 .3927217
0
1
nodegree |
185 .7081081 .4558666
0
1
re74 |
185 2095.574 4886.62
0 35040.07
re75 |
185 1532.055 3219.251
0 25142.24
re78 |
185 6349.144 7867.402
0 60307.93
-------------+-------------------------------------------------------u74 |
185 .7081081 .4558666
0
1
u75 |
185
.6 .4912274
0
1
age2 |
185 717.3946 431.2517
289
2304
age3 |
185 21554.66 20964.71
4913 110592
edu2 |
185 111.0595 39.30388
16
256
-------------+-------------------------------------------------------edure74 |
185 22898.73 57393.97
0 490561
edre74age3 |
185 4.28e+08 1.24e+09
0 8.75e+09
.
. * This has data as original except for recode of u74 and u75
. outfile treat age edu black hisp married nodegree re74 re75 re78 u74 u75 /*
> */ using propensity_cps.asc, replace
.
. ** Number of replications to use in the bootstrap
. ** Ideally at least 400
. global breps 200
.
. *** (2A) CPS propensity score model from DW02 Table 2 footnote A
.
. global CPSDW02 age age2 age3 edu edu2 married nodegree black hisp re74 re75 u74 u75 edure74
.
. * With common support option
. pscore treat $CPSDW02, pscore(myscore) blockid(myblock) comsup numblo(5) level(0.005) logit
****************************************************
****************************************************
The treatment is treat

treat |
Freq.
Percent
Cum.
672
------------+----------------------------------0 | 15,992
98.86
98.86
1|
185
1.14
100.00
------------+----------------------------------Total | 16,177
100.00

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:

Logit estimates
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
-------------+---------------------------------------------------------------age | 2.425229 .3500652 6.93 0.000 1.739114 3.111344
age2 | -.0672395 .0111308 -6.04 0.000 -.0890555 -.0454234
age3 | .0005685 .0001113 5.11 0.000 .0003505 .0007866
edu | .9247848 .2500694 3.70 0.000 .4346577 1.414912
edu2 | -.0572021 .0136202 -4.20 0.000 -.0838972 -.0305071
married | -1.556471 .2517687 -6.18 0.000 -2.049929 -1.063014
nodegree | .9270591 .3254621 2.85 0.004 .2891651 1.564953
black | 3.850668 .2662868 14.46 0.000 3.328755 4.37258
hisp | 1.673885 .409913 4.08 0.000 .8704705
2.4773
re74 | -.0002203 .0001086 -2.03 0.043 -.0004332 -7.40e-06
re75 | -.0001969 .0000378 -5.21 0.000 -.000271 -.0001228
u74 | 1.749522 .2897311 6.04 0.000
1.18166 2.317385
u75 | .00944 .257531 0.04 0.971 -.4953115 .5141915
edure74 | .0000222 9.08e-06 2.45 0.014 4.43e-06
.00004
_cons | -35.22098 3.797922 -9.27 0.000 -42.66477 -27.77719

673

------------------------------------------------------------Percentiles
Smallest
1% .0010892
.0010614
5%
.001221
.0010615
10% .0013925
.0010625
Obs
4041
25% .0021398
.0010632
Sum of Wgt.
4041
50%
75%
90%
95%
99%
.0053823
Mean
.0452964
Largest
Std. Dev.
.1326324
.0156111
.9356451
.0856723
.93718
Variance
.0175914
.282253
.9374608
Skewness
4.475994
.822637
.9384554
Kurtosis
24.36564
******************************************************
******************************************************

**********************************************************
**********************************************************

Inferior |
of block |
of pscore |
treat
0
1|
Total
674
-----------+----------------------+---------.0010614 | 3,214
18 | 3,232
.025 |
240
8|
248
.05 |
172
14 |
186
.1 |
96
19 |
115
.2 |
86
32 |
118
.4 |
31
38 |
69
.6 |
9
20 |
29
.8 |
8
36 |
44
-----------+----------------------+---------Total | 3,856
185 | 4,041
*******************************************
*******************************************
.
. * Without common support option
. pscore treat $CPSDW02, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
****************************************************
****************************************************

treat |
Freq. Percent
Cum.
------------+----------------------------------0 | 15,992
98.86
98.86
1|
185
1.14
100.00
------------+----------------------------------Total | 16,177
100.00

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:

675

Logit estimates
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
-------------+---------------------------------------------------------------age | 2.425229 .3500652 6.93 0.000 1.739114 3.111344
age2 | -.0672395 .0111308 -6.04 0.000 -.0890555 -.0454234
age3 | .0005685 .0001113 5.11 0.000 .0003505 .0007866
edu | .9247848 .2500694 3.70 0.000 .4346577 1.414912
edu2 | -.0572021 .0136202 -4.20 0.000 -.0838972 -.0305071
married | -1.556471 .2517687 -6.18 0.000 -2.049929 -1.063014
nodegree | .9270591 .3254621 2.85 0.004 .2891651 1.564953
black | 3.850668 .2662868 14.46 0.000 3.328755 4.37258
hisp | 1.673885 .409913 4.08 0.000 .8704705
2.4773
re74 | -.0002203 .0001086 -2.03 0.043 -.0004332 -7.40e-06
re75 | -.0001969 .0000378 -5.21 0.000 -.000271 -.0001228
u74 | 1.749522 .2897311 6.04 0.000
1.18166 2.317385
u75 | .00944 .257531 0.04 0.971 -.4953115 .5141915
edure74 | .0000222 9.08e-06 2.45 0.014 4.43e-06
.00004
_cons | -35.22098 3.797922 -9.27 0.000 -42.66477 -27.77719

------------------------------------------------------------Percentiles
Smallest
1% 5.92e-07
1.18e-09
5% 1.72e-06
4.07e-09
10% 3.63e-06
4.24e-09
Obs
16177
25% .0000196
1.55e-08
Sum of Wgt.
16177
50%
75%
90%
95%
99%
.0001247
Mean
.011436
Largest
Std. Dev.
.0691037
.0010579
.9356451
.0073933
.93718
Variance
.0047753
.0250635
.9374608
Skewness
9.281842
.3620009
.9384554
Kurtosis
99.39697
676
******************************************************
******************************************************

**********************************************************
**********************************************************

Inferior |
of block |
treat
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 11,635
0 | 11,635
.0007813 | 1,056
2 | 1,058
.0015625 |
932
5|
937
.003125 |
712
2|
714
.00625 |
709
2|
711
.0125 |
306
7|
313
.025 |
240
8|
248
.05 |
172
14 |
186
.1 |
96
19 |
115
.2 |
86
32 |
118
.4 |
31
38 |
69
.6 |
9
20 |
29
.8 |
8
36 |
44
-----------+----------------------+---------Total | 15,992
185 | 16,177
*******************************************
*******************************************
677
.
. * Nearest neighbor matching (random version)
. attnd re78 treat $CPSDW02, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
155
730.380
1049.321
0.696


command:
attnd re78 treat age age2 age3 edu edu2 married nodegree black hisp re74 re75 u74
u75
> edure74 , pscore() logit comsup
statistic: attnd
= r(attnd)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attnd | 200 730.3805 1280.829 941.0756 -1125.38 2586.141 (N)
|
151.7753 3865.059 (P)
|
-601.5495 1317.795 (BC)
-----------------------------------------------------------------------------Note: N = normal
678
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
155
730.380
941.076
0.776

.
. * Radius matching: Radius=0.0001
. attr re78 treat $CPSDW02, comsup boot reps($breps) dots logit radius(0.0001)


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------67
1027 -2935.932
888.041
-3.306


command:
attr re78 treat age age2 age3 edu edu2 married nodegree black hisp re74 re75 u74 u75
e
> dure74 , pscore() logit comsup radius(.0001)
679
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attr | 200 -2935.932 472.0703 1332.096 -5562.767 -309.0973 (N)
|
-5186.873 438.6902 (P)
|
-5999.987 -950.2962 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------67
1027 -2935.932
1332.096
-2.204

.
. * Kernel Matching
. attk re78 treat $CPSDW02, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
---------------------------------------------------------
680
185
3856
1267.716


command:
attk re78 treat age age2 age3 edu edu2 married nodegree black hisp re74 re75 u74
u75 e
> dure74 , pscore() logit comsup bwidth(.06)
statistic: attk
= r(attk)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attk | 200 1267.716 -64.23519 720.5805 -153.2374 2688.669 (N)
|
-211.0497 2559.206 (P)
|
-136.5283 2594.417 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
3856
1267.716
720.580
1.759
--------------------------------------------------------.
. * Stratification Matching
. atts re78 treat, pscore(myscore) blockid(myblock) comsup boot reps($breps) dots
681

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
3856
1505.512
734.270
2.050
---------------------------------------------------------

command:
atts re78 treat , pscore(myscore) blockid(myblock) comsup
statistic: atts
= r(atts)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------atts | 200 1505.512 -9.343635 665.1843 193.7979 2817.227 (N)
|
251.7493 2958.461 (P)
|
252.6815 2985.052 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
3856
1505.512
665.184
2.263
682
--------------------------------------------------------.
. *** (2B) CPS propensity score model from DW99 Table 2 footnote A
.
. global CPSDW99 age age2 edu edu2 nodegree married black hisp re74 re75 u74 u75 edure74 age3
.
. pscore treat $CPSDW99, pscore(myscore) blockid(myblock) comsup numblo(5) level(0.005) logit
****************************************************
****************************************************

treat |
Freq. Percent
Cum.
------------+----------------------------------0 | 15,992
98.86
98.86
1|
185
1.14
100.00
------------+----------------------------------Total | 16,177
100.00

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:

Logit estimates
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
683
-------------+---------------------------------------------------------------age | 2.425229 .3500652 6.93 0.000 1.739114 3.111344

age2 | -.0672395 .0111308 -6.04 0.000 -.0890555 -.0454234
edu | .9247848 .2500694 3.70 0.000 .4346577 1.414912
edu2 | -.0572021 .0136202 -4.20 0.000 -.0838972 -.0305071
nodegree | .9270591 .3254621 2.85 0.004 .2891651 1.564953
married | -1.556471 .2517687 -6.18 0.000 -2.049929 -1.063014
black | 3.850668 .2662868 14.46 0.000 3.328755 4.37258
hisp | 1.673885 .409913 4.08 0.000 .8704705
2.4773
re74 | -.0002203 .0001086 -2.03 0.043 -.0004332 -7.40e-06
re75 | -.0001969 .0000378 -5.21 0.000 -.000271 -.0001228
u74 | 1.749522 .2897311 6.04 0.000
1.18166 2.317385
u75 | .00944 .257531 0.04 0.971 -.4953115 .5141915
edure74 | .0000222 9.08e-06 2.45 0.014 4.43e-06
.00004
age3 | .0005685 .0001113 5.11 0.000 .0003505 .0007866
_cons | -35.22098 3.797922 -9.27 0.000 -42.66477 -27.77719


------------------------------------------------------------Percentiles
Smallest
1% .0010892
.0010614
5%
.001221
.0010615
10% .0013925
.0010625
Obs
4041
25% .0021398
.0010632
Sum of Wgt.
4041
50%
75%
90%
95%
99%
.0053823
Mean
.0452964
Largest
Std. Dev.
.1326324
.0156111
.9356451
.0856723
.93718
Variance
.0175914
.282253
.9374608
Skewness
4.475994
.822637
.9384554
Kurtosis
24.36564
******************************************************
******************************************************
684

**********************************************************
**********************************************************

Inferior |
of block |
treat
of pscore |
0
1 | Total
-----------+----------------------+---------.0010614 | 3,214
18 | 3,232
.025 |
240
8|
248
.05 |
172
14 |
186
.1 |
96
19 |
115
.2 |
86
32 |
118
.4 |
31
38 |
69
.6 |
9
20 |
29
.8 |
8
36 |
44
-----------+----------------------+---------Total | 3,856
185 | 4,041
*******************************************
*******************************************
.
. pscore treat $CPSDW99, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
685
****************************************************
****************************************************

treat |
Freq. Percent
Cum.
------------+----------------------------------0 | 15,992
98.86
98.86
1|
185
1.14
100.00
------------+----------------------------------Total | 16,177
100.00

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:

Logit estimates
Number of obs =
16177
LR chi2(14) = 1213.82
Prob > chi2 = 0.0000
Pseudo R2
= 0.6003
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
-------------+---------------------------------------------------------------age | 2.425229 .3500652 6.93 0.000 1.739114 3.111344
age2 | -.0672395 .0111308 -6.04 0.000 -.0890555 -.0454234
edu | .9247848 .2500694 3.70 0.000 .4346577 1.414912
edu2 | -.0572021 .0136202 -4.20 0.000 -.0838972 -.0305071
nodegree | .9270591 .3254621 2.85 0.004 .2891651 1.564953
married | -1.556471 .2517687 -6.18 0.000 -2.049929 -1.063014
black | 3.850668 .2662868 14.46 0.000 3.328755 4.37258
hisp | 1.673885 .409913 4.08 0.000 .8704705
2.4773
re74 | -.0002203 .0001086 -2.03 0.043 -.0004332 -7.40e-06
re75 | -.0001969 .0000378 -5.21 0.000 -.000271 -.0001228
u74 | 1.749522 .2897311 6.04 0.000
1.18166 2.317385
u75 | .00944 .257531 0.04 0.971 -.4953115 .5141915
edure74 | .0000222 9.08e-06 2.45 0.014 4.43e-06
.00004
age3 | .0005685 .0001113 5.11 0.000 .0003505 .0007866
_cons | -35.22098 3.797922 -9.27 0.000 -42.66477 -27.77719
686

------------------------------------------------------------Percentiles
Smallest
1% 5.92e-07
1.18e-09
5% 1.72e-06
4.07e-09
10% 3.63e-06
4.24e-09
Obs
16177
25% .0000196
1.55e-08
Sum of Wgt.
16177
50%
75%
90%
95%
99%
.0001247
Mean
.011436
Largest
Std. Dev.
.0691037
.0010579
.9356451
.0073933
.93718
Variance
.0047753
.0250635
.9374608
Skewness
9.281842
.3620009
.9384554
Kurtosis
99.39697
******************************************************
******************************************************

**********************************************************
**********************************************************

Inferior |
687
of block |
treat
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 11,635
0 | 11,635
.0007813 | 1,056
2 | 1,058
.0015625 |
932
5|
937
.003125 |
712
2|
714
.00625 |
709
2|
711
.0125 |
306
7|
313
.025 |
240
8|
248
.05 |
172
14 |
186
.1 |
96
19 |
115
.2 |
86
32 |
118
.4 |
31
38 |
69
.6 |
9
20 |
29
.8 |
8
36 |
44
-----------+----------------------+---------Total | 15,992
185 | 16,177
*******************************************
*******************************************
.
. attnd re78 treat $CPSDW99, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
155
730.380
1049.321
0.696

688

command:
attnd re78 treat age age2 edu edu2 nodegree married black hisp re74 re75 u74 u75
edure
> 74 age3 , pscore() logit comsup
statistic: attnd
= r(attnd)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attnd | 200 730.3805 1179.371 964.5437 -1171.658 2632.419 (N)
|
-9.143144 3738.959 (P)
|
-638.1188 1625.387 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
155
730.380
964.544
0.757

.
. attr re78 treat $CPSDW99, comsup boot reps($breps) dots logit radius(0.0001)

689

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------67
1027 -2935.932
888.041
-3.306


command:
attr re78 treat age age2 edu edu2 nodegree married black hisp re74 re75 u74 u75
edure7
> 4 age3 , pscore() logit comsup radius(.0001)
statistic: attr
= r(attr)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attr | 200 -2935.932 522.4813 1276.508 -5453.15 -418.7147 (N)
|
-5239.598 302.9884 (P)
|
-6023.029 -1232.031 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
690
--------------------------------------------------------67
1027 -2935.932
1276.508
-2.300

.
. * Kernel Matching
. attk re78 treat $CPSDW99, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
3856 1267.716


command:
attk re78 treat age age2 edu edu2 nodegree married black hisp re74 re75 u74 u75
edure7
> 4 age3 , pscore() logit comsup bwidth(.06)
statistic: attk
= r(attk)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------691
attk | 200 1267.716 -57.76407 751.2898 -213.7948 2749.227 (N)

|
-304.83 2488.355 (P)
|
-314.1009 2459.423 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
3856
1267.716
751.290
1.687
--------------------------------------------------------.

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
3856
1505.512
734.270
2.050
---------------------------------------------------------

command:
statistic: atts
= r(atts)
....................................................................................................
> ..................................................................................................
> ..
692
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------atts | 200 1505.512 61.77066 741.7862 42.7422 2968.282 (N)
|
245.6284 2880.622 (P)
|
348.125 2849.896 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
3856
1505.512
741.786
2.030
--------------------------------------------------------.
. *** (2C) CPS propensity score model from Becker-Ichino, 2002 (BI02)
.
. gen re742 = re74*re74
. gen re752 = re75*re75
. gen blacku74 = black*u74
. global CPSBI02 age age2 edu edu2 married black hisp re74 re75 re742 re752 blacku74
.
. pscore treat $CPSBI02, pscore(myscore) blockid(myblock) comsup numblo(5) level(0.005) logit
****************************************************
****************************************************
693

treat |
Freq. Percent
Cum.
------------+----------------------------------0 | 15,992
98.86
98.86
1|
185
1.14
100.00
------------+----------------------------------Total | 16,177
100.00

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:

Logit estimates
Number of obs =
16177
LR chi2(12) = 1170.86
Prob > chi2 = 0.0000
Pseudo R2
= 0.5790
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
-------------+---------------------------------------------------------------age | .7902073 .0940972 8.40 0.000 .6057803 .9746344
age2 | -.0128161 .0015894 -8.06 0.000 -.0159313 -.0097009
edu | .9953909 .2558663 3.89 0.000 .4939022 1.49688
edu2 | -.0636036 .0131378 -4.84 0.000 -.0893532 -.0378541
married | -1.534639 .2516679 -6.10 0.000 -2.027899 -1.041379
black | 3.340175 .3032312 11.02 0.000 2.745853 3.934497
hisp | 1.636367 .3971529 4.12 0.000 .8579614 2.414772
re74 | -.0001744 .0000626 -2.79 0.005 -.0002971 -.0000517
re75 | -.000168 .0000693 -2.42 0.015 -.0003039 -.0000322
re742 | 8.06e-09 2.61e-09 3.09 0.002 2.95e-09 1.32e-08
re752 | -2.05e-09 3.97e-09 -0.52 0.605 -9.83e-09 5.73e-09
blacku74 | 1.033264 .288037 3.59 0.000 .4687217 1.597806
_cons | -18.16269 1.865757 -9.73 0.000 -21.81951 -14.50588

694

------------------------------------------------------------Percentiles
Smallest
1% .0006768
.0006558
5% .0007912
.000656
10% .0009583
.0006562
Obs
5354
25% .0016749
.0006566
Sum of Wgt.
5354
50%
75%
90%
95%
99%
.0040446
Mean
.0343457
Largest
Std. Dev. .1120884
.0089357
.8905055
.0495031
.898552
Variance
.0125638
.1913766
.9023286
Skewness
4.931471
.6773557
.9038652
Kurtosis
29.27201
******************************************************
******************************************************

**********************************************************
**********************************************************
Variable blacku74 is not balanced in block 3
Inferior |
of block |
of pscore |
treat
0
1|
Total
695
-----------+----------------------+---------0 | 4,230
13 | 4,243
.0125 |
330
7|
337
.025 |
231
9|
240
.05 |
126
14 |
140
.1 |
108
23 |
131
.2 |
87
30 |
117
.4 |
29
20 |
49
.5 |
10
24 |
34
.6 |
12
25 |
37
.8 |
6
20 |
26
-----------+----------------------+---------Total | 5,169
185 | 5,354
*******************************************
*******************************************
.
. pscore treat $CPSBI02, pscore(myscore) blockid(myblock) numblo(5) level(0.005) logit
****************************************************
****************************************************

treat |
Freq. Percent
Cum.
------------+----------------------------------0 | 15,992
98.86
98.86
1|
185
1.14
100.00
------------+----------------------------------Total | 16,177
100.00

Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:

696
Iteration 4:
Iteration 5:
Iteration 6:
Iteration 7:
Iteration 8:

Logit estimates
Number of obs =
16177
LR chi2(12) = 1170.86
Prob > chi2 = 0.0000
Pseudo R2
= 0.5790
-----------------------------------------------------------------------------treat |
Coef. Std. Err.
-------------+---------------------------------------------------------------age | .7902073 .0940972 8.40 0.000 .6057803 .9746344
age2 | -.0128161 .0015894 -8.06 0.000 -.0159313 -.0097009
edu | .9953909 .2558663 3.89 0.000 .4939022 1.49688
edu2 | -.0636036 .0131378 -4.84 0.000 -.0893532 -.0378541
married | -1.534639 .2516679 -6.10 0.000 -2.027899 -1.041379
black | 3.340175 .3032312 11.02 0.000 2.745853 3.934497
hisp | 1.636367 .3971529 4.12 0.000 .8579614 2.414772
re74 | -.0001744 .0000626 -2.79 0.005 -.0002971 -.0000517
re75 | -.000168 .0000693 -2.42 0.015 -.0003039 -.0000322
re742 | 8.06e-09 2.61e-09 3.09 0.002 2.95e-09 1.32e-08
re752 | -2.05e-09 3.97e-09 -0.52 0.605 -9.83e-09 5.73e-09
blacku74 | 1.033264 .288037 3.59 0.000 .4687217 1.597806
_cons | -18.16269 1.865757 -9.73 0.000 -21.81951 -14.50588

------------------------------------------------------------Percentiles
Smallest
1% 2.89e-08
1.94e-10
5% 3.05e-07
1.94e-10
10% 1.20e-06
1.94e-10
Obs
16177
25% .0000148
1.94e-10
Sum of Wgt.
16177
50%
75%
90%
95%
99%
.0001313
Mean
.011436
Largest
Std. Dev. .0664629
.0016513
.8905055
.0074369
.898552
Variance
.0044173
.0234798
.9023286
Skewness
8.811019
.3855562
.9038652
Kurtosis
89.82108
697
******************************************************
******************************************************

**********************************************************
**********************************************************
Variable blacku74 is not balanced in block 7
Inferior |
of block |
treat
of pscore |
0
1 | Total
-----------+----------------------+---------0 | 11,076
1 | 11,077
.0007813 |
968
2|
970
.0015625 | 1,020
2 | 1,022
.003125 | 1,185
3 | 1,188
.00625 |
804
5|
809
.0125 |
330
7|
337
.025 |
231
9|
240
.05 |
126
14 |
140
.1 |
108
23 |
131
.2 |
87
30 |
117
.4 |
29
20 |
49
.5 |
10
24 |
34
.6 |
12
25 |
37
.8 |
6
20 |
26
-----------+----------------------+---------Total | 15,992
185 | 16,177
*******************************************
*******************************************
698
.
. attnd re78 treat $CPSBI02, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
147
1214.888
988.298
1.229


command:
attnd re78 treat age age2 edu edu2 married black hisp re74 re75 re742 re752 blacku74
,
> pscore() logit comsup
statistic: attnd
= r(attnd)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attnd | 200 1214.888 379.5276 924.3417 -607.8733 3037.65 (N)
|
-199.325 3378.257 (P)
|
-1646.026 2654.964 (BC)
-----------------------------------------------------------------------------Note: N = normal
699
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
147
1214.888
924.342
1.314

.
. attr re78 treat $CPSBI02, comsup boot reps($breps) dots logit radius(0.0001)


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------65
1089 -3094.104
857.247
-3.609


command:
attr re78 treat age age2 edu edu2 married black hisp re74 re75 re742 re752 blacku74 ,
> pscore() logit comsup radius(.0001)
statistic: attr
= r(attr)
700
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attr | 200 -3094.104 603.6858 1724.927 -6495.585 307.3775 (N)
|
-5865.623 247.5659 (P)
|
-8184.668 -474.5812 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------65
1089 -3094.104
1724.927
-1.794

.
. * Kernel Matching
. attk re78 treat $CPSBI02, comsup boot reps($breps) dots logit


--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
5169
881.520
.
701


command:
attk re78 treat age age2 edu edu2 married black hisp re74 re75 re742 re752 blacku74 ,
> pscore() logit comsup bwidth(.06)
statistic: attk
= r(attk)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------attk | 200 881.5195 193.3904 741.3048 -580.3012 2343.34 (N)
|
-375.8089 2373.732 (P)
|
-776.3726 2117.355 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
5169
881.520
741.305
1.189
--------------------------------------------------------.
702

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
5169
1538.713
---------------------------------------------------------

command:
statistic: atts
= r(atts)
....................................................................................................
> ..................................................................................................
> ..
Number of obs =
Replications =
200
16177

-------------+---------------------------------------------------------------atts | 200 1538.713 18.76738 748.4438 62.81438 3014.612 (N)
|
249.6562 3263.537 (P)
|
225.0108 3230.658 (BC)
-----------------------------------------------------------------------------Note: N = normal
P = percentile
BC = bias-corrected

--------------------------------------------------------n. treat. n. contr.
ATT Std. Err.
t
--------------------------------------------------------185
5169
1538.713
748.444
2.056
--------------------------------------------------------703
.
. ********** CLOSE OUTPUT **********
. log close
log: c:\Imbook\bwebpage\Section6\mma25p3extra.txt
log type: text
closed on: 26 May 2005, 13:26:49
----------------------------------------------------------------------------------------------------
704
705
706
707
708
709
710
711
712
713
714
715
BOOK
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
FIGURES
Most of these figures are produced by Stata programs given at this website.
Page Figure Brief caption
File
50 3.1
Social experiment with random assignment
ch3-fig1.wmf
89
4.1
Quantile regression estimates of slope coefficient
ch4fig1qr.wmf
90
4.2
Quantile regression estimated lines
ch4fig2qr.wmf
249 7.1
Power of Wald chi-square test
ch7power.wmf
253 7.2
Density of Wald test statistic of zero slope coefficient
ch7montecarlo.wmf
296 9.1
Histogram for log wage
ch9hist.wmf
296 9.2
Kernel density estimates for log wage
ch9kd1.wmf
297 9.3
Nonparametric regression of log wage on education
ch9ksm1.wmf
300 9.4
Kernel density estimates using differnet kernels
ch9kdensu1.wmf
309 9.5
k-NN regression
ch9ksmma.wmf
310 9.6
Nonparametric regression using Lowess
ch9ksmlowess.wmf
317 9.7
Nonparamertric estimate of derivative of y with respect to ch9kderiv.wmf

x
368 11.1
Bootstrap estimate of the density of t-test statistic
411 12.1
Halton sequence draws comparedto pseudo-random draws
413 12.2
Inverse transformation method for unit exponential draws ch12fig2invtransform.wmf
414 12.3
Accept-reject method for random draws
ch12fig3envelope.wmf
424 13.1
Bayesian analysis for mean parameter of normal density
ch13_bayes1.wmf
466 14.1
Charter boat fishing: probit and logit predictions
ch14binary.wmf
516 15.1
Generalized random utility model
ch15-Gen-RUM2.wmf
531 16.1
Tobit regression example
ch16condmeans.wmf
540 16.2
Inverse Mills ratio as censoring point c increases
ch16millsratio.wmf
575 17.1
Strike duration: Kaplan-Meier survival function
kennanstrk.wmf
585 17.2
Weibull distribution: density, survivor, hazard and ch17weibull.wmf

cumulative hazard functions
604 17.3
Unemployment duration: Kaplan-Meier survival function
605 17.4
Unemployment duration:
unemployment insurance
606 17.5
Unemployment duration: Nelson-Aalen cumulative hazard na_pt1.wmf

function
606 17.6
Unemployment duration: cumulative hazard functions by na_pt2.wmf

unemployment insurance
627 18.1
Length-biased sampling under stock sampling
633 18.2
Unemployment duration:
generalized residuals
633 18.3
Unemployment duration: Weibull model generalized exp_gamma.wmf

residuals
survival
ch11boot.wmf
functions
exponential-gamma
km_pt1.wmf
by km_pt2.wmf
ch18lbias.wmf
model exp.wmf
743
635 18.4
Unemployment duration: Weibull model generalized weibul16.wmf

residuals
636 18.5
Unemployment duration: Weibull-Inverse Gaussian model weibul16_ig.wmf

generalized residuals
661 19.1
Unemployment duration: Cox Competing Risks baseline combined_bsf.wmf

survival functions
662 19.2
Unemployment duration: Cox Competing Risks baseline combined_cbh.wmf

cumulative hazards
712 21.1
Hours and wages: pooled (overall) regression
ch21pantot.wmf
713 21.2
Hours and wages: between regression
ch21panbe.wmf
713 21.3
Hours and wages: within (fixed effects) regression
ch21panfe.wmf
714 21.4
Hours and wages: first differences regression
ch21panfd.wmf
793 23.1
Patents and R&D spending: pooled (overall) regression ch23fig1.wmf

[with corrected labelling of axes]
880 25.1
Regression-discontinuity design: example
ch25-fig1-rd.wmf
883 25.2
Treatment assignment in sharp and fuzzy RD designs.
ch25-fig2-rd.wmf
892 25.3
Training impact: earnings against propensity score by ch25treatment.wmf

treatment status
924 27.1
Missing data: examples of missing regressors
ch27fig1.wmf
Assign to
treatment
Yes
Eligible
subject
invited to
participate
Randomize
Agrees to
participate?
Assign to
control
No
Drop from
study
744
1
.8
.6
.4
.2
Upper 95% confidence band

Quantile slope coefficient
Lower 95% confidence band
OLS slope coefficient
Slope and confidence bands
Slope Estimates as Quantile Varies
.2
.4
.6
.8
15
Regression Lines as Quantile Varies

Actual Data
90th percentile
Median
10
10th percentile
Log Household Total Expenditure
Quantile
10
12
Log Household Medical Expenditure
745
.6
Test size = 0.10

Test size = 0.05
.4
Test size = 0.01
.2
Test Power
.8
Test Power as a function of the ncp
10
15
20
Noncentrality parameter lamda
.4
Monte Carlo Simulations of Wald Test

Monte Carlo
.2
.1
0
Density
.3
Standard Normal
-4
-2
Wald Test Statistic
746
.2
Density
.4
.6
Histogram for Log Wage
Log Hourly Wage
One-half plug-in
Plug-in
.2
.4
.6
Two times plug-in
Kernel density estimates
.8
Density Estimates as Bandwidth Varies
Log Hourly Wage
747
Bandwidth h=0.8
Bandwidth h=0.4
Bandwidth h=0.1
Actual data
Log Hourly Wage
Nonparametric Regression as Bandwidth Varies
10
15
20
Years of Schooling
.4
Epanechnikov (h=0.545)
Gaussian (h=0.246)
Quartic (h=0.646)
.2
Uniform (h=0.214)
Kernel density estimates
.6
Density Estimates as Kernel Varies
Log Hourly Wage
748
350
Actual Data
300
kNN (k=5)
Linear OLS
200
250
kNN (k=25)
150
Dependent variable y
k-Nearest Neighbours Regression as k Varies
20
40
60
80
100
Regressor x
Actual Data
300
Lowess (k=25)
200
250
OLS Cubic Regression
150
350
Lowess Nonparametric Regression
20
40
60
80
100
Regressor x
749
Nonparametric Derivative Estimation
From OLS Cubic Regression
-2
From Lowess (k=25)
20
40
60
80
100
Regressor x
.4
Bootstrap Density of 't-Statistic'

Bootstrap Estimate
.2
.1
0
Density
.3
Standard Normal
-4
-2
t-statistic from each bootstrap replication
750
.6
.4
0
.2
Cdf F(x)
.8
Inverse Transformation Method
Random variable x
Draw of 0.64 (vertical axis) yields x = 1.02 (horizontal axis).
.6
Accept-reject Method
Desired density f(x)
.4
.2
0
f(x) and kg(x)
Envelope kg(x)
10
Random variable x
751
.4
Bayes: Likelihood, Prior and Posterior

Likelihood N[10,2]
Prior N[5,3]
.2
0
.1
Density
.3
Posterior N[8,1.2]
10
15
Evaluation point
1.5
Predicted Probabilities Across Models

Actual Data (jittered)
Probit
.5
OLS
-.5
Predicted probability
Logit
-2
Log relative price (lnrelp)
752
Explanatory
variables
Disturbances
Disturbances
Latent
classes
Indicators
Latent
variables
Stated preference
indicators
Utilities
Observable
variable
Indicators
Unobservable
variable
Structural
relationship
Disturbances
Revealed preference
indicator y
-2000
2000
4000
Tobit: Censored and Truncated Means
Actual Latent Variable

Truncated Mean
Censored Mean
-4000
Different Conditional Means
Measurement
relationship
Uncensored Mean
Natural Logarithm of Wage
753
Inverse Mills ratio
N[0,1] Cdf
.5
1.5
N[0,1] Density
Inverse Mills, pdf and cdf
2.5
Inverse Mills Ratio as Cutoff Varies
-2
-1
Cutoff point c
.75
.5

Survival Function
.25
Survival Probability
Kaplan-Meier Survival Function Estimate
50
100
150
200
250
Strike duration in days
754
20
40
60
0 .2 .4 .6 .8 1
Weibull survivor
0 .01.02 .03.04
Weibull density
Weibull Distribution
80
20
40
60
80
Duration time
60
80
0 2 4 6 8
Cumulative hazard
.05 .1 .15
0
40
Duration time
Weibull hazard
Duration time
20
20
40
60
80
Duration time
Overall Survival Function Estimate

.25
.5
.75
Survival Estimate
10
20
30
Unemployment Duration in 2-week intervals
755
1.00
Survival Function Estimates by UI Status

No UI (UI = 0)
0.75
0.50
0.25
0.00
Received UI (UI = 1)
10
20
30
1.5
Overall Cumulative Hazard Estimate

.5
Cumulative Hazard
Cumulative Hazard Estimate
10
20
30
756
1.50
Cumulative Hazard Estimates by UI Status

No UI (UI = 0)
1.00
0.50
0.00
Cumulative Hazard
Received UI (UI = 1)
10
20
30
S3
S2
S1
S5
12-month
survey
period
S4
S7
Survey date
S9
S6
S8
757
4
3
2
1
Cumulative Hazard
Exponential Model Residuals
Cumulative Hazard
45 degree line
Generalized (Cox-Snell) Residual
3
2
1
Cumulative Hazard
45 degree line
Cumulative Hazard
Exponential-Gamma Model Residuals
758
4
2
Cumulative Hazard
Weibull Model Residuals
Cumulative Hazard
45 degree line
4
3
2
1
Cumulative Hazard
45 degree line
Cumulative Hazard
Weibull-IG Model Residuals
759
1
.8
.6
Risk 1 (full-time job)

Risk 2 (part-time job)
.2
.4
Risk 3 (unknown job)
Baseline Survival Probability
Baseline Survival Functions
10
20
30
10
Risk 1 (full-time job)
Risk 2 (part-time job)
Risk 3 (unknown job)
Baseline Cumulative Hazard
Baseline Cumulative Hazard Functions
10
20
30
760
8
6
4
Log annual hours
10
Pooled (Overall) Regression
Original data
Nonparametric fit
Linear fit
Log hourly wage
8
7.5
7
Averages
Nonparametric fit
6.5
Log annual hours
8.5
Between Regression
Linear fit
Log hourly wage
761
7
6
5
Log annual hours
Within (Fixed Effects) Regression
Deviations from average
Nonparametric fit
Linear fit
Log hourly wage
First differences
Nonparametric fit
Linear fit
-5
Log annual hours
First Differences Regression
-2
-1
Log hourly wage
762
4
2
0
Log Patents
Pooled (Overall) Regression
Original data
-2
Nonparametric fit
Linear fit
-5
10
Log R&D Spending
10
5
Actual data
No treat (low)
Treat (high)
Outcome y
15
20
Regression Discontinuity Example
Selection variable S
763
Post-treatment Earnings against Propensity Score

Treated_sample
5000
10000
15000
Real Earnings 1978
20000
Comparison_sample
.5
Propensity Score
Original data
.5
Propensity Score
Nonparametric regression
Graphs by Treatment Status
Propensity score Pr[D=1|S]
Sharp and Fuzzy RD Designs
Sharp Design
Fuzzy design
Selection variable S
764
Post-treatment Earnings against Propensity Score
5000
10000
15000
Treated_sample
Real Earnings 1978
20000
Comparison_sample
.5
Propensity Score
Original data
.5
Propensity Score
Nonparametric regression
Graphs by Treatment Status
765
BOOK CORRECTIONS - June 9, 2005 plus some but not all corrections since
then added
Page
p.85
Date Posted Correction or Addition

2/18/2006 Bottom line should be "censored models (see Section 16.9.2)." [Jeff Smith,
Michigan]
p.68, 147 11/22/2005 Liebler should be spelt Leibler [Joerg Stoye, NYU]
p.89
3/30/2006
Third last line should be "q = 0.1, 0.5, and 0.9" and not "q = 0.1, 0.2, ...,
0.9" [James MacKinnon, Queen's]
p. 113
5/27/2005
Exercise 4-2 part (b) should be Hence directly obtain a consistent estimate
of
the
variance
of
_hat
(and not Hence directly obtain the variance of y_bar)
p. 114
6/9/2005
Exercise 4-7 parts (d)-(f) need to be replaced. See mmaex04_7.pdf.
p. 164
6/9/2005
Exercise 5-1 is correct but the function is close to

A better example uses E[y|x]=exp(0+0.04x)/[1+exp(0+0.04x)].
p. 165
6/9/2005
Exercise 5-7 part (c) is ML estimation (delete the word NLS).
p. 168
3/3/2006
Second line after first displayed equation should be E[h(x)(y-g(x,))] = 0

(and not E[h(x)(y-x')]) [Doug Miller, UC-Davis]
p. 178
3/3/2006
Last displayed equation. The first and third matrices are wrong and should
be similar to G_hat in (6.21). For these matrices the two terms being
summed over i should be x_i*x_i' and 3*utilde_i^2*x_i*x_i'. [Doug Miller,
UC-Davis]
p. 189
3/6/2006
Theil's interpretation. Change "suppose that in the reduced form model" to

"Suppose that we specify a first-stage model where" [Doug Miller, UCDavis]
p.190
3/6/2006
Basmann's interpretation. Change "OLS reduced form prediction" to "OLS

first-stage predictions". [Doug Miller, UC-Davis]
p.193
3/6/2006
Top line change "because to regressors" to "because the regressors" [Doug

Miller, UC-Davis]
p. 199
5/18/2005
In Table 6.4 NL2SLS column is 0.969, 0.041, 0.84 (and not 0.960, 0.046,
0.85)
p. 214
3/28/2006
In the displayed equation for the 3SLS estimator the matrix OMEGA_hat
should
be
SIGMA-hat.
Same change two lines down and four lines down. SIGMA_hat = definition
given for OMEGA_hat.
p. 220
5/27/2005
Exercise 6-1 part (a) should be (y - exp(x'))^2 (and not (y - (x'))^2)
linear.
Exercise 6-1 part (d) should be E[x(y - exp(x'))] = 0 (so add = 0)

p. 255
5/18/2005
Sample size was N=40 (and not N=30)
p.255
5/18/2005
Five lines from bottom should be z = (0.817 - 1) / 0.376 = -0.487
p. 256
5/18/2005
In section 7.8.3 the percentiles should be -1.89 and 1.80 (and not -2.62 and
1.83)
p.278,280 11/22/2005 Liebler should be spelt Leibler [Joerg Stoye, NYU]

p. 414
5/18/2005
Figure 12.3 vertical axis label should be f(x) and kg(x) and legend should
be kg(x) (and not g(x))
766
p. 493
2/18/2006
First two lines should be "in the probability of fishing from a beach, and an
increase
of
0.119,
0.080,
and 0.068, respectively, in the probability of fishing from a pier, a private
boat,
and
a
charter
boat."
[Jeff Smith, Michigan]
p. 501
3/22/2006
(15.17) and the line before should have minus sign before the expected
Hessian. [Frank Windmeijer, Bristol]
p. 505
3/22/2006
Fifth line should be "computer-intensive" not "computer-intesive". [Frank

Windmeijer, Bristol]
p.508
3/22/2006
Possible error in (15.31) needs to be checked
p. 569
5/19/2005
Bibliographic note 16.3 should refer to Tobin (1958) (and not Tobit (1958))
[Kevin Hoover, UCD]
p. 793
4/7/2005
Figure 23.1 axes labels are reversed. Vertical axis is log(patents) and
horizontal axis is log(R&D)
p. 839
4/10/2006
Second equality for SIGMA_c^-1 should not have the inverse at the end.
p. 839
4/10/2006
Formula for [I + aee']^(1/2) should finish with ee' and not Mee'.
p. 895
5/26/2005
Table 25.6 footnote b drop RE74*RE75 from the list of regressors
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785

Solution Manual For Microeconometrics

Uploaded by

Copyright:

Available Formats

Solution Manual For Microeconometrics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solution Manual For Microeconometrics

Uploaded by

Copyright:

Available Formats

1

II: CORE METHODS

DATA 21. Linear Panel Models: Basics

VI: FURTHER TOPICS 24. Stratified and Clustered Samples

PART 1 (chapters 1-3)

Part 1 covers the essential components of microeconometric analysis -- an economic specification, a

PART 2 (chapters 4-10)

PART 3 (chapters 11-13)

PART 4 (chapters 14-20)

PART 5 (chapters 21-23)

PART 6 (chapters 24-27)

GUIDE FOR INSTRUCTORS AND OTHER READERS

Estimation: M-estimation, ML and NLS

Estimation: Numerical Optimization

Models: Binary and multinomial

Models: Censored and Truncated

Testing: Hypothesis Tests

Models: Basic Linear Panel

Gerard J. van den Berg, Free University, Amsterdam, The Netherlands

PROGRAMS: I. INTRODUCTION (chapters 1-3)

PROGRAMS: II. CORE METHODS (chapters 4-10)

Program and Output

Robust Standard Errors for mma04p1wls.do

IV Application with Weak mma04p4ivweak.do

5.9.2-3 159-63 Exponential: MLE using mma05p1mle.do

5.9.2-3 159-63 Exponential: NLS using nl mma05p2nls.do

5.9.2-3 159-63 Exponential: NLS using ml mma05p3nlsbyml.do

159-63 Exponential: Computation mma05p4margeffects.do

Part of preceding using mma06p2twostage.do

Asymptotic Power of Wald mma07p2power.do

Monte Carlo Simulation of mma07p3montecarlo.do

269-71 Conditional moment tests mma08p1cmtests.do

9.4-9.5 307-19 Nonparametric regression: mma09p2npmore.do

Kernel functions plotted

Gradient method example mma10p1gradient.do

III. Computationally-Intensive Methods

Program and Output

Illustration of Methods to mma12p3draws.do

Bayes Theorem Illustration mma13p1bayesthm.do

MCMC Example: Gibbs mma13p2bayesgibbs.sas Program generated

Computation mma12p1integration.do No data

Program and Output

Maximum score estimator mma14p2maxscore.lim

Multinomial Logit and mma15p1mnl.do

Nested Logit (or GEV) mma15p2gev.do

Limdep multinomial logit

Limdep and addon Nlogit mma15p4gev.lim

Classic Tobit MLE and mma16p1tobit.do

Inverse Mills ratio plotted

Nonparametric estimation mma17p1km.do

Nonparametric estimation mma17p2kmextra.do

Duration regression models mma17p4duration.do

Duration regression with mma18p1heterogeneity.do ema1996.dta

Competing risks model mma19p1comprisks.do

Count regression (doctor mma20p1count.do

708-13 Linear Panel Fixed and mma21p1panfeandre.do

Linear Panel Estimators mma21p2panmanual.do

Linear Panel pooled OLS mma21p4pangls.do

Nonlinear Panel Application mma23p1pannonlin.do

. READ IN DATA and SUMMARIZE

5. quietly replace `lnf' = -0.5ln(2_pi) - ln(`sigma') - 0.5*`res'^2/`sigma'^2

. MARGINAL EFFECTS for CHAPTER 5.9.4