Comparing several methods of measuring the same quantity

Paul Seed

Comparing several methods of measuring the same quantity

Paul Seed

2001

visibility

…

description

60 pages

link

1 file

AI-generated Abstract

The paper compares various methods of measuring blood iron and eyesight improvement, emphasizing the inadequacies of certain statistical tests such as high correlations and F tests for paired data. It explores the effectiveness of different clinical techniques and highlights better alternatives like Pitman's test in evaluating measurement accuracy.

STATA TECHNICAL BULLETIN May 2000 STB-55 A publication to promote communication among Stata users Editor Associate Editors H. Joseph Newton Department of Statistics Texas A & M University College Station, Texas 77843 979-845-3142 979-845-3144 FAX [email protected] EMAIL Nicholas J. Cox, University of Durham Joanne M. Garrett, University of North Carolina Marcello Pagano, Harvard School of Public Health J. Patrick Royston, UK Medical Research Council Jeroen Weesie, Utrecht University Subscriptions are available from Stata Corporation, email [email protected], telephone 979-696-4600 or 800-STATAPC, fax 979-696-4601. Current subscription prices are posted at www.stata.com/bookstore/stb.html. Previous Issues are available individually from StataCorp. See www.stata.com/bookstore/stbj.html for details. Submissions to the STB, including submissions to the supporting files (programs, datasets, and help files), are on a nonexclusive, free-use basis. In particular, the author grants to StataCorp the nonexclusive right to copyright and distribute the material in accordance with the Copyright Statement below. The author also grants to StataCorp the right to freely use the ideas, including communication of the ideas to other parties, even if the material is never published in the STB. Submissions should be addressed to the Editor. Submission guidelines can be obtained from either the editor or StataCorp. Copyright Statement. The Stata Technical Bulletin (STB) and the contents of the supporting files (programs, datasets, and help files) are copyright c by StataCorp. The contents of the supporting files (programs, datasets, and help files), may be copied or reproduced by any means whatsoever, in whole or in part, as long as any copy or reproduction includes attribution to both (1) the author and (2) the STB. The insertions appearing in the STB may be copied or reproduced as printed copies, in whole or in part, as long as any copy or reproduction includes attribution to both (1) the author and (2) the STB. Written permission must be obtained from Stata Corporation if you wish to make electronic copies of the insertions. Users of any of the software, ideas, data, or other materials published in the STB or the supporting files understand that such use is made without warranty of any kind, either by the STB, the author, or Stata Corporation. In particular, there is no warranty of fitness of purpose or merchantability, nor for special, incidental, or consequential damages such as loss of profits. The purpose of the STB is to promote free communication among Stata users. The Stata Technical Bulletin (ISSN 1097-8879) is published six times per year by Stata Corporation. Stata is a registered trademark of Stata Corporation. Contents of this issue an72. STB-49–STB-54 available in bound format ip9.1. Update of the byvar command sbe32.1. Errata for sbe32 sbe33. Comparing several methods of measuring the same quantity sbe34. Loglinear modeling using iterative proportional fitting sg135. Test for autoregressive conditional heteroskedasticity in regression error distribution sg136. Tests for serial correlation in regression error distribution sg137. Tests for heteroskedasticity in regression error distribution sg138. Bootstrap inferences about measures of correlation sg139. Logistic regression when binary outcome is measured with uncertainty sg140. The Gumbel quantile plot and a test for choice of extreme models sg141. Treatment effects model sg142. Uniform layer effect models for the analysis of differences in two-way associations snp15. somersd—Confidence intervals for nonparametric statistics and their differences zz10. Cumulative index for STB-49–STB-54 page 2 2 2 2 10 13 14 15 17 20 23 25 33 47 55 2 Stata Technical Bulletin an72 STB-55 STB-49–STB-54 available in bound format Patricia Branton, Stata Corporation, [email protected] The ninth year of the Stata Technical Bulletin (issues 49–54) has been reprinted in a bound book called The Stata Technical Bulletin Reprints, Volume 9 . The volume of reprints is available from StataCorp for $25, plus shipping. Authors of inserts in STB-49– STB-54 will automatically receive the book at no charge and need not order. This book of reprints includes everything that appeared in issues 49–54 of the STB. As a consequence, you do not need to purchase the reprints if you saved your STBs. However, many subscribers find the reprints useful since they are bound in a convenient volume. Our primary reason for reprinting the STB, though, is to make it easier and cheaper for new users to obtain back issues. For those not purchasing the Reprints, note that zz10 in this issue provides a cumulative index for the ninth year of the original STBs. You may order the Reprints Volumes online at www.stata.com/bookstore/stbr.html or use the enclosed order form. ip9.1 Update of the byvar command Patrick Royston, Imperial College School of Medicine, London, UK, [email protected] Abstract: The byvar command has been updated for Stata 6 and a few new features added. Keywords: Stata commands. The byvar command introduced in Royston (1995) has been updated for Stata 6 and a few new features added. References Royston, P. 1995. ip9: Repeat Stata command by variable(s). Stata Technical Bulletin 27: 3–5. Reprinted in Stata Technical Bulletin Reprints, vol. 5, pp. 67–69. sbe32.1 Errata for sbe32 López Vizcaı́no, M. E., Santiago Pérez M. I., Abraira Garcı́a L., Dirección Xeral de Saude Publica, Spain, [email protected] Abstract: Errors in the Methodology section of López Vizcaı́no et al. (2000) are corrected. Keywords: Outbreak, regression, threshold, public health surveillance. In the process of editing López Vizcaı́no et al. (2000), errors were introduced into the Methodology section. In the first two equations in that section the gi ’s should have been i ’s, while the mi should have been i . Finally, the sentence that begins after the third displayed equation should say that the Poisson model corresponds to = 1 rather that = 0. References López Vizcaı́no M. E., M.I. Santiago Pérez, and L. Abraira Garcı́a. 2000. sbe32: Automated outbreak detection from public health surveillance data. Stata Technical Bulletin 54: 23–25. sbe33 Comparing several methods of measuring the same quantity Paul Seed, GKT School of Medicine, King’s London, UK, [email protected] Abstract: New commands are given, based on the Bland–Altman approach to the analysis of studies comparing two or more methods for measuring the same quantity. An extension to more than two methods is explained, with an associated command. A new command, based on Pitman’s method, gives confidence intervals for the variance ratio of paired data. It is more powerful than Stata’s sdtest, particularly for large correlations. For more than two methods, with no reference standard, a new generalization of Bland–Altman methods is shown and compared with an approach based on factor analysis. Keywords: Method comparison, Bland–Altman, variance ratio. The problem New techniques for taking clinical measurements are always being developed. How can we decide which is best? Sometimes a new measurement technique is compared with an established “Gold Standard,” which may or may not be regarded as exact. How good is the new technique? Alternatively, there may be several methods, all seen as imperfect. Which is best? Stata Technical Bulletin 3 Some typical datasets Consider blood iron where one might want to compare an established method (colorimetry) with two new clinical techniques: inductively coupled plasma optical emission spectrometry (ICPOES) with 18 pairs of measurements . use col_icp . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------colorime | 20 19.3 4.878524 11 26 icpoes | 0 icp | 18 31.66667 10.07034 16 56 mean | 18 25.44444 6.116249 13.5 40 diff | 18 -12.44444 10.26257 -38 -5 and ICPOES following protein precipitation with trichloroacetic acid (TCA) with 52 pairs. . use tca_col,clear . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------colorime | 52 15.09615 5.668141 8 26 tca_ppt_ | 52 13.21154 5.413623 5 23 mean | 52 14.15385 5.498251 6.5 24.5 diff | 52 1.884615 1.395425 -1 5 The question is how does the new test compare with the old? The second example compares five eyesight tests carried out on 15 patients before and after operations for astigmatism. We are interested in the percentage improvement in eyesight as measured by each of the five tests. . use tan_part, clear . summarize pct_* Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------pct_1 | 15 20.19054 18.39144 0 60.40134 pct_2 | 15 33.52114 35.62084 .4051345 114.6789 pct_3 | 15 36.47013 57.8477 -13.49776 220.4319 pct_4 | 15 15.95635 11.46139 .7299263 41.23711 pct_5 | 15 24.67195 11.49144 9.999998 42.85715 One last example is tumor activity adjusted for partial volume and glucose uptake (the variable log gp) and that adjusted for partial volume alone (the variable log p) . use suv2, clear . summarize log_g* Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------log_p | 86 2.260328 .6361188 .2615947 4.098669 log_gp | 86 2.183237 .7080229 -.8204135 3.929667 Simple methods that don’t work I have found two methods in particular that don’t perform very well in such problems. First, very high correlations are almost always found. The null hypothesis that there is no association is just not credible. The significance test tells us nothing we don’t already know. Secondly, the F test, provided by Stata’s sdtest command, is not appropriate for paired data. Pitman’s test (below) is more powerful, particularly with a large correlation. It is not safe to assume that the measure with the smaller variance has the smaller component of error. While this is often the case, it might just be less sensitive to genuine variation. Other methods to use with caution Linear regression looks for any linear relationship: m2 = a + bm1 , whereas we are often interested in m2 = m1 , that is, a = 0 and b = 1. The method also assumes that m1 is measured without error. This is scarcely likely. If m1 is measured with error, the estimate of b is biased towards zero. Paired t tests and confidence intervals for differences in means are useful as evidence of systematic bias, but measures with large random error can have a nonsignificant t test, even when bias exists. We are mainly interested in the error in each individual measurement. Bias is not as important as the absolute size of the likely difference. 4 Stata Technical Bulletin STB-55 Simple approaches that can be useful The reference range for differences between individual measurements is defined as the mean plus or minus two standard deviations. Approximately 95% of values will be between these limits. If two measures agree well, the reference range will be very narrow. Note that the reference range is in the same units as the actual measurement. In Stata we can use . . . . . . . gen diff = m2 - m1 summarize diff local mdiff = r(mean) local lrr = `mdiff' - 2*r(Var)^.5 local urr = `mdiff' + 2*r(Var)^.5 display "Mean difference = `mdiff'" display "Reference Range = `lrr' to `urr'" Bland–Altman plots Bland and Altman (1983) introduced the idea of plotting the difference of paired variables versus their average, with horizontal lines for the reference range for the difference. Any plots of the actual data are useful to show oddities. The plots will show no trend if the variance of m1 and m2 are the same. A positive trend shows the variance of e2 is larger than that of e1 . We can use . gen av = (m1 + m2)/2 . graph diff av, xlab ylab yline(`mdiff', `lrr', `urr') or use the command baplot included with this insert . baplot m2 m1 Syntax for baplot command baplot varname1 varname2 if exp in range , format(str) avlab(str) difflab(str) graph options Options for baplot format(str) sets the format for the results given. avlab(str) gives a variable label to the average before plotting the graph. difflab(str) gives a variable label to the difference before plotting the graph. graph options are any of the options allowed with graph, twoway; see [G] graph options. Examples Consider comparing a new measure with a gold standard. For the blood-iron data, we can compare ICPOES with Colorimetry giving the output below and the plot in Figure 1. . use col_icp, clear . summarize icp colorime Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------icp | 18 31.66667 10.07034 16 56 colorime | 18 19.22222 5.105425 11 26 . baplot icp colorime , xlab(0,10,20,30,40) ylab(-40,-20,0,20,40) avlab("ICPOES vs > Colorimetry") Bland-Altman comparison of icp and colorime Limits of agreement (Reference Range for difference): -8.081 to 32.970 Mean difference: 12.444 (CI 7.341 to 17.548) Range : 13.500 to 40.000 Pitman's Test of difference in variance: r = 0.600, n = 18, p = 0.008 (Figure 1 on next page ) Stata Technical Bulletin 5 40 Difference 20 0 -20 -40 0 10 20 ICPOES vs Colorimetry 30 40 Figure 1. Comparing ICPOES and colorimetry for the blood-iron data. We can compare tca-precipitated ICPOES with Colorimetry giving the output below and the graph in Figure 2. . use tca_col.dta, clear . summarize colorime tca_ppt_ Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------colorime | 52 15.09615 5.668141 8 26 tca_ppt_ | 52 13.21154 5.413623 5 23 . baplot tca_ppt_ colorime, xlab(0,10,20,30,40) ylab(-40,-20,0,20,40) avlab("Adjust > ed ICPOES vs Colorimetry") Bland-Altman comparison of tca_ppt_ and colorime Limits of agreement (Reference Range for difference): -4.675 to 0.906 Mean difference: -1.885 (CI -2.273 to -1.496) Range : 6.500 to 24.500 Pitman's Test of difference in variance: r = -0.184, n = 52, p = 0.207 40 Difference 20 0 -20 -40 0 10 20 30 Adjusted ICPOES vs Colorimetry 40 Figure 2. Comparing tca-precipitated ICPOES with colorimetry. Pitman’s test of difference in variance As mentioned earlier, if m1 and m2 have equal variance, the covariance (and hence the correlation) between their average and their difference will be zero. Pitman’s test looks for a significant correlation between the difference and the average of m1 and m2 . If one exists, this is evidence that the variances are not the same. Because this uses the fact that the data are paired, it can be much more powerful than the usual F test (consider paired and unpaired t tests). Pitman, quoted in Snedecor and Cochran (1967), extended this test to give a confidence interval for the variance ratio. This can be obtained by using the new command sdpair included with this insert. 6 Stata Technical Bulletin Syntax for the sdpair command sdpair varname1 varname2 weight STB-55 if exp in range , format(str) level(#) fweights and aweights are allowed. Options for sdpair format(str) sets the format for the display of results. level(#) specifies the confidence level, in percent, for confidence intervals. The default is level(95) or as set by set level. Example of Pitman’s test: Compare tumor activity adjusted for partial volume and glucose uptake with that for partial volume alone: . sdtest log_p = log_gp (output omitted ) P < F_obs = 0.1627 P < F_L + P > F_U = 0.3254 P > F_obs = 0.8373 . sdpair log_p log_gp Pitman's variance ratio test between log_p and log_gp: Ratio of Standard deviations = 0.8984 95% Confidence Interval 0.8365 to 0.9649 t = -2.986, df = 84, p = 0.004 Multiple Bland–Altman plots for comparing more than two methods The command bamat produces a matrix of Bland–Altman plots for all possible pairs of methods. This is very useful for a first comparison of methods, and may identify a method that is clearly inferior to the others. It is illustrated with the eyesight data. Syntax for bamat if exp in range , format(str) notable data avlab(str) difflab(str) obs(#) listwise title(str) graph options bamat varlist Options for bamat format(str) sets the format for display of results. notable suppresses display of results. data lists data used in plotting each graph. avlab(str) gives a variable label to the average before plotting the graph. difflab(str) gives a variable label to the difference before plotting the graph. obs(#) specifies the minimum number of nonmissing values per observation needed for a point to be plotted. The default value is 2 (pairwise deletion). listwise specifies listwise deletion of missing data. Default is pairwise. Only observations with no missing values are used. title(str) adds a single title to the block of graphs. graph options are any of the options allowed with graph, twoway; see [G] graph options. Example of bamat Once again we consider the eyesight data. . use tan_part,clear . bamat pct_* Reference ranges for differences between two methods Method 1 Method 2 Mean [95% Reference Range] Minimum Maximum ---------------------------------------------------------------------pct_2 pct_1 13.331 -50.777 77.438 -31.678 88.525 pct_3 pct_1 16.280 -100.401 132.960 -42.994 195.588 pct_3 pct_2 2.949 -100.526 106.424 -44.137 162.944 pct_4 pct_1 -4.234 -47.425 38.956 -46.533 41.237 Stata Technical Bulletin 7 pct_4 pct_2 -17.565 -84.648 49.518 -98.993 5.375 pct_4 pct_3 -20.514 -126.212 85.184 -184.780 26.829 pct_5 pct_1 4.481 -32.680 41.643 -31.142 37.500 pct_5 pct_2 -8.849 -64.287 46.589 -71.822 21.817 pct_5 pct_3 -11.798 -115.825 92.229 -182.932 35.720 pct_5 pct_4 8.716 -15.297 32.729 -4.839 27.171 ---------------------------------------------------------------------Range of x values is -6.546 to 139, range of y values is -195.6 to 200 200 100 pct_1 200 100 100 100 0 0 0 0 -100 -100 -100 -100 -200 -200 -100 0 100 -200 -100 200 0 100 -200 -100 200 0 100 200 200 200 200 200 100 100 100 100 pct_2 0 0 0 0 -100 -100 -100 -100 -200 -200 -200 -200 -100 0 100 -100 200 0 100 -100 200 0 100 200 200 200 200 200 100 100 100 100 pct_3 0 0 0 0 -100 -100 -100 -100 -200 -200 -200 -200 -100 0 100 200 -100 0 100 200 -100 0 100 200 200 200 200 200 100 100 100 100 pct_4 0 0 0 -100 -100 -100 -100 -200 -200 -200 -200 -100 0 100 -100 200 0 100 -100 200 0 100 200 200 200 100 100 100 100 0 0 0 0 -100 -100 -100 -100 -200 -200 -200 -200 -100 0 100 200 -100 0 100 200 -100 0 100 200 -100 0 100 200 -100 0 100 200 -100 0 100 200 -100 0 100 200 0 200 200 195.6 200 pct_5 -100 0 100 200 Figure 3. Matrix of Bland–Altman plots for the eyesight data. Modified Bland–Altman plots We would like to modify Bland–Altman plots for use with more than two measures when there is no gold standard measure. For example, if we have eight measures, there would be 28 Bland–Altman plots. We consider a modification that gives one comparison per measure. The average is just the average of all the measures. We hope this is close to the truth. The difference we use for the ith measure is the average of the ith measure minus the average of the other measures. We work out a reference range as before. = + If each measure is of the form mi t ei , with the errors independent and of equal variance, then the correlation between the average and the difference will be zero. If for some particularly useful method, mi has smaller than average variance, there will be a negative trend. This method has difficulties if the errors are correlated or the model breaks down in other ways; for example, if linear function of the truth, that is, mi ai bi t ei . = + + We can do this by brute force in Stata by . . . . . . . . egen av = rmean(m1-m5) egen mean1 = rmean(m2-m5) gen diff = m1 - mean1 summ diff local mdiff = _r(mean) local lrr = `mdiff' - 2*r(Var)^.5 local urr = `mdiff' + 2*r(Var)^.5 graph diff av, xlab ylab yline(`lrr', `mdiff', `urr') or use the new command bagroup included with this insert. Syntax for bagroup if exp in range , format(str) rows(#) avlab(str) difflab(str) title(str) obs(#) listwise graph options bagroup varlist Options for bagroup format(str) sets the format for display of results. rows(#) specifies the number of rows of graphs to be shown. avlab(str) gives a variable label to the average before plotting the graph. difflab(str) gives a variable label to the difference before plotting the graph. m i is a 8 Stata Technical Bulletin STB-55 title(str) adds a single title to the block of graphs. obs(#) specifies the minimum number of nonmissing values per observation needed for a point to be plotted. The default value is 2 (pairwise deletion). listwise specifies listwise deletion of missing data. Default is pairwise. Only observations with no missing values are used. graph options are any of the options allowed with graph, twoway; see [G] graph options. Example of bagroup For the eyesight data we obtain the results below and the plot in Figure 4. . use tan_part, clear . summarize pct_* Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------pct_1 | 15 20.19054 18.39144 0 60.40134 pct_2 | 15 33.52114 35.62084 .4051345 114.6789 pct_3 | 15 36.47013 57.8477 -13.49776 220.4319 pct_4 | 15 15.95635 11.46139 .7299263 41.23711 pct_5 | 15 24.67195 11.49144 9.999998 42.85715 . bagroup pct * Comparisons with the average of the other measures Variable | Obs Mean SD Difference Reference Range ---------+---------------------------------------------------------pct_1 | 15 20.19 18.39 -7.46 -59.32 to 44.40 pct_2 | 15 33.52 35.62 9.20 -46.78 to 65.17 pct_3 | 15 36.47 57.85 12.89 -90.12 to 115.89 pct_4 | 15 15.96 11.46 -12.76 -55.15 to 29.63 pct_5 | 15 24.67 11.49 -1.86 -34.88 to 31.15 200 200 200 100 100 100 0 0 0 -100 -100 0 20 40 60 -100 0 80 20 pct_1 40 60 80 pct_2 200 200 100 100 0 0 -100 0 20 40 60 80 pct_3 -100 0 20 40 60 pct_4 80 0 20 40 60 80 pct_5 Figure 4. Modified Bland–Altman plots for the eyesight data. Factor analysis Principal component factor analysis finds linear combinations of the variables. The first accounts for the largest possible proportion of the total variation. Later factors account for as much as possible of what is left. Correlations, not covariances are used. Effectively, each variable is standardized to have mean zero and variance one. This gives each the same importance in determining the factors. In a factor analysis, the first factor should be a good measure of the truth. If some methods are measuring the wrong thing, their errors will be correlated. This confounder will tend to appear in secondary, orthogonal factors not in the main measure. Correlations of each measure with the principal factor are a useful measure of which is most predictive. Significance tests are not available. = + + Because the variables are first standardized, factor analysis is not affected by calibration problems of the form mi ai bi t ei . If there is a standard scale (as with the blood iron), this may be a problem. If not (as with the eyesight data), it may be a bonus. As an example, consider the eyesight data. Stata Technical Bulletin 9 . factor pct_* (obs=15) (principal factors; 2 factors retained) Factor Eigenvalue Difference Proportion Cumulative -----------------------------------------------------------------1 2.24432 1.81878 0.9872 0.9872 2 0.42554 0.48863 0.1872 1.1743 3 -0.06309 0.08703 -0.0277 1.1466 4 -0.15012 0.03304 -0.0660 1.0806 5 -0.18316 . -0.0806 1.0000 Factor Loadings Variable | 1 2 Uniqueness ----------+-------------------------------pct_1 | 0.34981 0.39493 0.72166 pct_2 | 0.81906 0.25653 0.26332 pct_3 | 0.65937 -0.26935 0.49268 pct_4 | 0.52325 -0.36159 0.59546 pct_5 | 0.86169 0.02151 0.25702 . score pct_fac (based on unrotated factors) (1 scoring not used) Scoring Coefficients Variable | 1 ----------+---------pct_1 | 0.04699 pct_2 | 0.34960 pct_3 | 0.18434 pct_4 | 0.12185 pct_5 | 0.41533 . corr pct_* (obs=15) | pct_1 pct_2 pct_3 pct_4 pct_5 pct_fac ---------+-----------------------------------------------------pct_1 | 1.0000 pct_2 | 0.4424 1.0000 pct_3 | 0.1321 0.4704 1.0000 pct_4 | 0.0077 0.3370 0.5163 1.0000 pct_5 | 0.2959 0.7727 0.5814 0.4527 1.0000 pct_fac | 0.3803 0.8905 0.7169 0.5689 0.9369 1.0000 Modeling approaches = + + If we use the model mi ai bi t ei , there are several possibilities, depending on the data. With repeated measures, we could use errors-in-variables regression (Strike 1991, 1996). With data from more than two methods of measurement, either restricted factor analysis (Dunn 1989) or multilevel modeling (Goldstein 1995) are possible. None of these are yet available in Stata. Conclusions Bland–Altman plots are a simple, effective way of comparing two methods of measuring the same quantity. More obvious methods, such as t tests, correlation, and regression can be seriously misleading. The Stata command sdtest is not appropriate for comparisons of variances with paired data, while the new command sdpair, based on Pitman’s method, is more powerful, and gives confidence intervals for the variance ratio. Bland–Altman plots can be generalized to handle more than two methods, while factor analysis allows comparison of each measure with a good estimate of the truth and is not affected by calibration problems. References Bland, J. M. and D. G. Altman. 1983. Measurement in medicine: The analysis of method comparison studies. Statistician 32: 307–317. Dunn, G. 1989. Design and Analysis of Reliability Studies . London: Edward Arnold. Goldstein, H. 1995. Multilevel Statistical Models . 2d ed. New York: Halstead. Snedecor, G. W. and W. S. Cochran. 1967. Statistical Methods . 6th ed. Aimes, IA: Iowa State University Press. Strike, P. 1991. Statistical Methods in Laboratory Medicine . Oxford: Butterworth. ——. 1996. Measurement in Laboratory Medicine. A Primer on Control and Interpretation . Oxford: Butterworth. 10 Stata Technical Bulletin sbe34 STB-55 Loglinear modeling using iterative proportional fitting Adrian Mander, MRC Biostatistics Unit, Cambridge, UK, [email protected] Abstract: Iterative proportional fitting is a procedure that calculates the expected frequencies within a contingency table. The algorithm converges to maximum likelihood estimates even when the likelihood is badly behaved and is extremely fast when the contingency table has a large number of cells. Keywords: Loglinear modeling, contingency tables, constrained estimation. Syntax ipf varlist weight , fit(string) confile(filename) convars(varlist) save(filename) expect constr(string) nolog fweights are allowed. Description The iterative proportional fitting (IPF) algorithm is a simple method to calculate the expected counts of a hierarchical loglinear model. The algorithm’s rate of convergence is first order. The more commonly used Newton–Raphson algorithm is second order. However, each iteration of the IPF algorithm is quicker because Newton–Raphson inverts matrices. This makes the IPF algorithm much quicker for contingency tables with numerous cells. The IPF algorithm has the following steps: 1. Initial estimates of the expected frequencies are given. The initial estimates should have associations and interactions that are less complex than the model being fitted. By default the initial frequencies are 1. 2. The estimates of the expected frequencies are successively adjusted by scaling factors so they match each marginal table. 3. The scaling continues until the log likelihood converges. The algorithm always converges to the correct expected frequencies even when the likelihood is poorly behaved, for example, when there are zero fitted counts. The varlist defines the dimension of the contingency table that the Poisson likelihood is calculated over. If the varlist is not specified, the variables in the fit option define the dimensions of the contingency table. Options fit(string) specifies the loglinear model. It requires special syntax of the form var1*var2+var3+var4. The term var1*var2 includes all the interactions between the two variables and also the main effects of var1 and var2. The main effects var3 and var4 are also included in the model but no interactions. This syntax is used in most books on loglinear modeling. confile(filename) specifies a .dta file that contains initial values for the expected counts, the variable containing the frequencies must be called Efreqold. Any missing values in this file will be replaced by 1. This option requires the use of the option convars. convars(varlist) specifies the variables in the file specified by confile, excluding Efreqold. This varlist may be a subset of the variables in the model. All cells not specified with an initial expected frequency will have initial value of 1. save(filename) specifies the expected frequencies, observed frequencies and estimated probabilities for every cell to be saved in a .dta file. expect specifies that the expected frequencies are displayed. constr(string) specifies initial values for the expected frequencies. The syntax requires a condition in square brackets followed by a value for the expected frequency. Hence [sex=="male"]2 replaces all initial values for males to be 2. nolog specifies whether the log likelihood is displayed at each iteration. Examples To illustrate the command, data has been taken from Agresti (1990, 308). . use fish . describe Stata Technical Bulletin 11 Contains data from fish.dta obs: 56 vars: 5 23 Nov 1999 15:06 size: 1,344 (99.8% of memory free) ------------------------------------------------------------------------------1. lake float %9.0g l 2. gender float %9.0g g g 3. size float %9.0g s 4. food float %12.0g f 5. freq float %9.0g ------------------------------------------------------------------------------Sorted by: and we reconstruct the table on page 309 of Agresti (1990) via the IPF algorithm: Table 1. Goodness of fit of models Model (1) (2) (3) (4) (5) (6) food + lake size gender food gender + lake size gender food size + lake size gender food lake + lake size gender food lake + food size + lake size gender food lake + food size + food gender + lake size gender G2 X2 df 116.76114 114.65707 101.61156 73.565895 52.478477 50.263695 106.49216 101.24765 86.887138 79.579025 58.016632 52.566868 60 56 56 48 44 40 In Table 2, we collapse the information in Table 1 over gender. Table 2. Goodness of fit of models for a table collapsed over gender G2 X2 df 81.36248 66.212906 38.167236 17.079826 73.059517 54.29039 32.742958 15.043343 28 24 16 12 Model (7) (8) (9) (10) food + lake food size + food lake + food lake + size lake size lake size food size + lake size The study is about the factors that influence the primary food choice of alligators. The response variable is the food and the choices are subclassified by size of alligator, gender of alligator, and one of four lakes the alligators are sampled from. There were 219 alligators distributed over 80 possible cells. As the data are sparse, the likelihood-ratio test (G2 ) and the Pearson 2 test are not reliable, but comparison of the models can be made using G2 . Let F = food, L = lake, G = gender, and S = size, and the following shorthand G2 [(F; LGS )j(F G; LGS )] = 2.1 and G2 [(F S; F L; LGS )j(F G; F L; F S; LGS )] = 2.2 is used to compare models (1) and (2) and models (5) and (6), respectively. Both tests are based on 4 degrees of freedom, suggesting that the table should be collapsed over gender. From the collapsed table, it is clear that both lake and size have effects on the food choice of the alligator. Constrained estimation Constrained estimation can be implemented by selecting appropriate models and initial expected frequencies. This will be illustrated using a case–control study. Let the variables E and D be exposure and disease (both variables are binary, exposed cases are defined by D = 1 and E = 1, respectively). The command that fits a model of independence of disease and exposure is . ipf [fw=freq], fit(E+D) exp D 0 0 1 1 E 0 1 0 1 Efreq 13.962963 15.037037 12.037037 12.962963 Ofreq 16 13 10 15 prob .2585734 .2784636 .2229081 .2400549 This model constrains the odds ratio to be 1. To constrain the odds ratio to equal 2 requires the initial expected frequency in either the cell (0,0) or the cell (1,1) for (D,E) to equal 2. The simplest way to alter one cell’s initial expected frequency is by using the constr option. . ipf [fw=freq], fit( D + E) constr( [D==0 & E==0]2 ) exp 12 Stata Technical Bulletin D 0 0 1 1 E 0 1 0 1 Efreq 16.260628 12.739385 9.7393703 15.260615 Ofreq 16 13 10 15 STB-55 prob .3011227 .2359145 .1803587 .282604 An alternative method uses convars and confile. First, create a file of initial values for table and save this file as constr.dta making sure that it is sorted on D and E. The ipf command will merge this file with the main dataset. Any cells that have no initial frequency after the merge will not be constrained. . list 1. 2. 3. 4. D 0 0 1 1 E 0 1 0 1 Efreqold 2 1 1 1 The model fit using the constrain file is shown below. Note that all the variables of the constrain file must be specified in the convars option. . ipf [fw=freq], fit( D + E) convars(D E) confile(constr) exp D E Efreq Ofreq prob 0 0 16.260628 16 .3011227 0 1 12.739385 13 .2359145 1 0 9.7393703 10 .1803587 1 1 15.260615 15 .282604 Partial constraints in a marginal table For illustration purposes, the variables D and E are extended to include one extra category each, call this 2. The basic fit is now given below. . ipf [fw=freq], fit( D + E) exp D E Efreq 0 0 11.79661 0 1 7.8644066 0 2 9.3389826 1 0 8.949152 1 1 5.9661016 1 2 7.0847454 2 0 3.2542372 2 1 2.1694915 2 2 2.5762711 Ofreq 14 2 13 9 11 2 1 3 4 prob .1999426 .133295 .1582879 .1516806 .1011204 .1200804 .0551566 .036771 .0436656 The same constrain.dta file as used previously gives the following output. . ipf [fw=freq], fit( D + E) convars(D E) confile(constr) exp D E Efreq Ofreq prob 0 0 11.611365 14 .1968022 0 1 4.3895254 2 .0743985 0 2 13 13 .2203383 1 0 11.388637 9 .1930272 1 1 8.6106501 11 .1459428 1 2 2 2 .0338982 2 0 1 1 .0169491 2 1 3 3 .0508473 2 2 4 4 .0677964 Observe that the initial values are missing for all cells except the top left 2 2 table. Hence this table is partially constrained to have an odds ratio of 2 in the top left part of the table, but the rest of the table is unconstrained. Note that the partial constraints are a subset of the marginal table defined by the varlist in the convars option; thus, in this example, the model being fit is actually D E with the partially constrained odds ratio 2. If the constr.dta file contained only missing values, then this would be equivalent to fitting the model D E. References Agresti, A. 1990. Categorical Data Analysis. New York: John Wiley & Sons. Stata Technical Bulletin sg135 13 Test for autoregressive conditional heteroskedasticity in regression error distribution Christopher F. Baum, Boston College, [email protected] Vince Wiggins, Stata Corporation, [email protected] Abstract: Implements Engle’s (1982) test for autoregressive conditional heteroskedasticity (ARCH) in a time-series linear regression model. Keywords: Conditional heteroskedasticity, ARCH, Engle. Syntax archlm if exp in range , lags(numlist) nosample Description Consider a regression of a time series of T values of a response yt on a regressor matrix X . The errors in this regression model may be unconditionally heteroskedastic and independently distributed, satisfying the assumptions for the application of ordinary least squares estimation, but their distribution may exhibit autoregressive conditional heteroskedasticity (ARCH), as defined by Engle (1982). archlm computes Engle’s Lagrange multiplier test for ARCH(p), that is, for the absence of ARCH effects up to and including pth-order, in a time-series model. See Davidson and MacKinnon (1993, 557). This command is to be used after regress. The test is for use with time-series data; you must tsset your data before using these tests. The command displays the test statistic, degrees of freedom and p-value, and places values in the return array. Type return list to see such values. Options lags(numlist) specifies the lag order(s) to be tested by archlm. Test results will then be produced for each specified lag order in numlist. By default, archlm will use p = 1, that is, a single lag. nosample indicates that the test be performed on either all observations or all observations included in archlm’s if and in conditions if specified. By default, archlm includes only observations from the estimation sample. Remarks The ARCH Lagrange multiplier test is a general test of the null hypothesis that the regression errors t are not conditionally heteroskedastic, versus the alternative that their distribution involves a pth-order ARCH process: H1 : 2t = 2 2 2 0 + 1 t;1 + 2 t;2 + ::: + p t;p Under the null hypothesis, all of the slope coefficients, 1 through p , are zero. As Engle (1982) first showed, this hypothesis may be tested by regressing the squares of the regression residuals on a constant and p lagged values of the squared residuals. Under the null hypothesis, T times the centered R2 from this regression will be distributed 2 (p), where T is the sample size and p is the number of lagged residual vectors included in the regression. If rejections are encountered, Stata’s arch command may be used to estimate variations of the ARCH model. Examples We access the Klein (1950) Model I data used as an example in the discussion of Stata’s reg3 discussion via net-aware Stata, . do http://fmwww.bc.edu/RePEc/bocode/k/klein.do . tsset year, yearly . regress consump wagegovt (output omitted ) . archlm,lags(1 2 3 4) ARCH LM test statistic, ARCH LM test statistic, ARCH LM test statistic, ARCH LM test statistic, order( order( order( order( 1): 2): 3): 4): 5.542637 9.431075 9.039037 8.787176 Chi-sq( Chi-sq( Chi-sq( Chi-sq( 1) 2) 3) 4) P-value P-value P-value P-value = = = = .0186 .009 .0288 .0666 14 Stata Technical Bulletin STB-55 Consumption is regressed on the government wage bill. The tests for ARCH(p) effects for orders 1, 2, 3 and 4 each reject the null hypothesis of no ARCH effects at stronger than the 10% level of significance. As Davidson and MacKinnon stress (1993, 557), such a finding may or may not indicate the presence of conditional heteroskedasticity; it may also point to other forms of misspecification. References Davidson, R. and J. MacKinnon. 1993. Estimation and Inference in Econometrics . New York: Oxford University Press. Engle, R. 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50: 987–1007. Klein, L. 1950. Economic fluctuations in the United States 1921–1941 . New York: John Wiley & Sons. sg136 Tests for serial correlation in regression error distribution Christopher F. Baum, Boston College, [email protected] Vince Wiggins, Stata Corporation, [email protected] Abstract: Implements Durbin’s (1970) h test and Breusch (1978) and Godfrey’s (1978) tests for autocorrelation in the disturbances. Both tests are valid in the presence of stochastic regressors, including lagged dependent variables. The h test is strictly for first-order autocorrelation whereas the Breusch–Godfrey test is applicable to autocorrelation or moving average of arbitrary degree. Keywords: Autocorrelation, moving average, Durbin, Breusch, Godfrey, stochastic regressor, lagged dependent variable. Syntax durbinh bgtest , lags(p) Both commands are to be used after regress; see [R] regress. Both tests are for use with time-series data. You must tsset your data before using these tests; see [R] tsset. Description Consider a regression of a time series of T values of a response yt on a regressor matrix X , possibly including one or more lagged values of the response variable. For ordinary least squares (OLS) to be the appropriate estimator, the error process t should be independently and identically distributed. In the context of time-series data, serial correlation is often encountered, violating the distributional assumptions on the error process. If lagged dependent variables are included in the regressor matrix, alternative tests of those distributional assumptions are required. durbinh computes a form of the Durbin h test (1970) for first-order serial correlation in a model containing a lagged dependent variable among the regressors. In that context, the commonly applied Durbin–Watson test (see dwstat) is biased toward acceptance of the null hypothesis of zero autocorrelation. The Durbin h test provides a consistent estimate of the first-order autocorrelation coefficient in the AR(1) process t = t;1 + t when the regressors include yt;1 . See Davidson and MacKinnon (1993, 357–364) for details. bgtest computes the Breusch–Godfrey Lagrange multiplier test (Breusch 1978, Godfrey 1978) for nonindependence in the error distribution, conditional on the lag order p. The test’s null hypothesis of independence in the error distribution has “locally equivalent” alternatives (Godfrey and Wickens 1982) of either AR(p) or MA(p): that is, a pth-order autoregressive or moving average process. The test statistic, a T R2 Lagrange multiplier measure, is distributed 2 (p) under the null hypothesis. The test is asymptotically equivalent to the Box–Pierce or Ljung–Box portmanteau tests (the Q statistic implemented in the wntestq command) for p lags. Unlike either form of the Q statistic, the Breusch–Godfrey test is valid in the presence of stochastic regressors such as lagged values of the dependent variable. Both commands display the test statistic, degrees of freedom and p-value, and save results in r(); see [R] saved results. Type return list to see such values. The Breusch–Godfrey test for p = 1 is asymptotically equivalent to the Durbin h test. The Durbin h test statistic is presented as a Student-t test with one degree of freedom. Options lags(p) specifies that an autoregressive or moving average process of order p for the regression errors is to be tested. This option only applies to bgtest. bgtest by default will use only a single lag. A greater number of lagged values may be included in the test via the lags option. Stata Technical Bulletin 15 Remarks The Breusch–Godfrey test is a general test of the null hypothesis that the regression errors t are independently distributed, versus the alternative that their distribution involves a pth-order process: H1 : t = AR(p) or t = MA(p) where AR(p) denotes the pth-order autoregressive process, and MA(p) denotes the pth-order moving average process. The test statistic is computed from the regression of the least squares residuals et on the full matrix of regressors, X , and p lags of the residuals. Under the null hypothesis, T times the uncentered R2 from this regression will be distributed 2 (p), where T is the sample size and p is the number of lagged residual vectors included in the regression. A rejection of the null hypothesis implies that the errors are distributed as AR(p) or MA(p). The indeterminacy arises from the equivalence of the derivatives of these two models when evaluated under the null hypothesis; in Godfrey and Wickens (1982) terms, they are locally equivalent alternatives under the null hypothesis. The Durbin h test is a special case of the Breusch–Godfrey test where p = 1. Textbook discussions of this test often provide an alternative formula which can be problematic due to the square root of a potentially negative quantity. The Breusch–Godfrey form of the test may always be computed, and is asymptotically equivalent. Examples We access the Klein (1950) Model I data used as an example in Stata’s discussion of the reg3 command via net-aware Stata. . do http://fmwww.bc.edu/RePEc/bocode/k/klein.do . tsset year, yearly . regress consump wagegovt L.consump (output omitted ) . durbinh Durbin-Watson h-statistic: .7848839 . bgtest Breusch-Godfrey LM statistic: . bgtest, lags(2) Breusch-Godfrey LM statistic: t = 3.401193 P-value = .0037 8.393221 Chi-sq( 1) P-value = .0038 7.866155 Chi-sq( 2) P-value = .0196 Consumption is regressed on the government wage bill and lagged consumption. The presence of the lagged dependent variable necessitates the use of the Durbin h or Breusch–Godfrey tests. Both tests overwhelmingly reject the null hypothesis of independent errors, as does the Breusch–Godfrey test with two lags (an alternative hypothesis of AR(2) or MA(2) in the error distribution). References Breusch, T. 1978. Testing for autocorrelation in dynamic linear models. Australian Economic Papers 17: 334–355. Davidson, R. and J. MacKinnon. 1993. Estimation and Inference in Econometrics . New York: Oxford University Press. Durbin, J. 1970. Testing for serial correlation in least-squares regression when some of the regressors are lagged dependent variables. Econometrica 38: 410–421. Godfrey, L. 1978. Testing against general autoregressive and moving average error models when the regressors include lagged dependent variables. Econometrica 46: 1293–1301. Godfrey, L. and M. Wickens. 1982. Tests of misspecification using locally equivalent alternative models. In Evaluating the Reliability of Econometric Models , eds. G. Chow and P. Corsi, 71–99. New York: John Wiley & Sons. Klein, L. 1950. Economic fluctuations in the United States 1921–1941 . New York: John Wiley & Sons. sg137 Tests for heteroskedasticity in regression error distribution Christopher F. Baum, Boston College, [email protected] Nicholas J. Cox, University of Durham, UK, [email protected] Vince Wiggins, Stata Corporation, [email protected] Abstract: Implements commands to perform White’s (1980) general test for heteroskedasticity and Breusch and Pagan’s (1979) LM test for heteroskedasticity with respect to a specified set of variables. Both tests are for linear regression models. Keywords: Heteroskedasticity, heteroskedastic, White, Breusch–Pagan. 16 Stata Technical Bulletin Syntax whitetst STB-55 if exp in range , nosample bpagan varlist if exp in range Description Consider a regression of n values of a response on a regressor matrix X including p nonconstant regressors. whitetst computes the White (1980) general test for heteroskedasticity in the error distribution by regressing the squared residuals on all distinct regressors, and their squares and cross-products. The test statistic, a Lagrange multiplier measure, is distributed as 2 (p) under the null hypothesis of homoskedasticity. See Greene (2000, 507–511). bpagan computes the Breusch–Pagan (1979) Lagrange multiplier test for heteroskedasticity in the error distribution, conditional on a set of variables which are presumed to influence the error variance. The test statistic, a Lagrange multiplier measure, is distributed as 2 (p) under the null hypothesis of homoskedasticity. Both commands are to be used after regress. Both commands display the test statistic, degrees of freedom and p-value, and return results in r(). Type return list to see such values. The Breusch–Pagan test is asymptotically equivalent to White’s (1980) general test for heteroskedasticity performed by whitetst if the same auxiliary variables are specified (for White’s test, all distinct regressors, and their squares and crossproducts). This test should not be confused with another Breusch–Pagan test implemented in Stata’s mvreg for the independence of error vectors in a multivariate setting. Options nosample when specified with whitetst indicates that the test be performed on either all observations or all observations included in whitetst’s if and in conditions if specified. By default, whitetst includes only observations from the estimation sample. Remarks Both these tests are general tests of heteroskedasticity which allow the researcher to take advantage of the consistency of the least squares point estimates of the coefficient vector, even in the presence of heteroskedasticity. This implies that the least squares residuals may be used to construct a test to detect heteroskedastic behavior in the true disturbances. The White test may be described as a general test of the null hypothesis H0 : i2 = 2 for all i If the null hypothesis is satisfied, the appropriate covariance matrix for the least squares coefficients will be the conventional estimator, which is based on the correct estimated covariance matrix of the least squares estimator V = s2 (X 0 X );1 If the null hypothesis is not appropriate, the correct covariance matrix will be V where = s2 (X 0 X );1 [X 0 X ] (X 0 X );1 is a diagonal matrix containing i2 on the diagonal. V may be consistently estimated by Vb = s (X 0 X );1 2 " n X i=1 # ei xi x0i (X 0 X );1 2 where ei are the least squares residuals and xi is the ith row of the regressor matrix. This is the variance estimated by regress when the robust option is specified. The two estimates of the covariance matrix will differ if the null hypothesis is not supported by the data. White’s test takes advantage of this difference. It is computed as nR2 in the regression of e2i , the squared residuals, on a constant and all unique variables in X X . The statistic is asymptotically distributed as 2 (p) where p is the number of nonconstant regressors in the equation. Although the White test is extremely general, this is also its weakness. A rejection may reveal heteroskedasticity, but it may also identify some form of misspecification, such as the exclusion of relevant variables from the equation. It is a nonconstructive test, in that a rejection does not provide a suggested remedy. Stata Technical Bulletin 17 The Breusch–Pagan test is a more specific test in which the null hypothesis may be specified as H0 : i2 = 2 f ( 0+ 0 zi ) where zi is a set of independent variables. The model is homoskedastic if = 0. Like the White test, the test produces a Lagrange multiplier statistic, one-half the explained sum of squares in the regression of e2i = (e0 e=n) on zi . Under the null hypothesis, this statistic is asymptotically distributed as 2 (p) where p is the number of variables in z . Examples With Stata’s auto data read in, . regress price mpg weight length . whitetst White's general test statistic : 39.59324 Chi-sq( 9) P-value = 9.0e-06 The nine degrees of freedom for this test statistic correspond to the three regressors, mpg, weight, length, their squares, and their three unique crossproducts. The small p-value indicates that the null hypothesis of homoskedasticity is overwhelmingly rejected. . gen gpm=1/mpg . regress price mpg weight length . bpagan mpg gpm Breusch-Pagan LM statistic: 6.75232 Chi-sq( 2) P-value = .0342 The two degrees of freedom for the test statistic correspond to the two variables, mpg and gpm, given on the bpagan command. The p-value indicates that the null hypothesis of homoskedasticity of the errors may be rejected at stronger than the 5% level of significance. Note on authorship whitetst was authored by Baum and Cox; the code was much improved by the availability of rmcoll (documented online in Stata updated after 28 September 1999). bpagan was authored by Baum and Wiggins. References Breusch, T. and A. Pagan. 1979. A simple test for heteroskedasticity and random coefficient variation. Econometrica 47: 1287–1294. Greene, W. 2000. Econometric Analysis . 4th ed. Upper Saddle River, NJ: Prentice–Hall. White, H. 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48: 817–838. sg138 Bootstrap inferences about measures of correlation Dan J. Neal, Syracuse University, [email protected] Abstract: This insert presents bootcor, a program that allows researchers to compare the strength of correlation coefficients in cases where Fisher r -to-z confidence intervals may be inaccurate. bootcor uses bootstrapping to compare Pearson’s R, intraclass correlations, and concordance coefficients. Results allow the researcher to obtain confidence intervals for the parameter estimates and a z -score and p-value for the difference of the correlations. Keywords: Pearson’s R, intraclass correlation, concordance coefficient, bootstrapping. Syntax bootcor var1 var2 var3 var4 level(#) saving(newfile) if exp in range , reps(#) stat(pearson j icc j concord) Introduction Applied researchers are often interested in comparing the relative strength of association between different variables. The standard approach used in these situations is to compute correlations, use the Fisher r -to-z transformation on two of the correlation coefficients, and then compute a standard error for the difference of these transforms. A simple z -test is then used to infer whether there is a difference between the two correlations. Additionally, confidence intervals can be constructed around the parameter estimates for each correlation coefficient. 18 Stata Technical Bulletin STB-55 There are drawbacks to the Fisher r -to-z technique. One drawback is the assumption that the original data are distributed bivariate normal. In applied research, this is rarely the case, and when the assumption of bivariate normality breaks down, confidence intervals and inferences about correlations can be inaccurate. A second drawback, and one that is much more problematic, is that the researcher often wants to compare correlations calculated from the same sample of observations, that is, elements of a correlation matrix. Such coefficients are not independent of each other, and therefore formulas for the standard error of the difference in z -transforms may not be readily available. This insert presents bootcor, a program that uses bootstrapping (Efron and Tibshirani 1993) to make more accurate inferences about the difference of correlation coefficients. bootcor creates a user-specified number of bootstrap resamples of the dataset, and computes the two correlation coefficients being compared for each resample. These two correlation coefficients are then r -to-z transformed (to improve the symmetry of the distributions) and a difference score is calculated. A z -test is used on the distribution of difference scores. bootcor can make inferences about Pearson product-moment correlation coefficients, intraclass correlation coefficients, and concordance coefficients. The user can specify three or four variables. If three variables are selected, a comparison is made between r (var1,var2) and r (var1,var3). If four variables are selected, a comparison is made between r (var1,var2) and r(var3,var4). Options B to compute. The default value of B is 50. It B be at least 1000 for adequate accuracy when estimating percentiles of sampling distributions. reps(#) allows the user to specify how many bootstrap replications recommended that is stat(pearson j icc j concord) specifies which measure of correlation should be used in the comparison. The Pearson product-moment correlation coefficient (pearson) is the default setting. The user can also choose from two other measures of agreement: the intraclass correlation coefficient (icc) or the concordance coefficient (concord). The user does not need to have additional commands installed for computing intraclass or concordance coefficients. level(#) allows the user to specify the level of confidence for the individual correlation coefficients. Level can range from 1 to 99.9. The default is 95. saving(newfile) will export the bootstrap replications to a .dct file that the user can later analyze in more detail. Five variables are saved, with each resample listed casewise. r boot1 is r 1, r boot2 is r 2, z boot1 is the Fisher-transformed value of r boot1, z boot2 is the Fisher-transformed value of r boot2, and z bootd is the difference of z boot1 and z boot2. The user can load this file into Stata using the command . infile using newfile.dct Examples The following examples are demonstrated on a subset of data from a dataset of alcohol-related measures in college students. Data were collected at two times, within one week of each other. The dataset is called bootcor.dta and is provided on the STB diskette. . use bootcor.dta, clear . describe Contains data from bootcor.dta obs: 82 vars: 8 18 May 1999 23:35 size: 1,476 (99.1% of memory free) ----------------------------------------------------------------------------1. ads1 byte %8.0g Alcohol Dependence Time 1 2. ads2 byte %8.0g Alcohol Dependence Time 2 3. rapiy1 byte %8.0g Alcohol Related Problems in the Last Year Time 1 4. rapim1 byte %8.0g Alcohol Related Problems in the Last Month Time 1 5. rapiy2 byte %8.0g Alcohol Related Problems in the Last Year Time 2 6. rapim2 byte %8.0g Alcohol Related Problems in the Last Month Time 2 7. bac1 float %9.0g Peak BAC Time 1 8. bac2 float %9.0g Peak BAC Time 2 ----------------------------------------------------------------------------Sorted by: Stata Technical Bulletin 19 . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------ads1 | 82 6.719512 3.976084 0 18 ads2 | 82 6.02439 3.6106 1 15 rapiy1 | 82 5.987805 6.621129 0 37 rapim1 | 82 1.256098 1.929696 0 9 rapiy2 | 82 5.512195 6.070052 0 29 rapim2 | 82 1.207317 2.083076 0 9 bac1 | 82 .0771341 .0663231 0 .268 bac2 | 82 .0782512 .0595403 0 .236 Comparing the test-retest reliabilities of measures In this first analysis, it is of interest whether the intraclass test-retest correlation coefficients of the two measures of alcohol-related problems are equal. In other words, is there any difference in the reliability of estimates of the number of alcohol problems in the past month versus problems in the last year? . bootcor rapiy1 rapiy2 rapim1 rapim2, r(1000) stat(icc) level(90) Results of Bootstrap Comparison of Intraclass Correlation -----------------------------------------------------------------------Bootstrap Replications: 1000 Observations: 82 -----------------------------------------------------------------------Variables Observed Bootstrap Mean(R) [ 90% CI ] rapiy1 & rapiy2 0.901 0.900 0.865 0.926 rapim1 & rapim2 0.676 0.667 0.507 0.783 -----------------------------------------------------------------------Z-score of Fisher R-to-Z Difference: 4.671 P-Value: 0.000 ------------------------------------------------------------------------ = The results of this bootstrap comparison yield a highly significant result, with a z 4.671. We would reject the null hypothesis that these two assessments have the same test-retest reliability; it appears that people are reliably better at reporting alcohol-related problems over the past year than in the past month. Also of interest are the confidence intervals for the two parameter estimates. The 90% confidence intervals for the parameters rho(rapiy1,rapiy2) and rho(rapim1,rapim2) are listed above. In the second analysis, the question of interest is whether the strength of the relationship between peak blood alcohol content and alcohol-related problems is the same as peak blood alcohol content and alcohol dependence symptoms. . bootcor bac1 rapim1 ads1, reps(1000) level(90) (0 observations deleted) Results of Bootstrap Comparison of Pearson's R -----------------------------------------------------------------------Bootstrap Replications: 1000 Observations: 82 -----------------------------------------------------------------------Variables Observed Bootstrap Mean(R) [ 90% CI ] bac1 & rapim1 0.652 0.655 0.504 0.766 bac1 & ads1 0.502 0.502 0.356 0.624 -----------------------------------------------------------------------Z-score of Fisher R-to-Z Difference: 1.577 P-Value: 0.115 ------------------------------------------------------------------------ = The results of this bootstrap comparison yield a nonsignificant result, with a z 1.577 and p intervals for the parameters rho(bac1,rapim1) and rho(bac1,ads1) are listed above as well. (Continued on next page ) = .115. The 90% confidence 20 Stata Technical Bulletin STB-55 Saved Results bootcor saves in r(): Scalars r(z) r(p) r(corr1) r(bcorr1) r(bcorr1l) r(bcorr1u) r(corr2) r(bcorr2) r(bcorr2l) r(bcorr2u) r(bse1) r(bse2) r(bsed) observed z -value of the mean of the difference scores probability of observing a z equal to or more extreme than observed value of r1 as calculated from the dataset observed mean of the bootstrap distribution of r 1 lower limit of the confidence interval of r 1 upper limit of the confidence interval of r 1 value of r2 as calculated from the dataset observed mean of the bootstrap distribution of r 2 lower limit of the confidence interval of r 2 upper limit of the confidence interval of r 2 standard error of the bootstrap distribution of r 1 standard error of the bootstrap distribution of r 2 standard error of the bootstrap distribution of difference scores Acknowledgment The author thanks Elizabeth T. Miller for providing the dataset used in the examples. References Efron, B. and R. J. Tibshirani. 1993. An Introduction to the Bootstrap . New York: Chapman & Hall. sg139 Logistic regression when binary outcome is measured with uncertainty Mario Cleves, Stata Corporation, [email protected] Alberto Tosetto, S. Bortolo Hospital, Vicenza, Italy, [email protected] Abstract: Traditional logit or logistic regression assumes that the outcome variable is measured without error. In some studies, however, the outcome variable is measured with imperfect sensitivity and specificity. It is known that the resulting misclassification will lead to biased parameter point estimates and variances. In this insert we implement an EM algorithm suggested by Magder and Hughes (1997) that produces unbiased estimates of parameters and their variances. Keywords: Logit, logistic models, sensitivity, specificity, EM algorithm, measurement error. Syntax logitem depvar indepvars if exp in range , sens(sensvar j #) spec(specvar j #) level(#) robust nolog noor iterate(#) tolerance(#) ltolerance(#) Syntax for predict predict in range , statistic type newvarname if exp where statistic is p xb stdp number probability of a positive outcome (the default) xj b, fitted values standard error of the prediction sequential number of the covariate pattern Unstarred statistics are available both in and out of sample; type predict : : : if e(sample) : : : if wanted only for the estimation sample. Starred statistics are calculated only for the estimation sample even when if e(sample) is not specified. Description logitem uses an expectation-maximization (EM) algorithm to estimate a maximum-likelihood logit regression model when the outcome variable is measured with an imperfect test of known sensitivity and specificity. The method allows the sensitivity and specificity to vary across observations. Stata Technical Bulletin 21 Options j #) specifies the value or the name of the sensitivity variable. Sensitivity should be between 0 and 1. spec(specvar j #) specifies the value or the name of the specificity variable. Specificity should be between 0 and 1. sens(sensvar level(#) specifies the confidence level, in percent, for confidence intervals. The default is level(95) or as set by set level. robust specifies that the Huber/White/sandwich estimator of variance is to be used in place of the traditional calculation. nolog prevents logitem from showing the iteration log. noor reports the estimated coefficients instead of odds ratios. This option affects how results are displayed, not how they are estimated. noor may be specified at estimation or when redisplaying previously estimated results. iterate(#), tolerance(#), and ltolerance(#) specify the definition of convergence. iterate(16000) tolerance(1e-6) ltolerance(0) is the default. Convergence is declared when tolerance() (b )) ltolerance() mreldif(bi+1 ; bi ) or reldif(ln L(bi+1 ); ln L i for two consecutive EM steps. In addition, iteration stops when i = iterate(); in that case, results along with the message “convergence not achieved” are presented. The return code is still set to 0. Options for predict p, the default, calculates the probability of a positive outcome. xb calculates the linear prediction. stdp calculates the standard error of the linear prediction. number numbers the covariate patterns—observations with the same covariate pattern have the same number. Observations not used in estimation have the prediction set to missing. The “first” covariate pattern is numbered 1, the second 2, and so on. Remarks Traditional logit or logistic regression assumes that the outcome variable is measured without error. In some studies, however, the outcome variable is not measured perfectly. This can occur, for example, when using a diagnostic test having sensitivity and/or specificity lower than 100%. The resulting misclassification can lead to bias in the coefficients estimated and related statistics (Copeland, et al. 1977). Magder and Hughes (1997) proposed an EM algorithm that incorporates the values of the sensitivity and specificity of the classification test into the estimation of the logistic parameters. They showed that in the presence of misclassification, their procedure produced unbiased estimates of both the coefficients and their variances. It is this EM algorithm that we have implemented in logitem. Note that when sensitivity and specificity are both set to one, logitem and logistic produce the same estimates. Examples Tosetto, et al. (1999) conducted a case–control study to determine the importance of the prothrombin gene allele G20210A as a risk factor in venous thromboembolism (VTE). The study consisted of 116 VTE patients and 232 healthy individuals ascertained randomly from a well defined population. For each subject in the study, they obtained information regarding previous diagnosis of VTE using a survey tool with an estimated sensitivity of 71.3% and specificity of 98.9%. Each subject in the study was also typed at the prothrombin locus. No homozygous carriers of the mutated allele (G20210A) were found. Thirteen (3.7%) subjects were heterozygous for the mutation and the remaining 335 subjects did not have the mutation. In our data, case indicates whether the patient has been diagnosed with VTE, and pro whether the individual has the mutation. Here are the results from logistic: . logistic case pro Logit estimates Log likelihood = -221.42878 Number of obs LR chi2(1) Prob > chi2 Pseudo R2 = = = = 348 0.16 0.6926 0.0004 22 Stata Technical Bulletin STB-55 -----------------------------------------------------------------------------case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------pro | 1.261261 .7337818 0.399 0.690 .403264 3.94476 ------------------------------------------------------------------------------ and those from logitem incorporating the sensitivity and specificity: . logitem case pro, sens(.713) spec(.989) nolog logistic regression when outcome is uncertain Number of obs = 348 LR chi2(1) = 0.00 Log likelihood = -221.42878 Prob > chi2 = 0.9998 -----------------------------------------------------------------------------| Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------pro | 1.355498 1.065479 0.387 0.699 .2904148 6.326728 ------------------------------------------------------------------------------ Neither model provides evidence supporting the hypothesis of an association between the mutated allele and VTE. Note that although the odds ratio reported by logitem is larger—further from the null—than that reported by standard logistic regression, its standard error is larger, reflecting the added uncertainty about the outcome variable. This is a known property of this method; namely, the EM algorithm typically produces larger odds ratios and larger variances. Saved Results logitem saves in e(): Scalars e(N) e(ll) e(ll 0) e(df m) number of observations log likelihood log likelihood, constant-only model model degrees of freedom e(chi2) 2 e(r2 p) pseudo R-squared Macros e(cmd) logitem e(depvar) name of dependent variable e(chi2type) LR; type of model 2 test Matrices e(b) coefficient vector e(V) variance–covariance matrix of the estimators Functions e(sample) marks estimation sample Methods and Formulas Let Yi = 1 if individual i truly has the outcome of interest (diseased) and 0 otherwise (nondiseased). Let Ti = 1 if individual i is classified as having the outcome and 0 otherwise. Assume that Yi is the probability that the ith individual truly has the condition being studied given the values of Ti and k 1 covariate vector Xi . Then if individual i is classified as having the outcome (Ti = 1), b and if Ti = 0, Yi = b Prob(Yi = 1jXi ; ) sensitivity Prob(Yi = 1jXi ; ) sensitivity + Prob(Yi = 0jXi ; ) (1 ; speci city) b Prob(Yi = 1jXi ; ) (1 ; sensitivity) Prob(Yi = 1jXi ; ) (1 ; sensitivity) + Prob(Yi = 0jXi ; ) speci city Yi = where is a k 1 coefficient vector to be estimated, and P P exp( kj=0 j Xij ) Prob(Yi = 1jXi ; ) = 1 + exp( kj=0 j Xij ) Stata Technical Bulletin The EM algorithm begins by first setting expectation step. 23 to an arbitrary value and computing Ybi for each observation. This is the The data are then duplicated and each observation included twice, once with the outcome variable set to 1 and another with the outcome set to zero. A weighted logistic regression model is fitted with weights equal to Ybi if the outcome variable is 1 and (1 Ybi ) if it is zero. This constitutes the maximization step. ; The new ’s obtained from the fitted logistic model are used to calculate new Ybi ’s and the process repeated until convergence is declared. References Copeland, K. T., H. Checkoway, A. J. Michael, and R. H. Holbrook. 1977. Bias due to misclassification in the estimation of relative risk. American Journal of Epidemiology 105: 488–495. Magder, L. S. and J. P. Hughes. 1997. Logistic regression when the outcome is measured with uncertainty. American Journal of Epidemiology 146: 195–203. Tosetto, A., E. Missiaglia, M. Frezzato, and F. Rodeghiero. 1999. The VITA project: prothrombin G20210A mutation and venous thromboembolism in the general population. Thromb Haemost 82: 1395–1398. sg140 The Gumbel quantile plot and a test for choice of extreme models Manuel G. Scotto, University of Lisbon, [email protected] Abstract: Some statistical tools for exploratory data analysis are presented. The Gumbel quantile plot is described as an informal way to test if the Gumbel distribution provides a good fit for data. Furthermore, we include a method of statistical choice among the three extreme value distributions. Keywords: Generalized extreme value distribution, hypothesis testing, Gumbel quantile plot. Syntax gqpt varname if exp in range Introduction The main goal of this work is in dealing with the statistical choice of extreme models. This is essential in applications where the attention is focused at rarely occurring events, such as an annual maximal flood exceeding dykes, or a seasonal minimal temperature below the critical value for crop production. We restrict ourselves to the one-dimensional case and start with a discussion of the problem. Let X1 ; : : : ; Xn be independent and identically distributed random variables with underlying marginal distribution given by G (x; ; ) = exp ; ! x ; ;1= 1+ ; 1 + x ; > 0; ;1 < < 1 which is the well-known generalized extreme value distribution (GEV). The parameters and are the location and scale parameters respectively and is the shape parameter and may be used to model a wide range of tail behaviors. There are three particular forms of G corresponding to > 0 (Fréchet distribution), < 0 (Weibull distribution), and = 0 being interpreted 0, widely called the Gumbel distribution. We use the Gumbel quantile plot (GQP) and the statistic first as the limit as introduced by Gumbel and developed by Tiago de Oliveira and Gomes (1984), hereafter referred to as OG. for a quick statistical choice between the extreme models. ! The quantile plot for the Gumbel distribution Probability plotting papers are commonly used to assess, in an informal way, whether a sample comes from a particular distribution. For the Gumbel distribution, the quantile function is given by (x; ; ) = exp ; exp(; x ; ) ; ;1 < x < 1 which leads to so called double logarithmic plotting. To this end, we first take the ordered sample X1:n : : : Xn:n and plot Xi:n versus log( log(pi )), where pi = i=(n + 1) is the classical plotting position. If the Gumbel distribution provides a good fit to our data, then the GQP should look roughly linear. Furthermore, both Fréchet and Weibull models can also be validated by ; ; 24 Stata Technical Bulletin STB-55 means of the GQP. If the plot has a downside concavity we can assume a Fréchet model whereas an upside concavity indicates a Weibull model. Finally, note that ; log(; log(pi)) = ; + Xi:n Using linear regression, quick estimates for and can be deduced from the slope and the intercept. Maximum likelihood estimators can be obtained by means of the gumbel command introduced in Scotto and Tobias (1998). Statistical choice between the extreme models Statistical choice among the extreme models gives a central and preeminent position to the Gumbel distribution due to the simplicity of inferences associated with this distribution. We present a test for H0 : = 0 in the GEV( ) model. We consider the statistic, Qn = Xn:n ; X([n=2]+1):n X([n=2]+1):n ; X1:n which is location and dispersion-parameter free. Under the validity of H0 , it was shown by OG that there exist an > 0 and bn such that Wn = an (Qn bn ) ()_ . One choice is an = log log 2 and bn = (log n + log log 2)=(log log n log log 2). and decide for OG proposed a simple deciding rule in order to decide among the extreme models; choose 0 < b < a < the Gumbel distribution when b Wn a, for the Fréchet distribution when Wn > a, and for the Weibull distribution when Wn < b. The values of a and b corresponding to the usual significance are given in the table below. ; ! 1 0.050 0.025 0.010 0.001 a b -1.561334 -1.719620 -1.893530 -2.222295 3.161461 3.843121 4.740459 7.010001 ; Example We applied both the GQP and the statistical test described above, to the annual maximum sea levels in Venice dataset during the period 1981–82 (Smith 1986). . gqpt seal Variable | Delta Lambda Q W --------------------------------------------------------------seal | 15.938767 96.122623 2.3608653 .41984981 ------------------------------------------------------------------------------------------------------------------Values corresponding to the usual significance levels ----------------------------------------------------alpha b a .050 -1.561334 3.161461 .025 -1.719620 3.841321 .010 -1.893530 4.740459 .001 -2.222951 7.010001 This gives rise to the graph in Figure 1. (Graph on next page ) Stata Technical Bulletin 25 -ln(-ln(pi)) 3.94155 -1.37403 71.03 173.57 x_(i:n) Figure 1. Gumbel quantile plot of annual maximum sea level in Venice, for the period 1981–82. References Scotto, M. G. and A. Tobias. 1998. sg83: Parameter estimation for the Gumbel distribution. Stata Technical Bulletin 43: 32–35. Reprinted in Stata Technical Bulletin Reprints , vol. 8, pp. 133–137. Smith, R. L. 1986. Extreme value based on the r largest annual events. Journal of Hydrology 86: 27–43. Tiago de Oliveira, J. and M. I. Gomes. 1984. Two test statistics for choice of univariate extremes models. Statistical Extremes and Applications , ed. J. Tiago de Oliveira. Dordrecht: D. Reidel. sg141 Treatment effects model Ronna Cong, Stata Corporation, [email protected] David M. Drukker, Stata Corporation, [email protected] Abstract: This article describes the new command treatreg and the treatment effects model that it estimates. treatreg estimates a treatment effects model using either a two-step consistent estimator or full maximum-likelihood. The treatment effects model considers the effect of an endogenously chosen binary treatment on another endogenous continuous variable, conditional on two sets of independent variables. In addition to a verbal and mathematical description of the treatment effects model and complete syntax diagram for the command, this article has several empirical examples which illustrate how the command is used and how to interpret its output. Keywords: Probit, endogenous treatment, simultaneous LDV models. Syntax Basic syntax treatreg depvar varlist , treat(depvars = varlists ) twostep Full syntax for maximum likelihood estimates only treatreg depvar varlist weight if exp in range , treat(depvars = varlists , noconstant ) robust cluster(varname) hazard(newvarname) noconstant first noskip level(#) iterate(#) maximize options Full syntax for two-step consistent estimates only treatreg depvar varlist if exp in range , twostep treat(depvars = varlists , noconstant ) hazard(newvarname) noconstant first level(#) pweights, aweights, fweights, and iweights are allowed with maximum likelihood estimation; see [U] 14.1.6 weight. No weights are allowed if twostep is specified. treatreg shares the features of all estimation commands; see [U] 23 Estimation and post-estimation commands. 26 Stata Technical Bulletin Syntax for predict predict type newvarname if exp in range , statistic STB-55 where statistic is xb yctrt ycntrt ptrt xbtrt stdptrt stdp stdf xj b, fitted values (the default) E (yj j treatment = 1) treatment = 0) P(treatment = 1) linear prediction for treatment equation standard error of the linear prediction for treatment equation standard error of the prediction standard error of the forecast E (yj j Description treatreg estimates a treatment effects model using either a two-step consistent estimator or full maximum likelihood. The treatment effects model considers the effect of an endogenously chosen binary treatment on another endogenous continuous variable, conditional on two sets of independent variables. Options treat(: : :) specifies the variables and options for the treatment equation. It is an integral part of specifying a treatment effects model and is not optional. twostep specifies that two-step efficient estimates of the parameters, standard errors, and covariance matrix are to be produced. robust specifies that the Huber/White/sandwich estimator of the variance is to be used in place of the conventional MLE variance estimator. robust combined with cluster() further allows observations which are not independent within cluster (although they must be independent between clusters). If you specify pweights, robust is implied; See [U] 23.11 Obtaining robust variance estimates. cluster(varname) specifies that the observations are independent across groups (clusters) but not necessarily independent within groups. varname specifies to which group each observation belongs. cluster() affects the estimation of the variance– covariance matrix and, thus, of the standard errors (VCE), but not the estimated coefficients. cluster() can be used with pweights to produce estimates for unstratified cluster-sampled data. cluster() implies robust; that is, specifying robust cluster() is equivalent to typing cluster() by itself. hazard(newvarname) will create a new variable containing the hazard from the treatment equation. The hazard is computed from the estimated parameters of the treatment equation. noconstant suppresses the constant term (intercept) in the model. This option may be specified on the regression equation, the treatment equation, or both. first specifies that the first-step probit estimates of the treatment equation be displayed prior to estimation. noskip specifies that a full maximum likelihood model with only a constant for the regression equation be estimated. This model is not displayed but is used as the base model to compute a likelihood-ratio test for the model test statistic displayed in the estimation header. By default, the overall model test statistic is an asymptotically equivalent Wald test of all the parameters in the regression equation being zero (except the constant). For many models, this option can significantly increase estimation time. level(#) specifies the confidence level, in percent, for confidence intervals. The default is level(95) or as set by set level. iterate(#) restricts the maximum number of iterations during optimization to the specified number; see [R] maximize. iterate(0) produces two-step parameter estimates with standard errors computed from the inverse Hessian of the full information matrix at the two-step solution for the parameters. As an alternative, the twostep option computes two-step consistent estimates of the standard errors. maximize options control the maximization process; see [R] maximize. You will seldom need to specify any of the maximize options except for iterate(0) and possibly difficult. If the iteration log shows many “not concave” messages and it is taking many iterations to converge, try the difficult option to see if that helps it to converge in fewer steps. Stata Technical Bulletin 27 Options for predict xb the default, calculates the linear prediction xj b. yctrt calculates the expected value of the dependent variable conditional on the presence of the treatment; E (yj j treatment = 1). ycntrt calculates the expected value of the dependent variable conditional on the absence of the treatment; E (yj j treatment = 0). wj ptrt calculates the probability of the presence of the treatment: P(treatment = 1) = Pr( + uj > 0). xbtrt calculates the linear prediction for the treatment equation. stdptrt calculates the standard error of the linear prediction for the treatment equation. stdp calculates the standard error of the prediction. It can be thought of as the standard error of the predicted expected value or mean for the observation’s covariate pattern. This is also referred to as the standard error of the fitted value. stdf calculates the standard error of the forecast. This is the standard error of the point prediction for a single observation. It is commonly referred to as the standard error of the future or forecast value. By construction, the standard errors produced by stdf are always larger than those produced by stdp; see [R] regress Methods and Formulas. Remarks The treatment effects model estimates the effect of an endogenous binary treatment, Z j , on a continuous, fully-observed variable yj , conditional on the independent variables xj and wj . The primary interest is in the regression function yj = xj + zj + j where zj is an endogenous dummy variable indicating whether the treatment is assigned or not. The binary decision to obtain the treatment zj is modeled as the outcome of an unobserved latent variable, zj . It is assumed that zj is a linear function of the exogenous covariates wj and a random component uj . Specifically, zj = wj and the observed decision is zj = 1; 0; + uj if zj > 0 otherwise where and u are bivariate normal with mean zero and covariance matrix 1 There are many variations of this model in the literature. Maddala (1983) derives the maximum likelihood and two-step estimators of the version implemented here. Maddala (1983) also gives a brief review of several empirical applications of this model. Barnow, et al. (1981) provide another useful derivation of this model. Barnow et al. (1981) concentrate on deriving the conditions in which the self-selection bias of the simple OLS estimator of the treatment effect, , is nonzero and of a specific sign. Example We will illustrate treatreg using a subset of the Mroz data distributed with Berndt (1991). This dataset contains 753 observations on women’s labor supply. Our subsample is of 250 observations, with 150 market laborers and 100 nonmarket laborers. Since 40% of the women in our sample chose not to enter the labor market, the simple treatment regression model is not the correct model for these data. Ideally, we would like a model that accounts for the sample selection on entering the labor force and the endogeneity of the college degree. Despite this misspecification, this dataset can be used to illustrate how the treatreg command works. . use labor, clear . describe Contains data from labor.dta obs: 250 vars: 15 size: 16,000 (98.4% of memory free) ------------------------------------------------------------------------------1. lfp float %9.0g 1 if woman worked in 1975 2. whrs float %9.0g wife's hours of work 28 Stata Technical Bulletin 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. kl6 k618 wa we ww hhrs ha he hw faminc wmed float float float float float float float float float float float STB-55 %9.0g %9.0g %9.0g %9.0g %9.0g %9.0g %9.0g %9.0g %9.0g %9.0g %9.0g # of children younger than 6 # of children between 6 and 18 wife's age wife's education attainment wife's wage husband's hours worked in 1975 husband's age husband's educational attainment husband's wage family income wife's mother's educational attainment 14. wfed float %9.0g wife's father's educational attainment 15. cit float %9.0g 1 if live in large city ------------------------------------------------------------------------------Sorted by: . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------lfp | 250 .6 .4908807 0 1 whrs | 250 799.84 915.6035 0 4950 kl6 | 250 .236 .5112234 0 3 k618 | 250 1.364 1.370774 0 8 wa | 250 42.92 8.426483 30 60 we | 250 12.352 2.164912 5 17 ww | 250 2.27523 2.59775 0 14.631 hhrs | 250 2234.832 600.6702 768 5010 ha | 250 45.024 8.171322 30 60 he | 250 12.536 3.106009 3 17 hw | 250 7.494435 4.636192 1.0898 40.509 faminc | 250 23062.54 12923.98 3305 91044 wmed | 250 9.136 3.536031 0 17 wfed | 250 8.608 3.751082 0 17 cit | 250 .624 .4853517 0 1 We will assume that the wife went to college if her educational attainment was more than 12 years. Let wc be the dummy variable indicating whether the individual went to college. With this definition, our sample contains the following distribution of college education. . gen wc = 0 . replace wc = 1 if we > 12 (69 real changes made) . tab wc wc | Freq. Percent Cum. ------------+----------------------------------0 | 181 72.40 72.40 1 | 69 27.60 100.00 ------------+----------------------------------Total | 250 100.00 We will model the wife’s wage as a function of her age, whether the family was living in a big city, and whether she went to college. An ordinary least squares estimation produces the following results: . regress ww wa cit wc Source | SS df MS Number of obs = 250 ---------+-----------------------------F( 3, 246) = 4.82 Model | 93.2398568 3 31.0799523 Prob > F = 0.0028 Residual | 1587.08776 246 6.45157627 R-squared = 0.0555 ---------+-----------------------------Adj R-squared = 0.0440 Total | 1680.32762 249 6.74830369 Root MSE = 2.54 -----------------------------------------------------------------------------ww | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------wa | -.0104985 .0192667 -0.545 0.586 -.0484472 .0274502 cit | .1278922 .3389058 0.377 0.706 -.5396351 .7954194 wc | 1.332192 .3644344 3.656 0.000 .6143819 2.050001 _cons | 2.278337 .8432385 2.702 0.007 .6174489 3.939225 ------------------------------------------------------------------------------ Is 1.332 a consistent estimate of the marginal effect of a college education on wages? If individuals choose whether or not to attend college and the error term of the model that gives rise to this choice is correlated with the error term in the wage Stata Technical Bulletin 29 equation, then the answer is no. (See Barnow et al. 1981 for a good discussion of the existence and sign of selectivity bias.) One might suspect that individuals with higher abilities, either innate or due to the circumstances of their birth, would be more likely to go to college and to earn higher wages. Such ability is, of course, unobserved. Furthermore, if the error term in our model for going to college is correlated with ability, and the error term in our wage equation is correlated with ability, then the two terms should be positively correlated. These conditions make the problem of signing the selectivity bias equivalent to an omitted-variable problem. In the case at hand, since we would anticipate the correlation between the omitted variable and a college education to be positive, we suspect that OLS is biased upwards. To account for the bias, we fit the treatment effects model. We model the wife’s college decision as a function of her mother’s and her father’s educational attainment. Thus, we are interested in estimating the model = 0 + 1 wa + 2 cit + wc + wc = 0 + 1 wmed + 2 wfed + u ww where wc = 1; 0; wc > 0, i.e., wife went to college otherwise and where and u have a bivariate normal distribution with covariance matrix 1 The following output gives the maximum likelihood estimates of the parameters of this model. . treatreg ww wa cit, treat(wc=wmed wfed) Iteration 0: Iteration 1: Iteration 2: log likelihood = -707.07237 log likelihood = -707.07215 log likelihood = -707.07215 Treatment effects model -- MLE Number of obs = 250 Log likelihood = -707.07215 Wald chi2(3) Prob > chi2 = = 4.11 0.2501 -----------------------------------------------------------------------------| Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------ww | wa | -.0110424 .0199652 -0.553 0.580 -.0501735 .0280887 cit | .127636 .3361938 0.380 0.704 -.5312917 .7865638 wc | 1.271327 .7412951 1.715 0.086 -.1815842 2.724239 _cons | 2.318638 .9397573 2.467 0.014 .4767477 4.160529 ---------+-------------------------------------------------------------------wc | wmed | .1198055 .0320056 3.743 0.000 .0570757 .1825352 wfed | .0961886 .0290868 3.307 0.001 .0391795 .1531977 _cons | -2.631876 .3309128 -7.953 0.000 -3.280453 -1.983299 ---------+-------------------------------------------------------------------/athrho | .0178668 .1899898 0.094 0.925 -.3545063 .3902399 /lnsigma | .9241584 .0447455 20.654 0.000 .8364588 1.011858 ---------+--------------------------------------------------------------------rho | .0178649 .1899291 -.3403659 .371567 sigma | 2.519747 .1127473 2.308179 2.750707 lambda | .0450149 .4786442 -.8931105 .9831404 ------------------------------------------------------------------------------LR test of indep. eqns. (rho = 0): chi2(1) = 0.01 Prob > chi2 = 0.9251 ------------------------------------------------------------------------------- In the input, we specified that the continuous dependent variable, ww (wife’s wage), is a linear function of cit and wa. Note the syntax for the treatment variable. The treatment wc is not included in the first variable list; it is specified in the treat() option. In this example, wmed and wfed are specified as the exogenous variables in the treatment equation. The output has the form of many two-equation estimators in Stata. We note that our conjecture that the OLS estimate was biased upwards is verified. But perhaps more interesting, the size of the bias is negligible and the likelihood-ratio test at the bottom of the output indicates that we cannot reject the null hypothesis that the two error terms are uncorrelated. This result might be due to several specification errors. We ignored the selectivity bias due to the endogeneity of entering the labor market. We have also written both the wage equation and the college education equation in crude linear form, ignoring any higher power terms or interactions. 30 Stata Technical Bulletin STB-55 The results for the two ancillary parameters require explanation. For numerical stability during optimization, treatreg does not directly estimate or . Instead, treatreg estimates the inverse hyperbolic tangent of , 1 + 1 atanh = 2 ln 1 ; ln and . Also, treatreg reports for it. = , along with an estimate of the standard error of the estimate and a confidence interval Technical Note If each of the equations in the model had contained many regressors, the treatreg command could become quite long. An alternate way of specifying our wage model would be to make use of Stata’s local macros. The following lines are an equivalent way of estimating our model. . local wageeq "ww wa cit" . local trteq "wc=wmed wfed" . treatreg `wageeq', treat(`trteq') Example (continued) Stata will also produce a two-step estimator of the model with the twostep option. Maximum likelihood estimation of the parameters can be time-consuming with large datasets, and the two-step estimates may provide a good alternative in such cases. Continuing with the women’s wage model, we can obtain the two-step estimates with consistent covariance estimates by typing . treatreg ww wa cit, treat(wc=wmed wfed) twostep Treatment effects model -- two-step estimates Number of obs = 250 Wald chi2(3) Prob > chi2 = = 3.67 0.2998 -----------------------------------------------------------------------------| Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------ww | wa | -.0111623 .020152 -0.554 0.580 -.0506594 .0283348 cit | .1276102 .33619 0.380 0.704 -.53131 .7865305 wc | 1.257995 .8007428 1.571 0.116 -.3114319 2.827422 _cons | 2.327482 .9610271 2.422 0.015 .4439031 4.21106 ---------+-------------------------------------------------------------------wc | wmed | .1198888 .0319859 3.748 0.000 .0571976 .1825801 wfed | .0960764 .0290581 3.306 0.001 .0391236 .1530292 _cons | -2.631496 .3308344 -7.954 0.000 -3.279919 -1.983072 ---------+-------------------------------------------------------------------hazard | lambda | .0548738 .5283928 0.104 0.917 -.9807571 1.090505 -----------------------------------------------------------------------------rho | 0.02178 sigma | 2.5198211 lambda | .05487379 .5283928 ------------------------------------------------------------------------------- The reported lambda () is the parameter estimate on the hazard from the augmented regression. The augmented regression is derived in Maddala (1983) and presented in the Methods and Formulas section below. The default statistic produced by predict after treatreg is the expected value of the dependent variable from the underlying distribution of the regression model. For the case at hand this statistic is ww = 0+ 1 wa + 2 cit + wc + Several other interesting aspects of the treatment effects model can be explored with predict. Continuing with our wage model, the wife’s expected wage, conditional on attending college, can be obtained with the yctrt option. The wife’s expected wages, conditional on not attending college, can be obtained with the ycntrt option. Thus, the difference in expected wages between participants and nonparticipants is the difference between yctrt and ycntrt. For the case at hand, we have the following calculation: . predict wwctrt, yctrt Stata Technical Bulletin 31 . predict wwcntrt, ycntrt . gen diff = wwctrt - wwcntrt . summarize diff Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------diff | 250 1.356912 .0134202 1.34558 1.420173 Technical Note The difference in expected earnings between participants and nonparticipants is = + (1;i ) i i E [yi j zi = 1] ; E [yi j zi = 0] If the correlation between the error terms, , is zero, then the problem reduces to one estimable by OLS and the difference is simply . Since is positive in our example, we see that least squares overestimates the treatment effect. Saved Results treatreg saves in e(): Scalars e(N) e(k) e(k eq) e(k dv) e(df m) e(ll) e(p) e(N clust) e(lambda) number of observations number of variables number of equations number of dependent variables model degrees of freedom log likelihood p-value for 2 test number of clusters e(selambda) standard error of e(rc) e(sigma) return code e(rho) 2 2 for comparison test p-value for comparison test e(ic) number of iterations e(chi2) e(chi2 c) e(p c) Macros e(cmd) treatreg e(user) e(depvar) name(s) of dependent variable(s) e(title) title in estimation output e(clustvar) name of cluster variable e(wtype) weight type weight expression e(wexp) e(method) requested estimation method e(vcetype) covariance estimation method name of likelihood-evaluator program type of optimization e(chi2type) Wald or LR; type of model 2 test e(chi2 ct) Wald or LR; type of model 2 test corresponding to e(chi2 c) e(hazard) variable containing hazard e(predict) program used to implement predict e(opt) Matrices e(b) e(V) coefficient vector variance–covariance matrix of the estimators Functions e(sample) marks estimation sample Methods and Formulas treatreg is implemented as an ado-file. Maddala (1983, 117–122) derives both the maximum likelihood and the two-step estimator implemented here. Greene (2000, 933–934) also provides an introduction to the treatment effects model. The primary regression equation of interest is yj = xj + zj + j where zj is a binary decision variable. The binary variable is assumed to stem from an unobservable latent variable zj = wj + uj The decision to obtain the treatment is made according to the rule zj = 10;; if zj > 0 otherwise 32 Stata Technical Bulletin STB-55 where and u are bivariate normal with mean zero and covariance matrix 1 The likelihood function for this model is given in Maddala (1983, 122). Greene (2000, 180) discusses the standard method of reducing a bivariate normal to a function of a univariate normal and the correlation . Combining the two yields the following log likelihood for observation j : lj where ! 8 > > > ln > > > < w + (yp; x ; )= ; 1 yj ; x ; 2 ; ln(p2) 2 1 ; 2 > > > > > : ln ;w ;p(y ; x )= ; 1 yj ; x 2 1 ; 2 => j j j j ! j j 2 j j p ; ln( 2) zj =1 zj =0 () is the distribution function of the standard normal distribution. In the maximum likelihood estimation, and are not directly estimated. Directly estimated are ln and atanh , where atanh = 21 ln 11 + ; The standard error of = is approximated through the propagation of error (delta) method, which is given by ; Var() D Var [atanh ln ] D where 0 D is the Jacobian of with respect to atanh and ln . Maddala (1983, 120–122), also derives the two-step estimator. In the first stage, one obtains probit estimates of the treatment equation Pr(zj = 1 j w ) = (w ) j j From these estimates the hazard, hj , for each observation j is computed as hj 8 > > > > < (w b) (w b) > > > : ;(w b) 1 ; (w b) => j j j j zj =1 zj =0 where is the standard normal density function. We also define dj Then, = hj (hj + b w ) j E [yi j zi ] = Xj + zj + hj ; Var [yi j zi ] = 2 1 ; 2 dj h The two-step parameter estimates of and are obtained by augmenting the regression equation with the hazard . Thus, the regressors become [ ] and we obtain the additional parameter estimate h on the variable containing the hazard. A consistent estimate of the regression disturbance variance is obtained using the residuals from the augmented regression and the parameter estimate on the hazard 0 2 PN Xzh b2 = The two-step estimate of is then ee+ h N b = h b j =1 dj Stata Technical Bulletin 33 We will now describe how the consistent estimates of the coefficient covariance matrix based on the augmented regression are derived. Let =[ Z ] and be a square diagonal matrix of rank N with (1 b 2 dj ) on the diagonal elements. A X h D ; V twostep where and 2 A0A);1(A0DA + Q)(A0 A);1 = b ( Q = b 2(A0 DA)Vp (A0 DA) Vp is the variance–covariance estimate from the probit estimation of the treatment equation. Reference Barnow, B., G. Cain, and A. Goldberger. 1981. Issues in the analysis of selectivity bias. Evaluation Studies Review Annual 5. Beverly Hills: Sage Publications. Berndt, E. 1991. The Practice of Econometrics. New York: Addison–Wesley. Greene, W. H. 2000. Econometric Analysis. 4th ed. Upper Saddle River, NJ: Prentice–Hall. Maddala, G. S. 1983. Limited-dependent and Qualitative Variables in Econometrics. Cambridge, UK: Cambridge University Press. sg142 Uniform layer effect models for the analysis of differences in two-way associations Maurizio Pisati, University of Trento, Italy, [email protected] Abstract: Many relevant research questions pertain to how the association between two categorical variables (say R and C ) depends on the values taken on by a third categorical variable (say L). The uniform layer effect models illustrated in this insert represent a particular way to tackle these questions. Specifically, they are a variety of the standard loglinear model based on three assumptions: a) there is an association between variables R and C , b) the pattern of association between R and C is constant across the categories of variable L, and c) the strength of association between R and C varies between any pair of categories of L by a uniform amount. This insert focuses on two different specifications of the uniform layer effect model: Yamaguchi’s additive model and Xie’s multiplicative model. Keywords: Contingency table analysis, mobility table analysis, loglinear model, additive model, multiplicative model. Overview Many relevant research questions pertain to how the association between two categorical variables (say R and C ) depends on the values taken on by a third categorical variable (say L). To tackle these questions, we first arrange the data into a three-way contingency table, whose cell frequencies can be expressed in terms of the standard saturated loglinear model R C L RL CL RC + RCL ijk log(Fijk ) = + i + j + k + ik + jk + ij where log(Fijk ) denotes the natural logarithm of the expected frequency in cell (i; j; k ), i indexes the I categories of the row variable R, j indexes the J categories of the column variable C , k indexes the K categories of the layer variable L, and the parameters are subject to a standard set of constraints that make them identifiable (Powers and Xie 2000). When dealing with the kind of questions mentioned above, the researcher typically focuses on the specification of both the two-way interaction term which expresses the baseline pattern of association between variables R and C , and the three-way interaction term which expresses how the R by C association observed in each layer k departs from that baseline pattern. There RCL are many ways to specify RC ij and ijk , all of which can be seen as lying on a continuum whose extremes correspond on RCL the one hand to the conditional independence model, which sets RC ij = ijk = 0 for all combinations of i, j , and k, and on the other hand to the saturated model, which specifies the association between R and C conditional on L using all the (I 1) (J 1) K available degrees of freedom. ; ; The uniform layer effect models illustrated in this insert represent a particular way to specify the interaction terms RC ij and RCL ijk (Goodman and Hout 1998). In their standard formulation, the models belonging to this category share three assumptions: There is an association between variables R and C , that is, RC ij 6= 0. The pattern of association between variables R and C , as represented by the fundamental set of (conditional) log-odds ratios log(ij jk ) = log(Fijk ) + log(F(i+1)(j +1)k ) log(F(i+1)jk ) log(Fi(j +1)k ) for i = 1; : : : ; I 1 and j = 1; : : : ; J 1 is constant across layers. ; ; ; The strength of association between variables R and C varies between any pair of layers by a uniform amount. ; 34 Stata Technical Bulletin STB-55 This insert focuses on two different specifications of the uniform layer effect model: the additive model and the multiplicative model. The additive model has been proposed by Yamaguchi (1987) and can be formulated as log(Fijk ) = RC CL RL L C + R i + j + k + ik + jk + ij + ij k where the and parameters are subject to appropriate constraints that make them identifiable. As we can see, the additive model retains the two-way interaction term but replaces the three-way interaction term with the product ij k . This means that the conditional log-odds ratios pertaining to each layer k take the parametric form log(ij k ) = j RC RC RC RC ij + (i+1)(j +1) ; (i+1)j ; i(j +1) + k Hence, in the additive model the parameters express the extent to which the strength of the association between variables R and C varies across layers. More precisely, the difference between any pair of parameters (say k and k ) expresses in absolute terms how much the R by C association is uniformly stronger or weaker in layer k than in layer k (Goodman and Hout 1998, 184). Formally, k;k = log(ij k ) j ; log(ijjk ) = ; k i = 1; : : : ; I ; 1; j = 1; : : : ; J ; 1 k ; It should be noted that because of the presence of the ij product in the equation, the results produced by the additive model depend on the ordering of both the row and column categories (Goodman and Hout 1998, 184). Sometimes the layers can be assigned exogenous scores that have a theoretical meaning (Yamaguchi 1987, 486). In such cases, the additive model can be reformulated as log(Fijk ) = RC CL RL L C + R i + j + k + ik + jk + ij + X V v=1 ijSvk v where v indexes the V exogenous scores assigned to the layers, Svk denotes the value taken on by score v in layer k , and v denotes the linear effect exerted by score v on the log-odds ratios. Thus, according to this version of the additive model, which I will refer to as the linear additive model, the difference between any pair of conditional log-odds ratios pertaining to layers k and k is equal to k;k = log(ij k ) j ; log(ijjk ) = X V v=1 (Svk ; Svk ) v; i = 1; : : : ; I; j = 1; : : : ; J It should be noted that in most cases, to ensure both identification and meaningfulness of the V K 2. ; parameters, it is required that The multiplicative model has been proposed by Xie (1992), see also Erikson and Goldthorpe (1992, 91–93) and can be formulated as log(Fijk ) = RL L C CL + R i + j + k + ik + jk + ij k where the , , and parameters are subject to appropriate constraints that make them identifiable. As we can see, the RCL multiplicative model replaces both the two-way interaction term RC ij and the three-way interaction term ijk with the product ij k , where ij denotes cell-specific scores that express the baseline pattern of association between variables R and C , and k denotes layer-specific scores that express the strength of the R by C association in each layer. The and parameters can be seen as latent scores estimated from the data using iterative procedures (Xie 1992, 382; Goodman and Hout 1998, 181–182). The formula for the multiplicative model implies that the conditional log-odds ratios pertaining to each layer k take the parametric form log(ij k ) = ( ij + j (i+1)(j +1) ; (i+1)j ; i(j +1) )k Consequently, in the multiplicative model the ratio between any pair of parameters (say k and k ) expresses in relative terms how much the association between variables R and C is uniformly stronger or weaker in layer k than in layer k (Goodman and Hout 1998, 185). Formally, k=k = log(ij k )= log(ij k ) = j j k =k ; i = 1; : : : ; I ; 1; j = 1; : : : ; J ; 1 Both the additive and the multiplicative uniform layer effect models have been originally devised to compare social mobility tables across countries or over time. However, both models can be applied to any research question where the R by C association is assumed to have the same pattern but possibly different strengths across layers (see Xie 1991, Goodman and Hout 1998). Stata Technical Bulletin 35 Syntax unidiff cellvar , row(rowvar) column(colvar) layer(layvar) effect(null j add j addlin j mult) quasi pattern(fi j qpm j qs j cp j ua j re j ce j rce j hrce j own1 j own2) design(varlist) scores(varlist) extra(varlist) refcat(#) constraints(numlist) lambda(rawlog j rawexp j stdlog j stdexp) shd(log j exp) saveexp(newvar) savelambda(newvar) nodetail nodisprc nodispextra Description unidiff estimates the null, additive, linear additive, and multiplicative uniform layer effect models, displays relevant goodness-of-fit statistics and parameter estimates, and optionally computes several ancillary quantities of interest. The dataset to be analyzed must include at least four variables: cellvar contains the observed cell frequencies that make up the three-way contingency table object of analysis. rowvar indexes (and optionally labels) the I categories of the row variable. colvar indexes (and optionally labels) the J categories of the column variable. layvar indexes (and optionally labels) the K categories of the layer variable. Note that unidiff drops without warning all variables starting with rc . Options row(rowvar) is required. It specifies the name of the row variable. column(colvar) is required. It specifies the name of the column variable. layer(layvar) is required. It specifies the name of the layer variable. effect(null j add j addlin j mult) is required. It specifies the type of uniform layer effect model to be estimated. effect(null) estimates the null effect model, that is, a model that postulates constant pattern and strength of the C association across layers. R by effect(add) estimates the additive model. effect(addlin) estimates the linear additive model. effect(mult) estimates the multiplicative model. pattern(fi j qpm j qs j cp j ua j re j ce j rce j hrce j own1 j own2) is required. It specifies the baseline pattern of association between variables R and C , that is, the form taken by the two-way interaction term RC ij or, in the case of the multiplicative model, by the ij parameters. Some patterns are allowed only when I J . For details on all these patterns of association, see Hout (1983). = pattern(fi) specifies the “full interaction” (saturated) pattern of association. pattern(qpm) specifies the “quasi-perfect mobility” pattern of association. It is allowed only when I = J. = J. pattern(cp) specifies the “crossing parameters” pattern of association. It is allowed only when I = J . pattern(qs) specifies the “quasi-symmetry” pattern of association. It is allowed only when I pattern(ua) specifies the “uniform association” pattern of association. pattern(re) specifies the “row effects” pattern of association. pattern(ce) specifies the “column effects” pattern of association. I ” pattern of association. specifies the “homogeneous row and column effects I ” pattern of association. It is allowed only when pattern(rce) specifies the “row and column effects pattern(hrce) I = J. pattern(own1) specifies a user-defined pattern of association expressed by a single “topological,” that is, categorical variable. pattern(own2) specifies a user-defined pattern of association expressed by one or more quantitative variables. 36 Stata Technical Bulletin STB-55 quasi requires that the “quasi-version” (i.e., with diagonal-specific parameters) of the selected pattern of the be applied. It is allowed only when I = J. R by C association design(varlist) is required if pattern(own1) or pattern(own2) is specified. It specifies the list of variables that expresses the user-defined baseline pattern of association between variables R and C . extra(varlist) specifies a list of additional variables intended to express particular features of the model that lie outside its P standard formulation. When this option is specified, the formulas for the additive, linear-additive and multiplicative-uniform T x , where t indexes the T “extra” layer effect models reported above must be complemented with the term t=1 t tijk variables included in the model, xtijk denotes the value taken on by “extra” variable t in cell (i; j; k ), and t denotes the parameter associated with the “extra” variable t. scores(varlist) is required if effect(addlin) is specified. It specifies the list of variables that represent the V exogenous scores assigned to the layers. k or k . For identification purposes, the following constraints are imposed: r = 0 for the additive model, and r = 1 for the multiplicative model, where r is the index specified by refcat. By default, layer 1 is taken as the reference category. refcat(#) specifies the layer to be taken as the reference category in the estimation of parameters k or k . Suppose we are . To impose this equality restriction we specify constraints (1 2 constraints(numlist) specifies equality constraints to be imposed on the estimation of parameters analyzing a table with four layers and want to make 1 1 2 3). lambda(rawlog j j = j rawexp stdlog stdexp) displays in tabular form the total interaction effects estimated by the fitted RCL model for each layer, that is, the equivalent of the sum RC ij + ijk for all combinations of i, j , and k. ij k ). lambda(rawexp) displays the raw effects in multiplicative (exponential) form (exp(ij k )). ~ ij k ). Standardization is achieved by “double-centering” lambda(stdlog) displays the standardized effects in additive form ( lambda(rawlog) displays the raw effects in additive (logarithmic) form ( j j j the effects around their mean, so that within each row, column, and layer they sum to zero (see Goodman 1991, 1088). exp(~ ij k )). lambda(stdexp) displays the standardized effects in multiplicative form ( j j shd(log exp) displays in tabular form layer-specific structural shift parameters (with standard errors), structural distances (with standard errors), mean structural distances, and overall structural effect computed according to the Sobel–Hout–Duncan approach to mobility table modeling (Sobel, et al. 1985). These quantities are particularly relevant in the analysis of social mobility tables and can be computed only when I = J . shd(log) displays all the above quantities in additive (logarithmic) form. shd(exp) displays all the above quantities in multiplicative (exponential) form. saveexp(newvar) creates newvar containing the expected cell frequencies under the fitted model. savelambda(newvar) creates newvar containing the total interaction effects estimated by the fitted model. The effects are saved ~ ij jk ). in standardized additive form ( nodetail suppresses the output describing the structure of the contingency table object of analysis and the specification of the fitted model. nodisprc suppresses the output of the table reporting the parameter estimates associated with the variables that express the by C association pattern. R nodispextra suppresses the output of the table reporting the parameter estimates associated with the extra variables. Example 1 In this first example, I reanalyze the social mobility data used by Yamaguchi (1987) and Xie (1992) in their illustration of, respectively, the additive and the multiplicative uniform layer effect models. It is a 5 5 3 contingency table that cross-classifies father’s occupational class (the row variable), son’s occupational class (the column variable), and country (the layer variable). The pattern of association between father’s class and son’s class is assumed to be constant across countries. The purpose of the analysis is to detect any cross-national variation in the strength of the father-son association. . use example1.dta, clear . describe Stata Technical Bulletin 37 Contains data from example1.dta obs: 75 vars: 4 size: 675 (97.5% of memory free) ------------------------------------------------------------------------------1. obs int %8.0g Observed cell frequencies 2. country byte %13.0g country Country 3. father byte %15.0g class Father's occupational class 4. son byte %15.0g class Son's occupational class ------------------------------------------------------------------------------Sorted by: . label list country: 1 2 3 class: 1 2 3 4 5 United States Britain Japan UpNonManual LowNonManual UpManual LowManual Farm Let us start with the additive model. To reproduce Yamaguchi’s (1987) results, two assumptions must be taken into account. First, occupational classes are ordered hierarchically along a vertical status dimension ranging from upper nonmanual (highest) to farm (lowest). Second, models are applied to off-diagonal cells only, due to the particular meaning that diagonal cells have in mobility table analysis. This means that diagonal cells must be “blocked,”, that is, their frequencies must be exactly reproduced by the models. To this aim, we must create and include in the models as “extra” variables, 15 ( I K ) indicator variables, one for each diagonal cell of each layer. = . local COUNTRY "US GB JA" . local i=1 . while `i'<=3 { 2. local ITEM : word `i' of `COUNTRY' 3. local j=1 4. while `j'<=5 { 5. generate diag`i'`j'=country==`i' & father==`j' & son==`j' 6. lab var diag`i'`j' "`ITEM': Immobility in class `j'" 7. local j=`j'+1 8. } 9. local i=`i'+1 10. } . describe Contains data from example1.dta obs: 75 vars: 19 size: 5,175 (97.5% of memory free) ------------------------------------------------------------------------------1. obs int %8.0g Observed cell frequencies 2. country byte %13.0g country Country 3. father byte %15.0g class Father's occupational class 4. son byte %15.0g class Son's occupational class 5. diag11 float %9.0g US: Immobility in class 1 6. diag12 float %9.0g US: Immobility in class 2 7. diag13 float %9.0g US: Immobility in class 3 8. diag14 float %9.0g US: Immobility in class 4 9. diag15 float %9.0g US: Immobility in class 5 10. diag21 float %9.0g GB: Immobility in class 1 11. diag22 float %9.0g GB: Immobility in class 2 12. diag23 float %9.0g GB: Immobility in class 3 13. diag24 float %9.0g GB: Immobility in class 4 14. diag25 float %9.0g GB: Immobility in class 5 15. diag31 float %9.0g JA: Immobility in class 1 16. diag32 float %9.0g JA: Immobility in class 2 17. diag33 float %9.0g JA: Immobility in class 3 18. diag34 float %9.0g JA: Immobility in class 4 19. diag35 float %9.0g JA: Immobility in class 5 ------------------------------------------------------------------------------Sorted by: In his analysis, Yamaguchi (1987) tests several specifications of the pattern of association between father’s class and son’s 38 Stata Technical Bulletin STB-55 class. For illustration purposes, I will focus on two of them; the “full interaction” pattern and the “homogeneous row and column effects” pattern. The additive model with full interaction pattern of the R by C association and “blocked” diagonal cells can be estimated by . unidiff obs, row(father) col(son) lay(country) effect(add) pattern(fi) > extra(diag11-diag35) Analysis of differences in two-way associations Table structure ------------------------------------------------------------------------------Name Label N. of categories ------------------------------------------------------------------------------Row father Father's occupational class 5 Column son Son's occupational class 5 Layer country Country 3 ------------------------------------------------------------------------------Model specification ------------------------------------------------------------------------------Layer effect: additive R-C association pattern: full interaction Additional variables: diag11 diag12 diag13 diag14 diag15 diag21 diag22 diag23 diag24 diag25 diag31 diag32 diag33 diag34 diag35 ------------------------------------------------------------------------------Goodness-of-fit statistics ------------------------------------------------------------------------------Model N df X2 p G2 p rG2 BIC DI ------------------------------------------------------------------------------Cond. indep. 28887 48 6659.6 0.00 5591.5 0.00 0.0 5098.5 16.0 Null effect 28887 22 36.2 0.03 36.2 0.03 99.4 -189.7 0.9 Additive effect 28887 20 30.7 0.06 30.7 0.06 99.5 -174.7 0.7 ------------------------------------------------------------------------------Beta parameters --------------+----------------------------------Country | estimate s.e. p-value --------------+----------------------------------United States | 0.0000 0.0000 0.0000 Britain | 0.0035 0.0147 0.8133 Japan | -0.0411 0.0180 0.0227 --------------+----------------------------------R-C association parameters -------------------------------------------------------------------------Variable Label estimate s.e. p-value -------------------------------------------------------------------------rc_fi2 Full interaction: level 2 -3.1828 0.2629 0.0000 rc_fi3 Full interaction: level 3 -3.1345 0.2618 0.0000 rc_fi4 Full interaction: level 4 -3.0926 0.2598 0.0000 rc_fi5 Full interaction: level 5 -3.3612 0.2348 0.0000 rc_fi6 Full interaction: level 6 -3.2003 0.2598 0.0000 rc_fi7 Full interaction: level 7 -1.6144 0.2949 0.0000 rc_fi8 Full interaction: level 8 -2.3159 0.2541 0.0000 rc_fi9 Full interaction: level 9 -2.8696 0.2285 0.0000 rc_fi10 Full interaction: level 10 -3.0470 0.2584 0.0000 rc_fi11 Full interaction: level 11 -2.0611 0.2552 0.0000 rc_fi12 Full interaction: level 12 -1.6946 0.2556 0.0000 rc_fi13 Full interaction: level 13 -2.7080 0.2261 0.0000 rc_fi14 Full interaction: level 14 -2.9291 0.2545 0.0000 rc_fi15 Full interaction: level 15 -1.8084 0.2493 0.0000 rc_fi16 Full interaction: level 16 -1.3609 0.2445 0.0000 rc_fi17 Full interaction: level 17 0.0000 0.0000 0.0000 -------------------------------------------------------------------------Extra variable parameters -------------------------------------------------------------------------Variable Label estimate s.e. p-value -------------------------------------------------------------------------diag11 US: Immobility in class 1 3.8151 0.2524 0.0000 diag12 US: Immobility in class 2 0.0000 0.0000 0.0000 diag13 US: Immobility in class 3 -0.5863 0.1402 0.0000 diag14 US: Immobility in class 4 0.0235 0.0690 0.7334 diag15 US: Immobility in class 5 0.2605 0.2189 0.2340 diag21 GB: Immobility in class 1 4.0496 0.2775 0.0000 diag22 GB: Immobility in class 2 0.1899 0.1024 0.0636 Stata Technical Bulletin 39 diag23 GB: Immobility in class 3 -0.4477 0.1475 0.0024 diag24 GB: Immobility in class 4 0.0000 0.0000 0.0000 diag25 GB: Immobility in class 5 1.1400 0.2498 0.0000 diag31 JA: Immobility in class 1 4.1316 0.3223 0.0000 diag32 JA: Immobility in class 2 0.2619 0.1284 0.0414 diag33 JA: Immobility in class 3 0.0000 0.0000 0.0000 diag34 JA: Immobility in class 4 -0.0485 0.1693 0.7744 diag35 JA: Immobility in class 5 0.0000 0.0000 0.0000 -------------------------------------------------------------------------Kappa indices --------------+------Country | Kappa --------------+------United States | 0.55 Britain | 0.70 Japan | 0.51 --------------+------- As we can see, the output consists of seven items: A description of the structure of the contingency table object of analysis. A description of the specification of the fitted model. The output of these first two items can be suppressed by specifying the option nodetail. A table reporting goodness-of-fit statistics for both the main model (in this case the additive model) and two benchmark models: the conditional independence model and the null effect model (see above). In this table, N denotes the total number of observations, df the residual degrees of freedom, X2 the Pearson chi-squared statistic (with corresponding p-value), G2 the likelihood-ratio chi-squared statistic (with corresponding p-value), rG2 the percent reduction in G2 compared to the conditional independence model, BIC the Bayesian information criterion, and DI the dissimilarity index. For more details on these measures, see the Methods and Formulas section below. A table reporting the maximum likelihood estimates (with corresponding standard errors and p-values) of the Note that the sign of for Great Britain reported in Table 2 of Yamaguchi’s (1987) article is reversed. parameters. A table reporting the maximum likelihood estimates (with corresponding standard errors and p-values) of the parameters associated with the variables that express the R by C association pattern. The output of this table can be suppressed by specifying the option nodisprc. A table reporting the maximum likelihood estimates (with corresponding standard errors and p-values) of the parameters associated with the extra variables. The output of this table can be suppressed by specifying the option nodispextra. A table reporting kappa indices, which express in standardized form the strength of the R by C association within each layer (Hout, et al. 1995, 813; Goodman 1991, 1089). For more details on the kappa index, see the Methods and Formulas section below. Yamaguchi (1987) estimates a second version of this model which constrains the beta parameters for the United States and Great Britain to be equal. To this aim, we use the option constraints as follows: . unidiff obs, row(father) col(son) lay(country) effect(add) pattern(fi) > extra(diag11-diag35) constraints(1 1 2) nodetail nodisprc nodispext Analysis of differences in two-way associations Goodness-of-fit statistics ------------------------------------------------------------------------------Model N df X2 p G2 p rG2 BIC DI ------------------------------------------------------------------------------Cond. indep. 28887 48 6659.6 0.00 5591.5 0.00 0.0 5098.5 16.0 Null effect 28887 22 36.2 0.03 36.2 0.03 99.4 -189.7 0.9 Additive effect 28887 21 30.8 0.08 30.8 0.08 99.4 -184.9 0.7 ------------------------------------------------------------------------------Beta parameters --------------+----------------------------------Country | estimate s.e. p-value --------------+----------------------------------United States | 0.0000 0.0000 0.0000 Britain | 0.0000 0.0000 0.0000 Japan | -0.0417 0.0178 0.0192 --------------+----------------------------------- 40 Stata Technical Bulletin STB-55 Kappa indices --------------+------Country | Kappa --------------+------United States | 0.55 Britain | 0.70 Japan | 0.51 --------------+------- Let us consider now the “homogeneous row and column effects” specification of the pattern of association between father’s class and son’s class. To specify this pattern within unidiff, we have . unidiff obs, row(father) col(son) lay(country) effect(add) pattern(hrce) > extra(diag11-diag35) nodispext Analysis of differences in two-way associations Table structure ------------------------------------------------------------------------------Name Label N. of categories ------------------------------------------------------------------------------Row father Father's occupational class 5 Column son Son's occupational class 5 Layer country Country 3 ------------------------------------------------------------------------------Model specification ------------------------------------------------------------------------------Layer effect: additive R-C association pattern: homogeneous row & column effects I Additional variables: diag11 diag12 diag13 diag14 diag15 diag21 diag22 diag23 diag24 diag25 diag31 diag32 diag33 diag34 diag35 ------------------------------------------------------------------------------Goodness-of-fit statistics ------------------------------------------------------------------------------Model N df X2 p G2 p rG2 BIC DI ------------------------------------------------------------------------------Cond. indep. 28887 48 6659.6 0.00 5591.5 0.00 0.0 5098.5 16.0 Null effect 28887 29 125.5 0.00 107.0 0.00 98.1 -190.9 1.3 Additive effect 28887 27 117.2 0.00 98.4 0.00 98.2 -179.0 1.2 ------------------------------------------------------------------------------Beta parameters --------------+----------------------------------Country | estimate s.e. p-value --------------+----------------------------------United States | 0.0000 0.0000 0.0000 Britain | -0.0025 0.0148 0.8657 Japan | -0.0524 0.0178 0.0034 --------------+----------------------------------R-C association parameters -------------------------------------------------------------------------Variable Label estimate s.e. p-value -------------------------------------------------------------------------rc_rc2 Row-Column effect 2 0.1012 0.0329 0.0021 rc_rc3 Row-Column effect 3 0.2729 0.0250 0.0000 rc_rc4 Row-Column effect 4 0.3333 0.0258 0.0000 rc_rc5 Row-Column effect 5 0.3323 0.0227 0.0000 -------------------------------------------------------------------------Kappa indices --------------+------Country | Kappa --------------+------United States | 0.62 Britain | 0.77 Japan | 0.52 --------------+------- The multiplicative version of the uniform layer effect model with full interaction pattern of the “blocked” diagonal cells can be estimated by . unidiff obs, row(father) col(son) lay(country) effect(mult) pattern(fi) > extra(diag11-diag35) nodispext R by C association and Stata Technical Bulletin Iteration Iteration Iteration Iteration Iteration Iteration 1: 2: 3: 4: 5: 6: deviance deviance deviance deviance deviance deviance = = = = = = 41 53.4755 35.3281 0.6868 0.0206 0.0009 0.0000 Analysis of differences in two-way associations Table structure ------------------------------------------------------------------------------Name Label N. of categories ------------------------------------------------------------------------------Row father Father's occupational class 5 Column son Son's occupational class 5 Layer country Country 3 ------------------------------------------------------------------------------Model specification ------------------------------------------------------------------------------Layer effect: multiplicative R-C association pattern: full interaction Additional variables: diag11 diag12 diag13 diag14 diag15 diag21 diag22 diag23 diag24 diag25 diag31 diag32 diag33 diag34 diag35 ------------------------------------------------------------------------------Goodness-of-fit statistics ------------------------------------------------------------------------------Model N df X2 p G2 p rG2 BIC DI ------------------------------------------------------------------------------Cond. indep. 28887 48 6659.6 0.00 5591.5 0.00 0.0 5098.5 16.0 Null effect 28887 22 36.2 0.03 36.2 0.03 99.4 -189.7 0.9 Multipl. effect 28887 20 30.7 0.06 30.9 0.06 99.4 -174.5 0.7 ------------------------------------------------------------------------------Phi parameters (layer scores) --------------+----------------------------------Country | Raw Scaled 1 Scaled 2 --------------+----------------------------------United States | 1.7025 1.0000 0.6064 Britain | 1.7703 1.0398 0.6305 Japan | 1.3605 0.7991 0.4845 --------------+----------------------------------Psi parameters (R-C association scores) -------------+--------------------------------------Father's | occupational | Son's occupational class class | UpNonM LowNon UpManu LowMan Farm -------------+--------------------------------------UpNonManual | 0.00 0.00 0.00 0.00 0.00 LowNonManual | 0.00 0.59 0.51 0.54 0.37 UpManual | 0.00 0.48 1.14 0.99 0.64 LowManual | 0.00 0.57 1.14 1.35 0.73 Farm | 0.00 0.64 1.29 1.55 2.93 -------------+--------------------------------------Kappa indices --------------+------Country | Kappa --------------+------United States | 0.55 Britain | 0.70 Japan | 0.51 --------------+------- As we can see, the output includes two new items: P A table reporting the maximum likelihood estimates of the parameters (layer scores). Three series of parameters are K 2 = 1 (see Xie 1992, reported: raw estimates, estimates rescaled so that r = 1, and estimates rescaled so that k=1 k 382). A table reporting the maximum likelihood estimates of the parameters (R by C association scores). 42 Stata Technical Bulletin STB-55 Example 2 To illustrate the estimation of the linear additive uniform layer effect model, in this second example I make use of the sixteen-country social mobility data originally assembled by Hazelrigg and Garnier (1976) and subsequently analyzed by several researchers (Grusky and Hauser 1984, Xie 1992). It is a 3 3 16 contingency table that cross-classifies father’s occupational class (the row variable), son’s occupational class (the column variable), and country (the layer variable). As in the previous example, the pattern of association between father’s class and son’s class is assumed to be constant across countries. The purpose of the analysis is to estimate the effect exerted by some country-level variables on the strength of that association. Following Hauser and Grusky (1988), four variables have been selected: degree of economic development (measured as per capita energy consumption in tons of coal), degree of social democracy (measured as percentage of seats in the national legislature held by social democratic parties), a dummy variable indicating countries belonging to the Eastern block, and a dummy variable indicating Asian countries (for details, see Hauser and Grusky 1988). . use example2.dta, clear . describe Contains data from example2.dta obs: 144 vars: 8 27 Dec 1999 10:09 size: 2,736 (98.5% of memory free) ------------------------------------------------------------------------------1. obs int %9.0g Observed cell frequencies 2. father byte %9.0g class Father's occupational class 3. son byte %9.0g class Son's occupational class 4. country byte %13.0g country Country 5. develop float %9.0g Economic development index 6. socdem float %9.0g Social democracy index 7. east byte %9.0g Eastern block country 8. asia byte %9.0g Asian country ------------------------------------------------------------------------------Sorted by: . label list class: 1 2 3 country: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 NonManual Manual Farm Australia Belgium France Hungary Italy Japan Philippines Spain United States West Germany West Malaysia Yugoslavia Denmark Finland Norway Sweden To begin, let us use unidiff to replicate Xie’s (1992) application of the multiplicative model to the sixteen-country social mobility data: . unidiff obs, row(father) col(son) lay(country) effect(mult) pattern(fi) Iteration Iteration Iteration Iteration Iteration Iteration Iteration Iteration 1: 2: 3: 4: 5: 6: 7: 8: deviance deviance deviance deviance deviance deviance deviance deviance = = = = = = = = 378.0271 67.2783 6.9561 0.5449 0.0430 0.0029 0.0010 0.0000 Stata Technical Bulletin Analysis of differences in two-way associations Table structure ------------------------------------------------------------------------------Name Label N. of categories ------------------------------------------------------------------------------Row father Father's occupational class 3 Column son Son's occupational class 3 Layer country Country 16 ------------------------------------------------------------------------------Model specification ------------------------------------------------------------------------------Layer effect: multiplicative R-C association pattern: full interaction Additional variables: none ------------------------------------------------------------------------------Goodness-of-fit statistics ------------------------------------------------------------------------------Model N df X2 p G2 p rG2 BIC DI ------------------------------------------------------------------------------Cond. indep. 113556 64 43389.7 0.00 42970.0 0.00 0.0 42225.0 25.6 Null effect 113556 60 1327.3 0.00 1328.8 0.00 96.9 630.4 3.7 Multipl. effect 113556 45 787.0 0.00 821.7 0.00 98.1 297.9 2.6 ------------------------------------------------------------------------------Phi parameters (layer scores) --------------+----------------------------------Country | Raw Scaled 1 Scaled 2 --------------+----------------------------------Australia | 6.7022 1.0000 0.2170 Belgium | 9.1682 1.3679 0.2968 France | 8.6107 1.2848 0.2788 Hungary | 7.5955 1.1333 0.2459 Italy | 9.2504 1.3802 0.2995 Japan | 7.1213 1.0625 0.2306 Philippines | 7.3397 1.0951 0.2376 Spain | 9.1393 1.3636 0.2959 United States | 7.3490 1.0965 0.2379 West Germany | 6.8584 1.0233 0.2220 West Malaysia | 6.1284 0.9144 0.1984 Yugoslavia | 6.9040 1.0301 0.2235 Denmark | 8.6761 1.2945 0.2809 Finland | 6.9970 1.0440 0.2265 Norway | 6.0621 0.9045 0.1963 Sweden | 8.5000 1.2682 0.2752 --------------+----------------------------------Psi parameters (R-C association scores) ----------+----------------------Father's | Son's occupational occupatio | class nal class | NonMan Manual Farm ----------+----------------------NonManual | 0.00 0.00 0.00 Manual | 0.00 0.23 0.14 Farm | 0.00 0.21 0.48 ----------+----------------------Kappa indices --------------+------Country | Kappa --------------+------Australia | 0.61 Belgium | 0.83 France | 0.78 Hungary | 0.69 Italy | 0.84 Japan | 0.64 Philippines | 0.66 Spain | 0.83 United States | 0.66 West Germany | 0.62 West Malaysia | 0.55 Yugoslavia | 0.62 43 44 Stata Technical Bulletin STB-55 Denmark | 0.78 Finland | 0.63 Norway | 0.55 Sweden | 0.77 --------------+------- In his analysis, Xie (1992) explores the effect exerted by four country-level variables (partially different from ours) on the strength of the father-son association by computing zero-order correlation coefficients between those variables and the parameters estimated by the multiplicative model. Alternatively, we can estimate the effect of country-level explanatory variables by means of the linear additive uniform layer effect model. . unidiff obs, row(father) col(son) lay(country) effect(addlin) pattern(fi) > scores(develop socdem east asia) Analysis of differences in two-way associations Table structure ------------------------------------------------------------------------------Name Label N. of categories ------------------------------------------------------------------------------Row father Father's occupational class 3 Column son Son's occupational class 3 Layer country Country 16 ------------------------------------------------------------------------------Model specification ------------------------------------------------------------------------------Layer effect: linear additive Layer score variables: develop socdem east asia R-C association pattern: full interaction Additional variables: none ------------------------------------------------------------------------------Goodness-of-fit statistics ------------------------------------------------------------------------------Model N df X2 p G2 p rG2 BIC DI ------------------------------------------------------------------------------Cond. indep. 113556 64 43389.7 0.00 42970.0 0.00 0.0 42225.0 25.6 Null effect 113556 60 1327.3 0.00 1328.8 0.00 96.9 630.4 3.7 Lin.add. effect 113556 56 1014.6 0.00 1005.2 0.00 97.7 353.4 3.2 ------------------------------------------------------------------------------Beta parameters -------------------------------------------------------------------------Variable Label estimate s.e. p-value -------------------------------------------------------------------------develop Economic development index -0.0058 0.0037 0.1165 socdem Social democracy index -0.0051 0.0005 0.0000 east Eastern block country 0.2025 0.0329 0.0000 asia Asian country -0.2327 0.0220 0.0000 -------------------------------------------------------------------------Layer scores --------------+--------------------------------------Country | develop socdem east asia --------------+--------------------------------------Australia | 4.80 39.50 0.00 0.00 Belgium | 4.73 34.90 0.00 0.00 France | 2.95 11.60 0.00 0.00 Hungary | 2.81 0.00 1.00 0.00 Italy | 1.79 18.60 0.00 0.00 Japan | 1.78 35.70 0.00 1.00 Philippines | 0.21 0.00 0.00 1.00 Spain | 1.02 0.00 0.00 0.00 United States | 9.20 0.00 0.00 0.00 West Germany | 4.23 37.40 0.00 0.00 West Malaysia | 0.36 8.20 0.00 1.00 Yugoslavia | 1.19 0.00 1.00 0.00 Denmark | 4.17 44.20 0.00 0.00 Finland | 2.68 26.50 0.00 0.00 Norway | 3.59 49.30 0.00 0.00 Sweden | 4.51 48.30 0.00 0.00 --------------+--------------------------------------R-C association parameters -------------------------------------------------------------------------- Stata Technical Bulletin 45 Variable Label estimate s.e. p-value -------------------------------------------------------------------------rc_fi2 Full interaction: level 2 1.9650 0.0224 0.0000 rc_fi3 Full interaction: level 3 1.4379 0.0472 0.0000 rc_fi4 Full interaction: level 4 1.8955 0.0286 0.0000 rc_fi5 Full interaction: level 5 4.2721 0.0530 0.0000 -------------------------------------------------------------------------Kappa indices --------------+------Country | Kappa --------------+------Australia | 0.67 Belgium | 0.68 France | 0.75 Hungary | 0.91 Italy | 0.73 Japan | 0.55 Philippines | 0.65 Spain | 0.79 United States | 0.79 West Germany | 0.67 West Malaysia | 0.63 Yugoslavia | 0.91 Denmark | 0.65 Finland | 0.71 Norway | 0.64 Sweden | 0.64 --------------+------- As we can see from the table reporting the parameters, the strength of the association between father’s class and son’s class decreases as both economic development and social democracy increase. Moreover, the father-son association is, coeteris paribus, stronger in the countries belonging to the Eastern block and weaker in the Asian countries. It is important to stress that, given the bad fit of the model, these results should be considered only for pedagogic purposes. Saved Results unidiff saves in r(): Scalars r(di) r(rG2) r(G2) r(X2) r(N) r(nrow) r(nlay) r(bic) dissimilarity index reduction in G2 r(G2 p) G2 X2 r(X2 p) r(df) r(ncells) number of observations number of categories of row variable number of categories of layer variable r(ncol) Bayesian information criterion p-value for G2 p-value for X 2 residual degrees of freedom number of cells number of categories of column variable Macros r(cellvar) name of variable containing the observed cell frequencies r(rowvar) r(colvar) r(layvar) r(effect) r(design) name of column variable type of uniform layer effect list of design variables name of row variable name of layer variable r(pattern) type of R by C association pattern r(extra) list of extra variables Methods and Formulas Let fijk denote the observed frequency in cell (i; j; k ), Fijk denote the expected frequency in cell (i; j; k ) under the fitted model, N denote the total number of observations, and df denote the residual degrees of freedom under the fitted model. The Pearson chi-squared statistic is X2 = The likelihood-ratio chi-squared statistic is G2 = 2 XXX I J K (f ; F )2 ijk ijk F ijk i=1 j =1 k=1 XXX I J K i=1 j =1 k=1 fijk log(fijk =Fijk ) 46 Stata Technical Bulletin The percent reduction in G2 is STB-55 rG2 = (1 ; G2M 1 =G2M 0 ) 100 where G2M 1 denotes the likelihood-ratio chi-squared statistic associated with the fitted model, and G2M 0 denotes the likelihood-ratio chi-squared statistic associated with the conditional independence model. The Bayesian information criterion is BIC = G2 ; df log(N ) The dissimilarity index is = I X J X K X jfijk ; Fijk j 2N i=1 j =1 k=1 100 The raw total interaction effects estimated with option lambda(rawlog) are 8 RC PT >> ij + t=1 txtijk ; >< RC + ij k + PT txtijk ; t=1 ijjk = > ijRC PV P + ijS >> ij Pv=1 vk v + Tt=1 txtijk ; : ij k + T txtijk ; t=1 where the two-way terms RC ij and for the null model for the additive model for the linear additive model for the multiplicative model ij are parameterized according to the selected pattern of the R by C association. The standardized total interaction effects estimated with option lambda(stdlog) satisfy the following conditions: I X ~ i=1 J ijjk = 0; X~ j =1 The layer-specific kappa indices are ijjk = 0; j = 1; : : : ; J; k = 1; : : : ; K i = 1; : : : ; I; k = 1; : : : ; K v u I X J ~ 2ijjk uX k = t ; i=1 j =1 IJ k = 1; : : : ; K The structural shift parameters estimated with option shd(log) are log( where log( 1jk jjk CL C ) = (Cj + CL jk ) ; (j + jk ); j = 1; : : : ; J; k = 1; : : : ; K ) = 0 for identification purposes. The structural distances estimated with option shd(log) are log( (j=j )jk ) = log( jjk ) ; log( j jk ); j; j = 1; : : : ; J; k = 1; : : : ; K The mean structural distances estimated with option shd(log) are log( (j=C )jk PJ log( ) = j =1 (j=j )jk J ;1 ); j = 1; : : : ; J; k = 1; : : : ; K The overall structural effects estimated with option shd(log) are PJ PJ j log( )j log( k ) = j=1 (jJ =1 J ) ; J(j=j )jk ; k = 1; : : : ; K Stata Technical Bulletin 47 References Erikson, R. and J. H. Goldthorpe. 1992. The Constant Flux: A Study of Class Mobility in Industrial Societies . Oxford: Clarendon Press. Goodman, L. 1991. Measures, models, and graphical displays in the analysis of cross-classified data. Journal of the American Statistical Association 86: 1085–1111. Goodman, L. A. and M. Hout. 1998. Statistical methods and graphical displays for analyzing how the association between two qualitative variables differs among countries, among groups, or over time: a modified regression-type approach. In Sociological Methodology 1998 , ed. A. Raftery, 175–230. Washington, DC: American Sociological Association. Grusky, D. B. and R. M. Hauser. 1984. Comparative social mobility revisited: models of convergence and divergence in 16 countries. American Sociological Review 49: 19–38. Hauser, R. M. and D. B. Grusky. 1988. Cross-national variation in occupational distributions, relative mobility changes, and intergenerational shifts in occupational distributions. American Sociological Review 53: 723–741. Hazelrigg, L. E. and M. A. Garnier. 1976. Occupational mobility in industrial societies: a comparative analysis of differential access to occupational ranks in seventeen countries. American Sociological Review 41: 498–511. Hout, M. 1983. Mobility Tables. Thousand Oaks, CA: Sage Publications. Hout, M., C. Brooks, and J. Manza. 1995. The democratic class struggle in the United States, 1948–1992. American Sociological Review 60: 805–828. Powers, D. A. and Y. Xie. 2000. Statistical Methods for Categorical Data Analysis. San Diego: Academic Press. Sobel, M. E., M. Hout, and O. D. Duncan. 1985. Exchange, structure, and symmetry in occupational mobility. American Journal of Sociology 91: 359–372. Xie, Y. 1991. Model fertility schedules revisited: the log-multiplicative model approach. Social Science Research 20: 355–368. ——. 1992. The log-multiplicative layer effect model for comparing mobility tables. American Sociological Review 57: 380–395. Yamaguchi, K. 1987. Models for comparing mobility tables: toward parsimony and substance. American Sociological Review 52: 482–494. snp15 somersd—Confidence intervals for nonparametric statistics and their differences Roger Newson, Guy’s, King’s and St Thomas’ School of Medicine, London, UK, [email protected] Abstract: Rank order or so-called nonparametric methods are in fact based on population parameters, which are zero under the null hypothesis. Two of these parameters are Kendall’s a and Somers’ D , the parameter tested by a Wilcoxon rank-sum test. Confidence limits for these parameters are more informative than p-values alone, for three reasons. Firstly, confidence intervals show that a high p-value does not prove a null hypothesis. Secondly, for continuous data, Kendall’s a can often be used to define robust confidence limits for Pearson’s correlation by Greiner’s relation. Thirdly, we can define confidence limits for differences between two Kendall’s a ’s or Somers’ D ’s, and these are informative, because a larger Kendall’s a or Somers’ D cannot be secondary to a smaller one. The program somersd calculates confidence intervals for Somers’ D or Kendall’s a , using jackknife variances. There is a choice of transformations, including Fisher’s z , Daniels’ arcsine, Greiner’s , and the z -transform of Greiner’s . A cluster option is available. The estimation results are saved as for a model fit, so that differences can be estimated using lincom. Keywords: Somers’ D, Kendall’s tau, rank correlation, rank-sum test, Wilcoxon test, confidence intervals, nonparametric methods. Syntax somersd varlist weight if exp in range , cluster(varname) level(#) taua tdist transf(transformation name) where transformation name is one of iden j z j asin j rho j zrho fweights, iweights and pweights are allowed. Description somersd calculates the nonparametric statistics Somers’ D (corresponding to rank-sum tests) and Kendall’s a , with confidence limits. Somers’ D or a is calculated for the first variable of varlist as a predictor of each of the other variables in varlist, with estimates and jackknife variances and confidence intervals output and saved in e() as if for the parameters of a model fit. It is possible to use lincom to output confidence limits for differences between the population Somers’ D or Kendall’s a values. 48 Stata Technical Bulletin STB-55 Options cluster(varname) specifies the variable which defines sampling clusters. If cluster is defined, then the between-cluster Somers’ D or a is calculated, and the variances are calculated assuming that the data are sampled from a population of clusters, rather than a population of observations. level(#) specifies the confidence level, in percent, for confidence intervals of the estimates. The default is level(95) or as set by set level. taua causes somersd to calculate Kendall’s a . If taua is absent, then somersd calculates Somers’ ; D. tdist specifies that the estimates are assumed to have a t-distribution with n 1 degrees of freedom, where of clusters if cluster is specified, or the number of observations if cluster is not specified. n is the number transf(transformation name) specifies that the estimates are to be transformed, defining estimates for the transformed population value. iden (identity or untransformed) is the default. z specifies Fisher’s z (the hyperbolic arctangent), asin specifies Daniels’ arcsine, rho specifies Greiner’s (Pearson correlation estimated using Greiner’s relation), and zrho specifies the z -transform of Greiner’s . If a varlist is supplied, then all options are allowed. If not, then somersd replays the previous somersd estimation (if available), and the only option allowed is level(#). Remarks The population value of Kendall’s a (Kendall 1970) is defined as XY = E [sign(X1 ; X2 )sign(Y1 ; Y2 )] (1) where (X1 ; Y1 ) and (X2 ; Y2 ) are bivariate random variables sampled independently from the same population, and E [ ] denotes expectation. The population value of Somers’ D (Somers 1962) is defined as DY X = XY XX (2) Therefore, XY is the difference between two probabilities, namely the probability that the larger of the two X -values is associated with the larger of the two Y -values and the probability that the larger X -value is associated with the smaller Y -value. DY X is the difference between the two corresponding conditional probabilities, given that the two X -values are not equal. Kendall’s a is the covariance between sign(X1 X2 ) and sign(Y1 Y2 ), whereas Somers’ D is the regression coefficient of sign(Y1 Y2 ) with respect to sign(X1 X2 ). (The correlation coefficient between sign(X1 X2 ) and sign(Y1 Y2 ) is known as Kendall’s b , and is the geometric mean of DY X and DXY .) ; ; ; ; ; ; Given a sample of data points (Xi ; Yi ), we may estimate and test the population values of Kendall’s a and Somers’ D by b Y X . These are commonly known as nonparametric statistics, even though XY the corresponding sample statistics bXY and D and DY X are parameters. The two Wilcoxon rank-sum tests (see [R] signrank) both test hypotheses predicting DY X = 0. The two-sample rank-sum test represents the case where X is a binary variable indicating membership of one of two subpopulations. The matched-pairs rank-sum test represents the case where there are paired data (Wi1 ; Wi2 ), such that Xi = sign(Wi1 Wi2 ), and Yi = Wi1 Wi2 . Kendall’s a is usually tested on “continuous” data, using ktau (see [R] spearman). j ; ; j There are several reasons for preferring confidence intervals to p-values alone: 1. Nonstatisticians often quote a nonsignificant result for a nonparametric test and argue as if they have “proved” a null hypothesis, when a confidence interval would show a wide range of other hypotheses which also fit the data. 2. In the case of continuous bivariate data, there is a correspondence between Kendall’s a and the more familiar Pearson’s correlation coefficient , known as Greiner’s relation (Kendall 1970). This states that = sin a 2 (3) and holds if the joint distribution of X and Y is bivariate normal. Under this relation, Kendall’s a -values of 0, 31 , 21 and 1 correspond to Pearson’s correlations of 0, 21 , p12 and 1, respectively. A similar correspondence is likely to hold in a wider range of continuous bivariate distributions (Kendall 1949, Newson 1987). 3. Kendall’s a has the desirable property that a larger a cannot be secondary to a smaller a , that is, if a positive XY is caused entirely by a monotonic positive relationship of both variables with a third variable W , then WX and WY must Stata Technical Bulletin 49 ; both be greater than XY . If we can show that XY WY > 0 (or, equivalently, that DY X implies that the correlation between X and Y is not caused entirely by the influence of W . ; DY W > 0), then this To understand the third point, assume that trivariate data points (Wi ; Xi ; Yi ) are sampled independently from a common population, with discrete probability mass function fW;X;Y ( ; ; ) and marginal probability mass function fW;X ( ; ). Define the conditional expectation Z (w1 ; x1 ; w2 ; x2 ) = E [sign(Y2 ; Y1 )jW1 = w1 ; X1 = x1 ; W2 = w2 ; X2 = x2 ] (4) for any w1 and w2 in the range of W -values and any x1 and x2 in the range of X -values. If we state that the positive relationship between Xi and Yi is caused entirely by a monotonic positive relationship between both variables and Wi , then that is equivalent to stating that whenever Z (w1 ; x1 ; w2 ; x2 ) 0 (5) w1 w2 and x2 x1 . However, the difference between the two a coefficients is XX WY ; XY =2 fW;X (w; x1 )fW;X (w; x2 )Z (w; x1 ; w; x2 ) w x2 <x1 X X fW;X (w1 ; x)fW;X (w2 ; x)Z (w1 ; x; w2; x) +2 x w1 <w2 X X fW;X (w1 ; x1 )fW;X (w2 ; x2 )Z (w1 ; x1 ; w2 ; x2 ): +4 w1 <w2 x2 <x1 (6) This difference must be nonnegative whenever the inequality (5) applies. In particular, if the distribution of the Wi and Xi is nearly continuous, then the difference (6) will be dominated by the third term, representing discordant (Wi ; Xi )-pairs. The difference between a -values will then be determined by the ordering of the Y -values when the larger of two W -values is associated with the smaller of two X -values. We now define the formulas for estimating XY , DY X and their differences. We assume the general case where the observations are clustered, which becomes the familiar unclustered case when there is one observation per cluster. Suppose there are n clusters, and the hth cluster contains mh observations. Define whi , Xhi and Yhi to be the importance weight, X -value and Y -value, respectively, for the ith observation of the hth cluster. (Like most estimation commands, somersd treats iweights and pweights as importance weights, and treats fweights as if they denoted a number of identical observations.) Define j vhijk = w0;hi wjk ; hh 6= =j XY ) =w w sign(X ; X )sign(Y ; Y ) t(hijk hi jk hi jk hi jk (7) (for any two observations). We will use the usual dot-substitution notation to define (for instance) vh:j: = mj mh X X i=1 k=1 XY ) = vhijk ; t(h:j: mj mh X X i=1 k=1 (XY ) thijk ; vh::: = n X j =1 XY ) = vh:j: ; t(h::: n X j =1 (XY ) th:j: (8) and any other sums over any other indices. Given that the clusters are sampled independently from a common population of clusters, we can define i h V = E [vh:j: ] ; TXY 6 = XY ) E t(h:j: (9) for all h = j . (In the terminology of Hoeffding (1948), these quantities are regular functionals of the cluster population distribution, and the expressions inside the square brackets are kernels of these regular functionals.) The quantities we really want to estimate are Kendall’s a and Somers’ D , defined respectively by XY = TXY =V; DY X = TXY =TXX = XY =XX (10) (These are equal to the familiar formulas (1) and (2) if each cluster contains one observation with an importance weight of one.) To estimate these, we use the jackknife method of Arvesen (1969) on the regular functionals (9) and use appropriate Taylor polynomials. The functionals V and TXY are estimated by the Hoeffding (1948) U -statistics Vb = n(nv::::; 1) ; TbXY = (XY ) t:::: n(n ; 1) (11) 50 Stata Technical Bulletin STB-55 and the respective jackknife pseudovalues corresponding to the hth cluster are given by (V ) h (XY ) h =(n ; 1);1 v:::: ; (n ; 2);1 [v:::: ; 2vh::: ] i h (XY ) XY ) ; 2th::: =(n ; 1);1 t(::::XY ) ; (n ; 2);1 t(:::: (12) somersd calculates correlation measures for a single variable X with a set of Y -variates (Y (1) ; : : : ; Y (p) ). It calculates, in the first instance, the covariance matrix for Vb , TbXX , and TbXY (i) for 1 i p. This is done using the jackknife influence matrix , which has n rows labeled by the cluster subscripts, and p + 2 columns labeled (in Stata fashion) by the names V , X , and Y (i) for 1 i p. It is defined by [h; V ] = (V ) h ; Vb ; [h; X ] = The jackknife covariance matrix is then equal to b C (XX ) h h ; TbXX ; h; Y (i) i = (XY (i) ) h ; TbXY i ( ) = [n(n ; 1)];1 0 (13) (14) The estimates for Kendall’s a and Somers’ D are defined by bXY = TbXY =Vb ; bY X D = TbXY =TbXX (15) p (p + 2) and the covariance matrices are defined using Taylor polynomials. In the case of Somers’ D , we define the b (D) , whose rows are labeled by the names Y (1) ; : : : ; Y (p) , and whose columns are labeled by matrix of estimated derivatives ; (1) (p) V; X; Y ; : : : ; Y . This matrix is defined by ;b(D) ;b(D) h h Y (i) ; X Y (i) ; Y (i) i i b @D Tb = bY X = ; bXY 2 TXX @ TXX b @D 1 = bY X = b TXX @ TXY (16) b ( ) , all other entries being zero. In the case of Kendall’s a , we define a (p + 1) (p + 2) matrix of estimated derivatives ; whose rows are labeled by X; Y (1) ; : : : ; Y (p) , and whose columns are labeled by V; X; Y (1) ; : : : ; Y (p) . This matrix is defined by Tb @ bXX = ; bXX b @V V2 1 @ b ;b( ) [X; X ] = bXX = b ;b( ) [X; V ] = h i @ TXX V @ bXY Tb @ TXY (i) V = ; bXY2 b V @V i @ b (i) h 1 XY ( i ) ( i ) ( ) ;b Y ; Y = b = b ;b( ) Y (i) ; V = (17) b (D) and all other entries again being zero. The estimated dispersion matrices of the Somers’ D and a estimates are therefore C ( ) b , respectively, defined by C b (D) C = ;b(D) Cb ;b(D) 0 ; b ( ) C = ;b ( ) Cb ;b( ) 0 (18) The transf option offers a choice of transformations. Since these are available both for Somers’ D and for Kendall’s a , we will denote the original estimate as (which can stand for D or ) and the transformed estimate as . They are summarized below, together with their derivatives d=d and their inverses ( ). transf Transform name () d=d ( ) iden z 1 Untransformed Fisher’s z arctanh()= 1 2 log[(1+ )=(1 asin rho zrho Daniels’ arcsine Greiner’s Greiner’s (z -transformed) ;)] arcsin() sin( 2 ) arctanh[sin( 2 )] (1;2 );1 tanh( )= ;1]=[exp(2 )+1] [exp(2 ) (1; );1=2 2 cos( ) 2 2 cos( )[1;sin( )2 ];1 2 2 2 sin( ) (2= )arcsin( ) (2= )arcsin[tanh( )] Stata Technical Bulletin 51 If transf is specified, then somersd displays and saves the transformed estimates and their estimated covariance, instead b () is the covariance matrix for the untransformed estimates given by (18), and ; b ( ) is the of the untransformed versions. If C diagonal matrix whose diagonal entries are the d=d estimates specified in the table, then the transformed parameter and its covariance matrix are b = (b); b ( ) C = ;b ( ) Cb() ;b( ) 0 (19) Fisher’s z -transform was originally recommended for the Pearson correlation coefficient by Fisher (1921) (see also Gayen 1951), but Edwardes (1995) recommended it specifically for Somers’ D on the basis of simulation studies. Daniels’ arcsine was suggested as a normalizing transform in Daniels and Kendall (1947). If transf(z) or transf(asin) is specified, then somersd prints asymmetric confidence intervals for the untransformed D or a values, calculated from symmetric confidence intervals for the transformed parameters using the inverse function ( ). (This feature corresponds to the eform option of other estimation commands.) Greiner’s (Kendall 1970) is based on the relation (3), and is designed to estimate the Pearson correlation coefficient corresponding to the measured a . If transf(zrho) is specified, somersd prints asymmetric confidence intervals for Greiner’s , using the inverse z -transform on symmetric confidence intervals for the z -transformed Greiner’s . Example 1 In the auto data, we compare US cars with foreign cars regarding weight and fuel efficiency. First, we use ranksum to give significance tests without confidence intervals: . ranksum mpg,by(foreign) Two-sample Wilcoxon rank-sum (Mann-Whitney) test foreign | obs rank sum expected ---------+--------------------------------Domestic | 52 1688.5 1950 Foreign | 22 1086.5 825 ---------+--------------------------------combined | 74 2775 2775 unadjusted variance 7150.00 adjustment for ties -36.95 ---------adjusted variance 7113.05 Ho: mpg(foreign==Domestic) = mpg(foreign==Foreign) z = -3.101 Prob > |z| = 0.0019 . ranksum weight,by(foreign) Two-sample Wilcoxon rank-sum (Mann-Whitney) test foreign | obs rank sum expected ---------+--------------------------------Domestic | 52 2379.5 1950 Foreign | 22 395.5 825 ---------+--------------------------------combined | 74 2775 2775 unadjusted variance 7150.00 adjustment for ties -1.06 ---------adjusted variance 7148.94 Ho: weight(foreign==Domestic) = weight(foreign==Foreign) z = 5.080 Prob > |z| = 0.0000 We note that American cars are typically heavier and travel fewer miles per gallon than foreign cars. For confidence intervals, we use somersd: . somersd foreign mpg weight Somers' D Transformation: Untransformed Valid observations: 74 -----------------------------------------------------------------------------| Jackknife foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------mpg | .4571678 .135146 3.383 0.001 .1922866 .7220491 weight | -.7508741 .0832485 -9.020 0.000 -.9140383 -.58771 ------------------------------------------------------------------------------ We see that, given a randomly-chosen foreign car and a randomly-chosen American car, the foreign car is 46% more likely to travel more miles per gallon than the American car than vice versa, with confidence limits from 19% to 72% more 52 Stata Technical Bulletin STB-55 likely. However, being foreign seems to be more reliable as a negative predictor of weight than as a positive predictor of “fuel efficiency”. We can use lincom to define confidence limits for the difference: . lincom -weight-mpg ( 1) - mpg - weight = 0.0 -----------------------------------------------------------------------------foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------(1) | .2937063 .0884397 3.321 0.001 .1203677 .4670449 ------------------------------------------------------------------------------ D The difference between Somers’ -values is positive. This indicates that, if there are two cars, one heavier and consuming fewer gallons per mile, the other lighter and consuming more gallons per mile, then the second is more likely to be foreign. So maybe 1970’s American cars were not as wasteful as some people think, and were, if anything, more fuel-efficient for their weight than non-American cars at the time. Figure 1 illustrates this graphically. Data points are domestic cars (“D”) and foreign shows it in stronger terms, without contentious cars (“F”). A regression analysis could show the same thing, but Somers’ assumptions such as linearity. (On the other hand, a regression model is more informative if its assumptions are true, so the two methods are mutually complementary.) D 45 F 40 Mileage (mpg) 35 D 30 FD F 25 FF F D D FF F D D D F DF DD D F F FF DD DF F 20 F F D D F 15 DD D D DD DDD D D D DD DD D DD D D D F D DD D D D D F DD DD D D D 10 5 0 1,000 2,000 3,000 Weight (lbs.) 4,000 5,000 Figure 1. Applying somersd to the auto data. D z The confidence intervals for such high values of Somers’ would probably be more reliable if we used the -transform, recommended by Edwardes (1995). The results of this are as follows: . somersd foreign mpg weight,tran(z) Somers' D Transformation: Fisher's z Valid observations: 74 -----------------------------------------------------------------------------| Jackknife foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------mpg | .4937249 .1708551 2.890 0.004 .1588551 .8285947 weight | -.9749561 .1908547 -5.108 0.000 -1.349024 -.6008878 -----------------------------------------------------------------------------95% CI for untransformed Somers' D Somers_D Minimum Maximum mpg .45716783 .15753219 .67972072 weight -.75087413 -.87382282 -.53768098 . lincom -weight-mpg ( 1) - mpg - weight = 0.0 -----------------------------------------------------------------------------foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------(1) | .4812312 .1235452 3.895 0.000 .2390871 .7233753 ------------------------------------------------------------------------------ z D D Note that somersd gives not only symmetric confidence limits for the -transformed Somers’ estimates, but also the more informative asymmetric confidence limits for the untransformed Somers’ estimates (corresponding to the eform option). The asymmetric confidence limits for the untransformed estimates are closer to zero than the symmetric confidence limits for the untransformed estimates in the previous output, and are probably more realistic. The output to lincom gives confidence limits Stata Technical Bulletin 53 for the difference between z -transformed Somers’ D values. This difference is expressed in z -units, but must, of course, be in the same direction as the difference between untransformed Somers’ D values. The conclusions are similar. Example 2 In this example, we demonstrate Kendall’s a by comparing weight (pounds) and displacement (cubic inches) as predictors of fuel efficiency (miles per gallon). We first use ktau to carry out significance tests with no confidence limits: . ktau mpg weight Number of obs Kendall's tau-a Kendall's tau-b Kendall's score SE of score = = = = = 74 -0.6857 -0.7059 -1852 213.605 (corrected for ties) Test of Ho: mpg and weight independent Pr > |z| = 0.0000 (continuity corrected) . ktau mpg displ Number of obs Kendall's tau-a Kendall's tau-b Kendall's score SE of score Test of Ho: mpg Pr > |z| = 74 = -0.5942 = -0.6257 = -1605 = 212.850 (corrected for ties) and displ independent = 0.0000 (continuity corrected) We then use somersd (with the taua option and the z -transform) to compute the same statistics with confidence limits. Note that somersd also outputs the a of mpg with mpg, which is simply the probability that two independently sampled mpg-values are not equal. . somersd mpg weight displ,taua tr(z) Kendall's tau-a Transformation: Fisher's z Valid observations: 74 -----------------------------------------------------------------------------| Jackknife mpg | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------mpg | 1.802426 .0748368 24.085 0.000 1.655748 1.949103 weight | -.8397412 .084022 -9.994 0.000 -1.004421 -.6750612 displ | -.6841711 .093055 -7.352 0.000 -.8665556 -.5017866 -----------------------------------------------------------------------------95% CI for untransformed Kendall's tau-a Tau_a Minimum Maximum mpg .94705665 .92964223 .96024957 weight -.68567197 -.76344472 -.58829928 displ -.59422436 -.69961991 -.46352103 We can use lincom to compare the two predictors and test whether smaller and heavier cars travel fewer miles per gallon than larger and lighter cars. This seems to be the case, as weight is a more negative predictor of mpg than displ: . lincom weight-displ ( 1) weight - displ = 0.0 -----------------------------------------------------------------------------mpg | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------(1) | -.1555701 .0742717 -2.095 0.036 -.3011399 -.0100003 ------------------------------------------------------------------------------ We demonstrate the cluster option using the variable manuf, equal to the first word of make, and used in [U] 23.11 Obtaining robust variance estimates to denote manufacturer. This analysis assumes that we are sampling from the population of car manufacturers rather than the population of car models. The results are as follows: . somersd mpg weight displ,taua tr(z) cluster(manuf) Kendall's tau-a Transformation: Fisher's z Valid observations: 74 Number of clusters: 23 (standard errors adjusted for clustering on manuf) 54 Stata Technical Bulletin STB-55 -----------------------------------------------------------------------------| Jackknife mpg | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------mpg | 1.83398 .0821029 22.338 0.000 1.673061 1.994898 weight | -.8391083 .0917593 -9.145 0.000 -1.018953 -.6592633 displ | -.694607 .0976751 -7.111 0.000 -.8860467 -.5031674 -----------------------------------------------------------------------------95% CI for untransformed Kendall's tau-a Tau_a Minimum Maximum mpg .95021392 .93195521 .96366535 weight -.68533644 -.76943983 -.57787293 displ -.60093349 -.70943563 -.46460448 . lincom weight-displ ( 1) weight - displ = 0.0 -----------------------------------------------------------------------------mpg | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------(1) | -.1445012 .0801437 -1.803 0.071 -.30158 .0125775 ------------------------------------------------------------------------------ Note that, in contrast to the case of most estimation commands, the cluster option affects the estimates as well as their standard errors. This is because the clustered estimates are calculated only from between-cluster comparisons, in this case pairs of car models from different manufacturers. Suppose that we are writing for an audience more familiar with Pearson’s correlation than with Kendall’s a . To estimate the Pearson correlations corresponding to our a coefficients, we use the zrho transform. The results are as follows: . somersd mpg weight displ,taua tr(zrho) Kendall's tau-a Transformation: z-transform of Greiner's rho Valid observations: 74 -----------------------------------------------------------------------------| Jackknife mpg | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------mpg | 3.179521 .1458796 21.796 0.000 2.893602 3.465439 weight | -1.378273 .1475561 -9.341 0.000 -1.667478 -1.089069 displ | -1.108838 .158893 -6.979 0.000 -1.420262 -.7974132 -----------------------------------------------------------------------------95% CI for untransformed Greiner's rho Rho Minimum Maximum mpg .99654393 .99388566 .99804762 weight -.88056403 -.93121746 -.79653796 displ -.80365118 -.88965364 -.66258811 ; ; The 59% a between displacement and fuel efficiency (from the unclustered output) is seen to correspond to a more impressive 80% Pearson correlation. The estimated Greiner’s is probably less likely to be oversensitive to outliers than the usual Pearson coefficient. Saved Results somersd saves in e(): Scalars e(N) e(N clust) number of observations number of clusters e(df r) residual degrees of freedom (if tdist present) somersd parameter label in output name of -variable covariance estimation method (Jackknife) transformation specified by transf e(param) e(tdist) e(clustvar) e(wtype) e(tranlab) parameter (somersd or taua) tdist if specified name of cluster variable weight type transformation label in output Matrices e(b) coefficient vector e(V) variance–covariance matrix of the estimators Functions e(sample) marks estimation sample Macros e(cmd) e(parmlab) e(depvar) e(vcetype) e(transf) X Note that (confusingly) e(depvar) is the X -variable, or predictor variable, in the conventional terminology for defining Somers’ D . somersd is also different from most estimation commands in that its results are not designed to be used by predict. Stata Technical Bulletin 55 References Arvesen, J. N. 1969. Jackknifing U-statistics. Annals of Mathematical Statistics 40: 2076–2100. Daniels, H. E. and M. G. Kendall. 1947. The significance of rank correlation where parental correlation exists. Biometrika 34: 197–208. Edwardes M. D. deB. 1995. A confidence interval for Pr(X<Y) ; Pr(X>Y) estimated from simple cluster samples. Biometrics 51: 571–578. Fisher, R. A. 1921. On the “probable error” of a coefficient of correlation deduced from a small sample. Metron 1(4): 3–32. Gayen, A. K. 1951. The frequency distribution of the product-moment correlation coefficient in random samples of any size drawn from non-normal universes. Biometrika 38: 219–247. Hoeffding, W. 1948. A class of statistics with asymptotically normal distribution. Annals of Mathematical Statistics 19: 293–325. Kendall, M. G. 1949. Rank and product-moment correlation. Biometrika 36: 177–193. ——. 1970. Rank Correlation Methods. 4th ed. London: Griffin. Newson, R. B. 1987. An analysis of cinematographic cell division data using U-statistics [Dphil dissertation]. Brighton, UK: Sussex University, 301–310. Somers, R. H. 1962. A new asymmetric measure of association for ordinal variables. American Sociological Review 27: 799–811. zz10 Cumulative index for STB-49–STB-54 [an] Announcements STB-49 STB-50 STB-53 2 2 2 an69 an70 an71 STB-43– STB-48 available in bound format Fall NetCourse schedule announced Spring NetCourse schedule announced [stata] Stata Corporation updates STB-50 STB-54 STB-54 34 2 4 stata53 stata54 stata55 censored option added to sts graph command Multiple curves plotted with stcurv command Search web for installable packages [dm] Data Management STB-49 STB-52 STB-51 STB-51 STB-51 STB-53 STB-49 STB-49 STB-50 STB-51 STB-49 STB-50 STB-50 STB-50 STB-51 STB-51 STB-52 STB-52 STB-54 STB-52 STB-53 STB-54 STB-54 2 2 2 2 2 3 2 6 3 2 7 3 5 9 3 5 2 2 7 8 6 8 16 dm45.1 dm45.2 dm50.1 dm56.1 dm61.1 dm63.1 dm65 dm66 dm66.1 dm66.2 dm67 dm68 dm69 dm70 dm71 dm72 dm72.1 dm73 dm73.1 dm74 dm75 dm76 dm77 Changing string variables to numeric: update Changing string variables to numeric: correction Update to defv Update to labedit Update to varxplor A new version of winshow for Stata 6 A program for saving a model fit as a dataset Recoding variables using grouped values Stata 6 version of recoding variables using grouped values Update of cut to Stata 6 Numbers of missing and present values Display of variables in blocks Further new matrix commands Extensions to generate, extended Calculating the product of observations Alternative ranking procedures Alternative ranking procedures: update Using categorical variables in Stata Contrasts for categorical variables: update Changing the order of variables in a dataset Safe and easy matched merging ICD-9 diagnostic and procedure codes Removing duplicate observations in a dataset 56 Stata Technical Bulletin STB-55 [gr] Graphics STB-49 STB-54 STB-49 STB-49 STB-50 STB-51 STB-51 STB-51 STB-51 STB-54 8 17 8 10 17 7 10 12 16 19 gr34.2 gr34.3 gr36 gr37 gr38 gr39 gr40 gr41 gr42 gr43 Drawing Venn diagrams An update to drawing Venn diagrams An extension of for, useful for graphics commands Cumulative distribution function plots Enhancement to the hilite command 3D surface plots A simple contour plot Distribution function plots Quantile plots, generalized Overlaying graphs [ip] Instruction on Programming STB-52 STB-50 STB-52 STB-54 9 20 10 21 ip18.1 ip28 ip29 ip29.1 Update to resample Automatically sorting by subgroup Metadata for user-written contributions to the Stata programming language Metadata for user-written contributions to the Stata programming language: extensions [os] Operating system, hardware, & interprogram communication STB-51 19 os17 Command-name registration at www.stata.com [sbe] Biostatistics & Epidemiology STB-49 STB-49 STB-50 STB-51 STB-52 STB-54 12 15 21 24 12 23 sbe27 sbe28 sbe29 sbe30 sbe31 sbe32 Assessing confounding effects in epidemiological studies Meta-analysis of p-values Generalized linear models: extensions to the binomial family Improved confidence intervals for odds ratios Exact confidence intervals for odds ratios from case–control studies Automated outbreak detection from public health surveillance data [sg] General Statistics STB-53 STB-49 STB-51 STB-49 STB-50 STB-54 STB-49 STB-49 STB-49 STB-49 STB-50 STB-50 STB-50 STB-51 STB-51 STB-54 STB-51 STB-51 STB-52 STB-52 STB-53 STB-54 STB-52 STB-52 STB-52 STB-53 STB-53 STB-53 STB-53 17 17 27 17 25 25 23 23 24 25 26 26 27 28 32 26 34 37 16 19 18 26 34 47 52 19 29 31 32 sg35.2 sg64.1 sg67.1 sg81.1 sg81.2 sg84.2 sg97.1 sg107.1 sg111 sg112 sg112.1 sg113 sg114 sg115 sg116 sg116.1 sg117 sg118 sg119 sg120 sg120.1 sg120.2 sg121 sg122 sg123 sg124 sg125 sg126 sg127 Robust tests for the equality of variances: update to Stata 6 Update to pwcorrs Update to univar Multivariable fractional polynomials: update Multivariable fractional polynomials: update Concordance correlation coefficient: update for Stata 6 Revision of outreg Generalized Lorenz curves and related graphs A modified likelihood-ratio test command Nonlinear regression models involving power or exponential functions of covariates Nonlinear regression models involving power or exponential functions of covariates: update Tabulation of modes rglm - Robust variance estimates for generalized linear models Bootstrap standard errors for indices of inequality Hotdeck imputation Update to hotdeck imputation Robust standard errors for the Foster–Greer–Thorbecke class of poverty indices Partitions of Pearson’s 2 for analyzing two-way tables that have ordered columns Improved confidence intervals for binomial proportions Receiver Operating Characteristic (ROC) analysis Two new options added to rocfit command Correction to roccomp command Seemingly unrelated estimation and the cluster-adjusted sandwich estimator Truncated regression Hodges–Lehmann estimation of a shift in location between two populations Interpreting logistic regression in all its forms Automatic estimation of interaction effects and their confidence intervals Two-parameter log-gamma and log-inverse Gaussian models Summary statistics for estimation sample Stata Technical Bulletin STB-53 STB-53 STB-54 STB-54 STB-54 STB-54 STB-54 35 47 27 36 42 46 47 sg128 sg129 sg130 sg131 sg132 sg133 sg134 Some programs for growth estimation in fisheries biology Generalized linear latent and mixed models Box–Cox regression models On the manipulability of Wald tests in Box–Cox regression models Analysis of variance from summary statistics Sequential and drop one term likelihood-ratio tests Model selection using the Akaike information criterion [ssa] Survival Analysis STB-49 30 ssa13 Analysis of multiple failure-time data with Stata [sts] Time-series, Econometrics STB-51 40 sts14 Bivariate Granger causality test [sxd] Experimental Design STB-50 STB-54 36 49 sxd1.1 sxd1.2 Update to random allocation of treatments to blocks Random allocation of treatments balanced in blocks: update [zz] Not elsewhere classified STB-49 40 zz9 Cumulative index for STB-43– STB-48 57 58 Stata Technical Bulletin STB-55 STB categories and insert codes Inserts in the STB are presently categorized as follows: General Categories: announcements an cc communications & letters dm data management dt datasets gr graphics in instruction Statistical Categories: sbe biostatistics & epidemiology sed exploratory data analysis sg general statistics smv multivariate analysis snp nonparametric methods sqc quality control sqv analysis of qualitative variables srd robust methods & statistical diagnostics ip os qs tt zz instruction on programming operating system, hardware, & interprogram communication questions and suggestions teaching not elsewhere classified ssa ssi sss sts svy sxd szz survival analysis simulation & random numbers social science & psychometrics time-series, econometrics survey sampling experimental design not elsewhere classified In addition, we have granted one other prefix, stata, to the manufacturers of Stata for their exclusive use. Guidelines for authors The Stata Technical Bulletin (STB) is a journal that is intended to provide a forum for Stata users of all disciplines and levels of sophistication. The STB contains articles written by StataCorp, Stata users, and others. Articles include new Stata commands (ado-files), programming tutorials, illustrations of data analysis techniques, discussions on teaching statistics, debates on appropriate statistical techniques, reports on other programs, and interesting datasets, announcements, questions, and suggestions. A submission to the STB consists of 1. An insert (article) describing the purpose of the submission. The STB is produced using plain TEX so submissions using TEX (or LATEX) are the easiest for the editor to handle, but any word processor is appropriate. If you are not using TEX and your insert contains a significant amount of mathematics, please FAX (979–845–3144) a copy of the insert so we can see the intended appearance of the text. 2. Any ado-files, .exe files, or other software that accompanies the submission. 3. A help file for each ado-file included in the submission. See any recent STB diskette for the structure a help file. If you have questions, fill in as much of the information as possible and we will take care of the details. 4. A do-file that replicates the examples in your text. Also include the datasets used in the example. This allows us to verify that the software works as described and allows users to replicate the examples as a way of learning how to use the software. 5. Files containing the graphs to be included in the insert. If you have used STAGE to edit the graphs in your submission, be sure to include the .gph files. Do not add titles (e.g., “Figure 1: ...”) to your graphs as we will have to strip them off. The easiest way to submit an insert to the STB is to first create a single “archive file” (either a .zip file or a compressed .tar file) containing all of the files associated with the submission, and then email it to the editor at [email protected] either by first using uuencode if you are working on a Unix platform or by attaching it to an email message if your mailer allows the sending of attachments. In Unix, for example, to email the current directory and all of its subdirectories: tar -cf - . | compress | uuencode xyzz.tar.Z > whatever mail [email protected] < whatever Stata Technical Bulletin 59 International Stata Distributors International Stata users may also order subscriptions to the Stata Technical Bulletin from our International Stata Distributors. Company: Address: Phone: Fax: Email: Countries served: Company: Address: Phone: Fax: Email: Countries served: Company: Address: Phone: Fax: Email: Countries served: Company: Address: Phone: Fax: Email: URL: Countries served: Applied Statistics & Systems Consultants P.O. Box 1169 17100 NAZERATH-ELLIT Israel +972 (0)6 6100101 +972 (0)6 6554254 [email protected] Israel Axon Technology Company Ltd 9F, No. 259, Sec. 2 Ho-Ping East Road TAIPEI 106 Taiwan +886-(0)2-27045535 +886-(0)2-27541785 [email protected] Taiwan Chips Electronics Lokasari Plaza 1st Floor Room 82 Jalan Mangga Besar Raya No. 82 JAKARTA Indonesia 62 - 21 - 600 66 47 62 - 21 - 600 66 47 [email protected] Indonesia Dittrich & Partner Consulting Kieler Strasse 17 5. floor D-42697 Solingen Germany +49 2 12 / 26 066 - 0 +49 2 12 / 26 066 - 66 [email protected] http://www.dpc.de Germany, Austria, Italy Company: Address: Phone: Fax: Email: Countries served: IEM P.O. Box 2222 PRIMROSE 1416 South Africa +27-11-8286169 +27-11-8221377 [email protected] South Africa, Botswana, Lesotho, Namibia, Mozambique, Swaziland, Zimbabwe Company: Address: MercoStat Consultores 9 de junio 1389 CP 11400 MONTEVIDEO Uruguay Phone: Fax: Email: Countries served: 598-2-613-7905 Same [email protected] Uruguay, Argentina, Brazil, Paraguay Company: Address: Phone: Fax: Email: URL: Countries served: Company: Address: Phone: Email: URL: Countries served: (List continued on next page) Metrika Consulting Mosstorpsvagen 48 183 30 Taby STOCKHOLM Sweden +46-708-163128 +46-8-7924747 [email protected] http://www.metrika.se Sweden, Baltic States, Denmark, Finland, Iceland, Norway Ritme Informatique 34, boulevard Haussmann 75009 Paris France +33 (0)1 42 46 00 42 +33 (0)1 42 46 00 33 [email protected] http://www.ritme.com France, Belgium, Luxembourg 60 Stata Technical Bulletin STB-55 International Stata Distributors (Continued from previous page) Company: Address: Phone: Fax: Email: Countries served: Company: Address: Phone: Fax: Email: URL: Countries served: Company: Address: Phone: Fax: Email: URL: Countries served: Company: Address: Phone: Fax: Email: URL: Countries served: Scientific Solutions S.A. Avenue du Général Guisan, 5 CH-1009 Pully/Lausanne Switzerland 41 (0)21 711 15 20 41 (0)21 711 15 21 [email protected] Switzerland Smit Consult Doormanstraat 19 5151 GM Drunen Netherlands +31 416-378 125 +31 416-378 385 [email protected] http://www.smitconsult.nl Netherlands Survey Design & Analysis Services P/L 249 Eramosa Road West Moorooduc VIC 3933 Australia +61 (0)3 5978 8329 +61 (0)3 5978 8623 [email protected] http://survey-design.com.au Australia, New Zealand Timberlake Consultants Unit B3 Broomsleigh Bus. Park Worsley Bridge Road LONDON SE26 5BN United Kingdom +44 (0)208 697 3377 +44 (0)208 697 3388 [email protected] http://www.timberlake.co.uk United Kingdom, Eire Company: Address: Phone: Fax: Email: Countries served: Company: Address: Phone: Fax: Email: Countries served: Company: Phone: Fax: Email: URL: Countries served: Company: Address: Phone: Fax: Email: Countries served: Timberlake Consulting S.L. Calle Mendez Nunez, 1, 3 41011 Sevilla Spain +34 (9) 5 422 0648 +34 (9) 5 422 0648 [email protected] Spain Timberlake Consultores, Lda. Praceta Raúl Brandao, n 1, 1 E 2720 ALFRAGIDE Portugal +351 (0)1 471 73 47 +351 (0)1 471 73 47 [email protected] Portugal Unidost A.S. Rihtim Cad. Polat Han D:38 Kadikoy 81320 ISTANBUL Turkey +90 (216) 414 19 58 +30 (216) 336 89 23 [email protected] http://abone.turk.net/unidost Turkey Vishvas Marketing-Mix Services CnO S. D. Wamorkar “Prashant” Vishnu Nagar, Naupada THANE - 400602 India +91-251-440087 +91-22-5378552 [email protected] India

Log In

Comparing several methods of measuring the same quantity

Sign up for access to the world's latest research.

Related papers

Related papers