Academia.eduAcademia.edu

Bias correction and out-of-sample forecast accuracy

2012, International Journal of Forecasting

The least squares (LS) estimator suffers from significant downward bias in autoregressive models that include an intercept. By construction, the LS estimator yields the best in-sample fit among a class of linear estimators notwithstanding its bias. Then, why do we need to correct for the bias? To answer this question, we evaluate the usefulness of the two popular bias correction methods, proposed by Hansen (1999) and So and Shin (1999), by comparing their out-of-sample forecast performances with that of the LS estimator. We find that bias-corrected estimators overall outperform the LS estimator. Especially, Hansen's grid bootstrap estimator combined with a rolling window method performs the best.

Munich Personal RePEc Archive Bias Correction and Out-of-Sample Forecast Accuracy Kim, Hyeongwoo and Durmaz, Nazif Auburn University May 2009 Online at https://mpra.ub.uni-muenchen.de/16780/ MPRA Paper No. 16780, posted 14 Aug 2009 06:06 UTC Bias Correction and Out-of-Sample Forecast Accuracy∗ Hyeongwoo Kim† and Nazif Durmaz‡ Auburn University May 2009 Abstract The least squares (LS) estimator suffers from significant downward bias in autoregressive models that include an intercept. By construction, the LS estimator yields the best in-sample fit among a class of linear estimators notwithstanding its bias. Then, why do we need to correct for the bias? To answer this question, we evaluate the usefulness of the two popular bias correction methods, proposed by Hansen (1999) and So and Shin (1999), by comparing their out-of-sample forecast performances with that of the LS estimator. We find that bias-corrected estimators overall outperform the LS estimator. Especially, Hansen’s grid bootstrap estimator combined with a rolling window method performs the best. Keywords: Small-Sample Bias, Grid Bootstrap, Recursive Mean Adjustment, Out-ofSample Forecast, Diebold-Mariano Test JEL Classification: C53 ∗ † Special thanks go to Henry Kinnucan and Henry Thompson for useful suggestions. Department of Economics, Auburn University, 415 W. Magnolia Ave., Auburn, AL 36849. Tel: 334-844- 2928. Fax: 334-844-4615. Email: [email protected] ‡ Department of Agricultural Economics and Rural Sociology, Auburn University, Auburn, AL 36849. Tel: 334-844-1949. Fax: 334-844-5639. Email: [email protected] 1 1 Introduction It is a well-known statistical fact that the least squares (LS) estimator for autoregressive (AR) processes suffers from serious downward bias in the persistence coefficient when the stochastic process includes a non-zero intercept and/or deterministic time trend. The bias can be substantial especially when the stochastic process is highly persistent (Andrews, 1993). Since the seminal work of Kendall (1954), an array of bias-correction methods has been put forward. To name a few, Andrews (1993) proposed a method to obtain the exactly median-unbiased estimator for an AR(1) process with Gaussian errors. Andrews and Chen (1994) extends the work of Andrews (1993) to get approximately median-unbiased estimator for higher order AR(p) processes. Hansen (1999) developed a nonparametric bias correction method, the grid bootstrap (GT), which is robust to distributional assumptions. The GT method has been actively employed by many researchers, among others, Kim and Ogaki (2009), Steinsson (2008), Karanasos et al. (2006), and Murray and Papell (2002). An alternative approach has been also proposed by So and Shin (1999) who develop the recursive mean adjustment (RMA) estimator that belongs to a class of (approximately) mean-unbiased estimators. The RMA estimator is computationally convenient to implement yet powerful and used in the work of Choi et al. (2008), Sul et al. (2005), Taylor (2002), and Cook (2002), for instance. By construction, the LS estimator provides the best in-sample fit among the class of linear estimators notwithstanding its bias.1 A natural question then arises: Why do we need to correct for the bias? We attempt to find an answer by comparing the out-of-sample forecast performances of the bias-correction methods with that of the LS estimator. We apply the GT and the RMA approaches along with the LS estimator for quarterly commodity price indices for the period of 1974.QI to 2008.QIII, obtained from the Commodity Research Bureau (CRB). We find that both bias correction methods overall outperform the LS estimator. 1 Recall that the LS estimator is obtained by minimizing the sum of squared residuals. 2 Especially, Hansen’s GT estimator combined with a rolling window method performed the best. Organization of the paper is as follows. In Section 2, we explain the source of bias and how each method corrects for biases. We also briefly explain how we evaluate the relative forecast performances. Section 3 reports our major empirical findings and Section 4 concludes. 2 Bias-Correction Methods We start with a brief explanation of the source of the bias in the LS estimator for an autoregressive process. Consider the following AR(1) process. yt = c + ρyt−1 + εt , (1) where |ρ| < 1 and εt is a white noise process. Note that estimating ρ by the LS estimator is equivalent to estimating the following. (yt − ȳ) = ρ (yt−1 − ȳ) + εt , where ȳ = T −1 PT j=1 (2) yj . The LS estimator for ρ is unbiased only when E [εt | (yt−1 − ȳ)] = 0. This exogeneity assumption, however, is clearly violated because εt is correlated with yj , for j = t, t + 1, · · · , T , thus with ȳ. Therefore, the LS estimator for AR processes with an intercept creates the mean-bias. The bias has an analytical representation, and as Kendall (1954) shows, the LS estimator ρ̂LS is biased downward. There is no analytical representation of the median-bias. Monte Carlo simulations, however, can easily demonstrate that the LS estimator produces significant median-bias for ρ when ρ gets close to unity (see Hansen, 1999). 3 When εt is serially correlated, it is convenient to express (1) as follows. yt = c + ρyt−1 + k X βj ∆yt−j + ut , (3) j=1 where ut is a white noise process that generates εt .2 For Hansen’s (1999) GT method, we define the following grid-t statistic. tN (ρi ) = ρ̂LS − ρi , se(ρ̂LS ) where ρ̂LS is the LS point estimate for ρ, se(ρ̂LS ) denotes the corresponding LS standard error, and ρi is one of M fine grid points in the neighborhood of ρ̂LS . Implementing LS estimations for B bootstrap samples at each of M grid points, we obtain the α% quantile ∗ ∗ function estimates, q̂N,α (ρi ) = q̂N,α (ρi , ϕ(ρj )), where ϕ denotes nuisance parameters such as βs that are functions of ρi . After smoothing quantile function estimates, the (approximately) median-unbiased estimate is obtained by, ∗ ρ̂G = ρi ∈ R, s.t. tN (ρi ) = q̃N,50% (ρi ), ∗ ∗ where q̃N,50% (ρi ) is the smoothed 50% quantile function estimates obtained from q̂N,α .3 To correct for median-bias in βj estimates, we treat other βs as well as ρ as nuisance parameters and follow the procedures described above. So and Shin’s (1999) RMA estimator utilizes demeaning variables using the partial mean instead of the global mean ȳ. Rather than implementing the LS for (2), the RMA estimator is obtained by the LS estimator for the following regression equation. (yt − ȳt−1 ) = ρ (yt−1 − ȳt−1 ) + ηt , 2 When the stochastic process is of higher order than AR(1), exact bias-correction is not possible because the bias becomes random due to the existence of nuisance parameters. For higher order AR(p) models, the RMA and the GT methods yield approximately mean- and median-unbiased estimators, respectively. 3 We used the Epanechinikov kernel K(u) = 3(1 − u2 )/4I(|u| ≤ 1), where I(·) is an indicator function. 4 where ȳt−1 = (t−1)−1 Pt−1 j=1 yj and ηt = εt −(1−ρ)(t−1)−1 Pt−1 j=1 yj . Note that the error term ηt is independent of (yt−1 − ȳt−1 ), which results in bias reduction for the RMA estimator ρ̂R . For a higher order AR process such as (3), the RMA estimator can be obtained by treating βs as nuisance parameters as in Hansen’s (1999) GT method. We use a conventional method proposed by Diebold and Mariano (1995) to evaluate the out-of-sample forecast accuracy of each bias-correction method relative to that of the LS 1 2 estimator. Let yt+h|t and yt+h|t denote two competing (out-of-sample) h-step forecasts given information set at time t. The forecast errors from the two models are, 1 2 ε1t+h|t = yt+h − yt+h|t , ε2t+h|t = yt+h − yt+h|t For the Diebold-Mariano test, define the following function. dt = L(ε1t+h|t ) − L(ε2t+h|t ), where L(εjt+h|t ), j = 1, 2 is a loss function.4 To test the null of equal predictive accuracy, H0 : Edt = 0, the Diebold-Mariano statistic (DM) is defined as, d¯ DM = q ¯ [ d) Avar( where d¯ is the sample mean loss differential, d¯ = T X 1 dt , T − T0 t=T +1 0 ¯ is the asymptotic variance of d, ¯ [ d) Avar( ¯ = [ d) Avar( 4 q X 1 k(j, q)Γ̂j , T − T0 j=−q One may use either the squared error loss function, (εjt+h|t )2 , or the absolute error loss function, |εjt+h|t |. 5 k(·) denotes a kernel function where k(·) = 0, j > q, and Γ̂j is j th autocovariance function estimate.5 Under the null, DM has the standard normal distribution asymptotically. 3 Empirical Results We use quarterly commodity price indices, CRB Spot Index and its six sub-indices, obtained from the Commodity Research Bureau (CRB) for the period of 1974 to 2008.6 We noticed a structural break of these series in 1973, the year of the demise of the Bretton Woods system (see Figure 1). Since our main objective is to evaluate relative forecast performances of competing estimators, we use observations starting from 1974.Q1 instead of using a dummy variable for the Bretton Woods era. Table 1 reports our estimates for the persistence parameter in (3). We find that both the RMA and the GT methods yield significant bias-corrections. For example, the ρ estimate for the Spot Index increases from 0.950 (LS) to 0.969 (RMA) and 0.975 (GT). This is far from being negligible because corresponding half-life estimates are 3.378, 5.503, and 6.844 years, respectively. Note also that median-unbiased estimates by the GT are not restricted to be less than one, because the GT is based on the local-to-unity framework and allows even mildly explosive processes.7 We evaluate the out-of-sample forecasting ability of the three estimators, the LS, the RMA, and the GT, with two alternative forecasting methods. First, we utilize first 69 out of 139 observations to obtain h-step ahead forecasts. Then, we keep forecasting recursively by adding one observation in each iteration until we forecast the last observation. Second, we obtain h-step ahead forecasts using first 69 observations, then keep forecasting with a 5 Following Andrews and Monahan (1992), we use the quadratic spectral kernel with automatic bandwidth selection for our analysis. 6 In order to reduce noise in the data, we converted monthly frequency raw data to quarterly data by taking end-of-period values. Alternatively, one may use quarterly averages. Averaging time series data, however, creates time aggregation bias as pointed by Taylor (2001). 7 When the true data generating process is I(1), one may use AR models with differenced variables, then correct for biases. Median/Mean bias for such models, however, tends to be small, because differenced variables often exhibit much weaker persistence. Since we are interested in evaluating the usefulness of bias-corrected estimators, we do not consider such models. 6 rolling window by adding and dropping one observation in each iteration, maintaining 69 observations, until we reach the end of full sample. We report our results in Tables 2 and 3. Overall, we find that both bias-correction methods outperform the LS estimator with an exception of the Textile Sub-Index. No matter what methods are employed, the ratios of root mean squared prediction errors (RMSPE), LS/RMA and LS/GT, are mostly greater than one, which implies higher prediction precision of these methods relative to the LS estimator. For example, 4-period (1 year) ahead out-of-sample forecasts for the Spot index by the LS, RMA, and GT with the recursive method yield 0.104, 0.099, and 0.102 RMSPEs, respectively (see Table 2). Because the ratio LS/RMA (1.050) is greater than LS/GT (1.018) and both ratios are greater than 1, the RMA performs the best and the LS is the worst for this case. The corresponding Diebold-Mariano statistic shows that the RMA outperforms the LS at the 5% significance level. The evidence of superior performance of the GT is weaker than the RMA because corresponding p-value is 0.185, that is, significant only at the 20% significance level. When we use the rolling window method for 4-period ahead Spot Index forecasts, the grid bootstrap works the best and the LS performs the worst. The GT is superior to the LS at the 1% significance level, while the RMA outperforms the LS at the 5% level. Another interesting finding is that a long memory is not necessarily good because forecast performance seems better with the rolling window method. It is easy to see the RMSPEs for each estimator are much smaller when we employ the rolling window strategy rather than the recursive method.8 Especially, Hansen’s GT estimator combined with the rolling window method performs the best because the associated RMSPEs are the smallest in majority cases. 4 Concluding Remarks This paper evaluates relative forecast performances of two bias-correction methods, the RMA and the GT, to the LS estimator without bias-correction. When an intercept or an intercept 8 We implemented same analysis for the sample period of 1974.Q1 to 2005.Q4 to see whether recent persistent movements of commodity indices significantly affected our results. We found very similar results. 7 and linear time trend are included in AR models, the LS estimator for the slope coefficient is downward-biased. Despite the bias, the LS estimator provides the best in-sample fit among a class of linear estimators. We attempt to find some justification of using these bias-correction methods by comparing the out-of-sample forecast accuracy of the methods with that of the LS estimator. Using the CRB Spot Index and its six sub-indices, we find that both methods overall outperform the LS estimator. Especially, Hansen’s GT performs the best when it is combined with the rolling window strategy. 8 References Andrews, D. W. K. (1993): “Exactly Median-Unbiased Estimation of First Order Autoregressive/Unit Root Models,” Econometrica, 61, 139–165. Andrews, D. W. K., and H.-Y. Chen (1994): “Approximately Median-Unbiased Estimation of Autoregressive Models,” Journal of Business and Economic Statistics, 12, 187–204. Andrews, D. W. K., and J. C. Monahan (1992): “An Improved Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimator,” Econometrica, 60, 953–966. Choi, C.-Y., N. C. Mark, and D. Sul (2008): “Bias Reduction in Dynamic Panel Data Models by Common Recursive Mean Adjustment,” manuscript. Cook, S. (2002): “Correcting Size Distortion of the Dickey-Fuller Test via Recursive Mean Adjustment,” Statistics and Probability Letters, 60, 75–79. Diebold, F. X., and R. S. Mariano (1995): “Comparing Predictive Accuracy,” Journal of Business and Economic Statistics, 13, 253–263. Hansen, B. E. (1999): “The Grid Bootstrap and the Autoregressive Model,” Review of Economics and Statistics, 81, 594–607. Karanasos, M., S. H. Sekioua, and N. Zeng (2006): “On the Order of Integration of Monthly US Ex-ante and Ex-post Real Interest Rates: New Evidence from over a Century of Data,” Economics Letters, 90, 163–169. Kendall, M. G. (1954): “Note on Bias in the Estimation of Autocorrelation,” Biometrika, 41, 403–404. Kim, H., and M. Ogaki (2009): “Purchasing Power Parity and the Taylor Rule,” Ohio State University Department of Economics Working Paper No. 09-03. 9 Murray, C. J., and D. H. Papell (2002): “The Purchasing Power Parity Persistence Paradigm,” Journal of International Economics, 56, 1–19. Ng, S., and P. Perron (2001): “Lag Length Selection and the Construction of Unit Root Tests with Good Size and Power,” Econometrica, 69, 1519–1554. So, B. S., and D. W. Shin (1999): “Recursive Mean Adjustment in Time-Series Inferences,” Statistics and Probability Letters, 43, 65–73. Steinsson, J. (2008): “The Dynamic Behavior of the Real Exchange Rate in Sticky-Price Models,” American Economic Review, 98, 519–533. Sul, D., P. C. B. Phillips, and C.-Y. Choi (2005): “Prewhitening Bias in HAC Estimation,” Oxford Bulletin of Economics and Statistics, 67, 517–546. Taylor, A. M. (2001): “Potential Pitfalls for the Purchasing-Power-Parity Puzzle? Sampling and Specification Biases in Mean-Reversion Tests of the Law of One Price,” Econometrica, 69, 473–498. Taylor, R. (2002): “Regression-Based Unit Root Tests with Recursive Mean Adjustment for Seasonal and Nonseasonal Time Series,” Journal of Business and Economic Statistics, 20, 269–281. 10 Table 1. Persistence Parameter Estimation Results Index Spot Livestock Fats&Oil Foodstuff Raw Industrials Textiles Metals ρL 0.950 0.933 0.933 0.952 0.940 0.917 0.963 CI [0.856,0.972] [0.770,0.966] [0.776,0.965] [0.813,0.976] [0.847,0.966] [0.807,0.951] [0.870,0.981] ρR 0.969 0.972 0.951 0.977 0.969 0.947 0.974 CI [0.872,0.985] [0.795,0.986] [0.800,0.985] [0.836,0.993] [0.863,0.979] [0.824,0.967] [0.887,0.993] ρG 0.975 0.990 0.997 1.008 0.955 0.932 0.996 CI [0.910,1.022] [0.875,1.044] [0.864,1.049] [0.890,1.049] [0.907,1.009] [0.874,1.003] [0.929,1.024] Index Spot Livestock Fats&Oil Foodstuff Raw Industrials Textiles Metals HLL 3.378 2.499 2.499 3.523 2.801 2.000 4.596 CI [1.114,6.102] [0.663,5.010] [0.683,4.864] [0.837,7.133] [1.044,5.010] [0.808,3.449] [1.244,9.033] HLR 5.503 6.102 3.449 7.447 5.503 3.182 6.578 CI [1.265,11.47] [0.755,12.29] [0.777,11.47] [0.967,24.70] [1.176,8.165] [0.895,5.164] [1.445,24.70] HLG 6.844 17.24 57.68 ∞ 3.764 2.461 43.24 CI [1.837, ∞ [1.298, ∞ [1.185, ∞ [1.487, ∞ [1.775, ∞ [1.287, ∞ [2.353, ∞ ] ] ] ] ] ] ] Note: i) The number of lags (k) was chosen by the general-to-specific rule as recommended by Ng and Perron (2001). ii) ρL , ρR , and ρG denote the least squares (LS), recursive mean adjustment (RMA, So and Shin 1999), and grid bootstrap (GT, Hansen 1999) estimates for persistence parameter, respectively. iii) 95% confidence intervals (CI) were constructed by 10,000 nonparametric bootstrap simulations for the LS and RMA estimators, and by 10,000 nonparametric bootstrap simulations on 30 grid points over the neighborhood of the LS estimate for the GT estimator. iv) HLL , HLR , and HLG denote the corresponding half-lives in years, calculated by (ln(0.5)/ln(ρ))/4. 11 Table 2. Recursive Out-of-Sample Forecast Results Index Spot Livestock Fats&Oil Foodstuff Raw Industrials Textiles Metals h 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 RMSPEL 0.045 0.066 0.084 0.104 0.141 0.082 0.118 0.128 0.144 0.178 0.110 0.159 0.174 0.193 0.245 0.063 0.090 0.105 0.124 0.157 0.049 0.076 0.097 0.122 0.162 0.037 0.056 0.074 0.089 0.109 0.087 0.139 0.187 0.226 0.309 RMSPER 0.044 0.063 0.078 0.099 0.138 0.079 0.110 0.124 0.138 0.172 0.109 0.157 0.173 0.192 0.246 0.062 0.088 0.103 0.122 0.156 0.047 0.072 0.092 0.118 0.157 0.037 0.056 0.075 0.091 0.113 0.085 0.135 0.181 0.223 0.303 RMSPEG 0.045 0.064 0.081 0.102 0.139 0.081 0.115 0.127 0.142 0.174 0.110 0.156 0.172 0.192 0.247 0.062 0.087 0.103 0.121 0.156 0.048 0.074 0.095 0.121 0.159 0.037 0.056 0.074 0.090 0.112 0.086 0.134 0.178 0.221 0.301 LS/RMA 1.031 1.059 1.065 1.050 1.026 1.035 1.066 1.035 1.039 1.034 1.003 1.013 1.008 1.001 0.994 1.027 1.032 1.017 1.015 1.003 1.028 1.057 1.056 1.036 1.030 0.993 0.997 0.990 0.978 0.964 1.020 1.031 1.033 1.016 1.019 LS/GT 1.004 1.033 1.029 1.018 1.012 1.012 1.025 1.012 1.011 1.021 0.995 1.018 1.011 1.003 0.992 1.029 1.040 1.022 1.020 1.004 1.009 1.021 1.023 1.010 1.015 0.989 0.999 0.994 0.985 0.973 1.014 1.034 1.046 1.024 1.025 DMR 1.183 (0.237) 1.808 (0.071) 2.555 (0.011) 2.421 (0.015) 1.456 (0.145) 1.561 (0.119) 2.598 (0.009) 2.064 (0.039) 2.683 (0.007) 1.810 (0.070) 0.397 (0.692) 1.712 (0.087) 1.294 (0.196) 0.230 (0.818) -1.082 (0.279) 1.521 (0.128) 2.172 (0.030) 1.532 (0.125) 1.326 (0.185) 0.299 (0.765) 1.053 (0.292) 1.800 (0.072) 2.639 (0.008) 2.235 (0.025) 1.980 (0.048) -0.450 (0.653) -0.115 (0.908) -0.532 (0.595) -1.776 (0.076) -2.240 (0.025) 1.878 (0.060) 2.296 (0.022) 3.540 (0.000) 2.565 (0.010) 2.546 (0.011) DMG 0.180 (0.857) 1.310 (0.190) 1.544 (0.122) 1.324 (0.185) 0.917 (0.359) 1.182 (0.237) 2.585 (0.010) 1.450 (0.147) 1.839 (0.066) 2.027 (0.043) -0.360 (0.719) 1.780 (0.075) 1.543 (0.123) 0.458 (0.647) -1.608 (0.108) 1.113 (0.266) 3.458 (0.001) 2.116 (0.034) 1.864 (0.062) 0.559 (0.576) 0.721 (0.471) 1.444 (0.149) 1.642 (0.101) 0.963 (0.335) 1.339 (0.181) -0.935 (0.350) -0.072 (0.943) -0.448 (0.654) -1.962 (0.050) -2.417 (0.016) 0.612 (0.540) 1.283 (0.199) 3.078 (0.002) 2.102 (0.036) 2.458 (0.014) Note: i) Out-of-sample forecasting was recursively implemented by sequentially adding one additional observation from 69 initial observations toward 139 total observations. ii) The number of lags (k) was chosen by the general-to-specific rule recommended by Ng and Perron (2001). iii) h denotes the forecast horizon (quarters). iv) RMSPEL , RMSPER , and RMSPEG denote the root mean squared prediction errors (RMSPE) for the Least Squares (LS), Recursive Mean Adjustment (RMA), and grid bootstrap (GT) estimators, respectively. v) LS/RMA and LS/GT are RMSPEL /RMSPER and RMSPEL /RMSPEG , respectively. vi) DMR and DMG denote Diebold-Mariano (1995) asymptotic test statistics for the pairs of estimators, LS-RMA and LS-GT. Null hypothesis is equal prediction accuracy. p-values from an asymptotic standard normal distribution are in parenthesis. 12 Table 3. Rolling Window Out-of-Sample Forecast Results Index Spot Livestock Fats&Oil Foodstuff Raw Industrials Textiles Metals h 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 1 2 3 4 6 RMSPEL 0.045 0.065 0.079 0.097 0.134 0.083 0.119 0.126 0.140 0.170 0.110 0.158 0.173 0.192 0.248 0.062 0.085 0.100 0.117 0.153 0.048 0.078 0.093 0.120 0.159 0.037 0.058 0.074 0.087 0.106 0.083 0.133 0.171 0.215 0.293 RMSPER 0.044 0.062 0.076 0.094 0.130 0.082 0.115 0.123 0.138 0.168 0.110 0.158 0.173 0.193 0.253 0.062 0.082 0.098 0.115 0.152 0.048 0.076 0.093 0.119 0.159 0.037 0.056 0.073 0.088 0.108 0.084 0.134 0.170 0.215 0.292 RMSPEG 0.044 0.062 0.074 0.093 0.129 0.083 0.112 0.122 0.135 0.164 0.108 0.153 0.167 0.188 0.251 0.061 0.080 0.095 0.111 0.148 0.047 0.074 0.090 0.116 0.156 0.037 0.057 0.074 0.088 0.108 0.083 0.132 0.165 0.210 0.288 LS/RMA 1.006 1.039 1.046 1.034 1.032 1.014 1.030 1.026 1.020 1.012 1.001 1.005 1.001 0.994 0.980 1.010 1.034 1.018 1.016 1.007 1.007 1.014 1.004 1.007 1.000 1.017 1.029 1.010 0.990 0.985 0.998 0.997 1.004 1.003 1.002 LS/GT 1.010 1.054 1.066 1.046 1.043 1.008 1.058 1.039 1.036 1.036 1.011 1.037 1.035 1.018 0.989 1.016 1.069 1.057 1.055 1.032 1.021 1.049 1.035 1.034 1.020 1.002 1.009 0.999 0.991 0.985 1.006 1.014 1.035 1.028 1.019 DMR 0.328 (0.743) 1.833 (0.067) 2.296 (0.022) 2.116 (0.034) 1.633 (0.102) 1.162 (0.245) 2.046 (0.041) 2.387 (0.017) 1.531 (0.126) 1.347 (0.178) 0.094 (0.925) 0.433 (0.665) 0.132 (0.895) -0.821 (0.411) -2.277 (0.023) 0.945 (0.345) 2.068 (0.039) 1.483 (0.138) 1.373 (0.170) 0.656 (0.512) 0.388 (0.698) 0.516 (0.606) 0.201 (0.841) 0.522 (0.601) 0.016 (0.987) 0.745 (0.457) 1.203 (0.229) 0.482 (0.629) -1.004 (0.316) -1.211 (0.226) -0.127 (0.899) -0.111 (0.912) 0.369 (0.712) 0.440 (0.660) 0.214 (0.831) DMG 0.473 (0.636) 1.778 (0.075) 3.348 (0.001) 3.267 (0.001) 2.648 (0.008) 0.303 (0.762) 2.145 (0.032) 1.683 (0.092) 2.075 (0.038) 3.026 (0.002) 0.461 (0.645) 1.338 (0.181) 1.892 (0.058) 1.246 (0.213) -1.354 (0.176) 0.793 (0.428) 2.226 (0.026) 3.073 (0.002) 2.824 (0.005) 2.396 (0.017) 0.828 (0.408) 1.417 (0.156) 1.858 (0.063) 2.206 (0.027) 1.286 (0.198) 0.213 (0.832) 0.563 (0.573) -0.066 (0.947) -1.408 (0.159) -1.633 (0.103) 0.282 (0.778) 0.454 (0.650) 2.439 (0.015) 2.909 (0.004) 1.471 (0.141) Note: i) Out-of-sample forecasting was implemented by sequentially adding one additional observation and dropping one observation in each iteration, maintaining 69 observations. ii) The number of lags (k) was chosen by the general-to-specific rule recommended by Ng and Perron (2001). iii) h denotes the forecast horizon (quarters). iv) RMSPEL , RMSPER , and RMSPEG denote the root mean squared prediction errors (RMSPE) for the Least Squares (LS), Recursive Mean Adjustment (RMA), and grid bootstrap (GT) estimators, respectively. v) LS/RMA and LS/GT are RMSPEL /RMSPER and RMSPEL /RMSPEG , respectively. vi) DMR and DMG denote Diebold-Mariano (1995) asymptotic test statistics for the pairs of estimators, LS-RMA and LS-GT. Null hypothesis is equal prediction accuracy. p-values from an asymptotic standard normal distribution are in parenthesis. 13 Figure 1. CRB Historical Data 14