Some critical remarks on Zhang&#39;s gamma test for independence

Ingo  Klein

Some critical remarks on Zhang's gamma test for independence

Ingo Klein

2011

visibility

…

description

19 pages

link

1 file

Zhang (2008) defines the quotient correlation coefficient to test for dependence and tail dependence of bivariate random samples. He shows that asymptotically the test statistics are gamma distributed. Therefore, he called the corresponding test gamma test. We want to investigate the speed of convergence by a simulation study. Zhang discusses a rank-based version of this gamma test that depends on random numbers drawn from a standard Fréchet distribution. We propose an alternative that does not depend on random numbers. We compare the size and the power of this alternative with the well-known t-test, the van der Waerden and the Spearman rank test. Zhang proposes his gamma test also for situations where the dependence is neither strictly increasing nor strictly decreasing. In contrast to this, we show that the quotient correlation coefficient can only measure monotone patterns of dependence.

econstor A Service of zbw Make Your Publications Visible. Leibniz-Informationszentrum Wirtschaft Leibniz Information Centre for Economics Klein, Ingo; Tinkl, Fabian Working Paper Some critical remarks on Zhang's gamma test for independence Diskussionspapier, No. 87/2010 Provided in Cooperation with: Friedrich-Alexander-University Erlangen-Nuremberg, Chair of Statistics and Econometrics Suggested Citation: Klein, Ingo; Tinkl, Fabian (2011) : Some critical remarks on Zhang's gamma test for independence, Diskussionspapier, No. 87/2010, Friedrich-Alexander-Universität Erlangen-Nürnburg, Lehrstuhl für Statistik und Ökonometrie, Nürnberg This Version is available at: http://hdl.handle.net/10419/52385 Standard-Nutzungsbedingungen: Terms of use: Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Zwecken und zum Privatgebrauch gespeichert und kopiert werden. Documents in EconStor may be saved and copied for your personal and scholarly purposes. Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich machen, vertreiben oder anderweitig nutzen. You are not to copy documents for public or commercial purposes, to exhibit the documents publicly, to make them publicly available on the internet, or to distribute or otherwise use the documents in public. Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen (insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, gelten abweichend von diesen Nutzungsbedingungen die in der dort genannten Lizenz gewährten Nutzungsrechte. www.econstor.eu If the documents have been made available under an Open Content Licence (especially Creative Commons Licences), you may exercise further usage rights as specified in the indicated licence. Lehrstuhl für Statistik und Ökonometrie Diskussionspapier 87 / 2011 Some critical remarks on Zhang’s gamma test for independence Ingo Klein Fabian Tinkl Lange Gasse 20 · D-90403 Nürnberg Some critical remarks on Zhang’s gamma test for independence Ingo Klein1 , Fabian Tinkl Department of Statistics and Econometrics University of Erlangen-Nuremberg, Germany Abstract Zhang (2008) defines the quotient correlation coefficient to test for dependence and tail dependence of bivariate random samples. He shows that asymptotically the test statistics are gamma distributed. Therefore, he called the corresponding test gamma test. We want to investigate the speed of convergence by a simulation study. Zhang discusses a rank-based version of this gamma test that depends on random numbers drawn from a standard Fréchet distribution. We propose an alternative that does not depend on random numbers. We compare the size and the power of this alternative with the well-known t-test, the van der Waerden and the Spearman rank test. Zhang proposes his gamma test also for situations where the dependence is neither strictly increasing nor strictly decreasing. In contrast to this, we show that the quotient correlation coefficient can only measure monotone patterns of dependence. Keywords and phrases: test on dependence, rank correlation test, Spearman’s ρ, copula, Lehmann ordering 1 Correspondence Author: Ingo Klein, Department of Statistics and Econometrics, Univer- sity of Erlangen Nuremberg, D-90403 Nuremberg, Lange Gasse 20, E-Mail: [email protected] 1 1 Introduction Zhang (2008) proposes Q+ = max{Xi /Yi } + max{Yi /Xi } − 2 max{Xi )/Yi } · max{Yi /Xi } − 1 (1) as a measure for dependence for two random variables (X, Y ) with margins that are standard Fréchet distributed. This measure takes values in [0, 1]. For Xi = Yi , i = 1, 2, . . . , n we get Q+ = 1. This is the only case with a strictly increasing relationship between Xi and Yi if Xi and Yi are both standard Fréchet distributed. If −Yi follows a standard Fréchet distribution there is a strictly decreasing rela- tionship between Xi and Yi if Xi = −Yi , i = 1, 2, . . . , n. As Q+ is not suitable. In this case Q− = − max{Xi /(−Yi )} + max{−Yi /Xi } − 2 max{Xi )/(−Yi )} · max{−Yi /Xi } − 1 (2) can be used as a measure of strictly decreasing dependence. With Xi = −Yi we get Q− = −1. If Xi and Yi are stochastically independent for i = 1, 2, . . . , n, it follows Q+ = Q− = 0. Under the hypothesis of independence Zhang proves that these test statistics are asymptotically Γ(2, 1) distributed. This result holds not only for a bivariate sample (X1 , Y1 ), . . . , (Xn , Yn ) of independent draws. It also holds if the sample fulfils some weak mixing properties. If the marginal distributions FU and FV of (Ui , Vi ) are known but not standard Fréchet distributed they can be transformed by Xi = 1 1 , Yi = , i = 1, 2, . . . , n − ln FU (Ui ) − ln FV (Vi ) (3) Xi = 1 1 , Yi = − , i = 1, 2, . . . , n. − ln FU (Ui ) − ln(1 − FV (Vi )) (4) or For this transformed random variables we can compute Q+ resp. Q− . If Vi = g(Ui ) holds with g strictly increasing (decreasing) we get Xi = Yi (Xi = −Yi ), i = 1, 2, . . . , n. This means that the quotient correlation coefficient measures monotonicity between Ui and Vi . At first glance this contradicts Zhang 2 (2008) who discusses the special case Yi = Xi2 with Xi ∼ N (0, 1). In this case he shows that his gamma test indicates dependence. The common alternatives (like Pearson’s or Spearman’s correlation coefficient) take the value 0 and cannot indicate the dependence that obviously holds. We will discuss this property later. Summarizing we get + Q = FU (Ui ) ln FV (Vi ) max{ ln } + max{ ln }−2 ln FV (Vi ) FU (Ui ) ln FV (Vi ) FU (Ui ) } max{ ln }−1 max{ ln ln FV (Vi ) FU (Ui ) (5) and − Q =− ln FU (Ui ) V (Vi )) } + max{ ln(1−F }−2 max{ ln(1−F ln FU (Ui ) V (Vi )) ln FU (Ui ) V (Vi )) max{ ln(1−F } max{ ln(1−F }−1 ln FU (Ui ) V (Vi )) (6) as quotient correlation coefficient for known margins FU and FV . If the marginal distributions FU and FV are unknown we have to estimate them consistently. Zhang discusses several consistent estimators. Kick (2011), p. 12 adds a further proposal based on ranks. We will discuss this estimator in section 4. Zhang proposes a rank based version of his quotient correlation coefficient, too. His proposal depends on a sequence of realizations z1 , . . . , zn of random variables Z1 , . . . , Zn that are standard Fréchet distributed. Denote R1 , . . . , Rn the ranks of X1 , . . . , Xn and S1 , . . . , Sn the ranks of Y1 , . . . , Yn . z (1) < . . . < z (n) are the realizations of the corresponding order statistics of Z1 , . . . , Zn . Then Q̃+ = max{z (Ri ) /z (Si ) } + max{z (Si ) /z (Ri ) } − 2 max{z (Ri )/z (Si )} · max{z (Si ) /z (Ri ) } − 1 (7) is the rank based version of the quotient correlation coefficient proposed by Zhang. This measure depends on the realizations of Z1 , . . . , Zn . To reduce this dependence Zhang proposes to draw several sequences of realizations, calculate Q̃+ for every sequence and use the mean of all values for Q̃+ as test statistic. This procedure does not affect the asymptotic properties. Zhang does not discuss how fast the convergence of the distributions of Q+ and Q̃+ to a Γ(2, 1) distribution is. We want to investigate this speed of convergence by a simulation study. Furthermore, we propose an alternative rank based version of the quotient correlation coefficient. This version neither depends on the distribution of the population and nor on the realization of a sample of standard Fréchet distributed 3 random variables. Our aim is to show that Zhang’s measure of dependence is not suitable for non monotonic patterns of dependence. Although this measure does not depend on copulas explicitly we will give some hints that it does not hold Lehmann’s ordering of dependence. Finally, we compare the power of the rank gamma test on dependence with the power of several of his competitors, especially the Spearman and the van der Waerden test. The critical values will be computed for the exact distribution as well as for the asymptotic distribution under the null. 2 Convergence speed for Zhang’s quotient correlation With FQ+ we denote the exact distribution of Q+ . Γ(.; 2, 1) is the distribution function of a gamma distributed random variable with parameter vector (2, 1). FQ−1+ and Γ−1 (.; 2, 1) denote the corresponding quantile function (=inverse distribution function). As a measure of distance between the exact and the asymptotic distribution of Q+ we use the Kolmogorov-Smirnov distance KS = sup |FQ+ (x) − Γ(x; 2, 1)|. (8) x∈R FQ−1+ (1 − α) 1 − FQ+ (Γ−1 (1 − α; 2, 1)) 0.026 5.546 0.074 10 0.014 4.672 0.048 25 0.006 4.697 0.047 50 0.005 4.688 0.047 100 0.008 4.741 0.050 n KS 3 0.049 5 8.739 0.130 Table 1: Differences between the exact (simulated) and the asymptotic distribution of nQ+ for some sample sizes n and α = 0.05. To check whether the test keeps its size α if we use the asymptotic distribution, we compare the critical values FQ−1+ (1 − α) and Γ−1 (1 − α; 2, 1) and the given test size α with 1 − FQ+ (Γ−1 (1 − α; 2, 1)) for alternative sample sizes n. We discuss the case α = 0.05 with Γ−1 (0.95; 2, 1) = 4.744. 4 The results presented in table 1 show that for small sample sizes the distance between the true and the asymptotic distribution is relatively small. If the marginal distributions are known and the sample size exceeds 10, the test size of 5% will be kept. 3 Sensitivity of Zhang’s rank based gamma test Zhang’s rank based gamma test depends on the realization of a sample from the standard Fréchet distribution. To exemplify how sensitive the gamma test is with respect to this initial sample, we draw 10 different samples of the size n = 10 and calculate the Kolmogorov-Smirnov distance between the exact and the asymptotic gamma distribution, the exact critical value FQ̃−1+ (1 − α) and the realized test size 1 − FQ̃ (Γ−1 (1 − α; 2, 1) of Zhang’s rank based gamma test. FQ̃−1 (1 − α) 1 − FQ (Γ−1 (1 − α; 2, 1)) 0.206 2.788 0.001 0.669 1.236 0.000 0.963 0.129 0.007 0.473 2.229 0.002 0.791 9.885 0.753 0.492 5.857 0.150 0.349 5.576 0.103 0.240 4.370 0.032 0.302 2.639 0.001 0.241 3.606 0.008 KS 0.315 4.233 0.023 Table 2: Dependence of the Kolmogorov-Smirnov distance, the critical value and realized test size on the initial sample from a standard Fréchet distribution Zhang proposes to calculate the mean of test statistics for all 10 samples. This strategy cannot be recommended if the value of the statistics varies extremely from sample to sample as is shown by the results in table 2. Following Kick (2011), p. 12, we will discuss an alternative rank based version of the gamma test which does not depend on an arbitrarily chosen initial sample. 5 4 Rank quotient correlation coefficient We consider a random sample (U1 , V1 ), . . . , (Un , Vn ) from a bivariate distribution with continuous margins FU and FV . Ri and Si denote the ranks of Ui and Vi in (U1 , . . . , Un ) and. (V1 , . . . , Vn ) for i = 1, 2, . . . , n. As a nonparametric version of the quotient correlation we propose Q+ R = Ri Si Si Ri max{ln n+1 / ln n+1 } + max{ln n+1 / ln n+1 }−2 (9) Si Si Ri Ri / ln n+1 } · max{ln n+1 / ln n+1 }−1 max{ln n+1 Q+ can only measure positive dependence between U and V . For negative dependence we have to define an alternative version Q− R = − Ri Ri i +1 i +1 / ln n−S } + max{ln n−S / ln n+1 }−2 max{ln n+1 n+1 n+1 (10) Ri Ri i +1 i +1 max{ln n+1 / ln n−S } · max{ln n−S / ln n+1 }−1 n+1 n+1 − Q+ R = 0 and QR = 0 hold, if U and V are independent. Perfect positive (negative) − dependence between U and V leads to Q+ R = 1 (QR = −1). With a suitable rearrangement of Ri and Si we get Ri = i, i = 1, 2, . . . , n. Si , i = 1, 2, . . . , n are now the ranks of Vi after this rearrangement. With this simplification we get Q+ R = Si Si i i / ln n+1 } + max{ln n+1 / ln n+1 }−2 max{ln n+1 (11) Si Si i i / ln n+1 } · max{ln n+1 / ln n+1 }−1 max{ln n+1 and Q− R =− i i i +1 i +1 / ln n−S } + max{ln n−S / ln n+1 }−2 max{ln n+1 n+1 n+1 i i i +1 i +1 / ln n−S } · max{ln n−S / ln n+1 }−1 max{ln n+1 n+1 n+1 . (12) − In this version Q+ R and QR depend only on (S1 , . . . , Sn ). This vector of ranks is uniformly distributed over the space of permutations of {1, 2, . . . , n} under the null of independence of U and V . Under this hypothesis the exact distributions of Q+ R and Q− R can be calculated as the empirical distribution of all n! values of the test statistics generated from all permutations. This procedure works only for small − sample sizes. For larger sample sizes the distributions of Q+ R and QR can be cal- culated approximately by simulation. Thus, we draw many elements from the set of permutations with equal probability, calculate the test statistic for each element and, based on these values, the empirical distribution. If the number of repetitions is sufficiently large the exact distribution will be approximated very precisely. 6 The number of repetitions is chosen to be 100000. In this case the exact and the simulated distribution almost coincide. Zhang (2008) seems to be not interested in this exact distribution. He derives the asymptotic distribution with the help of the asymptotic for maxima and minima. He does not check for which sample sizes the fixed and the realized test sizes are almost identically. We have to estimate the unknown marginal distributions FU and FV consistently. Let F̂U and F̂V be such consistent estimators. Zhang (2008) proves that the asymptotic distribution of Q does not change if Xi = − 1 ln F̂U (Ui ) und Yi = − 1 ln F̂V (Vi ) for i = 1, 2, . . . , n will be inserted in (5). Due to Pfeiffer (1989) one possibility to estimate FU and FV consistently is F̂U (u) = ♯{Ui |Ui ≤ u}/(n + 1) resp.F̂V (v) = ♯{Vi |Vi ≤ v}/(n + 1) for u, v ∈ R. This is a modified version of the empirical distribution function. This modification prevents the estimator from achieving the extreme values 0 or 1 for which the transformation to the standard Fréchet distribution becomes a problem. With nF̂U (Ui ) = Ri and nF̂V (Vi ) = Si there is a simple relationship between the empirical distribution functions and the ranks. Inserting the consistent estimator in + − + (5) gives Q+ R for Q and QR for Q . Therefore, under the null hypothesis and for continuous margins FU and FV V nQ+ R → Z, Z ∼ Γ(2, 1). holds. The so called antiranks n−Si +1 results from considering the random variable −V . Then, under the usual assumptions we also get V −nQ− R → Z, Z ∼ Γ(2, 1). We call the tests on independence rank gamma tests if they are based on this asymptotic distribution. To check whether the rank gamma test does not exceed the test size α we compare the critical values FQ−1+ (1 − α) and Γ−1 (1 − α; 2, 1) and α with the realized test size R 1 − FQ+ (Γ−1 (1 − α; 2, 1)) for several sample sizes. We restrict the discussion to R α = 0.05 which gives the approximative critical value Γ−1 (0.95; 2, 1) = 4.744. 7 FQ−1 (1 − α) 1 − FQ (Γ−1 (1 − α; 2, 1)) 0.260 4.047 0.020 50 0.245 4.410 0.033 100 0.243 4.423 0.034 200 0.240 4.440 0.034 500 0.238 4.447 0.035 1000 0.236 4.423 0.033 2000 0.237 4.449 0.035 5000 0.237 4.458 0.036 10000 0.235 4.440 0.034 n KS 5 0.280 10 3.445 0.008 Table 3: Differences between the exact (simulated) and the asymptotic distribution of nQ+ R for some sample sizes n and α = 0.05. As an important result we see that the convergence of nQ+ R towards its limiting distribution is very slow. The Kolmogorov-Smirnov distance is almost constant for sample sizes from 50 to 10000. This results contrasts the well known property of alternative test procedures (like the Spearman test) with a normal distribution as limiting distribution. For these tests sample sizes of around 20 are sufficient to guarantee that exact and asymptotic distribution are almost identical. As a positive property of the rank gamma test we can state that the test is conservative as the realized test size is always smaller than 5%. For the comparison of power between the rank gamma test and several of his competitors we need to specify some realistic alternatives of dependence. Before we can do this, we have to clarify what kind of dependence the quotient correlation coefficient is able to measure. 5 What kind of dependence does the rank quotient correlation coefficient measure? As already mentioned in the introduction the quotient correlation coefficients measures how monotone the relationship between two random variables (X, Y ) is because 8 the extreme values 1 or −1 will be achieved, if Y is a strictly increasing or decreasing function of X. In contrast, this measure rank correlation coefficients are copula based and maintain the well known Lehmann ordering of dependence. This ordering implies for two pairs of random variables (X, Y ) and (X ′ , Y ′ ) with bivariate distribution functions F and F ′ and corresponding copulas C and C ′ that (X, Y ) is less dependent than (X ′ , Y ′ ), if C(u, v) ≥ C ′ (u, v) u, v ∈ [0, 1] holds. Up to now, this properties has not been discussed for Q+ and Q+ R either. The problem is that these measures are sample based and not copula based. We do not know the corresponding functional for the populations distribution. Therefore, we try to get an idea whether Q+ maintains the Lehmann ordering by a simulation study. For this purpose we consider the Farlie-Gumbel-Morgenstern (FGM-) copula C(u, v; θ) = uv + θ(1 − u)(1 − v), ∀u, v ∈ [0, 1] (13) with the special setting θ = 0.2 and θ = 0.8 for the parameter of dependence. For this setting we have C(u, v; 0.2) ≤ C(u, v; 0.8) u, v ∈ [0, 1]. We draw 100000 random samples of size n from C(u, v; 0.2) and C(u, v; 0.8), compute Q+ and Spearman’s ρ and count how often the values of Q+ and ρ are smaller for θ = 0.2 than for θ = 0.8. We get the following result: 9 n Q+ ρ 10 0.59 0.64 100 0.64 0.93 1000 0.65 1.00 Table 4: Proportion of 100000 repetitions such that the values of Q+ or of Spearmans ρ are smaller for θ = 0.2 than for θ = 0.8 In one third of all repetitions the rank quotient correlation coefficient cannot identify the true ordering of dependence even if the sample size is 1000. In contrast to Spearman’s ρ the Lehmann ordering will be maintained for the moderate sample size of n = 100. Of course, this does not a proof the rank quotient correlation coefficient not maintaining the Lehmann ordering. But it is a hint that there could be a problem to interpret the way this coefficient measures dependence. After all, we want to answer the question whether the quotient correlation coefficient can measure non monotonic dependence. Zhang considers as an alternative hypothesis two dependent variables with the first variable U following a standard normal distribution and the second variable V = U 2 a χ2 (1) distribution. Zhang simulates the power of the gamma test in this special case. Now, we replicate Zhang’s simulation study for a test size of 5% and different sample sizes. The critical value will be again calculated with the exact and with the limiting distribution. 10 n exact asympt. 3 0.162 0.264 5 0.120 0.138 10 0.102 0.096 25 0.667 0.653 50 0.983 0.982 100 1.000 1.000 Table 5: Exact and approximative power of the gamma test for Ui = Vi2 with Vi iid standard normal for i = 1, 2, . . . , n. The results in table 5 seem to confirm Zhang’s assertion that the gamma test can discover non monotonic dependence. The well known alternative tests like the t-test based on Pearson’s correlation coefficient or Spearman’s ρ indicate independence in the special case of V = U 2 . At first glance, the gamma test seems to be suitable for more general alternatives. But if we use Q− to test on negative dependence, the null hypothesis of independence will also be denied for V = U 2 . In this case we get Q+ = −Q− because the standard normal distribution is symmetric around 0 such that FV (v) = 1 − FV (−v) holds for v ∈ R. Thus, the gamma tests detect positive as well as negative dependence. Therefore, the gamma test is only suitable for monotone dependence like his competitors the t-test and the Spearman test. 6 6.1 Power comparisons Rank correlation test Spearman’s rank correlation coefficient ρ based on the ranks (Ri , Si ) of the bivariate sample (U1 , V1 ), . . . , (Un , Vn ) is given by P (Ri − R̄)(Si − S̄) ρ = pP P (Ri − R̄)2 (Si − S̄)2 For continuous margins and after some rearrangement we get Ri = i, i = 1, 2, . . . , n. Finally, we get P 6 ni=1 (i − Si )2 ρ=1− (n − 1)n(n + 1) 11 (see f.e. Büning & Trenkler (1994), p. 232ff.). The distribution of ρ can be obtained via permutations of {1, 2, . . .} for small sample sizes, via simulation or approxima- tively by using the limiting distribution. For the limiting distribution it holds that ρ V q → Z, Z ∼ N (0, 1). 1 n−1 under the null hypotheses of independence. Spearman’s ρ is a special case of the linear rank statistic Pn i=1 a(i)a(Si ) P . n 2 i=1 a(i) P a(.) is the so-called score function a(.) with the property ni=1 a(i) = 0. The special choice a(i) = 2 1 i − n+1 2 , i = 1, 2, . . . , n. leads to the Spearman’s test. The corresponding scores will be called Wilcoxon scores. An alternative choice is a(i) = E Φ−1 (U (i) ) , i = 1, 2, . . . , n, where U (i) is the ith order statistic of a random sample from an uniform distribution on (0, 1). These scores will be called Terry-Hoeffding scores. Interchanging of expectation and the quantile function Φ−1 (.) gives the van der Waerden scores i −1 a(i) = Φ , i = 1, 2, . . . , n. n+1 Under the null hypothesis of independence the limiting distribution of the general linear rank statistic is given by Pn √ i=1 a(i)a(Si ) V → Z, Z ∼ N (0, 1). n−1 P n 2 i=1 a(i) (see Hájek & Šidák (1967), p. 167). It is well known (f.e. from Hájek & Šidák (1967), p. 75) that linear rank correlation tests with suitable chosen scores are locally optimal in the set of all linear rank tests, if the following alternative hypothesis will be considered: Xi = Xi∗ + ∆ · Zi , Yi = Yi∗ + ∆ · Zi (H1 : ∆ > 0) resp. Yi = Yi∗ − ∆ · Zi (H1 : ∆ > 0). 12 Xi∗ , Yi∗ , Zi∗ are independent random variables for i = 1, 2, . . . , n. ∆ determines the strength of the relationship between (Xi∗ , Yi∗ ). If the second moment of Xi∗ , Yi∗ and Zi exists, the Pearson correlation coefficient of (Xi , Yi ) is given by: ∆2 V ar(Zi ) . (V ar(Xi∗ + ∆2 V ar(Zi ))V ar(Yi∗ + ∆2 V ar(Zi )))1/2 Then, the test statistic ′ ′ n X f f −1 (Ri ) −1 (Qi ) E − (F (U )) E − (F (U )) f f i=1 gives a locally optimal linear rank test under mild conditions concerning the marginal density f of Xi∗ and Yi∗ . Especially, the linear rank tests with Terry Hoeffding scores and with Wilcoxon scores are locally optimal for the normal and the logistic distributions. Both tests are also asymptotically optimal in the class of all tests. Additionally, the linear rank test with van der Waerden scores is asymptotically optimal for the normal distribution (see Hájek & Šidák (1967), p. 254). We want to compare the power of the rank gamma test with some rank correlation tests in situations where the rank correlations tests are locally or asymptotically optimal. As competitors we consider the Spearman test (based on ρ) and the linear rank test with van der Waerden scores. The Terry Hoeffding scores are omitted because they are numerically tedious. 6.2 Comparison of power for the normal distribution (α = 0.05) We discuss the power of the t test, the linear rank test with van der Waerden scores, the Spearman test and the rank gamma test. Under the null hypothesis, for the t statistic holds r n−2 ∼ t(n − 2). t=r 1 − r2 The corresponding t test is optimal for one-sided alternatives and normal populations (see Büning & Trenkler (1994), p. 238). The test size is 5%. The corresponding critical values will be calculated for the exact and the asymptotic distribution. 13 ∆ t van der Waerden Spearman-Test rank gamma test exact asympt. exact asympt. exact asympt. n = 10 0.1 0.0562 0.0502 0.0517 0.0508 0.0553 0.0487 0.0178 0.5 0.1441 0.1358 0.1344 0.1261 0.1318 0.1130 0.0483 1.0 0.4688 0.4092 0.4056 0.3802 0.4036 0.3302 0.1945 2.0 0.9300 0.8749 0.8744 0.8621 0.8681 0.7857 0.6143 3.0 0.9926 0.9736 0.9746 0.9697 0.9738 0.9289 0.8367 n = 100 0.1 0.0580 0.0596 0.0593 0.063 0.062 0.0556 0.0363 0.5 0.6391 0.6312 0.6353 0.6130 0.6083 0.2574 0.2168 1.0 1.0000 0.9999 0.9999 0.9997 0.9997 0.8363 0.8017 2.0 1.0000 1.0000 1.0000 1.0000 1.0000 0.9987 0.9995 n = 1000 0.1 0.0909 0.1007 0.0933 0.0848 0.9040 0.0588 0.0422 0.5 1.0000 1.0000 1.0000 1.0000 1.0000 0.4030 0.3430 1.0 1.0000 1.0000 1.0000 1.0000 1.0000 0.9696 0.9607 Table 6: Comparison of power of t test, van der Waerden test, Spearman test and rank gamma test for different sample sizes n and dependence parameter ∆ The results presented in table 6 confirm the theoretical considerations. The van der Waerden as well as the Spearman test show a power that coincides almost with the power of the optimal t test even for relatively small samples. This result remains true if the critical value will be computed by the limiting distribution. It can be explained by the well known high speed of convergence of the distribution of linear rank tests to the normal distribution. In contrast to these tests, the rank gamma test has a substantial lower power for all sample sizes. Due to the low speed of convergence the power differs significantly for the exact and approximative critical value. It is worth to mention, that the exact power was calculated by simulation. Therefore, there are small deviations between the actual results and the results that should be expected by theoretical reasons. For example, the power of the van der Waerden test for n = 100 and ∆ = 0.1 is greater than the power of the optimal t-test. 14 6.3 Comparison of power for the Farlie Gumbel Morgenstern copula (α = 0.05) Assuming that the population is normal distributed ensures that the t-, the van der Waerden test and the test based on ρ have to have superior power. Now, we consider the Farlie-Gumbel-Morgenstern copula (13) with dependence parameter θ ∈ [−1, 1]. If θ > 0 we get positive dependence in the sense of Lehmann. This means that C(u, v) ≥ uv, u, v ∈ [0, 1] holds. For different positive parameter values θ we draw alternative 100000 bivariate samples from the FGM copula. Then we check how often the null hypothesis of independence is denied. Because the results for the van der Waerden and the Spearman test are very similar, we restrict the discussion to the power of Spearman’s test and the rank gamma test. θ rang gamma test Spearman test exact asympt. exact asympt. n = 100 0.1 0.0608 0.0443 0.0989 0.0972 0.2 0.0809 0.0572 0.1764 0.1644 0.4 0.1290 0.1072 0.3824 0.3812 0.6 0.1972 0.1597 0.6562 0.6436 0.8 0.3080 0.2515 0.8526 0.8576 1.0 0.4201 0.3577 0.9659 0.9670 n = 1000 0.1 0.0703 0.0489 0.2756 0.2691 0.2 0.0865 0.0626 0.6864 0.6392 0.4 0.1405 0.1363 0.9864 0.9943 0.6 0.2213 0.1827 1.0000 1.0000 0.8 0.3435 0.2885 1.0000 1.0000 1.0 0.4829 0.4233 1.0000 1.000 Table 7: Comparison of power for the Spearman- and the rank gamma test for alternative sample sizes n and dependence parameter θ Again, we can identify the well known picture. The power of the rank gamma test 15 increases slowly with sample size and strength of dependence. In contrast to this result the power of the Spearman test is already high for moderate sample sizes (f.e. n = 100) and dependence parameter values. 6.4 Comparison of power for the Resnick case (α = 0.05) Zhang (2008), p. 1020 discusses an example for an alternative going back to Resnick. This alternative models an extremely positive form of dependence. Consider two random variables U and V depending on an uniform random variable W and a standard normal variable Z, where W and Z are independent: U = 1/W, V = 1/(1 − W ) + Z. For this extreme situation one can expect the rank gamma and the Spearman test to show high power. n rank gamma test Spearman test exact asympt. exact asympt. 10 0.6591 0.4799 0.8803 0.8803 100 0.9042 0.8800 1.0000 1.0000 1000 0.9551 0.9396 1.0000 1.0000 Table 8: Comparison of power for the Spearman- and the rank gamma test in the Resnick case of extreme positive dependence Now, the rank gamma test shows the expectable high power. But the Spearman again has superior properties. This holds especially for small sample sizes. 7 Summary We identified some restrictions for the test on independence proposed by Zhang. Firstly, the quotient correlation coefficient can only measure the strength of monotone patterns of dependence. Whether this coefficient maintains the Lehmann ordering is still an open question. We got some hints that the quotient correlation coefficient does not. Secondly, the asymptotic of the gamma test is rather simple. But the speed of convergence is very slow except in the unrealistic situation 16 of known marginal distributions. Thirdly, Zhang’s rank based gamma test strongly depends on the initial sample from the standard Fréchet distribution. We cannot recommend to use this test. Fourthly, comparisons of power show that the rank gamma test is significantly inferior to the well known traditional linear rank tests like the Spearman test for several alternatives. It remains to investigate the power of the test if we give up the assumption of independently identically distributed random variables. The asymptotic of Zhang’s gamma test still works for stochastic processes with special mixing conditions. To our knowledge similar results for the asymptotic distribution of the linear rank test have not been established. Zhang modifies the gamma test to test for tail independence. Again, investigating the power of this test in comparison to well known competitors which have been discussed f.e. in Schmidt & Stadtmüller (2006) and Schmid & Schmidt (2007) remains an open task. References [1] Büning, H. & Trenkler, G. (1994). Nichtparametrische statistische Methoden. Berlin. [2] Hájek, J. & Šidák, Z. (1967). Theory of Rank Tests. New York. [3] Kick, K. (2011). Quotientenkorrelation und Finanzmarktdaten. Diplomarbeit. Universität Erlangen-Nürnberg. [4] Pfeiffer, D. (1989). Einführung in die Extremwertstatistik. Stuttgart. [5] Schmid, F. & Schmidt, R. (2007). Multivariate conditional versions of Spearman’s rho and related measurs of tail dependence. Journal of Multivariate Analysis 98, 1123-1140. [6] Schmidt, R. & Stadtmüller, U. (2006). Nonparametric estimation of tail dependence. Scandinavian Journal of Statistics 33, 307-335. [7] Zhang, Z. (2008). Quotient correlation: A sample based alternative to Pearson’s correlation. Annals of Statistics 36, 1007-1030. 17

Log In

Some critical remarks on Zhang's gamma test for independence

Related papers

Related papers

Related topics