3.6: General Hypothesis Tests
3.6: General Hypothesis Tests
3.6: General Hypothesis Tests
in the critical region (thus leading us to reject the NH) in no more than 5% of the repeated trials. In other words, we expect our rejection of the null hypothesis to be the wrong decision in no more than 5 times out of every 100 experiments.
(oi ei )2 ei i=1
mately a 2 pdf with = k 1 m degrees of freedom. Here m denotes the number of parameters (possibly zero) of the model discrete distribution which one needs to estimate before one can compute the expected frequencies, and is reduced by one further degree of freedom because of the constraint that determined by the sample size n. This 2 goodness of t test need not be restricted only to discrete random variables, since we can eectively produce discrete data from a sample drawn from a continuous pdf by binning the data. Indeed, as we remarked in Section 2.2.7 the Central Limit Theorem will ensure that such binned data are approximately normally distributed, which means that the sum of their squares will be approximately distributed as a 2 ei = n. In other words, once we have computed the rst k 1 expected frequencies, the k th value is uniquely
2 random variable. The approximation to a 2 pdf is very good provided ei 10, and is reasonable for 5 ei 10. Example 1 A list of 1000 random digits integers from 0 to 9 are generated by a computer. Can this list of digits be regarded as uniformly distributed? Suppose the integers appear in the list with the following frequencies:r or 0 1 2 3 4 92 5 6 7 112 8 114 9 91
106 88
97 101
103 96
Let our NH be that the digits are drawn from a uniform distribution. This means that each digit is expected to occur with equal frequency i.e. er = 100, for all r. Thus:2 = (oi ei )2 ei i=1
k
7.00
Suppose we adopt a 5% level of signicance. The number of degrees of freedom, = 9; hence the critical value of 2 = 16.9 for a one-tailed test. Thus, at the 5% signicance level we accept the NH that the digits are uniformly distributed.
Example 2 The table below shows the number of nights during a 50 night observing run when r hours of observing time were clouded out. Fit a Poisson distribution to these data for the pdf of r and determine if the t is acceptable at the 5% signicance level. r 0 1 18 2 7 3 4 >4 0
No. of nights 21
3 1
Of course one might ask whether a Poisson distribution is a sensible model for the pdf of r since a Poisson RV is dened for any non-negative integer, whereas r is clearly at most 12 hours. However, as we saw in Section 1.3.2, the shape of the Poisson pdf is sensitive to the value of the mean, , and in particular for small values of the value of the pdf will be negligible for all but the rst few integers, and so we neglect all larger integers as possible outcomes. Hence, in tting a Poisson model
we also need to estimate the value of . We take as our estimator of the sample mean, i.e. = 21 0 + 18 1 + +7 2 + 3 3 + 1 4 50 = 0.90
Substituting this value into the Poisson pdf we can compute the expected outcomes, er = 50 p(r; ), where p(0; 0.90) = 0.4066 p(3; 0.90) = 0.0494 p(1; 0.90) = 0.3659 p(4; 0.90) = 0.0111 p(2; 0.90) = 0.1647 p(5; 0.90) = 3.3 105
If we consider only ve outcomes, i.e. r 4, since the value of the pdf is negligible for r > 4, then the number of degrees of freedom, = 3 (remember that we had to estimate the mean, ). The value of the test statistic is 2 = 0.68, which is smaller than the critical value. Hence we accept the NH at the 5% level i.e. the data are well tted by a Poisson distribution.
Sn (x) =
0
n 1 i
i.e. Sn (x) is a step function which increments by 1/n at each sampled value of x. Let the model cdf be P (x), corresponding to pdf p(x), and let the null hypothesis be that our random sample is drawn from p(x). The KS test statistic is Dn = max |P (x) Sn (x)| It is easy to show that Dn always occurs at one of the sampled values of x. The remarkable fact about the KS test is that the distribution of Dn under the null hypothesis is independent of the functional form of P (x). In other words, whatever the form of the model cdf, P (x), we can determine how likely it is that our actual sample data was drawn from the corresponding pdf. Critical values for the KS statistic are tabulated or can be obtained e.g. from numerical recipes algorithms.
The KS test is an example of a robust, or nonparametric, test since one can apply the test with minimal assumption of a parametric form for the underlying pdf. The price for this robustness is that the power of the KS test is lower than other, parametric, tests. In other words there is a higher probability of accepting a false null hypothesis that two samples are drawn from the same pdf because we are making no assumptions about the parametric form of that pdf.
We estimate by the sample correlation coecient, , dened by: = (xi x )(yi y ) [ (xi x )2 ] [ (yi y )2 ]
where, as usual, x and y denote the sample means of x and y respectively, and all sums are over 1, ..., n, for sample size, n. is also often denoted by r, and is referred to as Pearsons correlation coecient. If x and y do have a bivariate normal pdf, then corresponds precisely to the parameter dened in Section 3.1. To test hypotheses about we need to know the sampling distribution of . We consider two special cases, both of which are when x and y have a bivariate normal pdf.
has a students t distribution , with = n 2 degrees of freedom. Hence, we can use t to test the hypothesis that x and y are independent.
(ii): = 0 = 0 In this case, then for large samples, the statistic z = 1 1+ loge 2 1
z =
1 1 + 0 loge 2 1 0
2 z =
1 n3