3.6: General Hypothesis Tests

3.
6: General Hypothesis Tests

The 2 goodness of t tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally.
3.6.1: Simple Hypothesis Tests

A simple hypothesis test is one where we test a null hypothesis, denoted by H1 (say), against an alternative hypothesis, denoted by H2 i.e. the test consists of only two competing hypotheses. We construct a test statistic, t, and based on the value of t observed for our real data we make one of the following two decisions:1. accept H1 , and reject H2 2. accept H2 , and reject H1 To carry out the hypothesis test we choose the critical region for the test statistic, t. This is the set of values of t for which we will choose to reject the null hypothesis and accept the alternative hypothesis. The region for which we accept the null hypothesis is known as the acceptance region. Note that we must choose the critical region and acceptance region ourselves. For example we might choose the critical region as the set of values of t for which t > 0.
3.6.2: Level of Signicance

The level of signicance of a hypothesis test is the maximum probability of incurring a type I error which we are willing to risk when making our decision. In practice a level of signicance of 5% or 1% is common. If a level of signicance of 5% is adopted, for example, then we choose our critical region so that the probability of rejecting the null hypothesis when it is true is no more than 0.05. If the test statistic is found to lie in the critical region then we say that the null hypothesis is rejected at the 5% level, or equivalently that our rejection of the null hypothesis is signicant at the 5% level. This means that, if the null hypothesis is true, and we were to repeat our experiment or observation a large number of times, then we would expect to obtain by chance a value of the test statistic which lies 1
in the critical region (thus leading us to reject the NH) in no more than 5% of the repeated trials. In other words, we expect our rejection of the null hypothesis to be the wrong decision in no more than 5 times out of every 100 experiments.
3.6.3: Goodness of Fit for Discrete Distributions

We can illustrate some of the important ideas of hypothesis testing by considering how we test the goodness of t of data to discrete distributions. We do this again using the 2 statistic. Suppose we carry out n observations and obtain as our results k dierent discrete outcomes, E1 , ..., Ek which occur with frequencies o1 , ..., ok (o for observed). An example of such observations might be the number of meteors observed on n dierent nights, or the number of photons counted in n dierent pixels of a CCD. Consider the null hypothesis that the observed outcomes are a sample from some model discrete distribution (e.g. a Poisson distribution). Suppose, under this null hypothesis, that the k outcomes, E1 , ..., Ek , are expected to occur with frequencies e1 , ..., ek (e for expected). We can test our null hypothesis by comparing the observed and expected frequencies and determining if they dier signicantly. We construct the following 2 test statistic. where oi =
2
(oi ei )2 ei i=1
ei = n. Under the null hypothesis this test statistic has approxi-
mately a 2 pdf with = k 1 m degrees of freedom. Here m denotes the number of parameters (possibly zero) of the model discrete distribution which one needs to estimate before one can compute the expected frequencies, and is reduced by one further degree of freedom because of the constraint that determined by the sample size n. This 2 goodness of t test need not be restricted only to discrete random variables, since we can eectively produce discrete data from a sample drawn from a continuous pdf by binning the data. Indeed, as we remarked in Section 2.2.7 the Central Limit Theorem will ensure that such binned data are approximately normally distributed, which means that the sum of their squares will be approximately distributed as a 2 ei = n. In other words, once we have computed the rst k 1 expected frequencies, the k th value is uniquely
2 random variable. The approximation to a 2 pdf is very good provided ei 10, and is reasonable for 5 ei 10. Example 1 A list of 1000 random digits integers from 0 to 9 are generated by a computer. Can this list of digits be regarded as uniformly distributed? Suppose the integers appear in the list with the following frequencies:r or 0 1 2 3 4 92 5 6 7 112 8 114 9 91
106 88
97 101
103 96
Let our NH be that the digits are drawn from a uniform distribution. This means that each digit is expected to occur with equal frequency i.e. er = 100, for all r. Thus:2 = (oi ei )2 ei i=1
k
7.00
Suppose we adopt a 5% level of signicance. The number of degrees of freedom, = 9; hence the critical value of 2 = 16.9 for a one-tailed test. Thus, at the 5% signicance level we accept the NH that the digits are uniformly distributed.
Example 2 The table below shows the number of nights during a 50 night observing run when r hours of observing time were clouded out. Fit a Poisson distribution to these data for the pdf of r and determine if the t is acceptable at the 5% signicance level. r 0 1 18 2 7 3 4 >4 0
No. of nights 21
3 1
Of course one might ask whether a Poisson distribution is a sensible model for the pdf of r since a Poisson RV is dened for any non-negative integer, whereas r is clearly at most 12 hours. However, as we saw in Section 1.3.2, the shape of the Poisson pdf is sensitive to the value of the mean, , and in particular for small values of the value of the pdf will be negligible for all but the rst few integers, and so we neglect all larger integers as possible outcomes. Hence, in tting a Poisson model
we also need to estimate the value of . We take as our estimator of the sample mean, i.e. = 21 0 + 18 1 + +7 2 + 3 3 + 1 4 50 = 0.90
Substituting this value into the Poisson pdf we can compute the expected outcomes, er = 50 p(r; ), where p(0; 0.90) = 0.4066 p(3; 0.90) = 0.0494 p(1; 0.90) = 0.3659 p(4; 0.90) = 0.0111 p(2; 0.90) = 0.1647 p(5; 0.90) = 3.3 105
If we consider only ve outcomes, i.e. r 4, since the value of the pdf is negligible for r > 4, then the number of degrees of freedom, = 3 (remember that we had to estimate the mean, ). The value of the test statistic is 2 = 0.68, which is smaller than the critical value. Hence we accept the NH at the 5% level i.e. the data are well tted by a Poisson distribution.
3.6.4: The Kolmogorov-Smirnov Test

Suppose we want to test the hypothesis that a sample of data is drawn from the underlying population with some given pdf. We could do this by binning the data and comparing with the model pdf using the 2 test statistic. This approach might be suitable, for example, for comparing the number counts of photons in the pixels (i.e. the bins) of a CCD array with a bivariate normal model for the point spread function of the telescope optics, where the centre of the bivariate normal denes the position of a star. For small samples this does not work well, however, as we cannot bin the data nely enough to usefully constrain the underlying pdf. A more useful approach in this situation is to compare the sample cumulative distribution function with a theoretical model. We can do this using the KolmogorovSmirnov (KS) test statistic. Let {x1 , ..., xn } be an iid random sample from the unknown population. Suppose the {xi } have been arranged in ascending order. The sample cdf, Sn (x), of X is dened as:-
Sn (x) =
0
n 1 i
x < x1 xi x < xi+1 , x xn 1in1
i.e. Sn (x) is a step function which increments by 1/n at each sampled value of x. Let the model cdf be P (x), corresponding to pdf p(x), and let the null hypothesis be that our random sample is drawn from p(x). The KS test statistic is Dn = max |P (x) Sn (x)| It is easy to show that Dn always occurs at one of the sampled values of x. The remarkable fact about the KS test is that the distribution of Dn under the null hypothesis is independent of the functional form of P (x). In other words, whatever the form of the model cdf, P (x), we can determine how likely it is that our actual sample data was drawn from the corresponding pdf. Critical values for the KS statistic are tabulated or can be obtained e.g. from numerical recipes algorithms.
The KS test is an example of a robust, or nonparametric, test since one can apply the test with minimal assumption of a parametric form for the underlying pdf. The price for this robustness is that the power of the KS test is lower than other, parametric, tests. In other words there is a higher probability of accepting a false null hypothesis that two samples are drawn from the same pdf because we are making no assumptions about the parametric form of that pdf.
3.6.5: Hypothesis Tests on the Sample Correlation Coecient

The nal type of hypothesis test which we consider is associated with testing whether two variables are statistically independent, which we can do by considering the value of the sample correlation coecient. In Section 3.1 we dened the covariance of two RVs, x and y , as cov(x, y ) = E [(x x )(y y )]
and the correlation coecient, , as = cov(x, y ) x y 5
We estimate by the sample correlation coecient, , dened by: = (xi x )(yi y ) [ (xi x )2 ] [ (yi y )2 ]
where, as usual, x and y denote the sample means of x and y respectively, and all sums are over 1, ..., n, for sample size, n. is also often denoted by r, and is referred to as Pearsons correlation coecient. If x and y do have a bivariate normal pdf, then corresponds precisely to the parameter dened in Section 3.1. To test hypotheses about we need to know the sampling distribution of . We consider two special cases, both of which are when x and y have a bivariate normal pdf.
(i): = 0 (i.e. x and y are independent) If = 0, then the statistic t = n2 1 2
has a students t distribution , with = n 2 degrees of freedom. Hence, we can use t to test the hypothesis that x and y are independent.
(ii): = 0 = 0 In this case, then for large samples, the statistic z = 1 1+ loge 2 1
2 given by has an approximately normal pdf with mean, z and variance z
z =
1 1 + 0 loge 2 1 0
2 z =
1 n3

3.6: General Hypothesis Tests

Uploaded by

Copyright:

Available Formats

3.6: General Hypothesis Tests

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3.6: General Hypothesis Tests

Uploaded by

Copyright:

Available Formats

3.

6: General Hypothesis Tests

3.6.1: Simple Hypothesis Tests

3.6.2: Level of Signicance

3.6.3: Goodness of Fit for Discrete Distributions

ei = n. Under the null hypothesis this test statistic has approxi-

3.6.4: The Kolmogorov-Smirnov Test

x < x1 xi x < xi+1 , x xn 1in1

3.6.5: Hypothesis Tests on the Sample Correlation Coecient

and the correlation coecient, , as = cov(x, y ) x y 5

(i): = 0 (i.e. x and y are independent) If = 0, then the statistic t = n2 1 2

2 given by has an approximately normal pdf with mean, z and variance z

You might also like