Statistics2 PastExamQuestions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Martin Luther University Halle-Wittenberg

Department of Economics
Chair of Econometrics
Prof. Dr. Christoph Wunder

Statistics II - Past Exam Questions

Problem 1

In a random sample of size n = 621, researchers find that 232 respondents never exercise or do sport.

1.1 Test whether the population proportion of individuals who never exercise or do sport is greater than one-third.
State the hypotheses, rejection rule, test statistic and conclusion (α = 0.05).

1 1
• Hypotheses: H0 : p 6 3 vs. H1 : p > 3
• Rejection rule: reject H0 if Z > zα = z0.05 = 1.645
• Empirical value of test statistic:
232
p̂ = = 0.374
621

p̂ − p0 0.374 − 0.333 0.041


Z=q =q =√ = 2.168
p0 (1−p0 ) 0.333(1−0.333) 0.019
n 621

• Conclusion: As the empirical value of the test statistic is larger than the critical value,
we reject the null hypothesis that individuals who never exercise or do sport is less than
one-third.

1.2 Compute the p value for the test in problem

• p value: P(Z > 2.168|H0 ) = 1 − P(Z 6 2.168|H0 ) = 1 − 0.985 = 0.015

Problem 2

Let x1 , x2 , ..., xn denote a random sample of size n from a uniform population with lower bound a = 0 and upper
bound b. Show that the estimator θ̂ = 2X̄ is an unbiased estimator of b. Use E(Xi ) = b−a
2 as the expected value of
the uniform random variable.

E θ̂ = E (2X̄)
 
1
= E 2 · ∑ Xi
n

2
= E (Xi )
n∑
 
2 b
= ∑
n 2

2·n·b
=
2·n

=b

Hence, θ̂ is an unbiased estimator of b.

Problem 3

A teacher believes that girls do better on her exams than boys do. A sample of 10 girls (n1 = 10) and 12 boys
(n2 = 12) yields x̄1 = 7, x̄2 = 5.5, s21 = 1, s22 = 1.7. Assume normality of both populations and equal variances.
Carry out an appropriate test for the teacher’s hypothesis. State the hypotheses, degrees of freedom, rejection rule,
empirical value of the test statistic and conclusion (α = 0.05).

• Definition of random variables: X1i is the result in the exam for girl i, i = 1, ..., n1 , X2 j is the
result in the exam for boy j, j = 1, ..., n2 .
• Hypotheses: H0 : µ1 6 µ2 vs. H1 : µ1 > µ2
• Degrees of freedom: n1 + n2 − 2 = 12 + 10 − 2 = 20
• Rejection rule: reject H0 if t > tn1 +n2 −2;α = t20;0.05 = 1.725
• Empirical value of test statistic:
x̄1 − x̄2
t = r
(n1 −1)s21 +(n2 −1)s22

1
n1 + n12 n1 +n2 −2

7 − 5.5
=q  (10−1)·1+(12−1)·1.72
1 1
10 + 12 10+12−2

1.5
=q
1 1
 9+18.7
10 + 12 20

1.5
=√
0.373

= 2.45

• Conclusion: As the empirical value of the test statistic is greater than the critical value, we
reject the null hypothesis at the 5% level. We conclude that girls do better on her exams than
boys do.

2
Problem 4

Consider an empirical investigation to assess the effectiveness of a training program designed to increase monthly
wages (measured in Euros). If the program is effective, the participants will exhibit higher monthly wages after the
program as compared to before the program. The table below shows data for a random sample of 10 participants.

participant wage before program wage after program


1 1080 1200
2 990 980
3 1050 1080
4 1200 1260
5 1110 1350
6 750 760
7 1830 1770
8 1260 1320
9 1140 1230
10 1050 1110

Do you believe that the program exerts a significant influence (α = 0.10)? Carry out a sign test. State hypotheses,
rejection rule, test statistic, and conclude.

• Hypotheses: H0 : p ≥ 0.5 vs. H1 : p < 0.5, where p = P(X > Y ) is the probability that X (wage before
program) is greater than Y (wage after program).
• Rejection rule: We reject H0 if P(V ≤ v|H0 , n) ≤ α.
• Test statistic:

patient before program after program difference sign


1 1080 1200 120 -
2 990 980 -10 +
3 1050 1080 30 -
4 1200 1260 60 -
5 1110 1350 240 -
6 750 760 10 -
7 1830 1770 -60 +
8 1260 1320 60 -
9 1140 1230 90 -
10 1050 1110 60 -

let V be the number of + signs, then P(V ≤ 2|H0 , 10) = 0.055.


• Conclusion I: Since 0.06 < 0.10, we have significant evidence against the H0 at the 10% level.

Problem 5

A sample of size n = 13 from a normal population yields an average of 3 and a standard deviation of 2.

5.1 Calculate the 99% confidence interval for the population mean.

3
• Quantile: tn−1, α2 = t12,0.005 = 3.055
• Confidence interval:
2
ll: 3 − 3.055 · √ = 1.305
13

2
ul: 3 + 3.055 · √ = 4.695
13

CI: [1.305, 4.695]

5.2 Calculate the 90% confidence interval for the population variance. (5 points)

• Quantiles: χ2n−1,1− α = χ212,0.95 = 5.226, χ2n−1, α = χ212,0.05 = 21.026


2 2

• Confidence interval:
n−1 2 12
ll: 2
s = · 4 = 2.282
χn−1, α 21.026
2

n−1 12
ul: s2 = · 4 = 9.184
χ2n−1,1− α 5.226
2

CI: [2.282, 9.184]

Problem 6

The table below shows summary statistics on hourly wages (in euros) before and after participating in an on-the-
job training program for n = 10 workers.

before after difference


(B) (A) (D)
mean 27.2 29.4 2.2
std. dev. 3.5 4.6 3.8
Test whether the program had a positive effect on wages. Assume that the differences in wages before and after
the program are normally distributed. State the hypotheses, degrees of freedom, rejection rule, empirical value of
the test statistic, and conclusion (α = 0.10).

• Hypotheses: H0 : µD 6 0 vs. H1 : µD > 0


• Degrees of freedom: n − 1 = 9
• Rejection rule: we reject if H0 if t > tn−1,α = t9,0.10 = 1.383
• Empirical value of test statistic:
µD 2.2
t= sD = 3.8
= 1.831
√ √
n 10

• Decision: As the test statistic is higher than the critical value, we reject the null hypothesis
of an average difference of zero.

4
Problem 7

The table below shows summary statistics of life satisfaction for men and women:

mean std. dev. sample size


men 7.14 1.00 11
women 7.16 1.24 10

Assume that life satisfaction is normally distributed in the populations. Carry out a test for equality of the two
population variances. State the hypotheses, degrees of freedom, rejection rule, empirical value of the test statistic,
and conclude (α = 0.05).

• Hypotheses: H0 : σ21 = σ22 vs. σ21 6= σ22 , where σ21 and σ22 are the population variances for men and
women, respectively.
• Degrees of freedom: n1 − 1 = 11 − 1 = 10, n2 − 1 = 10 − 1 = 9
1 1
• Rejection rule: we reject H0 if F > Fn1 −1;n2 −1, α2 = F10,9,0.025 = 3.96 or if F < Fn −1;n −1, α = F9,10,0.025 = 0.26
2 1 2

• Empirical value of test statistic:

s21 1.002
F= = = 0.6504
s22 1.242

• Decision: As the test statistic is in the acceptance area, we are unable to reject the null
hypothesis that the population variances are equal.

Problem 8

A business magazine claims that the average market value of all pharma companies in the US is less than 250
million euros. A random sample of 30 firms reveals a sample mean of 235 million euros and a standard deviation
of 85 million euros. Carry out an appropriate test for the magazine’s claim. State the hypotheses, degrees of
freedom, rejection rule, empirical value of the test statistic, and conclude (α = 0.10).

• Hypotheses: H0 : µ ≥ 250 vs. H1 : µ < 250


• Degrees of freedom: n − 1 = 30 − 1 = 29
• Rejection rule: reject H0 if t < −tn−1;α = −t29;0.10 = −1.311
• Empirical value of test statistic:
x̄ − µ0 235 − 250 15
t= = =− = −0.9666
√s √85 15.5188
n 30

• Decision: As the test statistic is not lower than the negative critical value, we can not reject
the null hypothesis that the average market value of all pharma companies in the US is more than
or equal 250 million euros.

Problem 9

Consider a TV debate between two political parties, left (l) and right (r). The order of speakers is as follows:

5
speaker 1 2 3 4 5 6 7 8 9 10
political party l l l r r r r r l l
Do you believe that the order in which speakers of the political parties make their appearance is random? Carry
out a runs test. Give the hypotheses, rejection rule, test statistic, and conclusions (α = 0.05).

• Hypotheses: H0 : order of speakers is random vs. H1 : order of speakers is not random


• Rejection rule: we reject H0 at the 5% significance level if P(R 6 r|H0 , n1 = 5, n2 = 5) ≤ 0.025 or
P(R > r|H0 , n1 = 5, n2 = 5) ≤ 0.025.
• Test statistic: number of runs R = 3
• Conclusions:

P(R 6 3|H0 , n1 = 5, n2 = 5) = 0.040

P(R > 3|H0 , n1 = 5, n2 = 5) = 1 − P(R 6 2|H0 , n1 = 5, n2 = 5) = 1 − 0.008 = 0.992.

• Since 0.040 > 0.025 and 0.992 > 0.025, one cannot reject the H0 that the samples are generated
randomly.

Problem 10

Statisticians want to estimate the proportion of immigrants in Germany with an accuracy of ± 0.01 and 99%
confidence. The proportion is believed to be at most 0.25. Find the minimum sample size required.

p̂ (1 − p̂) z20.005
n=
B2

0.25 · (1 − 0.25) · 2.572


=
0.012

= 12384.187

The survey must include at least 12385 individuals.

Problem 11

The table below records the test scores of a random sample of 16 students from school A, for semester 1 and
semester 2. In semester 1 the students were in large classrooms with a teacher student ratio of 1:50. In semester
2 the students were spilt into smaller classrooms with a teacher student ratio of 1:20. A researcher wants to test
whether the students’ test scores increase when they are assigned to smaller classroom sizes. Assume that test
scores of all the students in school A are normally distributed.
Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Semester 1 67 49 42 37 52 72 84 56 43 91 67 28 70 29 42 51
Semester 2 69 52 50 38 72 74 82 62 49 89 64 33 77 28 50 55
11.1 Carry out an appropriate test at the significance level of 5% to test the researcher’s claim. State the hypotheses,
the test statistic (and its distribution), the rejection rule, the empirical value of the test statistic, and your test
decision.

6
Step 1: State the hypotheses

• H0 :µD = µ2 − µ1 ≤ 0
• H1 :µD = µ2 − µ1 > 0, where µ1 and µ2 denote the population test scores in semester 1 and 2,
respectively.

Step 2: Select test statistic


D̄−µ
• t= √D
sD / n
∼ tn−1

Step 3: Define the rejection rule

• α = 1 − 0.95 = 0.05
• tα=0.05,d f =15 = 1.753
• We reject H0 in favour of H1 at the 5% level if t > t = 1.753

Step 4: Calculate the empirical value of the test statistic

• Semester 1 = x̄1 ; Semester 2=x̄2 .


1
• D̄ = n1 ∑ni=1 (Di ) = 16 · 64 = 4
q q
1
• sD = 1−n ∑(Di − D̄)2 = 470 15 = 5.5976
4−0√
• t= 5.5976/ 16
= 2.86

Step 5: Decision

• We reject the null hypothesis in favour of the alternative hypothesis at the 5% significance
level.

11.2 The researcher has now obtained the test scores of a random sample of 20 students from school B for semester
2. School B maintains a teacher student ratio of 1:50, which has not undergone any changes in the recent
years. The researcher wants to test his claim against the new data he has obtained from school B. Assume
that test scores for all the students in school B are normally distributed and have a different variance to
school A. Carry out an appropriate test at the significance level of 5% to test the researchers claim. State the
hypotheses, the test statistic, degrees of freedom, the rejection rule, the empirical value of the test statistic,
and your test decision.The test scores of the random sample of 20 students are as follows:

51, 52, 54, 61, 67, 48, 47, 71, 87, 48, 55, 52, 69, 67, 58, 71, 58, 32, 81, 91

Set up:

• School A (Semester 2):

1 n 1 16 994
x̄A = ∑ xi = ∑ xi = 16 = 59
n i=1 16 i=1

1 1 4686
s2A = (∑ xi − x̄A )2 = · 4686 = = 312.4
n−1 15 15

• School B

1 n 1 20 1220
x̄B = ∑ xi = ∑ xi = 20 = 61
n i=1 20 i=1

7
1 1 4052
s2B = ( xi − x̄B )2 = · 4052 = = 213.2632
n−1 ∑ 19 19

Step 1: State the hypotheses (1p)

• H0 :µA − µB ≤ (µA − µB )0
• H1 :µA − µB > (µA − µB )0

Step 2: Select test statistic

• (σA and σB are unknown and unequal)


• Test statistic
(x¯A − x¯B ) − (µA − µB )0
t= r
SA2 S2
nA + nBB

Step 3: Define the rejection rule

• α = 1 − 0.95 = 0.05
• Degrees of freedom:
  2 2  
SA SB2

213.2632 2
  
+ 312.4
   
 nA nB
  16 + 20
 911.325
df = 2 = = = 29.02 ∼ 29
  
2 2
(312.4/16)2 (213.2632/20)2 25.415 + 5.984

(SA /nA ) (SB2 /nB ) +
nA −1 + nB −1 15 19

• tα=0.05,d f =29 = 1.699


• We reject H0 in favour of H1 at the 5% level if t > t = 1.699

Step 4: Calculate the empirical value of the test statistic


(59 − 61) − 0 2
T=q =√ = −0.364
312.4
+ 213.2632 30.188
16 20

Step 5: Decision

• We cannot reject the null hypothesis in favour of the alternative hypothesis at the 5%
significance level.

11.3 Which test would you prefer to use and why? Explanations should be clear and contextualized.

• Prefer the paired observation comparison test conducted in problem 3.1


• By comparing the test scores of the same student, before and after been assigned to a smaller
class room size, we can omit a lot of the extraneous variation/other factors that affect
test scores, such as socio-economic status of student, students ability, which will not be
possible when comparing two independent random samples.

You might also like