TCD 2021

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

The data in the following table are summary statistics of the measurements of galactose binding

for three groups of patients: patients with Crohn’s disease, patients with Ulcerative colitis, and
a control group. The study is interested in comparing the three population means.

Group 1 Group 2 Group 3


(Crohn’s disease) (Ulcerative colitis) (Control)
Mean 1910.2 2373.6 2804.5
Standard deviation 515.7 727.1 526.8
Group size 9 13 20

(a) Let 𝜎12 denote the population variance of group 1, and 𝜎22 denote the population variance
of group 2. Test the hypothesis that 𝐻0 : 𝜎12 =𝜎22 . What are the assumptions for the test?
(12 marks)
(b) Assume that 𝜎1 =𝜎2 =𝜎. Let 𝑢1 denote the population mean of group 1, and 𝑢2 denote the
population mean of group 2. Test the hypothesis that 𝐻0 : 𝑢1 =𝑢2 . What are the assumptions
for the test?
(14 marks)
(c) Construct the 95% confidence interval for the mean galactose binding of group 2.
(6 marks)

The analysis of variance table for the galactose binding measurements is given below.

Source of Degrees of Sum of Mean Variance Probability


variation freedom squares square ratio (F) (𝑝)
Between 2 5174310.0 2587155.0 7.34 0.002
groups
Within 39 13743776.2 352404.5
groups
Total 41 18918086.2

(d) Let {𝑥1 , 𝑥2 , … , 𝑥9 } denote the measurements of Group 1, {𝑦1 , 𝑦2 , … , 𝑦13 } the measurements
of Group 2, and {𝑧1 , 𝑧2 , … , 𝑧20 } the measurements of Group 3. Write down the equations
for calculating the between-group sum of squares and within-group sum of squares.
(3 marks)
(e) From the critical value tables in the page “ST-3 Selected critical values for the F
distribution”, what is the critical value for the significance level 0.05?
(4 marks)
(f) What can we conclude from the p-value?
(2 marks)
In a flu vaccination program, 92 participants got vaccinated, and 92 participants are put in the
control group (non-vaccinated). The program kept track of the number of participants who had
flu-like symptoms and the causative agents. At the end of the winter, the following table was
constructed to illustrate the occurrence of flu among the participants.

Unvaccinated Vaccinated Total


Sick with flu 23 5 28
Sick with non-flu 8 10 18
No sick 61 77 138
Total 92 92 184

Perform a test to answer the question: “Was there a difference in the occurrence of three
outcomes between those vaccinated and unvaccinated in the population of interest?”
(a) Write down the null and alternative hypothesis and the underlying assumptions of the test.
(3 marks)
(b) Detail how you calculate each statistic, including the equations and the computed values.
(12 marks)
(c) Report the critical value and your conclusion.
(3 marks)

A normal QQ plot for a sample of data, denoted by {𝑥1 , 𝑥2 , …, 𝑥𝑛 }, is given below.

(a) Explain what sample quantiles and theoretical quantiles are.


(4 marks)
(b) The above QQ plot indicates that the data are not normally distributed: both the bottom end
and the upper end of the QQ plot deviate from the straight line. Explain what features the
data distribution has, compared to the normal distribution.
(4 marks)
A study investigated how well a diagnostic test worked for detecting kidney disease in patients
with high blood pressure. The diagnostic test was applied to 137 patients -- 67 with known
kidney disease and 70 healthy patients. A positive test result implies the patient has kidney
disease, and a negative test result implies the patient does not have kidney disease. Here are
the results of the experiment:

Positive Negative Total


Disease 44 23 67
Healthy 10 60 70
Total 54 83 137

According to the test results, answer the following questions.


(a) If a patient receives a positive test result, what is the estimated probability that the patient
actually has the disease?
(4 marks)
(b) What is the estimated probability that a healthy patient, tested twice independently, gets
two positive results?
(4 marks)
Medical records for a particular region reveal that of the 937 residents who died in a year, 212
died of causes related to diabetes, and 312 had at least one parent with diabetes. Of the 312
residents with at least one parent with diabetes, 102 died of causes related to diabetes.
(c) For the 937 residents who have died, what is the estimated probability that a resident died
of causes related to diabetes, given that neither of his/her parents has diabetes?
(4 marks)
(d) Explain the two types of error that may be made in hypothesis testing. How does the sample
size affect the probability of making each one, if a predetermined significance level of 0.05
is chosen?
(4 marks)
Researchers want to know the role of gender in mortality due to COVID-19. In one study,
clinical characteristics of 191 adult patients with COVID-19 were reported, where 54 patients
died during hospitalisation, and 137 were discharged. Details are given below:

Died Discharged Total


Female 16 56 72
Male 38 81 119
Total 54 137 191

Test the hypothesis that “the mortality proportion is the same in male and female patients in
the population of interest”. Construct a confidence interval for the difference in the mortality
proportion between males and females in the population.
(a) Write down the null and alternative hypothesis and the underlying assumptions of the test.
(3 marks)
(b) Detail how you calculate each statistic, including the equations and the computed values.
(11 marks)
(c) Report the critical value and your conclusion.
(3 marks)

You might also like