Biostatistics and Research Methods Exercise 1

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 5
At a glance
Powered by AI
Some of the key concepts discussed include different types of study designs like cross-sectional, cohort and case-control studies. Sensitivity and specificity are also introduced in relation to evaluating diagnostic tests. The effects of outliers on measures like mean, median, standard deviation and IQR are demonstrated.

Cross-sectional, retrospective cohort, prospective cohort and case-control studies are some of the study designs discussed. Observational studies are described as events being observed as they occur with no active intervention by the researcher.

Sensitivity measures the proportion of actual positives correctly identified, while specificity measures the proportion of negatives correctly identified. They are used to evaluate the performance of diagnostic tests by comparing the test results to a gold standard.

1.

Residents of three villages with three different types of water supply were asked to participate in a
survey to identify cholera careers and the proportion of carriers in each village was compared. The type of
study design used is: Cross- sectional study

2. Give true or false about observational studies


a. Events are observed as they occur, with no active intervention by the researcher: True
b. Comparison groups may differ with respect to factors related to the outcome: True
c. Subjects may be followed forward from exposure to outcome or backward from outcome to
exposure:True

3. Give true or false about retrospective cohort study


a. Allows to investigate many outcomes: True
b. Incidence rates can be calculated: True
c. Selection bias is eliminated: True
d. Recall bias is minimum compared to case-control study. True
e. Usually requires less time than prospective cohort: True

4. Identify the type of study designs for each of the following


a. Study of past mortality trends to predict the future mortality.
 Retrospective cohort
b. Study of the incidence of cancer in men who have quit smoking.
 Prospective cohort
c. Study of people who are not treated for a certain disease.
 Cross sectional
d. Study of people who have homogeneous characteristics for a certain disease.
 Cohort
5., You are evaluating the impact of a certain drug on patients aged 75 and above admitted to Tikur
Anbessa Hospital.
a. What study design is most appropriate?
Interventional, because here there is a drug that can be taken as intervention and the aim is
to evaluate the impact as the result of the drug.
b. What is the target population?
 All patients age75 and above in TAH
c. What is the possible source of population?
 All patients aged 75 and above admitted to TAH.
d. What possible sampling frame do you use:
 List of all patients or the record cards for those aged 75 and above admitted to
Tiku Anbessa hospital.
e. How do you select your sample: by using SRS, this is because SRS is a better representative as
individual are selected randomly and it is the most common method that avoids problem with
systemic approach.
f. If no previous study is available, what sample size is needed to estimate the impact in the target
population at a 95% CI and marginal of error = 0.05.this can be calculated by using the following formulas:

1
Where: z=1.96 at CI =95%, P=50 %( 0.5), Q=1-0.5=>0.5, D=0.05
So n=1.962*(0.5)*(0.5) = 0.9604/.0025= 384
(0.05)2
6. A study in a hospital showed that 30% of those who came for emergency reasons were aged 15 years or
less.
a. What is the variable of interest of the study: Age
b. What is the type of the variable: Categorical
c. What is the scale of measurement for the variable: Ordinal
d. Which descriptive measure is most appropriate for the data: percentile/proportion

7. The table below shows results (out of 50) of pre-test and post-test scores for students attending
biostatistics course using mean , median, standard deviation and inter-quartile range.
Exam type Score
Mean Median SD IQR
Pre-test 10 22 12 35
Post-test 30 35 10 20
a. What type of variable is pre-test: Continuous /quantitive
b. What is the scale of measurement for the variable: Ratio
c. The mean of the difference between pre and post tests is 20 True
d. The median of the difference between pre and post tests is 13True
e. The SD of the difference between pre and post tests is -2False
f. The IQR of the difference between pre and post is 15 True
g. Which graphical display is most appropriate to compare pre and post tests:
 Frequency polygon/Box plot
8. What is the most appropriate graphical method to display for the following data?
a. The distribution of diarrhea in an outbreak investigation:
 Frequency polygon or line graph.
b. The weight of newborns in a health center:
 Histogram
c. The marital status of pregnant women attending ANC:
 Bar chart/pie chart
d. Treatment failure among TB patients :
 pie chart or bar chart

9. The following table shows frequency of diastolic blood pressure (DBP) of men aged 30-69 with mean
DBP of 84mmHg
2
DBP in mmHg Freq. Relative Cumulative
freq. (%) relative freq (%)
Below 65 60 4 4

65-74 270 18 22

75-84 540 36 58

85-94 420 28 86

95-104 150 10 96

105-115 45 3 99

Above 115 15 1 100

Total 1500

a. Fill the relative and cumulative relative frequency columns in the table
b. People with 95mmHg and above are considered hypertensive. What is the percentile
hypertensive: 86th and above. (In percent it is 14%)
c. The frequency for 800 women is almost similar with mean 79mmHG and same SD as that for
men. Say True or False for the following and justify your answers
(i) The median DBB will be the same for both sexes: False
 Here same SD does not mean the same median, as the SD is sensitive to
outliers but median is not.
(ii) The proportion of hypertensive is the same for both sexes: False
(iii) The variability of DBP is lower for women: false
 To compare the variability, we should use CV (coefficient of variable)
which is the ratio of SD to mean. Here men have mean of 84mmHg women
with 79mmHg. So dividing these numbers to the same number, the
variability is lower for men not to women.
10. The following table shows crude death rates (CDR) in four different villages
Village No of CDR
population
A 50,000 10
B 100,000 8
C 250,000 8
D 250,000 4
a. Which of the villages has the largest number of deaths?
b. Can we say village D has a better survival compared to other villages? Justify your answer.
Answer
A.
This question can be answered by using the following formula:
Total number of death per year
CDR  x 1000
Mid year population

3
For each the total number of deaths per year can be calculated as:
CDR*mid year population
So, for A ->total No. of death per yr=10*50,000/1000=500
B-> total No. of death per yr=8*100,000/1000=800
C-> total No. of death per yr=8*250,000/1000=2000
D-> total No. of death per yr=4*250,000/1000=1000
As a result village C has the largest number of deaths
B. No, we can’t say. Even thought the CDR is 4 which is less as compared to other
villages, it is from 250,000 that is much compared to villages A&B
11. An instructor gave a quiz with 3 questions each worth one point. 30% of the class scored 3 points, 50%
scored 2 points, 10% scored one point and the rest scored zero.
a. Calculate the mean, median and mode of the scores if there were 30 students in the class:
b. Is it possible to find out the three measures without being told the number of students? Justify
your answer.
Answer:
A. Total students are 30.

Student % Student No. Scores


30 9 3
50 15 2
10 3 1
10 3 0

Therefore, Mean= (9*3+15*2+3*1+3*0)/30=2


Median =2
Mode =2
B. Yes, it is possible to find.
Because the number of the students have been already given in percent and from the
percent value it is possible to calculate the mean, median and mode as shown in the
table in part A.
12. The following are life expectancies of males in six European countries: 74, 77, 73, 75, 77 and 78. If the
observed value of 74 is mistakenly recorded 740. What is the effect on each of the following? Mean
Median, SD, and IQR.
Answer:
Here it is better answering by comparing the values of each before and after the effect.
So, Arrange orderly: 73, 74, 75, 77, 77, and 78 then before 74 is recorded mistakenly as 740
Median =76
Mean = (73+74+75+77+77+78)/6=75.7
Variance =sum (xi-mean) 2/n-1=3.9=SD=1.9
IQR=3.5
After 74 mistaken as 740 the orders will be: 73, 75, 77, 77, 78, and 740 then
Mean =186.7
Median =77
Variance= sum (xi-mean) 2/n-1=73485.87=SD=271.08
IQR=169
So all the mean, median, SD, and IQR will be increased as compared above.
4
 It is logical because the mean, SD and IQR are high affected by the outliers and here they are
significantly affected when 74 is mistaken taken as 740.

13. A physical examination and audiometric tests were given to 500 individuals with suspected ear
problem. The results are shown in the following tables.

Hearing Physical Examination Hearing Audiometric test


problem Present Absent problem Present Absent
Yes 240 40 Yes 270 60
No 60 160 No 30 140

Compared to physical examination calculate the sensitivity and specificity of audiometric test

Here before calculating the Sensitivity and specificity, it is better to say something about sensitivity


and specificity. So Sensitivity and specificity are statistical measures of the performance of a binary
classification test. Sensitivity (also called the true positive rate) measures the proportion of actual
positives which are correctly identified and is complementary to the false negative
rate. Specificity (sometimes called the true negative rate) measures the proportion of negatives which are
correctly identified and is complementary to the false positive rate.
Now we say that:
 True positive(A): Sick people correctly diagnosed as sick
 False positive(B): Healthy people incorrectly identified as sick

 True negative(C): Healthy people correctly identified as healthy

 False negative(D): Sick people incorrectly identified as healthy

Here in our case for physical examination: A=240 => sensitivity= A = 240 = 80%
B=40 A+D 240+60
C=160 => specificity =C = 160 =80%
D=60 B+C 40+160

For the audiometric test: A=270  sensitivity = A/ (A+D) = 270/300=90%


B=60
C=140specificity = 140/ (B+C) =140/200 =70%
D=30
So, the audiometric test is more sensitive than the physical examination and the physical examination is
more specific than the audiometric test.

You might also like