Daguay, Dionne R.
Sibal, Kristal Joyce M.
General Instructions: Submit the exam NLT 5:00PM on Wednesday (August 2, 2017) thru hard copy or
email ([email protected]). Kindly follow the format: Calibri 11, 8.5x13, .doc/x file.
I. Data Presentation
Choose the appropriate graph for each of the data set given below. Give a brief interpretation
of your graphs. (5 points each)
Table 2.1 Ten Leading Causes of Morbidity in the Philippines, 1991
Causes Rate Per 100, 000 Population
Diarrheal Diseases 1, 702.5
Bronchitis 1, 518.5
Influenza 788.3
Pneumonia 469.2
Tuberculosis, all forms 210.0
Accidents 107.3
Diseases of the Heart 98.0
Malaria 73.6
Varicella 73.0
Measles 59.9
Source: Philippine Health Statistics, 1991
Table 2.1 Ten Leading Causes of Morbidity in the Philippines Rate Per 100,000
Interpretation: 4 among the top 5 diseases in the Philippines are respiratory tract infections
due to the population congestion and tropical climate in our country. Diarrhea is the leading
cause of morbidity in our country also because of our climate, that makes our food easier to
Table 2.2 Occupation of 256 Patients with Allergic Contact Dermatitis Seen at the PGH
Dermatology Clinic, April 1982 to November 1986
Occupation No. of Patients %
Housekeeper 83 32.42
Student 40 15.63
Office Worker 35 13.67
Paramedic 18 7.03
Teacher 15 5.86
Dressmaker 15 5.86
Others 50 19.53
Total 256
Source: Gutierrez, G., et.al., A Study of Allergic Contact Dermatitis at the PGH Dermatology
Clinic, Acta Medica Philippina, vol.24, Series 2, no. 2, April-June 1988, pp. 61-65.
19% Student
32% Office Worker
6% Paramedic
6% Teacher
7% Dressmaker
16% Others
Interpretation: Housekeepers tend to be more prone on having contact dermatitis in the years observed
than the rest of the group because of job related factors like replacing of beddings, dusting, dirt removal
and frequent use of detergents for cleaning. But they are briefly followed by other non-specific group of
people, student and office workers. Lower cases are observed in the rest of the group, namely paramedics,
teachers and dressmakers respectively.
Table 2.3 Distribution of Health Workers According to Type and Whether or Not They
Have Received Training on the Proper Way of Filling-up Forms
Type of Health Worker Without Training With Training
Midwife 37 10
Sanitarian 19 8
Nurse 8 8
Doctor 4 6
Total 68 32
Interpretation: In this chart, the number of midwife and sanitarian health workers are higher in those
who didnt have training on the proper way of filling-up forms than those who had for this particular group
of people doesnt necessarily need to fill up hospital/medical forms. Nurses have equal amount of
numbers while Doctors, have a slightly higher number in those who trained than those who did not, for
these professions are the ones that makes use and writes these forms.
However, the numbers are somewhat irrelevant for comparison between the health worker, as they are
not equally distributed in number, having a lower population for nurses and doctors, but can be evaluated
between the specific profession, assuming the comparison is between the specifics.
Table 2.4 Distribution of CPH Students by Degree Program
Degree Program Total Number of Students Numbers of MDs
MPH (Masters of public 80 55
MHA (Master of Health 40 34
MOH (masters of 6 3
occupational health)
MSPH (master of science in 20 4
public health)
Interpretation: The 3 degree programs MPH, MHA, MSPH have a wider scope of study and so has more
number of enrolled students. On the other hand, MOH focuses on aspects of health and safety in the work
space and constitutes the smallest number of students. The students who finished MPH, MHA and MOH
mostly proceeded to medicine because they have mastery in their field. While those who finished MSPH
only have a fifth in their population who proceeded to medicine.
Table 2.5 Post-Treatment Prevalence of Soil-Transmitted Helminthiasis of Both Sexes by
Age, San Narciso, Victoria, Mindoro Oriental (1982)
No. of Ascaris Trichuris Hookworm
Age Group
Examination + % + % + %
0-6 53 7 13.2 15 28.3 0 0.0
7 14 92 8 8.7 27 29.3 5 5.4
15+ 163 11 6.7 57 35.0 8 4.9
Source: Cabrera, B.D. and Cruz, A.C. A Comparative Study on the Effect of Mass Treatment
of the Entire Community and Selective Treatment of Children on the Total prevalence of
Soil-Transmitted Helminthiasis in Two Communities, Mindoro, Philippines, Collected
Papers on the Control of Soil-Transmitted Helminthiasis. Vol.2
40 29.3
6.7 27
20 11 8.7
8 28.3
13.2 15 8 4.9
0 7 5
0 5.4
+ % + % + %
Ascaris Trichuris Hookworm
0-6 7 14 15+
Interpretation: Based on our computed data, all age groups have percentages from 41-47 percent of
recurrent worm infection even after treatment. Therefore, age group is not a factor for efficacy of
treatment. In this situation, one out of two people still have worm infections, so the treatment is
inefficient to eradicate the worms.
II. Linear Regression (include the syntax and STATA output if necessary)
1. A pre-test is given to all students enrolled in Biostatistics 201 at the beginning of the
course in order to test the students background in Basic Math (Algebra). The pre-test
scores and the final grades of 20 students who were enrolled in Biostatistics 201 in 1987
were recorded as follows:
a. Interpret the scatterplot for these data. (3 points)
b. Find the equation of the regression line to predict final grades from the
pre-test. (3 points)
c. How do you interpret the computed values of the intercept and the
regression coefficient? (3 points)
d. Using the derived regression equation, what is the expected final grade
of a student with a pre-test score of 25.0? (3 points)
e. Is the pre-score a significant predictor of a students final grade in
Biostatistics? Support your answer. (3 points)
Interpretation: based on the scatter plot above, it is visualized that there are many
outliers from the line and it does not present a significance between the pre-test and the
final grade.
Multiple R 0.572093
R Square 0.32729
Adjusted R
Square 0.289917
Error 1.035381
Observations 20
df SS MS F F
Regression 1 9.38811 9.38811 8.757445 0.008396
Residual 18 19.29627 1.072015
Total 19 28.68438
Y= mx + b
Y = -0.0574x + 3.958278
c. Based on the computations, with the regression being significant, the pretest is contributory
to the final grade, however we should still consider other factors that can raise the final grade.
The intercept show almost 2/3 of the class had a grade of 30-40 in their pre-test.
Y = -0.0574x + 3.958278
Y = -0.0574(25.0) + 3.958278
= 2.523278 or 2.50
df SS MS F F
Regression 1 9.38811 9.38811 8.757445 0.008396
Residual 18 19.29627 1.072015
Total 19 28.68438
The pre-score is a significant indicator of the final grades based on the criteria:
III. Correlation (35 points)
Using exam.dta, investigate which subtests are associated with each other. By a line, show
the trend of the relationship existing between two subtests that exhibit the strongest
Hint: To determine the correlation between the different subtests use the syntax pwcorr.
. pwcorr awards read write math science socst, sig star (.05)
Having all p values > 0.05, means that all subtests have no significant relationship with each
other when paired. Each variable is independent from each other.
id | Coef. Std. Err. t P>|t| [95% Conf. Interval]
read | -.9399389 .5939055 -1.58 0.115 -2.111279 .2314016
write | -.0588975 .5952755 -0.10 0.921 -1.23294 1.115145
math | .3989122 .6282515 0.63 0.526 -.8401678 1.637992
science | 2.024179 .5580354 3.63 0.000 .9235838 3.124774
socst | .519796 .5031412 1.03 0.303 -.472533 1.512125
_cons | -.4929999 25.63078 -0.02 0.985 -51.04376 50.05776
id | Coef. Std. Err. t P>|t| [95% Conf. Interval]
read | -.9464122 .58879 -1.61 0.110 -2.107626 .2148018
math | .3840752 .6085414 0.63 0.529 -.8160926 1.584243
science | 2.013055 .545203 3.69 0.000 .9378036 3.088307
socst | .5040097 .4759601 1.06 0.291 -.4346807 1.4427
_cons | -1.078062 24.87594 -0.04 0.965 -50.1385 47.98238
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 3, 196) = 8.47
Model | 76478.9039 3 25492.968 Prob > F = 0.0000
Residual | 590171.096 196 3011.07702 R-squared = 0.1147
-------------+------------------------------ Adj R-squared = 0.1012
Total | 666650 199 3350 Root MSE = 54.873
id | Coef. Std. Err. t P>|t| [95% Conf. Interval]
read | -.8260443 .5561915 -1.49 0.139 -1.922932 .2708439
science | 2.133859 .509716 4.19 0.000 1.128627 3.139091
socst | .5632375 .4659004 1.21 0.228 -.3555839 1.482059
_cons | 3.487264 23.76449 0.15 0.883 -43.37966 50.35419
id | Coef. Std. Err. t P>|t| [95% Conf. Interval]
read | -.5048064 .4891828 -1.03 0.303 -1.469514 .4599007
science | 2.208286 .5065767 4.36 0.000 1.209277 3.207295
_cons | 12.36642 22.62748 0.55 0.585 -32.25677 56.98961
. regress id science
id | Coef. Std. Err. t P>|t| [95% Conf. Interval]
science | 1.878867 .3934046 4.78 0.000 1.103066 2.654668
_cons | 3.080741 20.76476 0.15 0.882 -37.86772 44.0292
Among the subsets, science has the strongest relationship with id. Following the critera, it has a
P value of 0.000.
Since 0.000 < 0.05, there is a significance with science and the students. Therefore, we can
conclude that students performance are averaging from 40 60 with no one having marks that
are too low, or too high.
The correlation within the plot is positive but not too strong because of the outliers or the
scattered dots seen from the line.
