Stat 101
Stat 101
Stat 101
Smoke
FORMULAS
S/N NAME OF GROUPED DATA UNGROUPED DATA
FORMULA
𝑅
K = 1+3.322 log N W = 𝐾
1. Sturges Rule Where W=width, R=range, N=numbers of data and Nil
K = number of rows (S/N)
∑ 𝑓𝑥 ∑𝑥
2a. Mean
∑𝑓 𝑛
ℎ ∑ 𝑓𝑥 ∑𝑑
A+ ∑𝑓 A+ 𝑛
2b. Assumed mean 𝑥−𝐴
Where d = Where d= x – A
ℎ
∑𝑓 𝑛
3. Harmonic mean 𝑓 1
∑( ) ∑( )
𝑥 𝑥
Geometric 𝑛 𝑛
4a. √𝑥1 + 𝑥2+ … . 𝑥𝑛 √𝑥1 + 𝑥2+ … . 𝑥𝑛
mean (1)
Geometric ∑ 𝑓𝑙𝑜𝑔𝑥 ∑ 𝑙𝑜𝑔𝑥
4b Antilog Antilog
mean (2) ∑𝑓 𝑛
The middle number
Or
ℎ 𝑁
5. Median L1+ 𝑓 ( 2 − 𝑐𝑏 ) If the middle value are
two, sun them up and
divide by 2
First Quartile ℎ 𝑁
6. L1+ 𝑓 ( 4 − 𝑐𝑏 ) Nil
(Q1)
Third Quartile ℎ 3𝑁
7. L1+ 𝑓 ( 4 − 𝑐𝑏 ) Nil
(Q3)
Third Deciles ℎ 3𝑁
8. L1+ 𝑓 ( 10 − 𝑐𝑏 ) Nil
(D3)
70th Percentile ℎ 70𝑁
9. L1+ 𝑓 ( 100 − 𝑐𝑏 ) Nil
(P70)
Inter-quartile
10. 𝑄3 − 𝑄1 Nil
range
Quartile
deviation or 𝑄3 −𝑄1
11. Nil
semi-inter- 2
quartile range
12. Range Xmax - Xmin Xmax - Xmin
Coefficient of Xmax − Xmin
13. Nil
range Xmax+ Xmin
Coefficient of 𝑆.𝐷
14. x 100 Nil
variation 𝑚𝑒𝑎𝑛
∑ 𝑓/𝑥−𝑥̅ / ∑/𝑥−𝑥̅ /
15. Mean Deviation ∑𝑓 𝑛
1
Compiled by Aminu a.k.a Dr.Smoke
P.M.F of a
29. Binomial Pr(X) = nCx ▪ px ▪ qn-x Nil
distribution
P.M.F of a
𝑒 −λ λx
30. Poisson P(x) = λ = np Nil
𝑥!
distribution
P.D.F of Normal − (𝑥−𝝁)2
31. 1 𝒙−𝝁 Nil
Distribution P(x) = ▪𝑒 2𝛿2 (Z = )
√2𝜋∙ 𝛿 𝜹
Spear man
𝟔 ∑ 𝑫𝟐
32. Ranks r= 1- 𝒏(𝒏𝟐 −𝟏) Where D=Rx- Ry Nil
correlation
2
Compiled by Aminu a.k.a Dr.Smoke
Y = a + bX
Regression
34. equation of Y ∑ 𝑌 = na + b∑ 𝑋 Nil
on X
∑ 𝑋𝑌 = a ∑ 𝑋 + b ∑ 𝑋 2
Regression X = a + bY
35. equation of X ∑ 𝑋 = na + b∑ 𝑌 Nil
on Y
∑ 𝑋𝑌 = a ∑ 𝑌 + b ∑ 𝑌 2
Coefficient of
correlation
36. between two Nil
r= √𝒃𝒚𝒙 × 𝒃𝒙𝒚
lines of
regression
Standard
deviation of 𝒃𝒚𝒙 × 𝛿𝑥
37. Nil
lines of 𝛿𝑦 =
𝑟
regression
3
Compiled by Aminu a.k.a Dr.Smoke
4
Compiled by Aminu a.k.a Dr.Smoke
Use the following information to answer question 14-15. Eight coins were tossed together and the
number of heads (X) resulting was noted. The operation was repeated 256 times and the frequency
distribution of the number of heads in given below:
X 0 1 2 3 4 5 6 7 8
F 1 9 26 59 72 52 29 7 1
If the median number of head is 4.
14. What is the coefficient of mean deviation about the median
(a) 0.887 (b) 0.222 (c) 0.111 (d) 0.132
15. What is the standard deviation of the experiment?
(a) 3.97 (b) 1.41 (c) 1.98 (d) 2.38
16. What is the geometric mean of: 2, 4, 8, 12, 16, 24?
(a) 11.0 (b) 8.2 (c) 9.2 (d) 10.0
17. Which of the following measures condense a huge unwieldy set of numerical data into a single
value that are representative of the entire distribution?
(a) Skewness (b) Kurtosis (c) Dispersion (d) Averages
18. Which of the following is not a measure of location?
(a) Mean (b) Mode (c) Variance (d) Percentile
19. Which of the following correctly defined the relationship between Arithmetic mean(AM),
Geometric mean (GM) and Harmonic mean(HM)?
(a) AM> GM> HM (b) AM< GM <HM (c) AM = GM= HM (d) GM > AM <HM
20. Which of the following measures gives an idea about the shape of the curve of a frequency
distribution?
(a) Mean (b) Skewness (c) Kurtosis (d) Variance
21. A bag contains 20 tickets marked with numbers 1 to 20. One ticket is drawn at random. Find the
probability that it will be a multiple of 2 or 5
(a) 0.6 (b) 0.7 (c) 0.5 (d) 0.2
22. In a single throw of two dice, what is the probability of getting a total of 8
5 31 5 2
(a) (b) (c) 18 (d) 18
36 36
Using the information to answer question 23-25: the adjoining data shows the length of life of
wholesale grocers in Abuja
Length of life (years) 0-5 5-10 10-15 15-25 25 and above Total
Percentage of wholesalers 65 16 9 5 5 100
23. During the period studied, what is the probability that an entrant to this profession will fail within
five years? (a) 0.65 (b) 0.05 (c) 0.16 (d) 0.09
24. What is the probability that an entrant will survive at least 25years
(a) 0.1 (b) 0.06 (c) 0.09 (d) 0.05
25. How many years would he have to survive to be among the 10percent longest survivors?
(a) At most 15 years (b) at least 16 years(c) At most 5 years (d) At least 15 years
1 1
26. A problem in statistics is given to three students A, B and C whose chances of solving it are 3, 4, and
1
5
respectively. Find the probability that the problem will all be solved if they all try independently
2 4 3 3
(a) (b) 5 (c) (d) 4
3 5
5
Compiled by Aminu a.k.a Dr.Smoke
6
Compiled by Aminu a.k.a Dr.Smoke
40. The probability that a student will graduate is 0.4, what is the probability that out of 5 students at
least one will graduate?
(a) 0.9222 (b) 0.0777 (c) 0.2592 (d) 0.0103
41. Between the hours of 2 and 4P.M, the average number of phone calls per minute coming into the
switch board of a company is 2.5 , find the probability that during one particular minute there will
be exactly 3 calls.
(a) 0.082 (b) 0.2341 (c) 0.2138 (d) 0.6065
42. If X is a Poisson variate with parameter 1. Find P(3<X<5)
(a) 0.1952 (b) 0.0153 (c) 0.1789 (d) 0.0243
43. If 3% of electric bulbs manufactured by a company are defective. Find the probability that in a
sample of 100 bulbs, exactly five balls are defective.
(a) 0.1008 (b) 0.0081 (c) 0.2131 (d) 0.0634
44. Which of the following is not a property of a binomial experiment?
(a) The experiment consist of a sequence of “n” identical trials (b) Each outcome can be referred to
as a success or a failure (c) The trials are independent (d) The probability of success can change
from one trial to the next.
45. Which of the following is correct about the value of correlation coefficient (r)?
(a) -2< 1 < 2 (b) 1 < r <3 (c) 0<r<-2 (d) -1 < r < 1
2 2
46. Given n=10, ∑ 𝑋= 195, ∑ 𝑋 = 4485, ∑ 𝑌= 149, ∑ 𝑌 = 2682, ∑ 𝑋 𝑌= 3446. Obtain the values of Karl
Pearson’s correlation coefficient.
(a) 0.67 (b) 0.96 (c) 0.89 (d) 0.79
47. The line of regression of y on x is:
(a) The line which gives the best estimate for the value of x for any specified value of y
(b) The line which gives the best estimate for the value of y for any specified value of x
(c) The line which divides the value of y and x (d) None of these
48. On each of 30 items, two measurements are made, the following summations were given: ∑ 𝑋= 15,
∑ 𝑋2 = 61, ∑ 𝑌= -6, ∑ 𝑌2= 90, ∑ 𝑋 𝑌= 56, calculate the slope of regression line of y on x?
(a) 0.8561 (b) 2.1650 (c) 1.1028 (d) 1.2143
49. Given two lines of regression equations: y=0.6332x + 0.619, and x = 1.5104y – 0.6235. Obtain the
correlation coefficient between x and y. (a) 0.6720 (b) 0.5239 (c) 0.9779 (d) 0.9527
50. The lines of regression of a bivariate population are: 8x – 10y + 66 = 0; 40x – 18y = 214. The variance
of x is 25. Find the standard deviation of y.
(a) 4 (b) 5 (c) 7 (d) 9
7
Compiled by Aminu a.k.a Dr.Smoke
b. Define a Poisson probability mass function with parameter𝜃. Proof that its mean is equal to its
variance.
QUESTION TWO
a. The weekly wages of 2000 workers are normally distributed around a mean of 140 and standard
deviation of 10. Estimate the number of workers whose weekly wages will be:
(i) Between 120 and 130 (ii) More than 170 (iii) less than 165.
b. The following tables gives the ages and blood pressure of 7 women
Age (X) 56 42 36 47 49 42 60
Blood Pressure (Y) 147 125 118 128 145 140 155
Determine the least square regression equation of y on x.
USE THE FORMULAS IN THIS MATERIAL TO ATTEMPT THE OBJECTIVES, AND COMPARE YOUR ANSWERS
WITH MINE.
CONTACT ME FOR MORE EXPLANATION: 08111652798
[email protected]
SOLUTIONS
2016/2017 OBJ ANSWERS
1. C 2. A 3. A 4. B 5. B 6. D 7.D
8. B 9. C 10.C 11.C 12.C 13.A 14.B
15. C 16. B 17.D 18. C 19.A 20.B 21.B
22. A 23.A 24.D 25.A 26.C 27.B 28.B
29. B 30.A 31.A 32.A 33.D 34.D 35.C
36. A 37.D 38.B 39.A 40.A 41.C 42.B
43. A 44.D 45.D 46.B 47.A 48. C 49. C
50. A
𝑒 −λ λx
(b) P(x) = for x= 0,1,2,3......
𝑥!
9
Compiled by Aminu a.k.a Dr.Smoke
(b)
S/N X Y X2 XY
1 56 147 3136 8232
2 42 125 1764 167
3 36 118 1296 4248
4 47 128 2209 6016
5 49 145 2401 7105
6 42 140 1764 5880
7 60 155 3600 9300
Total 332 958 16170 40948
11
Compiled by Aminu a.k.a Dr.Smoke
UNIVERSITY OF ABUJA
FACULTY OF SCIENCE
DEPARTMENT OF STATISTICS
1. Which of the following veterans perceived statistics as the quantitative data affected to a marked
extent by multiplicity of causes (a) Yule and Kendall (b) Webster (c) Bowley (d)Secrist
2. The following points may be termed as preliminaries to data collection except (a) objectives of the
enquiry (b) source of information (c) tabulation (d) type of enquiry
3. The frequency curve may be regarded as a limiting form for which of the following graphs? (a)
Histogram (b) Frequency polygon (c) Ogive (d) Line graph
4. Through which of the following graph can the median of a distribution be obtain? (a) Bar chat (b)
Ogive (c) frequency curve (d) Histogram
Use this information to answer question 5-9. A company launched a sales campaign and appointed
100 sales girls for this purpose. At the end of the period, the sales results were analyzed and the
following information where obtained
Sales 75-80 80-85 85-90 90-95 95-100 100-105 105-110 110-115
Relative 0.09 0.12 0.15 0.11 0.20 0.20 0.11 0.02
frequencies
5. What is the mean sale of the company? (a) 94.25 (b) 94.35 (c) 92.25 (d) 92.35
6. What is the coefficient of variation of the sales? (a) 50.0% (b) 60.0% (c) 50.5% (d) 60.5%
7. What is the third quartile of the sales? (a) 75 (b) 70 (c) 102 (d) 103
8. What is the 90th percentile of the sales? (a) 90 (b) 50 (c) 106.36 (d) 102.46
9. The relative measure of variability based on percentile is? (a) 0.12 (b) 0.14 (c) 0.23 (d) 0.34
10. The arithmetic mean of runs secured by three batmen X, Y and Z in a series of 10 linning are 50, 48
and 12 respectively. The standard deviation of the runs are 15, 12 and 2 respectively, who is the
most consistent of the three? (a) Z (b) X (c) Y (d) None of these
12
Compiled by Aminu a.k.a Dr.Smoke
11. Calculate the square root of the mode of the following distribution: 2, 4, 7, 5, 4, 2, 4, 2, 9, 6, 4, 2, 9
(a) 2 (b) 3 (c) 4 (d) 9
12. Which of the following measures is regarded as the measure of variation of a distribution? (a)
Dispersion (b) Averages (c) Skewness (d) Kurtosis
13. From the following data, find the Karl peason’s coefficient of Skewness:
Measurement 10 11 12 13 14 15
Frequency 2 4 10 8 5 1
(a) 0.2485 (b) 0.3478 (c) 0.6286 (d) 0.7290
14. The mean for a symmetrical distribution is 50.6, what are the values of median and mode? (a) 25.3
and 20.4 (b) 101.2 and 105.7 (c) 50.9 and 60.7 (d) 50.6 and 50.6
15. A bag contains 20 tickets, marked with number 1 to 20, one ticket is drawn at random, find the
probability that it will be a multiple of 3 or 5? (a) 0.45 (b) 0.60 (c) 0.53 (d) 0.40
16. A bag contains 8 white and 3 red balls, of two balls are drawn at random without replacement, find
the probability that one is of each color, (a) 20/55 (b) 8/11 (c) 3/55 (d) 24/55
17. In a single throw of two dice, what is the probability of getting a total of 8? (a) 3/36 (b) 6/36 (c) 7/36
(d) 5/36
Use the following information to answer question 18-19. Assume that a factory has two machines,
past record shows that machine I produces 30% of the items of output and machine II produces
70% of the items, 5% of the items produced by machine I were defective and only 1% produced by
machine II are defective.
18. If a defective item is drawn at random, what is the probability that it was produced by machine I (a)
0.542 (b) 0.682 (c) 0.976 (d) 0.318
19. If a defective item is drawn at random what is the probability that it was produced by machine II?
(a) 0.682 (b) 0.542 (c) 0.318 (d) 0.976
20. Let A and B be two possible events. The event that happening of any one of them excludes the
happening of others in the same experiment is known as (a) Mutually exclusive (b) Independent (c)
Equally likely (d) All of these
21. Let P(A) be the probability of an event A, if P(A)= 1 which of the following is correct ? (a) A is not
certain (b) A is equally likely (c) A is certain (d) None of these
22. A random variable X has the following probability distribution
X 0 1 2 3
P(x) 1/8 3/8 3/4 1/8
13
Compiled by Aminu a.k.a Dr.Smoke
Compute the expectation of X (a) 1.3 (b) 1.2 (c) 1.5 (d) 1.6
23. Which of the following represents a probability distribution?
(a) P(x) 0.2, 0.35, 0.12, 0.40, -0.07 (b) P(x) 0.2, 0.25, 0.10, 0.14, 0.49 (c) P(x) 0.2, 0.25, 0.10, 0.15,
0.30, (d) P(x) 0.2, 0.40, 0.02, 0.14, 0.07
Use the following information to answer questions 24-25
A random variable X has the following probability distribution
X 0 1 2 3 4
P(x) 0.12 K2/2 0.5 0.24 0.12
24. What is the value of constant K (a) 0.04 (b) 0.4 (c) 0.2 (d) 0.02
25. Obtain the value of Var(K2X) (a) 0.02 (b) 0.03 (c) 0.01 (d) 0.04
26. Which of the following is a property of binomial distribution? (a) The number of trial is indefinitely
large (b) The number of trials are independent (c) The probability of success is very small (d) The
probability of success can change from one trial to the next
27. The average number of customer who appears at a counter of a certain bank per minute is 2, find
the probability that during a given minute, three or more customers appear? (a) 0.3235 (b) 0.1353
(c) 0.3245 (d) 0.1355
28. If 5% of the electric bulbs manufactured by a company are defective, find the probability that in a
sample of 100 bulbs, exactly 5 bulbs are defective (a) 0.1734 (b) 0.1008 (c) 0.1755 (d) 0.1006
29. The probability that a student will graduate is 40%, determine the probability that out of 5 students,
none will graduate (a) 0.0102 (b) 0.2592 (d) 0.9222 (c) 0.0788
30. Which of these is not a property of a normal distribution (a) Mean ≠ Median ≠ Mode (b)The curve
is symmetrical (c) The graph is the famous bell shaped curve (d) the total of the normal curve is 1
31. The ranks according to two attributes in a sample are given below:
R1 1 2 3 4 5
R2 5 4 3 2 1
What is the rank correlation between them? (a) 0 (b) -1 (c) 1 (d) None of the above
32. For a particular product, the sale (y) and the advertisement expenditure (x) for 10years provide the
results: ∑ 𝑋=15, ∑ 𝑋 2 =250, ∑ 𝑌=110, ∑ 𝑌 2 =3200, ∑ 𝑋𝑌=400, find the regression line of y on x (a)
y=1.033x + 9.4505 (b) y= -1.033x – 9.4505 (c) 1.033X-9.4505 (d) y=-1.033x + 9.4505
33. A numerical value used as a summary measure for a sample such as sample mean, is known as a (a)
population parameter (b) Sample parameter (c) sample statistic (d) population mean
34. µ is an example of a? (a) population parameter (b) sample statistic (c) population variance (d) mode
14
Compiled by Aminu a.k.a Dr.Smoke
35. The sum of the percent frequencies for all classes will always equal (a) one (b) the number of
classes (c) the number of items in the study (d) 100
36. In a five number summary, which of the following is not used for data summarization? (a) the
smallest value (b) the largest value (c) the median (d) the mean
Data set 1
The following data shows the number of hours worked by 200 statistics students
No. of hours 0-9 10-19 20-29 30-39
Frequency 40 50 70 40
37. Refer to data set 1, the class width of this distribution is (a) 9 (b) 10 (c) 11 (d) varies
38. Refer to data set 1, the number of students working 19 hours or less is (a) 40 (b) 50 (c) 90 (d) cannot
be determined without the original data
39. Refer to data set 1, the relative frequency of students working 9hours or less is (a) 0.2 (b) 0.45 (c) 10
(d) cannot be determined from the information given
40. The difference between the largest and the smallest data values is the (a) variance (b) inter-quartile
range (c) range (d) coefficient of variation
41. Which of the following is not a measure of central location? (a) mean (b) median (c) variance (d)
mode
42. The sum of deviation of the individual data element from their mean is? (a) always greater than
zero (b) always less than zero (c) sometimes greater than and sometimes less than zero depending
on the data elements (d) always equal to zero
43. A tabular summary of a set of data showing the fraction of the total number of items in several
classes is (a) frequency distribution (b) relative frequency distribution (c) frequency (d) cumulative
frequency distribution
Data set 3
A researcher has collected the following sample data. The mean of the sample is 5
3 5 12 3 2
44. Refer to data set 3, the variance is?
(a) 80 (b) 4.062 (c) 13.2 (d) 16.5
45. Refer to data set 3, the standard deviation is?
(a) 8.944 (b) 4.062 (c) 13.2 (d) None
46. Refer to data set 3, The coefficient of variation is
(a) 72.66% (b) 81.24% (c) 264% (d) 330%
47. Refer data set 3, The range is?
(a) 1 (b) 2 (c) 10 (d) 12
48. Refer data set 3, the inter-quartile range is
(a) 1 (b) 2 (c) 10 (d) 12
49. Which of the following is not a measure of dispersion?
15
Compiled by Aminu a.k.a Dr.Smoke
(a) The range (b) The 50th percentile (c) The inter-quartile range (d) The variance
50. In computing descriptive statistics from grouped data
(a) Data values are treated as if they occur at the midpoint of a class
(b) The grouped data result is more accurate than the ungrouped data
(c) The grouped data computations are used only when population is being analyzed
(d) All of the above answers are correct
16
Compiled by Aminu a.k.a Dr.Smoke
(iii) Find the probability that at least one person plays sport, given that not more than three
people play sport
SOLUTIONS
SECTION A
1. A 2.C 3.B 4.B 5.A 6.10.12
7. C 8.C 9.B 10.C 11.A 12.A
13.A 14.D 15.C 16.D 17.D 18.B
19.C 20.B 21.C 22.C 23.C 24.C
25.A 26.B 27.A 28.C 29.C 30.A
31.C 32.A 33.A 34.A 35.D 36.C
37.B 38.B 39.A 40.C 41.C 42.D
43.A 44.C 45.D 46.A 47.C 48.3
49.A 50.C
SOLUTIONS TO SECTION B
QUESTION ONE
CLASS MID LOWER
S/N FREQ C.F LOG (X) F LOG(X)
INTERVAL POINT (x) C.B
1 70 – 74 40 40 72 1.857 74.28 69.5
2 75 – 79 45 85 77 1.886 84.87 74.5
3 80 – 84 50 135 82 1.914 95.70 79.5
4 85 – 89 60 195 87 1.940 116.40 84.5
5 90 – 94 70 265 92 1.964 137.48 89.5
6 95 – 99 80 345 97 1.987 158.96 94.5
100 –
7 100 445 102 2.009 200.90 99.5
104
TOTAL 445 954.72
∑ 𝑓𝑙𝑜𝑔𝑥 954.72
(i) G.M = Antilog ( ∑𝑓
) = Antilog ( ) = Antilog (2.145) = 139.64
445
17
Compiled by Aminu a.k.a Dr.Smoke
𝐷 −𝐷
(ii) Relative measure of variability based on deciles is given by 𝐷9 +𝐷1
9 1
ℎ 9𝑁 9𝑁 9×445
D9 = L1 + 𝑓 ( 10 − 𝐶𝑏 ) to locate the D9 class we will have 10 = =
10
ℎ 1𝑁 1𝑁 1×445
D1 = L1 + 𝑓 ( 10 − 𝐶𝑏 ) to locate the D1 class we will have 10 = =
10
𝐷 −𝐷 𝐷 −𝐷
Therefore 𝐷9 +𝐷1 = 𝐷9 +𝐷1
9 1 9 1
18
Compiled by Aminu a.k.a Dr.Smoke
= np – np2
= np(1-p) (by factorization)
Variance = npq (q=1-p)
QUESTION TWO
S/N X Y XY X2 Y2
1 2450 1370 3356500 6002500 1876900
2 2480 1350 3348000 6150400 1822500
3 2540 1400 3556000 6451600 1960000
4 2420 1330 3218600 5856400 1768900
5 2350 1270 2984500 5522500 1612900
6 2290 1210 2770900 5244100 1464100
7 2400 1330 3192000 5760000 1768900
8 2460 1350 3321000 6051600 1822500
TOT 19390 10610 25747500 47039100 14096700
𝒏 ∑ 𝒙𝒚−(∑ 𝒙)(∑ 𝒚)
r = = where n = 8, ∑ 𝒙 = 19390, ∑ 𝒚 = 10610, ∑ 𝒙𝟐 = 47039100, ∑ 𝒚𝟐 =
√[𝒏 ∑ 𝒙𝟐 –(∑ 𝒙)𝟐 ] × [𝒏 ∑ 𝒚𝟐 –(∑ 𝒚)𝟐 ]
14096700, ∑ 𝒙𝒚= 25747500.
𝟖(𝟐𝟓𝟕𝟒𝟕𝟓𝟎𝟎)− (𝟏𝟗𝟑𝟗𝟎)(𝟏𝟎𝟔𝟏𝟎)
Therefore, r =
√[𝟖 (𝟒𝟕𝟎𝟑𝟗𝟏𝟎𝟎)− (𝟏𝟗𝟑𝟗𝟎)𝟐 ][𝟖 (𝟏𝟒𝟎𝟗𝟔𝟕𝟎𝟎)− (𝟏𝟎𝟔𝟏𝟎)𝟐 ]
𝟐𝟎𝟓𝟗𝟖𝟎𝟎𝟎𝟎 − 𝟐𝟎𝟓𝟕𝟐𝟕𝟗𝟎𝟎 𝟐𝟓𝟐𝟏𝟎𝟎
r= = 𝟐𝟔𝟐𝟎𝟏𝟑.𝟒𝟓 = 0.962
√[𝟑𝟒𝟎𝟕𝟎𝟎][𝟐𝟎𝟏𝟓𝟎𝟎]
The correlation is strongly positive
ii. where n = 8, ∑ 𝒙 = 19390, ∑ 𝒚 = 10610, ∑ 𝒙𝟐 = 47039100, ∑ 𝒚𝟐 = 14096700, ∑ 𝒙𝒚= 25747500.
For the regression equation of Y on X, the equations below are used.
∑ 𝑌 = na + b∑ 𝑋 .................................................... equation 1
∑ 𝑋𝑌 = a ∑ 𝑋 + b ∑ 𝑋 2 .......................................... equation 2
19
Compiled by Aminu a.k.a Dr.Smoke
252100 340700𝑏
=
340700 340700
b = 0.74
Substitute b as 0.74 in equation (a)
10610 = 8a + 19390b
10610 = 8a + 19390(0.74)
10610 = 8a + 14348.6 (collecting like terms)
10610 – 14348.6 = 8a
-3738.6 = 8a
-3738.6 = 8a (dividing both sides by the coefficient of a)
−3738.6 8𝑎
=
8 8
a = -467.3
Therefore the regression equation of Y on X is given by
Y = -467.3 + 0.74X
Therefore Pr (X ≤ 3) = 0.786
(iii) Pr (1≤X≤ 3) = Pr(X=1) + Pr(X=2) + Pr(X=3)
Pr (1≤X≤ 3) = 0.138 + 0.299 + 0.324
Therefore Pr (1≤X≤ 3) = 0.761
21