IAL Statistics Revision Worksheet Month 6

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

BATCH – JAN 2022

S1 REVISION WORKSHEET
MONTH 06: JULY 2021
Syllabus: Chapter 2, 3, 4, 5

1. The table shows the frequency distribution for the number of petals in the flowers of a group of celandines.
(a) Work out how many celandines were in the group. (98)
(b) Write down the modal number of petals. (6) Number of Frequency(f)
(c) Calculate the mean number of petals. (6.31) petals
(d) Calculate the median number of petals. (6) 5 8
(e) If you saw a celandine, write down how many petals you would 6 57
expect it to have. (6) 7 29
8 3
9 1
2. The grouped frequency distribution shown below gives the speed of service of the top 50 performers in Men’s
professional tennis in 1992.
Service Speed (m. p. h) 90 - 94 95 - 99 100 - 104 105 - 109 110 - 114 115 - 119 120 - 124 125 - 129
Frequency 2 7 9 14 9 4 3 2

Calculate
(a) Modal class (105-109)
𝑥−112
(b) Mean, Variance and standard deviation using coding 𝑦 = 5
(𝜇 = 107.5, 𝜎 2 = 69.25, 𝜎 = 8.32)
(c) Q1, Q2, Q3, P25, P90 (101.4; 107; 112.56; 101.4; 119.5)
(d) Range, IQR and 50% to 90% inter percentile range (39; 11.16; 12.5)

3. The following table summarises the distances, to the nearest km, that 134 examiners travelled to attend a meeting
in London.
Distancs (km) Number of examiners
41 − 45 4
46 − 50 19
51 − 60 53
61 − 70 37
71 − 90 15
91 − 150 6

(a) Give a reason to justify the use of a histogram to represent these data.
(b) Write down the underlying feature associated with each of the bars in a histogram.
(c) Represent the data in a histogram.
The mid-point of each class is represented by x and the corresponding frequency by f. Calculations then give the
following values
∑ 𝑓𝑥 = 8379.5 and ∑ 𝑓𝑥 2 = 557489.75
(d) Calculate an estimate of the mean and an estimate of the standard deviation for these data. (62.5, 15.8)

One coefficient of skewness is given by


𝑄3 − 2𝑄2 + 𝑄1
𝑄3 − 𝑄1
(e) Evaluate this coefficient and comment on the skewness of these data.
(f) Give another justification of your comment in part (e).
BATCH – JAN 2022

4. Figure shows a histogram for the variable t which represents the time taken, in minutes, by a group of the people
to swim 500m.

(a) Copy and complete the frequency table for t.


t 5-10 10-14 14-18 18-25 25-40
Frequency 10 16 24

(b) Estimate the number of people who took longer than 20 minutes to swim 500m. [40]
(c) Find an estimate of the mean time taken. [18.9]
(d) Find an estimate for the standard deviation of t. [7.26]
(e) Find an estimate for the median and quartiles for t. [𝑄1 = 14, 𝑄2 = 18, 𝑄3 = 23]
3(mean−median)
One measure of skewness is found using standard deviation

(f) Evaluate this measure and describe the skewness of these data. [0.372]

5. A travel agent sells holidays from his shop. The price, in £, of 15 holidays sold on a particular day are shown
below.
299 1050 2315 999 485
350 169 1015 650 830
99 2100 689 550 475
For these data, find

(a) the mean and the standard deviation, (805, 621)


(b) the median and the inter-quartile range. (650, 665)

An outlier is an observation that falls either more than 1.5 × (inter-quartile range) above the upper quartile or
more than 1.5 × (inter-quartile range) below the lower quartile.

(c) Determine if any of the prices are outliers. (2100 and 2315)

The travel agent also sells holidays from a website on the Internet. On the same day, he recorded the price, £x,
of each of 20 holidays sold on the website. The cheapest holiday sold was £98, the most expensive was £2400
and the quartiles of these data were £305, £1379 and £1805. There were no outliers.

(d) On graph paper, and using the same scale, draw box plots for the holidays sold in the shop and the holidays
sold on the website.
(e) Compare and contrast sales from the shop and sales from the website.

6. Given 𝑃(𝐶𝐷) = 0.5 𝑃(𝐶𝐷) = 0.4 and 𝑃(𝐷) = 0.6, find :


4
a. 𝑃(𝐶), [0.46] b. 𝑃(𝐷𝐶), [0.652] c. 𝑃(𝐷𝐶), [0.348] d. 𝑃(𝐷𝐶). [9]
BATCH – JAN 2022

2 1 4
7. The events A and B are such that P(A) = , P(B) = and P(𝐴|𝐵′)) = .
5 2 5
(a) Find
2 9
(i) P(𝐴 ∩ 𝐵′) (5) (ii) P(𝐴 ∩ 𝐵), (0) (iii) P(𝐴 ∪ 𝐵), (10) (iv) P(𝐴|𝐵). (0)

(b) State, with a reason, whether or not A and B are


(i) mutually exclusive,
(ii) independent.

8. A die is biased in such a way that 5 occurs twice the other score and probability of other scores are same. This
biased die is rolled together with another fair cubicle die. Find the probability that the product of the two score is
19
at most 25. (21)

9. 𝐴 and 𝐵 are two events such that P(𝐴|𝐵) = 0.1, P(𝐴|𝐵′) = 0.6 and P(𝐵) = 0.3.
(a) Represent the two events on a Venn diagram
(b) Represent the two events on a tree diagram
(c) Hence, or otherwise, calculate :
i. P(𝐴 ∩ 𝐵) (0.03)
ii. P(𝐴) (0.45)
iii. P(𝐵|𝐴′). (0.49)

10. Three companies operate a bus service along a busy main road. Amber buses run 50% of the service and 2% of
their buses are more than 5 minutes late. Blunder buses run 30% of the service and 10% of their buses are more
than 5 minutes late. Clipper buses run remainder of the service and only 1% of their buses are more than 5 minutes
late.
Jean is waiting for a bus on the main road.
(a) Find the probability that the first bus to arrive is an Amber bus that is more than 5 minutes late (0.01)

Let A, B and C denote the events that Jean catches an Amber bus, a Blunder bus and a clipper bus
respectively. Let L denote the event that Jean catches a bus that is more than 5 minutes late.
(b) Draw a Venn diagram to represent the events A, B, C and L with their associated probabilities
(c) Find the probability that Jean catches a bus that is more than 5 minutes late. (0.042)

11. A computer game has three levels and one of the objectives of every level is to collect a diamond. The probability
4 2 1
of a randomly chosen player collecting a diamond on the first level is 5, the second level is 3 and the third level is 2.
The events are independent.
a) Draw a tree diagram to represent collecting diamonds on the three levels of the game.
Find the probability that a randomly chosen player
4
b) collects all three diamonds, [ ]
15
7
c) collects only one diamond. [30]

12. The security passes for a certain company are colored yellow or white and are provided with either a clip to go on
6 2
a pocket or a chain to be worn around the neck. The probability that a pass has a clip is 10 and 3 of the white
4
passes and 7 of the yellow passes are fitted with clips. A member of the company is stopped on his way into the
work and his pass checked. Find the probability that
a) the pass is yellow (0.7)
b) the pass is yellow with a chain. (0.3)
BATCH – JAN 2022

13. The accountant of a company monitors the number of items produced per month by the company, together with
the total cost of production. The following table shows the data collected for a random sample of 12 months.

Number of items (x)


21 39 48 24 72 75 15 35 62 81 12 56
(1000s)
Production cost (y)
40 58 67 45 89 96 37 53 83 102 35 75
(£1000)

a) Explain which one is an explanatory variable and which one is a response variable.
b) Calculate the value of Sxx, Syy , & Sxy and hence find the value of p.m.c.c (r).
c) Explain why this value of r would support the fitting of a regression equation of y on x.
d) Interpret the value of p.m.c.c (r)
e) Find the regression equation of y on x.
f) If x and y are coded as 𝑥 = 𝑝 + 10 & 𝑦 = 𝑐 + 30, then write down the regression equation of c on p and
p.m.c.c between p and c.
g) Estimate the value of c if 𝑝 = 18 and comment on the validity of the estimate.

14. A long distance lorry driver recorded the distance travelled, m miles, and the amount of fuel used, f litres,
each day. Summarised below are data from the driver’s records for a random sample of 8 days.
The data are coded such that x = m – 250 and y = f – 100.
∑ 𝑥 = 130 ∑ 𝑦 = 48 ∑ 𝑥𝑦 = 8880 Sxx = 20 487.5

a. Find the equation of the regression line of y on x in the form y = a + bx. [𝑦 = −0.425 + 0.395𝑥]
b. Hence find the equation of the regression line of f on m. [𝑓 = 0.825 + 0.395𝑚]
c. Predict the amount of fuel used on a journey of 235 miles. [93.65 litres]

➢ Required Formulae ( These formulae will not be given in the test):

𝑥 2 𝑥
1. Variance( 2 ) = 𝑛
−2 Where  = for Type 1
𝑛
Note: For Type 3, x is
𝑓𝑥 2
= −2 𝑓𝑥 the midpoint of the class
𝑓
Where  = for Type 2 & 3
𝑓
2. Standard Deviation (  ) = √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒

3. An outlier is any value which is,


• greater than ( upper quartile + 1.5  interquartile range )
and / or
• less than the ( lower quartile − 1.5 interquartlie range )

(𝒙)𝟐 (𝒚)𝟐 𝒙𝒚 𝑺𝒙𝒚


4. Sxx= 𝒙𝟐 − 𝒏
Syy= 𝒚𝟐 −
𝒏
Sxy= 𝒙𝒚 −
𝒏
𝒓=
√𝑺𝒙𝒙 𝑺𝒚𝒚

5. a. Regression equation of y on x : b. Regression equation of x on y :


y = a +bx , x = a + by ,
𝑺𝒙𝒚 𝑺𝒙𝒚
where, b = ̅ − 𝒃𝒙
&a = 𝒚 ̅ where, b = ̅ − 𝒃𝒚
&a = 𝒙 ̅
𝑺𝒙𝒙 𝑺𝒚𝒚

where,
c. Regression equation of h ons :
𝒚 𝒙
y = a +bx , ̅=
𝒚 ̅=
&𝒙
𝒏 𝒏
𝑺𝒉𝒔
where, b =
𝑺𝒔𝒔
̅ − 𝒃𝒔̅
&a = 𝒉 ̅ = 𝒉 & 𝒔̅ = 𝒔
𝒉 𝒏 𝒏
BATCH – JAN 2022

➢ Venn Diagram Related Problem (Notation based) :

Relations :
1. P(AB) = Probability of event A or B or both.

P(A) + P(B) - P (AB)


2. We can use these shortcuts (we don’t need to shade Venn Diagram).
a)P(AB) = 1 - P(AB)
b)P(AB) = 1 - P(AB)
c)P(AB) = 1 - P(AB)
3.P(A) = P(not A ) = 1 - P(A) or P(B) = P( not B ) = 1 - P(B)

You might also like