TD1vf Pres

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

Université Internationale de Rabat Academic Year 2024–2025

School of Aerospace & Automotive Engineering Calculus III

Tutorial 1
Univariate and Bivariate Descriptive Statistics

Questions. Among these statements, specify which are true and which are
false :
1. A variable is a characteristic being studied.
2. The task of descriptive statistics is to collect data.
3. The task of descriptive statistics is to present data in the form of tables,
graphs, and statistical indicators.
4. The values of variables are also called modalities.
5. For a qualitative variable, each statistical individual can have only one
modality.

School of Aerospace & Automotive Engineering (1/41) UIR/ Pr. M. Fihri


6. In order to perform statistical analyses, a quantitative variable may sometimes
be transformed into a qualitative variable.

School of Aerospace & Automotive Engineering (2/41) UIR/ Pr. M. Fihri


Solution –
1) True
2) False
3) True
4) True
5) True
6) True

School of Aerospace & Automotive Engineering (3/41) UIR/ Pr. M. Fihri


Exercise 1. Give the nature of each of the following variables :
(i) Satisfaction level regarding a summer stay at a vacation center (Very
high ; High ; Moderate ; Low ; None).
(ii) A student’s semester average at the School of Pharmacy.
(iii) The time spent by a second-year student reading or watching the news.
(iv) The section (A, B, C, ...) of the biology class in which a student is
enrolled.
(v) The number of phone calls received by a student in a day.

School of Aerospace & Automotive Engineering (4/41) UIR/ Pr. M. Fihri


Solution –
(i) Satisfaction level regarding a summer stay at a vacation center (Very
high ; High ; Moderate ; Low ; None) :
This is an ordinal qualitative variable since there is an established
hierarchy among the satisfaction levels.
(ii) A student’s semester average at the School of Pharmacy :
This is a continuous quantitative variable, taking values in the interval
R+ : [0, 20].
(iii) The time spent by a second-year student reading or watching the
news :
This is a continuous quantitative variable, taking values in the interval
[0, t], t > 0.
(iv) The section (A, B, C, ...) of the biology class in which a student is
enrolled :
This is a nominal qualitative variable.
(v) The number of phone calls received by a student in a day :
This is a discrete quantitative variable, taking values in {0, 1, 2, . . . , n},
with n ∈ N∗.

School of Aerospace & Automotive Engineering (5/41) UIR/ Pr. M. Fihri


Exercise 2. A microprocessor manufacturing company is testing a new
technology. It counts the number of defects on 40000 components and found
the following results :
Number of defects ni Ni fi Fi
2 5500
3 12000
4 22000
5 30000
6 36000
7 40000
Σ − −
The columns are defined as follows : - ni : frequency ; - Ni : cumulative
frequency ; - fi : relative frequency ; - Fi : cumulative relative frequency.
1) Specify the studied variable and its nature (denoted as X).
2) Copy and complete the table above.
3) What is the modal number of defects for this new technology ?
4) What is the median number of defects for this new technology ?
5) What is the mean number of defects for this new technology ?
School of Aerospace & Automotive Engineering (6/41) UIR/ Pr. M. Fihri
6) Draw the cumulative frequency curve.
7) Another independent database was created to collect information on the
number of defects of this new technology (denoted as Y ), with the following
indicators : ȳ = 2 ; V(Y ) = 3.
Given that V(X) ≃ 2.28, calculate the mean and variance of W = X +2Y .

School of Aerospace & Automotive Engineering (7/41) UIR/ Pr. M. Fihri


Solution –
1. The studied variable is X, representing the number of defects. It
is a quantitative discrete variable.
2. Completed table :
xi ni Ni fi Fi
2 5500 5500 0.14 0.14
3 6500 12000 0.16 0.30
4 10000 22000 0.25 0.55
5 8000 30000 0.20 0.75
6 6000 36000 0.15 0.90
7 4000 40000 0.10 1.00
Σ 40000 − 1 −
3. The modal number of defects is 4.
1
4. The median number of defects is also 4 (since n × 2 = 20000, so
Me = 4).
5. The mean number of defects is x̄ ≃ 4.36.
6. Draw the cumulative frequency curve (increasing step function).
School of Aerospace & Automotive Engineering (8/41) UIR/ Pr. M. Fihri
7. For W = X + 2Y , the calculations are as follows :
w̄ = x̄ + 2ȳ = 4.36 + 2 × 2 = 8.36
V(W ) = V(X) + 4V(Y ) = 2.28 + 4 × 3 = 14.28

School of Aerospace & Automotive Engineering (9/41) UIR/ Pr. M. Fihri


Exercise 3. A survey asked 300 middle school students about their favorite
fruit among the six most commonly consumed fruits in Morocco : banana,
nectarine, orange, peach, pear, and apple. The following results were obtained :
Fruit Banana Nectarine Orange Peach Pear Apple
Count (frequency) 72 33 30 36 45 84
1) Identify the variable and specify its nature.
2) Construct the relative frequency table.
3) Is there a mode ? If yes, identify it.
4) Create two graphical representations of this variable.

School of Aerospace & Automotive Engineering (10/41) UIR/ Pr. M. Fihri


Solution –
1. The variable is ”the favorite fruit of 300 middle school students.”
This is a nominal qualitative variable.
2. Frequency table :
Fruit Count (Frequency) ni Frequency fi
Banana 72 0.24
Nectarine 33 0.11
Orange 30 0.10
Peach 36 0.12
Pear 45 0.15
Apple 84 0.28
Total 300 1
3. The mode is ”apple,” which has the highest count (frequency) or
relative frequency.
4. (i) Create a bar chart of the counts (or relative frequencies).
(ii) Create a pie chart : αi = fi × 360 is the number of degrees
measuring the sector for modality i.
School of Aerospace & Automotive Engineering (11/41) UIR/ Pr. M. Fihri
Exercise 4. The following data represents electricity costs (in DH) during
the month of March for a sample of 50 small apartments in a large city :
80 90 95 96 102 108 109 111 114 116
119 123 127 128 129 130 130 135 137 139
141 143 144 147 148 149 149 150 151 153
154 157 158 163 165 166 167 168 171 172
175 178 183 185 187 191 197 202 206 220
1) Calculate the quartiles Q1, Q2, and Q3 of this sample.
2) Group the data into equal-width classes, then construct the frequency
and cumulative frequency table.
3) Plot the histogram and frequency polygon.

School of Aerospace & Automotive Engineering (12/41) UIR/ Pr. M. Fihri


Solution –
1) Using the procedure covered in class, the data is ordered :
– Calculation of Q1 : np = 50 × 0.25 = 12.5, which is not an integer,
so :

Q1 = x(⌈12.5⌉) = x(13)
– Calculation of Me = Q2 : np = 50 × 0.5 = 25, which is an integer,
so :
x(np) + x(np+1) x(25) + x(26) 148 + 149
Me = = = = 148.5
2 2 2
– Calculation of Q3 : np = 50 × 0.75 = 37.5, which is not an integer,
so :

Q3 = x(⌈37.5⌉) = x(38) = 168


2) The Sturges’ formula gives the number of classes :

k = 1 + 3.33 log10 N ≃ 1 + 3.33 log10(50) ≃ 7 classes.


School of Aerospace & Automotive Engineering (13/41) UIR/ Pr. M. Fihri
The range is calculated as : e = xmax − xmin = 220 − 80 = 140.
The class width (amplitude) is calculated as : A = e/140 = 20
The resulting table is :
Classi ni fi Fi
[80, 100[ 4 0.08 0.08
[100, 120[ 7 0.14 0.22
[120, 140[ 9 0.18 0.40
[140, 160[ 13 0.26 0.66
[160, 180[ 9 0.18 0.84
[180, 200[ 5 0.10 0.94
[200, 220[ 3 0.06 1
Total 50 1 //
3) Plot the histogram, followed by the frequency polygon (on the same
figure).

School of Aerospace & Automotive Engineering (14/41) UIR/ Pr. M. Fihri


Exercise 5. In a survey conducted among students at UIR, the investigator
recorded the time (in minutes) taken by each respondent to reach the university.
The following table summarizes the observed times :
Class [21; 22[ [22; 23[ [23; 24[ [24; 26[ [26; 30[
Count (freq) ni 50 90 70 60 40
1) Construct the cumulative relative frequency table.
2) Represent the data using a histogram and plot the relative frequency
polygon.
3) Calculate the mean and variance.
4) Calculate the quartiles.
5) Is this distribution symmetric or skewed ?

School of Aerospace & Automotive Engineering (15/41) UIR/ Pr. M. Fihri


Solution –
1) The following frequency table is obtained :
Classes Count (Freq) ni R Freq fi Cumulative R Freq F (x) Corrected Freq fic ci fi ci ni (ci − x)2
[21,22[ 50 0.16 0.16 0.16 21.5 3.44 255.38
[22,23[ 90 0.29 0.45 0.29 22.5 6.525 142.884
[23,24[ 70 0.23 0.68 0.23 23.5 5.405 4.732
[24,26[ 60 0.19 0.87 0.09 25 4.75 92.256
[26,30[ 40 0.13 1 0.03 28 3.64 719.104
Total 310 1 - - - 23.76 1214.356
where the corrected frequency for class i, denoted fic, is defined as :
c a0
fi = fi × .
ai
— a0 is the base amplitude (here 1).
— ai is the amplitude of class i.
— The amplitude of class [xi, xi+1[ is ai = xi+1 − xi.
2) Histogram and frequency polygon (see figure below) :

School of Aerospace & Automotive Engineering (16/41) UIR/ Pr. M. Fihri


Figure 1 – Histogram and frequency polygon of the distribution

3) The mean is given by :


5 5
1X X
x= nici = fici = 23.76 min.
n i=1 i=1

School of Aerospace & Automotive Engineering (17/41) UIR/ Pr. M. Fihri


and the variance by :
5
1 X 1214.356
s2 = ni(ci − x)2 = ≃ 3.92 min2.
n i=1 310
4) To calculate the quartiles, use the following formula for j = 1, 2, 3 :
F (Qj ) − F (xi)
Qj = xi + (xi+1 − xi) × ; Qj ∈ [xi, xi+1[.
F (xi+1) − F (xi)
In this case :
-
0.25 − 0.16
Q1 = 22 + (23 − 22) × = 22.31,
0.45 − 0.16
-
0.5 − 0.45
Q2 = 23 + (24 − 23) × = 23.22,
0.68 − 0.45
-
0.75 − 0.68
Q3 = 24 + (26 − 24) × = 24.75.
0.87 − 0.68
5) The Pearson skewness coefficient is :
3(x̄ − Q2) 3(23.76 − 23.22)
γ3 = = √ ≃ 0.82.
s 3.92
Since γ3 is positive, the distribution is skewed to the right.
School of Aerospace & Automotive Engineering (18/41) UIR/ Pr. M. Fihri
Exercise 6. A budget study yielded the following results :

Budget [800, 1000[[1000, 1400[[1400, 1600[[1600, y[[y, 2400[ [2400, x[


Cumulative Freq. 0.08 0.18 0.34 0.64 0.73 1
PART 1 : Some data is missing.
1) Calculate the missing upper bound x, given that the range of the series
is 3200 euros.
2) Calculate the missing bound y in the following two cases :
a) The average budget is 1995 euros.
b) The median budget is 1920 euros.
PART 2 : Now, consider the missing bound y to be 2000 euros.
3) Provide a graphical representation of the budget distribution.
4) Calculate the average and median budget.
5) Find the counts (frequencies) ni corresponding to each budget class, as
well as the total count (size) n, given that :
n
X
nic2i = 4741200000 and V(X) = 604044.
i=1

School of Aerospace & Automotive Engineering (19/41) UIR/ Pr. M. Fihri


Solution –
PART 1 : Some data is missing.
1. The missing upper bound x = xmax is calculated as follows, knowing
that the range e of the series is 3200 euros :
The range e = xmax − xmin gives :

3200 = xmax − 800 =⇒ x = xmax = 4000.


2. The missing bound y is calculated in the following two cases :
(a) The average budget is 1995 euros :

k k
1X X
x̄ = nici = fi c i
n i=1 i=1
First, we calculate the frequencies from the cumulative frequencies
in the previous table :

School of Aerospace & Automotive Engineering (20/41) UIR/ Pr. M. Fihri


Classes [800, 1000[[1000, 1400[[1400, 1600[[1600, y[[y, 2400[ [2400, 4000[
Fi 0.08 0.18 0.34 0.64 0.73 1
fi 0.08 0.1 0.16 0.3 0.09 0.27
Therefore :
k
X
x̄ = fici = 1995
i=1
That is :

1600 + y y + 2400
0.08×900+0.1×1200+0.16×1500+0.3× +0.09× +0.27×3200 =
2 2
Solving gives : y = 1800.
(b) The median budget is 1920 euros :
The median budget is 1920 euros. We use linear interpolation on
the interval [1600, y[ :

1920 − 1600 0.5 − 0.34


= .
y − 1600 0.64 − 0.34
School of Aerospace & Automotive Engineering (21/41) UIR/ Pr. M. Fihri
Solving gives : y = 2200.
PART 2 : Now, consider the missing bound y to be 2000 euros.
3. Provide a graphical representation of the budget distribution.
Plot the histogram : To provide the correct graphical representation
of the distribution, we first correct the frequencies since the class
widths are not equal :

fi
fic =
×α
ai
Where α is the scale factor, equal to the value of the smallest or most
common class width. Here, we use α = 400.

Classes [800, 1000[[1000, 1400[[1400, 1600[[1600, 2000[[2000, 2400[ [2400, 4000[


ai 200 400 200 400 400 1600
fi 0.08 0.1 0.16 0.3 0.09 0.27
fic 0.16 0.1 0.32 0.3 0.09 0.0675
4. Calculate the average and median budget.
School of Aerospace & Automotive Engineering (22/41) UIR/ Pr. M. Fihri
k
X
x̄ = fici = 0.08×900+0.1×1200+0.16×1500+0.3×1800+0.09×2200+0.27×32
i=1
The median budget falls in the interval [1600, 2000[. Using linear
interpolation :

Me − 1600 0.5 − 0.34


= .
2000 − 1600 0.64 − 0.34
Solving gives : Me = 1813.
5. Find the counts (frequencies) ni corresponding to each budget class
as well as the total count (size) n, given that :
n
X
nic2i = 4741200000 and x̄ = 2034.
i=1
Using :
k k
1X 2 1X 2
V (X) = 604044 = ni(ci − x̄) = nici − (x̄)2
n i=1 n i=1
School of Aerospace & Automotive Engineering (23/41) UIR/ Pr. M. Fihri
and :
n
X
nic2i = 4741200000 and x̄ = 2034.
i=1
Solving gives :

1
604044 = × 4741200000 − (2034)2
n

=⇒ n = 1000
To calculate the class counts, apply the formula : ni = fi × n.

School of Aerospace & Automotive Engineering (24/41) UIR/ Pr. M. Fihri


Exercise 7. We consider the statistics (for ”electricity costs (in DH)”
during the month of March for a sample of 50 small apartments in a large
city) given in Exercise 4.
1) Construct the box plot.
2) Interpret the results.
Solution –
Box plot parameters : (see Exercise 4)
– Q1 = 127
– Me = 148.5
– Q3 = 168
– IQR = Q3 − Q1 = 168 − 127 = 41
– a = max(Q1 − 1.5 × IQR, xmin) = max(65.5, 80) = 80
– b = min(Q3 + 1.5 × IQR, xmax) = min(229.5, 220) = 220
This distribution shows no outliers.

School of Aerospace & Automotive Engineering (25/41) UIR/ Pr. M. Fihri


Exercise 8. Consider the following table that provides the distribution of
the pair (X, Y ).
Y
HH
HH
H
0 1
X
HH
H
HH

[0.5, 1.5[ 21 8
[1.5, 2.5[ 23 15
[2.5, 3.5[ 10 23
1) What are the marginal distributions of X and Y ?
2) Calculate the marginal means and variances of X and Y .
3) Calculate the marginal coefficient of variation of Y . Interpret.
4) Are X and Y independent ?
5) Calculate the mean and variance of the variable Z = 0.165X + 0.13Y .

School of Aerospace & Automotive Engineering (26/41) UIR/ Pr. M. Fihri


Solution –
1. The marginal distribution of X is given in the following table :
X Count (freq ni)
[0.5, 1.5[ 29
[1.5, 2.5[ 38
[2.5, 3.5[ 33
Σ 100
The marginal distribution of Y is given in the following table :
Y Count (freq ni)
0 54
1 46
Σ 100
2. We find :
3
1 X 29 × 1 + 38 × 2 + 33 × 3
x= ni.ci = = 2.04,
100 i=1 100

School of Aerospace & Automotive Engineering (27/41) UIR/ Pr. M. Fihri


2
1 X 54 × 0 + 46 × 1
y= n.j yj = = 0.46,
100 j=1 100

3
!
1 X
V (X) = s2x = ni.c2i − (x)2 = 4.78 − 2.042 = 0.6184,
100 i=1
 
2
2 1 X
V (Y ) = sy =  n.j yj2 − (y)2 = 0.2484.
100 j=1

3. The coefficient of variation for Y is :



sy 0.2484
CVY = = = 1.083473 ≃ 108%.
y 0.46
The distribution of Y is heterogeneous.
4. The variables X and Y are independent if and only if :
ni. × n.j
nij = , ∀i = 1, 2, 3 and j = 1, 2.
n
School of Aerospace & Automotive Engineering (28/41) UIR/ Pr. M. Fihri
Y
H
HH
H
HH
0 1 Σ
X HH
H
H
[0.5, 1.5[ 21 8 29
[1.5, 2.5[ 23 15 38
[2.5, 3.5[ 10 23 33
Σ 54 46 100
However, we have a counterexample :
n2. × n.1 38 × 54
n21 = 23 ̸= = = 20.52,
n 100
so the variables X and Y are dependent.
5. The variance of Z = 0.165X + 0.13Y is :
V (Z) = V (0.165X+0.13Y ) = 0.1652V (X)+0.132V (Y )+2×0.165×0.13×cov(X, Y
with the covariance between X and Y :
 
3 X 2
1 X
sxy = cov(X, Y ) =  nij ciyj  − x × y = 0.1316.
100 i=1 j=1

School of Aerospace & Automotive Engineering (29/41) UIR/ Pr. M. Fihri


Exercise 9. A bank manager wants to know if there is a relationship between
the annual income X and the amount of money Y allocated to savings. For
a sample of 10 families, he obtained the following results (in 104 DH) :
X 12 15 13 10 10 14 16 18 16 14
Y 0.2 1.2 1.0 0.7 0.3 1.0 1.6 1.4 1.2 0.7
1) Calculate the means x̄, ȳ and the standard deviations sx, sy .
2) Calculate the correlation coefficient and interpret the result.
3) Find the expression of the regression line of Y on X.
4) Estimate the amount of money allocated to savings by a family with an
income of 110,000 DH.
Solution –
1. We find :
10
1 X 138
x= xi = = 13.8,
10 i=1 10
10
1 X 9.3
y= yi = = 0.93,
10 i=1 10

School of Aerospace & Automotive Engineering (30/41) UIR/ Pr. M. Fihri


10
2 1 X
sx = (xi − x)2 = 6.16,
10 i=1
10
2 1 X
sy = (yi − y)2 = 0.1861.
10 i=1
sxy
2. The correlation coefficient is ρ = sx sy , where
10
1 X
Cov(X, Y ) = sxy = (xi − x)(yi − y) = 0.886.
10 i=1
Thus, ρ ≃ 0.8275, indicating a strong positive relationship between Y
and X.
3. Since ρ > 0.8, we can determine the regression line of Y on X.
The equation of the regression between X and Y is given by :
y = ax + b,
s
where a = sxy2 and b = y − ax.
x
This gives :
a ≃ 0.1438 and b ≃ −1.0549.
School of Aerospace & Automotive Engineering (31/41) UIR/ Pr. M. Fihri
Therefore, the regression line is :
y = 0.1438 × x − 1.0549.
4. If x = 11 (i.e., an income of 110,000 DH), then :
y ≃ 0.5273,
which means approximately 5273 DH is allocated to savings for this
family.

Additional Exercises

Exercise 10. Viewers are invited to rate a show by sending a message


containing one of the letters A, B, C, or D, representing respectively ”very
good show,” ”good show,” ”bad show,” and ”very bad show.” Below are the
ratings from 32 viewers :
B, B, A, C, A, D, A, A, B, C, D, D, C, A, B, B, C, A, D, C, A, A, B, A, C,
D, B, B, C, D, B, A

School of Aerospace & Automotive Engineering (32/41) UIR/ Pr. M. Fihri


1) Describe the variable.
2) Create a frequency and relative frequency distribution table.
3) Plot a suitable graphical representation.
Exercise 11. The call durations, in minutes and rounded to the nearest
whole number, for 22 calls recorded at a call center are listed in the table
below :
10 12 14 14 15 15 16 16 17 17 17
18 18 18 19 19 20 20 21 22 23 24
1) Create the frequency and cumulative frequency table using five classes.
2) Construct the histogram for this statistical series.
3) Plot the cumulative frequency curve.
4) What value do 25% of the observed durations fall below ?
5) Calculate the mean and the standard deviation for this distribution.
6) Calculate the coefficient of variation. Is this statistical distribution
homogeneous ?
Solution –
e
1) The common class width for these 5 classes is given by a ∼ =
k
School of Aerospace & Automotive Engineering (33/41) UIR/ Pr. M. Fihri
24 − 10
= 2.8 (which can be rounded to 3)
5
where k = 1 + 3.322 log10 22 = 1 + 3.322 × 1.342 = 5.459 rounded to
5, and then we construct the classes. We get the following frequency
table :

Classi ni fi Fi
[10, 13[ 2 0.09 0.09
[13, 16[ 4 0.18 0.27
[16, 19[ 8 0.36 0.63
[19, 22[ 5 0.23 0.86
[22, 25[ 3 0.14 1
T otal 22 1 1
2) You can plot either the frequency histogram or the relative frequency
histogram :

School of Aerospace & Automotive Engineering (34/41) UIR/ Pr. M. Fihri


Figure 2 – Frequency Histogram

3) The key points for the cumulative frequency curve are :


Absc 10 13 16 19 22 25
Ord (in %) 0 9 27 63 86 100
The cumulative distribution function (cumulative frequency curve) is
represented as follows :

School of Aerospace & Automotive Engineering (35/41) UIR/ Pr. M. Fihri


Figure 3 – Cumulative Frequency Curve

4) Since the cumulative frequency function F is increasing, we have


0.09 ≤ 0.25 ≤ 0.27
Thus, F (13) ≤ F (x) ≤ F (16), so 13 ≤ x ≤ 16, where x is the sought
value. Using linear interpolation :
x − 13 0.25 − 0.09 0.16
= ⇒ x = 13 + (16 − 13) × ≈ 15.67 minutes
16 − 13 0.27 − 0.09 0.18
So, 25% of the durations are less than 15.67 minutes.
5) The mean duration is given by :

School of Aerospace & Automotive Engineering (36/41) UIR/ Pr. M. Fihri


22
1 X 1 385
x̄ = xi = (10 + 12 + ... + 23 + 24) = = 17.5 minutes.
22 i=1 22 22
22 22
1 X X
The variance is s2 = x2i − (17.5)2 where x2i = 6989
22 i=1 i=1
Thus s2 ≈ 11.43 minutes2

The standard deviation is s = 11.43 ≈ 3.38 minutes.
6) The coefficient of variation, denoted by CV, is defined as :
s 3.38
CV = . In our case, CV = ≈ 0.1931 = 19.31%
x̄ 17.5
This statistical distribution is homogeneous.

Exercise 12. A car travels 200 kilometers at 50 km/h, then 100 kilometers
at 100 km/h.
What is its average speed over the entire trip ?
Solution –
The average speed will be the ratio of the total distance traveled to the

School of Aerospace & Automotive Engineering (37/41) UIR/ Pr. M. Fihri


total travel time. The total distance is 200 + 100 = 300 kilometers.
300
xH = 200 100 = 60 km/h
50 + 100
So :
– 200 km at 50 km/h takes 4 hours ;
– 100 km at 100 km/h takes 1 hour.
Thus, the total time for 300 kilometers is 5 hours, and the average
speed is 60 km/h.
Exercise 13. In order to evaluate the relationship between grain density
and yield, a series of trials were conducted on different plots of a cereal crop.
The experiment gave the following results :
xi 150 250 350 450
zi 57.06 60.73 62.73 63.48
yi
where xi denotes the number of grains sown per square meter, and zi
denotes the yield per hectare.
1) Calculate the values of yi = ln(64 − zi) for i = 1, 2, 3, 4.
School of Aerospace & Automotive Engineering (38/41) UIR/ Pr. M. Fihri
2) Calculate the linear correlation coefficient between x and y. Interpret
the result.
3) Determine the equation of the regression line of y on x. From this,
deduce an expression of z as a function of x.
Exercise 14. A survey conducted among 100 women aged 40 was carried
out to assess the impact of reducing family allowances. The survey gave the
following results :
Number of children (xi) Number of women (ni)
0 10
1 20
2 20
3 30
4 20
1) Characterize the distribution (population size, individual, modalities,
variable type).
2) Draw the corresponding diagram.
3) Define and plot the cumulative frequency curve.
4) Find the proportion of women with fewer than 4 children.
School of Aerospace & Automotive Engineering (39/41) UIR/ Pr. M. Fihri
5) Find the proportion of women with at least 2 children.
6) Calculate the mean and standard deviation of this distribution.
Solution –
1. Population : ”40-year-old women” ; Size : n = 100 ; Individual : ”a
40-year-old woman” ; Modalities : ”0, 1, 2, 3, 4” ; Variable : ”Number
of children” ; Type : quantitative discrete.
2. Draw either the bar chart of the counts (frequencies) or the bar chart
of the relative frequencies.
3. The cumulative frequency function (distribution function) is defined
as follows :


 0 if x < 0
0.1 if 0 ≤ x < 1




0.3 if 1 ≤ x < 2

F (x) =

 0.5 if 2 ≤ x < 3
0.8 if 3 ≤ x < 4




1 if x ≥ 4

School of Aerospace & Automotive Engineering (40/41) UIR/ Pr. M. Fihri


Number of children (xi) Number of women (ni) R. Freq. fi Cumul. R. Freq
0 10 0.1 0.1
1 20 0.2 0.3
2 20 0.2 0.5
3 30 0.3 0.8
4 20 0.2 1
4. The proportion of women with fewer than 4 children is 0.8.
5. The proportion of women with at least 2 children is 0.2 + 0.3 + 0.2 =
0.7.
6. Mean x̄ = 2.3, variance V (x) = 1.61, standard deviation σX ≈ 1.27.

School of Aerospace & Automotive Engineering (41/41) UIR/ Pr. M. Fihri

You might also like