Beamer Mémoire-1
Beamer Mémoire-1
Beamer Mémoire-1
Probabilities-Statistics
Dr.HEBCHI Chaima
22 octobre 2023
Introduction
The term ’statistics’ refers to both a collection of numerical data
(observational data) related to a specific subject and the activities
involved in gathering, processing, and interpreting this data.
The term ’descriptive statistics’ involves ordering, classifying, and
representing observed data in a suitable format.
Course Objectives
Review and set reminders for important concepts and
vocabulary.
Learn how to describe and represent a set of data effectively.
Explore the study of statistical distributions involving two
variables.
Formulate conclusions based on the analysis of the studied
population.
statistical population :
The statistical population is the set of elements that we intend to
study. This set is denoted as : Ω
Example : the Algerian population, all the companies in a region, a
set of geographical sites...
Statistical individual (or statistical unit) :
A statistical individual ω represents an element of the population
under consideration.
Example : a person, a company, a geographical site...
Character (statistical variable) :
The character refers to how the observation of individuals in a
population is conducted. Each individual in a population can
generally be described by one or more characteristics, and this
application is denoted as X .
Example : for people : gender, age, salary,...
for companies : number of employees, sector of activity, etc.
3/33 Dr.HEBCHI Chaima Proba-stat
Representation of a series
Basic descriptive statistics vocabulary
Measures of Central Tendency
Bivariate descriptive statistics
Measures of Central Tendency
Remark
X
fi = 1
i
8
Effectifs
6
4
2
8 10 12 14 16
Note
Arithmetic mean :
k : number of categories ;
n = the total number of data points in the dataset.
The formula for weighted arithmetic mean (by the frequencies) :
k
1X
x̄ = ni xi
n i=1
Example :
xi ni mi ni mi
[5 − 15[ 3 10 30
[15 − 25[ 12 20 240
[25 − 35[ 2 30 60
[35 − 45[ 9 40 360
[45 − 55[ 1 50 50
27 740
740
x̄ = 27 = 27.41
Mode :
Case of discrete variables
Definition : the mode is the value of the modality (xi ) that
corresponds to the highest frequency ni (or the frequency fi ) in the
data.
Example : Consider the series of numbers
{5, 5, 5, 5, 10, 10, 15, 7, 7, 8, 13, 13, 13}. The most frequent value is
5, so Mode = 5.
Case of continuous variables
Definition : We call the modal class of a continuous statistical
series any class with the highest frequency. To calculate the mode
(with frequencies grouped into classes of equal amplitudes), you
should apply the following formula :
f1 − f 0
Mode = l1 + ×i
(f1 − f0 ) + (f1 − f2 )
18/33 Dr.HEBCHI Chaima Proba-stat
Representation of a series
Basic descriptive statistics vocabulary
Measures of Central Tendency
Bivariate descriptive statistics
Measures of Central Tendency
n/2 − c.f
Q2 = l1 + ×i
f
where :
l1 = lower limit of the median class ;
n = total number of data ;
c.f = cumulative frequencyof the class preceding the median class ;
f = frequency of the median class ;
i = size of the median class interval.
Example : Calculation of the median when values are grouped into
classes with equal class width
We have here : : n/2 = 14. This means that the 14th observation
is located in the class [20 − 30[.
22/33 Dr.HEBCHI Chaima Proba-stat
Representation of a series
Basic descriptive statistics vocabulary
Measures of Central Tendency
Bivariate descriptive statistics
Measures of Central Tendency
The variance :
Let X be a statistical characteristic that takes on k modalities xi ,
to which are associated frequencies ni . The variance σ 2 (X ) of this
series is written as follows :
k
1X
σ 2 (X ) = ni (xi − X̄ )2
n i=1
Properties :
if a ∈ R, b ∈ R and let Y = aX + b, then :
1) σ 2 (Y ) = σ 2 (aX + b) = a2 σ 2 (X )
2) σ(Y ) = σ(aX + b) = |a|σ(X )
X
3) σ 2 (X ) = 1
n ni xi2 − (X̄ )2
i
4) σ 2 (X ) = (X¯2 ) − (X̄ )2