Statistics
Statistics
Statistics
ACCRA GHANA
BSM 301
STATISTICS
LECTURE NOTES
COMPILED
BY
1
REFERENCES
2
INTRODUCTION TO STATISTICS - Concept, Definitions And Relations
SKEWNESS / KURTOSIS
Poisson Distribution
Binomial Distribution
3
Geometric and Hyper geometric Distribution
Exponential Distribution
Gamma Distribution
4
INTRODUCTION TO STATISTICS
There are two main branches of statistics namely descriptive and inferential.
Descriptive Statistics: It utilizes numerical and graphical methods to look for patterns
in a data set, summarizes and presents the information in a convenient form useful in
making decisions. The idea of descriptive statistics is to describe a data set.
Descriptive statistics include both numerical measures, like mean and median and
graphical displays, like pie-charts.
The main target of inferential statistics is to make conclusions about a population based
on a sample of data from that population. One commonly used inferential technique is
hypothesis testing.
5
A Statistical Hypothesis is an educated guess about the relationship between two or
more variables. For instance, an educational leader may have a lingering question:
does a graduate of RMU have a better chance of securing a job compared to graduates
of other Universities in Ghana?
The hypothesis would be that graduates of RMU would have a better chance of
securing jobs since the graduates are of superior abilities. The processes for running
the test are executed, once the hypothesis is formed. In forming statistical hypotheses,
the variables are either dependent or independent.
Dependent Variables are variables which represent the effects that are being tested.
Independent Variables are variables which represent the inputs to the dependent
variable or the variable that could be manipulated to check if it is the cause. In the
above example where an educational leader seeks to find out whether it is graduates of
RMU or others who have greater chances of securing jobs, the dependent variable is
whether a graduate is able to secure a job. The independent variable is which university
completed, whether RMU or other.
A statistical test is done to evaluate the data collected on graduates who secured jobs
from RMU or others; to find out if the educational leader’s hypothesis is correct or
otherwise.
6
5 Generalize the result to your population and draw conclusions
SAMPLING
PROBABILITY AND NON – PROBABILITY SAMPLING TECHNIQUES
Probability sampling involves using random selection so that each unit in the population
has a known / equal chance of being selected. Probability sampling keeps sampling
error low and samples are seen to be representative.
Non – probability sampling does not involve random selection so some units in the
population may have had a higher chance of being selected.
Generalizability refers to being able to use sample results as if they applied to the
whole population.
Sampling Error is the difference in result between a sample and that of the population
Sampling Frame is a list of all units in the population from which a sample could be
selected.
Sampling Fraction is the number required for sample divided by number in total
sampling frame expressed as a fraction or percentage.
Convenience: sampling chosen for ease rather than through random sampling. Used in
pilot studies or short term projects where there is insufficient time to construct a
probability sample. The results cannot be generalised to the population.
Quota: used in market research and opinion polling. The sample is chosen to include a
certain proportion of particular variables (gender, age group). There is no random
sampling stage, the choice of respondent is up to the interviewer provided the quota is a
accurate.
Snowball: an initial group of respondents relevant to the research topic are contacted
and then his group to contact others for the research.
There is no sampling frame and not random and sometimes difficult to pre-define the
population (eg creative ideas contributors in a company). This technique is mostly used
in qualitative approaches.
7
Purposive: One’s own judgment is used in selecting a sample and used with small
populations within qualitative research, especially case studies or grounded theory. This
approach cannot yield any statistical inference about the population.
Systematic: sample is chosen directly from the sampling frame, doing without any
random number in selecting a random sample. With a random sample proportion,
example, 1 in 10, start with a random number generated item in the list, then choose
every 10th name until the sample is complete
We often wonder about how large a sample should be. There is no right answer to
sample size. It is more important to look at the absolute size of a sample than its relative
size in relation to the total population. Imagine 10% of a population as fine sample.
Then sample size for 1m is 100 thousand and that for 10000 is 1000 and for 1000, we
have 10.
This sample could be quite unrepresentative of the total population by itself. So relative
sample size is not important but absolute size is. The larger the sample size, the more
the sample is likely to represent the population and the lower the sampling error. The
larger the absolute size of a sample, the more closely its distribution will be to the
normal distribution.
For a statistical analysis on data, the minimum size of sample for any one category of
the data should be 30, as this is most likely to offer reasonable chance of normal
distribution. If the sample frame is 30 or less, it is prudent to include the whole frame,
rather than sampling.
Margin of Error: the expected margin of error is affected by absolute size of sample
within a population. 5% margin of error (95% certainty) is the maximum normally
appropriate for rigorous research. There is diminishing need for higher samples at the
8
high population end of the table (the figures to achieve 95% certainty for a population of
1m is the same as for a population of 10 m).
Variation in population
If population is highly varied the sample size will need to be larger than if the population
is less varied.
Example; A continuous random variable, has a probability density function, f(x), given
2𝑥 𝑓𝑜𝑟 0 ≤ 𝑥 ≤ 1
by 𝑓 (𝑥 ) = {
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find (a) the mean, µ (b) the variance, σ2 , of this distribution
A random sample of 100 observations is taken from this distribution, and the mean, 𝑥,
is found. Write down the distribution of 𝑥.
Solution
1 1 1 2 2
(a) µ = ∫0 𝑥𝑓(𝑥)𝑑𝑥 = ∫0 𝑥(2𝑥)𝑑𝑥 = ∫0 2𝑥 2 𝑑𝑥 = [3 𝑥 3 ]10 = 3
1 1 1 2 2 1 4 1
(b) σ2 = ∫0 𝑥 2 𝑓(𝑥)𝑑𝑥 - µ2 = ∫0 𝑥 2 (2𝑥)𝑑𝑥 - µ2 = ∫0 2𝑥 3 𝑑𝑥 - µ2 = [4 𝑥 4 ]10 - ( 3)2 = 2 − 9 = 18
1
σ2 18 2
By the Central Limit Theorem, the distribution of 𝑥 is approximately 𝑁(µ, 𝑛 ) = 𝑁(3 , 100 )
2 1
= 𝑁(3 , 1800)
VARIABLES
Variables are into qualitative and quantitative. Qualitative variables are variables that
could be placed into distinct groups in accordance with some characteristics, with each
element belonging to only one category. Different types of data fall into four categories:
interval (quantifiable), ordinal, nominal and dichotomous (all three referred to as
categorical).
9
Interval variables
It is the highest form of measurement and the easiest to manipulate and analyze. There
is a fixed interval (space) between each variable and this is a consistent space. There
could be answers involving age, income and weight. There is an even more precise
form of this variable known as ratio variable.
Ordinal variables
These can be rank-ordered but the space between the variables is not equal across the
range. For example, asking for ages: 1-5, 6-10, 11-15 and over 16. The last category
changes the entire set into ordinal and constraints what we do with the data.
Nominal variables
This cannot be rank-ordered at all. An example could be to offer alternative answers in
a multiple choice question such as ‘sometimes’, ‘occasionally’ and ‘often’.
Dichotomous variables
This answer can fall into only one of two categories, treated as special kind of nominal
variable. For example YES / NO, MALE / FEMALE, TRUE / FALSE.
SOURCES OF DATA
Data is the observed, measured and the recorded values pertaining to a variable.
It is a collection of raw facts and could be measurements, observations, words etc.
Data could include marks of students in statistics exams, ages of people.
The types of data or variables are qualitative and quantitative.
Qualitative variables cannot be measured on a natural numerical scale and could only
be classified into categories. For instance, gender and degree of satisfaction.
10
DATA ORGANISATION
Qualitative data is often presented with bar charts, pie charts and frequency polygons
whiles quantitative data comes with scatter diagrams, stem-and-leaf plots, histogram,
cumulative frequency curves(ogive), and frequency polygons or bar graphs (component
/ multiple).
SIMPLE BAR CHART
The pupils in a class are classified according to their favourite soft drinks, as in the table
below.
Fanta Coke Sprite Lemon Malt
8 10 15 6 12
Use a simple bar chart to illustrate the above
20 BAR GRAPH
BAR GRAPH
0
FANTA COKE SPRITE LEMON MALT
11
Female 15 35 20 35
Total 50 70 60 40
80
60
FEMALE
40
MALE
20
0
SCIENCE ARTS BUSINESS HOME ECONOMICS
Below is the distribution of males and female in five cities in Africa (in thousands)
Gender Accra Free town Monrovia Abuja Pretoria
Males 200 100 50 40 150
Females 300 150 60 60 200
Illustrate this using multiple bar graphs.
350
300
250
200
150
100 MALES
50
0
FEMALES
Below is the distribution of students in RMU statistics class. Illustrate on a pie chart.
Programs Number of students
BME 45
BEE 18
BCE 8
12
STUDENTS IN RMU
11%
BME
26% BEE
63% BCE
It represents discrete quantitative data in a way that can be used to study the shape of a
frequency distribution as well as the range of the values. The plot could easily be used
to recreate the data.
STEM LEAF
1 012356678
13
2 12455778
3 2335677789
4 01224456668
5 00113456689
SCATTER DIAGRAM
The dependent variable is on the y-axis while the independent variable is on the x-axis.
We look out for correlations on the scatter diagram. There could also be a line of best fit
using the “eye ball fitting method”.
Example: The following are the number of minutes it takes 8 typist to finish a piece of
secretarial work on Monday and on Friday. Construct a scatter diagram using the data
set and indicate a line of best fit using the eyeball fitting method.
Typist 1 2 3 4 5 6 7 8
Monday(x) 9 8 10 13 11 15 13 12
Friday (y) 8 12 11 15 11 14 16 15
20 SCATTER DIAGRAM
10
0
SCATTER DIAGRAM
0 5 10 15 20
HISTOGRAM
When the class intervals are of different width then the heights of the bars are
proportional to frequency density = class frequency x k where k = height scale factor
Class width
14
Example: Represent the data set below with a histogram.
Height (cm) 140-144 145-149 150-159 160-164 165-174
Frequency 4 5 10 10 8
THE MEAN
Ungrouped Data;
Suppose we have n observations, x1, x2, x3, . . . , xn. The mean or mean value is defined
𝑥𝑖
as 𝑥̅ = ∑𝑛𝑖=1 𝑛
Example
The set of numbers, x2, 3, 3x - 4, 7, 9, where x is a positive integer, has a mean of 8.6.
Find x.
Solution
𝑥𝑖 x2+3+3x− 4+7+9
𝑥̅ = ∑𝑛𝑖=1 = = 8.6 ⇒ x2 + 3x - 28 = 0 ⇒ x = 4
𝑛 5
x 1 2 3 4 5
f 2 3 4 5 6
15
Solution
x f fx
1 2 2
2 3 6
3 4 12
4 5 20
5 6 30
sum 20 70
∑ 𝑓𝑥 70
𝑥̅ = ∑𝑓
= 20 = 3.5
Solution
∑ 𝑓𝑥 1826
𝑥̅ = ∑𝑓
= = 29.5
62
Example: Find the weighted mean of three test results, 80, 85, and 75, where the first
test counts 20%, the second 30% and the third counts 50 %
Solution
∑ 𝑤.𝑥 20(80)+30(85)+ 50(75) 7900
. W (𝑥̅ ) = ∑ = = = 79
𝑤 20+30+50 100
16
Simplifying the Calculation of the Mean
Suppose we want to determine the mean of the set of numbers 507, 508,498, 502,497.
A direct method gives 502.4.
We could make these numbers smaller by subtracting 500 from each number, yielding
7, 8, -2, 2, -3. These added give 12 and their mean is 12/5 = 2.4. To find the mean of
the original values, add 500 to 2.4 to give 502.4.
∑(𝑥−𝑎)
𝑥= 𝑛 +a
The heights, y cm, of a sample of 90 students are summarized by the equation∑(𝑥 −
200) = 280. Find the mean height of a student.
Solution
∑(𝑦−𝑎) ∑(𝑦−200) 280
𝑦= 𝑛 +a= + 200 = + 200 = 203.1
90 90
It is unique, that is there exists only one for any data set.
It is more representative since every unit is considered.
Extreme values affect the mean (hence use for data without outliers).
Applied quite often in hypothesis testing.
Used for interval / ratio (not skewed) data.
THE MEDIAN
Ungrouped data;
𝑛+1 th
Odd number of observation: ( ) position
2
𝑛 𝑛+2 th
Even number of observations: [(2 )th ( ) ]/ 2
2
Grouped Data
∑𝑓
−𝑐
Median = L + h[ 2 𝑓 ]
17
L =lower class boundary of the median class, n= total frequency, f= frequency of median
class, h= class interval, c= cumulative frequency, down to preceding class before
median class.
It is unique
Not affected by extreme values /outliers, hence preferred.
Found for nominal data and preferred for ordinal data.
Not all units are involved in its calculation.
Used for interval /ratio (skewed) data.
MODE
It is the value or class with highest frequency. A bimodal data set has two modes whiles
a multimodal data set has more than 2 modes.
Grouped Frequency
𝑎
Mode =L +h [ 𝑎+𝑏 ]
18
L = lower class boundary of modal class, h = class interval, a = difference between
modal frequency and frequency above it, b = difference between modal frequency and
frequency below it.
PERCENTILES
Arrange the data in ascending order and find the position of the pth percentile as p/100 (n
+ 1), to the nearest whole number.
QUARTILES
First quartile Q1, is 25th percentile, 2nd quartile Q2, is 50th percentile (median) and third
quartile,
19
MEASURES OF DISPERSION
The Range;
For 2, 2, 3, 3, 4, 4, range = 4 – 2 = 2.
MEAN DEVIATION
Example The data set 25, 26, 27, 28, and 29 are scores in a mid-semester exam. Find
the mean deviation.
Solution
𝑋̅ = 25+26+27+28+29 / 5 = 27
x |𝑥 -𝑋̅|
25 2
26 1
27 0
28 1
29 2
sum 6
Mean Deviation = 1/N ∑[|𝑥 − 𝑋̅ |] = 6/5 = 1.2
On the average, the test scores deviated by 1.2 marks from the mean mark
∑𝑁 ̅
𝑖=1(𝑥𝑖−𝑋 )2
Sample Variance, S2 = 𝑁−1
∑ (𝑥𝑖−𝑋)2 𝑁 ̅
Sample Standard Deviation 𝜎 = √ 𝑖=1𝑁−1
20
Simplifying the Calculation of the Variance
∑(𝑥−𝑎)2 2
𝜎2 = -𝑥
𝑛
Let ∑(𝑥 − 200) = 280 and ∑(𝑥 − 200)2 = 9000 represent information on the heights of
∑(𝑥−𝑎)2 2 9000 280
90 people then 𝜎 2 = -𝑥 = – ( 90 )2 = 90.32 and therefore the standard
𝑛 90
deviation is 9.5.
NB: The standard deviation remains same for both the original and new values.
For a grouped frequency distribution, it is assumed that all the values in a class are
centered at the mid-point or a data with a particular class allocated at the mid-point
which results in grouping error; and corrected by the Sheppard’s correction as (S2 –
C2/12), C = common class interval size.
S2− C2
Corrected Variance = and Corrected Standard Deviation = √Corrected Variance
12
Standard deviation shows how data deviates from the central (mean).
For standard deviations, 𝜎1 and 𝜎2, for samples, N1 and N2, if 𝜎1 > 𝜎2, then sample N1 is
more spread than N2.
Example
A manufacturer of electrical gadgets has two devices A and B. The devices have respective mean
life spans of 578 days and 825 days with corresponding standard deviations of 80 days and 150
days. Which device is preferred?
Solution
21
Coefficient of Variation of B = 150/825 x 100 = 18.18%.
A is preferred because the data are more closely spread (with lower percentage) around
the mean.
Example
Calculate the standard deviation for 143, 155, 167, 171, 181, and 191.
SKEWNESS
Skewness shows the degree of departure from symmetry of a distribution. Data which is
not normal or symmetrical is skewed positively or negatively.
3(𝑀𝑒𝑎𝑛−𝑀𝑒𝑑𝑖𝑎𝑛)
Skewness 𝛾 1= 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
∑(𝑋−𝑋̿)3
Sample skewness = (𝑁−1)𝑆 3
∑(𝑋−𝑋̿)3
Population skewness = 𝑁𝑆 3
Example
The mean, median and mode are all located on the line of symmetry
22
Positive Skewness
Negative Skewness
23
KURTOSIS
∑(𝑋−𝑋̿)4
Sample kurtosis = (𝑁−1)𝑆 4
∑(𝑋−𝑋̿)4
Population kurtosis = 𝑁𝑆 4
A Platykurtic distribution has a lower peak than normal distribution and lighter tails.
It has negative kurtosis, meaning the data points are distributed closer to the extreme
values than the mode which lies at the middle. The graph looks a little flat with gentle
slope.
Mesokurtic distribution has zero kurtosis and the data points are evenly distributed.
A Leptokurtic distribution has higher peak than normal distribution and has heavier
tails. A leptokurtic distribution has positive kurtosis, meaning the data points are
gathered closer to the mode of the distribution, thereby making the peak of the graph
pointed with steep slope.
24
CORRELATION
Correlation is a statistical technique which shows whether variables are related and the
extent. For a positive correlation as the values of one variable increase, the values of
the other variable also increase and vice versa; example, voltage supplied and current
generated. There is a direct relationship. For a negative correlation as the values of the
first variable increase, the values of the second variable decrease or there is an inverse
relationship, example, supply of a product and its price. Zero correlation implies no
relationship, example skin colour and intelligence.
Example
16 Calculate Pearson’s product Moment Correlation Coefficient and indicate its significance.
Age(x) 20 21 22 23 24 25
Mark (y) 25 30 35 37 28 29
Solution
x y xy x2 y2
20 25 500 400 625
21 30 630 441 900
22 35 770 484 1225
23 37 851 529 1369
24 28 672 576 784
25 29 725 625 841
135 184 4148 3055 5744
25
6(4148)−135(184)]
r= = 0.2
√[6(3055)−18225][6(5744)−33856]
6 ∑𝑛 2
𝑖=1 𝑑𝑖
𝜌=1− where d = difference between the ranks of the two pairs of variables, n
𝑛(𝑛 2−1)
= number of paired values.
For two or more sample with same values, the average rank should be used. E.g. for 2
samples all with the same depth ranked 5 th in order, you should add the rank values
together. If after ranking and reach 5 th position and moving to 6th but are 2(8 and 8)
values to be ranked, add the rank values together (6+7 = 13) and divide by numbers of
samples with same depth number ( 13/2 = 6.5th and the next has rank of 8th.
Example
Calculate the spearman’s rank correlation for the data and explain the value.
Age (x) 25 26 28 32 35 40
Mark (y) 60 65 77 89 72 87
Solution
26
REGRESSION ANALYSIS
A regression line is a straight line that describes how a dependent variable changes
with respect to an independent variable. The line is to explain the change in a
dependent variable in terms of an independent variable and also to predict the values of
the independent variable for a given dependent variable.
The best line is the line that minimizes these distances. The least squares regression
line is the line y=a+bx; y is the predicted response for any predictor x ; a is the y-
intercept and b is the slope. There are simple and multiple linear regressions. Multiple
linear regression has two or more independent variables against one dependent
variable.
Example
X 11 12 13 14 15 16 17
y 10 13 16 10 18 11 18
Fit a regression line equation to the data above and predict the value of x when y is 15.
Example
X 5 7 9 10 12 15
y 12 16 20 22 26 32
27
PROBABILITY
Introduction To Probability
Sample Space is the collection of all the simple events for a statistical experiment,
denoted S.
There exist two simple and important rules to be observed while assigning probabilities
to simple events:
28
An event is a single or group of outcomes of an experiment.
The empty set and the sample space are all events.
P (A ∪ B ) = P (A) + P (B).
Let n (A) be the number of events in an experiment with the number of outcomes in the
sample space as n (S). Then P (A) = n (A) / n (S).
Example
Let S = {1, 2, 3, 4, 5, 6} be the sample space when a die with n (S) = 6 is tossed once;
the event that the number showing is a factor of 6, E = {1, 2, 3, 6}, with n (E) = 4, has
probability, P (E) =n (E)/ n(S) = 4/6 = 2/3.
Example
Let two coins be tossed once or one coin be tossed two times. The sample space, S =
{HH, HT, TH, TT}, with n (S) = 4. The probability of obtaining no head has event, E =
{TT}, n {E} = 1 and hence P {E} = ¼.
The event that there is at least a tail is E = {HT, TH, TT} with n {E} = 3 and P {E} = ¾.
When a die is tossed twice or two dice are tossed once, the space is : {(1,1), (1,2), (1,3),
(1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6),
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3),
(6,4), (6,5), (6,6)} with n (S) = 36. The event that the sum of the values is 1 is E = { }
with P (E) = 0/36 = 0 and the event that the sum of the values is less than 13 is the entire
sample space, a sure event with P (E) = 36/36 =1
29
RELATIVE FREQUENCY DEFINITON OF PROBABILITY
Ages 6 7 8 9 10
Frequency 10 20 30 40 50
The probability that a child chosen at random is 9yrs is P (9yrs) = n (9years)/n (F)
40
= /150 = 4/15.
The probability that a child chosen at random is at most 9 years is P (at most 9) =
n (at most 9) /n (F) = 10 + 20 + 30 + 40 /150 = 100/150 = 2/3.
1) 0≤ 𝑃(𝐸 ) ≤ 1
2) P (A u B) = P (A) + P (B) – P (A n B) and for mutually exclusive events, P (A n B)
= P (A u B ) = P (A) + P (B)
3) P (A) + P(A’) = 1 , P (A’) = 1 – P (A)
Proofs
1 ∅⊆E⊆S
P (∅)≤ P(E) ≤ 1
0 ≤ P (E) ≤ 1
A
2
30
A∪B=A–x+x+B–x
A∪B=A+B– x
A ∪ B = A + B - (A ∩ B)
n (A ∪ B) = n (A) + n (B) – n (A ∩ B)
n (A ∪ B) n (A) n (B) n (A ∩ B)
= + −
n (S) n (S n (S) n (S)
P (A ∩ B) = 0 ⇒ P (A ∪ B) = P (A) + P (B)
A’
A ∪A’ = S
Example The probability that a boy with a catapult hits target A is 2/3 and that he hits
target B is ¾. Given the probability of hitting both targets to be 1/2, find the probability
that he
a) hits at least one of the targets b) does not hit any.
Solution;
P (A) = 2/3, P (A’) =1/3, P (B) = ¾, P (B’) = ¼. P (A ∩ B) = ½.
a) P (at least one) = P (either A or B or both) = P (A ∪ B)
= P (A) + P (B) – P (A ∩ B) = 2/3 + 3/4 - 1/2 = 11/12
31
PROBABILITY OF INDEPENDENT EVENTS
Two or more events are independent if the probability of one of them is not affected by
knowing whether or not the other (s) have occurred. Events A and B are independent if
P (A ∩ B) = P (A) P (B). Equivalent to this is the condition P (A / B) = P (A) that is the
probability of A is the same as the conditional probability of A given B.
Example
A red and a black dice are thrown. Let R be the event that red dice shows 6 and B that
black dice shows 6; then P (R ∩ B) = P(R) P(B) = 1/6 x 1/6 = 1/36.
The Inclusion – Exclusion Formula for three events is, P(A ∪ B ∪ C) = P (A) + P (B) +
P (C) –P (A ∩ B) – P (A ∩C) – P (B ∩ C)+ P(𝐴 ∩ 𝐵 ∩ 𝐶)
Example
A bowl contains 13 red and 7 white identical balls. A ball is selected at random from the box; find the
probability of selecting
Solution
d) P (R and W or W and R) = P (R ∩ W) ∪ P (W ∩ R)
Example
The probability that A hits a target is ¾ and that B hits is 2/3 and that of C is 3/5. Given
that they fire together, find the probability that
a) they all missed the target b) Exactly one hits the target
b) at least one shot hits d) A hits given that exactly one hit is
recorded
Solution
32
P (A) = ¾ P (A’) = ¼, P (B) = 2/3 P (B’) = 1/3 P (C) = 3/5 P (C’) = 2/5
Conditional probability is the probability of some event A, given that event B occurs. For
any two events A and B, the conditional probability of event A given B had occurred is
given as
P (B/A) = P (B ∩ A) / P (A)
Example
Suppose that at a Goil Filling Station, 60% of drivers check their oil levels, 40% check
tyre pressure and 10% check both oil levels and tyre pressure. Suppose also that a
driver is selected at random without bias. What is the probability that a driver checked
his tyre pressure given that he had checked his oil levels?
Solution
Again, the probability that oil levels are checked given that tyre pressure is checked is
given by P (O/T) = P (O ∩ T) / P (T) = 0.1 / 0.4 = 1/4
33
Example
Students of a school were selected for an alcohol test. The table is a distribution of
results.
Solution
7
99 7 99 7 1
a) P (positive /alcoholic)=P (positive and alcoholic) / P(alcoholic) = 77 = 99 × 77 = =11
77
99
TOTAL PROBABILITY
If event A could be realized only when one of the events B 1, B2, B3, B4, - - - Bn occurs,
then the probability of event A is
Example
In a used car garage, 45% of the cars are manufactured in U.S.A and 20% of these cars are
compact, 25% are manufactured in Europe and 30% are compact and finally 30% are
manufactured in Japan and 70% of them are compact..
a) If a car is selected at random from the garage, find the probability it is compact
34
b) Given that the car is compact, find the probability that it is manufactured in
Europe.
Solution
Example
During a day’s production, X produces 1440 cards, Y produces 864 cards and Z does 576 cards.
The probability of X producing a defective card is 0.02, that of Y is 0.1 and that of Z is 0.05.
Find the probability that at the end of the day, one card selected at random will be defective.
Solution
BAYE’S THEOREM
35
Example
Suppose that Bob can decide to go to work by one of three modes of transportation,
car, bus, or commuter train. Because of high traffic, if he decides to go by car, there is a
50% chance he will be late. If he goes by bus, which has special reserved lanes but is
sometimes overcrowded, the probability of being late is only 20%. The commuter train is
almost never late, with a probability of only 1%, but is more expensive than the bus.
(a) Suppose that Bob is late one day, and his boss wishes to estimate the probability
that he drove to work that day by car. Since he does not know which mode of
transportation Bob usually uses, he gives a prior probability of 1/3 to each of the three
possibilities. What is the boss’ estimate of the probability that Bob drove to work?
(b) Suppose that a coworker of Bob’s knows that he almost always takes the commuter
train to work, never takes the bus, but sometimes, 10% of the time, takes the car. What
is the coworker’s probability that Bob drove to work that day, given that he was late?
Solution
Example
In Orange County, 51% of the adults are males and the other 49% are females. One
adult is randomly selected for a survey involving credit card usage. It is later learned
that the selected survey subject was smoking a cigar. Also, 9.5% of males smoke
cigars, whereas 1.7% of females smoke cigars. Find the probability that the selected
subject is a male.
Solution
36
Example
Three boxes A, B and C, contain red and black balls. Box A contains 2 red and 3 black
balls, box B contains 1 and 4 black balls and box C contains 3 red balls and 1 black ball.
We choose randomly a box, and from this box we choose randomly one of the balls.
Assume that the drawn ball is red. Find the probability that the ball comes from box A.
Solution
P (A) = P(B) = P(C) = 1/3 P(R|A) = 2/5 P(R|B) = 1/5 P(R|C) = ¾
𝑃(𝐴)P(R |A) 1/3×2/5
P(A|R) = P(A)P(R |A)+P(B)P(R |B)+P(C) P(R |C) = 1 2 1 3 = 8/27=0.3
( + + )
3 5 5 4
Example
Two boxes A1 and A2 contain w1 white and b1 black balls and w2 white and b2 black balls
respectively. We draw at random one ball from each one of the boxes and then at
random one of the two balls. Find the probability that this ball is white.
Solution
Let Ai denote the event that a ball comes from box i and let A denote the event that the
ball is white. Since we choose 1 ball from each box, we get
P (Ai) = ½ , i = 1, 2
wi
P (A| Ai) = wi+bi = i = 1, 2
1 w1 1 w2
P (A) = P (A1) P (A/A1) + P (A2) P (A/A2) = 2 w1+b1 + 2 w2+b2
Example
An information channel can transmit 0s and 1s, though some errors may occur. One
expects that a sent 0 is changed with the probability 1/5 to a 1, and that a sent 1 is
changed with the probability 1/6 to a 0. It is also given that in mean 2/3 of all signals are
0s.
a) Assuming that we receive a 0, what is the probability that a 0 was sent?
b) Assuming that we receive a 1, what is the probability that a 1 was sent?
Solution
a) Ai = {i sent}, i = 0, 1 A = {1 received}
𝑃(0 𝑠𝑒𝑛𝑡)P(0 received |0 sent)
P (0 sent|0 received) =
P(0 𝑠𝑒𝑛𝑡)P(0 received |0 sent)+P(1 sent)P(0 received |1 sent)
2/3×4/5 48
=2/3×4/5+1/3×1/6 = 53
b) Ai = {i sent}, i = 0, 1 A = {1 received}
Example
37
A factory buys 1000 light bulbs of type A and 500 bulbs of type B which are somewhat
more expensive. For a randomly chosen bulb of type A there is the probability 0.6 of
that it lasts longer than 2 months. . For a randomly chosen bulb of type B we have the
probability 0.9 of that it lasts longer than 2 months. By mistake all bulbs are mixed
together. A bulb is chosen at random from the 1500 bulbs. Find the probability that this
bulb will last for longer than 2 months.
b)if a bulb lasts for more than 2 months, what is the probability that it is of type A?
1000 500
P(the bulb lasts in more than 2 months) = 1500 ×0.6 + 1500 × 0.9 = 0.7
Example
An aircraft emergency locator transmitter (ELT) is a device designed to transmit a signal
in the case of a crash. The Altigauge Manufacturing Company makes 80% of the ELTs,
the Bryant Company makes 15% of them, and the Chartair Company makes the other
5%. The ELTs made by Altigauge have a 4% rate of defects, the Bryant ELTs have a
6% rate of defects, and the Chartair ELTs have a 9% rate of defects.
If a randomly selected ELT is then tested and is found to be defective, find the
probability that it was made by the Altigauge Manufacturing Company.
Solution
P(A) = 0.80 P(B) = 0.15 P(C) = 0.05 P(D|A) = 0.04 P(D|B) = 0.06 P(D|C) = 0.09
E.g. 7! = 7 x 6 x 5 x 4 x 3 x 2 x 1
Note: 0! = 1! = 1
Example
38
a) It is a multiple of 3 or 8? (b) It is a multiple of 2 or 3?
Solution
MULTIPLICATION THEOREM
Suppose that event D1 could result in any one of n (D1) outcomes, and for
each outcome of the event D1, there are n (D2), then together there will be n
(D1) x n (D2) outcomes for the two events.
n (D) = n (D1) x n (D2)
Example
One has 20 pairs of jeans and 16 shirts, in how many ways could the person
combine these clothes if he wears a pair of jeans and a shirt at a time?
Solution
n (A) = n (E1) x n (E2) = 20 x18 = 360 ways.
PERMUTATIONS
Permutation is the different arrangements of a given number of things by
considering some or all at a time.
In permutations; ‘order’ is the watch word. In general, the number of
permutations of n distinct things taking them all at a time = nPn = n!
n
pn = n! / (n – n)! = n! / 0! = n! /1 = n!
E.g. 4p4 = 4! / (4 -4)! = 4! / (4 – 4)! = 4! / 0! = 4x3x2x1 = 24
Again, npr =n!/ (n – r)!
E.g. 6p3 = 6! / (6 -3)! = 6! /3! = 6 x 5 x 4 = 120
39
Example
How many ways can gold, silver, and bronze medals be awarded for a race run by 8
people?
Solution.
8!
Using the permutation formula we find P (8,3) = (8−3)!= 336 ways.
Example
How many five-digit zip codes can be made where all digits are unique? The possible
digits are the numbers 0 through 9.
10!
Solution. P(10, 5) = (10−5)! = 30, 240 zip codes.
Example
Solution
Example
How many different number plates for cars could be made if each number plate
contains four (4) of the digits from 0 – 9 followed by a letter A – Z, and prefixed with GT,
assuming that
Solution
0 – 9 gives, n = 10, r = 4
a) No repetition; 10
p4 = 10! / (10 – 4)! = 10! / 6!
= 10 x 9 x 8 x 7 = 5040
40
If 0 is not used, we’ve 260000 – 26 = 259974 plates
Example
In how many ways can the letters of the word ‘STATISTICS’ be arranged?
Solution
n1 = S, n (n1) = 3 n2 = T, n (n2) = 3 n3 = A, n (n3) = 1
n4 = I, n (n4) = 2 n5 = C, n (n5) = 1
n
pr = 10! /3! 3! 1! 2! 1! = 10 / 3! 3! 2! = 50400 ways
COMBINATIONS
The number of ways of selecting r items from a set of n distinct objects without regard to
any order is referred to as combinations.
Example
How many ways are there to select a committee to develop a discrete mathematics
course at a school if the committee is to consist of 3 faculty members from the
Mathematics department and 4 from the computer science department, if there are 9
faculty members of the math department and 11 of the CS department?
Solution.
9! 11!
There are C (9, 3) · C(11, 4) = × = 27, 720 ways.
3!(9−3)! 4!(11−4)!
Example How many combinations are there in 6 distinct things taking 4 at a time?
n
NB; Ck = nC n-k
Example
In how many different ways can 4 of 13 teachers be selected to assist with the
preparation of examinations?
Solution
41
n = 13, r = 4, hence 13C4 ways
13
C4 = 13! / 4! (13 – 4)! = 13! / 4! 9! = 13 x 12 x 11 x 10 /24 = 715
Example
A statistics lecturer sets 7 questions in an end of semester exam and students were
asked to attempt any 4 of them. Find the number of ways of selecting these questions.
Solution
n = 7, r = 4, 7
C4 ways C4 = 7! / 4! (7 – 4)! = 7! / 4! 3! = 7 x 6 x 5/ 6 = 35
7
ways
Example
In how many ways could a committee comprising of 7 men and 6 women be formed
from a group of 9 men and 8 women?
Solution;
Example
A box contains 12 red, 6 white and 10 blue balls. If three balls are drawn at random
simultaneously, find the probability that
Solution
selecting 3 of 12 balls
a) P (all red) =
Ways of selecting 3 out of (12 + 6 + 10) balls
12C3 12! /3! 9! 2 x 11 x 10 55
= = =
28C3 28! / 3!25! 28 x 9 x 13 819
2 of 12 and 1 of 6
b) P (2 red and 1 white) = 28C3
42
= 12C2 x 6C1 = 66 x 6 = 11
28
C3 32764 91
c) P (at least 1 red) = 1 – P (no red) = 1 – C3/28C3
16
= 1 – 16! / 3! 13!
28! 3! 25! = 1 – 20 /117 = 97/117
Example
In a group of 6 boys and 4 girls, four are to be selected. In how many ways can they be
selected such that at least one boy should be there?
Solution Ways = (6C4) + (6C3 x 4C1) + (6C2 x 4C2) + (6C1 x 4C3) = 209
Example
There are 6 periods in each working day of a school. In how many ways can one
organize 5 subjects such that each subject is allowed at least one period?
Solution
In 6 periods, 5 can be organized in 6P5 ways and the remaining 1 period can be
organized in 5P1 ways.
Example
Given a class of 12 girls and 10 boys.
(a) In how many ways can a committee of 5 consisting of 3 girls and 2 boys be chosen?
(b) What is the probability that a committee of five, chosen at random from the class, consists of
3 girls and 2 boys?
(c) How many of the possible committees of 5 have no boys?(i.e. consists only of girls)
(d) What is the probability that a committee of five, chosen at random from the class, consists
only of girls?
Solution
(a) First note that the order of the children in the committee does not matter. From 12 girls we
can choose C (12, 3) different groups of three girls. From the 10 boys we can choose C(10, 2)
different groups. Thus, by the Fundamental Principle of Counting the total number of committee
is
43
12! 10! 12× 11× 10 10× 9
C (12, 3) × C(10, 2) = 3!9! × 2!8! = 3× 2× 1 × 2× 1 = 220× 45 = 9900
(b) The total number of committees of 5 is C (22, 5) = 26,334. Using part
( a), we find the probability that a committee of five will consist of 3 girls and
2 boys to be
C(12,3)×C(10,2) 9900
= 26334 = 0.4
𝐶(22,5)
(c) The number of ways to choose 5 girls from the 12 girls in the class is
12× 11× 10×9×8
C (10, 0) × C(12, 5) = C(12, 5) = 5×4×3× 2× 1 = 792
(d) The probability that a committee of five consists only of girls is
𝐶(12,5) 792
= 26334 = 0.03
𝐶(22,5)
1 POISSON DISTRIBUTION
Situations occur when the variable under consideration is the number of occurrences of a
particular event in a given interval of space. Examples, number of cars passing a point on a
road in an hour, number of phone calls a person receives in a day. The distribution used to
model these scenarios is the Poisson probability distribution, defined by
𝜆𝑥
P(X=x) = 𝑒 −𝜆 , x = 0, 1, 2, 3 . . . and 𝜆 is the mean occurrences.
𝑥!
Example
The number of particles emitted per second by a radioactive source has a Poisson distribution
with mean 5. Calculate the probabilities of
(a) 0 (b) 1 (c) 2 (d) 3 or more emissions in a time interval of 1 second
Solution
𝜆𝑥 5𝑥
X – Po (5) P(X=x) = 𝑒 −𝜆 𝑥! P(X=x) = 𝑒 −5 𝑥!
50 51 52
(a) P(X=0) = 𝑒 −5 0! = 0.007 (b) P(X=1) = 𝑒 −5 1! = 0.03 (c) P(X=2) = 𝑒 −5 2! = 0.08
Mean = µ = E(X) = 𝜆
44
Variance = σ2 = Var(X) = 𝜆
2 BINOMIAL DISTRIBUTION
A single trial has exactly two and only two outcomes which are success (p) and failure(q) and
are mutually exclusive. A fixed number of trials, n , takes place, with each trial been
independent of the outcome of all the other trials.
Let X, the random variable, represent the number of successes in the n trials of an experiment,
have a probability distribution given by
Eg A card is selected at random from a standard pack of 52 playing cards. The suit of the card
is recorded and the card is replaced. This process is repeated to give a total of 16 selections
and on each occasion, the card is replaced in the pack before another selection is made.
Calculate the probability that
a exactly five hearts occur in the 16 selections b at least three hearts occur
Solution
P(heart) = P(success) = 1/4 (1one out of the four hearts) P(failure) = 1-1/4 = ¾ hence X –
B(16, 1/4)
a P(X=5) = (16
5
)( 1/4)5 (3/4)11 = 0.2
3 Geometric Distribution
Consider an experiment where there will be only two outcomes success, p, and failure, q, and
p+q = 1. The number, X, of trials needed to obtain the first success, of independent trials. The
probability mass function (pmf) for a discrete probability distribution
45
P(X = x ) = 𝑝𝑞 𝑥−1 for x = 0, 1, 2, 3, . . .
1 𝑞 1−𝑃
E(X) = µ = 𝑃
and Var(X) = 𝑃 = 𝑃
Example
In a certain producing process it is known that, on the average, 1 in every 100 items is
defective. What is the probability that the fifth item inspected is the first defective item found ?
Solution
Sampling for this distribution is carried out without replacement and hence the repeated trials
are not independent. Consider a population of size, N, composed of two categories, good, R
and defective, N – R.
(𝑅 )(𝑁−𝑅)
𝑃 (𝑋 = 𝑥) = 𝑝(𝑥) = 𝑥 𝑛−𝑥
(𝑁 )
x = 0, 1, 2, . . . n and 0 ≤ x ≤ R and 0 ≤ n – x ≤ N
𝑛
𝑛𝑅 𝑁−𝑛 𝑅 𝑅
E(X) = µ = 𝑁
and Var(X) = 𝑁−1 . n. 𝑁 (1 − 𝑁)
Example
A class in Statistics has 25 students , 15 males and 10 females. A committee of 5 students is to
be selected. What is the probability mass function for the number, X, of females on the
commmittee ?
Calculate the mean and standard deviation
Solution
(10)( 15 )
𝑥 5−𝑥
𝑃 (𝑋 = 𝑥) = 𝑝(𝑥) = (25 )
x = 0, 1, 2, 3, 4, 5
5
x 0 1 2 3 4 5
p(x) 0.06 0.3 0.4 0.2 0.1 0.005
5𝑥10
µ= 25
=2
46
25−5 10 10
Var(X) = 25−1
. 5. 25 (1 − 25) = 1
Standard deviation = √1 = 1
The mean, µ, is the expectation or expected value of a random variable is a rough estimate of
the probability of an event occurring within a large sequence of events. E(X) = µ = ∑ 𝑥𝑖𝑝𝑖
Example
Find the expected values of the random variables X, Y and W which have the following
probability distributions
(i) x 0 1 2 3 4
P(X = x) 1/8 3/8 1/8 ¼ 1/8
(ii) ) y -2 -1 0 1 2 3
P(Y = y) 0.15 0.25 0.3 0.05 0.2 0.05
(iii) w 1 2 3 4 5 6 7
Example
Calculate the standard deviation of the random variable W above.
47
Solution
E(X) = np=100(0.04) = 4 and Var(X) = npq = 100(0.04)(0.96)=3.84
(𝑥−µ)2
1 −
𝑓 (𝑥) = 𝜎√2𝜋 𝑒 2𝜎2 for all real values of x.
We standardize variables so that one normal distribution table can be used for all normal
distributions. The standardized value, Z, is given from the value of the variable X as
𝑋−µ
𝑍= 𝜎
with µ = 0 and 𝜎 2 = 1.
𝑋−µ
𝑍= 𝜎
allows you to change a statement about a N(µ, 𝜎 2 ) into an equivalent statement
about a N(0, 1) distribution.
230−205
P(X ≤ 230) = P (Z ≤ ) = P (Z≤1.25) = Φ(1.25) = 0.894
20
48
2 Exponential distributions
F(x) = 1 − 𝑒 −𝜆𝑥
1 1
E(X) = µ = 𝜆
and Var(X) = 𝜆2
A device contains two electrical components, A and B. The lifespans of A and B are both
exponentially distributed with expected values of five years and ten years respectively. The
device works as long as both components work. What is the expected lifespan of the device ?
Solution
1
A: E(X) = =
5
1
B: E(X) = =
10
1
A and B both work: E(X) = = 10
(since it is smaller)
3 Gamma distribution
𝛤(𝑛) = (𝑛 − 1)!
49
1
𝛤 (2) = √𝜋
Letting 𝛼 = 1, we have
−𝜆𝑥
𝑓𝑋 (𝑥) = { 𝜆𝑒 𝑥>0
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝛼 𝛼
E(X) = and Var(X) = =
𝜆 𝜆2
50