Unit-I PROBABILITY AND DISTRIBUTION
Unit-I PROBABILITY AND DISTRIBUTION
Unit-I PROBABILITY AND DISTRIBUTION
Introduction
(ii) Classification
(iii) Re-sampling
Also a random experiment is defined as an experiment in which all the possible outcomes are
known in advance and no personal bias is exercised.
Throwing of an unbiased coin is a random experiment as out of two faces, any of the face i.e.
head or tail may come up. Similarly, throwing of an unbiased die is a random experiment as any
of the six faces of the die may come up. In this experiment, there are six possibilities ( 1 or 2 or
3 or 4 or 5 or 6 ). This is also random experiment.
Event is a subset of sample space and cases are its members i.e. subsets consisting of single
members.
Equally likely cases ( Events ): Cases are called mutually exclusive when no two of them can
occur simultaneously.
Mutually exclusive events: Two events are said to be mutually exclusive events when the
occurrence of one of them, stops the occurrence of the other.
Independent events: Two events are said to be independent events if occurrence of one event
does not affect the occurrence of other.
Exhaustive Cases: A set of cases is said to be exhaustive if it includes all possible outcomes of a
trial.
Favourable cases: The cases which entail the happening of an event are said to be favourable
to an event.
Odds in favour of or against the trial: If an experiment can succeed in 𝑚 ways and fail in 𝑛
ways, each of these ways being equally likely, then the odds are 𝑚 to 𝑛 in favour or 𝑛 to 𝑚
against the trial.
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔
Odds in favour of an event = 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑛𝑜𝑛−ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑛𝑜𝑛−ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔
Odds against an event = 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔
Definition:
Let an event A can happen in 𝑚 ways, and fail in 𝑛 ways where all ways are equally likely to
occure, then the probability of the happening of event A is defined as
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑐𝑎𝑠𝑒𝑠 𝑚
𝑃(𝐴) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑢𝑡𝑢𝑎𝑙𝑙𝑦 𝑒𝑥𝑐𝑙𝑢𝑠𝑖𝑣𝑒 𝑎𝑛𝑑 𝑒𝑞𝑢𝑎𝑙𝑙𝑦 𝑙𝑖𝑘𝑒𝑙𝑦 𝑐𝑎𝑠𝑒𝑠 = 𝑚+𝑛 = 𝑝(𝑠𝑎𝑦)
𝑚 𝑛
Thus 𝑃(𝐴) + 𝑃(𝐴̅) = 𝑝 + 𝑞 = 𝑚+𝑛 + 𝑚+𝑛 = 1.
From this, it is noted that 𝑃(𝐴) = 𝑝 is such that 0 ≤ 𝑝 ≤ 1. 𝑃(𝐴̅) = 𝑞 is called the
complementary event. Also 0 ≤ 𝑞 ≤ 1.
Conditional probability: The probability of an event A, when event B has already occurred is
known as conditional probability of event A and denoted as P(A/B). This is in the case when
events A and B are dependent.
Example1. The chance of an event happening is the square of the chance of a second event but
the odds against the first are the cube of the odds against the second. Find the chance of each.
Solution: Let 𝑝 and 𝑝′ be the chances of happening of two events, then 𝑝 = 𝑝′2 .
1−𝑝
Odds against the first event = 𝑝
1−𝑝′
Odds against the second event = 𝑝′
𝑝1 + 𝑝2 + ⋯ + 𝑝𝑛 = 1
1−2𝑛
𝑝1 (1 + 2 + 22 + ⋯ + 2𝑛−1 ) = 1 , implies that 𝑝1 ( 1−2 ) = 1, which implies that
1
𝑝1 = 2𝑛−1 , therefore
1−2𝑘 2𝑘 −1
𝑃(𝐴𝑘 ) = ∑𝑘𝑖=1 𝑃(𝑤𝑘 ) = 𝑝1 + 𝑝2 + ⋯ + 𝑝𝑘 = 𝑝1 (1 + 2 + 22 + ⋯ + 2𝑘−1 ) = 𝑝1 ( 1−2 ) = 2𝑛 −1.
Example3. The sum of two positive quantities is equal to 2𝑛. Find the chance that the product
3
of two quantities is not less than 4 times of their greatest product.
𝑑𝑦
For maximum and minimum putting 𝑑𝑥 = 0, we get 2𝑛 − 2𝑥 = 0, implies that 𝑥 = 𝑛.
𝑛 3𝑛
which implies that 2 < 𝑥 < .
2
3𝑛 𝑛
Therefore, number of favourable cases = −2=𝑛
2
Statement: It states that the probability of the happening of any one of the several mutually
exclusive events is the sum of the probabilities of the happening of separate events i.e.
Let 𝑁 be the number of cases which are equally likely, mutually exclusive and exhaustive. Out
of these let
……………………………………………………………….
……………………………………………………………….
Since 𝐴1 , 𝐴2 , … , 𝐴𝑛 are mutually exclusive, the cases 𝑚1 , 𝑚2 , … , 𝑚𝑛 are quite distinct and no-
overlapping.
Note:- 1. When two events 𝐴 and 𝐵 are not mutually exclusive, then there will be some
outcomes, or cases which favour both 𝐴 and 𝐵 together and suppose this happens in 𝑚𝑘 ways
( this is included in both 𝑚1 and 𝑚2 favourable to both 𝐴 and 𝐵 respectively ). Thus the total
number of cases favouring either 𝐴 or 𝐵 or both is 𝑚1 + 𝑚2 − 𝑚𝑘 . Hence the probability of
occurrence of 𝐴 or 𝐵 or both is given by
𝑚1 +𝑚2 −𝑚𝑘 𝑚1 𝑚2 𝑚𝑘
𝑃(𝐴 + 𝐵) = = + − = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴𝐵)
𝑛 𝑛 𝑛 𝑛
Where 𝑃(𝐴𝐵) represent the probability of both 𝐴 and 𝐵 happening together. It may be noted
that when 𝐴 and 𝐵 are mutually exclusive then 𝑃(𝐴𝐵) = 0.
2. When three events 𝐴, 𝐵 and 𝐶 are non mutually exclusive events, then
Statement: The probability of the occurrence of two independent events is the product of their
separate probabilities i.e. 𝑃(𝐴𝐵) = 𝑃(𝐴). 𝑃(𝐵).
Proof: Let the two independent events be 𝐴 and 𝐵. Let the event 𝐴 succeed in 𝑚1 ways and fail
in 𝑛1 ways, and the event 𝐵 succeed in 𝑚2 ways and fail in 𝑛2 ways, all the ways in both the
events 𝐴 and 𝐵 being equally likely. Now there are (𝑚1 + 𝑛1 ) ways in event 𝐴 and (𝑚2 + 𝑛2 )
ways in event 𝐵. Each of the (𝑚1 + 𝑛1 ) ways can be associated with each of (𝑚2 + 𝑛2 ) ways.
Thus in their simultaneous happening, the total number of ways are (𝑚1 + 𝑛1 ) × (𝑚2 + 𝑛2 ).
Out of these total number of ways, we have (𝑚1 . 𝑚2 ) ways in which both the events succeed.
Note:- If two events 𝐴 and 𝐵 are dependent, then 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴). 𝑃(𝐵/A), where 𝑃(𝐵/A) is
known as conditional probability of event 𝐵 when event 𝐴 has already occurred.
Repeated trials
If the probability that event happens in a single trial is 𝑝 and the probability that it fails is q then
𝑝 + 𝑞 = 1. We would like to know the probability of its happening exactly 𝑟 times in 𝑛 trials.
By multiplication theorem, ( as the trials are independent ) probability for exactly 𝑟 consecutive
successes followed by (𝑛 − 𝑟) failures will be
Obviously the probability for 𝑟 successes and (𝑛 − 𝑟) failures remain the same as above in
whatever order events happen. Now the total probability for 𝑟 successes and (𝑛 − 𝑟) failures,
irrespective of the order of their occurrence can be found by finding the total number of
possible ways in which this particular event can happen. Now the total number of possible ways
are simply the permutations of 𝑛 things taken all at a time, of which 𝑟 are alike each equal to 𝑝
and (𝑛 − 1) alike equal to 𝑞.
𝑛!
Therefore, possible number of ways = 𝑟!(𝑛−𝑟)!
𝑛!
Hence the number of times 𝑝𝑟 𝑞 𝑛−𝑟 is to be added is 𝑟!(𝑛−𝑟)! .
𝑛𝐶𝑟 𝑝𝑟 𝑞 𝑛−𝑟 + 𝑛𝐶𝑟+1 𝑝𝑟+1 𝑞 𝑛−𝑟−1 + 𝑛𝐶𝑟+2 𝑝𝑟+2 𝑞 𝑛−𝑟−2 + ⋯ + 𝑛𝐶𝑛 𝑝𝑛 𝑞 𝑛−𝑛 , since the event can
happen 𝑟, 𝑟 + 1, 𝑟 + 2, … , 𝑛 ways.
Cor.2. The probability that the event happens at least once in 𝑛 trials is (1 − 𝑞 𝑛 ) = 1 −
probability of zero success.
Since the total probability is 1, and the probability of zero success is 𝑛𝐶0 𝑝0 𝑞 𝑛−0 = 𝑞 𝑛 .
Subtracting 𝑞 𝑛 from 1 we get the probability for at least one success.
Example1. A die is thrown 8 times and it is required to find the probability that 6 will show
Solution: The chance that 6 will show in single throw is 1/6 and the chance that it fails is 1 −
1 5
= 6.
6
1 7 5 40
(i) The chance for exactly 7 successes in 8 trials is 8𝐶7 𝑝7 𝑞 8−7 = 8𝐶7 (6) (6) = 68 .
(ii) The chance for at least 7 successes is = 𝑝(𝑟 = 7) + 𝑝(𝑟 = 8) where 𝑟 denotes the number
1 7 5 1 8 5 0 41
of successes = 8𝐶7 (6) (6) + 88 (6) (6) = .
68
5 8
(iii) The chance for at least one success = 1 − (6) .
Example2. In a given race, the odds in favour of four horses 𝐴, 𝐵, 𝐶, 𝐷 are 1:3, 1:4, 1:5, 1:6
respectively. Assuming that a dead heat is impossible; find the chance that one of them wins
the race.
Since a dead heat ( in which all the four horses cover same distance in same time ) is not
possible, the events are mutually exclusive.
1 1
Odds in favour of 𝐴 are 1:3, therefore 𝑝1 = 1+3 = 4 .
1 1 1 1 1 1
Similarly 𝑝2 = 1+4 = 5 , 𝑝3 = 1+5 = 6 , 𝑝4 = 1+6 = 7.
1 1 1 1 319
Then, the probability that one of them wins is 𝑝 = 𝑝1 + 𝑝2 + 𝑝3 + 𝑝4 = 4 + 5 + 6 + 7 = 420.
Example3. A committee of 12 students consists of 3 representatives from first year, 4 from
second year and 5 from third year classes. Out of 12 members three are to be removed by
drawing lots. What is the chance that
(ii) two belong to one class and the third to the different class.
Solution: The total number of ways of choosing 3 students out of 12 is 12𝐶3 = 220.
The number of ways of choosing 1 student from three groups are 3𝐶1 . 4𝐶1 . 5𝐶1 = 3.4.5 = 60
60 3
Therefore, required probability 220 = 11.
(ii) If 2 from first year and one from others = 3𝐶2 . 9𝐶1 = 3.9 = 27
If 2 from second year and one from others = 4𝐶2 . 8𝐶1 = 6.8 = 48
If two from third year and one from others = 5𝐶2 . 7𝐶1 = 10.7 = 70
If three students belong to third year = 5𝐶3 = 10 Total number of ways =1+4+10=15
15 3
The required probability = 220 = 44.
Example 4. An apparatus contains 6 electronic tubes. It will not work unless all tubes are
working. If the probability of failure of each tube is 0.05, what is the probability of failure of
apparatus?
Probability that apparatus works = Probability that all 6 tubes work =(0.95)6
Statement: If an event 𝐸, can only occur in combination with one of the mutually exclusive
events 𝐸1 , 𝐸2 , … , 𝐸𝑛 , then
𝑃(𝐸𝑘 )𝑃(𝐸/𝐸𝑘 )
𝑃(𝐸𝑘 /𝐸) = ∑𝑛 , 𝑘 = 1,2, … , 𝑛
𝑖=1 𝑃(𝐸𝑖 )𝑃(𝐸/𝐸𝑖 )
Proof: Since the event 𝐸 can occur only with the events 𝐸1 , 𝐸2 , … , 𝐸𝑛 , the possible forms in
which 𝐸 can occur are
These forms are mutually exclusive as the events 𝐸 are mutually exclusive.
Example1. In a bolt factory, machines 𝐴, 𝐵 and 𝐶 manufacture respectively 25%, 35% and 40%
of the total. Of their output 5%, 4% and 2% are defective bolts. A bolt is drawn at random from
the product and is found to be defective. What is the probability that it was manufactured by
machine 𝐵?
Solution: Let 𝐸1 , 𝐸2 and 𝐸3 denote the event that a bolt at random is manufacture by the
machines 𝐴, 𝐵 and 𝐶 respectively and let 𝐸 denote the event of its being defective. Then
1 7 2
𝑃(𝐸1 ) = 25% = 4 , 𝑃(𝐸2 ) = 35% = 20 , 𝑃(𝐸3 ) = 40% = 5
1
The probability of drawing a defective bolt manufactured by machine 𝐴 is 𝑃(𝐸/𝐸1 ) = 5% = 20
1 1
Similarly, 𝑃(𝐸/𝐸2 ) = 4% = 25 , 𝑃(𝐸/𝐸3 ) = 2% = 50
1 white, 2 black and 3 red balls; 2 white, 1 black and 1 red balls and 4 white, 5 black and 3 red
balls. One urn is chosen at random and two balls drawn. They happen to be white and red.
What is the probability that they come from urns I, II and III?
Solution: Let 𝐸1 : urn I is chosen; 𝐸2 : urn II is chosen; 𝐸3 : urn III is chosen and
1𝐶1 ×3𝐶1 1
𝑃(𝐴/𝐸1 ) = 𝑃( a white and a red ball are drawn from urn I)= =5
6𝐶2
55 15
Similarly, 𝑃(𝐸2 /𝐴) = 118 , 𝑃(𝐸3 /𝐴) = 59.
Example3. 𝐴 and 𝐵 take turns in throwing two dice, the first to throw 10 being awarded the
prize, show that if 𝐴 has the first throw, their chances of winning are in the ratio 12:11.
Solution: The number 10 can be drawn in three ways (6,4) ; (4,6) ; (5,5). Therefore,
3 1
The probability of throwing 10 𝑝 = 36 = 12.
1 11
Therefore, the probability of failre 𝑞 = 1 − 𝑝 = 1 − 12 = 12
If 𝐴 is to win, he should throw 10 in either the first, the third, the fifth,…throws.
1 11 2 1 11 4 1
The respective probabilities are 12 ; (12) × 12 ; (12) × 12 ; …
1 11 2 1 11 4 1
Therefore, 𝐴′𝑠 total chances of winning = 12 + (12) × 12 + (12) × 12 + ⋯
1
12 12
= 11 2
= 23 (Using sum of an infinite G.P.)
1−( )
12
𝐵′𝑠 can win in either second ; fourth ; sixth ; … throws.
11 1 11 3 1 11 5 1
Similarly, 𝐵′𝑠 total of chance of winning are = 12 × 12 + (12) × 12 + (12) × 12 + …
1 11
× 11
12 12
= 11 2
= 23.
1−( )
12
12 11
Hence 𝐴′𝑠 chance to 𝐵′𝑠 chance = 23 : 23 = 12: 11.
If an experiment is conducted under identical conditions, values so obtained may not be similar.
Observations are always taken about a factor or character under study, which can take different
values. This factor or character is termed as variable. The observations may be the number of
certain objects or items or their measurements. These observations very even though the
experiment is conducted under identical conditions. Hence we have a set of outcomes of a
random experiment. A rule that assigns a real number to each outcome is called a random
variable. The rule is nothing but a function of the variable, say, 𝑋 that assigns a unique value to
each outcome of the random experiment. It is clear that there is a value for each outcome,
which it takes with certain probability. Thus when a variable 𝑋 takes the value 𝑥𝑖 with
probability 𝑝𝑖 (𝑖 = 1,2,3, … , 𝑛), then 𝑋 is called random variable or stochastic variable or a
variate.
A random variable 𝑋, which can take only finite number of values in an interval of the domain,
is called discrete random variable. For example if we throw a pair of dice at a time and note the
sum which turn up, we note that it must be an integer between 2 and 12. Thus the discrete
random variable can take the finite values 2 ≤ 𝑥 ≤ 12. On the other hand a random variable 𝑋,
which can take every value in the domain or when its range 𝑅 is an interval, is calle continuous
random variable. In this case the random variable can take infinite values in the domain or
interval. The probability of the continuous random variable is defined to fall in the interval
1 1
(𝑥 − 2 𝑑𝑥, 𝑥 + 2 𝑑𝑥), and not at 𝑋 = 𝑥.
Note that the probability of any single 𝑥, a value of 𝑋, is zero i.e. 𝑃(𝑋 = 𝑥) = 0.
For example, the height of students in a country lies between 100 cms and 200 cms. The
continuous random variable:
Another example, the maximum life of electric bulbs is 2000 hours. The continuousrandom
variable;
𝑋(𝑥) = {𝑥 ∶ 0 ≤ 𝑥 ≤ 2000}
The values 𝑥1 , 𝑥2 , … , 𝑥𝑛 of the discrete random variable 𝑋 with their respective probabilities
𝑝1 , 𝑝2 , … , 𝑝𝑛 constitute a probability distribution which is called discrete probability distribution
of the discrete random variable 𝑋. It may be noted that 𝑝1 + 𝑝2 + … + 𝑝𝑛 = 1.
In a throw of a pair of dice the sum (𝑋) is discrete random variable which is an integer between
2 and 12 with the probabilities 𝑃(𝑋) given as:
𝑋 2 3 4 5 6 7 8 9 10 11 12
P(𝑋) 1 2 3 4 5 6 5 4 3 2 1
36 36 36 36 36 36 36 36 36 36 36
𝑋 = 𝑥𝑖 ∶ 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
𝑃(𝑋)𝑜𝑟𝑝(𝑥𝑖 ) ∶ 36 36 36 36 36 36 36 36 36 36 36
In case of continuous random variable, the variate 𝑋 can have infinite values, and it is not
possible to have a finite probability associated with each possible point or value of the variate
and yet to have the sum of these probabilities equal to unity. In this case we associate
probabilities with intervals. Let the probability of the continuous random variable 𝑋 to fall in
1 1
the interval (𝑥 − 2 𝑑𝑥, 𝑥 + 2 𝑑𝑥) be given by 𝑓(𝑥)𝑑𝑥, where 𝑓(𝑥) is a continuous function of 𝑥
and is called probability density function. The curve 𝑦 = 𝑓(𝑥) is called the probability density
curve.
Therefore, the probability of a continuous random variable 𝑋 in a given interval is given by
1 1
𝑃 (𝑥 − 2 𝑑𝑥 ≤ 𝑋 ≤ 𝑥 + 2 𝑑𝑥) = 𝑓(𝑥)𝑑𝑥
The probability that the variate 𝑋 falls in the interval (𝑎, 𝑏) is given by
𝑏
𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥
𝛽
If 𝑋 lies only inside the interval (𝛼, 𝛽), then ∫𝛼 𝑓(𝑥)𝑑𝑥 = 1.
Note:-
𝑏
1. 𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥 = area between the curve 𝑦 = 𝑓(𝑥), 𝑥 −axis and the
ordinates 𝑥 = 𝑎 and 𝑥 = 𝑏 as shown in the figure.
∞
2. 𝑃(−∞ ≤ 𝑋 ≤ ∞) = ∫−∞ 𝑓(𝑥)𝑑𝑥 = area between the curve 𝑦 = 𝑓(𝑥), 𝑥 −axis and the
ordinates 𝑥 = −∞ and 𝑥 = ∞ which is unity.
𝑥
3. 𝑃(𝑋 ≤ 𝑥) = ∫−∞ 𝑓(𝑥)𝑑𝑥 = 𝐹(𝑥) (say) where 𝐹(𝑥) is called probability distribution
function or cumulative distribution function. Thus 𝐹(𝑥) gives the probability of the variable 𝑋
to take values up to 𝑥.
𝑃(∞) = 1
𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = 𝐹(𝑏) − 𝐹(𝑎).
OR
If corresponding to exhaustive and mutually exclusive cases that may arise from an experiment,
a variate 𝑥 assumes 𝑛 value 𝑥𝑖 (𝑖 = 1,2, … , 𝑛) with the probabilities 𝑝𝑖 (𝑖 = 1,2, … , 𝑛), then
assemblage of the values 𝑥𝑖 with their probabilities 𝑝𝑖 define the probability distribution
function of the variate 𝑥. Since the number 𝑥 is associated with the outcome of a random
experiment, it is called a random variable or stochastic variable or more commonly a variate.
Most of the concepts discussed with the frequency distributions apply equally well to
distribution functions. Thus the mean value 𝑥̅ of the discrete distribution function is given by
∑ 𝑝 𝑖 𝑥𝑖
𝑥̅ = ∑ 𝑝𝑖
= ∑ 𝑝𝑖 𝑥𝑖 , because for all the mutually exclusive and exhaustive cases, ∑ 𝑝𝑖 = 1.
OR
If a real variable 𝑋 be associated with the outcome of a random experiment, then since the
values which 𝑋 takes depend on chance, it is called random variable or a stochastic variable or
simply a variate.
For example, if a random experiment 𝐸 consists of tossing a pair of dice, the sum 𝑋 of the two
numbers which turn up have the value 2,3,4,…,12 depending on chance. Then 𝑋 is the random
variable. It is a function whose values are real numbers and depend on chance.
If a random variable takes a finite set of values, it is called a discrete variate. On the other
hand, if it assumes an infinite number of uncountable values, it is called a continuous variate.
Suppose a discrete variate 𝑋 is the outcome of some experiment. If the probability that 𝑋 takes
the values 𝑥𝑖 , is 𝑝𝑖 , then
For example,
The discrete probability distribution for 𝑋, the sum of the numbers which turn on tossing a pair
of dice is given by the following table:
𝑋 = 𝑥𝑖 2 3 4 5 6 7 8 9 10 11 12
𝑝(𝑥𝑖 ) 1 2 3 4 5 6 5 4 3 2 1
36 36 36 36 36 36 36 36 36 36 36
Distribution function:
𝑋 = 𝑥𝑖 0 1 2 3 4 5 6
𝑝(𝑥𝑖 ) 𝑘 3𝑘 5𝑘 7𝑘 9𝑘 11𝑘 13𝑘
(ii) What will be the minimum value of 𝑘 so that 𝑃(𝑋 ≤ 2) > 0.3.
16
Therefore, 𝑃(𝑋 < 40 = 𝑘 + 3𝑘 + 5𝑘 + 7𝑘 = 16𝑘 =
49
24
𝑃(𝑋 ≥ 5) = 11𝑘 + 13𝑘 = 24𝑘 = 49
33
𝑃(3 < 𝑋 ≤ 6) = 9𝑘 + 11𝑘 + 13𝑘 = 33𝑘 = 49
When a variate 𝑋 takes every value in an interval, it gives rise to continuous distribution of 𝑋.
The distributions defined by the variates like heights and weights are continuous distributions.
A major conceptual difference, however, exists between discrete and continuous probabilities.
When thinking in discrete terms, the probability associated with an event is meaningful. With
continuous events, however, where the number of events are infinitely large, the probability
that a specific event will occur is practically zero. For this reason, continuous probability
statements must be worded somewhat differently from discrete ones. Instead of finding the
probability that 𝑥 equals some value, we find the probability of 𝑥 falling in a small interval.
Thus the probability distribution of a continuous variate 𝑥 is defined by a function 𝑓(𝑥) such
1 1
that the probability of the variate 𝑥 falling in small interval 𝑥 − 2 𝑑𝑥 to 𝑥 + 2 𝑑𝑥 is 𝑓(𝑥)𝑑𝑥.
1 1
Symbolically it can be expressed as 𝑃 (𝑥 − 2 𝑑𝑥 ≤ 𝑥 ≤ 𝑥 + 2 𝑑𝑥) = 𝑓(𝑥)𝑑𝑥. Then 𝑓(𝑥) is
called the probability density function and the continuous curve 𝑦 = 𝑓(𝑥) is called the
probability curve.
The range of the variable may be finite or infinite. But even when the range is finite, it is
convenient to consider it is infinite by supposing the density function to be zero outside the
given range. Thus if 𝑓(𝑥) = 𝜑(𝑥) be the density function denoted for the variate 𝑥 in the
interval(𝑎, 𝑏), then it can be written as
0 , 𝑥<𝑎
𝑓(𝑥) = {𝜑(𝑥), 𝑎≤𝑥≤𝑏
0 , 𝑥>𝑏
∞
The density function 𝑓(𝑥) is always positive and ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1 ( i.e. the total area under the
probability curve and the x-axis is unity which corresponds to the requirements that the total
probability of happening of an event is unity ).
Distribution function
𝑥
If 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫−∞ 𝑓(𝑥)𝑑𝑥, then 𝐹(𝑥) is defined as the cumulative distribution
function or simply a distribution function of the continuous variate 𝑋. It is the probability that
the value of the variate 𝑋 will be ≤ 𝑥.
(𝑖𝑖) 𝐹(−∞) = 0
(iii) 𝐹(∞) = 1.
𝑏 −∞ 𝑏 𝑏 𝑎
(iv) 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥 = ∫𝑎 𝑓(𝑥)𝑑𝑥 + ∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫−∞ 𝑓(𝑥)𝑑𝑥 − ∫−∞ 𝑓(𝑥)𝑑𝑥
= 𝐹(𝑏) − 𝐹(𝑎).
𝑒 −𝑥 , 𝑥≥0
𝑓(𝑥) = {
0 , 𝑥<0
(ii) If so, determine the probability that the variate having this density will fall in the in terval
(1,2)?
Hence the function 𝑓(𝑥) satisfies the requirements for a density function.
2
(ii) Required probability= 𝑃(1 ≤ 𝑥 ≤ 2) = ∫1 𝑒 −𝑥 𝑑𝑥 = 𝑒 −1 − 𝑒 −2 = 0.368 − 0.135 = 0.233.
(1) Let a discrete random variable 𝑋 assume the values 𝑥1 , 𝑥2 , 𝑥3 , … … 𝑥𝑛 with probabilities
𝑝1 , 𝑝2 , 𝑝3 , … … . , 𝑝𝑛 then the mean or expected value is defined as
The variance is defined as 𝜎 2 = ∑𝑛𝑖=1 𝑝𝑖 (𝑥𝑖 − 𝜇)2 = ∑ 𝑝(𝑥 − 𝜇)2 = ∑ 𝑝(𝑥 2 + 𝜇 2 − 2𝜇𝑥)
𝜎 2 = ∑ 𝑝𝑥 2 + ∑ 𝑝𝜇 2 − 2 ∑ 𝑝𝜇𝑥 = ∑ 𝑝𝑥 2 + 𝜇 2 ∑ 𝑝 − 2𝜇 ∑ 𝑝𝑥
𝜎 2 = ∑ 𝑝𝑥 2 + 𝜇 2 − 2𝜇. 𝜇 = ∑ 𝑝𝑥 2 − 𝜇 2 .
(2) Let a continuous random variable 𝑋 has a probability 𝑓(𝑥)𝑑𝑥 in the interval
1 1
𝑥 − 2 𝑑𝑥 ≤ 𝑋 ≤ 𝑥 + 2 𝑑𝑥. If the variate 𝑋 lies only in the interval 𝑎 ≤ 𝑋 ≤ 𝑏, we have the
expected or mean value.
𝑏
∫𝑎 𝑥𝑓(𝑥)𝑑𝑥 𝑏 𝑏
𝐸(𝑋) = 𝜇 = 𝑏 = ∫𝑎 𝑥𝑓(𝑥)𝑑𝑥 , because ∫𝑎 𝑓(𝑥)𝑑𝑥 = 1.
∫𝑎 𝑓(𝑥)𝑑𝑥
𝑏
The variance in this case is defined as 𝜎 2 = ∫𝑎 (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥
𝑏 𝑏
𝜎 2 = ∫𝑎 𝑥 2 𝑓(𝑥)𝑑𝑥 + 𝜇 2 − 2𝜇. 𝜇 = ∫𝑎 𝑥 2 𝑓(𝑥)𝑑𝑥 − 𝜇 2 .
Example 1. A pair of two coins is tossed, what is the expected value and variance?
Solution: In tossing of two coins, probability distribution for number of heads(𝑋)
𝑋 0 1 2
P(𝑋) 1 1 1
4 2 4
1 1 1
Expected value or mean value = 𝐸(𝑋) = 𝜇 = ∑ 𝑝𝑥 = 4 (0) + 2 (1) + 4 (2) = 1
1 1 1 3 1
The variance 𝜎 2 = ∑ 𝑝𝑥 2 − 𝜇 2 = 4 (0)2 + 2 (1)2 + 4 (2)2 − (1)2 = 2 − 1 = 2.
Example 2. A pair of dice is thrown together; find the expected value and variance for sum of
numbers.
Solution: Let 𝑋 denotes the sum of numbers on pair of dice, then probability distribution is
𝑋 = 𝑥𝑖 2 3 4 5 6 7 8 9 10 11 12
P( 1 2 3 4 5 6 5 4 3 2 1
𝑋 = 𝑥𝑖 ) 36 36 36 36 36 36 36 36 36 36 36
1 2 3 4
Variance of a pair of dice 𝜎 2 = ∑ 𝑝𝑥 2 − (𝜇)2 = 36 (2)2 + 36 (3)2 + 36 (4)2 + 36 (5)2 +
5 6 5 4 3 2 1 1974
(6)2 + 36 (7)2 + 36 (8)2 + 36 (9)2 + 36 (10)2 + 36 (11)2 + 36 (12)2 − (7)2 = − 49 =
36 36
329 35
− 49 = .
6 6
35
Standard deviation = 𝜎 = √ 6 .
Example 3. A bag contains 8 items of which 2 are defective. A man selects 3 items at random.
Find the expected number of defective items he has drawn. Also find variance.
Solution: The expected number of defective items can be zero defective, one defective, two
defective items. Thus, 𝑋 = 0, 1, 2.
𝐶(2,0)×𝐶(6,3) 20
Now 𝑝1 = 𝑃(𝑋 = 0) = = 56
𝐶(8,3)
𝐶(2,1)×𝐶(6,2) 30
𝑝2 = 𝑃(𝑋 = 1) = 𝐶(8,3)
= 56
𝐶(2,2)×𝐶(6,1) 6
𝑝3 = 𝑃(𝑋 = 2) = = 56
𝐶(8,3)
𝑋 0 1 2
P(𝑋) 20 30 6
56 56 56
Hence the expected number of defective items drawn is
20 30 6 3
𝐸(𝑋) = 𝜇 = ∑ 𝑝𝑥 = 𝑝1 𝑥1 + 𝑝2 𝑥2 + 𝑝3 𝑥3 = 56 (0) + 56 (1) + 56 (2) = 4.
20 30 6 3 45
Variance = 𝜎 2 = ∑ 𝑝𝑥 2 − 𝜇 2 = 56 (0)2 + 56 (1)2 + 56 (2)2 − (4)2 = 112
Example 4. The continuous random variable 𝑋 lies only inside the interval (0,2) and its density
function is given as: 𝑓(𝑥) = 𝑘𝑥(2 − 𝑥); 0 ≤ 𝑥 ≤ 2, find the expected value and variance.
2
Solution: Since 𝑥 lies only inside the interval (0,2), we have ∫0 𝑓(𝑥)𝑑𝑥 = 1, therefore
2 3
∫0 𝑘𝑥(2 − 𝑥)𝑑𝑥 = 1, implies that 𝑘 = 4.
2 2 3
Now 𝐸(𝑋) = 𝜇 = ∫0 𝑥𝑓(𝑥)𝑑𝑥 ∫0 𝑥. 4 𝑥(2 − 𝑥)𝑑𝑥 = 1
2 2 3 6 1
Variance = 𝜎 2 = ∫0 𝑥 2 𝑓(𝑥)𝑑𝑥 − (𝜇)2 = ∫0 𝑥 2 . 4 𝑥(2 − 𝑥)𝑑𝑥 − (1)2 = 5 − 1 = 5.
Example 5. The life in hours of a certain kind of radio tube has the probability density function:
100
, 𝑓𝑜𝑟 𝑥 ≥ 100
𝑓(𝑥) = [ 𝑥 2
0 , 𝑓𝑜𝑟 𝑥 < 100
(ii) the probability that the life of the tube is 150 hours
(iii) the probability that the life of the tube is more than 150 hours.
(ii) Probability that the tube will have a life of 150 hours
100 1
= 𝐹(150) = 𝑃(𝑋 ≤ 150) = 1 − 150 = 3
1 2
(iii) Probability that the tube will have a life less than 150 hours = 1 − 3 = 3.
Example 6. Find the moment generating function of the exponential distribution
1
𝑓(𝑥) = 𝑐 𝑒 −𝑥/𝑐 , 0 ≤ 𝑥 < ∞ , 𝑐 > 0. Hence find its mean and S.D.
1 ∞
(𝑡− )𝑥
|𝑒 𝑐 |
1
=𝑐 1
0
= (1 − 𝑐𝑡)−1 = 1 + 𝑐𝑡 + (𝑐𝑡)2 + (𝑐𝑡)3 + ⋯
|(𝑡− |
𝑐
𝑑
Therefore, 𝜇′1 = [𝑑𝑡 𝑀0 (𝑡)] = [0 + 𝑐 + 𝑐 2 2𝑡 + 𝑐 3 3𝑡 2 + ⋯ ]𝑡=0 = 𝑐
𝑡=0
𝑑2
𝜇′2 = [𝑑𝑡 2 𝑀0 (𝑡)] = [2𝑐 2 + 𝑐 3 6𝑡 + ⋯ ]𝑡=0 = 2𝑐 2
𝑡=0
BINOMIAL DISTRIBUTION
Let an experiment consisting of 𝑛 trials be performed and let the occurrence of an event in any
trial be called a success and its non-occurrence a failure. Let 𝑝 be the probability of success and
𝑞 be the probability of the failure in a single trial, where 𝑞 = 1 − 𝑝, so that 𝑝 + 𝑞 = 1.
Let us assume that trials are independent and the probability of success is same in each trial.
Let us claim that we have 𝑛 trials, then the probability of happening of an event 𝑟 times and
failing (𝑛 − 𝑟) times in any specified order is 𝑝𝑟 𝑞 𝑛−𝑟 (by the theorem on multiplication of
probability). But the total number of ways in which the event can happen 𝑟 times exactly in 𝑛
trials is 𝐶(𝑛, 𝑟). These 𝐶(𝑛, 𝑟) ways are equally likely, mutually exclusive and exhaustive.
Where 𝑃(𝑋 = 𝑟) or 𝑃(𝑟) is the probability distribution a random variable 𝑋 of the number of
successes. Giving different values to 𝑟, i.e. putting 𝑟 = 0, 1, 2, … , 𝑛, we get the corresponding
probabilities 𝐶(𝑛, 0)𝑝0 𝑞 𝑛 , 𝐶(𝑛, 1)𝑝1 𝑞 𝑛−1 , 𝐶(𝑛, 2)𝑝2 𝑞 𝑛−2 , … . , 𝐶(𝑛, 𝑛)𝑝𝑛 𝑞 0 , which are the
different terms in the Binomial expansion of (𝑞 + 𝑝)𝑛 .
As a result of it, the distribution 𝑃(𝑟) = 𝐶(𝑛, 𝑟)𝑝𝑟 𝑞 𝑛−𝑟 is called Binomial probability
distribution. The two independent constants, 𝑛 and 𝑝 in the distribution are called the
parameters of the distribution.
Again if the experiment (each consisting of 𝑛 trials) be repeated 𝑁 times, the frequency
function of the Binomial distribution is given by
The expected frequencies of 0, 1, 2, 3, …,𝑛 successes in the above set of experiment are the
successive terms in the Binomial expansion of 𝑁(𝑞 + 𝑝)𝑛 ; 𝑤ℎ𝑒𝑟𝑒 𝑝 + 𝑞 = 1, which is also
called Binomial frequency distribution.
2. It depends on the parameters 𝑝 or 𝑞, the probability of success or failure and 𝑛 (the number
of trials). The parameter 𝑛 is always a positive integer.
4. The statistics of the Binomial distribution one mean= 𝑛𝑝, variance= 𝑛𝑝𝑞 and standard
deviation= √𝑛𝑝𝑞.
5. The mode of the binomial distribution is equal to that value of 𝑋 which has the largest
frequency.
6. The shape and location of a Binomial distribution changes as 𝑝 changes for a given 𝑛 or 𝑛
changes for a given 𝑝.
𝑋 0 1 2 ⋯ ⋯ n
0 𝑛 1 𝑛−1
𝑃(𝑋) 𝐶(𝑛, 0)𝑝 𝑞 𝐶(𝑛, 1)𝑝 𝑞 𝐶(𝑛, 2)𝑝2 𝑞 𝑛−2 ⋯ ⋯ 𝐶(𝑛, 𝑛)𝑝𝑛 𝑞 0
Therefore,
𝜇 = 𝐶(𝑛, 0)𝑝0 𝑞 𝑛 × 0 + 𝐶(𝑛, 1)𝑝1 𝑞 𝑛−1 × 1 + 𝐶(𝑛, 2)𝑝2 𝑞 𝑛−2 × 2 + ⋯ + 𝐶(𝑛, 𝑛)𝑝𝑛 𝑞 0 × 𝑛
𝑛(𝑛−1)
= 0 + 𝑛𝑝𝑞 𝑛−1 + 𝑝2 𝑞 𝑛−2 + ⋯ + 𝑛𝑝𝑛
2!
(𝑛−1)(𝑛−2)
= 𝑛𝑝 (𝑞 𝑛−1 + (𝑛 − 1)𝑝𝑞 𝑛−2 + 𝑝2 𝑞 𝑛−3 + ⋯ + 𝑝𝑛−1 ) = 𝑛𝑝(𝑞 + 𝑝)𝑛−1 = 𝑛𝑝
2
Mean= 𝜇 = 𝑛𝑝.
(ii) Variance = ∑ 𝑝𝑥 2 − 𝜇 2
= 𝑛(𝑛 − 1)𝑝2 [𝑞 𝑛−2 + 𝐶(𝑛 − 2,1)𝑝𝑞 𝑛−3 + ⋯ + 𝑝𝑛−2 ] = 𝑛(𝑛 − 1)𝑝2 (𝑞 + 𝑝)𝑛−2 = 𝑛(𝑛 − 1)𝑝2
= 𝑛2 𝑝2 − 𝑛𝑝2
From (1)
By Binomial distribution 𝑃(𝑟) = 𝐶(𝑛, 𝑟)𝑝𝑟 𝑞 𝑛−𝑟 ; 𝑃(𝑟 + 1) = 𝐶(𝑛, 𝑟 + 1)𝑝𝑟+1 𝑞 𝑛−𝑟−1
(𝑛−𝑟)𝑝
Therefore, 𝑃(𝑟 + 1) = (𝑟+1)𝑞
𝑃(𝑟).
This is the recurrence formula for binomial distribution.
Example 1. Find the probability that in five tosses of a fair die a 6 appears
5
Probability of non-6 face = 6.
1 2 5 3 1 125 625
(i) Hence probability of 6 appearing twice in 5 throws = 𝐶(5,2) (6) (6) = 10. 36 . 216 = 3888
9ii) Probability of 6 appearing at least twice = 𝑃(2) + 𝑃(3) + 𝑃(4) + 𝑃(5) + 𝑃(6)
1 0 5 5 1 1 5 4 763
= 1 − 𝑃(0) − 𝑃(1) = 1 − 𝐶(5,0) (6) (6) − 𝐶(5,1) (6) (6) = 3882.
Example 2. Six dice are thrown 729 times. How many times do you expect at least three dice to
show a five or a six?
2 1 1 2
Solution: Here = 6 = 3 ; 𝑞 = 1 − 𝑝 = 1 − 3 = 3 ; 𝑛 = 6
Hence the Binomial distribution is given by 𝑁 × 𝑃(𝑟 ≥ 3) = 729[1 − 𝑃(0) − 𝑃(1) − 𝑃(2)]
2 6 1 1 2 5 1 2 2 4 64 92 240
= 729 [1 − 𝐶(6,0) (3) − 𝐶(6,1) (3) (3) − 𝐶(6,2) (3) (3) ] = 729 [1 − 729 − 729 − 729]
= 233.
Example 3. Eight coins are tossed at a time for 256 times. Number of heads are observed at
each throw and is recorded as tabulated below. Find the expected frequencies by Binomial
distribution. Compare the theoretical and experimental values of mean and standard deviation.
1 1 1
Solution: Probability of head(success) in a single trial 𝑝 = 2, therefore 𝑞 = 1 − 𝑝 = 1 − 2 = 2.
1 0 1 8 1
Expected frequency of 0 head = 𝐶(8,0) (2) (2) × 256 = 1 × 1 × 256 × 256 = 1
1 1 1 7 1 1
Expected frequency of 1 head = 𝐶(8,1) (2) (2) × 256 = 8 × 2 × 128 × 256 = 8
1 2 1 6 1 1
Expected frequency of 2 head = 𝐶(8,2) (2) (2) × 256 = 28 × 4 × 64 × 256 = 28
1 3 1 5 1 1
Expected frequency of 3 head = 𝐶(8,3) (2) (2) × 256 = 56 × 8 × 32 × 256 = 56
1 4 1 4 1 1
Expected frequency of 4 head = 𝐶(8,4) (2) (2) × 256 = 60 × 16 × 16 × 256 = 70
1 5 1 3 1 1
Expected frequency of 5 head = 𝐶(8,5) (2) (2) × 256 = 56 × 32 × 8 × 256 = 56
1 6 1 2 1 1
Expected frequency of 6 head = 𝐶(8,6) (2) (2) × 256 = 28 × 64 × 4 × 256 = 28
1 7 1 1 1 1
Expected frequency of 7 head = 𝐶(8,7) (2) (2) × 256 = 8 × 128 × 2 × 256 = 8
1 8 1 0 1
Expected frequency of 8 head = 𝐶(8,8) (2) (2) × 256 = 1 × 256 × 1 × 256 = 1
Comparison table
Example 4. If the sum of the mean and the variance of a Binomial distribution of 5 trials is 4.8,
find the distribution.
Solution: Let the required Binomial distribution be 𝐶(𝑛, 𝑟)𝑝𝑟 𝑞 𝑛−𝑟 where 𝑛 = number of trials =
5.
5𝑝(1 + 𝑞) = 4.8, implies that 5(1 − 𝑞)(1 + 𝑞) = 4.8, which implies that
1 1 4
𝑞 = 5, therefore, 𝑝 = 1 − 5 = 5
4 𝑟 1 5−𝑟
Hence the required Binomial distribution is 𝐶(5, 𝑟) (5) (5) .
POISSON DISTRIBUTION
The Poisson distribution is a discrete probability distribution which has the following
characteristics:
(i) It is the limiting form of Binomial distribution as 𝑛 becomes infinitely large i.e. 𝑛 → ∞ and 𝑝,
the constant probability of success for each trial becomes indefinitely small i.e. 𝑝 → 0 in such a
manner that 𝑛𝑝 = 𝑚 remains a finite number.
(ii) It consists of a single parameter 𝑚 only. The entire distribution can be obtained once 𝑚 is
known.
𝑛(𝑛−1)(𝑛−2)…(𝑛−(𝑟−1)) 𝑚 𝑟 𝑚 𝑛−𝑟 𝑛
𝑃(𝑟) = 𝐶(𝑛, 𝑟)𝑝𝑟 𝑞 𝑛−𝑟 = ( 𝑛 ) (1 − 𝑛 ) , since 𝑛𝑝 = 𝑚 𝑜𝑟 𝑝 = 𝑚.
𝑟!
𝑛(𝑛−1)(𝑛−2)…(𝑛−(𝑟−1)) 𝑚 𝑛 𝑚 −𝑟
𝑃(𝑟) = 𝐶(𝑛, 𝑟)𝑝𝑟 𝑞 𝑛−𝑟 = 𝑚𝑟 (1 − 𝑛 ) (1 − 𝑛 )
𝑟!𝑛𝑟
1 2 𝑟−1 𝑚𝑟 𝑚 𝑛 𝑚 −𝑟
𝑝(𝑟) = (1 − 𝑛) (1 − 𝑛) … (1 − ) . 𝑛𝑟 . (1 − 𝑛 ) (1 − 𝑛 )
𝑛
1 2 𝑟−1 𝑚 −𝑟 𝑚 𝑛
lim (1 − 𝑛) (1 − 𝑛) … (1 − ) = 1, lim (1 − 𝑛 ) = 1, , lim (1 − 𝑛 ) = 𝑒 −𝑚
𝑛→∞ 𝑛 𝑛→∞ 𝑛→∞
This is the probability of 𝑟 successes for Poisson distribution. For 𝑟 = 0,1,2, … we get the
probabilities of 0, 1, 2, … successes as
𝑒 −𝑚 𝑚2 𝑒 −𝑚 𝑚3
𝑃(0) = 𝑒 −𝑚 , 𝑃(1) = 𝑚𝑒 −𝑚 , 𝑃(2) = , 𝑃(3) = and so on.
2! 3!
𝑒 −𝑚 𝑚3
Proof: ∑ 𝑃(𝑟) = 𝑃(0) + 𝑃(1) + 𝑃(2) + ⋯ = 𝑒 −𝑚 + 𝑚𝑒 −𝑚 + +⋯
3!
𝑚2 𝑚3
= 𝑒 −𝑚 (1 + 𝑚 + + + ⋯ ) = 𝑒 −𝑚 . 𝑒 𝑚 = 1.
2! 3!
Mean (𝜇) = ∑∞
𝑖=0 𝑝𝑖 𝑥𝑖 𝑂𝑅 ∑ 𝑟𝑃(𝑟) = 0 + 1. 𝑃(1) + 2𝑃(2) + 3. 𝑃(3) + ⋯
𝑒 −𝑚 𝑚1 𝑒 −𝑚 𝑚2 𝑒 −𝑚 𝑚3
= 1. + 2. + 3. +⋯
1! 2! 3!
𝑚2 𝑚3
= 𝑚𝑒 −𝑚 (1 + 𝑚 + + + ⋯ ) = 𝑚𝑒 −𝑚 𝑒 𝑚 = 𝑚
2! 3!
𝑚2 𝑚3
= 𝑚2 𝑒 −𝑚 (1 + 𝑚 + + + ⋯ ) + 𝑚 = 𝑚2 𝑒 −𝑚 𝑒 𝑚 + 𝑚 = 𝑚2 + 𝑚
2! 3!
𝜎 2 = 𝑚2 + 𝑚 − 𝑚2 = 𝑚
𝜇1 = 0 , 𝜇2 = 𝑚 , 𝜇3 = 𝑚 , 𝜇4 = 𝑚 + 3𝑚2
𝜇 2 𝑚2 1 𝜇 𝑚+3𝑚2 1
𝛽1 = 𝜇3 3 = 𝑚3 = 𝑚 , 𝛽2 = 𝜇 42 = = 3 + 𝑚.
2 2 𝑚2
𝑒 −𝑚 𝑚𝑟 𝑒 −𝑚 𝑚𝑟+1
We have 𝑃(𝑟) = and (𝑟 + 1) = , therefore
𝑟! (𝑟+1)!
𝑃(𝑟+1) 𝑒 −𝑚 𝑚𝑟+1 𝑟! 𝑚 𝑚
= × 𝑒 −𝑚 𝑚𝑟 = 𝑟+1 , implies that 𝑃(𝑟 + 1) = 𝑟+1 𝑃(𝑟), 𝑟 = 0,1,2,3, ….
𝑃(𝑟) (𝑟+1)!
Which is the required recurrence formula for Poisson distribution. With this formula we can
find 𝑃(1), 𝑃(2), 𝑃(3), … if 𝑃(0) is given.
Example 1. Suppose a book of 585 pages contains 43 typographical errors. If these errors are
randomly distributed throughout the book, what is the probability that 10 pages, selected at
random, will be free from errors? ( Use 𝑒 −0.735 = 0.4795 )
43
Solution: Here 𝑝 = 585 = 0.0735 and 𝑛 = 10, therefore, 𝑚 = 𝑛𝑝 = 10 × 0.0735 = 0.735.
Clearly, 𝑝 is very small and 𝑛 is large. So, it is a case of Poisson distribution.
𝑒 −𝑚 𝑚𝑟 𝑒 −0.735 (0.735)𝑟
Then, 𝑃(𝑋 = 𝑟) = =
𝑟! 𝑟!
𝑒 −0.735 (0.735)0
Therefore, 𝑃(𝑛𝑜 𝑒𝑟𝑟𝑜𝑟) = 𝑃(𝑋 = 0) = = 𝑒 −0.735 = 0.4795.
0!
Example 2. In a certain factory turning out razor blades, there is a small chance of 0.002 for any
blade to be defective. The blades are supplied in packets of 10. Calculate the approximate
number of packets containing no defective, one defective and two defective blades in a
consignment of 10000 packets. ( Given 𝑒 −0.02 = 0.9802 )
𝑒 −0.02 (0.02)0
= 10000 × = 10000 × 0.9802 = 9802.
0!
𝑒 −0.02 (0.02)1
= 10000 × = 10000 × 0.9802 × 0.02 = 196.04.
1!
𝑒 −0.02 (0.02)2
= 10000 × = 10000 × 0.9802 × 0.0002 = 1.96.
2!
Example3. If the variance of a Poisson distribution is 2, find the probabilities for 𝑟 = 1,2,3,4
from the recurrence relation of the Poisson distribution. Also find 𝑃(𝑋 ≥ 4).(𝑒 −2 = 0.1353)
𝑒 −2 (2)0
Solution: Here variance = 𝑚 = 2. 𝑃(0) = = 𝑒 −2 = 0.1353
0!
𝑚 2
We know that 𝑃(𝑟 + 1) = 𝑟+1 𝑃(𝑟) = 𝑟+1 𝑃(𝑟) (1)
2 2 2
𝑃(2) = 1+1 𝑃(1) = 0.2706 , 𝑃(3) = 2+1 𝑃(2) = 3 × 0.2706 = 0.1804
2 2
𝑃(4) = 3+1 𝑃(3) = 4 × 0.1804 = 0.0902
x 0 1 2 3 4
𝑓 192 100 24 3 1
𝑒 −𝑚 𝑚𝑟
Solution: 𝑃(𝑟) = 𝑟!
0×192+1×100+2×24+3×3+4×1 161
𝑚 = mean of the distribution = = 320 = 0.5
192+100+24+3+1
𝑒 −0.5 (0.5)0
𝑃(0) = = 0.6065, therefore, theoretical freq. 𝑓 = 320 × 0.6065 = 194 (approx.)
0!
𝑒 −0.5 (0.5)1
𝑃(1) = = 0.30325, therefore, theoretical freq. 𝑓 = 320 × 0.30325 = 97 (approx.)
1!
𝑒 −0.5 (0.5)2
𝑃(2) = = 0.07581, therefore, theoretical freq. 𝑓 = 320 × 0.07581 = 24 (approx.)
2!
𝑒 −0.5 (0.5)3
𝑃(3) = = 0.0126, therefore, theoretical freq. 𝑓 = 320 × 0.0126 = 4 (approx.)
3!
𝑒 −0.5 (0.5)4
𝑃(4) = = 0.0016, therefore, theoretical freq. 𝑓 = 320 × 0.0016 = 0.512 𝑜𝑟 1
4!
(approx.)
x 0 1 2 3 4
𝑓 194 97 24 4 1
NORMAL DISTRIBUTION
Normal distribution is the most popular and commonly used distribution. It was discovered by
De-Moivre in 1733, after 20 years when Bernaulli gave Binomial distribution. This distribution is
a limiting case of Binomial distribution when neither 𝑝 and 𝑞 are too small and 𝑛, the number
of trials becomes infinitely large i.e. 𝑛 → ∞. In fact any quantity whose variation depends on
random cause will be distributed according to the normal distribution whereas in Binomial and
Poisson distribution, 𝑋 assume values like 0, 1, 2, …. And thus these distributions are discrete
distributions. Cases when the variables can assume any value between 0 and 1 or between 1
and 2 are classified under the continuous variate, for example in case of height , weight etc.
The continuous random variable 𝑥 is said to have a normal distribution, if its probability density
function is defined as:
1 𝑥−𝑎 2
𝑓(𝑥) = 𝑘. 𝑒 −2( )
𝑏 for −∞ < 𝑥 < ∞ , where 𝑘 is a constant and 𝑎 and 𝑏 are two parameters.
∞
Now 𝑃(−∞ < 𝑥 < ∞) = ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
1 𝑥−𝑎 2
∞
Therefore, 1 = 𝑘. ∫−∞ 𝑒 −2( 𝑏 ) 𝑑𝑥 (1)
𝑥−𝑎 𝑑𝑥
Taking 𝑦 = , therefore 𝑑𝑦 = , implies that 𝑑𝑥 = √2𝑏𝑑𝑦
√2𝑏 √2𝑏
1
Now taking 𝑦 2 = 𝑡, therefore 𝑦 = √𝑡, implies that 𝑑𝑦 = 2 𝑡 𝑑𝑡
√
From (2)
∞ ∞ 1
1
1 = 2𝑘√2𝑏 ∫0 𝑒 −𝑡 2 𝑡 𝑑𝑡 = √2𝑘𝑏 ∫0 𝑒 −𝑡 𝑡 −2 𝑑𝑡 = √2𝑘𝑏Γ(1/2)= √2𝑘𝑏√𝜋.
√
1
Therefore = 𝑏√2𝜋 .
1 𝑥−𝑎 2
1 − ( )
Thus 𝑓(𝑥) = ×𝑒 2 𝑏 for −∞ < 𝑥 < ∞.
√2𝜋𝑏
𝑥−𝑎
Putting = 𝑧, implies that 𝑥 = 𝑎 + 𝑏𝑧, therefore, 𝑑𝑥 = 𝑏𝑑𝑧
𝑏
𝑧2 𝑧 2 𝑧 2
1 ∞ − 1 ∞ − 1 ∞ −
𝜇= ∫ (𝑎 + 𝑏𝑧)𝑒 2 𝑑𝑧 = ∫ 𝑎𝑒 2 𝑑𝑧 + ∫ 𝑏𝑧𝑒 2 𝑑𝑧
√2𝜋 −∞ √2𝜋 −∞ √2𝜋 −∞
𝑧 2 𝑧2
∞ 2
𝑏∞ − 𝑏 − ∞ −𝑧
𝜇=𝑎+ ∫ 𝑧𝑒 2 𝑑𝑧 = 𝑎 + [−𝑒 2 ] = 𝑎 + 0 = 𝑎 , because ∫−∞ 𝑒 2 𝑑𝑧 = √2𝜋.
√2𝜋 −∞ √2𝜋 −∞
1 𝑥−𝜇 2 𝑏
Putting 𝑦 = 2 ( ) , therefore 𝑑𝑥 = 𝑑𝑦 and so
𝑏 √2√𝑦
1 1
𝑏2 ∞ 2𝑏 2 ∞ 2𝑏 2 ∞
𝜎2 = ∫ 𝑦 2 𝑒 −𝑦 𝑑𝑦 = ∫0 𝑦 2 𝑒 −𝑦 𝑑𝑦 = Γ(3/2) , because Γ(n)= ∫0 𝑒 −𝑥 𝑥 𝑛−1 𝑑𝑥
√𝜋 −∞ √𝜋 √𝜋
2𝑏 2 1 1 b2
𝜎2 = Γ( ) = √π = b2 , therefore Standard deviation 𝜎 = 𝑏.
√𝜋 2 2 √π
1 𝑥−𝜇 2
1
𝑓(𝑥) = 𝜎√2𝜋 𝑒 −2( )
𝜎 , −∞ < 𝑥 < ∞.
1 𝑥−𝜇 2
1
The curve 𝑦 = 𝑓(𝑥) = 𝜎√2𝜋 𝑒 −2( )
𝜎 , −∞ < 𝑥 < ∞, where 𝜇 and 𝜎 are the mean and the
standard deviation respectively is shown in the figure.
The shaded area under the curve from 𝑥 = 𝑎 to 𝑥 = 𝑏 gives probability of the variable 𝑥 lying
between the values 𝑎 and 𝑏. Thus
1 𝑥−𝜇 2
1 𝑏
𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝜎√2𝜋 ∫𝑎 𝑒 −2( )
𝜎 𝑑𝑥
1 𝑥−𝜇 2
1
If total frequency be 𝑁, the normal frequency distribution is given by 𝑁. 𝜎√2𝜋 𝑒 −2( )
𝜎 .
1 𝑥−𝜇 2
1𝑏
The frequency of the variable 𝑥 between 𝑎 and 𝑏 is given by 𝑁. 𝜎√2𝜋 ∫𝑎 𝑒 −2( 𝜎 ) 𝑑𝑥.
Notes.
2. Total area under the curve 𝑦 = 𝑓(𝑥), above x-axis, from −∞ to ∞ is unity.
3. The area of the normal probability curve between 𝜇 − 𝜎 and 𝜇 + 𝜎 is 68.27% or 0.6827.
4. The area of the normal probability curve between 𝜇 − 2𝜎 and 𝜇 + 2𝜎 is 95.45% or 0.9545.
5. The area of the normal probability curve between 𝜇 − 3𝜎 and 𝜇 + 3𝜎 is 99.73% or 0.9973.
𝑥−𝜇
6. By using the transformation 𝑧 = , we get the standard normal curve and the total area
𝜎
under this curve is unity. The line 𝑧 = 0 divides the whole area in two equal parts. The left side
area of 𝑧 = 0 is 0.5 whereas right side area 𝑧 = 0 is also 0.5. The area in between the ordinates
𝑧 = 0 and 𝑧 = 𝑧1 , can be computed from the standard table.
1 𝑥−𝜇 2
1 ∞
7. The mean deviation from mean = 𝜎√2𝜋 ∫−∞|𝑥 − 𝜇|𝑒 −2( )
𝜎 𝑑𝑥
𝑥−𝜇
Putting = 𝑧, implies that 𝑥 = 𝜇 + 𝜎𝑧, therefore 𝑑𝑥 = 𝜎𝑑𝑧, we get
𝜎
𝑧2 𝑧 2
1 ∞ 𝜎 ∞
The mean deviation from mean = 𝜎√2𝜋 ∫−∞|𝜎𝑧|𝑒 − 2 𝜎𝑑𝑧 = ∫ |𝑧|𝑒 −
2 𝑑𝑧
√2𝜋 −∞
𝑧 2 𝑧 2 𝑧2
−∞ 𝑧2
∞
𝜎 0 − 𝜎 ∞ − 𝜎 𝜎
= ∫ (−𝑧)𝑒 2 𝑑𝑧 + ∫ (+𝑧)𝑒 2 𝑑𝑧 = |−𝑒 − 2 | + |−𝑒 − 2 |
√2𝜋 −∞ √2𝜋 0 √2𝜋 0 √2𝜋 0
𝜎 𝜎 2 2 4
= (0 + 1) + (0 + 1) = 𝜎√ = 𝜎√ = 0.80𝜎(𝑎𝑝𝑝𝑟𝑜𝑥. ) = 5 𝜎.
√2𝜋 √2𝜋 𝜋 3.1416
4
Therefore, mean deviation from mean in the normal distribution = 5 𝜎.
5
In other words standard deviation 𝜎 = 4 times mean deviation from mean.
(𝜇 )2
3. 𝛽1 or moment coefficient of skewness = (𝜇3 )3 = 0.
2
𝜇 3𝜎4
4. 𝛽2 or moment coefficient of kurtosis = (𝜇 4)2 = = 3.
2 𝜎4
Example1. In a normal distribution if 𝜇 = 50 and 𝜎 = 10, find (i) 𝑃(50 ≤ 𝑥 ≤ 80) (ii) 𝑃(60 ≤
𝑥 ≤ 70)
50−50 80−50
(i) 𝑧 = = 0 when 𝑥 = 50 and 𝑧 = = 3 when 𝑥 = 80.
10 10
Example2. In a normal distribution 31% of items are under 45 and 8% are over 64. Find the
mean and standard deviation of the distribution.
Solution: Let 𝜇 be the mean and 𝜎 be the standard deviation of the distribution.
𝑥−𝜇
Normal variate 𝑧 = .
𝜎
45−𝜇
Similarly, at 𝑥 = 45 , |𝑧| = | | = |𝑧2 |
𝜎
Now the value of the normal variate 𝑧2 corresponding to area 0.19 is 0.495.
45−𝜇 𝜇−45
Therefore, | | = 0.495 OR = 0.495 (2)
𝜎 𝜎
Example3. A sample of 100 dry battery cells tested to find the length of life produced the
following results: 𝑋̅ = 12 ℎ𝑜𝑢𝑟𝑠, 𝜎 = 3 ℎ𝑜𝑢𝑟𝑠. Assuming the data to be normally distributed,
what %age of battery cells are expected to have life
(a) more than 15 hours. (b) less than 6 hours (c) between 10 and 14 hours.
𝑋−𝑋̅ 𝑋−12
Solution: Let 𝑋 denotes the length of life of dry battery cells. Also 𝑧 = =
𝜎 3
15−12
(a) When 𝑋 = 15, 𝑧 = =1
3
Therefore, 𝑃(𝑋 > 15) = 𝑃(𝑧 > 1) = area to the right of 𝑧 = 1 is 0.5 − 0.3413 = 0.1587.
Therefore, %age of battery cells having life more than 15 hours = 0.1587 × 100 = 15.87%.
6−12
(b) When 𝑋 = 6, 𝑧 = = −2
3
Therefore, 𝑃(𝑋 < 15) = 𝑃(𝑧 < −2) = area to the left of 𝑧 = −2 is 0.5 − 0.4722 = 0.0228.
Therefore, %age of battery cells having life less than 6 hours = 0.0228 × 100 = 2.28%.
10−12 14−12
(c) When 𝑋 = 10 , 𝑧 = = −0.67 and when 𝑋 = 14 , 𝑧 = = 0.67
3 3
Therefore, 𝑃(10 < 𝑋 < 14) = 𝑃(−0.67 < 𝑧 < 0.67) = 2 × 𝑃(0 < 𝑧 < 0.67)
= 2 × 0.2487 = 0.4974.
Therefore, %age of battery cells having life span between 10 hours and 14 hours is 49.74%.
Find the constant 𝐶, the mean and the variance of the random variable. Find also the upper 5%
value of the random variable.
Solution: 𝑃(−∞ < 𝑥 < ∞) = 1 = unit area under the curve 𝑦 = 𝑓(𝑥) and 𝑥 −axis.
1 2 −30𝑥)
∞ ∞
Therefore, 1 = ∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫−∞ 𝐶𝑒 −50(9𝑥 𝑑𝑥 (1)
1 9 10 9 5 2 25 9 5 2 1
Now − 50 (9𝑥 2 − 30𝑥) = − 50 (𝑥 2 − 𝑥) = − 50 [(𝑥 − 3) − =] − 50 (𝑥 − 3) + 2
3 9
5 2 5 2
(𝑥− ) 1 𝑥−3
3
9 5 2 1 1 − 50 1 − ( )
∞ − (𝑥− ) + ∞ ∞ 2 5/3
1 = 𝐶 ∫−∞ 𝑒 50 3 2 𝑑𝑥 = 𝐶𝑒 ∫−∞ 𝑒 2 9 𝑑𝑥 = 𝐶𝑒 ∫−∞ 𝑒2 𝑑𝑥
1 𝑥−𝜇 2
1 ∞ 5 5 1
1 = 𝜎√2𝜋 ∫−∞ 𝑒 −2( )
𝜎 𝑑𝑥 , we get 𝜇 = 3 1.667 , 𝜎 = 3 , 𝐶 = = 0.145.
√𝑒√2𝜋𝜎
5
𝑥−𝜇 𝑥−
3
Standard normal variate 𝑧 = = 5 corresponding to 95% area , we have
𝜎
3
5
𝑥−
3
1.96 = 5 , implies that 𝑥 = 4.933.
3
Exponential distribution
A random variable 𝑋 is said to have an exponential distribution with parameter 𝛾 > 0if its
probability density function 𝑓(𝑥) is defined as
𝛾𝑒 −𝛾𝑥 , 𝑥 ≥ 0
𝑓(𝑥) = {
0, 𝑥<0
Implies that
1 − 𝑒 −𝛾𝑥 , 𝑥 ≥ 0
𝐹(𝑥) = {
0, 𝑥<0
𝑒 −𝛾𝑥 𝑒 −𝛾𝑥 ∞ 1
= 𝛾 [𝑥 − (−𝛾)2 ] = 𝛾
−𝛾 0
0 ∞ 1 2 1
= ∫−∞ 𝑥 2 . 0𝑑𝑥 + ∫0 𝑥 2 𝛾𝑒 −𝛾𝑥 𝑑𝑥 − (𝛾) = 𝛾2
Example1. A power supply unit for a computer component is assumed to follow an exponential
distribution with a mean life of 1200 hours. What is the probability that the component will
1 1
1200 = 𝛾 , implies that 𝛾 = 1200
(i) The probability that the component will fall in the first 300 hours is
𝑥 𝑒 −𝛾𝑥 𝑥
𝑃(𝑋 ≤ 𝑥) = ∫0 𝛾𝑒 −𝛾𝑥 𝑑𝑥 = 𝛾 [ ] = 1 − 𝑒 −𝛾𝑥
−𝛾 0
(ii) The probability that the component will survive more than 1500 hours is
∞ 𝑒 −𝛾𝑥 ∞
𝑃(𝑋 ≥ 𝑥) = ∫𝑥 𝛾𝑒 −𝛾𝑥 𝑑𝑥 = 𝛾 [ ] = 𝑒 −𝛾𝑥
−𝛾 𝑥
(iii) The probability that the component will lost between 120 hours and 1500 hours is given as
Example2. A random variable has an exponential distribution with probability density function
given by
3𝑒 −3𝑥 , 𝑥 > 0
𝑓(𝑥) = {
0, 𝑥≤0
What is the probability that 𝑥 not less than 4 ? Find the mean and standard deviation. Show
that coefficient of variation is 1.
∞
Solution: 𝑃(𝑥 𝑖𝑠 𝑛𝑜𝑡 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 4) = 𝑃(𝑥 ≥ 4) = ∫4 3𝑒 −3𝑥 𝑑𝑥 = 𝑒 −12
∞ ∞ 1
Mean (𝜇) = 𝐸(𝑋) = ∫0 𝑥𝑓(𝑥)𝑑𝑥 = 3 ∫0 𝑥𝑒 −3𝑥 𝑑𝑥 = 3
∞ 1 1
Variance 𝜎 2 = 𝐸[𝑋 2 ] − [𝐸(𝑋)]2 = ∫0 𝑥 2 3𝑒 −3𝑥 𝑑𝑥 − (3)2 = 9
1
Standard deviation (𝜎) = 3
We know that
1
𝑀𝑒𝑎𝑛 3
Coefficient of variation = 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 1 =1
3
Gamma Distribution
A continuous random variable 𝑋 is said to have gamma distribution with parameter 𝛼 𝑎𝑛𝑑 𝛾 if
its probability density function is defined as
𝛼𝛾 𝑒 −𝛼𝑥 𝑥 𝛾−1
̅ , 𝑥 ≥ 0 , 𝛼, 𝛾 > 0
𝑓(𝑥) = { |𝛾
0 , 𝑥<0
1
Putting 𝛼𝑥 = 𝑦 𝑡ℎ𝑒𝑛 𝑑𝑥 = 𝛼 𝑑𝑦
𝛼𝛾 𝛼𝑥 𝛾−1 −𝑦 1 ∞
−𝛼𝑥 𝛾−1
= ̅ ∫0 𝑦 𝑒 𝑑𝑦 = |𝛾
̅ ∫0 𝑒 𝑥 𝑑𝑥
|𝛾
Remarks:
∞ 0 ∞ 𝛼𝛾 𝑒 −𝛼𝑥 𝑥 𝛾−1
∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫−∞ 0𝑑𝑥 + ∫0 ̅
|𝛾
𝑑𝑥
𝛼𝛾 |𝛾
= ̅ . 𝛼𝛾 =1
|𝛾
∞ ∞ 𝛼𝛾 𝑒 −𝛼𝑥 𝑥 𝛾−1
As we know Mean (𝜇) = 𝐸(𝑋) = ∫−∞ 𝑥𝑓(𝑥)𝑑𝑥 = ∫0 𝑥. ̅ 𝑑𝑥
|𝛾
𝛼𝛾 ∞ −𝛼𝑥 𝛾 𝛼𝛾 ̅̅̅̅̅̅̅
|𝛾+1 𝛾
= ̅ ∫0 𝑒 𝑥 𝑑𝑥 = ̅ . 𝑎𝛾+1 =𝑎
|𝛾 |𝛾
∞ 𝛼𝛾 𝑒 −𝛼𝑥 𝑥 𝛾−1 𝛾 2
𝑉𝑎𝑟(𝑋) = 𝜎 2 = 𝐸[𝑋 2 ] − [𝐸(𝑋)]2 = ∫0 𝑥 2 ̅ 𝑑𝑥 − (𝑎 )
|𝛾
𝛼𝛾 ∞ 𝛾+1 −𝛼𝑥 𝛾 2
= ̅ ∫0 𝑥 𝑒 𝑑𝑥 − (𝑎)
|𝛾
1
Putting 𝛼𝑥 = 𝑡 𝑡ℎ𝑒𝑟𝑒𝑓𝑜𝑟𝑒 𝑑𝑥 = 𝛼 𝑑𝑡, we get
𝛼𝛾 ∞ 𝛾2 ̅̅̅̅̅̅̅
|𝛾+2 𝛾2 𝛾
𝛾+1 −𝑡
= 𝛼𝛾+2|𝛾
̅ ∫0 𝑡 𝑒 𝑑𝑡 − 𝛼2 = ̅ − 𝛼2 = 𝑎2
𝑎2 |𝛾
1
𝑑𝑥 = 𝛼−𝑡 𝑑𝑦, we get
𝛼𝛾 ∞ 𝑦 𝛾−1 −𝑦 1
𝑀𝑜 (𝑡) = ̅ ∫ ( ) 𝑒 𝛼−𝑡 𝑑𝑦
|𝛾 0 𝛼−𝑡
𝛼 𝛾∞ 𝛾−1 −𝑦 𝛼 𝛾
𝛼 𝛼 𝛾 𝛾
𝑀𝑜 (𝑡) = (𝛼−𝑡)𝛾|𝛾 ̅
̅ ∫0 𝑦 𝑒 𝑑𝑦 = (𝛼−𝑡)𝛾 |𝛾
̅ |𝛾 = (𝛼−𝑡)𝛾 = (𝛼−𝑡) , 𝑡 < 𝛾
OR
𝑡 −𝛾
𝑀𝑜 (𝑡) = (1 − ) ,𝑡 < 𝛾
𝛼
Geometric Distribution
If 𝑝 be the probability of success and 𝑘 be the number of failures preceding the first success
then the geometric distribution is
𝑝(𝑘) = 𝑞 𝑘 𝑝 , 𝑘 = 0, 1, 2, … . . & 𝑞 = 1 − 𝑝
Obviously
∑∞ ∞ 𝑘 2 3
𝑘=0 𝑝(𝑘) = 𝑝 ∑𝑘=0 𝑞 = 𝑝(1 + 𝑞 + 𝑞 + 𝑞 + ⋯ . )
𝑝 𝑝
= 𝑝(1 − 𝑞)−1 = 1−𝑞 = 𝑝 = 1
Mean (𝜇) = ∑∞ ∞ 𝑘
𝑘=0 𝑘𝑝(𝑘) = ∑𝑘=0 𝑘. 𝑞 𝑝
= 0 + 1. 𝑞𝑝 + 2. 𝑞 2 𝑝 + 3. 𝑞 3 𝑝 + ⋯.
𝑉𝑎𝑟(𝜎 2 ) = ∑∞ 2
𝑘=0 𝑘 𝑝(𝑘) − (𝜇)
2
= ∑∞ 2 𝑘
𝑘=0 𝑘 . 𝑞 𝑝 − (𝜇)
2
𝑞 2
= [0 + 12 . 𝑞1 𝑝 + 22 . 𝑞 2 𝑝 + 32 . 𝑞 3 𝑝 + ⋯ . ] − (𝑝)
𝑞
= 𝑝2
𝑀𝑜 (𝑡) = 𝐸(𝑒 𝑡𝑘 ) = ∑∞ 𝑡𝑘 𝑘
𝑘=0 𝑒 𝑞 𝑝
= ∑∞ 𝑡 𝑘 𝑡 𝑡 2 𝑡 3
𝑘=0 𝑝(𝑞𝑒 ) = 𝑝(1 + 𝑞𝑒 + (𝑞𝑒 ) + (𝑞𝑒 ) + ⋯ )
𝑝
= 𝑝(1 − 𝑞𝑒 𝑡 )−1 = 1−𝑞𝑒 𝑡
Weibull distribution
A continuous random variable 𝑋 has a Weibull distribution if its probability density function is
defined as
𝑥𝛼
𝛼
𝑓(𝑥) = 𝑐 𝑥 𝛼−1 𝑒 − 𝑐 , 𝑥 > 0 , 𝑐 > 0
𝑏 1 (𝑏−𝑎)2
𝑉𝑎𝑟(𝜎 2 ) = ∫𝑎 𝑥 2 . 𝑏−𝑎 𝑑𝑥 − (𝜇)2 = 12
Example1. A random variable 𝑋 has a uniform distribution over (−3,3), find 𝑘 for which
1
𝑃(𝑋 > 𝑘) = 3. Also evaluate 𝑃(𝑋 < 2) and 𝑃(|𝑋 − 2| < 2).
Therefore,
𝑘 𝑘 1 1 𝑘
𝑃(𝑋 > 𝑘) = 1 − 𝑃(𝑋 ≤ 𝑘) = 1 − ∫−3 𝑓(𝑥)𝑑𝑥 = 1 − ∫−3 6 𝑑𝑥 = 2 − 6
1
But we have given 𝑃(𝑋 > 𝑘) = 3
Therefore,
1 𝑘 1
− 6 = 3 , implies that 𝑘 = 1
2
Now
2 2 1 5
𝑃(𝑋 < 2) = ∫−3 𝑓(𝑥)𝑑𝑥 = ∫−3 6 𝑑𝑥 = 6
And
3 31 1
𝑃(|𝑋 − 2| < 2) = 𝑃(−2 < 𝑋 − 2 < 2) = 𝑃(0 < 𝑋 < 4) = ∫0 𝑓(𝑥)𝑑𝑥 = ∫0 6 𝑑𝑥 = 2
Example2. A die is cast until 6 appear. What is the probability that it must be cast more than 5
times?
1 1 5
Solution: Here probability of getting 6 is 𝑝 = 6. Therefore, 𝑞 = 1 − 6 = 6
Therefore,
5 𝑥−1 1 1 5 5 2 5 3 5 4 5 5
= 1 − ∑5𝑥=1 (6) . (6) = 1 − 6 [1 + 6 + (6) + (6) + (6) ] = (6) .
This distribution gives the probability that the event occurs for the kth time on the rth trial
(𝑟 ≥ 𝑘). If 𝑝 be the probability of occurrence of an event, then
𝑟−1 𝑘 𝑟−𝑘
𝑃(𝑘, 𝑟) = 𝐶𝑘−1 𝑝 𝑞
It contains two parameters 𝑝(0 < 𝑝 < 1) and 𝑘 (a positive integer). If 𝑘 = 1, the negative
binomial distribution reduces to the geometric distribution.
Suppose a bag contains 𝑚 white and 𝑛 black balls. If 𝑟 balls are drawn one at a time (with
replacement), then the probability that 𝑘 of them will be white is
𝐶𝑘𝑚 𝐶𝑟−𝑘
𝑛
𝑃(𝑘) = , 𝑘 = 0, 1, 2, … , 𝑟 , 𝑟 ≤ 𝑚 , 𝑟 ≤ 𝑛.
𝐶𝑟𝑚+𝑛
2. The 𝑟 𝑡ℎ moment of a variable 𝑥 about the any point 𝑎 is usually denoted by 𝜇′𝑟 is given by
𝟏
𝝁′𝒓 = 𝑵 ∑ 𝒇𝒊 (𝒙𝒊 − 𝒂)𝒓 , ∑ 𝒇𝒊 = 𝑵
1 1 1
If 𝑟 = 0 , 𝜇0 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )0 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 = 𝑁 𝑁 = 1
1 1 1 1
If 𝑟 = 1 , 𝜇1 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )1 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥̅ 𝑁 ∑𝑛𝑖=1 𝑓𝑖 = 𝑥̅ − 𝑥̅ 𝑁 𝑁 = 𝑥̅ − 𝑥̅ = 0
1 1 1 1
If 𝑟 = 2 , 𝜇2 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )2 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 2 − 2𝑥̅ 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 + (𝑥̅ )2 𝑁 ∑𝑛𝑖=1 𝑓𝑖
1 1 1 1 2
= 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 2 − 2𝑥̅ 𝑥̅ + (𝑥̅ )2 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 2 − (𝑥̅ )2 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 2 − (𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 )
= 𝜎 2 (variance)
1
If 𝑟 = 3 , 𝜇3 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )3
1
If 𝑟 = 4 , 𝜇4 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )4
1 1 1
If 𝑟 = 0 , 𝜇′0 = ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑎)0 = ∑𝑛𝑖=1 𝑓𝑖 = 𝑁=1
𝑁 𝑁 𝑁
1 1 1 1
If 𝑟 = 1 , 𝜇′1 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑎)1 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑎 𝑁 ∑𝑛𝑖=1 𝑓𝑖 = 𝑥̅ − 𝑁 𝑁 = 𝑥̅ − 𝑎 .
1
If 𝑟 = 2 , 𝜇′2 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑎)2
1
If 𝑟 = 3 , 𝜇3 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑎)3
1
If 𝑟 = 4 , 𝜇′4 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )4
Moments about origin
1
𝑣𝑟 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 )𝑟 , 𝑟 = 0,1,2, , …, where 𝑁 = ∑𝑛𝑖=1 𝑓𝑖
1 1 1
If 𝑟 = 0 , 𝑣0 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 )0 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 = 𝑁 𝑁 = 1
1 1
If 𝑟 = 1 , 𝑣1 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 )1 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 = 𝑥̅
1
If 𝑟 = 2 , 𝑣2 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 )2
1
If 𝑟 = 3 , 𝑣3 = ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 )3
𝑁
1
If 𝑟 = 4 , 𝑣4 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 )4
We have
1 1 1
𝜇𝑟 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )𝑟 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 ([𝑥𝑖 − 𝑎] − [𝑥̅ − 𝑎])𝑟 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 ([𝑥𝑖 − 𝑎] − 𝜇′1 )𝑟
2 𝑟
𝜇𝑟 = 𝜇′𝑟 − 𝐶(𝑟, 1)𝜇′𝑟−1 𝜇′1 + 𝐶(𝑟, 2)𝜇′𝑟−2 (𝜇 ′1 ) − ⋯ + (−1)𝑟 (𝜇 ′1 )
Putting = 2, 3, 4, … , we get
2 2 2
𝜇2 = 𝜇′2 − 2(𝜇 ′1 ) + (𝜇 ′1 ) = 𝜇′2 − (𝜇 ′1 ) , because 𝜇 ′ 0 = 1
3 3 3
𝜇3 = 𝜇′3 − 3𝜇′2 𝜇′1 + 3(𝜇 ′1 ) − (𝜇 ′1 ) = 𝜇′3 − 3𝜇′2 𝜇′1 + 2(𝜇 ′ 2 )
2 4
𝜇4 = 𝜇′4 − 4𝜇′3 𝜇′1 + 6𝜇′2 (𝜇 ′1 ) − 3(𝜇 ′1 )
1
= 𝑁 ∑[𝑓𝑖 (𝑥𝑖 − 𝑥̅ )𝑟 + 𝐶(𝑟, 1)𝑓𝑖 (𝑥𝑖 − 𝑥̅ )𝑟−1 (𝑥̅ − 𝑎) + 𝐶(𝑟, 2)𝑓𝑖 (𝑥𝑖 − 𝑥̅ )𝑟−2 (𝑥̅ − 𝑎)2 + ⋯ +
𝐶(𝑟, 2)𝑓𝑖 (𝑥𝑖 − 𝑥̅ )1 (𝑥̅ − 𝑎)𝑟−1 + 𝑓𝑖 (𝑥̅ − 𝑎)𝑟 ]
Putting = 1, 2, 3, 4, … , we get
If 𝑟 = 1, 𝜇′1 = 𝑥̅ − 𝑎 = 𝜇1 − 𝑎
If 𝑟 = 4, 𝜇′4 = 𝜇4 + 4𝜇3 𝜇′1 + 6𝜇2 𝜇′12 + 4𝜇1 𝜇′13 + 𝜇′14 = 𝜇4 + 4𝜇3 𝜇′1 + 6𝜇2 𝜇′12 + 𝜇′14
1 1
𝑣𝑟 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑎 + 𝑎)𝑟 = 𝑁 ∑𝑛𝑖=1 𝑓𝑖 [(𝑥𝑖 − 𝑎)𝑟 + 𝐶(𝑟, 1)(𝑥𝑖 − 𝑥̅ )𝑟−1 𝑎 + ⋯ + 𝑎𝑟 ]
On taking 𝑎 = 𝑥̅ , we get
If 𝑟 = 1 , 𝑣1 = 𝜇1 + 𝜇0 𝑥̅ = 𝑥̅ , because 𝜇1 = 0 , 𝜇0 = 1
If 𝑟 = 2 , 𝑣2 = 𝜇2 + 𝐶(2,1)𝜇1 𝑥̅ + 𝜇0 𝑥̅ 2 = 𝜇2 + 𝑥̅ 2
= 𝜇4 + 4𝜇3 𝑥̅ + 6𝜇2 𝑥̅ 2 + 𝑥̅ 4
̅𝟐 ; 𝒗𝟑 = 𝝁𝟑 + 𝟑𝝁𝟐 𝒙
̅ ; 𝒗𝟐 = 𝝁𝟐 + 𝒙
Hence, 𝒗𝟏 = 𝒙 ̅𝟑 ; 𝒗𝟒 = 𝝁𝟒 + 𝟒𝝁𝟑 𝒙
̅+𝒙 ̅𝟐 + 𝒙
̅ + 𝟔𝝁𝟐 𝒙 ̅𝟒
Moment generating function:
(1) The moment generating function (m.g.f.) is a function that generates moments. In the case
of discrete probability distributions it is defined as
Where 𝑀𝑎 (𝑡) is the moment generating function (m.g.f.) of the discrete probability distribution
of 𝑥 about the point 𝑎 and is a function of the parameter 𝑡.
𝑡2 𝑡𝑟
𝑀𝑎 (𝑡) = ∑ 𝑝𝑖 [1 + 𝑡(𝑥𝑖 − 𝑎) + 2! (𝑥𝑖 − 𝑎)2 + ⋯ + 𝑟! (𝑥𝑖 − 𝑎)𝑟 + ⋯ ]
𝑡2 𝑡𝑟
= ∑ 𝑝𝑖 + 𝑡 ∑ 𝑝𝑖 (𝑥𝑖 − 𝑎) + 2! ∑ 𝑝𝑖 (𝑥𝑖 − 𝑎)2 + ⋯ + 𝑟! ∑ 𝑝𝑖 (𝑥𝑖 − 𝑎)𝑟 + ⋯
𝑡2 𝑡𝑟
= 1 + 𝑡 ∑ 𝑝𝑖 (𝑥𝑖 − 𝑎) + 2! ∑ 𝑝𝑖 (𝑥𝑖 − 𝑎)2 + ⋯ + 𝑟! ∑ 𝑝𝑖 (𝑥𝑖 − 𝑎)𝑟 + ⋯ (2)
OR
𝑡2 𝑡𝑟
𝑀𝑎 (𝑡) = 1 + 𝑡𝜇1′ + 2! 𝜇2′ + ⋯ + 𝑟! 𝜇𝑟′ + ⋯ (3)
𝑡𝑟
From equation (3) we notice that 𝜇𝑟′ is the coefficient of 𝑟! in the expansion of 𝑀𝑎 (𝑡). For this
reason the function 𝑀𝑎 (𝑡) is called the moment generating function (m.g.f.).
Alternately, 𝜇𝑟′ can also be obtained by differentiating 𝑀𝑎 (𝑡) 𝑟 times with respect to 𝑡 and
putting 𝑡 = 0 in the differentiated result i.e.
𝑑𝑟
(𝑑𝑡 𝑟 (𝑀𝑎 (𝑡))) = 𝜇𝑟 ′ (4)
𝑡=0
Thus the moment about any point 𝑥 = 𝑎 can either be computed from equation (3) or more
easily from formula (4).
OR
Equation (5) shows that the moment generating function about point 𝑎 = 𝑒 −𝑎𝑡 (m.g.f. about
the origin)
Note:- (1) The m.g.f. of the sum of two independent variables is the product of their m.g.fs.
(2) If 𝑓(𝑥) is the density function of a continuous variable 𝑋, then the moment generating
function of this continuous probability distribution about 𝑥 = 𝑎is defined as
∞
𝑀𝑎 (𝑡) = ∫−∞ 𝑒 𝑡(𝑥−𝑎) 𝑓(𝑥)𝑑𝑥.
Example1. Find the moment generating function of the discrete binomial distribution given by
𝑓(𝑥) = 𝐶(𝑛, 𝑥)𝑝 𝑥 𝑞 𝑛−𝑥 . Also, find the first and second moment about the mean and standard
deviation.
Moment generating function about the origin 𝑀0 (𝑡) = ∑ 𝑒 𝑡𝑥 𝑓(𝑥) = ∑ 𝑒 𝑡𝑥 𝐶(𝑛, 𝑥)𝑝 𝑥 𝑞 𝑛−𝑥
= ∑ 𝐶(𝑛, 𝑥)(𝑝𝑒 𝑡 )𝑥 𝑞 𝑛−𝑥 = 𝑞 𝑛 + 𝐶(𝑛, 1)𝑞 𝑛−1 (𝑝𝑒 𝑡 ) + 𝐶(𝑛, 2)𝑞 𝑛−2 (𝑝𝑒 𝑡 )2 + ⋯ + (𝑝𝑒 𝑡 )𝑛
= (𝑞 + 𝑝𝑒 𝑡 )𝑛
𝑑
𝑣1 = [𝑑𝑡 𝑀0 (𝑡)] = [𝑛(𝑞 + 𝑝𝑒 𝑡 )𝑛−1 𝑝𝑒 𝑡 ]𝑡=0 = 𝑛(𝑞 + 𝑝)𝑛−1 𝑝 = 𝑛𝑝 , because 𝑞 + 𝑝 = 1.
𝑡=0
𝑑2 𝑑
𝑣2 = [𝑑𝑡 2 𝑀0 (𝑡)] = [𝑑𝑡 {𝑛(𝑞 + 𝑝𝑒 𝑡 )𝑛−1 𝑝𝑒 𝑡 }]
𝑡=0 𝑡=0
𝜇1 = 𝑥̅ = 𝑣1 = 𝑛𝑝
Example2. Find the moment generating function of the discrete distribution given by 𝑓(𝑥) =
𝑒 −𝑚 .𝑚𝑥
. Also, find the first and second moments about mean and variance.
𝑥!
𝑒 −𝑚 .𝑚𝑥
Solution: Here we have 𝑓(𝑥) = .
𝑥!
𝑥
𝑒 −𝑚 .𝑚𝑥 (𝑚𝑒 𝑡 )
Moment generating function about the origin = 𝑀0 (𝑡) = ∑ 𝑒 𝑡𝑥 . = 𝑒 −𝑚 ∑
𝑥! 𝑥!
= [𝑒 𝑚(1−1) 𝑚2 + 𝑒 𝑚(1−1) 𝑚] = 𝑚2 + 𝑚
𝜇1 = 𝑥̅ = 𝑣1 = 𝑛𝑝
1 1 1
= 𝑐. 1 [0 − 1] = = (1 − 𝑐𝑡)−1 = 1 + 𝑐𝑡 + (𝑐𝑡)2 + (𝑐𝑡)3 + ⋯
𝑡− 1−𝑐𝑡
𝑐
𝜇1 = 𝑥̅ = 𝑣1 = 𝑐.
𝑑2 𝑑
𝑣2 = [𝑑𝑡 2 𝑀0 (𝑡)] = [𝑑𝑡 {𝑐 + 2𝑐 2 𝑡 + 3𝑐 3 𝑡 2 + ⋯ }] = [2𝑐 2 + 6𝑐 3 𝑡 + ⋯ ]𝑡=0 = 2𝑐 2
𝑡=0 𝑡=0
Example4. Find the moment generating function of the continuous normal distribution given by
1 𝑥−𝜇 2
1 − ( )
𝑓(𝑥) = 𝜎√2𝜋 𝑒 2 𝜎 ; −∞ < 𝑥 < ∞ .
1 𝑥−𝜇 2
1
Solution: Here we have 𝑓(𝑥) = 𝜎√2𝜋 𝑒 −2( )
𝜎
The moment generating function about the origin is
1 𝑥−𝜇 2
∞ ∞ 1
𝑀0 (𝑡) = ∫−∞ 𝑒 𝑡𝑥 𝑓(𝑥)𝑑𝑥 = ∫−∞ 𝑒 𝑡𝑥 𝜎√2𝜋 𝑒 −2( )
𝜎 𝑑𝑥 (1)
𝑥−𝜇
Putting = 𝑧 so that 𝑑𝑥 = 𝜎𝑑𝑧in (1), we get
𝜎
𝑧2 2
1 ∞ 𝑒 𝜇𝑡 ∞ 𝑡𝜎𝑧−𝑧
𝑀0 (𝑡) = 𝜎√2𝜋 ∫−∞ 𝑒 𝑡(𝜎𝑧+𝜇) 𝑒 − 2 (𝜎𝑑𝑧) = ∫ 𝑒 2 𝑑𝑧
√2𝜋 −∞
1 1
𝜇𝑡+ 𝜎2𝑡2 1 𝑧2 𝜇𝑡+ 𝜎2𝑡2 1
𝑒 2 ∞ 𝑡𝜎𝑧− 𝜎2 𝑡 2 − 𝑒 2 ∞ − (𝑧 2 −2𝑡𝜎𝑧+𝜎2 𝑡 2 )
=
√2𝜋
∫−∞ (𝑒 2 2 ) 𝑑𝑧 =
√2𝜋
∫−∞ 𝑒 2 𝑑𝑧
1 1
𝜇𝑡+ 𝜎2𝑡2 𝜇𝑡+ 𝜎2 𝑡2
𝑒 2 ∞ −1(𝑧−𝑡𝜎)2 𝑒 2 ∞ 1 2
= ∫−∞ 𝑒 2 𝑑𝑧 = × √2𝜋 , because ∫−∞ 𝑒 −2(𝑧−𝑡𝜎) 𝑑𝑧 = √2𝜋
√2𝜋 √2𝜋
1 2 2
= 𝑒 𝜇𝑡+2𝜎 𝑡
.