96320
96320
96320
2 1
A P A B
P( B) P( A B) 3 4 5
P
B P ( B ) P ( B ) 2 8
3
2. Write down the axioms of probability.
n n
i) P( E) 0 ii) P(S ) 1 iii) P( Ei ) P( Ei ) if Ei’s are mutually exclusive events
i 1 i 1
E(X)=1,
E[X(X-1]) = 4 E[X2 – X] =4 E[X2] – E[X] =4 E[X2] – 1 =4 E[X2] = 5
Var X E X 2 E X 5 1 4
2
k x e x dx 1
2
k x 2 e x 2 x e x 2 e x
1
0
k 0 2 1
1
2k 1 k
2
1
x 1 ; 1 x 1
8. If a random variable X has the p.d.f f x 2 . Find the mean and variance of X .
0 ; otherwise
1
1 x3 x 2
1 1
Mean= xf x dx x x 1 dx x 2 x dx
1 1 1
2 1 2 1 2 3 2 1 3
1
1 x x3
1 4
2 x f x dx x 3 x 2 dx .
2 1 1 1 1 1 1 1 2 1
2 1 2 4 3 1 2 4 3 4 3 2 3 3
1 1 3 1 2
2
Variance 2 1 = .
3 9 9 9
9. Let M x t
1
such that t 1, be the mgf of r.v X. Find the mgf of Y = 2X +1.
1 t
M Y (t ) M 2 X 1 (t ) et M X (2t ) M aX b (t ) ebt M X (at )
1 et
et
1 t t 2t 1 2t
3
10. If the random variable has the moment generating function M X t , compute E[X2].
3t
(May/June 2016)
1
3 3 t
M X t 1
3t t
3 1
3
3
1
3 3 t
M X t 1
3t t
3 1
3
3
1 t 2 t2 2 t3
2 3
t t t
1 1
3 3 3 3 1! 9 2! 9 3!
tr
E X r r' coefficient of in M X t
r!
t2
2
E X 2 2' coefficient of in M X t .
2! 9
2e2 x ; x 0
11. A random variable X has density function given by f
x . Find m.g.f of X
0 ; x 0
M X t E etx e f x dx e 2e dx 2 e dx
tx tx 2 x 2 t x
0 0
e 2 t x 2 0 2 2
2 [e e ] [0 1] ,t 2 .
(2 t ) 0 2 t 2 t 2 t
x 2 e x
12. A continuous RV X has the pdf f (x) , x 0 . Find the rth moment of X about the origin.
2
x 2e x 1 1
r E[ X r ] x xr dx x r 2e x dx x ( r 3) 1e x dx
r
f ( x)dx
0
2 20 20
1
x
n 1 x
(r 3) e dx (n)
2 0
1
(r 2)! if n is positive int eger (n) (n 1)!
2
13. For a Binomial distribution with mean 2 and standard deviation 2 , find the first two terms of the
distribution. (May/June 2014)
1 2 2
np 6 and npq 2 npq 2 6q 2 q p n 6 n 9
3 3 3
x 9 x
2 1
P X x n Cx p q 9C x
x n x
3 3
0 9 9
2 1 1
P X 0 9C0
3 3 3
1 8
2 1 2 1 2
P X 1 9C1 9 8 7
3 3 3 3 3
14. Define Binomial Distribution .What are its mean and variance? ( April/May 2017)
x n x
The Probability of ‘x’ successes in ‘n’ trials is given by P( X x) nCx p q , x 0,1, 2,...
Mean = np and variance = npq
15. One percent of jobs arriving at a computer system need to wait until weekends for scheduling, owing to
core-size limitations. Find the probability that among a sample of 200 jobs there are no jobs that have
to wait until weekends.
X be the Random variable denoting the no. of jobs that have to wait
p = 1%= 0.01, n = 200, = np = (200)(0.01) =2,
e x
By Poisson distribution, P( X x) , x 0,1, 2,.....
x!
e2 20
Probability that there are no jobs that have to wait until weekends = P( X 0) e2 0.1353
0!
16. A quality control inspector rejects 40% of a certain product. Find the probability that the first
acceptable product is the third one inspected.
Probability of rejection = q = 0.4
Probability of acceptance = p = 0.6
P(X x) q x 1p, x 1,2,3,....
P(X=3) = q3-1p= (.4)2(.6) = 0.096
17. If the probability that a target is destroyed on any one shot is 0.5, find the probability that it would be
destroyed on 6th attempt. (Nov / Dec 2013)
Given that, the probability that a target is destroyed on any one shot is 0.5
p 0.5 q 1 p 1 0.5 0.5
By Geometric Distribution, P(X x) q x 1p, x 1,2,3,.... ; P(X 6) (0.5)61 (0.5) (0.5)6 0.0156
4
18. If X is a Uniformly distributed R.V with mean 1 and variance , find P(X<0).
3
ab
Mean = 1 a b 2 --------------(1)
2
b a 4
2
variance = b a 4 ---------------(2)
12 3
(1) + (2) 2b = 6 b = 3
(1) – (2) 2a = -2 a = -1
1 1
, a xb , 1 x 3
f ( x) b a f ( x) 4
0 , otherwise
0 , otherwise
0 0
1 1 0 1
P(X 0) f (x)dx dx x 1 .
1 1
4 4 4
19. Suppose the length of life of an appliance has an exponential distribution with mean 10 years. What is
the probability that the average life time of a random sample of the appliances is atleast 10.5?
Mean of the exponential distribution = E(X) = 1/10 = 1/
x
1 1
, f (x) ex , x 0 f (x) e 10 , x 0
10 10
x
1 10
P(X 10.5) f (x)dx e dx e 1.05 0.3499
10.5 10.5
10
7
P X x 1,
x 0
K 2K 2K 3K K 2 2K 2 7K 2 K 1
1
10K2+9K−1=0 K or K 1 ( here K 1 is impossible, since P( X x) 0 )
10
1
K
10
The probability mass function is
X x 0 1 2 3 4 5 6 7
P( X x) 0 1 2 2 3 1 2 17
10 10 10 10 100 100 100
1 2 2 3 1 81
(ii) P X 6 P X 0 P X 1 ... P X 5
10 10 10 10 100 100
81 19
P X 6 1 P X 6 1
100 100
P 0 X 5 P X 1 P X 2 P X 3 P X 4 K 2 K 2 K 3K
8 4
8K
10 5
(iii) The distribution of X is given by FX x defined by FX x P X x
P( X x)
X x FX x P X x
0 0 0 ,x 1
1 1 1
1 0+ = , 1 x 2
10 10 10
2 1 2 3
2 0+ + = , 2 x3
10 10 10 10
2 1 2 2 5
3 0+ + + = ,3 x 4
10 10 10 10 10
3 1 2 2 3 8 4
4 0+ + + + = ,4 x 5
10 10 10 10 10 10 5
1 1 2 2 3 1 81
5 0+ + + + + = , 5 x6
100 10 10 10 10 100 100
2 1 2 2 3 1 2 83
6 0+ + + + + + = , 6 x7
100 10 10 10 10 100 100 100
17 1 2 2 3 1 2 17 100
7 0+ + + + + + + = 1, x 7
100 10 10 10 10 100 100 100 100
1
(vi) To find the minimum value of C if P X C
2
P( X x)
X x P( X x)
1
0 0 0<
2
1 1 1 1
1 0+ = <
10 10 10 2
2 1 2 3 1
2 0+ + = <
10 10 10 10 2
2 1 2 2 5 1
3 0+ + + = =
10 10 10 10 10 2
3 1 2 2 3 8 4 1
4 0+ + + + = >
10 10 10 10 10 10 5 2
1 1 2 2 3 1 81 1
5 0+ + + + + = >
100 10 10 10 10 100 100 2
2 1 2 2 3 1 2 83 1
6 0+ + + + + + = >
100 10 10 10 10 100 100 100 2
17 1 2 2 3 1 2 17 100 1
7 0+ + + + + + + = 1>
100 10 10 10 10 100 100 100 100 2
the minimum value of C is 4
ii) Trains arrive at a station at 15 minutes interval starting at 4 a.m. If a passenger arrive at a
station at a time that is uniformly distributed between 9.00 a.m. and 9.30 a.m., find the
probability that he has to wait for the train for (i) less than 6 minutes (ii) more than 10 minutes.
(May/June 2014)
Solution:
Let X denotes number of minutes past 9.00 a.m. that the passenger arrives at the stop till 9.30a.m.
1
X ~U[0,30] f ( x) , 0 x 30
30
(i )P that he has to wait for the train for less than 6 minutes
P (9 x 15) (24 x 30)
30 dx 30 x x 1230 0.4
15 30 15 30
1 1 1
f ( x)dx f ( x)dx dx
15 30
9 24
9 24 9
30 24
(ii)P that he has to wait for the train for more than 10 minutes
P (0 x 5) (15 x 20)
30 dx 30 x x 1030 0.3333
5 20 5 20
1 1 1
f ( x)dx f ( x)dx
5 20
dx 0 15
0 15 0
30 15
iii) Find the moment generating function of a poisson distribution. Hence find mean and variance.
(APR / MAY’ 19)
Solution:
e x
; x 0,1, 2,...; 0
P X x x!
0, otherwise
M.G.F= M X (t ) E (e tX )
e t
x
x
tx e
e tx
f (x) x!
e e
x!
x 1 x 1 x 1
=
e
e 1
t
e t 2
1! 2!
e ee
t
M x (t) e
e t 1
d
dt
Mean = E(X)= M X (t) e
d et 1
t 0 dt
t 0
e t e
e t 1
t 0
Variance = Var ( X ) E ( X 2 ) E ( X ) 2
d '
Where E ( X 2 ) = dt
M X (t)
t 0 dt
e e
d t et 1
t 0
e t . e t e et 1 e et 1 .e t
t 0
( 1)
Variance = Var (X) E(X2 ) E(X) ( 1) 2 .
2
2. i) A continuous random variable X has the p.d.f f x kx3e x , x 0. Find the rth order moment of X
about the origin. Hence find m.g.f, mean and variance of X. (Nov/Dec 2015)
Solution:
3 e x 2 e
x
e x
e x
Since kx e dx 1 k x
3 x
(3x ) (6 x) (6) 1
0 1 1 1 10
1
k x3e x 3x2e x 6xe x 6 1 k (0) (6) 1 6 k 1 k .
0 6
1 1
E ( X ) r x r f x dx x r x3e x dx x r 3e x dx
r
n e x x n 1dx , n 0
0
60 60 0
1 x r 311 1 r 3 !
6 0
here n r 4
e x dx ( r 4) n (n 1)!
6 6
4! 24
Putting r 1 , E ( X ) 1 4
6 6
5! 120
r 2 , E ( X ) 2 20
2
6 6
2
Mean = E ( X ) 1 4 ; Variance = E ( X 2 ) E ( X ) 2 1
2
2 20 4 20 16 4
2
To find M.G.F
M X (t) E(e ) e
tX tx
f (x)dx
1 3 x
M X (t) e
tx
x e dx
6
1 3 tx x 1
60
x e dx x 3e(1 t)x dx
60
1 e(1 t)x 2 e
(1 t)x e(1 t)x e(1 t)x
x 3 3x
6x (1 t)4
3
6
(1 t) (1 t) (1 t)
2
6 0
1 e(1t)x e(1t)x e(1t)x e(1t)x
x3 3x 2 6x 6
6 (1 t) (1 t)2 (1 t)3 (1 t)4 0
1 6
(0) 4
6 (1 t)
1
M X (t)
(1 t) 4
ii) a) A machine manufacturing bolts is known to produce 5% defective. In a random sample of 15
bolts, what is the probability that there are (1). Exactly 3 defective bolts, and (2). Not more than
3 defective bolts? (NOV/DEC 2018)
Solution:
Given n 15, p 0.05, q 0.95
By Binomial Distribution, P( X x) nCx p x q n-x
15Cx (0.05) x (0.95)15 x
(1) P(Exactly 3 defective bolts)=P(X 3) 15C3 (0.05)3 (0.95)153 0.0307
(2) P(Not more than 3 defective bolts) P(X 3) P(X 0) P( X 1) P( X 2) P( X 3)
15C0 (0.05) 0 (0.95)150 15C1 (0.05)1 (0.95)151 15C2 (0.05) 2 (0.95)15 2 15C3 (0.05)3 (0.95)153
0.994
b) Six coins are tossed 6400 times. Using the Poisson distribution, what is the approximate
probability of getting six heads 10 times?
Solution:
6 6
Probability of getting six heads in one toss of six coins is p np 6400 =100
1 1
2 2
e 100 (100)10
Let X be the number of times getting 6 heads P( X 10) 1.025 1030
10!
3. i) 1
A random variable X has the probability mass function f (x) = , x= 1,2,3,...
2x
Find its (i) M.G.F (ii) Mean (iii) Variance.
Solution:
M.G.F= M X (t ) E (e tX )
x
1 et
= e f ( x) e x
tx tx
x 1 x 1 2 x 1 2
2 3 4
et et et et
= ...
2 2 2 2
et et et et
2 3
= 1 ...
2 2 2 2
1
et et
= 1
2 2
et
M.G.F= M X (t ) …………….. (1)
2 et
d d et (2 et )et et (et )
Mean = E(X)= MX (t) 2
dt t 0 dt 2 et t 0 (2 et )2 t 0
Variance = Var ( X ) E ( X 2 ) E ( X ) 2
d d 2et (2 et )2 et et 2(2 et )(e t )
Where E ( X 2 ) = M'X (t) 6
dt
t 0 dt
(2 e t 2
)
t 0
(2 e t 4
) t 0
Variance = Var ( X ) E ( X 2 ) E ( X ) = 6 - 4 =2
2
ii) A component has an exponential time to failure distribution with mean of 10,000 hours.
(i) The component has already been in operation for its mean life. What is the
Probability that it will fail by 15,000 hours?
(ii) At 15,000 hours the component is still in operation. What is the probability
that it will operate for another 5000 hours? (Nov/Dec 2015)
Solution:
Let X be the random variable denoting the time to failure of the component following exponential
1 1
distribution with Mean 10000 hours. 10, 000
10, 000
1
x
e 10,000 , x 0
The p.d.f. of X is f x 10, 000
0 , otherwise
(i) Probability that the component will fail by 15,000 hours given that it has already been in
operation for its mean life P X 15,000 / X 10,000
P 10,000 X 15,000
(1)
P X 10,000
15,000 x
1
P 10,000 X 15,000 10000 e 10000
dx
10,000
15000
x 15000
1 e 10000 10000
x
e
10000 1 10000
10000 10000
15000 10000 3
e 10000 e 10000 e 2 e 1 e 1 e 1.5 (2)
x
x
10000
x
P X 10,000
1 1 e
e 10000 dx 1 e 10000
10,000
10000 10000 10000
10000 10000
e e 1 e 1 (3)
Sub (2) & (3) in (1)
e1 e1.5 0.3679 0.2231
(1) P X 15,000 / X 10,000 0.3936 .
e1 0.3679
(ii) Probability that the component will operate for another 5000 hours given that
it is in operation 15,000 hours P X 20,000/ X 15,000
P X 5000 [By memoryless property]
x
1
f x dx 10000 e 10000
dx
5000 5000
x
1 e 10000 10000
x
1 e e0.5 0.6065
10000 5000
10000 5000
iii) An electrical firm manufactures light bulbs that have a life, before burn-out, that is normally
distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours. (April/May 2018)
Solution:
x x 800
Given X N ( , ) where 800, 40, z
40
The standard normal z values corresponding to x1 778, x2 834 are
778 800 834 800
z1 0.55 and z2 0.85
40 40
P(778 X 834) P(0.55 Z 0.85)
P(0 Z 0.55) P(0 Z 0.85)
f ( x)dx f ( x)dx f ( x)dx f ( x)dx f ( x)dx 1
0 1 2 3
1 2 3
ax dx a dx (3a ax)dx 1
0 1 2
1 3
x2 x2 1
a a x 1 a 3x 1 a
2
2 0 2 2 2
(ii) CDF
If 0 x 1
x
x
x x2
x
x2
F ( x) f ( x) dx dx
0
2 4 0 4
If 1 x 2
1
x2 x
x x1 x
x 1 x 1
F ( x) f ( x) dx dx dx
0
2 1
2 4 0 2 1 2 4
If 2 x 3
3 x
x 1 2 x
x 1
F ( x)
f ( x) dx dx dx dx
0
2 1
2 2
2 2
1 x
x 2 x 3x x 2 3x x 2 5
2
4 0 2 1 2 4 2 2 4 4
0, x0
2
x , 0 x 1
2
x 1
FX x , 1 x 2
2 4
3x x 2 5
, 2 x3
2 4 4
1, x3
0.94 0.02
0.0188
0.27729
0.94 0.02 0.05 0.98 0.0678
5. i) The average percentage of marks of candidates in an examination is 42 with a standard deviation
of 10. If the minimum mark for pass is 50% and 1000 candidates appear for the examination,
how many candidates can be expected to get the pass mark? If it is required, that double the
number of the candidates should pass, what should be the minimum mark for pass?
(Nov/Dec 2015)
Solution:
Let X denote the marks of the candidates, then X N 42,102
X 42
Let z , P X 50 P z 0.8 0.5 P 0 z 0.8 0.5 0.2881 0.2119
10
If 1000 students write the test, 1000P X 50 212 students would pass the examination.
If double that number should pass, then the no of passes should be 424.
We have to find z1 , such that P z z1 0.424
P 0 z z1 0.5 0.424 0.076
50 x1
From tables, z1= 0.19 , z1 x1 50 10 z1 50 1.9 48.1
10
The pass mark should be 48 nearly.
ii) Derive the MGF, mean and variance of Geometric distribution and also state and prove the
special property of it. ( May/June 16)
Solution:
P( X x) pq x 1 , x 1, 2,3,
Moment Generating Function
M X t E (etX ) etx p( x) etx q x 1 p p[et e 2t q e3t q 2 ]
x 1 x 1
pet
pet [1 qet qet 2 ] pet [1 qet ]1
1 qet
t 0 1 qe
t
2
t 0
p
d 2 pet
M (0) 2
' "
d pet 1 q
2 X
dt 1 qet
dt
t 0 1 qe
t
2
t 0
p2
1
Mean 1'
p
2
1 q 1
q
2
Variance 2 2
'
2
'
1
p p p
Memoryless property of geometric distribution.
Statement:
If X is a random variable with geometric distribution, then X lacks memory, in the sense that
P X s t / X s P X t s, t 0 .
Proof:
The p.m.f of the geometric random variable X is P( X x) q x 1 p , x 1,2,3,....
P X s t X s P X s t
P X s t / X s (1)
P X s P X s
P X t q t 1 t 2
x 1
p q p q p q p .... qt p 1 q q 2 q3 ....
t
x t 1
d 1
Mean 1 M X t 2
dt t 0 t t 0
d2 2 2
2 2 M X t
3
2
dt t 0 t t 0
2
2 1 1
Variance 2 1 2 2 2 .
MEMORYLESS PROPERTY
Statement: If X is exponentially distributed with parameters , then for any two positive integers
‘s’ and ‘t’, P X s t / X s P X t s, t 0
Proof:
e x , x 0
The p.d.f of X is f x
0 , Otherwise
P X k e x dx e x ek
k
k
P x s t x s
P X s t / x s
P x s
P X s t e s t
s e t P X t
P X s e
6. i) Let X be a Uniformly distributed R.V over [-5, 5]. Determine (May/June2016)
1 P X 2 2 P X 2 3 Cumulative distribution function of X
(4)Var ( X ).
Solution:
The R.V X ~ U[-5,5].
The p.d.f
1
for 5 x 5
f ( x) 10
0 otherwise
2 2 2
1 1 1
1 P X 2 f ( x)dx dx dx x 5
2
5 5
10 10 5 10
1 7
2 5
10 10
2 P X 2 1 P X 2 1 P 2 X 2
2 2 2
1 1 1 1 4
P 2 X 2 f ( x)dx dx dx x 2 2 2
2
2 2
10 10 2 10 10 10
If 5 x 5
x5
x x
1 1
f ( x)dx 10 dx 10 x
x
F ( x) 5
5 5
10
If x 5
55
5 x 5
1 1
F ( x) f ( x)dx f ( x)dx 10 dx 0 10 x
5
5
1
5 5 5
10
0 for x 5
x5
F ( x) for 5 x 5
10
1 for x 5
b a
2
(4) Var ( X )
12
5 (5)
2
100 25
.
12 12 3
ii) x 1
3 1
Let P X x , x 1, 2,3, be the probability mass function of the R.V. X.
4 4
Compute 1 P X 4 2 P X 4 / X 2 3 E X (4)Var ( X ). (May/June2016)
Solution:
1 P X 4 P X 5 P X 6 P X 7
x 1
3 1
P( X x)
X 5 X 5 4 4
3 1 1 1
4 5 6
4 4 4 4
3 1
4
1 1 2
1
4 4 4 4
4 1 4 1 4
3 1 1 3 1 3 1
1
4 4 4 4 4 4 4
2 P X 4 / X 2 P( X 2)
P X 3 P X 4 P X 5
x 1
3 1
P( X x)
X 3 X 3 4 4
3 1 1 1
2 3 4
4 4 4 4
3 1 1 1
2 2
1
4 4 4 4
2 1 2 1 2
3 1 1 3 1 3 1
1
4 4 4 4 4 4 4
1 1 4
3 E X 3
p 3
4
1
1 16 4
(4) Var ( X ) 2 2 .
q 4
p 3 4 9 9
4
iii) There are two boxes B1 and B2. B1 contains two red balls and one green ball. B2 contains one red
ball and two green balls.
(1) A ball is drawn from one of the boxes randomly. It is found to be red. What is the
probability that it is from B1?
(2) Two balls are drawn randomly from one of the boxes without replacement. One is red and
the other is green. What is the probability that they came from B1?
(3) A ball is drawn from one of the boxes is green. What is the prob. that it came from B2?
(4) A ball is drawn from one of the boxes is white what is the prob. that it came from B2?
Solution:
1 1
Let B1 & B2 be the events that the boxes B1 & B2 respectively are selected. P ( B1 ) ; P ( B2 )
2 2
2 1
1) Let A be the event that a red ball is selected. P ( A / B1 ) ; P ( A / B2 )
3 3
P( B1 ) P( A / B1 )
P( ball is from B1 , given it is red) P( B1 / A)
P( B1 ) P( A / B1 ) P( B2 ) P( A / B2 )
1 2
2 3 2
1 2 1 1 3
2 3 2 3
2) Let C be the event that a red ball and a green ball are selected.
2C1 1C1 2 1C1 2C1 2
P(C / B1 ) ; P(C / B2 )
3C 2 3 3C 2 3
P( B1 was chosen given a red ball and a green ball were selected ) P ( B1 / C )
P ( B1 ) P (C / B1 )
P ( B1 ) P (C / B1 ) P ( B2 ) P (C / B2 )
1 2
2 3 1
1 2 1 2 3
2 3 2 3
1 2
3) Let D be the event that a green ball is selected. P( D / B1 ) ; P( D / B2 )
3 3
P( B2 ) P( D / B2 )
P( ball is from B1 , given it is green) P ( B2 / D )
P( B1 ) P( D / B1 ) P( B2 ) P( D / B2 )
1 2
2 3 2
1 1 1 2 3
2 3 2 3
4) Let E be the event that a white ball is selected.
The given two boxes does not contained a white ball, hence the probability is 0
iv) Four boxes A, B, C, D contain fuses. The boxes contain 5000,3000,2000 and 1000 fuses
respectively. The percentages of fuses in boxes which are defective are 3%, 2%, 1% and 0.5%
respectively. One fuse is selected at random arbitrarily from one of the boxes. It is found to be
defective fuse. Find the probability that it has come from box D. (APR / MAY’ 19)
Solution:
Let E1, E2, E3, E4, be the event that boxes A,B,C,D respectively are selected.
1
P(E1 ) P(E 2 ) P(E 3 ) P(E 4 )
4
Let D be the event that defective fuse is selected.
Given P(D / E1 ) 0.03, P(D / E 2 ) 0.02, P(D / E 3 ) 0.01, P(D / E 4 ) 0.005,
By Bayes theorem,
P( E4 ) P( D / E4 )
P( E4 / D)
P( E1 ) P( D / E1 ) P( E2 ) P( D / E2 ) P( E3 ) P( D / E3 ) P( E4 ) P( D / E4 )
1
.005
4
0.005
0.07692
1 1 1 1
.03 .02 .01 .005
0.065
4 4 4 4
UNIT – II TWO DIMENSIONAL RANDOM VARIABLES
PART – A
1
, 0 x 2, 0 y 3
1. Given the joint probability density function of X and Y as f ( x, y ) 6 , determine
0, otherwise
the marginal density functions.
3
y
3
1 1
The marginal function of X is f X ( x)
f ( x, y )dy dy , 0 x 2
0
6 6 0 2
2
x
2
1 1
The marginal function of Y is fY ( y)
f ( x, y )dx dx , 0 y 3
0
6 6 0 3
2. Find the value of k, if the joint density function of (X , Y) is given by
k (1 x)(1 y ) , 0 x 4,1 y 5
f ( x, y )
0 , otherwise
Given the joint pdf of (X , Y) is f(x , y) = k (1 – x) (1 – y), 0 < x < 4, 1 < y < 5
5 4
f ( x, y )dxdy 1 k (1 x)(1 y )dxdy 1
1 0
4
5
x2 x2
k x yx y dy 1
1
2 2 0
5
5
y2 1
k (4 4 y )dy 1 k 4 y 4 1 k (30 2) 1 32k 1 k
1 2 1 32
3. The joint probability density function of bivariate random variable
4 xy , 0 x 1, 0 y 1
(X , Y) is given by f ( x, y ) . Find P (X + Y<1 )
0 , elsewhere
4 xy , 0 x 1, 0 y 1
Given the joint pdf of (X , Y) is f ( x, y ) .
0 , elsewhere
1 1 x 1 x
1
y2
P( X Y 1) 4 xydydx 4 x dx
0 0 0 2 0
1 1
2 x(1 x)2 dx 2 x(1 2 x x 2 )dx
0 0
1
1
x2 x3 x 4 1 2 1 1
2 ( x 2 x x )dx 2 2 2
2 3
0 2 3 4 0 2 3 4 6
8 xy , 0 x 1, 0 y x
4. If f ( x, y ) is the joint probability density
0 , elsewhere
function of X and Y, find f(y/x).
x
x y2
f X ( x) f ( x, y )dy 8 xydy 8 x 4 x3 , 0 x 1
y y 0 2
0
f ( x, y ) 8 xy 2 y
f ( y / x) , 3 2 , 0 y x, 0 x 1
f X ( x) 4x x
5. The regression equations of X on Y and Yon X are respectively 5x – y = 22 and 64x – 45y = 24. Find the
mean values of X and Y.
Regression lines pass through the mean values of X and Y. Solving the two equations we get the
mean values.
Let 5x – y = 22 --------------(1)
64x – 45y = 24 --------------(2)
Multiply equation (1) by 45 and subtract equation ( 2)
225x – 45y = 990
– 64x + 45y = – 24
__________________
161x = 966 x = 6
Substitute in equation 1
5 ( 6 ) – y = 22 y = 8. mean value of X = 6 and mean value of Y = 8
1
,0 y x 1
6. The joint p.d.f. of R.V. (X,Y) is given as f ( x , y ) x . Find the marginal p.d.f. of Y.
0 , elsewhere
1
1
f ( x, y)dx dx log x y log1 log y log y, 0 y 1
1
The marginal pdf of Y is fY ( y)
y
x
7. The following table gives the joint probability distribution of X and Y, find the marginal distribution
function of X and Y.
X
1 2 3
Y
1 0.1 0.1 0.2
2 0.2 0.3 0.1
X
1 2 3 p(y)
Y
1 0.1 0.1 0.2 0.4
2 0.2 0.3 0.1 0.6
p(x) 0.3 0.4 0.3 1
The marginal distribution of X is The marginal distribution of Y is
X 1 2 3
Y 1 2
p(x) 0.3 0.4 0.3
p(y) 0.4 0.6
8. Let X and Y be two independent R.Vs with Var(X) = 9 and Var(Y) = 3. Find Var(4X – 2Y + 6)
Var(4X – 2Y + 6) = 16 Var(X) + 4 Var(Y) = 16(9) + 4(3) = 156
9. The joint pdf of a two dimensional random variable (X,Y) is given by f ( x , y ) kxe y , 0 x 2, y 0.
Find the value of k.
Given that f(x,y) is pdf of (X,Y)
f(x,y) ≥ 0 , for all x ,y
2 2
2
x2
f ( x, y )dxdy 1 kxe dxdy 1 k xdx . e dy 1 k . e y 1
y y
2 0
0
0 0 0 0
1
k (2)(1) 1 k
2
10. If the joint cumulative distribution function of X and Y is given by F ( x, y) (1 e x )(1 e y ), x 0, y 0 ,
find P( 1 < X < 2 , 1 < Y < 2 )
2 F 2
The joint pdf is f ( x, y )
xy xy
1 e x 1 e y
x
1 e x .e y e x .e y e ( x y ) , x 0, y 0
2 2 2 2 2 2
P 1 X 2,1 Y 2 f ( x, y)dxdy e ( x y ) dxdy e x .e y dxdy
1 1 1 1 1 1
1 1 e 1
2 2 2 2
14. Determine the value of the constant c if the joint density function of two discrete random variables X
and Y is given by p(x,y) = cxy, x = 1,2,3 and y = 1,2,3.
X
1 2 3 p(y)
Y
1 c 2c 3c 6c
2 2c 4c 6c 12c
3 3c 6c 9c 18c
p(x) 6c 12c 18c 36c
Since p(x,y) is the joint pdf of X and Y
p(x,y) ≥ 0 , for all x ,y
1
p( x, y) 1 36c 1 c 36
m n
Let the auxillary random variable be V = Y. The transformation functions are u = x + y, v = y, y > 0 v >
0 and 0 < x < 2 0 < u – v < 2 v < u < v + 2
19. The joint probability mass function of the discrete random variable (X , Y) is given by the table
X
2 4
Y
1 1/10 1.5/10
3 2/10 3/10
5 1/10 1.5/10
Find the conditional probability P ( X = 2 / Y = 3)
X
2 4 PY(y)
Y
1 1/10 1.5/10 2.5/10
3 2/10 3/10 5/10
5 1/10 1.5/10 2.5/10
PX(x) 4/10 6/10 1
P( X 2 , Y 3) 2 / 10 2
P ( X = 2 / Y = 3) =
PY (3) 5 / 10 5
20. The two lines of regression are 4x – 5y + 33 = 0 and 20x – 9y = 107. Calculate the coefficient of
correlation between X and Y.
4x – 5y + 33 = 0 ----------- (1) 20x – 9y = 107 ----------- (2)
Let (1) be the regression line of Y on x and let (2) be the regression line of X on Y.
4 33 4
y x b1
5 5 5
9 107 9 4 9 9 3
x y b2 r b1b2 . 0.6 1
20 20 20 5 20 25 5
PART – B
1. 25e 5 y , 0 x 0.2, y 0
(i) The joint pdf of the random variables X and Y is defined as f ( x , y ) . (a)
0 , elsewhere
Find the marginal PDFs of X and Y (b) cov (X,Y)
Solution:
The marginal PDF of X is
f X ( x)
f ( x, y)dy 25e y dy 25 e y 25 e e0 25(1) 25, 0 x 0.2
0
0
0.2
The marginal PDF of Y is fY ( y) f ( x, y)dx 25e
y
dx 25e y x 0 25e y 0.2 5e y , 0 y
0.2
0
0.2
0.2 x2 0.04
E(X) = xf X ( x)dx x(25)dx 25 25
0.5
0 0
2 2
E(Y) = yfY ( y)dy y(5e y )dx 5 ye y e y 50 1 5
0
0
0.2 0.2
y y
E(XY) = xyf ( x, y)dxdy xy(25e )dxdy 25 ye dy . xdx
0 0 0 0
2 0.2
x 0.04
= 25 ye y e y .
250 1. (25)(0.02) 0.5
0
0
2 2
Cov(x,y) = E(XY) – E(X)E(Y) = 0.5 – (0.5)(5) = –2
k ( x 1)e y , 0 x 1, y 0
(ii). Find the constant k such that f ( x , y ) is a joint p.d.f. of the
0 , otherwise
continuous random variable (X,Y). Are X and Y independent R.Vs? Explain.
Solution:
To find k : Given that f(x,y) is pdf of (X,Y)
f(x,y) ≥ 0 , for all x ,y and f ( x, y)dxdy 1
1
f ( x, y )dxdy 1 k ( x 1)e y dxdy 1
0 0
1
k ( x 1)dx . e y dy 1
0 0
1
x 2
k x . e y 1
2 0
0
3 2
k (1) 1 k
2 3
2 y
( x 1)e , 0 x 1, y 0
f ( x, y) 3
0 , otherwise
The marginal PDF of X is
2 2 2 2 2
f X ( x)
f ( x, y)dy ( x 1)e y dy ( x 1) e y ( x 1) e e0 ( x 1)(1) ( x 1), 0 x 1
0
3 3 0 3 3 3
The marginal PDF of Y is
1
1
2 2 x2 2 1 3 2
fY ( y )
f ( x, y )dx ( x 1)e y dx e y x e y 1 . e y e y , 0 y
0
3 3 2 0 3 2 2 3
2
Consider f X ( x ) . fY ( y) = ( x 1) . e y = f(x , y)
3
X and Y are independent
4 xy , 0 x 1, 0 y 1
2. (i). Let the joint p.d.f. of R.V. (X,Y) be given as f ( x, y ) , find the marginal
0 , elsewhere
densities of X and Y and the conditional densities of X given Y = y. (April/May 2018)
Solution:
The marginal density function of X is
1 y 2 1
f X ( x) f ( x, y) dy 4 xy dy 4 x 2 x, 0 x 1
0 2 0
The marginal density function of Y is
1 x 2 1
fY ( y) f ( x, y) dx 4 xy dx 4 y 2 y, 0 y 1
0 2 0
The conditional densities function of X given Y = y is
f x
y
f ( x, y) 4 xy
fY ( y )
2y
2 x, 0 x 1
(ii). The joint density function of two random variable X and Y is given by
6 2 xy
x , 0 x 1, 0 y 2
f ( x, y) 7 2 .
0 , elsewhere
1 1
(a) Compute the marginal p.d.f of X and Y ? (b) Find E(X) & E(Y) (c) P X , Y
2 2
Solution:
The marginal pdf of X is
2
6 2 xy 6 2 x y2
2
6
f X ( x) f ( x, y ) dy x dy x y 2 x x , 0 x 1
2
0
7 2 7 2 2 0 7
The marginal pdf of Y is
1
6 xy 6 x3 x 2 y 6 1 y
1
fY ( y )
f ( x, y ) dx x 2 dx
0
7 2 7 3
,0 y 2
2 2 0 7 3 4
1
6 2 x4 x2
1
6 2 6 3 2 1 6 1 1 6
E(X) = xf X ( x)dx x 2 x x dx 2 x x
0
7 7 0 7 4
2 7 2 2 7
0
2 2
6 y y2 6 y 2 y3
2
6 1 y 6 2 2 8
E(Y) = yfY ( y )dy y dy
0
7 3 4 7 3 4 7 6 12 7 3 3 7
0 0
1 1 1
2
2
1 1 6
2 2
xy 6 2
x y2
P( X , Y ) f ( x, y ) dy dx x 2 dy dx x 2 y dx
2 2 1 0 1
7 2 0
7 2 2 1
2 2 2
1 1 1 1
2
6 x2 x 6 3x 2 15 x 2
2
6 x3 15 x 2 2
2 x2 x dx dx
0
7 2 16 0
7 2 16 0 7 2 32 0
6 1 15 6 23 69
.
7 16 128 7 128 448
3. (i). If X,Y and Z are uncorrelated random variables with zero means and standard deviations 5 , 12 and
9 respectively and if U = X + Y, V = Y + Z, find the correlation coefficient between U and V.
Solution:
Given E(X) = E(Y) = E(Z) = 0
Now, E(U) = E(X+Y) = E(X) + E(Y) = 0 and E(V) = E(Y+Z) = E(Y) + E(Z) = 0
E(U2) = E((X+Y)2) = E(X2 + Y2 + 2XY) = E(X2) + E(Y2) + 2E(XY) = 25 + 144+ 0 = 169
E(V2) = E((Y+Z)2) = E(Y2 + Z2 + 2YZ) = E(Y2) + E(Z2) + 2E(YZ) = 144 + 81+ 0 = 225
e v ( u 1) e v ( u 1) 1 1
fU ( u) f ( u, v )dv v e v ( u 1)
dv v . 1. 2
0 , u0
( u 1) ( u 1) 0 ( u 1) ( u 1) 2
2
0
(iii). The life time of a certain brand of an electric bulb may be considered as a random variable with
mean 1200h and standard deviation 250h. Find the probability, using central limit theorem, that the
average lifetime of 60 bulbs exceeds 1250hours. (NOV/DEC 2018)
Solution:
Let Xi (i=1,2,...60) denote the life time of the bulbs.
Here =1200, 2 2502
Let X denote the average life time of 60 bulbs.
2 X
By Central limit theorem, X follows N , . Let Z
n
n
P X 1250 P Z 1.55 0.0606
4. (i). Obtain the equation of the regression line Y on X from the following data.
X 3 5 6 8 9 11
Y 2 3 4 6 5 8
Solution:
X Y U=X–6 V=Y–6 U2 V2 UV
3 2 –3 –4 9 16 12
5 3 –1 –3 1 9 3
6 4 0 –2 0 4 0
8 6 2 0 4 0 0
9 5 3 –1 9 1 –3
11 8 5 2 25 4 10
6 –8 48 34 22
V
V
8
1.33 , U2
U 2
U
2
48
1 7 , U 2.646 ,
n 6 n 6
V2
V 2
34
V
2
(1.33)2 3.898 , V 1.974
n 6
Cov(U , V) =
UV U V 22 (1)(1.33) 4.997
n 6
Cov(U ,V ) 4.997
rUV 0.484
U . v (2.646) (3.898)
rXY 0.484
X U 6 X 1 6 7
Y V 6 Y 1.33 6 4.67
X U X 2.646
Y V Y 3.898
The regression line of Y on X is
r Y
Y Y
X
X X
(0.484) (3.898)
Y 4.67 X 7
(2.646)
Y 0.713 X 0.321
(ii) X and Y are two random variables having the joint probability mass function
f x , y k 3x 5y ; x 1, 2, 3 : y 0, 1, 2 . Find the marginal distribution and conditional
distribution of X, P(X = xi /Y = 2) , P(X ≤2 / Y ≤ 1). (April/May 2019)
Solution:
X
1 2 3 PY(y)
Y
0 3k 6k 9k 18k
1 8k 11k 14k 33k
2 13k 16k 19k 48k
PX(x) 24k 33k 42k 99k
To find k:
We know that Total probability = 1
P(x,y) = 1 99k = 1
k = 1/99
The marginal distribution of X is The marginal distribution of Y is
X 1 2 3
Y 0 1 2
p(x) 24/99 33/99 42/99 p(y) 18/99 33/99 48/99
5. (i) In a partially destroyed laboratory record only the lines of regressions and variance of X are
available. The regression equations are 8x – 10y + 66 = 0 and 40x – 18y = 214 and variance of X = 9. Find
(a) the correlation coefficient between X and Y (b) Mean values of X and Y (c) variance of Y.
Solution:
Given 8x – 10y = –66 ……(1)
40x – 18y = 214 ……(2)
Let (1) be the regression line of y on x and (2) be the regression line of x on y.
8 x 66 8 4
10y = 8x + 66 y the regression coefficient of y on x is b1
10 10 10 5
18 y 214 18 9
40x = 18y + 214 x the regression coefficient of x on y is b2
40 40 40 20
b1b2 =
4 9 9
1
5 20 25
Let r be the correlation between x and y.
9 3
r b1b2 0.6 [Since both regression coefficients are positive, r is positive]
25 5
Let x, y be the point of intersection of the two regression lines.
Solving (1) and (2) we get x , y
5 x (1) 40x – 50y = – 330
40x – 18y = 214
Subtracting – 32y = – 544
y = 17
Now, 8x – 10y = – 66 8x – 10(17) = – 66 8x = 170 – 66 8x = 104 x = 13
x, y = (13 , 17) is the mean of X and Y.
4
Y2 b1 b
We know, 2 Y2 1 X2 Y2 5 .(9) Y2 16 Variance of Y is 16
X b2 b2 9
20
(ii) The joint probability density function of a two dimensional random variable (X,Y) is given by
x2
f ( x, y) xy 2 , 0 x 2, 0 y 1 . Compute (i) P(X > 1), (ii) P(Y < ½ ), (iii) P(X < Y) (iv) Are X and
8
Y independent?
Solution:
The marginal pdf of X is
1
2 x2
1
y3 x2 x x2
8 3x , 0 x 2
x
f X ( x) f ( x, y) dy xy dy x y
0 8 3 8 0 3 8 24
The marginal pdf of Y is
2
2
x2 x2 x3 1
f Y ( y) f ( x, y) dx xy 2 dx y 2 2 y 2 , 0 y 1
0 8 2 24 0 3
2
2
x x2 x 2 x3 1
f X ( x) dx dx 4 1 8 1
1 3 7 19
P(X > 1) = 13 8
1 6 241 6 24 6 24 24
1/ 2
1 1/ 2 1/ 2
2 1 2 y3 1
P Y f Y ( y ) dy 2 y dy
2 0 3 3 3 0
2 1 1 1 3 1
. .
3 8 3 2 12 4
y
2 x2
1 y 1
x 2 y 2 x3
P( X Y ) xy dxdy dy
0 0 0
8 2 24 0
1
1
y4 y3 y5 y4 1 1 53
dy
0
2 24 10 96 0 10 96 480
x 1
Consider f X ( x ) . fY ( y) = 8 3x . 2 y 2 ≠ f(x , y) X and Y are not independent.
24 3
UNIT III - TESTING OF HYPOTHESIS
PART – A
1. Define Population, Sample and Sample Size.
The group of individuals under study is called population. The population may be finite or infinite.
A finite subset of statistical individuals in a population is called Sample. The number of individuals in a
sample is called Sample Size (n).
2. Define Parameter and Statistic. (APRIL / MAY ‘15)
The Statistical constants in population namely mean µ and variance 2 which are usually referred to as
parameters. Statistical measures computed from sample observations alone, i.e. mean x and variance s2
which are usually referred to as statistic.
3. List out the applications of t –distribution. (NOV / DEC ‘13)
To test the significant difference between the means of two independent samples.
To test the significant difference between the means of two dependent samples or paired
observation.
To test the significance of the mean of a random sample.
To test the significance of an observed correlation coefficient.
4. Mention the Properties of t – distribution.
The variable t distribution ranges from to .
The t – distribution is symmetrical and has a mean zero.
The variance of the t – distribution is greater than one, but approaches one as the number of degrees
of freedom and therefore the sample size become large.
5. What is Standard Error? (APRIL / MAY ‘11) (APRIL / MAY ‘17)
The standard deviation of the sampling distribution of a statistic is known as its Standard error.
6. State any two properties of 2 distribution.
(i)Chi – square curve is always positively skewed
(ii)Chi – square values increase with the increase in degrees of freedom
7. Explain the various uses of Chi-square test. (APRIL / MAY ‘14) (NOV / DEC ‘14)
Test of goodness of fit, Test of independence of attributes, Test of Homogeneity for a specified value of
standard deviation
8. What are the assumptions on which F-test is based?
p1 p 2
Test Statistic : Z
1 1
PQ
n1 n 2
The Sample proportion,
800 800 n p n p
p1 0.80 , p2 0.67 , P 1 1 2 2 0.7273 & Q 1 P 0.2727
1000 1200 n1 n2
p1 p2
Z 6.9905
1 1
PQ
n1 n2
Table value : Z 1.645
Conclusion : The calculated value is greater than the table value, hence we reject the null hypothesis.
2(i) A random sample of 10 boys had the following I.Q’s: 70,120,110,101,88,83,95,98,107,100.Do these data
support the assumption of a population mean I.Q of 100? Find a reasonable range in which most of the
mean I.Q. values of samples of 10 boys lie. (APRIL / MAY ’15)
Solution:
Hypothesis:
H 0 : 100
H 1 : 100
Level of Significance : 0.05
Test Statistic : t x , where s 2
2
x x 2
S n n
n 1
Analysis:
X 70 120 110 101 88 83 95 98 107 100 972 X
96312 X 2
X2 4900 14400 12100 10201 7744 6889 9025 9604 11449 10000
x x 96312 972 2
2 2
s 2
n n 10 10
9631.2 9447.84 183.36 S 13.54
x x
2
x
2
t , where s 2
S n n
n 1
97.2 100 2.8
t 0.6204
13.54 4.5133
10 1
Table value : t ,n 1 t5%,101 t0.05,9 2.262
Conclusion: The table value is greater than the calculated value; hence we accept the null hypothesis and
conclude that the data are consistent with the assumption of mean I.Q of 100 in the population.
Reasonable range in which most of the mean I.Q. values of samples of 10 boys lie:
x x 100
2.262 2.262
S 13.54
n 1 10 1
x 100
2.262
4.5133
x 100
2.262 2.262
4.5133
(ii) The table below gives the number of aircraft accidents that occurred during the various days of the
week. Test whether the accidents are uniformly distributed over the week.
Days Mon Tue Wed Thurs Fri Sat
No. of accidents 14 18 12 11 15 14
Solution:
We want to test whether the accidents are uniformly distributed. So we apply 2 -test.
Null Hypothesis H0: The accidents are uniformly distributed over the 6 days. (Monday to Saturday)
Alternative Hypothesis H1: The accidents are not uniformly distributed.
84
Under Ho, the expected frequencies for each day = =14
6
O E 2
The test statistic is 2
E
O E O E O E O E 2
2
E
14 14 0 0 0
18 14 4 16 1.143
12 14 -2 4 0.286
11 14 -3 9 0.643
15 14 1 1 0.071
14 14 0 0 0.000
84 84 2 =2.143
Number of degrees of freedom v = n-1 = 6-1 = 5
For v=5 degrees of freedom , from the table of 2 at 5% level is 02.05 = 11.07
2 02.05
Conclusion: Since the calculated value of < the table value of , Ho is accepted at 5% level of significance.
2 2
Sample 2 16 16 20 27 26 25 21
Can we conclude that the two samples are drawn from the same population? Test at 5% level of
significance. (APRIL / MAY ’19)
Solution:
Let S1 , S 2 be the sample variances and let 1 , 2 be the variances of the two populations we have to test the
2 2 2 2
significance of the differences the variances of the two samples. So we apply F-test of equality of variances.
F test
Null Hypothesis H0: 1 2 (Variances are equal)
2 2
2 2
To find S1 and S 2
Sample 1 Sample 2
X X2 Y Y2
17 289 16 256
27 729 16 256
18 324 20 400
25 625 27 729
27 729 26 676
29 841 25 625
13 169 21 441
17 289
X = 173 X2 =3995 Y =151 Y2 =3383
X 173, X n
X 173
n1 8, 21.6
1 8
n2 7, Y 151, Y Y
151
21.57
n2 7
X X
2 2
s12 36.27
n1 1 n1 1
Y Y
2 2
s22 20.95
n2 1 n2 1
n1s12 (8)(36.27)
S12 41.45
n1 1 7
n2 s22 (7)(20.95)
S22 24.44
n2 1 6
S12 41.45
F 1.696
S22 24.44
Number of degrees of freedom (v1, v2 ) (n1 1, n2 1) (7,6)
For (v1 , v2 ) =(7,6), the table value of F at 5% level is F0.05 4.21
F F0.05
Since the calculated value of F < the table value of F, H 0 is accepted at 5% level of significance.
t test
Null Hypothesis H0: 1 2
Alternative Hypothesis H1: 1 2
The test statistic is
xy
t
1 1
s
n1 n2
n1s12 n2 s22 8(36.27) 7(20.95)
where s 33.6
n1 n2 2 872
xy 21.6 21.57 0.03
t 0.001724
1 1 1 1 33.6 0.518
s 33.6
n1 n2 8 7
v n1 n2 2 13 degrees of freedom at 5% level of significance
t.05 (13) 2.16
t.05 (13) > |t|, H0 is accepted
Conclusion: The two samples are drawn from populations with same variances.
3(i) The following data represent the biological values of protein from cow’s milk and buffalo’s milk:
Cow’s milk 1.82 2.02 1.88 1.61 1.81 1.54
Buffalo’s milk 2.00 1.83 1.86 2.03 2.19 1.88
Examine whether the average values of protein in the two samples significantly differ at 5% level.
Solution:
n=6
1 1
x1 10.68 1.78, s12 19.16 (1.78)2 0.0261
6 6
1 1
x2 11.79 1.965; s22 23.25 (1.965) 2 0.0154
6 6
H 0 : x1 x2 and H1 : x1 x2
x1 x 2
As the two samples are independent , the test statistic is given by t =
s12 s 22
n 1
1.78 1.965
t= 2.03 and v 10
0.0261 0.0154
5
Two tailed test is to be used. LOS is 5%
From table t 5% (v 10) 2.23
H 0 is accepted (i.e,) the difference between the mean protein values of the two varieties of milk is not significant
at 5% level.
(ii) The following data are collected on two characters. Based on this can you say that there is no relation
between smoking and literacy. ( APRIL / MAY ’17 )
Smokers Non Smokers
Literates 83 57
Illiterates 45 68
Solution:
Null Hypthesis H0: No difference between the two treatment.
Alternative Hypothesis H1: difference between the two treatment
Level of significance: 5% or 0.05
Degrees of freedom=(r-1)(s-1)=(2-1)(2-1)=1
ad bc a b c d
2
a b c d (a c)(b d )
(83*68) (45*57) 253
2
2
9.47
83 45 57 68 (83 57)(45 68)
Conclusion: Since 2 =9.47 > 3.841, so we reject H 0 at 5% level of significance
4(i) A manufacturer of electric bulbs, by some process, find the standard deviation of the lamps to be
100hrs.He wants to change the process if the new process results in even smaller variation in the life of
lamps. In adopting the new process a sample of 150 bulbs gave the standard deviation of 95 hrs. Is the
manufacturer justified in changing the process.
Solution:
Given 0.05, 100, S 95 , n 150
The Parameter of interest is .
Null Hypothesis H0: =S
Alternative Hypothesis H1: S
Level of significance: 5% or 0.05
s 95 100
z 0.866
100
2n 300
i.e., z 0.866 1.96
Conclusion: The calculated value is less than the table value; hence we accept the null hypothesis. So the
manufacture finds no justification in changing the process on this evidence alone.
(ii) The theory predicts that the proportion of beans in the four groups A,B,C,D should be 9:3:3:1. In an
experiment among 1600 beans, the numbers in the four groups were 882, 313, 287, and 118. Does the
experimental result support the theory ? (MAY/JUNE ’14) (APR/MAY ’15)
Solution:
Ho : The experimental data support the theory
Based on Ho, the expected numbers of beans in the four groups are as follows
Observed Expected O E 2 (O E ) 2
frequency (O) frequency (E) E
882 900 324 0.360
313 300 169 0.563
287 300 169 0.563
118 100 324 3.240
4.726
O E
2
2
=4.726
E
Calculated value of 2 =4.726
Tabulated value of is 7.81 at 5% level of significance. Since calculated value < tabulated value.
2
Therefore, we accept the null hypothesis. i.e. the experimental data support the theory.
(iii) Two independent samples of eight and seven items respectively had the following values of the
variable: (MAY / JUNE ’16 )
Sample 1 9 11 13 11 15 9 12 14
Sample 2 10 12 10 14 9 8 10
Do the two estimates of population variance differ significantly at 5% level of significance?
Let S1 , S 2 be the sample variances and let 1 , 2 be the variances of the two populations we have to test
2 2 2 2
the significance of the differences the variances of the two samples. So we apply F-test of equality of
variances.
Null Hypothesis H0: 1 2 (Variances are equal)
2 2
2 2
To find S1 and S 2
Sample I Sample II
X1 X12 X2 X22
9 81 10 100
11 121 12 121
13 169 10 100
11 121 14 196
15 225 9 81
9 81 8 64
12 144 10 100
14 196
94 1138 73 785
n1 8, n2 7, x1 94, x1 1138
2
x2 73, x2 785
2
x 1 x2 73 10.43
x 94
1 11.75 , x2
n1 8 n2 7
s12 1 x1
x2
1138
2
11.75 4.19,
2
n1 8
x22
x 2
2 785
10.43 3.39
2
s22
n2 7
n1s12 8 4.19 n s 2 7 3.39
S12 4.79 , S22 2 2 3.96
n1 1 7 n2 1 6
S12 4.79
Sin ce S12 S22 , thetest statistic is F 1.21
S22 3.96
Number of degrees of freedom (v1, v2 ) (n1 1, n2 1) (7,6)
For (v1 , v2 ) =(7,6), the table value of F at 5% level is F0.05 4.21
F F0.05
Conclusion: Since the calculated value of F < the table value of F, H 0 is accepted at 5% level of
significance. The two samples are drawn from populations with same variances.
5(i) Test significance of the difference between the means of the samples, drawn from two normal
populations with the same SD using the following data: (APRIL / MAY ’15) (NOV / DEC ’13)
Size Mean Standard Deviation
Sample I 100 61 4
Sample II 200 63 6
Solution:
H o : x1 x2 or 1 2
H 1 : x1 x 2 or 1 2
Two tailed test is to be used.
x1 x 2 61 63
The test statistic is z 3.02
2 2
s1 s2 42 62
n2 n1 200 100
Tabulated value is 1.96 at 5% level of significance.
z z The difference between x1 and x 2 is significant at 5% level of significance. i.e. Ho
is rejected and H1 is accepted. Therefore, the two normal populations, from which the samples are drawn, may not
have the same mean though they may have the same S.D.
(ii) A sample of size 13 gave and estimated population variance 3.0 while another sample of size 15 gave
an estimate of 2.5. Could both samples be from population with the same variance? (APRIL/MAY ’17 )
Solution:
Null Hypothesis H0: 1 2 (Variances are equal)
2 2
(iv) Fit a Poisson’s distribution to the following data and the goodness of fit. Test at 5% level of significance.
(APRIL / MAY ’19)
x 0 1 2 3 4 5
f 142 156 69 27 5 1
Solution:
Null Hypothesis H0: Poisson distribution fit the given data
Alternative Hypothesis H1: Poisson distribution not fit the given data
Mean X
fi xi 1 X 1
fi
e x
By Poisson distribution, P( X x)
x!
e x
Now, Expected frequency = f
x!
x 0 1 2 3 4 5
5 1
Oi 142 156 69 27
6
6 1
Ei 147 147 74 25
7
16. Write down the ANOVA table for Latin Square Design.
Source of Sum of Degree of
Mean Square F- Ratio
Variation Degrees freedom
Column SSC MSC
SSC n-1 MSC FC
Treatment n 1 MSE
Row SSR MSR
SSR n-1 MSR FR
Treatments n 1 MSE
Between SSK MSK
SST n-1 MSK FK
Treatments n 1 MSE
Error (or) SSE
SSE (n-1) (n-2) MSE
Residual (n 1)(n 2)
PART B
1. Analyse the variance in the following latin square of yields (in kgs) of paddy where A, B, C, D denote the
different methods of cultivation.
D 122 A 121 C 123 B 122
B 124 C 123 A 122 D 125
A 120 B 119 D 120 C 121
C 122 D 123 B 121 A 122
Examine whether the different methods of cultivation have given significantly different yields.
(30) 2
TSS X ij2 C.F 92 35.75
i j 16
SSR
T i*
2
C.F 81
(30)2
24.75
n 16
SSC
T *j
2
C.F 59
(30)2
2.75
n 16
SSL
T i*
2
C.F 60.5
(30)2
4.25
n 16
SSE = TSS – SSC – SSR-SSL = 35.75 – 24.75 – 2.75 – 4.25 = 4
ANOVA Table
Source of Sum of Degree of Mean FTabRatio
F- Ratio
Variation Squares freedom Square ( 5% level)
Between
SSR=24.75 n - 1= 3 MSR=8.25
Rows FR= FR(3, 6)=4.76
Between 12.31 Fc(3, 6)=4 .76
SSC=2.75 n - 1= 3 MSC = 0.92
Columns FL(3, 6)=4 .76
Between FC = 1.37
SSL = 4.25 n - 1= 3 MSL = 1.42
Letters
FL = 2.12
Residual SSE= 4 (n – 1)(n – 2) = 6 MSE = 0.67
Total 35.75
Conclusion :
Cal FC< Tab FC , Cal FL< Tab FL and Cal FR> Tab FR
There is significant difference between the rows, no significant difference between the letters and no
significant difference between the columns.
There is no significant difference between the different methods of cultivation.
2. Analyze 22 factorial experiments for the following table.
Treatmen Replications
t I II III IV
(1) 12 12.3 11.8 11.6
a 12.8 12.6 13.7 14
b 11.5 11.9 12.6 11.8
ab 14.2 14.5 14.4 15
SOLUTION:
Null hypothesis: All the mean effects are equal.
Let A and B be the two factors. Let n= number of replications=4
Subtract 12 from each
Replications
Treatment
I II III IV
(1) 0 0.3 -0.2 -0.4
a 0.8 0.6 1.7 2
b -0.5 -0.1 0.6 -0.2
ab 2.2 2.5 2.4 3
Let us find SS for the table
Replications Row Total Ri Ri
2
Treatment
I II III IV
(1) 0 0.3 -0.2 -0.4 -0.3 0.09
a 0.8 0.6 1.7 2 5.1 26.01
b -0.5 -0.1 0.6 -0.2 -0.2 0.04
ab 2.2 2.5 2.4 3 10.1 102.01
Column Total C j 2.5 3.3 4.5 4.4 T=14.7
Cj
2 6.25 10.89 20.25 19.36
T2
T=14.7; Correction factor= =13.5
N
TSS=21.19, SSC=0.688, SSR=18.54 , SSE=1.96
Source of Sum of Degree of Mean
F- Ratio FTabRatio
Variation Squares freedom Square
S B =1.63 FB 7.409 10.56
b 1 MSB=1.63
3. Four varieties A,B,C,D of a fertilizer are tested in a randomized block design with 4 replication. The plot
yields in pounds are as follows:
Column / Row 1 2 3 4
1 A(12) D(20) C(16) B(10)
2 D(18) A(14) B(11) C(14)
3 B(12) C(15) D(19) A(13)
4 C(16) B(11) A(15) D(20)
Analyse the experimental yield. ( MAY / JUNE ’16 )
Solution:
H0: There is no significant difference between the fertilizers and replication
H1 : Significant difference between the fertilizers and replication
Variety Block Total
1 2 3 4 varieties
X12 X22 X32 X42
(X1) (X2) (X3) (X4)
A 12 14 15 13 54 144 196 225 169
B 12 11 11 10 44 144 121 121 100
C 16 15 16 14 61 256 225 256 196
D 18 20 19 20 77 324 400 361 400
58 60 61 57 236 868 942 963 865
N=16; T=Grand Total = 236
(Grand total ) 2 (236) 2
Correction Factor = 3481
Total No of Observatio ns 16
TSS X ij2 C.F 868 942 963 865 3481 157
i j
T
2
h
Ti*
2
Between MSC =
SSC=2 k – 1=3 FC = 0.545
blocks 0.67 F5%(3, 9) = 3.86
(h – 1)( k – 1) MSE
Residual SSE = 10.5
=9 = 1.17
Conclusion: Cal FC<Tab FC and Cal FR> Tab FR Therefore null hypothesis is rejected. Hence four
varieties are not similar. But the varieties are similar along block wise
4 The following is a Latin square of a design when 4 varieties of seeds are being tested. Set up the
analysis of variance table and state your conclusion. The following is a Latin square of a design when 4
varieties of seeds are being tested. Set up the analysis of variance table and state your conclusion. You
may carry out suitable change of origin and scale. (APRIL / MAY ‘17)
A 105 B 95 C 125 D 115
C 115 D 125 A 105 B 105
D 115 C 95 B 105 A 115
B 95 A 135 D 95 C 115
Solution:
H0 : Four varieties are similar
H1 : Four varieties are not similar
Let us take 100 as origin and divide by 5 for simplifying the calculation
TOTA
Variety X 1 X2 X3 X4 X12 X22 X32 X42
L
Y1 1 -1 5 3 8 1 1 25 9
Y2 3 5 1 1 10 9 25 1 1
Y3 3 -1 1 3 6 9 1 1 9
Y4 -1 7 -1 3 8 1 49 1 9
6 10 6 10 32 20 76 28 28
N=Total No of Observations = 16 T=Grand Total = 32
(Grand total )2
Correction Factor = = 64
Total No of Observations
TSS X12 X 22 X 32 X 42 C.F 20 76 28 28 64 88
( X 1 ) 2 ( X 2 ) 2 ( X 3 ) 2 (6)2 (10)2 (6)2 (10)2
SSC C.F 64 4
N1 N1 N1 4 4 4 4
(Y1 ) 2
(Y2 ) 2
(Y3 )2 (Y4 )2 (8)2 (10)2 (6)2 (8)2
SSR C.F 64 2
N1 N2 N2 N2 4 4 4 4
To find SSK
Treatment 1 2 3 4 Total
A 1 1 3 7 12
B -1 1 1 -1 0
C 5 3 -1 3 10
D 3 5 3 -1 10
ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
SSC
Column MSC MSC
SSC=4 n-1=3 n 1 FC =7.52
Treatment MSE
=1.33
SSR
Row MSR MSR
SSR=2 n-1=3 n 1 FR =14.9
Treatments MSE
=0.67
SSK
Between MSK MSK
SST=22 n-1=3 n 1 FK =1.36
Treatments MSE
=7.33
MSE
Error (or) (n-1) (n- SSE
SSE=60
Residual 2)=6 (n 1)( n 2)
10
Table value F(3,6) degrees of freedom 8.94
There is significant difference between treatments
5. As part of the investigation of the collapse of the roof of a building, a testing laboratory is given all the
available bolts that connected the steel structure at 3 different positions on the roof. The forces
required to shear each of these bolts ( coded values) are as follows: (APR / MAY ’19)
Position 1 : 90 82 79 98 83 91
Position 2 : 105 89 93 104 89 95 86
Position 3 : 83 89 80 94
Perform an analysis of variance to test at the 0.05 level of significance whether the differences
among the sample means at the 3 positions are significant.
Solution:
H0: There is no significant difference between the sample means at the three positions.
H1 : Significant difference between the sample means at the three positions.
We shift the origin
X1 X2 X3 TOTA X12 X22 X32
L
1 16 -6 11 1 256 36
-7 0 0 -7 49 0 0
-10 4 -9 -15 100 16 81
9 15 5 29 81 225 25
-6 0 - -6 36 0 -
2 6 - 8 4 36 -
Tot - -3 - -3 - 9 -
al -11 38 -10 17 271 542 142
N= Total No of Observations = 17 T=Grand Total = 17
(Grand total )2
Correction Factor = = 17
Total No of Observations
PART B
1. The following data gives the average life in hours and range in hours of 12 samples each of 5 lamps.
Construct X - hart and R- chart, comment on state of control. (APR / MAY ’19)
Sample No. 1 2 3 4 5 6 7 8 9 10 11 12
Mean X i 120 127 152 157 160 134 137 123 140 144 120 127
Range Ri 30 44 60 34 38 35 45 62 39 50 35 41
Solution:
1
X
N
Xi
1
120 127 152 157 160 134 137 123 140 144 120 127 136.75
12
1
R
N
Ri
1
30 44 60 34 38 35 45 62 39 50 35 41 42.75
12
From the table of control chart for sample size n=5, we have A2 0.577, D3 0 & D4 2.115
i) Control limits for X chart:
CL (central line)= X 136.75 ; LCL X A2 R 136.75 (0.5775 )(42.75 ) 112.1
UCL=161.4
4
CL = 136.75
Conclusion:
Since all the sample points lie within the LCL and UCL lines, the process is under control according to X chart
ii) Control limits for R-Chart:
CL R 42.75; LCL D3 R 0; UCL D4R 2.115 42.75 90.41
UCL=90.41
LCL=0
Conclusion :
Since all the sample range fall within the control limits the statistical process is under control according to
R chart .
2. The Values of sample mean X and sample standard deviation S for 15 samples, each of size 4, drawn
from a production process are given below. Draw the appropriate control charts for the process
average and process variability. Comment on the state of control.
Sample No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Mean 15 10 12.5 13 12.5 13 13.5 11.5 13.5 13 14.5 9.5 12 10.5 11.5
S.D 3.1 2.4 3.6 2.3 5.2 5.4 6.2 4.3 3.4 4.1 3.9 5.1 4.7 3.3 3.3
Solution:
X
X i 185.5
12.36; s
s i
60.3
4.02
N 15 N 15
i) Control limits for X chart:
From the table of control chart constants, for sample size n = 4, we have A1 1.880, B3 0 and B4 2.266
n 4
CL X 12.36 ; LCL X A1 s 12.36 1.880 4.02. 3.63
n 1 3
n 4
UCL X A1 s 12.36 1.880 4.02. 21.09
n 1 3
Conclusion:
Even before drawing the control chart, we observe that the given sample mean values lie between 3.63 and 21.09
and that the given S.D values fall within 0 and 9.11. Hence the process is under control with respect to average and
variability.
3a) 15 tape recorders were examined for quality control test. The number of defects in each tape
recorder is recorded below. Draw the appropriate control chart and comment on the state of control.
Unit No.(i) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
No. of defects (c) 2 4 3 1 1 2 5 3 6 7 3 1 4 2 1
Solution:
The number of defects per sample containing only one item is given, c
c 2 4 3
i 2 1
45
3
N 15 15
CL c 3; LCL c 3 c 3 3 3 2.20 ; LCL 0 ( since LCL cannot be negative)
UCL c 3 c 3 3 3 8.20
Since all the sample points lie within the LCL and UCL lines, the process is under control.
3b) Construct a control chart for defectives for the following data: (APRIL / MAY ’15 ) (APR / MAY ’19)
Sample No: 1 2 3 4 5 6 7 8 9 10
No. inspected: 90 65 85 70 80 80 70 95 90 75
No. of defectives: 9 7 3 2 9 5 3 9 6 7
Solution:
We note that the size of the sample varies from sample to sample.
We can construct p-chart , provided 0.75n n i 1.25 n , for all i
1 1 1
Here n n i 90 65 ............. 90 75 800 80
N 10 10
Hence 0.75 n 60 and 1.25 n 100
The values of ni be between 60 and 100. Hence p-chart, can be drawn by the method given below.
Total no.of defectives 60
Now p 0.075
Total no.of items inspected 800
Hence for the p-chart to be constructed,
CL= p 0.075
LCL= p 3
p 1 p 0.075 3 0.075 x0.925
0.013
n 80
Since LCL cannot be negative, it is taken as 0.
UCL= p 3
p 1 p 0.075 3 0.075 x0.925
0.163
n 80
The values of pi for the various samples are 0.100, 0.108,0.035, 0.029, 0.113, 0.063, 0.043, 0.067, 0.093
Since all the sample points lie within the control lines, the process is under control.
4. The data given below are the number of defectives in 10 samples of 100 items each. Construct a p-chart
and an np-chart and comment on the results. ( MAY / JUNE ’14) (APRIL / MAY ‘17)
Sample No. 1 2 3 4 5 6 7 8 9 10
No. of defectives 6 16 7 3 8 12 7 11 11 4
Solution:
Sample size is constant for all samples, n=100.
Total no. of defectives = 6 + 16+7+3+8+12+7+11+11+4= 85
Total no. Inspected= 10 x 100 = 1000
Total no.of defectives 85
Average fraction defective = p 0.085
Total no.of items inspected 1000
For p-chart:
p 1 p 0.085 (0.915) 0.0013
LCL p 3 0.085 3
n 100
p 1 p 0.085 0.915 0.1687
UCL p 3 0.085
n 3
UCL=0.1687
Conclusion:
All these values are less than UCL=0.1687 and greater than LCL=0.0013. In the control chart, all sample points lie
within the control limits. Hence, the process is under statistical control.
For np-chart:
UCL n p 3 n p 1 p
np3
p 1 p 100 0.1687 16.87
n
np 100 0.085 8.5
LCL n p 3 n p 1 p
np 3
p 1 p 100 0.0013 0.13
n
UCL=16.87
CL=8.5
LCL=0.13
Conclusion:
All the values of number of defectives in the table lie between 16.87 and 0.13. Hence, the process is under control
even in np-chart.
5. The following data give the measurements of 10 samples each of size 5 in the production process taken
in an interval of 2 hours. Calculate the sample means and ranges and draw the control charts for mean
and range. (MAY / JUNE ‘16) (NOV / DEC ’18)
Sample No. 1 2 3 4 5 6 7 8 9 10
Observed 49 50 50 48 47 52 49 55 53 54
measurements 55 51 53 53 49 55 49 55 50 54
X
54 53 48 51 50 47 49 50 54 52
49 46 52 50 44 56 53 53 47 54
53 50 47 53 45 50 45 57 51 56
Solution:
1 1
X X i 52 50 50 51 47 52 49 54 51 54 51.0
N 10
1 1
R R i 6 7 6 5 6 9 8 7 7 4 6.5
N 10
From the table of control chart for sample size n=5, we have A2 0.577, D3 0 & D4 2.115
i) Control limits for X chart:
CL (central line) = X 51.0 ; LCL X A2 R 2 51.0 (0.577)(6.5) 47.2495
UCL X A2 R 2 51.0 (0.577)(6.5) 54.7505
UCL=54.75
MEAN CHART
Conclusion :Since 5th sample mean fall outside the control limits the statistical process is out of control according
to X chart
Control limits for R-Chart: CL R 6.5; LCL D3 R 0; UCL D4R 2.115 6.5 13.7475
i)
RANGE CHART
Conclusion :
Since all the sample means fall within the control limits the statistical process is under control according to
R chart .