Unit 9 Part 2
Unit 9 Part 2
Unit 9 Part 2
Ie, if
Coro: If and are random variables and are any numbers provided only that
The converse is not true. ie, if then and are need not be independent.
In other words two uncorrelated random variables need not be independent.
i) The sign of is determined by the sign of
Therefore, If then and if then
ii) From the above two theorems, we see that the correlation coefficient is a measure of
the degree of linearity between and . Values of close to indicate high linearity
while values of near zero indicate lack of linearity. If is negative decreases as
Result: Karl Pearson’s correlation coefficient is also called product moment correlation coefficient.
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 20
Unit – 9 – Statistics Correlation & Regression
Positive correlation
Two variables are said to be positively correlated if for an increase in the value of one
variable there is also an increase in the value of the other variable or for a decrease in the value of
one variable there is also a decrease in the value of the other variable.
Negative correlation
Two variables are said to be negatively correlated if for an increase in the value of one
variable there is a decrease in the value of the other variable.
No correlation
Two variables are said to be uncorrelated if the change in the value of one variable has no
connected with the change in the value of the other variable.
Simple correlation
multiple correlation
The correlation in the case of more than two variable is called multiple correlation
Scatter diagram
Scatter diagram gives a rough idea of the correlation between two variables
1) The formula for correlation coefficient holds only if there is a linear correlation between the
variables. That is the relationship between the variables is linear.
2) Correlation theory does not establish casual relationship between the variables. It does not
suggest that the variations in y are caused by variables in x or vice versa. A high correlation
between variables x and y may describe any of the following situations
a) Variation in is caused by variation in
b) Variation in is caused by variation in
c) and jointly dependent
d) The correlation between and may be due to chance.
The probable error of the correlation coefficient is an amount which if added to and
subtracted from the mean correlation coefficient, produces amounts within which the chances are
even that a coefficient of correlation from a series selected at random will fall.
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 21
Unit – 9 – Statistics Correlation & Regression
Rank Correlation
The rank correlation coefficient when there are ranks in each variable is given by the
formula (due to Spearman)
Tie ranks:
When the values of variables and are given, we can rank the values in each of the
variables and determine the Spearman’s rank correlation coefficient. It two or more observations
have the same rank we assign to them the mean rank.
Multiple Correlation
Result: If then
Partial Correlation
If we want to study the effect of one variable on another variable after eliminating the
effects of all other variables, the measure of relationship between and is called the Partial
In other words a partial correlation coefficient measures the relationship between any two
variables when the other variables connected with those variables are kept constant.
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 22
Unit – 9 – Statistics Correlation & Regression
1. and
2. If then
PG – TRB – Questions
a) b) c) d)
6. If is the correlation coefficient of independent variables and then [2006-07]
a) b) c) d)
7. If and are independent variables, then [2011-12]
a) X and Y are uncorrelated b) X and Y are positively correlated
c) X and Y are negatively correlated d) correlation coefficient cannot be defied
8. The co efficient of correlation between X and Y is 0.6 and their covariance is 4.8.If the
variance of X is 9 then the standard deviation of Y is: [2012-13]
a) b) c) d)
9. The partial correlation coefficient between and in the case of three
variables , , is [2015]
a) b) c) d)
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 23
Unit – 9 – Statistics Correlation & Regression
Regression analysis is a mathematical measure of the average relationship between two or
more variables in terms of the original units of the data.
The equation of regression line of on is
3. The modulus value of the arithmetic mean of the regression coefficients is not less than the
modulus value of the correlation coefficient .
4. Regression coefficients are independent of the change of origin but not of scale.
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 24
Unit – 9 – Statistics Correlation & Regression
PG – TRB – Questions
1. If the sum of the squares of the rank differences of a pairs of values is 80, the
correlation coefficient between them is [2001]
a) b) c) d) none of these
2. One of the two regression lines is [2002-03]
a) b)
c) d)
3. The regression coefficient of Y on X is and the regression coefficient of X on Y is .
The correlation coefficient is
a) b) c) d)
4. The correlation coefficient is [2003-04]
a) harmonic mean between two regression coefficients
b) geometric mean between two regression coefficients
c) arithmetic mean between two regression coefficients
d) product of the two regression coefficients
5. If is the correlation coefficient between two variables and and if and are
two regression coefficients then [2004-05]
a) b) c) d)
6. If is the correlation coefficient of the random variables X and Y the regression
coefficient of Y and X line is [2005-06]
a) b) c) d)
7. If one regression coefficient is greater than unity then the other must be [2005-06]
a) greater than the first one b) equal to unity
c) less than unity d) equal to zero
8. The equations of two regression lines are 5x-y=22 and 64x-45y=24. The mean values
of X and Y are respectively [2005-06]
a) 3 and 4 b) 4 and 3 c) 6 and 8 d) 8 an
9. If is the correlation coefficient of the random variables X and Y the regression
coefficient of Y and X line is [2006-07]
a) b) c) d)
10. The equation of two regression lines are and . The mean
values of and are given by [2006-07]
a) 3 and 4 b) 4 and 3 c) 6 and 8 d) 8 and 6
11. The regression coefficients are [2011-12]
a) dependent on change of origin and scale
b) independent on change of origin and scale
c) dependent on change of origin and scale
d) independent on change of origin but not of scale
12. Consider the function then [2011-12]
a) regression of y on x and x on y are linear
b) regression of y on x and x on y are non-linear
c) regression of y on x is linear, regression of x on y is non-linear
d) regression of y on x is non-linear, but regression of x on y is li
13. If the equation of two regression lines are and then
is [2012-13]
a) b) c) d)
14. If the regression lines coincide then which of the following is false? [2012-13]
a) b) c) d)
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 25
Unit – 9 – Statistics Correlation & Regression
15. Finding the relation between two random variables is called [2015]
a) regression b) analysis c) sample tests d) binomial
Curve Fitting
1. Fitting a straight line
Suppose be pair of values of . Let us assume
as a line of best fit for this data. The parameter are determined using the principle of
least square.
They are by the normal equations
Let ; ; ;
The normal equations are
Let ; ; ;
The normal equations are
Let ; ; ;
The normal equations are
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 26