Unit 9 Part 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Unit – 9 – Statistics Correlation & Regression

Correlation (or) Correlation coefficient

Let and be two random variables. The correlation coefficient, denoted by


is defined by

if

Ie, if

Numerical value of the correlation coefficient

The coefficient of correlation lies between -1 and +1 inclusive of those


values

i) When is positive, the variable and increase or decrease together


ii) When implies that there is a perfect positive correlation between variables and
iii) When is negative, the variable and move in the opposite direction.
iv) When there is a perfect negative correlation
v) When the two variables are uncorrelated

Theo: assume all values between -1 and +1

Theo: if then there exists a linear relation between and

Theo: Correlation coefficient is independent (unaffected) of change of origin and scale

Coro: If and are random variables and are any numbers provided only that
then

Theo: If is the correlation coefficient between and and if where


are constant, with then

Theo:

Theo:

Theo: Two independent variables are uncorrelated


Ie, If and are independent

The converse is not true. ie, if then and are need not be independent.
In other words two uncorrelated random variables need not be independent.

Theo: Let and be random variable connected by the relation then

Note:
i) The sign of is determined by the sign of
Therefore, If then and if then

ii) From the above two theorems, we see that the correlation coefficient is a measure of
the degree of linearity between and . Values of close to indicate high linearity
while values of near zero indicate lack of linearity. If is negative decreases as
increases.

Result: Karl Pearson’s correlation coefficient is also called product moment correlation coefficient.
__________________________________________________________________________________
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 20
Unit – 9 – Statistics Correlation & Regression

Positive correlation

Two variables are said to be positively correlated if for an increase in the value of one
variable there is also an increase in the value of the other variable or for a decrease in the value of
one variable there is also a decrease in the value of the other variable.

That is the two variables change in the same direction

Negative correlation

Two variables are said to be negatively correlated if for an increase in the value of one
variable there is a decrease in the value of the other variable.

That is the two variables change in the opposite direction

No correlation

Two variables are said to be uncorrelated if the change in the value of one variable has no
connected with the change in the value of the other variable.

Simple correlation

The correlation between two variables is called the simple correlation

multiple correlation

The correlation in the case of more than two variable is called multiple correlation

Scatter diagram

Scatter diagram gives a rough idea of the correlation between two variables

It gives no information about the degree of relationship between the variables

Limitations

1) The formula for correlation coefficient holds only if there is a linear correlation between the
variables. That is the relationship between the variables is linear.
2) Correlation theory does not establish casual relationship between the variables. It does not
suggest that the variations in y are caused by variables in x or vice versa. A high correlation
between variables x and y may describe any of the following situations
a) Variation in is caused by variation in
b) Variation in is caused by variation in
c) and jointly dependent
d) The correlation between and may be due to chance.

Probable Error of Correlation Coefficient

The probable error of the correlation coefficient is an amount which if added to and
subtracted from the mean correlation coefficient, produces amounts within which the chances are
even that a coefficient of correlation from a series selected at random will fall.

If is the correlation coefficient in a sample of pairs of observations, then

i) its standard error (S.E) is given by S.E

ii) its Probable error (P.E) is given by P.E

__________________________________________________________________________________
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 21
Unit – 9 – Statistics Correlation & Regression

Rank Correlation

The rank correlation coefficient when there are ranks in each variable is given by the
formula (due to Spearman)

Where is the difference between ranks of corresponding pairs of and . number of


observations.

Result: The limits for rand correlation coefficient is

Tie ranks:

When the values of variables and are given, we can rank the values in each of the
variables and determine the Spearman’s rank correlation coefficient. It two or more observations
have the same rank we assign to them the mean rank.

Multiple Correlation

If is the dependent variable, and are independent variables, we denoted the


multiple correlation by . Thus in multiple correlation, we study the compound effects of a
group of variables upon a single variable which is not included in the group.

Multiple Correlation coefficient

The multiple correlation coefficient for three variables is denoted by

Result: If then

Partial Correlation

If we want to study the effect of one variable on another variable after eliminating the
effects of all other variables, the measure of relationship between and is called the Partial
Correlation.

In other words a partial correlation coefficient measures the relationship between any two
variables when the other variables connected with those variables are kept constant.

Partial Correlation Coefficient

The Partial Correlation Coefficient and is denoted by

Similarly,
__________________________________________________________________________________
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 22
Unit – 9 – Statistics Correlation & Regression

Result: If each is equal to

Relation between Multiple and Partial Correlation Coefficient

Result

1. and
and
2. If then
3.

4. If then lies between

PG – TRB – Questions

1. Limit for the coefficient of correlation is [2001]


a) b) c) d)
2. Correlation coefficient between x and y lies in the interval [2003-04]
a) b) c) d)
3. If is the correlation coefficient between two variables and then [2004-05]
a) b) c) d)
4. The coefficient of correlation between the two variables X and Y is 0.6. Their
covariance is 3.6 and variance of X is 4. The variance of Y is [2005-06]
a) 3.5 b) 4.5 c) 5.5 d) 9
5. If , and are the standard deviations of , and respectively then the
correlation coefficient between and is given by [2006-07]

a) b) c) d)
6. If is the correlation coefficient of independent variables and then [2006-07]
a) b) c) d)
7. If and are independent variables, then [2011-12]
a) X and Y are uncorrelated b) X and Y are positively correlated
c) X and Y are negatively correlated d) correlation coefficient cannot be defied
8. The co efficient of correlation between X and Y is 0.6 and their covariance is 4.8.If the
variance of X is 9 then the standard deviation of Y is: [2012-13]
a) b) c) d)
9. The partial correlation coefficient between and in the case of three
variables , , is [2015]
a) b) c) d)

__________________________________________________________________________________
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 23
Unit – 9 – Statistics Correlation & Regression

Regression
Regression analysis is a mathematical measure of the average relationship between two or
more variables in terms of the original units of the data.
The equation of regression line of on is

The equation of regression line of on is

Note:

1. The slope is called the regression coefficient of on

2. The slope is called the regression coefficient of on

Properties of Regression Coefficients


1. The correlation coefficient is the geometric mean between the two regression coefficients.

i) If the two regression coefficients are positive then is positive


ii) If the two regression coefficients are negative then is negative

2. If any one of the regression coefficient is then other regression coefficient is

3. The modulus value of the arithmetic mean of the regression coefficients is not less than the
modulus value of the correlation coefficient .

4. Regression coefficients are independent of the change of origin but not of scale.

Angle between the two regression lines

Angle between the two regression lines is

Note:

1. If then the two regression lines coincide


2. If
If the variables are uncorrelated, then the regression lines are perpendicular to each other

__________________________________________________________________________________
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 24
Unit – 9 – Statistics Correlation & Regression

PG – TRB – Questions
1. If the sum of the squares of the rank differences of a pairs of values is 80, the
correlation coefficient between them is [2001]
a) b) c) d) none of these
2. One of the two regression lines is [2002-03]
a) b)
c) d)
3. The regression coefficient of Y on X is and the regression coefficient of X on Y is .
The correlation coefficient is
a) b) c) d)
4. The correlation coefficient is [2003-04]
a) harmonic mean between two regression coefficients
b) geometric mean between two regression coefficients
c) arithmetic mean between two regression coefficients
d) product of the two regression coefficients
5. If is the correlation coefficient between two variables and and if and are
two regression coefficients then [2004-05]
a) b) c) d)
6. If is the correlation coefficient of the random variables X and Y the regression
coefficient of Y and X line is [2005-06]
a) b) c) d)
7. If one regression coefficient is greater than unity then the other must be [2005-06]
a) greater than the first one b) equal to unity
c) less than unity d) equal to zero
8. The equations of two regression lines are 5x-y=22 and 64x-45y=24. The mean values
of X and Y are respectively [2005-06]
a) 3 and 4 b) 4 and 3 c) 6 and 8 d) 8 an
9. If is the correlation coefficient of the random variables X and Y the regression
coefficient of Y and X line is [2006-07]
a) b) c) d)
10. The equation of two regression lines are and . The mean
values of and are given by [2006-07]
a) 3 and 4 b) 4 and 3 c) 6 and 8 d) 8 and 6
11. The regression coefficients are [2011-12]
a) dependent on change of origin and scale
b) independent on change of origin and scale
c) dependent on change of origin and scale
d) independent on change of origin but not of scale
12. Consider the function then [2011-12]
a) regression of y on x and x on y are linear
b) regression of y on x and x on y are non-linear
c) regression of y on x is linear, regression of x on y is non-linear
d) regression of y on x is non-linear, but regression of x on y is li
13. If the equation of two regression lines are and then
is [2012-13]
a) b) c) d)
14. If the regression lines coincide then which of the following is false? [2012-13]
a) b) c) d)
__________________________________________________________________________________
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 25
Unit – 9 – Statistics Correlation & Regression

15. Finding the relation between two random variables is called [2015]
a) regression b) analysis c) sample tests d) binomial
Curve Fitting
1. Fitting a straight line
Suppose be pair of values of . Let us assume
as a line of best fit for this data. The parameter are determined using the principle of
least square.
They are by the normal equations

2. Fitting a second degree polynomial


Suppose be pair of values of . Let us assume
as a second degree polynomial of best fit for this data. The parameter are
determined using the principle of least square.
They are by the normal equations

3. Fitting a curve of the form

Let ; ; ;
Then,
The normal equations are

4. Fitting a curve of the form

Let ; ; ;
Then,
The normal equations are

5. Fitting a curve of the form

Let ; ; ;
Then,
The normal equations are

__________________________________________________________________________________
Prepared by D.THIRUMARAN, M.Sc., B.Ed., - 8015461606, C.Mutlur 26

You might also like