Correlation Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

STATISTICS FOR

ECONOMICS
CHAPTER 7

CORRELATION

1
Correlation
Lesson Objectives:
✓Explain the meaning of correlation
✓Discuss the importance/significance of correlation
✓Distinguish and illustrate between linear and non-linear
correlation; simple and multiple correlation.

2
CORRELATION
➢Correlation (correlation coefficient) is a statistical tool/method/technique that
measures quantitative relationship (direction and intensity) between variables.
➢For example, relationship between price of a good and quantity demanded, household
income and household expenditure, study time and test scores, level of employment
and output, etc.
➢It should be noted that one may find relationship between two variables but in reality,
such a relationship is meaningless as no meaningful explanation can be given to such
relationship. The relationship could have occurred as a result of coincidence or as a
result of a third hidden variable. Such relationship is Known as spurious
relation/correlation
➢Spurious correlation is a situation where a relationship may be found to exist between
two variables due to either coincidence or the presence of a certain third unseen factor,
when in reality there is no meaningful causal relation between the two variables.
➢For example, a correlation existing between ice cream sales and drowning during
summer is a spurious correlation as ice cream sales does not cause drowning. Thus, the
existence of a relationship is because of a hidden third factor (high temperature) 3
Importance/significance of correlation
➢Formation of laws and concepts:
✓The study of correlation shows the direction and degree of relationship between
variables. This has helped in the formulation of various laws and concepts in
economics, e.g., law of demand; law of supply; Okun's law, etc.
➢Cause and effect relationship
✓Correlation acts as a benchmark for the study of a cause effect relationship between
variables, which can be use for prediction.
➢Business decision:
✓Correlation analysis facilitate business decisions because the trend path of one variable
may suggest the expected changes in the other. Thus, a businessman may plan his
business decision for the future accordingly
➢Policy formation:
✓Correlation analysis also helps the government in formulating policies. For e.g., the
government may find a negative correlation between tax rate and tax revenue, this will
help her in implementing appropriate policies to achieve an objective 4
All correlated things are not due to cause and effect, comment
✓A causal relation between two events exists if the occurrence of
the first causes the other. The first event is called the cause and
the second event is called the effect.
✓A correlation between two variables does not necessarily imply
causation, as the correlation between the two variables may be
out of coincidence or as a result of a third hidden variable/factor
acting upon them. For e.g., correlation between “drowning” and
“ice cream sales” might exist as a result of a third factor “high
temperature” which causes more people to buy ice cream and
causes more people to go for swimming, which may increase
drowning.
✓On the other hand, if there is a causal relationship between two
variables, they must be correlated.
5
Linear vs Nonlinear correlation

➢Linear correlation refers to a situation when two variables change by a fixed


proportion/or constant unit. Hence, when the pair of variable is plotted on a
graph, it forms a straight line or best fit straight line.
➢Non-linear correlation refers to when to variables do not change by a fixed
proportion/or constant unit, hence, when the pair of variable is plotted on a
graph, it does not form a straight line, however, it forms a curve or best fit
curve.

6
Simple and Multiple correlation

✓Simple correlation is a study of relationship between two


variables only.
✓Multiple correlation refers to the study of relationship among
more than two variables simultaneously. E.g., whether or not
sales value is related to expenditure on advertising and to price.
✓When more than two variables are involved and out of these the
relationship between only two variables is studied, treating the
other variables as constant, then the correlation is known as
partial correlation. E.g., assessing whether or not the sales value
is related to the expenditure on advertising when the effect of
price is controlled.
7
Exercise
➢1. Define correlation
➢2. The correlation appearing between two variables, namely ‘drowning’ and ‘ice cream
sales’ may be referred to as ______
Spurious correlation
➢3. How is the study of correlation useful?
➢4. When two variables change in constant proportion it is called:
✓a. partial correlation b. non-linear correlation
✓c. constant correlation d. linear correlation
➢5. For every $5 increase in the price of a gallon of jet fuel, the cost of LA-NYC flight
increases by about $1000. Construct a table showing gallon of jet fuel price and cost of
LA-NYC flight using any four price levels. does this illustrate linear or non linear
correlation?
➢Simple correlation involves:
✓a. one variable only b. more than one variable
✓c. two variables only d. more than two variables
➢How is simple correlation different from multiple correlation? 8
.
Lesson Objectives:
✓Differentiate between positive and negative correlation
✓Outline the various degrees of correlation
✓State the various methods of estimating correlation coefficient
✓Interpret/comment on the scattered diagram.

9
Positive and negative correlation
➢Correlation between different variables may either be positive or
negative.
➢Positive correlation
✓When two variables move in the same direction, that is, when
one increases, the other also increases and when one decreases
the other also decreases. In such a relationship a positive
correlation is said to exist.
➢Negative correlation
✓When two variables move in opposite direction, that is, when
one increases, the other decreases and when one decreases the
other increases. In such a relationship, a negative correlation is
said to exist.

10
Exercise

✓1. 2. When 2 variables change in the same direction, then such a


correlation is called
✓ a. Positive b. Negative c. No correlation d. All of
these
✓3. When 2 variables change in the different direction, then such
a correlation is called
✓ a. Positive b. Negative c. No correlation d. All of
these
✓4. give two examples each of variables exhibiting positive
correlation, negative correlation and no correlation

11
Degree of correlation
➢Correlation is of various degrees, depending on the value of the
correlation coefficient.
Perfect correlation:
✓It is when two variables change in the same proportion, either
positively or negatively. Plotting such relation forms a straight line
with all the points lying on the straight line.
▪ Thus, we have:
▪ perfect positive correlation with a correlation coefficient of +1
▪ perfect negative correlation with a correlation coefficient of -1
Absence of correlation:
✓when there is no linear relationship between two variables. The
correlation coefficient is 0
12

High correlation:
✓It is when the correlation of two series is close to 1. the coefficient
of correlation lies between 0.75 and 1 for High positive
correlation, and -0.75 and -1 for high negative correlation.
Moderate correlation:
✓It is when the correlation of two series is neither large nor small.
the coefficient of correlation lies between 0.25 and 0.75 for
moderate positive correlation, and -0.25 and -75 for moderate
negative correlation.
Low correlation:
✓It is when the correlation of two series/variables is very small. the
coefficient of correlation lies between 0 and 0.25 for low positive
correlation, and 0 and -0.25 for low negative correlation. 13
Exercise
✓1. Coefficient of correlation always lies between
✓ a. 0 & 1 b. -1 & 0 c. -1 & 1 d. None of these
✓2. When the coefficient of correlation lies between 0.25 and
0.75, it is called
✓ a. Low correlation b. Moderate c. High d. Perfect
✓ 3. When the coefficient of correlation lies between 0.75 and 1,
it is called
✓ a. Low correlation b. Moderate c. High d. Perfect
✓ 4. When the coefficient of correlation lies between 0 and 0.25,
it is called
✓ a. Low correlation b. Moderate c. High d. Perfect 14

➢4. If the correlation coefficient (rxy) between two variables is
zero, it implies that:
✓A. the two variables are independent
✓B. the two variables do not have negative correlation
✓C. the two variables are not linearly related
✓D. all of the above

15

Methods of estimating correlation coefficient
✓Scattered diagram/scattered plot
✓Karl Pearson’s coefficient of correlation (product moment
correlation coefficient)
✓Spearman Rank correlation coefficient

16
.
1. Scattered diagram/scattered plot
✓Scattered diagram is a graphical representation showing an
estimate of the degree and direction of correlation between two
series/variables
✓In interpreting the scattered diagram, we look at the general
pattern and direction of the points whether it is sloping upward
from left to right or downward from left to right, to determine
the direction of the relation
✓We also look at how close the points are to a fitted line, to
determine an estimate for the degree of relation

17
Activity

✓1). How can you describe the correlation between the variables
X and Y if the points of the scattered diagram tend to cluster
about a straight line sloping upward? High Positive Correlation
✓2). How can you describe the correlation between the variables
X and Y if the points of the scattered diagram tend to cluster
about a straight line sloping downward? High negative Correlation

18
Activity
Interpret/comment on the following scattered plot
•.
Strong/high positive relationships Weak/low positive relationships

Y Y

A B

X X

Y Y
D
C

X X
Strong/high negative relationships Weak negative relationships
Activity

.
Y
E Perfect negative correlation

F
No relationship/correlation
X
Activity
✓The table shows the correlations for the four graphs below.
Match each graph to the correlation coefficient.

Graph D Graph A Graph B Graph C

21
CORRELATION
Lesson Objectives:
✓Calculate and interpret the Karl Pearson’s correlation coefficient
(product moment correlation coefficient) using direct method

22
.
2. Pearson’s coefficient of correlation (product moment
correlation coefficient or simple correlation coefficient)
✓Unlike the scattered diagram which only give us an estimate of
the correlation between two variables, the Pearson’s coefficient
of correlation quantifies the linear correlation between two
variables, as it gives a precise numerical value of the linear
relationship between two variables.
✓It thus measures the strength and direction of the linear
relationship between two variables.
✓The Pearson’s correlation coefficient is denoted by r
For the Direct Method,

σ(𝑋−𝑋)(𝑌− ത
𝑌)
✓It is calculated using the formula 𝑟𝑋𝑌 =
ത 2 ×σ(𝑌−𝑌)
σ(𝑋−𝑋) ത 2
23
Application Exercises – part B of worksheet 3, pg. 42
➢,

24
25
26
CORRELATION
Lesson Objectives:
✓Calculate and interpret the Pearson’s coefficient of correlation
using the short-cut method

27
Calculating the Pearson’s coefficient of correlation (product moment
correlation coefficient) using the short-cut method
✓The Pearson’s correlation coefficient is calculated by the short-
cut method using the following formula:
(σ 𝑑𝑥)×(σ 𝑑𝑦)
σ 𝑑𝑥𝑑𝑦 −
✓𝑟𝑋 𝑌 = 2
𝑁
2
(σ 𝑑𝑥) (σ 𝑑𝑦)
σ(𝑑𝑥)2 − × σ(𝑑𝑦)2 −
𝑁 𝑁

Where,
▪ dx = X-Ax,
▪ dy = Y-Ay,
▪ N is the total number of items,
▪ A is the assumed mean in the respective series (variable),
28
Application Exercises - worksheet 4, pg. 43
➢,

29
30
31
...
Lesson Objectives:
✓Calculate and interpret the Pearson’s coefficient of correlation
using the step-deviation method

32
Calculating the Pearson’s coefficient of correlation (product moment
correlation coefficient) using the Step-deviation method
✓The Pearson’s correlation coefficient is calculated by the step-
deviation method using the following formula:
(σ 𝑑𝑥′)×(σ 𝑑𝑦′)
σ 𝑑𝑥′𝑑𝑦′−
✓𝑟𝑋𝑌 = 2
𝑁
2
(σ 𝑑𝑥′) ( σ 𝑑𝑦′)
σ(𝑑𝑥′)2 − × σ(𝑑𝑦′)2 −
𝑁 𝑁

▪ Where, dx = X-Ax, dy = Y-Ay, N is the total number of items, A is


the assumed mean in the respective series (variable),
𝑑𝑥 𝑑𝑦
▪ dx’ = and dy’ = , where 𝑐𝑥 and 𝑐𝑦 are the common factor
𝑐𝑥 𝑐𝑦
of series X and series Y respectively
33
Application Exercises worksheet 5, pg. 43

34
35
36
Properties of correlation coefficient
✓r has no unit.
✓The value of the correlation coefficient lies between -1 and +1. that is, -1≤
r ≤ +1.
✓A negative value for r indicates an inverse relationship (that is, the two
variables move in opposite direction), while a positive value for r indicate a
direct relationship (that is, the two variables move in the same direction).
✓If r = 0, the two variables are uncorrelated. There is no linear relationship
between them. However, other types of relationship may be there.
✓If r = 1 or r = -1, the correlation is perfect and there is exact linear
relationship
✓The value of r is unaffected by the change of origin and change of scale
✓A high value of r (close to +1 or -1) indicates a strong linear relation, and a
low value of r (close to 0) indicates a weak linear relation
37
Methods of estimating correlation coefficient
3. Spearman’s Rank correlation coefficient
✓There are some variables whose quantitative measurement is not
possible. These variables are know as qualitative variables, e.g.
beauty, bravery, wisdom, etc.
✓In situations where attributes cannot be expressed in numbers or
quantitative terms, their relative merit can be determine on the
basis of their order of preference or ranking.
✓In such situations, the Spearman’s Rank Difference is best
suited to be used
6 σ 𝐷2
✓It is calculated using the formula 𝑟𝑠 = 1 − 3
𝑁 −𝑁
✓Where D= difference between the ranks, N= number of pairs
38
Application Exercises Worksheet 6, pg.44

✓,

39
40
41
3. Spearman’s Rank correlation coefficient in situation of tie
(when ranks are repeated or when values of items in the series are
equal)
✓Sometimes, two or more items in a series have equal ranks.
✓In such situations, average of the two or more ranks is accorded to
each item
✓The following formula is then applied in such case of repeated ranks
✓It is calculated using the formula
1 3 1
6[σ 𝐷2 + 𝑚1 −𝑚1 + 𝑚23 −𝑚2 + … ]
✓𝑟𝑠 = 1 − 12 12
𝑁3 −𝑁
1
✓Where m= number of repetitions, 𝑚3 − 𝑚 is the correlation
12
factor and the number of time it is used corresponds to the number of
items with repeated ranks.
42
Application Exercises Worksheet 6, pg.44

✓,

43
44
45

You might also like