Lecture

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Lesson 1 Correlation Analysis

Sample Exercise:
Plot the following points in the rectangular coordinate system.

1. (-3, 2) 6. (3, 3)

2. (1, -5) 7. (4, -4)


3. (-3, -5) 8. (3, 5)

4. (-2, 4) 9. (1, -3)


5. (-5, 0) 10. ( 0, 5)

Bivariate Data
Data in statistics is sometimes classified according to how many variables are in particular study. When you
conduct a study that looks at a single variable, that study involves univariate data. For example, you study a
group of students to find out their average grade.
Bivariate data is when you are studying two variables. These variables are compared to find the
relationships between them. For example, age might be one variable and weight might be another variable.
Another is when you want to find out the temperature and the ice cream sales.
Using correlation analysis, we can find out the relationship of variables in a bivariate data. Many businesses,
marketing and social science questions and problems could be solved using bivariate data sets. For instance,
is there
a link between child obesity and family income? This is where correlation analysis is helpful.

Correlation analysis is a method of statistical evaluation used to study the strength of a relationship
between two numerically measured, continuous variables(e.g. height and weight). This particular type of analysis
is useful when a researcher wants to establish if there are possible connections between variables.
A scatterplot, or diagram, is a type of mathematical diagram using Cartesian coordinates to display values
for two variables in a set of data. The independent variable is plotted along the horizontal axis (x) and the
dependent variable is plotted along the vertical axis (y). Scatterplot provides a visual representation of the
correlation, or relationship between the two variables. It shows the direction and strength of a relationship of
the variables.
All correlations have two properties: direction and strength.

 Positive correlation: Both variables move in the same direction. In other words, as one variable
increases, the other variable also increases. As one variable decreases, the other variable also
decreases. An upward trend in points indicates a positive correlation.

Examples: IQ vs. academic performance; salary vs. job satisfaction

 Negative correlation: The variables move in opposite directions. As one variable increases, the other
variable decreases. As one variable decreases, the other variable increases. A downward trend in points
indicates a negative correlation.

Examples: academic performance vs. no. of hours watching tv; stress vs. job performance

 Zero or no correlation: It means that there is no apparent relationship between the two variables.
Example: shoe size vs. salary; socio-economic status vs. grades

NORHAN A. SARIP 1
The strength of a correlation is determined by its numerical value. It may be perfect, very high,
moderately high, moderately low, very low, and zero.

The diagram above shows some examples of scatter plots and correlations.

Creating Scatterplot in Spreadsheet or Excel

What’s interesting is you can create your scatterplot from your data using Excel. Here are the steps you need:

 Select the worksheet range that contains the data.


 Click On the Insert tab, click the XY (Scatter) chart command button.
 Select the Chart subtype that doesn't include any lines.
 Confirm the chart data organization.
 Annotate the chart, if appropriate.Add those little flourishes to your chart that will make it more
attractive and readable. For example, you can use the Chart Title and Axis Titles buttons to annotate the
chart with a title and with descriptions of the axes used in the chart.
 If you want to add a trendline, click Add Chart Element menu's Trendline command button.

Lesson 2 Pearson Product-Moment Correlation

Pearson Correlation Coefficient

The most common coefficient of correlation is known as the Pearson product-moment correlation coefficient,
or Pearson’s r. It is a measure of the linear correlation (dependence) between two variables X and Y, giving a

NORHAN A. SARIP 2
value between +1 and −1. It was developed by Karl Pearson from a related idea introduced by Francis Galton
in the 1880s.

When conducting a statistical test between two variables, it is a good idea to conduct a Pearson correlation
coefficient value to determine just how strong that relationship is between the two variables. If the coefficient
value is in the negative range, then that means the relationship between the variables is negatively correlated,
or as one value increases, the other decreases. If the value is in the positive range, then that means the
relationship between the variables is positively correlated, or both values increase or decrease together.

To determine the strength of the computed r:

If r=0 no association or correlation

If 0 < r < ±0.25 very low correlation

If ±0.25 < r < ±0.50 moderately low correlation

If ±0.50 < r < ±0.75 moderately high correlation

If ±0.75 < r < ±1 very high or strong correlation

If r = ±l perfect correlation

Correlation Coefficient Software

Most spreadsheet editors such as Excel, Google sheets and OpenOffice can compute correlations for you.
The illustration below shows an example:

Using the Excel, click on an empty cell where you want the correlation coefficient to be entered. Then enter
the following formula.

=PEARSON(array1, array2)

Simply replace ‘array1‘ with the range of cells containing the first variable and replace ‘array2‘ with the
range of cells containing the second variable.

For the example above, the Pearson correlation coefficient (r) is 0. 76.

NORHAN A. SARIP 3

You might also like