Chapter 5 Sta404

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

CHAPTER 5: BIVARIATE ANALYSIS

5.1 CORRELATION

1. Correlation is statistical measure that indicates the extent to which two or more
variables fluctuate together.

2. It measures the linear relationship between two variables.

3. Correlation is measured by correlation coeficient.

Correlation Coeficient

Pearson Product-Moment, Spearman Rank,


r rs

5.1.1 SCATTER DIAGRAM

1. It is a plot of 2 variables.

2. It is a graphical method to help determine the existence of a linear


relationship between two variables.

Example 1: The advertising expenditures and sales volumes for a


corporation during 10 randomly selected months. Plot
the scatter diagram to see the nature of the relationship.

Month Advertising Expenditure Sales volume


(RM’000), x (RM’000), y
1 1.2 101
2 0.8 92
3 1.0 110
4 1.3 120
5 0.7 90
6 0.8 82
7 1.0 93
8 0.6 75
9 0.9 91
10 1.1 105

LECTURER: U. H LAU 1
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

3. Various trends / pattern of scatter plot.

a) No relationship b) curvilinear relationship

y y

x
x (b) r = 0 x
(a) r = 0 (b) r = 0
(c) & (e) positive relationship (d) & (f) negative linear relationship
y y
y

x x
(c) r = 0.8 (d) r = - 0.6

y y y

x x
(e) r = 1 (f) r = - 1

LECTURER: U. H LAU 2
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

5.1.2 PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT

1. It is used to analyze the linear relationship between two quantitative


variables.

2. Pearson Product-Moment Correlation Coefficient:

n xy- x  y
r=
[n x 2 -(  x)2 ][n y 2 -(  y)2 ]

or

 xy-  n
x y
r=
[ x -  ][ y - 
2 2
( x)
2 ( y) 2
]
n n

-1  r  1

3. Sign of the coefficient indicates the direction of the relationship


(positive or negative).

Positive sign  Positive relationship


(the two variables move in harmony)

Negative sign  Negative relationship


(the two variables move in opposite or inverse direction)

4. Magnitude of the coefficient indicates the strength of the relationship


(strong or weak).
Strength
increases

-1 0 1
Strength
decreases

LECTURER: U. H LAU 3
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

|r| Strength of relationship


<0.20 Almost negligible correlation
0.20 – 0.40 Weak correlation
0.40 – 0.70 Moderate correlation
0.70 – 0.90 Strong correlation
>0.90 Very strong correlation

Example 2: Find the Pearson Product-Moment Correlation, r if


x = 11150, y = 1130, x2 = 7247500
y2 = 71100, xy = 708750, n = 20
and interpret the answer.

x= Monthly salary
y= Entertainment expenditure

5.2 SIMPLE LINEAR REGRESSION

5.2.1 Dependent & Independent Variable

1. Independent Variable (predictor)


– set by the experimenter
– not depend on other variables

2. Dependent Variable
– depends on other variables

3. The two variables are tied together based on a mathematical model:

Y = a + bX where a = the y-intercept


b = the slope

Example 3:

Identify the independent variable (x) and dependent variable (y).

(a) The carbon monoxide level in the blood is to be related to the


number of cigarettes a person smokes per day.
(b) A market analyst wishes to relate local advertising expenses with
product sales in several areas of the state.
(c) Age of respondent and blood pressure measurement.
(d) Hours of study and statistics test score.

LECTURER: U. H LAU 4
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

5.2.2 SIMPLE LINEAR REGRESSION EQUATION

1. Linear regression equation takes the following form:

y = a + bx where x = independent variable


y = dependent variable
b = slope

∆y Δy
b=
∆x Δx a

2. Using Least Squares Method, the regression coefficients are determined


as follows:

a=
 y - b x = y -bx b=
n xy -  x  y
n n n x 2 -   x 
2

 y - b x = y -bx
 xy -  n
a= x y
n n
=
 x
2

x - n2

Example 4: By using the data in example 1,

 x=9.4,  x =9.28,  xy=924.8,


2

 y=959,  y =93 569, n=10


2

Find the least squares regression line.

LECTURER: U. H LAU 5
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

3. Interpretation of regression coefficients

y- intercept, a

It tells what is happening to the dependent variable (y) when independent


variable (x) is zero.

 When x=0, then y=a.

Intercept a

y-intercept, a can be interpreted mathematically but not necessarily be


interpreted logically.

Slope, b

change in y Δy
b= =
change in x Δx

It tells how much is the changes in dependent variable (y) when


independent variable (x) change by 1 unit.

 When x change by 1 unit, then y will change by b units.

i) If b is positive value. For each additional increase in variable x, there


will be an increase in variable y.
ii) If b is negative value. For each additional increase in variable x, there
will be a decrease in variable y.

LECTURER: U. H LAU 6
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

Example 5:

Interpret the regression coefficients in the following regression equation:

y = 13.919 + 0.07638x where x = monthly income (RM)


y = monthly entertainment expenditure (RM)

Example 6:

Refering to the following regression line, interpret the intercept and slope
coefficients.

y = 46.49 + 52.57x where x = advertising expenditure (RM’000)


y = sales volume (RM’000)

5.2.3 PREDICTION

1. By using the regression line,


Λ
y = a + bx where ŷ = estimated, predicted or forecast y.

Example 7:

By using the regression line below, predict the monthly entertainment


expenditure of your monthly income is RM1000.

y = 13.919 + 0.07638x where x = monthly income (RM)


y = monthly entertainment expenditure (RM)

Example 8:

By using the regression line below, predict the sales volume if the
corporation has budgeted RM10,000 for advertising in a month.

y = 46.49 + 52.57x where x = advertising expenditure (RM’000)


y = sales volume (RM’000)

LECTURER: U. H LAU 7
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

5.3 COEFFICIENT OF DETERMINATION

1. It determines how well the x-variable can be used to help in predicting y in


terms of percentage. The closer the value is to ‘1’, the better.

Example 9:

Refering to example 2,

r = 0.911
r2 = 0.9112 = 0.829

This implies that 82.9% of the variations in entertainment expenditure (y) among
respondents is mainly due to their monthly income (x).

LECTURER: U. H LAU 8
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

Example 10: (SPSS output)

A car dealer wants to find the relationship between the odometer reading (in
miles) and the selling price (in $) of used cars. A random sample of 100 cars is
selected, and the data recorded. Performed the analysis based on the following
SPSS ouput.

Model Summaryb

Adjusted Std. Error of


Model R R Square R Square the Estimate
1 .806a .650 .647 151.56875
a. Predictors: (Constant), odometer
b. Dependent Variable: price

Coefficientsa

Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 6533.383 84.512 77.307 .000
odometer -.031 .002 -.806 -13.495 .000
a. Dependent Variable: price

a) Find the regression equation for that relates the two variables.
b) Interpret the regression coefficient.
c) Determine and interpret the coefficient of determination.
f) Predict the selling price of a used car if its odometer reading is 35 000miles.

LECTURER: U. H LAU 9
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 5

Example 11: (SPSS output )

An automobile manufacturer would like to investigate how the fuel consumption


of a car changes as its speed increases. The sample data collected is as
displayed.
Speed (km/h) Fuel (lit/100km)
60 5.90
70 6.30
80 7.40
90 7.57
100 8.27
110 9.03
120 9.57
130 10.79
140 11.77
150 12.83
160 13.5
170 13.95
180 14.32
190 15.25
200 16.27

The SPSS output for the regression analysis produce the following output:
Model Summaryb

Adjusted Std. Error of


Model R R Square R Square the Estimate
1 .996a .992 .991 .31508
a. Predictors: (Constant), Speed (km/h)
b. Dependent Variable: Fuel Consumption (lit/100km)

Coefficientsa

Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 1.010 .258 3.914 .002
Speed (km/h) 7.568E-02
X .002 .996 40.192 .000
a. Dependent Variable: Fuel Consumption (lit/100km)
 

a) Find the value of X.


b) Show that the Pearson correlation coefficient is 0.996.

LECTURER: U. H LAU 10

You might also like