Multiple Linear Regression

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Multiple Linear Regression

In the multiple Linear regression model, there are at least two independent variables. The linear multiple regre4ssion
model with two independent variables would look like:

Y = 𝑏0 + 𝑏1 𝑋1 + 𝑏2 𝑋2 + U
In the above model there are three parameters b0, b1, b2, that are to be estimated. One of the the very crucial
assumptions for the estimation of the multiple regression is that there should not be any perfect positive or a negative
correlation between X1 and X2. If the correlation between coefficient between X1 and X2 is either + 1 or – 1 , the
model cannot be estimated and this is called the problem of perfect multicollinearity.

The following table gives the data on the quantity demanded, price and income of a commodity for the period ( 1996 to
2005 )

Year Demand (Y) Price (X) Income (I)


1996 100 5 1000
1997 75 7 600
1998 80 6 1200
1999 70 6 500
2000 50 8 300
2001 65 7 400
2002 90 5 1300
2003 100 4 1100
2004 110 3 1300
2005 60 9 300

INTERPRETATION:
i. The value of 𝑅2 equals 0.894, indicating that 89.4 per cent of the variations in the demand are explained
by the price and income.
ii. The estimated regression equation as obtained in the table may be written as :
Y = 11.692 4- 7.188X + 0.0141 I
Where Y = Demand
X = Price
I = Income
Solution:
a). The regression model is Y = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋3 + 𝛽3 𝑋3 + 𝛽4 𝑋4 + 𝛽5 𝑋5
p – value for regression = 0.038, which is < than 0.05, hence we reject the null
hypothesis and accept alternate hypothesis, indicating that Regression model is
statistically significant.
𝐻0 : 𝐼𝑓 𝛽1 = 𝛽2 = 𝛽3 = 𝛽4 = 𝛽5 , then Sincerity, Excitement, Competence,
Sophistication, Ruggedness does not affect Customer brand engagement.
𝐻1 : If at least 𝛽1 ≠ 0 𝑜𝑟 𝛽2 ≠ 0 𝑜𝑟 𝛽3 ≠ 0 𝑜𝑟 𝛽4 ≠ 0 𝑜𝑟 𝛽5 ≠ 0, then Sincerity,
Excitement, Competence, Sophistication, Ruggedness does affect Customer brand
engagement.
b). Dependent variable is : Customer brand engagement.
Independent variables are: Sincerity, Excitement, Competence, Sophistication,
Ruggedness.
c). Independent variable Excitement can be concluded as the predictors of
dependent variable, because it has the largest absolute value for standardized co-
efficients.

d). Since 𝑅 2 =0.863, indicating that 86.3 % of the variance of dependent variable (
Customer brand engagement ) can be predicted from independent variables
(Sincerity, Excitement, Competence, Sophistication, Ruggedness ).

NOTE IF R-SQUARED VALUE IS NOT GIVEN , THEN FIND IT OUT USING THE
𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠
FORMULA 𝑅 2 = 𝑇𝑜𝑡𝑎𝑙 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠

e). Y (Customer brand engagement) = 13.318 − 0.226𝑋1 + 0.319𝑋2 − 0.052𝑋3 +


0.059𝑋4 + 0.127𝑋5 .. Where
𝑋1 𝑖𝑠 𝑆𝑖𝑛𝑐𝑒𝑟𝑖𝑡𝑦, 𝑋2 𝑖𝑠 𝐸𝑥𝑐𝑖𝑡𝑒𝑚𝑒𝑛𝑡, 𝑋3 𝑖𝑠 𝐶𝑜𝑚𝑝𝑒𝑡𝑒𝑛𝑐𝑒, 𝑋4 𝑖𝑠 𝑆𝑜𝑝ℎ𝑖𝑠𝑡𝑖𝑐𝑎𝑡𝑖𝑜𝑛, 𝑋5 𝑖𝑠 𝑅𝑢𝑔𝑔

f). Since p-value of Sincerity is 0.046, which is < than 0.05 . Sincerity is a significant
predictor of Customer brand engagement
Since p-value of Excitement is = 0.025, which is < 0.05, Excitement is a
significant predictor of Customer brand engagement.
Since p-value of Competence is = 0.679, which is > 0.05, Competence is not a
significant predictor of Customer brand engagement.
Since p-value of Sophistication is = 0.722, which is > than 0.05, Sophistication
is not a significant predictor of Customer brand engagement.
Since p-value of Ruggedness is = 0.499, which is > than 0.05, Ruggedness is not a
significant predictor of Customer brand engagement.
g).Sample size = 177 + 1 = 178.
Ans).

a). There are 3 independent variables i.e., X1, X2, X3 and X4.

b). The number of observations is 29 + 1 = 30.

c).The regression equation is:

Y ( Sales ) = 20.273 + 2.6250 X1 + 2.880 X2 + 4.282 X3 – 0.8076 X4

d). Y ( Sales ) = 20.273 + 2.6250 X1 + 2.880 X2 + 4.282 X3 – 0.8076 X4

Y ( Sales ) = 20.273 + 2.6250 (100 ) + 2.880 (150) + 4.282 (50) – 0.8076 (10) = 900.524.

e). P – value in the first table = 0.000 i.e ., < = 0.05 . Hence reject the Null hypothesis .

Therefore Regression model is statistically significant.

f). i). p – value of X1 is 0.1337 , which is > than 0.05, so the impact of X1 on sales is insignificant.

ii). P – value for X2 is = 0.3594, which is > than 0.05, so the impact of X2 on sales is insignificant.

iii). P – value for X3 is = 0.0121, which is < = 0.05, it indicates that X3 significantly influences sales.

iv). P – value for X4 is = 0.0110, which is < = 0.05, it indicates that X4 significantly influences sales.
𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 22975.2376
g). Multiple coefficient of determination = = = 0.716699
𝑇𝑜𝑡𝑎𝑙 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 32057.00

This means 71.67 % of the variance of sales can be predicted from the variables X1, X2, X3, X4.
NOTE:

a). If relative importance of the independent variables have to be obtained then, it can be obtained by the absolute
value of the standardized regression coefficients . If the absolute value of the standardized coefficients of X2 is more
than others then, X2 is relatively more important than the others variables.

b). Y ( Sales ) = 20.273 + 2.6250 X1 + 2.880 X2 + 4.282 X3 – 0.8076 X4

This regression equation indicates that X4 is negatively related with SALES . If X4 goes up by one unit, sales will go down
by 0.8076 units while keeping X1, X2, X3 constant. Similarly X1 is positively related with sales. If X1 goes up by one unit
sales would go up by 2.6250 units. Similarly you can analyze for X2, X3 which are both positively related with sales..

QUESTION:

1. In a study to predict the sale price of a residential property (dollars), data is taken on 20 randomly selected
properties. The potential Predictors in the study are appraised land value (dollars), appraised value of
improvements (dollars), and area of property living space (Square feet), and a 0.05 significance level is
chosen for hypothesis testing. The analysis is carried out using Excel and is as shown below:

Based on above information answer the following questions


a) List the independent and dependent variables
b) Frame the necessary hypothesis?
c) Comment on value and strength of model
d) Frame the Equation of regression.
e. Is the hypothesis in (a) accepted or rejected (explain)?
f. Find the number of observations.
g. Find the multiple coefficient of determination and interpret it.s

QUESTION:s
2. Data were collected on 200 high schools students and their scores on various tests, including science, math,
reading and social studies. The variable female is a dichotomous variable coded 1 if the student was female and
0 if male.
Based on above information answer the following questions
a) List the independent and dependent variables
b) Frame the necessary hypothesis?
c) Comment on value and strength of model
d) Frame the Equation of regression.
e) Is the hypothesis in (a) accepted or rejected (explain)?

Theory

3. What is regression? Explain the difference between simple and multiple regression ( at least 4
points )

You might also like