Practical 1-3 SPSS

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

Name: RAGINI

Roll No: 21026765023


Group: A

Practical

Aim: To carry out simple linear regression on the basis of given data.

Problem: The following data gives the House price in Lakhs (Y) and Area in square yards (X) of a
reality firm. Fit the simple linear regression model to the following data and carry out the analysis.
Y X Y X Y X Y X
186 175 182 167 162 156 179 160
180 168 162 160 192 180 170 149
160 154 169 165 185 167 170 160
186 166 176 167 163 157 165 148
163 162 180 175 185 167 165 154
172 152 157 157 170 157 169 171
192 179 170 172 176 168 171 165
170 163 186 181 176 167 192 175
174 172 180 166 160 145 176 161
191 170 188 181 167 156 168 162
182 170 153 148 157 153 169 162
178 147 179 169 180 162 184 176
181 165 175 170 172 156 171 160
168 162 165 157 184 174 161 158
162 154 156 162 185 160 185 175
188 166 185 174 165 152 184 174
168 167 172 168 181 175 179 168
183 174 166 162 170 169 184 177
188 175 179 159 161 149 175 158
166 164 181 155 188 176 173 161
180 163 176 171 181 165 164 146
176 163 170 159 156 143 181 168
185 171 165 164 161 158 187 178
169 161 183 175 152 141 181 170

Theory:
Simple Linear Regression:
Model: y = β0 + β1x + ε,
where the intercept β0 and slope β1 are unknown constants, called parameter, which are to be
estimated by method of least square and ε is a random error component.

Assumptions:
1. There is a linear relationship between the response (y) and regressor (x).
2. The errors are assumed to be Normally distributed with mean 0 and unknown variance σ2
i.e., εi ~ N(0, σ2) for all i .
3. The error terms are uncorrelated which implies the absence of autocorrelation. i.e.,
Cov(εi , εj) = 0 for all i ≠ j.
4. There is no multicollinearity and the variables are homoscedastic.
The model along with the above assumptions is known as Classical Linear Regression Model (CLRM).
Coefficient of determination (R2): It tells us the proportion or percentage of variation can be
explained by regressor x. The value of R2 lies between 0 and 1. The values of R2 that are close to 1
imply that most of the variability in y is explained by the regression model. The value of R 2 always
increases when we add new regressor variables.

The Adjusted R2 tells the percentage of variation explained by only those regressors that actually
affect the dependent variable y.

Analysis of Variance (ANOVA): It is based on partitioning of total variability in response variable to


draw inferences about the significance of regression.

Hypothesis: To test whether the regressor is significant or not.


i.e., to test H0: β1 = 0 against H1: β1 ≠ 0.

Test criteria: If p-value<0.05, we reject H0 at 5% level of significance and conclude on the basis of
given data that the regressor is statistically significant.

Steps:
Analyze → Regression → Linear → Dependent: House_price → Independent(s) → Area_yards →
Statistics → Estimates, Model fit, Descriptives → Continue → OK

Output:

Table 1.1: Case Summariesa

Y X
1 186.00 175.00 22 176.00 163.00
2 180.00 168.00 23 185.00 171.00
3 160.00 154.00 24 169.00 161.00
4 186.00 166.00 25 182.00 167.00
5 163.00 162.00 26 162.00 160.00
6 172.00 152.00 27 169.00 165.00
7 192.00 179.00 28 176.00 167.00
8 170.00 163.00 29 180.00 175.00
9 174.00 172.00 30 157.00 157.00
10 191.00 170.00 31 170.00 172.00
11 182.00 170.00 32 186.00 181.00
12 178.00 147.00 33 180.00 166.00
13 181.00 165.00 34 188.00 181.00
14 168.00 162.00 35 153.00 148.00
15 162.00 154.00 36 179.00 169.00
16 188.00 166.00 37 175.00 170.00
17 168.00 167.00 38 165.00 157.00
18 183.00 174.00 39 156.00 162.00
19 188.00 175.00 40 185.00 174.00
20 166.00 164.00 41 172.00 168.00
21 180.00 163.00 42 166.00 162.00
43 179.00 159.00 71 161.00 158.00
44 181.00 155.00 72 152.00 141.00
45 176.00 171.00 73 179.00 160.00
46 170.00 159.00 74 170.00 149.00
47 165.00 164.00 75 170.00 160.00
48 183.00 175.00 76 165.00 148.00
49 162.00 156.00 77 165.00 154.00
50 192.00 180.00 78 169.00 171.00
51 185.00 167.00 79 171.00 165.00
52 163.00 157.00 80 192.00 175.00
53 185.00 167.00 81 176.00 161.00
54 170.00 157.00 82 168.00 162.00
55 176.00 168.00 83 169.00 162.00
56 176.00 167.00 84 184.00 176.00
57 160.00 145.00 85 171.00 160.00
58 167.00 156.00 86 161.00 158.00
59 157.00 153.00 87 185.00 175.00
60 180.00 162.00 88 184.00 174.00
61 172.00 156.00 89 179.00 168.00
62 184.00 174.00 90 184.00 177.00
63 185.00 160.00 91 175.00 158.00
64 165.00 152.00 92 173.00 161.00
65 181.00 175.00 93 164.00 146.00
66 170.00 169.00 94 181.00 168.00
67 161.00 149.00 95 187.00 178.00
68 188.00 176.00 96 181.00 170.00
69 181.00 165.00 Total N 96
70 156.00 143.00 a. Limited to first 100 cases.

Table 1.2: Descriptive Statistics


Mean Std. Deviation N
House_price 174.3229 9.96044 96
Area 163.9167 9.15213 96
Table 1.3: Correlations
House_price Area
Pearson House_price 1.000 .765
Correlation Area .765 1.000
Sig. House_price . .000
(1-tailed) Area .000 .
House_price 96 96
N
Area 96 96
Table 1.4: Model Summary
R Adjusted R Std. Error of the
Model R Square Square Estimate
1 .765a .585 .580 6.45354
a. Predictors: (Constant), Area

Table 1.5: ANOVAb


Model Sum of Squares df Mean Square F Sig.
1 Regression 5510.058 1 5510.058 132.300 .000a
Residual 3914.932 94 41.648
Total 9424.990 95
a. Predictors: (Constant), Area
b. Dependent Variable: House_price

Table 1.6: Coefficientsa


Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 37.922 11.877 3.193 .002
Area .832 .072 .765 11.502 .000
a. Dependent Variable: House_price

Conclusion:
1. From table 1.2 of descriptive statistics, the mean and standard deviation of House_price are
174.32 Lakhs and 9.960 Lakhs respectively. Again, mean and standard deviation of
Area_yards are 163.92 square yards and 9.152 square yards respectively.

2. From table 1.3, the value of correlation between House_price and Area_yards is 0.765 and
p-value = 0<0.05. So, we reject null hypothesis implying that there is a significant correlation
between House_price and Area_yards.

3. From table 1.4, the value of coefficient of determination R2 = 0.585 which implies that the
regression model explains 58.5% of the total variation in House_price. Also, Adjusted R 2 =
0.580 ≈ R2. So, the model is a good fit.

4. From table 1.5: ANOVA, the p-value is 0 which is less than 0.05. So, we reject H 0 at 5% level
of significance implying that overall regression is significant.

5. From table 1.6, coefficients of the regression model are ^β 0 = 37.922 and ^β 1 = 0.832. So, the
fitted regression model is: House_price = 37.922 + 0.832*Area_yards
We can also see that the p-value for testing significance of β 1 is 0 < 0.05. So, we reject null
hypothesis at 5% level of significance implying that the regressor (Area_yards) is significant.
Name: RAGINI
Roll No: 21026765023
Group: A

Practical

Aim: To fit a multiple linear regression model on the basis of given data.

Problem: A recent survey of clerical employees of a large financial organization included questions
related to employee satisfaction with their supervisors. There was a question designed to measure
the overall performance of a supervisor as well as questions that were related to specific activities
involving interaction between employee and supervisor. An exploratory study was conducted to try
to explain relationship between specific supervisor activities and overall satisfaction with supervisors
as perceived by employees. Y = Overall rating of job being done by supervisor
X1 = Handles employee complaints X2 = Doesn’t allow special privileges
X3 = Opportunity to learn new things X4 = Raises based on performance
X5 = Too critical of poor performance X6 = Rate of advancing to better jobs

Sl. Overall Handles Doesn’t allow Opportunity Raises based Too critical Rate of
No. Rating Employee’s special to learn on of poor advancing
of Job Complaints preferences new things performance performance to better
(Y) (X1) (X2) (X3) (X4) (X5) jobs (X6)
1 43 51 30 39 61 92 45
2 63 64 51 54 63 73 47
3 71 70 68 69 76 86 48
4 61 63 45 47 54 84 35
5 81 78 56 66 71 83 47
6 43 55 49 44 54 49 34
7 58 67 42 56 66 68 35
8 71 75 50 55 70 66 41
9 72 82 72 67 71 83 31
10 67 61 45 47 62 80 41
11 64 53 53 58 58 67 34
12 67 60 47 39 59 74 41
13 69 62 57 42 55 63 25
14 68 83 83 45 59 77 35
15 77 77 54 72 79 77 46
16 81 90 50 72 60 54 36
17 74 85 64 69 79 79 63
18 65 60 65 75 55 80 60
19 65 70 46 57 75 85 46
20 50 58 68 54 64 78 52
21 50 40 33 34 43 64 33
22 64 61 52 62 66 80 41
23 53 66 52 50 63 80 37
24 40 37 42 58 50 57 49
25 63 54 42 48 66 75 33
26 66 77 66 63 88 76 72
27 78 75 58 74 80 78 49
28 48 57 44 45 51 83 38
29 85 85 71 71 77 74 55
30 82 82 39 59 64 78 39
Total 30 30 30 30 30 30 30
Theory:
Multiple Linear Regression:

Model: y = β0 + β1x1 + β2x2 + … + βpxp + ε,


where the intercept β0 and slope coefficients βi’s, i = 1, 2, …, p are unknown constants, called
parameter, which are to be estimated and ε is a random error component.

Assumptions:
1. There is a linear relationship between the response and regressors.
2. The errors are assumed to be Normally distributed with mean 0 and unknown variance σ 2.
i.e., εi ~ N(0 , σ2) for all i .
3. The error terms are uncorrelated i.e., absence of autocorrelation.
i.e., Cov(εi , εj) = 0 for all i ≠ j .
4. There is no multicollinearity, and the variables are homoscedastic.

Coefficient of determination (R2): It tells us the proportion or percentage of variation can be


explained by regressor x. The value of R2 lies between 0 and 1. The values of R2 that are close to 1
imply that most of the variability in y is explained by the regression model. The value of R 2 always
increases when we add new regressor variables.

The Adjusted R2 tells the percentage of variation explained by only those regressors that actually
affect the dependent variable y.

Analysis of Variance (ANOVA): It is based on partitioning of total variability in response variable to


draw inferences about the significance of regression.

Hypothesis: To test whether the regressor is significant or not.


i.e., to test H0: β1 = 0 against H1: β1 ≠ 0.

Test criteria: If p-value<0.05, we reject H0 at 5% level of significance and conclude on the basis of
given data that the regressor is statistically significant.

Steps:
Analyze → Regression → Linear → Dependent: Y → Independent(s) → X1 X2 X3 X4 X5 X6 → Statistics
→ Estimates, Model fit, Descriptives → Continue → OK

Output:

Table 2.1: Case Summariesa


Y X1 X2 X3 X4 X5 X6
1 43.00 51.00 30.00 39.00 61.00 92.00 45.00
2 63.00 64.00 51.00 54.00 63.00 73.00 47.00
3 71.00 70.00 68.00 69.00 76.00 86.00 48.00
4 61.00 63.00 45.00 47.00 54.00 84.00 35.00
5 81.00 78.00 56.00 66.00 71.00 83.00 47.00
6 43.00 55.00 49.00 44.00 54.00 49.00 34.00
7 58.00 67.00 42.00 56.00 66.00 68.00 35.00
8 71.00 75.00 50.00 55.00 70.00 66.00 41.00
9 72.00 82.00 72.00 67.00 71.00 83.00 31.00
10 67.00 61.00 45.00 47.00 62.00 80.00 41.00
11 64.00 53.00 53.00 58.00 58.00 67.00 34.00
12 67.00 60.00 47.00 39.00 59.00 74.00 41.00
13 69.00 62.00 57.00 42.00 55.00 63.00 25.00
14 68.00 83.00 83.00 45.00 59.00 77.00 35.00
15 77.00 77.00 54.00 72.00 79.00 77.00 46.00
16 81.00 90.00 50.00 72.00 60.00 54.00 36.00
17 74.00 85.00 64.00 69.00 79.00 79.00 63.00
18 65.00 60.00 65.00 75.00 55.00 80.00 60.00
19 65.00 70.00 46.00 57.00 75.00 85.00 46.00
20 50.00 58.00 68.00 54.00 64.00 78.00 52.00
21 50.00 40.00 33.00 34.00 43.00 64.00 33.00
22 64.00 61.00 52.00 62.00 66.00 80.00 41.00
23 53.00 66.00 52.00 50.00 63.00 80.00 37.00
24 40.00 37.00 42.00 58.00 50.00 57.00 49.00
25 63.00 54.00 42.00 48.00 66.00 75.00 33.00
26 66.00 77.00 66.00 63.00 88.00 76.00 72.00
27 78.00 75.00 58.00 74.00 80.00 78.00 49.00
28 48.00 57.00 44.00 45.00 51.00 83.00 38.00
29 85.00 85.00 71.00 71.00 77.00 74.00 55.00
30 82.00 82.00 39.00 59.00 64.00 78.00 39.00
Total N 30 30 30 30 30 30 30
a. Limited to first 100 cases.

Table 2.2: Descriptive Statistics


Mean Std. Deviation N
Y 64.6333 12.17256 30
X1 66.6000 13.31476 30
X2 53.1333 12.23543 30
X3 56.3667 11.73701 30
X4 64.6333 10.39723 30
X5 74.7667 9.89491 30
X6 42.9333 10.28871 30
Table 2.3: Correlations
Y X1 X2 X3 X4 X5 X6
Y 1.000 .825 .426 .624 .590 .156 .155
X1 .825 1.000 .558 .597 .669 .188 .225
X2 .426 .558 1.000 .493 .445 .147 .343
Pearson Correlation X3 .624 .597 .493 1.000 .640 .116 .532
X4 .590 .669 .445 .640 1.000 .377 .574
X5 .156 .188 .147 .116 .377 1.000 .283
X6 .155 .225 .343 .532 .574 .283 1.000
Y . .000 .009 .000 .000 .205 .207
X1 .000 . .001 .000 .000 .160 .116
X2 .009 .001 . .003 .007 .219 .032
Sig. (1-tailed) X3 .000 .000 .003 . .000 .271 .001
X4 .000 .000 .007 .000 . .020 .000
X5 .205 .160 .219 .271 .020 . .065
X6 .207 .116 .032 .001 .000 .065 .
Y 30 30 30 30 30 30 30
X1 30 30 30 30 30 30 30
X2 30 30 30 30 30 30 30
N X3 30 30 30 30 30 30 30
X4 30 30 30 30 30 30 30
X5 30 30 30 30 30 30 30
X6 30 30 30 30 30 30 30

Table 2.4: Model Summary


Adjusted R Std. Error of the
Model R R Square
Square Estimate
1 .856a .733 .663 7.06799
a. Predictors: (Constant), X6, X1, X5, X2, X3, X4

Table 2.5: ANOVAb


Model Sum of Squares df Mean Square F Sig.
Regression 3147.966 6 524.661 10.502 .000a
1 Residual 1149.000 23 49.957
Total 4296.967 29
a. Predictors: (Constant), X6, X1, X5, X2, X3, X4
b. Dependent Variable: Y
Table 2.6: Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 10.787 11.589 .931 .362
X1 .613 .161 .671 3.809 .001
X2 -.073 .136 -.073 -.538 .596
X3 .320 .169 .309 1.901 .070
X4 .082 .221 .070 .369 .715
X5 .038 .147 .031 .261 .796
X6 -.217 .178 -.183 -1.218 .236
a. Dependent Variable: Y

Conclusion:
1. From table 2.2 of descriptives, we can gather information about mean and standard
deviation of given variables.

2. From table 2.3 of correlations, we can gather information about individual correlation
between the response and the predictors and also the pairwise correlation between the
various predictors.

Using the p value from the tables, we can say that there is significant correlation between all
pairs except (Y, X5); (Y, X6); (X1, X5); (X2, X5); (X3, X5); (X1, X6) and (X5, X6) since p value for these
pairs > 0.05 implying we fail to reject null hypothesis at 5% level of significance.

Also, we can say that the predictors X1, X3 and X4 have a significant effect on the response Y
as evident from their correlation coefficients as well as the p-values.

3. From table 2.4, we see that R2 = 0.733 implying the regression model explains 73.3% of the
total variation in the response and model is a good fit.
Also, Adjusted R2 = 0.663 implying 66.3% of the total variation of the response is explained
by only those predictors that have a significant effect on the response.

4. From table 2.5 of ANOVA, the p-value for testing the null hypothesis: β1 = β2 = …. = β6 = 0
which is less than the level of significance α = 0.05. So, we reject the null hypothesis at 5%
level of significance and conclude that the overall regression is significant.

5. From table 2.6, the coefficients of the regression model are ^β 0= 10.787, ^β 1 = 0.613, ^β 2 = -
0.073, ^β 3 = 0.320, ^β 4 = 0.082, ^β 5 = 0.038 and ^β 6 = -0.217 and the fitted regression equation
is: Y = 10.787 + 0.613 X1 – 0.073 X2 + 0.320 X3 + 0.082 X4 + 0.038 X5 – 0.217 X6

We can also see that the p-value for testing significance of β 1 is 0.001 < 0.05. So, we reject
null hypothesis at 5% level of significance implying that the regressor X 1 (Handles employee
complaints) is significantly significant. However, the p value for testing the significance of
βi=0; i = 2, 3, …, 6 are greater than 0.05 implying individually X2, X3, X4, X5 and X6 do not have
a significant effect on the response.
Name: RAGINI
Roll No: 21026765023
Group: A

Practical

Aim: To test the significance of regression coefficient and check the presence of autocorrelation and
heteroscedasticity.

Problem: The following data is based on time income consumption and expenditure of 30 families in
some locality. Assuming that consumption is linearly related to income, propose a model and test for
significance of regression. Also, compute coefficient of determination and discuss its importance on
the model adequacy measure. Check for presence of serial correlation and heteroscedasticity.

Income Consumption Income Consumption


80 55 180 115
110 65 225 140
185 70 220 120
110 80 240 145
120 79 185 130
115 84 220 152
130 98 210 144
140 95 245 175
125 90 260 180
90 75 190 135
105 74 205 140
160 110 265 178
150 113 270 191
165 125 230 137
145 108 250 189

Theory:
Simple Linear Regression:
Model: y = β0 + β1x + ε,
where the intercept β0 and slope β1 are unknown constants, called parameter, which are to be
estimated by method of least square and ε is a random error component.

Assumptions:
1. There is a linear relationship between the response (y) and regressor (x).
2. The errors are assumed to be Normally distributed with mean 0 and unknown variance σ2
i.e., εi ~ N(0, σ2) for all i .
3. The error terms are uncorrelated which implies the absence of autocorrelation. i.e.,
Cov(εi , εj) = 0 for all i ≠ j.
4. There is no multicollinearity and the variables are homoscedastic.
The model along with the above assumptions is known as Classical Linear Regression Model (CLRM).

Coefficient of determination (R2): It tells us the proportion or percentage of variation can be


explained by regressor x. The value of R2 lies between 0 and 1. The values of R2 that are close to 1
imply that most of the variability in y is explained by the regression model. The value of R 2 always
increases when we add new regressor variables.
The Adjusted R2 tells the percentage of variation explained by only those regressors that actually
affect the dependent variable y.

Analysis of Variance (ANOVA): It is based on partitioning of total variability in response variable to


draw inferences about the significance of regression.

Hypothesis: To test whether the regressor is significant or not.


i.e., to test H0: β1 = 0 against H1: β1 ≠ 0.

Test criteria: If p-value<0.05, we reject H0 at 5% level of significance and conclude on the basis of
given data that the regressor is statistically significant.

Autocorrelation: It is a characteristic of data in which random errors of a regression model are


correlated with a delayed copy of itself by a function of delay about a time series data i.e.,
Autocorrelation is a mathematical representation of the degree of similarity between a given time
series and a lagged version of itself over successive time intervals.

Durbin Watson test:


Durbin Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in
the residuals from a regression analysis.

Hypothesis:
H0: Errors are serially uncorrelated
H1: Errors follow a first order autoregressive process

Test Criteria: If the value of Durbin-Watson statistic is less than 1 or greater than 4, we reject the null
hypothesis whereas if the value is close to 2, then, we accept H0.

Heteroscedasticity: Heteroscedasticity refers to the circumstance in which there is a systematic


change in the spread of the residuals over the range of measured values. It is a violation of the
assumptions of OLS regression technique which assumes that all residuals are drawn from a
population that has constant variance (homoscedasticity).

Steps:
1. Analyze → Regression → Linear → Dependent: Consumption → Independent(s): Income →
Statistics → Estimates, Model fit, Descriptives, Durbin-Watson → Continue → Save →
Residuals: Unstandardized → Continue → OK
2. Transform → Compute Variable → Target Variable: Absolute_Residuals → Numeric
Expression: ABS(RES_1) → OK
3. Analyze → Correlate → Bivariate → Variables: Income, Absolute_Residuals → Correlation
Coefficients: Pearson → OK
Output:

Table 3.1: Case Summariesa


Unstandardized
Income Consumption absolute_residual
Residual
1 80.00 55.00 -2.91697 2.92
2 110.00 65.00 -11.93739 11.94
3 185.00 70.00 -54.48844 54.49
4 110.00 80.00 3.06261 3.06
5 120.00 79.00 -4.27753 4.28
6 115.00 84.00 3.89254 3.89
7 130.00 98.00 8.38233 8.38
8 140.00 95.00 -.95781 .96
9 125.00 90.00 3.55240 3.55
10 90.00 75.00 10.74289 10.74
11 105.00 74.00 .23268 .23
12 160.00 110.00 1.36191 1.36
13 150.00 113.00 10.70205 10.70
14 165.00 125.00 13.19184 13.19
15 145.00 108.00 8.87212 8.87
16 180.00 115.00 -6.31837 6.32
17 225.00 140.00 -9.84900 9.85
18 220.00 120.00 -26.67893 26.68
19 240.00 145.00 -14.35921 14.36
20 185.00 130.00 5.51156 5.51
21 220.00 152.00 5.32107 5.32
22 210.00 144.00 3.66121 3.66
23 245.00 175.00 12.47072 12.47
24 260.00 180.00 7.96051 7.96
25 190.00 135.00 7.34149 7.34
26 205.00 140.00 2.83128 2.83
27 265.00 178.00 2.79044 2.79
28 270.00 191.00 12.62038 12.62
29 230.00 137.00 -16.01907 16.02
30 250.00 189.00 23.30065 23.30
Total N 30 30 30 30
a. Limited to first 100 cases.

Table 3.2: Descriptive Statistics

Mean Std. Deviation N


Consumption 119.7333 39.06134 30
Income 177.5000 57.20125 30

Table 3.3: Correlations


Consumption Income
Pearson Correlation Consumption 1.000 .928
Income .928 1.000
Sig. (1-tailed) Consumption . .000
Income .000 .
N Consumption 30 30
Income 30 30

Table 3.4: Model Summaryb


Adjusted R Std. Error of the
Model R R Square Square Estimate Durbin-Watson
a
1 .928 .862 .857 14.76674 1.537
a. Predictors: (Constant), Income
b. Dependent Variable: Consumption

Table 3.5: ANOVAb


Model Sum of Squares df Mean Square F Sig.
1 Regression 38142.280 1 38142.280 174.919 .000a
Residual 6105.587 28 218.057
Total 44247.867 29
a. Predictors: (Constant), Income
b. Dependent Variable: Consumption

Table 3.6: Coefficientsa


Standardized
Unstandardized Coefficients
Model Coefficients t Sig.
B Std. Error Beta
(Constant) 7.196 8.926 .806 .427
1
Income .634 .048 .928 13.226 .000
a. Dependent Variable: Consumption

Table 3.7: Residuals Statisticsa


Minimum Maximum Mean Std. Deviation N
Predicted Value 57.9170 178.3796 119.7333 36.26639 30
Residual -54.48844 23.30065 .00000 14.50991 30
Std. Predicted Value -1.705 1.617 .000 1.000 30
Std. Residual -3.690 1.578 .000 .983 30
a. Dependent Variable: Consumption
Table 3.8: Correlations
Income Absolute_Residuals
Pearson Correlation 1 .275
Income Sig. (2-tailed) .141
N 30 30
Pearson Correlation .275 1
Absolute_Residuals Sig. (2-tailed) .141
N 30 30

Conclusion:
1. The residuals and their absolute values have been tabulated in Table 3.1.

2. From table 2.2 of descriptives, we can gather information about mean and standard
deviation of given variables.

3. From table 3.3, the value of correlation between Consumption and Income is 0.928 and
p-value = 0 < 0.05. So, we reject null hypothesis that there is a significant correlation
between Consumption and Income.

4. From table 3.4, the value of coefficient of determination R2 = 0.862 which implies that the
regression model explains 86.2% of the total variation in Consumption. Also, Adjusted R 2 =
0.857 ≈ R2. So, the model is a good fit.

5. From table 3.4, the value of Durbin-Watson statistic is 1.537 which is close to 2. Therefore,
we fail to reject null hypothesis that errors are serially uncorrelated implying autocorrelation
is absent in the data.

6. From table 3.5: ANOVA, the p-value is 0 which is less than 0.05. So, we reject H 0 at 5% level
of significance implying that overall regression is significant.

7. From table 3.6, coefficients of the regression model are ^β 0 = 7.196 and ^β 1 = 0.634. So, the
fitted regression model is: Consumption = 7.196 + 0.634*Income
We can also see that the p-value for testing significance of β 1 is 0 < 0.05. So, we reject null
hypothesis at 5% level of significance implying that the regressor (Income) is significant.

8. From table 3.7 of residual statistics we can gather that the minimum, maximum, mean and
standard deviation values of the predicted values, residuals, standardized predicted values
and standardized residuals.

9. From table 3.8, the value of Pearson’s correlation coefficient between the regressor
(Income) and the absolute value of the residuals is 0.275 and p-value for testing null
hypothesis i.e. presence of homoscedasticity = 0.141 > 0.05. So, we fail to reject null
hypothesis at 5% level of significance implying that the errors have a constant variance.

You might also like