Tutorial 6 Linear Regression and Correlation

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

TUTORIAL 6 LINEAR REGRESSIONS & CORRELATIONS

1. The following table shows the amount of water, in cm3, applied to seven similar plots
on an experimental farm. It also shows the yield of hay in tones per acre.

Amount of water (x) 30 45 60 75 90 105 120


Yield of hay (y) 4.85 5.20 5.76 6.60 7.35 7.95 7.77

a. Find the equation of the regression line of y on x in the form y  a  bx .


b. Calculate the correlation coefficients of your regression line
c. What would you predict the yield to be for x =28 and for x =150? Comment on
the reliability of each of your predicted yields.

2. Two people, X and Y were asked to give marks out of 20 for seven brands of fish
finger. The results recorded in the table.

Brands A B C D E F G
X’s mark 8 10 18 2 1 4 15
Y’s mark 5 14 12 9 4 1 19

a. Find the equation of the regression line of y on x in the form y  a  bx .


b. Calculate the correlation coefficients of your regression line.
c. Test the linearity between x and y when   0.05 .

3. A mother monitored the growth of her baby and recorded the length h cm and weight y
h3
x=
kg at various stages in the baby’s development. The new variable 10000 was
calculated and the values of x and y are given in the table below.

19. 25. 31.


x 12.5 5 0 4 55.1 68.1 88.5
4.8 6.3 7.1 10.6 13.6 17.9
y 4.43 8 1 8 3 0 5

a. Plot a scatter diagram to illustrate the data and comment on whether a linear
relationship between y and x is likely to provide suitable model for the
relationship between y and x.
b. Obtain the regression line of y on x.
c. Estimate the weight of the baby when it was 75 cm long.
4. A car manufacturer is testing the braking distance, y meters for different speeds, x
km/h when the brakes were applied.

Speed of Car, (x km/h) 20 50 70 90 110 130


Braking distance (y metres) 25 50 85 155 235 350

a. Plot a scatter diagram


b. Calculate the equation of the regression line of y on x and draw the line on your
scatter diagram.
c. Use your regression equation to predict values of y when x = 100 and x = 150.
Comment with reasons, on the likely accuracy of these predictions.
d. Discuss briefly whether the regression line provides a good model or whether
there is a better way of modeling the relationship between y and x.

5. Values of x and y for a set of bivariate data are given in the following table.

x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9


y 1.97 1.94 1.89 1.82 1.73 1.62 1.49 1.34 1.17

a. Plot a scatter diagram to illustrate the data.


b. Calculate the product moment correlation coefficient for this data and state what
its value tells you about the relationship between x and y.
c. State which of the following best indicates the relationship between x and y.
i. The product moment correlation
ii. The scatter diagram
Give a reason for your answer.

6. An old film is treated with chemical in order to improve the contrast. Preliminary tests
on 9 samples drawn from a segment of the film produced the following results.

Sample A B C D E F G H I
x 1 1.5 2 2.5 3 3.5 4 4.5 5
y 49 60 66 62 72 64 89 90 96

The quantity x is a measure of the amount of chemical applied, and y is the contrast
index, which takes values between 0 and 100.

a. Plot a scatter diagram to illustrate the data.


b. It is subsequently discovered that one of the samples of film are damaged and
produced an incorrect result. State which sample you think this was.
In all subsequent calculations this incorrect sample is ignored. The remaining data can
be summarized as follows:
∑ x=23 . 5 , ∑ y=584 , ∑ x 2=83. 75 , ∑ y 2=44 . 622 , ∑ xy =1883 , n=8
c. Calculate the product moment correlation coefficient
d. State with a reason, whether it is sensible to conclude from your answer in part (c)
that x and y are linearly related.
e. The line of regression of y on x has equation y=a+bx . Calculate the values of
a and b.

7. Before hiring new employees, the personnel director for a company decides to do a
regression analysis of the company’s current salary structure. She believes that an
employee’s salary is related to the number of years of work experience (YEARS) and
to the number of years of post-high school education (POSTHSED). The following
EXCEL output is produced from the sample data she has gathered:

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.785
R Square 0.886
Adjusted R
Square 0.884
Standard Error 3164
Observations 194

ANOVA
Significanc
  df SS MS F eF
1479211827 739605913
Regression 2 2 6 738.9 0
Residual 191 1912102400 10011007
1670422118
Total 192 4    

Coefficient Standard
  s Error t Stat P-value
Intercept 29436.2 581.3 50.4 0
POSTHSED 1306.1 255.3 5.12 0
YEARS 832.63 44.49 18.71 0

a. What is the dependent (response) variable?


b. What are the independent (explanatory) variables?
c. What are the regression equation values?
d. Predict a salary for one with no experience and with no post-high school
education.
e. Predict a salary for one with 6 years of work experience and with 4 years of post-
high school education.
f. Interpret each coefficient in the given equation.
g. What is the value of standard error of estimate? Interpret this value.
h. What is the value of coefficient of multiple determinations? Interpret this value.
i. What can you conclude from the given ANOVA table if given   0.05 ?

8. A manufacturer found that a significant relationship exist among the number of hours
an assembly line employee works per shift x1 , the total number of items produced x2 ,
and the number of defective items produced y. The multiple regression equation is
yˆ  9.6  2.2 x1  1.08 x2 .

a. Predict the number of defective items produced by an employee who has worked
9 hours and produced 24 items.
b. Interpret each coefficient in the given equation.

9. A researcher has determined that a significant relationship exists among an employee’s


age x1 , grade point average x2 , and income y. The multiple regression equation is
yˆ  34127  132 x1  20805 x2 .

a. Predict the income of a person who is 32 years old and has a GPA of 3.4.
b. Interpret each coefficient in the given equation.

ANSWERS

1. a. yˆ  3.67  0.04 x
b. r  0.9766
c. y  4.7235, y  9.3275 , the prediction is reliable

3. b. yˆ  1.6858  0.1772 x
c. x  42.187, y  9.1616kg

5. b. r  0.9753 , strong negative linear correlation


c. product moment correlation

You might also like