WEEK 8 Regression Analysis
WEEK 8 Regression Analysis
WEEK 8 Regression Analysis
Self-Learning Kit
Statistics & Probability
Quarter 4 - Week 8
ALVIN M. TAMPOS
Writer
Statistics & Probability
Self-Learning Kit
Quarter 4 – Week 8
First Edition, 2020
Republic Act 8293, section 176 states that: No copyright shall subsist in any work of the Government
of the Philippines. However, prior approval of the government agency or office wherein the work is created
shall be necessary for exploitation of such work for profit. Such agency or office may, among other things,
impose as a condition the payment of royalties.
Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand names, trademarks, etc.)
included in this Self-Learning Kit are owned by their respective copyright holders. Every effort has been exerted
to locate and seek permission to use these materials from their respective copyright owners. The publisher
and authors do not represent nor claim ownership over them.
i
Note to the Learner
This Self-Learning Kit is prepared for you to learn the specified competencies
based on the Most Essential Learning Competencies (MELC) for Statistics and
Probability, Quarter 4, Week 8. It is designed in a simplified structure to help you easily
understand the lesson for the week. It contains the following parts:
ii
Lesson Title Regression Analysis
Learning • solves problems involving regression analysis.
Competency
MELC Code • M11/12SP-IVj-2
I Have Known
A. Directions: Answer the following exercises. Choose the letter of the correct
answer.
6. Two variables X & Y are related by the line Y=3x-5. Solve for the value of Y
given the following values of X:
A. X= 6
B. X=10
C. X=36
3
I Can Connect
In the last lesson, we learned that when the trend line is drawn, we observe
that some of the points are on the line while others are below or above the line. In
other words, we say that the point in the scatterplot regress with reference to the line.
If the average y distances of the points from this line is the least, then we call this line
the regression line of the line that “best fit” in the scatterplot. The regression line is the
same as the trend line.
(∑ 𝑌) (∑ 𝑋 2 ) − (∑ 𝑋)(∑ 𝑋𝑌)
𝑎=
𝑛(∑ 𝑋 2 ) − (∑ 𝑋)2
𝑛(𝑋𝑌) − (∑ 𝑋)(∑ 𝑌)
𝑏=
𝑛(∑ 𝑋 2 ) − (∑ 𝑋)2
The regression line Y’=bX +a is also called the line prediction equation because
we use it to predict Y if X is known. Since in the analysis, only the y distance was
considered, the line cannot be used to predict X from Y.
4
I Can Learn
The following data shows the scores of selected grade 11 Senior High School
students in General Mathematics and Earth and Life Science.
a. Test if there is a significant relationship between the two variables at 95%
level of confidence.
b. Predict the grade of a student with a grade of 87 in Math.
Steps Solution
1. Identify the dependent and Here, the dependent variable is the grades
independent variables in General Mathematics while the
independent variable is the Grades in Earth
& Life Science
2. Compute the correlation Let us put the data in columns and find the
coefficient (r) following: ∑ 𝑋, ∑ 𝑌, ∑ 𝑋 2 , ∑ 𝑌 2 , ∑ 𝑋𝑌 and
substitute them in the formula:
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌
𝑟=
X Y 𝑋2 𝑌2 𝑋𝑌
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
85 86 7225 7396 7310
87 85 7569 7225 7395
80 81 6400 6561 6480
79 80 6241 6400 6320
88 87 7744 7569 7656
89 88 7921 7744 7832
∑𝑋 ∑𝑌 ∑ 𝑋2 ∑ 𝑌2 ∑ 𝑋𝑌
= 508 = 507
= 43100 = 42895 = 42993
5
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
6(42993) − (508)(507)
𝑟=
√[6(43100) − (508)2 ][6(42895) − (507)2 ]
𝑟 = 0.97
3. Test the significance of r using the Here n=6 and r = 0.97
formula:
𝑛−2
𝑛−2 𝑡 = 𝑟√
1 − 𝑟2
𝑡 = 𝑟√
1 − 𝑟2
6−2
𝑡 = 0.97√ = 7.98
1 − (0.97)2
4. Compare the computed t-value to Using the df = n-2 = 6-2 = 4, a=0.05, two-
the critical t-value. tailed test, we find from the table that the
critical value of t is 2.776
5. Make a Decision. Since the computed t=7.98 is greater than
the critical value of t=2.776, we reject the
null hypothesis. So, there a significant
relationship between the two variables.
6. Summarize the results. It appears that there is a significant
relationship between the grades in General
Mathematics and Earth & Life Science.
Thus, we will proceed to regression analysis.
7. Compute the value of a and b Using the value obtained in step 2, we have
in the regression equation 𝑌’ = 𝑏𝑋 + 𝑎 the following:
using the following. (∑ 𝑌) (∑ 𝑋 2 ) − (∑ 𝑋)(∑ 𝑋𝑌)
𝑎=
𝑛(∑ 𝑋 2 ) − (∑ 𝑋)2
(∑ 𝑌) (∑ 𝑋 2 ) − (∑ 𝑋)(∑ 𝑋𝑌) (507)(43100) − (508)(42993)
𝑎= =
𝑛(∑ 𝑋 2 ) − (∑ 𝑋)2 6(43100) − 5082
𝒂 = 𝟐𝟏
𝑛(𝑋𝑌) − (∑ 𝑋)(∑ 𝑌)
𝑏=
𝑛(∑ 𝑋 2 ) − (∑ 𝑋)2
𝑛(𝑋𝑌) − (∑ 𝑋)(∑ 𝑌)
𝑏=
𝑛(∑ 𝑋 2 ) − (∑ 𝑋)2
(6)(42993) − (508)(507)
𝑏=
6(43100) − 5082
𝒃 = 𝟎. 𝟕𝟓
8. From the regression equation. Substitute the value of a and b in the
equation.
𝑌’ = 𝑏𝑋 + 𝑎
𝒀’ = 𝟎. 𝟕𝟓𝑿 + 𝟐𝟏
Step 1: Open Microsoft Excel and Paste the given data. See image below.
Step 3: After clicking Data Analysis this window will pop-up. Scroll down and select
Regression then click OK.
7
Step 4: After clicking Ok this window will appear. Supply the Input Y Range with your
Dependent variable and Input X Range with your Independent variable. See image
below.
2
1
How to Input the X and Y Range: Click on the blank space given after the Input Y
Range. Then select the range of your Y in this case select from C2:C7.
Next, click on the blank space given after the Input X Range. Then select the range
of your X, in this case select from B2:B7.
Step 5: Check on the Confidence Level and enter the percentage. Then Click OK.
This window will then appear.
8
Compare the computed t-value to the critical t-value. Using the df = n-2 = 6-2 = 4,
a=0.05, two-tailed test, we find from the table that the critical value of t is 2.776. Since
the computed t=7.864 is greater than the critical t=2.776, we reject the null
hypothesis. So, there is a significant relationship between the two variables. Another
way to identify if there is a significant relationship between the two variables is by
checking the Significance F value. Since, the F value is 0.00 lesser than 0.05 level of
significance (Alpha), then we could say that there is a significant relationship
between the two variables.
c. On the upper right of the graph click the plus (+) sign. Then, all other options will
be displayed. Choose trendline.
9
d. Go to more options.
e. Lastly, check display equation on the chart. The equation being shown is the
equation of the regression line.
10
Note: There is a little difference between the computed value of t using the formula
and Excel. This is due to the value of r in the manual computation is already
rounded up to two decimal places.
I Can Try
1. The following data pertains to the heights of fathers and their eldest sons in
inches. If there is a significant relationship between the two variables, predict
the height of the son if the height of his father is 78 inches. Use 0.05 as the level
of significance.
11
66 68
63 66
68 70
70 72
60 65
58 60
a. Solve if there is a significant relationship between the two variables using
excel.
b. Solve for a and b. if applicable.
c. Predict the height of the son if the height of his father is 68inches.
I Can Assess
Direction: Read and understand the questions below, and answer. Show your
solution on a separate piece of paper.
1. Given the data for the grades and the number of hours spent studying per day.
Use 0.05 as the level of significance. Solve for the following:
a. the value of r.
b. test if there is a significant relationship between the two variables using
Microsoft excel. (Proceed to c and d. If applicable)
c. Solve for a and b.
d. Predict the grades of the student if the number of hours spent studying
per day is 4hrs.
Hour/s spent Studying Grades
2 77
2 80
3 85
4 88
4 86
3 82
I Can Do More
Additional Activity:
The data shows the weight of the mothers and their eldest daughter in kilograms.
If there is a significant relationship between the two variables. Predict the Weight
of the daughter if the weight of her mother is 60Kg. Use 0.05 as the level of
significance.
12
a. Find the value of r.
b. test if there is a significant relationship between the two variables using
Microsoft excel. (Proceed to c and d. If applicable)
c. Solve for a and b.
d. Predict the weight of the daughter if the weight of her mother is 60kg.
13