Bio-L8- Correlation and Regression Analysis
Bio-L8- Correlation and Regression Analysis
Bio-L8- Correlation and Regression Analysis
Scatter Gram :
Consider a list of pairs of numerical values respresenting variables 𝑥 and 𝑦 .
The scatter gram of the data is simply a picture of the pairs of values as a
point in a coordinate plan 𝑅3 . The picture some times indicates a
relationship between the point as illustrated in the following examples :
Correlation Coefficient :
Pearson defined 𝑟 so that the formula for 𝑟 has a minimum possible value of −1 and
a maximum possible value of +1 , when the sample points lie exatly in a line sloping
down to the right we say there is perfect negative correlation : 𝑟 = −1 , when the
sample points lie exactly in a line sloping up to the right , we say there is perfect
positive correlation : 𝑟 = +1 , when there is no tending of the points to the lie in a
straight line we say there is no correlation 𝑟 = 0 .
If 𝑟 is near to +1 or −1 we say we have high correlation. If 𝑟 is near zero, we say
that we have lowe correlation .
𝑛 𝑛
𝑖=1(𝑥𝑖 −𝑋) 𝑖=1(𝑦𝑖 −𝑌)
𝑟=
𝑛 (𝑥 −𝑋)2 𝑛 (𝑦 −𝑌)2
𝑖=1 𝑖 𝑖=1 𝑖
or
𝑛 𝑛 𝑛
𝑖=1 𝑥𝑖 𝑦𝑖 − ( 𝑖=1 𝑥𝑖 𝑖=1 𝑦𝑖 )/𝑛
𝑟=
𝑛 2 𝑛 2
𝑛 2
( 𝑖=1 𝑥𝑖 ) 𝑛 2
( 𝑖=1 𝑦𝑖 )
𝑖=1 𝑥𝑖 − 𝑛 𝑖=1 𝑦𝑖 − 𝑛
64 81 81
Regression Line :
A regression line is a straight line that describes how a response variable
y changes as an explanatory variable x changes. We often use a
regression line to predict the value of y for a given value of x.
𝑦 = 𝑎 + 𝑏𝑥
𝑛
𝑖=1 𝑥𝑖 𝑦𝑖 −𝑛𝑋𝑌
𝑏= 𝑛 𝑥 2 −𝑛𝑋 2
𝑖=1 𝑖
𝑎 = 𝑌 − 𝑏𝑋
𝑛 𝑛 𝑛
𝑖=1 𝑥𝑖 𝑦𝑖 = 711 , 𝑖=1 𝑥𝑖 = 47 , 𝑋 = 47/6 , 𝑖=1 𝑦𝑖 = 79 , 𝑌 = 79/6
𝑛 2
𝑖=1 𝑥𝑖 = 423 ,
13
Example 1:Use the method of least squares to fit a straight line to the accompanying data points.
Give the estimates of 𝛽0 and 𝛽1 . Plot the points and sketch the fitted least-squares line. The observed
data values are given in the following table. 𝑥 −1 0 2 −2 5 6 8 11 12 −3
𝑦 −5 −4 2 −7 6 9 13 21 20 −9
14
𝑛 2 ( 𝑛
𝑖=1 𝑥𝑖 )
2 (38)2
𝑆𝑥𝑥 = 𝑖=1 𝑥𝑖 − 𝑛
= 408 − 10
= 263.6
𝑛 𝑛
𝑛 𝑖=1 𝑥𝑖 𝑖=1 𝑦𝑖 38 46
𝑆𝑥𝑦 = 𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑛
= 709 − 10
= 534.2
𝑥 = 3.8, 𝑦 = 4.6
𝑆𝑥𝑦 534.2
Therefore, 𝛽1 = 𝑆𝑥𝑥 = 263.6 = 2.0266
and 𝛽0 = 𝑦 − 𝛽1 𝑥 = 4.6 − (2.0266)(3.8) = −3.1011
Hence, the least-squares line for these data is
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 = −3.1011 + 2.0266 x
EXAMPLE : Fit a least square line for the following data. Also find the trend values
(𝑦 ) and show that (𝑦 − 𝑦)=0
𝑥 1 2 3 4 5
𝑦 2 5 3 8 7
H.W
15