Chapter No 11 (Simple Linear Regression)
Chapter No 11 (Simple Linear Regression)
Chapter No 11 (Simple Linear Regression)
Definitions
Pearson product moment correlation coefficient: A numerical measure of strength is the linear
relationship between two variables is called Pearson product moment correlation coefficient, total
correlation and coefficient of simple correlation. It is denoted by r.
∑(𝑋 − 𝑋̅)(𝑌 − 𝑌̅)
𝑟=
√∑(𝑋 − 𝑋̅)2 𝛴(𝑌 − 𝑌̅)2
𝑛∑𝑋𝑌 − 𝛴𝑋𝛴𝑌
𝑟=
√[𝑛∑𝑋 2 − (∑𝑋)2 ][𝑛∑𝑌 2 − (∑𝑌)2 ]
𝑟 = √𝑏𝑌𝑋 × 𝑏𝑋𝑌
Scatter Diagram: The graphical representation of the set of n pairs of bivariate data is called scatter plot
and scatter diagram. In scatter diagram we take independent variable X along (x-axis) and dependent
variable Y along (y-axis). The scatter diagram shows positive linear relation, negative linear relation, no
relationship, curvy linear relationship.
Regression: The dependence of one dependent variable on one or more independent variable is called
regression.
Simple regression: The dependence of one dependent variable on one independent variable is called
simple regression.
Multiple regression: The dependence of one dependent variable on two or more variables is called
multiple regression.
Coefficient of determination: The coefficient of determination is denoted by 𝑅 2 . It is statistical measure
that indicates the proportion of the variance in the dependent variable that is predictable from the
independent variable. In simple words, it tells us how well independent variable explains the variability of
dependent variable. The coefficient of determination lies between 0 and 1.
𝑆𝑆𝐸
𝑅2 = 1 −
𝑆𝑆𝑇
Properties of least square regression line:
• The least squares regression line always goes through the line (𝑋̅, 𝑌̅), means of the data.
• The sum of the deviations of the observed values Y from the least square regression line is always
equal to 0, 𝛴(𝑌 − 𝑌̂) = 0.
• The sum of the square of the deviations of the observed value Y from the least square regression
line is minimum, 𝛴(𝑦 − 𝑦̂)2 = 𝑚𝑖𝑛𝑖𝑚𝑢𝑚.
• The least square regression line obtained from a random sample is the line of best fit because a
and b are the unbiased estimates of the parameter 𝑎 𝑎𝑛𝑑 𝛽.
Standard deviation of regression or standard error of estimate:
∑(𝑦 − 𝑦̅)2
𝑠𝑌𝑋 = √
𝑛−2
Regression line: The line or curve around which the point cluster is called regression line.
Dependent variable: In dependent variable, it is denoted by Y and it is the variable that is being
explained or predicted. It is also called as regressand, explained, predictand and response. For example,
marks score in test.
Independent variable: In independent variable, it is denoted by X and it is the variable used to make
predictions. It is also called as regressor, explanatory, predictor and regression. For example, number of
hours study.
Simple linear regression: If the simple regression describes the dependence of the expected value of the
dependent variable as a linear function of the independent variable, then the regression is called simple
linear regression.
Simple linear regression coefficient: the simple linear regression coefficient is the relative change in the
expected value of the dependent variable with respect to one unit increase in the independent non-random
variable.