Group 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

SPECIFICATION OF REGRESSION

MODELS & ESTIMATION


PARAMETERS

Group presentation by :
Gurliv 26037
Guneet 26030
Yashvi 26028
Samarjot 26004
Sunita 26025
REGRESSION ANALYSIS
Regression analysis is a statistical
method used to examine the relationship
between one dependent variable
(usually denoted as 𝑌Y) and one or more
independent variables (usually denoted
as 𝑋1,𝑋2,...,𝑋𝑛X1​, X2​, ...,Xn​) .
The purpose of regression analysis is to
understand how changes in the
independent variables are associated
with changes in the dependent variable.
Relationship
Control
Identification

Importance
Prediction
in Statistical Variable Selection
Modelling

Hypothesis Testing Forecasting


TYPES OF REGRESSION MODELS

A linear regression model is a specific form of regression A nonlinear regression model is a type of regression
analysis where the relationship between the dependent analysis where the relationship between the
variable and one or more independent variables is dependent variable and the independent variables is
assumed to be linear. not assumed to be linear. It allows for more flexible
Mathematical expression : functional forms, such as exponential, logarithmic,
polynomial, or other nonlinear relationships.
𝑌=𝛽0+𝛽1𝑋1+𝛽2𝑋2+...+𝛽𝑛𝑋𝑛+𝜖Y=β0​+β1​X1​+β2​X2​+...+βn​ Mathematical expression :
Xn​+ϵ 𝑌=𝑓(𝑋1,𝑋2,...,𝑋𝑛,𝛽)+𝜖Y=f(X1​,X2​,...,Xn​,β)+ϵ

Y is the dependent variable.


Y is the dependent variable.
𝑋1,𝑋2,...,𝑋𝑛X1​,X2​,...,Xn​are independent variables.
𝑋1,𝑋2,...,𝑋𝑛X1​,X2​,...,Xn​are independent variables.
𝛽β are the parameters of the nonlinear function 𝑓f.
𝛽0β0​is the intercept.
𝑓f is a nonlinear function of the independent
𝛽1,𝛽2,...,𝛽𝑛β1​,β2​,...,βn​are coefficients.
variables and parameters.
𝜖ϵ is the error term, capturing the difference
𝜖ϵ is the error term, capturing the difference
between observed and predicted 𝑌Y values.
between observed and predicted 𝑌Y values.
SPECIFICATION OF
REGRESSION
MODELS

In regression analysis, specification refers to


the crucial step of defining the structure of your
model. This involves deciding which variables
are independent (explanatory) and which is
the dependent (response) variable
SPECIFICATION
OF 1 CHOOSING THE INDEPENDENT VARIABLES

These are the factors you believe

REGRESSION influence the dependent variable.


Selecting them involves a mix of
theoretical underpinnings,
MODELS experience, and common sense.
SPECIFICATION
2 FUNCTIONAL FORM
OF This refers to the mathematical
relationship between the independent
REGRESSION and dependent variables.

MODELS
SPECIFICATION
3 STOCHASTIC TERM
OF This term represents the error in
your model, basically the
REGRESSION unexplained variance. Ideally,
this term should be random and

MODELS normally distributed.


CHOOSING THE RIGHT SPECIFICATION IS VITAL BECAUSE IT
DIRECTLY AFFECTS THE ACCURACY AND RELIABILITY OF YOUR
REGRESSION ANALYSIS. IF YOU INCLUDE IRRELEVANT
VARIABLES, OMIT IMPORTANT ONES, OR CHOOSE AN
INCORRECT FUNCTIONAL FORM, YOUR RESULTS MIGHT BE
MISLEADING.
PARAMETER ESTIMATION

Parameter estimation is the process of using sample data to make educated


guesses or estimates about the parameters of a statistical model that
describes a population. Parameters are numerical characteristics that
define the population distribution or relationship between variables.
PARAMETER ESTIMATION METHODS

ordinary least square maximum likelyhood


estimation(OLS) estimation (MLS)
INTERPRETATION OF ESTIMATED PARAMETERS

The interpretation of estimated parameters depends on the context of the statistical model
being used. Here's how parameters are typically interpreted:
1. Linear Regression:
Intercept (𝛽^0β^​0​): Represents the estimated value of the dependent variable when
all independent variables are zero. In some cases, this may not have a meaningful
interpretation, especially if the independent variables cannot logically be zero.
Coefficients (𝛽^1,𝛽^2,…β^​1​,β^​2​,…): Represent the estimated change in the dependent
variable for a one-unit change in the corresponding independent variable, holding
all other variables constant.
1. Logistic Regression:
Coefficients: Represent the estimated log-odds of the occurrence of the event of
interest for a one-unit change in the corresponding independent variable, holding
all other variables constant. Exponentiating these coefficients gives the odds ratio
interpretation.
2. Poisson Regression:
Coefficients: Represent the estimated log-relative rate of occurrence of events for
a one-unit change in the corresponding independent variable, holding all other
variables constant.
3. Exponential Distribution:
Parameter (𝜆λ): Represents the estimated rate at which events occur. It indicates
the average number of events occurring per unit of time or space.
1. Normal Distribution:
Mean (𝜇μ) and Standard Deviation (𝜎σ): Represent the estimated location and spread of
the data, respectively. The mean is the center of the distribution, while the standard
deviation measures the dispersion or variability of the data.
2. Generalized Linear Models (GLMs):
The interpretation of parameters in GLMs depends on the specific link function used.
The coefficients typically represent the estimated change in the response variable for a
one-unit change in the corresponding predictor variable, adjusted for other variables in
the model.
Robust regression methods
Regression analysis is a powerful tool for modeling relationships between variables.
However, traditional regression techniques can be sensitive to outliers and violations
of assumptions such as normality and homoscedasticity. Robust regression methods
offer solutions to these challenges by providing more reliable estimates even in the
presence of outliers and non-normal data distributions. Two commonly used robust
regression methods are Huber regression and M-estimation.

1 Huber Regression 2 M-Estimation


Nonparametric Regression
In many real-world scenarios, the relationship between variables
may not adhere to a specific parametric form assumed by traditional
regression models. Nonparametric regression methods offer flexible
alternatives that can capture complex relationships without
imposing strict assumptions.
1 Kernel 2 Locally Weighted
Regression Regression
Kernel regression is a nonparametric Locally weighted regression, also known as LOESS
technique that estimates the conditional (locally estimated scatterplot smoothing), is another
nonparametric regression method that emphasizes
expectation of a response variable given
local relationships between variables.
predictor variables. It uses a kernel function Unlike global regression models that fit a single
to assign weights to neighboring data points, function to the entire dataset, locally weighted
with closer points receiving higher weights. regression fits a separate regression model to each
By incorporating information from nearby data point, with nearby points receiving higher
data points, kernel regression provides a weights.
smooth estimate of the underlying This approach allows locally weighted regression to
adapt to changes in the underlying data structure,
relationship between variables. Common
making it particularly effective for modeling
kernel functions include Gaussian, nonlinear relationships and handling
Epanechnikov, and uniform kernels. heteroscedasticity.

You might also like