Econometrics Chap 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Chapter 2

2.7 Properties of the OLS Estimator


If assumptions 1--4 hold, then the estimators ˆα and ˆβ determined by OLS will
have a number of desirable properties, and are known as Best Linear Unbiased
Estimators (BLUE).
Estimator’-- ˆα and ˆβ are estimators of the true value of α and β
● ‘Linear’-- ˆα and ˆβ are linear estimators-- that means that the formulae for ˆα
and ˆβ are linear combinations of the random variables (in this case, y)
● ‘Unbiased’-- on average, the actual values of ˆα and ˆβ will be equal to their
true values.
● ‘Best’-- means that the OLS estimator ˆβ has minimum variance among the
class of linear unbiased estimators;
• the Gauss--Markov theorem proves that,
- the OLS estimator is best by examining an arbitrary alternative linear
unbiased estimator and showing in all cases that it must have a variance no
smaller than the OLS estimator.
Under assumptions 1--4 listed above, the OLS estimator can be shown to have
the desirable properties that it is consistent, unbiased and efficient.
Unbiasedness and efficiency have already been discussed above, and
consistency is an additional desirable property. These three characteristics will
now be discussed in turn.
1.Consistency
• The least squares estimators ˆα and ˆβ are consistent.
• Algebraic statement for ˆβ's consistency:

(Eq. 2.17)
• Consistency implies:
✓ As the sample size (T) tends to infinity, the probability of ˆβ being δ away
from its true value tends to zero.
✓ In other words, as the number of observations increases, the estimator
approaches its true value.
• Consistency is an asymptotic property:
✓ It holds for large sample sizes, approaching infinity.
• Assumptions for consistency:

are sufficient to derive the consistency of the OLS


estimator.
2.Unbiasedness
• The least squares estimators ˆα and ˆβ are unbiased.
✓ Unbiasedness implies:
E(ˆα) = α (Eq. 2.18)
E(ˆβ) = β (Eq. 2.19)
✓ On average, the estimated coefficients are equal to their true values.
✓ No systematic overestimation or underestimation of the true coefficients.
• Assumption for unbiasedness:
✓ It requires the assumption that cov(ut, xt) = 0.
• Unbiasedness is a strong condition:
✓ It holds for all sample sizes, both small and large.
• Unbiasedness vs. Consistency:
✓ Unbiasedness is stronger than consistency because it holds for all sample
sizes, not just for large samples.
3.Efficiency
✓ An estimator ˆβ of a parameter β is efficient if no other estimator has a
smaller variance.
• Efficiency implies:
✓ It minimizes the probability of being far from the true value of β.
✓ The estimator is considered the "best" when it comes to minimizing
uncertainty in estimation.
• Efficiency among linear unbiased estimators:
✓ It minimizes uncertainty within the class of linear unbiased estimators.
• Technical statement of efficiency:
✓ An efficient estimator's probability distribution is narrowly dispersed around
the true value.
_________End_________

2.8 Precision and standard errors


i) Estimating the variance of the error term (σ2)
• Specificity to the Sample:
-Regression estimates ˆα and ˆβ are specific to the sample used for their
estimation.
-Different samples from the population will lead to different data points (xt
and yt) and, consequently, different OLS estimates.
• Desirable measure: Reliability and Precision
-It is desirable to assess the reliability or precision of the estimators (ˆα
and ˆβ).
• Confidence in estimates:
-Knowing the reliability helps determine whether there is confidence in the
estimates.
• Variability across samples:
-Understanding the sampling variability helps gauge how much the
estimates (ˆα and ˆβ) are likely to vary from one sample to another within
the given population.
• Standard Error:
-The standard error provides a measure of the precision of the estimates.
• Valid estimators:
-Given assumptions 1--4, valid estimators of the standard errors can be
calculated.

✓ where s is the estimated standard deviation of the residuals (see below).


✓ It is worth noting that the standard errors give only a general indication of the
likely accuracy of the regression parameters.
✓ They do not show how accurate a particular set of coefficient estimates is.
✓ If the standard errors are small, it shows that the coefficients are likely to be
precise on average, not how precise they are for this particular sample.
✓ Thus standard errors give a measure of the degree of uncertainty in the
estimated .
✓ It can be seen that they are a function of the actual observations on the
explanatory variable, x, the sample size, T, and another term, s.
✓ The last of these is an estimate of the variance of the disturbance term.
✓ The actual variance of the disturbance term is usually denoted by σ2.
✓ How can an estimate of σ2 be obtained?
i)Estimating the variance of the error term (σ2)
ii) Some comments on the standard error estimators
✓ From elementary statistics, the variance of a random variable ut is given by

(2.22)
✓ Assumption 1 of the CLRM was that the expected or average value of the
errors is zero. Under this assumption, (2.22) above reduces to

✓ So what is required is an estimate of the average value of u2t, which could


be calculated as

✓ Unfortunately (2.24) is not workable since ut is a series of population


disturbances, which is not observable.
✓ Thus the sample counterpart to ut,which is ˆut, is used

✓ But this estimator is a biased estimator of σ2. An unbiased estimator, s2,


would be given by the following equation instead of the previous one
ˆ

✓ s is also known as the standard error of the regression or the standard error
of the estimate.
✓ It is sometimes used as a broad measure of the fit of the regression equation.
✓ Everything else being equal, the smaller this quantity is, the closer is the fit of
the line to the actual data
ii) Some comments on the standard error estimators
(1) Larger sample size (T) → Smaller coefficient standard errors.
✓ T explicitly affects SE(ˆα) and implicitly affects SE(ˆβ).
✓ More information from a larger sample lead to increased confidence in the
estimates.
(2) Both SE(ˆα) and SE(ˆβ) depend on s^2 (or s) - the estimate of error variance.
✓ Larger s^2 → More dispersed residuals → Greater uncertainty in the model.
(3) Sum of squares of (xt - ¯x) affects both formulae:
✓ Larger sum of squares → Smaller coefficient variances.
✓ Figure 2.7: Small (xt - ¯x)^2 → Difficult to determine the line's position.
✓ Figure 2.8: Large (xt - ¯x)^2 → More confidence in the estimates.
(4) Term x^2_t affects only the intercept standard error, not the slope standard
error.
✓ x^2_t measures how far points are from the y-axis.
✓ Figure 2.9: Points far from the y-axis → Difficult to estimate the intercept
accurately.
✓ Figure 2.10: Points closer to the y-axis → Easier to determine where the line
crosses the y-axis.
_________end_________
✓ β=0.5091isasingle(point)estimate of the unknown population parameter,β.
✓ As stated above, the reliability of the point estimate is measured by the
coefficient’s standard error.
✓ Sample coefficients and their standard errors are used to make inferences
about population parameters.
✓ Example: The estimate of the slope coefficient is ˆβ = 0.5091, but it is
expected to vary from one sample to another.
✓ Hypothesis Testing:
✓ Hypothesis testing helps answer questions about the plausibility of
population parameters based on sample estimates.
________--end________________
❖ *********Hypothesis testing: some concepts
Hypothesis Testing Framework:
✓ Hypothesis testing involves two hypotheses that go together.
✓ Null Hypothesis (H0 or HN):
-The null hypothesis is the statement or statistical hypothesis being tested.
✓ Alternative Hypothesis (H1 or HA):
-The alternative hypothesis represents the remaining outcomes of interest.
✓ Both hypotheses are essential for hypothesis testing to compare and assess
the evidence in favor of one over the other.
✓ Example of Hypothesis Testing:
• Hypothesis: Testing the true value of β = 0.5 using the regression results
above.
• Null Hypothesis (H0):
H0: β = 0.5
The null hypothesis states that the true, but unknown value of β is 0.5.
• Alternative Hypothesis (H1):
H1: β ≠ 0.5
The alternative hypothesis represents the remaining outcomes where β is not
equal to 0.5.
• Two-Sided Test:
This is a two-sided test since the alternative hypothesis includes both
possibilities: β < 0.5 and β > 0.5.

✓ Example of One-Sided Hypothesis Testing:

• Hypothesis: Testing the true value of β = 0.5 using the regression results
above, with prior information suggesting β > 0.5.
• Null Hypothesis (H0):
H0: β = 0.5
The null hypothesis states that the true, but unknown value of β is 0.5.
• Alternative Hypothesis (H1):
H1: β > 0.5
The one-sided alternative hypothesis suggests that β is more than 0.5.
• One-Sided Test:
• This is a one-sided test because the alternative hypothesis only considers the
possibility of β being greater than 0.5, and β < 0.5 is no longer of interest in
this context.
✓ Two Ways to Conduct a Hypothesis Test:
1. Test of Significance Approach:
• The test of significance approach involves statistical comparison of the
estimated coefficient value and its value under the null hypothesis.
• If the estimated value is significantly different from the hypothesized value,
the null hypothesis is likely to be rejected.
2.Confidence Interval Approach:
• The confidence interval approach also compares the estimated coefficient
value with its value under the null hypothesis.
• If the value under the null hypothesis falls within the confidence interval, the
null hypothesis is less likely to be rejected.
✓ Comparison of Estimated and Hypothesized Values:
• In general terms, if the estimated value is far from the hypothesized value,
the null hypothesis is more likely to be rejected.
• Conversely, if the value under the null hypothesis and the estimated value
are close to each other, the null hypothesis is less likely to be rejected.
Example:
• Consider the estimated value ˆβ = 0.5091 from above.
• A hypothesis that the true value of β is 5 is more likely to be rejected than a
null hypothesis.
- that the true value of β is 0.5
- because the estimated value is far from 5 but relatively close to 0.5.
What is required now is a statistical decision rule that will permit the formal
testing of such hypotheses.
▪ The probability distribution of the least squares estimators
✓ In order to test hypotheses, assumption 5 of the CLRM must be used,
-namely that ut ∼ N(0,σ2)-- i.e. that the error term is normally distributed.
✓ The normal distribution is a convenient one to use for it involves only two
parameters (its mean and variance).
✓ This makes the algebra involved in statistical inference considerably simpler
than it otherwise would have been.
✓ Since yt depends partially on ut,
- it can be stated that if ut is normally distributed, yt will also be normally
distributed.
✓ Least squares estimators (ˆβ) are linear combinations of random variables
(yt).
✓ ˆβ = wtyt, where wt are effectively weights in the regression equation.
✓ The weighted sum of normal random variables (yt) is also normally
distributed.
✓ As a result, the coefficient estimates (ˆβ) will also follow a normal
distribution. Thus ,

✓ Will the coefficient estimates still follow a normal distribution if the errors do
not follow a normal distribution?
- Well, briefly, the answer is usually ‘yes’,
-provided that the other assumptions of the CLRM hold, and the sample
size is sufficiently large.
✓ Standard normal variables can be constructed from ˆα and ˆβ by subtracting
the mean and dividing by the square root of the variance

The square roots of the coefficient variances are the standard errors.
✓ Unfortunately, the standard errors of the true coefficient values under the
PRF are never known,
-- all that is available are their sample counterparts,
-the calculated standard errors of the coefficient estimates, SE(ˆα) and SE(
ˆβ).
✓ Replacing the true values of the standard errors with the sample estimated
versions induces another source of uncertainty,
- and also means that the standardised statistics follow a t-distribution with T
− 2 degrees of freedom (defined below) rather than a normal distribution, so

✓ This result is not formally proved here.


▪ A note on the t and the normal distributions

✓ A normal variate can be scaled to have zero mean and unit variance by
subtracting its mean and dividing by its standard deviation.
✓ There is a specific relationship between the t- and the standard normal
distribution, and the t-distribution has another parameter, its degrees of
freedom.
✓ There are broadly two approaches to testing hypotheses under regression
analysis:
a) the test of significance approach
b) and the confidence interval approach.

a) Conducting a test of significance

(1) Estimateˆα, ˆβ and SE(ˆα),SE(ˆβ) in the usual way.


(2) Calculate the test statistic. This is given by the formula,

✓ Where β* is the value of β under the null hypothesis.


✓ The null hypothesis is H0:β =β∗ and the alternative hypothesis is
H1:β=β∗(for a two-sided test).
3) A tabulated distribution with which to compare the estimated test statistics is
required.
-Test statistics derived in this way can be shown to follow a t-distribution with T
− 2 degrees of freedom.
(4) Choose a ‘significance level’, often denoted α (not the same as the regression
intercept coefficient).
- It is conventional to use a significance level of 5%.

(5) Given a significance level, a rejection region and non-rejection region can be
determined.
-A 5% significance level means that 5% of the total distribution (area under the
curve) will be in the rejection region.
✓ Two-Sided Test Rejection Region:
-In a two-sided test, the 5% rejection region is split equally between the two
tails of the distribution.
✓ One-Sided Test Rejection Region:
-In a one-sided test, the 5% rejection region is located solely in one tail of the
distribution.
-Figure 2.14 shows the rejection region for a test with the alternative of the 'less
than' form.
-Figure 2.15 shows the rejection region for a test with the alternative of the
'greater than' form.
(6) Use the t-tables to obtain a critical value or values with which to compare the
test statistic.
-The critical value will be that value of x that puts 5% into the rejection region.
(7) Finally perform the test.
-If the test statistic lies in the rejection region then reject the null hypothesis
(H0), else do not reject H0.

✓ In Step 2, the estimated value of β is compared to the value under the null
hypothesis.
✓ The difference is "normalised" or scaled by the standard error of the
coefficient estimate.
✓ The standard error measures the confidence in the coefficient estimate from
the first stage.
✓ A small standard error leads to a larger test statistic relative to a large
standard error.
✓ A small standard error implies that even a small difference between the
estimated and hypothesised values can lead to rejecting the null hypothesis.
✓ Dividing by the standard error ensures that, under the five CLRM
assumptions, the test statistic follows a tabulated distribution.
✓ This distribution is used to determine the critical values and p-values for
hypothesis testing.
b)The confidence interval approach to hypothesis testing

❖ The test of significance and confidence interval approaches always give


the same conclusion
✓ Under the test of significance approach, the null hypothesis that β = β∗ will
not be rejected if the test statistic lies within the non-rejection region,
✓ i.e. if the following condition holds
✓ Rearranging, the null hypothesis would not be rejected if

✓ But this is just the rule for non-rejection under the confidence interval
approach.
✓ So it will always be the case that, for a given significance level, the test of
significance and confidence interval approaches will provide the same
conclusion by construction.
✓ One testing approach is simply an algebraic rearrangement of the other.

You might also like