Outline of Econometrics Topics

Understand linear in parameters, ie, many nonlinear functions (intrinsically linear) can be estimated with
linear regression analysis

Be able to put into words regression coefficients (being careful to list each independent variable being
held constant) when the variables are in either level form or log form. Understand the formulas for the
regression slope and standard errors in the two variable case (for Mid-Term only from an Excel spread
sheet) and multiple regression.

Be able to interpret regressions that involve quadratic terms (ie, x^2 terms).

Be able to put in words the idea of a coefficient in multiple regression based on partialling out

Understand assumptions made when ordinary least squares estimation is carried out both written in the
form of individual observations and in matrix form.

Understand how the regression coefficients can be thought of as random variables.

Understand the role of the t distribution for hypothesis testing and confidence interval estimation both
in two variable and in multiple regression.

Understand the Analysis of Variance for Regression table, including the decomposition of the sum of
squares (formulas), degrees of freedom (formulas) and mean squares (variances) formulas. Understand
how to calculate the R^2, the standard error of estimate (sigmahat ), the F statistic, the adjusted R^2 (R
bar squared) and the standard deviation of the dependent variable from that table. Put in words the
R^2 and state what the standard error of estimate is an estimate of. Understand when the adjusted R
squared could be used.

Understand the hypothesis tested with the F statistics generated from the Analysis of Variance for
Regression table

Understand that the R^2 from a regression could be calculated as the simple correlation between the
dependent variable (y) and the predicted value of the dependent variable (yhat).

Understand how to find an R^2 from a function with the dependent variable in log form that can be
compared to a function with the dependent variable in level form.

Understand how to think about the residuals (uhats) as that part of the dependent variable not
explained by the linear relationship with the independent variables.

Understand how the standard error of both the mean and individual prediction is calculated in both two
variable regression and multiple regression (formulas). Compute and put in words a confidence interval
for both a mean prediction and a individual prediction.

Understand how to compute omitted variable bias based on excluded independent variables that should
have been included.

Understand alternative formulas for calculating the variances of regression coefficients in multiple
regression including the calculation and understanding of a VIF (variance inflation factor). This can help
understand the issue of multicollinearity.

Be able to read mata output for calculation and interpretation of: betahat, variance covariance matrix of
betahat, analysis of variance for regression table, standard error of prediction, and betahat as a linear
combination of the dependent variable data. (Mid-term Exam only)

Understand testing hypothesis on multiple restrictions on regression slope parameters (F test).

Be able to put into words the idea of an estimator being unbiased; being consistent.

Understand how to estimate and interpret “beta coefficients” where change is measured in standard
deviation units.

Understand how to interpret regressions that involve categorical (dummy) variables both for two
category variables (eg, male, female), or multiple category variables (eg, full, associate, assistant,

Understand how to compute and interpret interactive variables between two numeric variables;
between two categorical (dummy) variables and between a numeric (non-categorical) and categorical

Understand how to compute percent change when the dependent variable is in log form and
independent variable is in level form for either a unit change in the independent variable (useful with
categorical variables) or instantaneous change (approximate).

Understand how to test the equality of regression coefficients between two groups (eg, male, female) or
two time periods. This F test is called the Chow test.

Understand the assumption of homoscedasticity and the problems that occur if this assumption does
not hold (heteroskedasticity) (ie, OLS is still unbiased but no longer the most efficient estimation
technique and the standard errors and F statistics are generated by incorrect formulas). Understand
the concept of heteroskedasticity-robust standard errors and how to compute them in Stata.
Understand how to test for heteroskedasticity with the Breusch- Pagan test and the White test and the
difference between these two tests. Understand how to estimate feasible generalized least squared
estimates using a weighted least squares estimation as well as weighted least squares when the weight
is known.

Understand the assumption that is violated when there is autocorrelation. Understand the assumption
made to proceed with testing for autocorrelation, ie, first order autoregressive assumption
ut =ρ u ¿ ¿ + ε t and test for autocorrelation with that assumption using both a t test (assuming strict
exogeneity and not strict exogeneity) and the Durbin Watson test. Understand that OLS is still unbiased
with autocorrelation but incorrect formulas are used to compute the standard errors of the regression

coefficients and the F test. Understand how to estimate a feasible generalized least squared estimate
using generalized difference variables (ie, using the Prais-Winston iterative technique for estimating ρ)

Understand how to interpret an index number and how to change the base period of an index number.

Understand that including a time trend variable in a regression is equivalent to detrending each variable
(dependent and all independent variables) and running a regression on the detrended variables. Thus
being able to interpret regression coefficients that involve a trend variable as an independent variable.

Understand how to set up seasonal dummy variables and how to interpret the results of a regression
with seasonal dummy variables.

Understand how to interpret regression coefficients that involve distributed lags (dynamic models).

Understand how to interpret regression coefficients when cross section data is pooled across two time
periods. This could involve a time dummy variable and interactive variables with that time dummy
variable. Know that the time dummy variable interacting with a numeric variable gives the change in
the slope for the year of the dummy variable compared to the year contained in the constant.

Understand the idea of a natural experiment when some change occurs that impacts only some of the
observations in a cross section analysis and data is available for a time period prior to that change and
after that change. Be able to put in words the differences-in-differences ideas.

Understand how to carry out and interpret a two period panel data analysis.

Understand how to carry out and interpret a linear probability model (where the dependent variable is
made up of 1’s and 0”s and the slope coefficients can be interpreted as a change in the probability of the
outcome measured by the dependent variable). Understand the idea of the “percent correctly
predicted” overall, for both the case when the dependent variable is a 1 and when the dependent
variable is a 0 and how a weighted average of the last two numbers might be best.

Understand the logistic and normal distributions and the cumulative logistic and cumulative normal
distribution. Understand how a latent variable model could end up with computing probabilities from
the cumulative (S shaped) functions. Be able to put in words the ideas behind maximum likelihood
estimation. Understand how to use the probit and logit commands in Stata. Understand how the
coefficients in a probit or logit model estimate a function where the predicted values are z values that
must then be translated into a probability with the cumulative function. Understand why those
coefficients need to be scaled in order to have numbers that can be interpreted as a change in
probability and how to interpret those scaled values. Understand that one must use the “margins”
command to determine the numbers that measure the change in a probability at the average of the
change in the slopes. That command uses a scaling factor computed as the mean of the normal or
logistic function value for each predicted z value. This is the average partial effect calculation.
Understand how to interpret the “percent correctly predicted” following a probit or logit analysis.

