Using Stata for Principles of Econometrics

This book is a supplement to Principles of Econometrics, 5th Edition by R. Carter Hill, William E. Griffiths and Guay C. Lim (Wiley, 2018), hereinafter POE5. This book is not a substitute for the textbook, nor is it a standalone computer manual. It is a companion to the textbook, showing how to perform the examples in the textbook using Stata Release 15. This book will be useful to students taking econometrics, as well as their instructors, and others who wish to use Stata for econometric analysis.

Chapter 1   Introducing Stata
Key Terms
1.1 Starting Stata
1.2 The opening display
1.3 Exiting Stata
1.4 Stata data files for POES

1.4.1 A working directory

1.5 Opening Stata data files

1.5.1 Using the toolbar
1.5.2 The use command
1.5.3 Using files on the Internet
1.5.4 Locating book files on the internet

1.6 The variables window

1.6.1 Using the data utility for a single label

1.7 Describing data and obtaining summary statistics
1.8 The Stata help system

1.8.1 Using keyword search
1.8.2 Opening a dialog box
1.8.3 Complete documentation in Stata manuals
1.8.4 Advice
1.8.5 Stata videos on YouTube
1.8.6 Statalist
1.8.7 Not elsewhere classified

1.9 Stata command syntax

1.9.1 Syntax of summarize
1.9.2 Learning syntax using the review window

1.10 Saving your work

1.10.1 Copying and pasting
1.10.2 Using a log file

1.11 Using the data browser
1.12 Using Stata graphics

1.12.1 Histograms
1.12.2 Scatter diagrams

1.13 Using Stata Do-files
1.14 Creating and managing variables

1.14.1 Creating (generating) new variables
1.14.2 Using the expression builder
1.14.3 Dropping or keeping variables and observations
1.14.4 Using arithmetic operators
1.14.5 Using Stata math functions

1.15 Using Stata density functions

1.15.1 Cumulative distribution functions
1.15.2 Inverse cumulative distribution functions

1.16 Using and displaying scalars

1.16.1 Example of standard normal cdf
1.16.2 Example of t-distribution tail-cdf
1.16.3 Example computing percentile of the standard normal
1.16.4 Example computing percentile of the t-distribution

1.17 A scalar dialog box
1.18 Using factor variables

Chapter 1 Do-file

 

Chapter 2   Simple linear regression
Key Terms
2.1 The food expenditure data

2.1.1 Starting a new problem
2.1.2 Starting a log file
2.1.3 Opening a Stata data file
2.1.4 Browsing and listing the data

2.2 Computing summary statistics
2.3 Creating a scatter diagram

2.3.1 Enhancing the plot

2.4 Regression

2.4.1 Fitted values and residuals
2.4.2 Plotting the fitted regression line

2.5 Using Stata to obtain predicted values

2.5.1 Using saved coefficients
2.5.2 Using lincom
2.5.3 Using the margins command
2.5.4 Using incomplete observations
2.5.5 Computing an elasticity

2.6 OLS estimator variances and covariance

2.6.1 Estimating the variance of the error term
2.6.2 Viewing the estimated variances and covariance
2.6.3 Saving the Stata data file

2.7 Estimating nonlinear relationships

2.7.1 A quadratic model
2.7.2 A log-linear model

2.8 Regression with indicator variables
Appendix 2A Average marginal effects

2A.1 Elasticity in a linear relationship
2A.2 Elasticity in a quadratic relationship
2A.3 Slope in a log-linear model

Appendix 2B Simulation experiment

2B.1 Fixed x’s
2B.2 Random x’s
Chapter 2 Do-file

 

Chapter 3   Interval Estimation and Hypothesis Testing
Key Terms
3.1 Interval estimates

3.1.1 Critical values from the t-distribution
3.1.2 Creating an interval estimate
3.1.3 Creating an interval estimate using lincom

3.2 Hypothesis tests

3.2.1 Right-tail test of significance
3.2.2 Right-tail test of an economic hypothesis
3.2.3 Left-tail test of an economic hypothesis
3.2.4 Two-tail test of an economic hypothesis
3.2.5 Two-tail test of significance

3.3p-values

3.3.1 p-value of a right-tail test
3.3.2 p-value of a left-tail test
3.3.3 p-value for a two-tail test
3.3.4 p-values in Stata output
3.3.5 Testing and estimating linear combinations of parameters

Appendix 3A Graphical tools
Appendix 3B Monte Carlo simulation

3B.1 Fixed x’s
3B.2 Random x’s

Chapter 3 Do-file

 

Chapter 4   Prediction, Goodness-of-Fit and Modeling Issues
Key Terms
4.1 Least squares prediction

4.1.1 Editing the data
4.1.2 Estimate the regression and obtain postestimation results
4.1.3 Creating the prediction interval
4.1.4 Using margins to create the prediction Interval

4.2 Measuring goodness-of-fit

4.2.1 Correlations and R2

4.3 The effects of scaling and transforming the Data

4.3.1 Reporting the regression results
4.3.2 The linear-log functional form
4.3.3 Plotting the fitted linear-log model
4.3.4 Editing graphs

4.4 Analyzing the residuals

4.4.1 Residual plots
4.4.2 The Jarque-Bera test
4.4.3 Chi-square distribution critical values
4.4.4 Chi-square distribution p-values
4.4.5 Identifying unusual observations

4.5 Polynomial models

4.5.1 Estimating and checking the linear relationship
4.5.2 Estimating and checking a cubic equation
4.5.3 Estimating a log-linear yield growth model

4.6 Estimating a log-linear wage equation

4.6.1 The log-linear model
4.6.2 Calculating wage predictions
4.6.3 Constructing wage plots
4.6.4 Generalized R2
4.6.5 Prediction intervals in the log-linear model
4.6.6 Prediction intervals in the log-linear model using margins

4.7 A log-log model
Chapter 4 Do-file

 

Chapter 5   Multiple Linear Regression
Key Terms
5.1 The Hamburger Chain Model
5.2 Least squares prediction

5.2.1 Least squares procedure
5.2.2 Least squares prediction
5.2.3 Rescaling the variables
5.2.4 Estimating the error variance
5.2.5 Measuring the goodness-of-fit
5.2.6 Frisch-Waugh-Lovell

5.3 Least Squares Precision
5.4 Confidence intervals

5.4.1 Changing the confidence level
5.4.2 Linear combination of parameters

5.5 Hypothesis tests

5.5.1 Two-sided t-tests
5.5.2 One-sided t-tests
5.5.3 Testing a linear combination of parameters

5.6 Interaction Variables

5.6.1 Polynomial regressors
5.6.2 Using factor variables for interactions
5.6.3 Interactions with other variables
5.6.4 Log-wages and quadratic interactions
5.6.5 Optimal level of advertising
5.6.7 Maximizing wages via experience

Appendix 5B.1 Nonlinear functions of a single parameter
Appendix 5B.2 Nonlinear functions of two parameters
Appendix 5C.1 Least squares estimation with chi-square errors
Appendix 5C.2 Monte Carlo simulation of the delta method
Appendix 5D Bootstrapping
Chapter 5 Do-file

 

Chapter 6   Further Inference in the Multiple Regression Model
Key Terms
6.1 Testing joint hypotheses: The F-test

6.1.1 Testing the significance of the model
6.1.2 Relationship between t- and F-tests
6.1.3 More general F-tests
6.1.4 Large sample tests
6.1.5 Nonlinear hypothesis tests

6.2 Stata programs
6.3 Nonsample information
6.4 Model specification

6.4.1 Omitted variables
6.4.2 Irrelevant variables
6.4.3 Choosing the model
6.4.4 RESET test for function form
6.4.5 RESET program
6.4.6 Control variables
6.4.7 Prediction-forecast error variance
6.4.8 Prediction-model selection and RMSE

6.5 Poor data, collinearity, and insignificance

6.5.1 Variance inflation factors
6.5.2 Influential observations

Chapter 6 Do-file

 

Chapter 7   Using Indicator Variables
Key Terms
7.1 Indicator variables

7.1.1 Creating indicator variables
7.1.2 Estimating an indicator variable regression
7.1.3 Testing the significance of the indicator variables
7.1.4 Further calculations
7.1.5 Computing average marginal effects

7.2 Applying indicator variables

7.2.1 Interactions between qualitative factors
7.2.2 Adding regional indicators
7.2.3 Testing the equivalence of two regressions
7.2.4 Estimating separate regressions
7.2.5 Indicator variables in log-linear models

7.3 The linear probability model
7.4 Treatment effects
7.5 Differences-in-Differences estimation

Chapter 7 Do-file

 

Chapter 8   Heteroskedasticity
Key Terms
8.1 The nature of heteroskedasticity
8.2 Heteroskedastic-consistent standard errors
8.3 The generalized least squares estimator

8.3.1 Feasible GLS-a more general case
8.3.2 Feasible GLS with a heteroskedastic partition

8.4 Detecting heteroskedasticity

8.4.1 The Goldfeld-Quandt test using partitioned data
8.4.2 The Goldfeld-Quandt test in the food expenditure model
8.4.3 Lagrange multiplier tests

8.5 Heteroskedasticity in the linear probability model
Appendix 8D Alternative robust sandwich estimators
Appendix 8E Monte Carlo evidence
Chapter 8 Do-file

 

Chapter 9   Regression with Time-Series Data: Stationary Variables
Key Terms
9.1 Introduction

9.1.1 Defining time-series in Stata
9.1.2 Time-series plots
9.1.3 Stata’s lag and difference operators

9.2 Correlogram
9.3 The AR(2) model
9.4 Autoregressive distributed lag models

9.4.1 Forecasts and forecast intervals
9.4.2 Model selection
9.4.3 Granger causality

9.5 Serial correlation in errors

9.5.1 Detecting autocorrelation in residuals
9.5.2 Okun’s Law
9.5.3 HAC standard errors
9.5.4 Nonlinear least squares
9.5.5 Feasible GLS

9.6 The consumption function
9.7 Multipliers for an IDL model
9.8 Durbin-Watson Test
Chapter 9 Do-file

 

Chapter 10   Endogenous Regressors and Moment Based Estimation
Key Terms
10.1 Least squares estimation of a wage equation
10.2 Two-stage least squares
10.3 IV estimation with surplus instruments

10.3.1 Illustrating partial correlations

10.4 The Hausman test for endogeneity
10.5 Testing the validity of surplus instruments
10.6 Testing for weak instruments
10.7 Calculating the Cragg-Donald F-statistic
10.8 Illustrations using simulated data
10.9 A simulation experiment
Chapter 10 Do-file

 

Chapter 11   Simultaneous Equations Models
Key Terms
11.1 Truffle supply and demand
11.2 Estimating the reduced form equations
11.3 2SLS estimates of truffle demand
11.4 2SLS estimates of truffle supply
11.5 Supply and demand of fish
11.6 Reduced forms for fish price and quantity
11.7 2SLS estimates of fish demand
11.8 2SLS alternatives
11.9 Monte Carlo simulation
Chapter 11 Do-file

 

Chapter 12   Regression with Time-Series Data: Nonstationary Variables
Key Terms
12.1 Stationary and nonstationary data

12.1.1 Review: generating dates in Stata
12.1.2 Extracting dates
12.1.3 Graphing the data
12.1.4 Summary statistics using subsamples
12.1.5 Correlogram

12.2 Deterministic trends
12.3 Spurious regressions
12.4 Unit root tests for stationarity

12.4.1 Is GDP trend stationary?
12.4.2 Is wheat yield stationary?

12.5 Integration and cointegration

12.4.1 Order of integration
12.4.2 Engle-Granger test
12.4.3 The Error correction model
12.4.4 Regression with no cointegration

Chapter 12 Do-file

 

Chapter 13   Vector Error Correction and Vector Autoregressive Models
Key Terms
13.1 VEC and VAR models
13.2 Estimating a VEC model
13.3 Estimating a VAR
13.4 Impulse responses and variance decompositions
Chapter 13 Do-file

 

Chapter 14   Time-Varying Volatility and ARCH Models
Key Terms
14.1 ARCH model and time-varying volatility
14.2 Simulating ARCH 14.3 Testing, estimating, and forecasting
14.4 Extensions

14.3.1 GARCH
14.3.2 Threshold GARCH
14.3.3 GARCH-in-mean

Chapter 14 Do-file

 

Chapter 15   Panel Data Models
Key Terms
15.1 A microeconomic panel
15.2 The fixed effects estimator

15.2.1 The difference estimator: T = 2
15.2.2 The within estimator: T = 2
15.2.3 The within estimator: T = 3
15.2.4 The fixed-effects estimator: xtreg
15.2.5 The least squares dummy variable estimator
15.2.6 Testing for fixed effects

15.3 Panel data regression error assumptions

15.3.1 The OLS estimation with cluster-robust standard errors
15.3.2 Fixed-effects estimation with cluster-robust standard errors
15.3.3 Random-effects estimation of a production function
15.3.4 Random-effects estimation of a wage equation
15.3.5 Testing for random-effects
15.3.6 The Hausman contrast test for the production function
15.3.7 The Hausman contrast test for the wage equation
15.3.8 A regression based Hausman test for the production function
15.3.9 A regression based Hausman test for the wage equation
15.3.10 The Hausman-Taylor estimator
Chapter 15 Do-file

 

Chapter 16   Qualitative and Limited Dependent Variable Models
Key Terms
16.1 Models with binary dependent variables

16.1.1 The linear probability model
16.1.2 Probit: a small example
16.1.3 Probit: the transportation data
16.1.4 Marginal effects
16.1.5 Probit marginal effects: details
16.1.6 Standard error of average marginal effect

16.2 The logit model for binary choice

16.2.1 Wald tests
16.2.2 Likelihood ratio tests
16.2.3 Binary choice models with a continuous endogenous variable

16.3 Multinomial logit
16.4 Conditional logit

16.4.1 Estimation using asclogit

16.5 Ordered choice models
16.6 Models for count data
16.7 Censored data models
16.8 Selection bias
Chapter 16 Do-file

 

Appendix A   Review of Math Essentials
Key Terms
A.1 Stata math and logical operators
A.2 Math functions
A.3 Extensions to generate
A.4 The calculator
A.5 Scientific notation
A.6 Numerical derivatives and integrals
Appendix A Do-file

 

Appendix B   Review of Probability
B.1 Stata probability functions
B.2 Binomial distribution
B.3 Poisson distribution
B.4 Normal distribution

B.4.1 Normal density plots
B.4.2 Normal probability calculations

B.5 Chi-square distribution

B.5.1 Plotting the chi-square density
B.5.2 Chi-square probability calculations
B.5.3 The non-central chi-square pdf

B.6 Student’s t-distribution

B.6.1 Plot of standard normal and t(3)
B.6.2 t-distribution probabilities
B.6.3 Graphing tail probabilities
B.6.4 The non-central t-distribution

B.7 F-distribution

B.7.1 Plotting the F-density
B.7.2 F-distribution probability calculations
B.7.3 The non-central t-distributions

B.8 The log-normal distribution
B.9 Random numbers

B.7.1 Using inversion method
B.7.2 Creating uniform random numbers

Appendix B Do-file

 

Appendix C   Review of Statistical Inference
Key Terms
C.1 Examining the hip data

C.1.1 Constructing a histogram
C.1.2 Obtaining summary statistics
C.1.3 Estimating the population mean

C.2 Using simulated data values
C.3 The central limit theorem
C.4 Estimating population moments
C.5 Interval estimation

C.5.1 Computing confidence intervals
C.5.2 Using simulated data
C.5.3 Using the hip data

C.6 Testing the mean of a normal population

C.6.1 Right-tail test
C.6.2 Two-tail test

C.7 Testing the variance of a normal population
C.8 Testing the equality of two normal population means

C.8.1 Population variances are equal
C.8.2 Population variances are unequal

C.9 Testing the equality of two normal population variances
C.10 Testing normality
C.11 Maximum likelihood estimation

C.11.1 Testing a population proportion
C.11.2 Likelihood ratio test
C.11.3 Wald test
C.11.4 Lagrange multiplier test

C.12 Least squares
C.13 Kernel density estimator
Appendix C Do-file

Author: Lee C. Adkins and R. Carter Hill
Edition: Fifth Edition
ISBN-13: 978-1-119-46324-5
©Copyright: 2018 Wiley

This book is a supplement to Principles of Econometrics, 5th Edition by R. Carter Hill, William E. Griffiths and Guay C. Lim (Wiley, 2018), hereinafter POE5. This book is not a substitute for the textbook, nor is it a standalone computer manual. It is a companion to the textbook, showing how to perform the examples in the textbook using Stata Release 15. This book will be useful to students taking econometrics, as well as their instructors, and others who wish to use Stata for econometric analysis.