Environmental Econometrics Using Stata

Environmental Econometrics Using Stata is written for applied researchers that want to understand the basic theory of modern statistical methods and how to use them. It is also perfectly suited for teaching. Each chapter is motivated with real data and ends with a set of exercises. The book is also inherently interdisciplinary. The questions posed by environmental issues are relevant to researchers in the physical sciences, economics, sociology, political science, and public health, among other fields.

Each chapter begins with a real dataset and research question. The authors then provide a gentle introduction to the statistical method and demonstrate how to use it to answer the research question. The authors discuss the assumptions about the data and the model, demonstrate the Stata commands used to fit the model and check the model assumptions, and interpret the results. The workflow of the book mimics the workflow that would be required to present your results to an academic audience.

The book is of interest not only for its exposition of the topics but also for its breadth. The book presents estimators for continuous, binary, and ordered outcomes in cross-sectional data; univariate and multivariate time series with stationary and nonstationary data; linear and dynamic panel data; and spatial models and fractional integration. The range of methods is not arbitrary; it is a function of the questions posed by environmental data and reflects the challenges faced by researchers from different disciplines to answer a wide range of questions using modern statistical methods.

List of figures
List of tables
Preface (PDF)
Acknowledgments
Notation and typography

1 Introduction

1.1 Features of the data

1.1.1 Periodicity
1.1.2 Nonlinearity
1.1.3 Structural breaks and nonstationarity
1.1.4 Time-carrying volatility
Types of data

2 Linear regression models
2.1 Air pollution in Santiago, Chile
2.2 Linear regression and OLS estimation
2.3 Interpreting and assessing the regression model

2.3.1 Goodness of fit
Tests of significance
2.3.2 Residual diagnostics
Homoskedasticity
Serial independence
Normality

2.4 Estimating standard errors

3 Beyond ordinary least squares
3.1 Distribution of particulate matter
3.2 Properties of estimators

Consistency
Asymptotic normality
Asymptotic efficiency

3.3 Maximum likelihood and the linear model
3.4 Hypothesis testing

Likelihood-ratio test
Wald test
LM test

3.5 Method-of-moments estimators and the linear model
3.6 Testing for exogeneity

4 Introducing dynamics
4.2 Specifying and fitting dynamic time-series models

AR models
Moving-average models
ARMA models

4.3 Exploring the properties of dynamic models
4.4 ARMA models for load-weighted electricity price
4.5 Seasonal ARMA models

5 Multivariate time-series models
5.1 CO2 emissions and growth
5.2 The VARMA model
5.3 The VAR model
5.4 Analyzing the dynamics of a VAR

5.4.1 Granger causality testing
5.4.2 Impulse–responses
Vector moving-average form
Orthogonalized impulses
5.4.3 Forecast-error variance decomposition

5.5 SVARs

5.5.1 Short-run restrictions
5.5.2 Long-run restrictions

6 Testing for nonstationarity
6.1 Per capita CO2 emissions
6.2 Unit roots
6.3 First-generation unit-root tests

6.3.1 Dickey–Fuller tests
6.3.2 Phillips–Perron tests

6.4 Second-generation unit-root tests

6.4.1 KPSS test
6.4.2 Elliott–Rothenberg–Stock DFGLS test

6.5 Structural breaks

6.5.1 Known breakpoint
6.5.2 Single-break unit-root tests
6.5.3 Double-break unit-roots tests

7 Modeling nonstationary variables
7.2 Illustrating equilibrium relationships
7.3 The VECM
7.4 Fitting VECMs

7.4.1 Single-equation methods
7.4.2 System estimation

7.5 Testing for cointegration
7.6 Cointegration and structural breaks

8 Forecasting
8.1 Forecasting wind speed
8.2 Introductory terminology
8.3 Recursive forecasting in time-series models

8.3.1 Single-equation forecasts
8.3.2 Multiple-equation forecasts
8.3.3 Properties of recursive forecasts

8.4 Forecast evaluation
8.5 Daily forecasts of wind speed for Santiago
8.6 Forecasting with logarithmic dependent variables

8.6.1 Staying in the linear regression framework
8.6.2 Generalized linear models

9 Structural time-series models
9.1 Sea level and global temperature
9.2 The Kalman filter
9.3 Vector autoregressive moving-average models in state-space form
9.4 Unobserved component time-series models

9.4.1 Trends
9.4.2 Seasonals
9.4.3 Cycles

9.5 A bivariate model of sea level and global temperature

10 Nonlinear time-series models
10.1 Sunspot data
10.2 Testing
10.3 Bilinear time-series models
10.4 Threshold autoregressive models
10.5 Smooth transition models
10.6 Markov switching models

11 Modeling time-varying variance
11.1 Evaluating environmental risk
11.2 The generalized autoregressive conditional heteroskedasticity model
11.3 Alternative distributional assumptions
11.4 Asymmetries
11.5 Motivating multivariate volatility models
11.6 Multivariate volatility models

11.6.1 The vech model
11.6.2 The dynamic conditional correlation model

12 Longitudinal data models
12.1 The pollution haven hypothesis
12.2 Data organization

12.2.1 Wide and long forms of panel data
12.2.2 Reshaping the data

12.3 The pooled model
12.4 Fixed effects and random effects

12.4.1 Individual FEs
12.4.2 Two-way FE
12.4.3 REs
12.4.4 The Hausman test in a panel context
12.4.5 Correlated RE

12.5 Dynamic panel-data models

13 Spatial models
13.1 Regulatory compliance
13.2 The spatial weighting matrix

13.2.1 Specifications
Distance weights
Contiguity weights
13.2.2 Construction

13.3 Exploratory data analysis
13.4 Spatial models

Spatial lag model
Spatial error model

13.5 Fitting spatial models by maximum likelihood

Spatial lag model
Spatial error model

13.6 Estimating spillover effects
13.7 Model selection

14 Discrete dependent variables
14.1 Humpback whales
14.2 The data
14.3 Binary dependent variables

14.3.1 Linear probability model
14.3.2 Binomial logit and probit models

14.4 Ordered dependent variables
14.5 Censored dependent variables

15 Fractional integration
15.1 Mean sea levels and global temperature
15.2 Autocorrelations and long memory
15.3 Testing for long memory
15.4 Estimating d in the frequency domain
15.5 Maximum likelihood estimation of the ARFIMA model
15.6 Fractional cointegration

A Using Stata

A.1 File management

A.1.2 Organization of do-, ado-, and data files
A.1.3 Editing Stata do- and ado-files

A.2 Basic data management

A.2.1 Data types
A.2.2 Getting your data into Stata
Handling text files
The import delimited command
Importing data from other package formats
A.2.3 Other data issues
Protecting the data in memory
Missing data handling
Recoding missing values: the mvdecode and mvencode commands
A.2.4 String-to-numeric conversion and vice versa

A.3 General programming hints

Variable names
Observation numbering:_n and _N
The varlist
The numlist
The if exp and in range qualifiers
Local macros
Global macros
Scalars
Matrices
Looping
The generate command
The egen command
Computation for by-groups

A.4 A smorgasbord of important topics

Date and time handling
Time-series operators

A.5 Factor variables and operators
A.6 Circular variables

References

Author index (PDF)

Subject index (PDF)