# Negative Binomial Regression

Negative Binomial Regression, Second Edition, by Joseph M. Hilbe, reviews the negative binomial model and its variations. Negative binomial regression—a recently popular alternative to Poisson regression—is used to account for overdispersion, which is often encountered in many real-world applications with count responses.

Negative Binomial Regression covers the count response models, their estimation methods, and the algorithms used to fit these models. Hilbe details the problem of overdispersion and ways to handle it. The book emphasizes the application of negative binomial models to various research problems involving overdispersed count data. Much of the book is devoted to discussing model-selection techniques, the interpretation of results, regression diagnostics, and methods of assessing goodness of fit.

Hilbe uses Stata extensively throughout the book to display examples. He describes various extensions of the negative binomial model—those that handle excess zeros, censored and truncated data, panel and longitudinal data, and data from sample selection.

Negative Binomial Regression is aimed at those statisticians, econometricians, and practicing researchers analyzing count-response data. The book is written for a reader with a general background in maximum likelihood estimation and generalized linear models, but Hilbe includes enough mathematical details to satisfy the more theoretically minded reader.

This second edition includes added material on finite-mixture models; quantile-count models; bivariate negative binomial models; and various methods of handling endogeneity, including the generalized method of moments.

Preface to the second edition

1. Introduction
1.1 What is a negative binomial model?
1.2 A brief history of the negative binomial
1.3 Overview of the book

2. The concept of risk
2.1 Risk and 2 × 2 tables
2.2 Risk and 2 × k tables
2.3 Risk ratio confidence intervals
2.4 Risk difference
2.5 The relationship of risk to odds ratios
2.6 Marginal probabilities: joint and conditional

3. Overview of count response models
3.1 Varieties of count response model
3.2 Estimation
3.3 Fit considerations

4. Methods of estimation
4.1 Derivation of the IRLS algorithm

4.1.1 Solving for ∂ L or U — the gradient
4.1.2 Solving for ∂2 L
4.1.3 The IRLS fitting algorithm

4.2 Newton–Raphson algorithms

4.2.1 Derivation of the Newton–Raphson
4.2.2 GLM with OIM
4.2.3 Parameterizing from μ to xΒ
4.2.4 Maximum likelihood estimators

5. Assessment of count models
5.1 Residuals for count response models
5.2 Model fit tests

5.2.2 Information criteria fit tests

5.3 Validation models

6. Poisson regression

6.1 Derivation of the Poisson model

6.1.1 Derivation of the Poisson from the binomial distribution
6.1.2 Derivation of the Poisson model

6.2 Synthetic Poisson models

6.2.1 Construction of synthetic models
6.2.2 Changing response and predictor values
6.2.3 Changing multivariable predictor values

6.3 Example: Poisson model

6.3.1 Coefficient parameterization
6.3.2 Incidence rate ratio parameterization

6.4 Predicted counts
6.5 Effects plots
6.6 Marginal effects, elasticities, and discrete change

6.6.1 Marginal effects for Poisson and negative binomial effects models
6.6.2 Discrete change for Poisson and negative binomial models

6.7 Parameterization as a rate model

6.7.1 Exposure in time and area
6.7.2 Synthetic Poisson with offset
6.7.3 Example

7. Overdispersion
7.1 What is overdispersion?
7.2 Handling apparent overdispersion

7.2.1 Creation of a simulated base Poisson model
7.2.2 Delete a predictor
7.2.3 Outliers in data
7.2.4 Creation of interaction
7.2.5 Testing the predictor scale

7.3 Methods of handling real overdispersion

7.3.1 Scaling of standard errors / quasi-Poisson
7.3.2 Quasi-likelihood variance multipliers
7.3.3 Robust variance estimators
7.3.4 Bootstrapped and jackknifed standard errors

7.4 Tests of overdispersion

7.4.1 Score and Lagrange multiplier tests
7.4.2 Boundary likelihood ratio test
7.4.3 R2p and R2pd tests for Poisson and negative binomial models

7.5 Negative binomial overdispersion

8. Negative binomial regression
8.1 Varieties of negative binomial
8.2 Derivation of the negative binomial

8.2.1 Poisson–gamma mixture model
8.2.2 Derivation of the GLM negative binomial

8.3 Negative binomial distributions
8.4 Negative binomial algorithms

8.4.1 NB-C: canonical negative binomial
8.4.2 NB2: expected information matrix
8.4.3 NB2: observed information matrix
8.4.4 NB2: R maximum likelihood function

9. Negative binomial regression: modeling
9.1 Poisson versus negative binomial
9.2 Synthetic negative binomial
9.3 Marginal effects and discrete change
9.4 Binomial versus count models
9.5 Examples: negative binomial regression

Example 1: Modeling number of marital affairs
Example 2: Heart procedures
Example 3: Titanic survival data
Example 4: Health reform data

10. Alternative variance parameterizations

10.1 Geometric regression: NB α = 1

10.1.1 Derivation of the geometric
10.1.2 Synthetic geometric models
10.1.3 Using the geometric model
10.1.4 The canonical geometric model

10.2 NB1: The linear negative binomial model

10.2.1 NB1 as QL-Poisson
10.2.2 Derivation of NB1
10.2.3 Modeling with NB1
10.2.4 NB1: R maximum likelihood function

10.3 NB-C: Canonical negative binomial regression

10.3.1 NB-C overview and formulae
10.3.2 Synthetic NB-C models
10.3.3 NB-C models

10.4 NB-H: Heterogeneous negative binomial regression
10.5 The NB-P model: generalized negative binomial
10.6 Generalized Waring regression
10.7 Bivariate negative binomial
10.8 Generalized Poisson regression
10.9 Poisson inverse Gaussian regression (PIG)
10.10 Other count models

11. Problems with zero counts
11.1 Zero-truncated count models
11.2 Hurdle models

11.2.1 Theory and formulae for hurdle models
11.2.2 Synthetic hurdle models
11.2.3 Applications
11.2.4 Marginal effects

11.3 Zero-inflated negative binomial models

11.3.1 Overview of ZIP/ZINB models
11.3.2 ZINB algorithms
11.3.3 Applications
11.3.4 Zero-altered negative binomial
11.3.5 Tests of comparative fit
11.3.6 ZINB marginal effects

11.4 Comparison of models

12. Censored and truncated count models

12.1 Censored and truncated models — econometric parameterization

12.1.1 Truncation
12.1.2 Censored models

12.2 Censored Poisson and NB2 models — survival parameterization

13. Handling endogeneity and latent class models

13.1 Finite mixture models

13.1.1 Basics of finite mixture modeling
13.1.2 Synthetic finite mixture models

13.2 Dealing with endogeneity and latent class models

13.2.1 Problems related to endogeneity
13.2.2 Two-stage instrumental variables approach
13.2.3 Generalized method of moments (GMM)
13.2.4 NB2 with an endogenous multinomial treatment variable
13.2.5 Endogeneity resulting from measurement error

13.3 Sample selection and stratification

13.3.1 Negative binomial with endogenous stratification
13.3.2 Sample selection models
13.3.3 Endogenous switching models

13.4 Quantile count models

14. Count panel models
14.1 Overview of count panel models
14.2 Generalized estimating equations: negative binomial

14.2.1 The GEE algorithm
14.2.2 GEE correlation structures
14.2.3 Negative binomial GEE models
14.2.4 GEE goodness-of-fit
14.2.5 GEE marginal effects

14.3 Unconditional fixed-effects negative binomial model
14.4 Conditional fixed-effects negative binomial model
14.5 Random-effects negative binomial
14.6 Mixed-effects negative binomial models

14.6.1 Random-intercept negative binomial models
14.6.2 Non-parametric random-intercept negative binomial
14.6.3 Random-coefficient negative binomial models

14.7 Multilevel models

15. Bayesian negative binomial models
15.1 Bayesian versus frequentist methodology
15.2 The logic of Bayesian regression estimation
15.3 Applications

Appendix A: Constructing and interpreting interaction terms

Appendix B: Data sets, commands, functions 