*Multilevel and Longitudinal Modeling Using Stata, Fourth Edition*, by Sophia Rabe-Hesketh and Anders Skrondal, is a complete resource for learning to model data in which observations are grouped—whether those groups are formed by a nesting structure, such as children nested in classrooms, or formed by repeated observations on the same individuals. This text introduces random-effects models, fixed-effects models, mixed-effects models, marginal models, dynamic models, and growth-curve models, all of which account for the grouped nature of these types of data. As Rabe-Hesketh and Skrondal introduce each model, they explain when the model is useful, its assumptions, how to fit and evaluate the model using Stata, and how to interpret the results. With this comprehensive coverage, researchers who need to apply multilevel models will find this book to be the perfect companion. It is also the ideal text for courses in multilevel modeling because it provides examples from a variety of disciplines as well as end-of-chapter exercises that allow students to practice newly learned material.

The book comprises two volumes. Volume I focuses on linear models for continuous outcomes, while volume II focuses on generalized linear models for binary, ordinal, count, and other types of outcomes.

Volume I begins with a review of linear regression and then builds on this review to introduce two-level models, the simplest extensions of linear regression to models for multilevel and longitudinal/panel data. Rabe-Hesketh and Skrondal introduce the random-intercept model without covariates, developing the model from principles and thereby familiarizing the reader with terminology, summarizing and relating the widely used estimating strategies, and providing historical perspective. Once the authors have established the foundation, they smoothly generalize to random-intercept models with covariates and then to a discussion of the various estimators (between, within, and random effects). The authors also discuss models with random coefficients. The text then turns to models specifically designed for longitudinal and panel data—dynamic models, marginal models, and growth-curve models. The last portion of volume I covers models with more than two levels and models with crossed random effects.

The foundation and in-depth coverage of linear-model principles provided in volume I allow for a straightforward transition to generalized linear models for noncontinuous outcomes, which are described in volume II. This second volume begins with chapters introducing multilevel and longitudinal models for binary, ordinal, nominal, and count data. Focus then turns to survival analysis, introducing multilevel models for both discrete-time survival data and continuous-time survival data. The volume concludes by extending the two-level generalized linear models introduced in previous chapters to models with three or more levels and to models with crossed random effects.

In both volumes, readers will find extensive applications of multilevel and longitudinal models. Using many datasets that appeal to a broad audience, Rabe-Hesketh and Skrondal provide worked examples in each chapter. They also show the breadth of Stata’s commands for fitting the models discussed. They demonstrate Stata’s **xt** suite of commands (**xtreg**, **xtlogit**, **xtpoisson**, etc.), which is designed for two-level random-intercept models for longitudinal/panel data. They demonstrate the **me** suite of commands (**mixed**, **melogit**, **mepoisson**, etc.), which is designed for multilevel models, including those with random coefficients and those with three or more levels. In volume 2, they discuss **gllamm**, a community-contributed Stata command developed by Rabe-Hesketh and Skrondal that can fit many latent-variable models, of which the generalized linear mixed-effects model is a special case.The types of models fit by the **xt** commands, the **me** commands, and **gllamm** sometimes overlap; when this happens, the authors highlight the differences in syntax, data organization, and output for the commands. The authors also point out the strengths and weaknesses of these commands, based on considerations such as computational speed, accuracy, available predictions, and available postestimation statistics.

The fourth edition of *Multilevel and Longitudinal Modeling Using Stata* has been thoroughly revised and updated. In it, you will find new material on Kenward–Roger degrees-of-freedom adjustments for small sample sizes, difference-in-differences estimation for natural experiments, instrumental-variables estimation to account for level-one endogeneity, and Bayesian estimation for crossed-effects models. In addition, you will find new discussions of **meologit**, **cmxtmixlogit**, **mestreg**, **menbreg**, and other commands introduced in Stata since the third edition of the book.

In summary, *Multilevel and Longitudinal Modeling Using Stata, Fourth Edition* is the most complete, up-to-date depiction of Stata’s capacity for fitting models to multilevel and longitudinal data. Readers will also find thorough explanations of the methods and practical advice for using these techniques. This text is a great introduction for researchers and students wanting to learn about these powerful data analysis tools.

© Copyright 1996–2023 StataCorp LLC

**List of tables**

**List of figures**

**List of displays**

**V Models for categorical responses**

10.2 Single-level logit and probit regression models for dichotomous responses

10.2.1 Generalized linear model formulation

Estimation using logit

Estimation using glm

10.2.2 Latent-response formulation

Probit regression

Estimation using probit

10.3 Which treatment is best for toenail infection?

10.4 Longitudinal data structure

10.5 Proportions and fitted population-averaged or marginal probabilities

10.6 Random-intercept logistic regression

10.6.1 Model specification

Two-stage formulation

10.6.2 Model assumptions

10.6.3 Estimation

Using melogit

Using gllamm

10.7 Subject-specific or conditional versus population-averaged or marginal relationships

10.8 Measures of dependence and heterogeneity

10.8.2 Median odds ratio

10.8.3 Measures of association for observed responses at median fixed part of the model

10.9 Inference for random-intercept logistic models

10.9.2 Tests of variance components

10.10 Maximum likelihood estimation

10.10.2 Some speed and accuracy considerations

Starting values

Using melogit and gllamm for collapsible data

Spherical quadrature in gllamm

10.11 Assigning values to random effects

10.11.2 Empirical Bayes prediction

10.11.3 Empirical Bayes modal prediction

10.12 Different kinds of predicted probabilities

10.12.2 Predicted subject-specific probabilities

Predictions for the subjects in the sample: Posterior mean probabilities

10.13 Other approaches to clustered dichotomous data

10.13.1 Conditional logistic regression

10.13.2 Generalized estimating equations (GEE)

10.14 Summary and further reading

10.15 Exercises

11.2 Single-level cumulative models for ordinal responses

11.2.2 Latent-response formulation

11.2.3 Proportional odds

11.2.4 Identification

11.3 Are antipsychotic drugs effective for patients with schizophrenia?

11.4 Longitudinal data structure and graphs

11.4.2 Plotting cumulative proportions

11.4.3 Plotting cumulative sample logits and transforming the time scale

11.5 Single-level proportional-odds model

11.5.1 Model specification

11.6 Random-intercept proportional-odds model

11.6.1 Model specification

Estimation using gllamm

11.6.2 Measures of dependence and heterogeneity

Median odds ratio

11.7 Random-coefficient proportional-odds model

11.7.1 Model specification

Estimation using gllamm

11.8 Different kinds of predicted probabilities

11.8.2 Predicted subject-specific probabilities: Posterior mean

11.9 Do experts differ in their grading of student essays?

11.10 A random-intercept probit model with grader bias

11.10.1 Model specification

11.11 Including grader-specific measurement-error variances

11.11.1 Model specification

11.12 Including grader-specific thresholds

11.12.1 Model specification

11.13 Other link functions

Continuation-ratio logit model

Adjacent-category logit model

Baseline-category logit and stereotype models

11.14 Summary and further reading

11.15 Exercises

12.2 Single-level models for nominal responses

12.2.1 Multinomial logit models

Estimation using mlogit

12.2.2 Conditional logit models with alternative-specific covariates

Estimation using clogit

Estimation using cmclogit

12.2.3 Conditional logit models with alternative- and unit-specific covariates

Estimation using cmclogit

12.3 Independence from irrelevant alternatives

12.4 Utility-maximization formulation

12.5 Does marketing affect choice of yogurt?

12.6 Single-level conditional logit models

12.6.1 Conditional logit models with alternative-specific intercepts

Estimation using cmclogit

12.7 Multilevel conditional logit models

12.7.1 Preference heterogeneity: Brand-specific random intercepts

Estimation using gllamm

12.7.2 Response heterogeneity: Marketing variables with random coefficients

Estimation using gllamm

12.7.3 Preference and response heterogeneity

Estimation using gllamm

12.8 Prediction of marginal choice probabilities

12.9 Prediction of random effects and household-specific choice probabilities

12.10 Summary and further reading

12.11 Exercises

**VI Models for counts**

13.2 What are counts?

13.2.2 Counts as aggregated event-history data

13.3 Single-level Poisson models for counts

13.4 Did the German healthcare reform reduce the number of doctor visits?

13.5 Longitudinal data structure

13.6 Single-level Poisson regression

13.6.1 Model specification

Estimation using glm

13.7 Random-intercept Poisson regression

13.7.2 Measures of dependence and heterogeneity

13.7.3 Estimation

Using mepoisson

Using gllamm

13.8 Random-coefficient Poisson regression

13.8.1 Model specification

Estimation using gllamm

13.9 Overdispersion in single-level models

13.9.1 Normally distributed random intercept

13.9.2 Negative binomial models

Constant dispersion or NB1

13.9.3 Quasilikelihood

13.10 Level-1 overdispersion in two-level models

13.10.1 Random-intercept Poisson model with robust standard errors

13.10.2 Three-level random-intercept model

13.10.3 Negative binomial models with random intercepts

13.10.4 The HHG model

13.11 Other approaches to two-level count data

13.11.1 Conditional Poisson regression

Estimation using Poisson regression with dummy variables for clusters

13.11.2 Conditional negative binomial regression

13.11.3 Generalized estimating equations

13.12 Marginal and conditional effects when responses are MAR

13.13 Which Scottish counties have a high risk of lip cancer?

13.14 Standardized mortality ratios

13.15 Random-intercept Poisson regression

13.15.1 Model specification

13.15.2 Prediction of standardized mortality ratios

13.16 Nonparametric maximum likelihood estimation

13.16.1 Specification

13.16.2 Prediction

13.17 Summary and further reading

13.18 Exercises

**VII Models for survival or duration data**

**Introduction to models for survival or duration data (part VII)**

14.2 Single-level models for discrete-time survival data

14.2.1 Discrete-time hazard and discrete-time survival

14.2.2 Data expansion for discrete-time survival analysis

14.2.3 Estimation via regression models for dichotomous responses

14.2.4 Including time-constant covariates

14.2.5 Including time-varying covariates

14.2.6 Multiple absorbing events and competing risks

14.2.7 Handling left-truncated data

14.3 How does mother’s birth history affect child mortality?

14.4 Data expansion

14.5 Proportional hazards and interval-censoring

14.6 Complementary log–log models

14.6.1 Marginal baseline hazard

14.6.2 Including covariates

14.7 Random-intercept complementary log-log model

14.7.1 Model specification

14.9 Summary and further reading

14.10 Exercises

15.2 What makes marriages fail?

15.3 Hazards and survival

15.4 Proportional hazards models

15.4.1 Piecewise exponential model

Estimation using poisson

15.4.2 Cox regression model

15.4.3 Cox regression via Poisson regression for expanded data

15.4.4 Approximate Cox regression: Poisson regression, smooth baseline hazard

15.5 Accelerated failure-time models

15.5.1 Log-normal model

Estimation using stintreg

15.6 Time-varying covariates

15.7 Does nitrate reduce the risk of angina pectoris?

15.8 Marginal modeling

15.8.1 Cox regression with occasion-specific dummy variables

15.8.2 Cox regression with occasion-specific baseline hazards

15.8.3 Approximate Cox regression

15.9 Multilevel proportional hazards models

15.9.1 Cox regression with gamma shared frailty

15.9.2 Approximate Cox regression with log-normal shared frailty

15.9.3 Approximate Cox regression with normal random intercept and coefficient

15.10 Multilevel accelerated failure-time models

15.10.1 Log-normal model with gamma shared frailty

15.10.2 Log-normal model with log-normal shared frailty

15.10.3 Log-normal model with normal random intercept and random coefficient

15.11 Fixed-effects approach

15.11.1 Stratified Cox regression with subject-specific baseline hazards

15.12 Different approaches to recurrent-event data

15.12.2 Counting process risk interval

15.12.3 Gap-time risk interval

15.13 Summary and further reading

15.14 Exercises

**VIII Models with nested and crossed random effects**

16.2 Did the Guatemalan-immunization campaign work?

16.3 A three-level random-intercept logistic regression model

16.3.2 Measures of dependence and heterogeneity

Types of median odds ratios

16.3.3 Three-stage formulation

16.3.4 Estimation

Using gllamm

16.4 A three-level random-coefficient logistic regression model

16.4.1 Estimation

Using gllamm

16.5 Prediction of random effects

16.5.2 Empirical Bayes modal prediction

16.6 Different kinds of predicted probabilities

16.6.2 Predicted median or conditional probabilities

16.6.3 Predicted posterior mean probabilities: Existing clusters

16.7 Do salamanders from different populations mate successfully

16.8 Crossed random-effects logistic regression

16.8.2 Approximate maximum likelihood estimation

16.8.3 Bayesian estimation

Priors for the salamander data

Estimation using bayes: melogit

16.8.4 Estimates compared

16.8.5 Fully Bayesian versus empirical Bayesian inference for random effects

16.9 Summary and further reading

16.10 Exercises

**A Syntax for gllamm, eq, and gllapred: The bare essentials**

**B Syntax for gllamm**

**C Syntax for gllapred**

**D Syntax for gllasim**

**References**

**Author index** (PDF)

**Subject index** (PDF)

© Copyright 1996–2023 StataCorp LLC