Multilevel and Longitudinal Modeling Using Stata, Fourth Edition, by Sophia Rabe-Hesketh and Anders Skrondal, is a complete resource for learning to model data in which observations are grouped—whether those groups are formed by a nesting structure, such as children nested in classrooms, or formed by repeated observations on the same individuals. This text introduces random-effects models, fixed-effects models, mixed-effects models, marginal models, dynamic models, and growth-curve models, all of which account for the grouped nature of these types of data. As Rabe-Hesketh and Skrondal introduce each model, they explain when the model is useful, its assumptions, how to fit and evaluate the model using Stata, and how to interpret the results. With this comprehensive coverage, researchers who need to apply multilevel models will find this book to be the perfect companion. It is also the ideal text for courses in multilevel modeling because it provides examples from a variety of disciplines as well as end-of-chapter exercises that allow students to practice newly learned material.
The book comprises two volumes. Volume I focuses on linear models for continuous outcomes, while volume II focuses on generalized linear models for binary, ordinal, count, and other types of outcomes.
Volume I begins with a review of linear regression and then builds on this review to introduce two-level models, the simplest extensions of linear regression to models for multilevel and longitudinal/panel data. Rabe-Hesketh and Skrondal introduce the random-intercept model without covariates, developing the model from principles and thereby familiarizing the reader with terminology, summarizing and relating the widely used estimating strategies, and providing historical perspective. Once the authors have established the foundation, they smoothly generalize to random-intercept models with covariates and then to a discussion of the various estimators (between, within, and random effects). The authors also discuss models with random coefficients. The text then turns to models specifically designed for longitudinal and panel data—dynamic models, marginal models, and growth-curve models. The last portion of volume I covers models with more than two levels and models with crossed random effects.
The foundation and in-depth coverage of linear-model principles provided in volume I allow for a straightforward transition to generalized linear models for noncontinuous outcomes, which are described in volume II. This second volume begins with chapters introducing multilevel and longitudinal models for binary, ordinal, nominal, and count data. Focus then turns to survival analysis, introducing multilevel models for both discrete-time survival data and continuous-time survival data. The volume concludes by extending the two-level generalized linear models introduced in previous chapters to models with three or more levels and to models with crossed random effects.
In both volumes, readers will find extensive applications of multilevel and longitudinal models. Using many datasets that appeal to a broad audience, Rabe-Hesketh and Skrondal provide worked examples in each chapter. They also show the breadth of Stata’s commands for fitting the models discussed. They demonstrate Stata’s xt suite of commands (xtreg, xtlogit, xtpoisson, etc.), which is designed for two-level random-intercept models for longitudinal/panel data. They demonstrate the me suite of commands (mixed, melogit, mepoisson, etc.), which is designed for multilevel models, including those with random coefficients and those with three or more levels. In volume 2, they discuss gllamm, a community-contributed Stata command developed by Rabe-Hesketh and Skrondal that can fit many latent-variable models, of which the generalized linear mixed-effects model is a special case.The types of models fit by the xt commands, the me commands, and gllamm sometimes overlap; when this happens, the authors highlight the differences in syntax, data organization, and output for the commands. The authors also point out the strengths and weaknesses of these commands, based on considerations such as computational speed, accuracy, available predictions, and available postestimation statistics.
The fourth edition of Multilevel and Longitudinal Modeling Using Stata has been thoroughly revised and updated. In it, you will find new material on Kenward–Roger degrees-of-freedom adjustments for small sample sizes, difference-in-differences estimation for natural experiments, instrumental-variables estimation to account for level-one endogeneity, and Bayesian estimation for crossed-effects models. In addition, you will find new discussions of meologit, cmxtmixlogit, mestreg, menbreg, and other commands introduced in Stata since the third edition of the book.
In summary, Multilevel and Longitudinal Modeling Using Stata, Fourth Edition is the most complete, up-to-date depiction of Stata’s capacity for fitting models to multilevel and longitudinal data. Readers will also find thorough explanations of the methods and practical advice for using these techniques. This text is a great introduction for researchers and students wanting to learn about these powerful data analysis tools.
© Copyright 1996–2023 StataCorp LLC
10.2 Single-level logit and probit regression models for dichotomous responses
10.2.1 Generalized linear model formulation
Estimation using logit
Estimation using glm
10.2.2 Latent-response formulation
Probit regression
Estimation using probit
10.3 Which treatment is best for toenail infection?
10.4 Longitudinal data structure
10.5 Proportions and fitted population-averaged or marginal probabilities
10.6 Random-intercept logistic regression
10.6.1 Model specification
Two-stage formulation
10.6.2 Model assumptions
10.6.3 Estimation
Using melogit
Using gllamm
10.7 Subject-specific or conditional versus population-averaged or marginal relationships
10.8 Measures of dependence and heterogeneity
10.8.2 Median odds ratio
10.8.3 Measures of association for observed responses at median fixed part of the model
10.9 Inference for random-intercept logistic models
10.9.2 Tests of variance components
10.10 Maximum likelihood estimation
10.10.2 Some speed and accuracy considerations
Starting values
Using melogit and gllamm for collapsible data
Spherical quadrature in gllamm
10.11 Assigning values to random effects
10.11.2 Empirical Bayes prediction
10.11.3 Empirical Bayes modal prediction
10.12 Different kinds of predicted probabilities
10.12.2 Predicted subject-specific probabilities
Predictions for the subjects in the sample: Posterior mean probabilities
10.13 Other approaches to clustered dichotomous data
10.13.1 Conditional logistic regression
10.13.2 Generalized estimating equations (GEE)
10.14 Summary and further reading
10.15 Exercises
11.2 Single-level cumulative models for ordinal responses
11.2.2 Latent-response formulation
11.2.3 Proportional odds
11.2.4 Identification
11.3 Are antipsychotic drugs effective for patients with schizophrenia?
11.4 Longitudinal data structure and graphs
11.4.2 Plotting cumulative proportions
11.4.3 Plotting cumulative sample logits and transforming the time scale
11.5 Single-level proportional-odds model
11.5.1 Model specification
11.6 Random-intercept proportional-odds model
11.6.1 Model specification
Estimation using gllamm
11.6.2 Measures of dependence and heterogeneity
Median odds ratio
11.7 Random-coefficient proportional-odds model
11.7.1 Model specification
Estimation using gllamm
11.8 Different kinds of predicted probabilities
11.8.2 Predicted subject-specific probabilities: Posterior mean
11.9 Do experts differ in their grading of student essays?
11.10 A random-intercept probit model with grader bias
11.10.1 Model specification
11.11 Including grader-specific measurement-error variances
11.11.1 Model specification
11.12 Including grader-specific thresholds
11.12.1 Model specification
11.13 Other link functions
Continuation-ratio logit model
Adjacent-category logit model
Baseline-category logit and stereotype models
11.14 Summary and further reading
11.15 Exercises
12.2 Single-level models for nominal responses
12.2.1 Multinomial logit models
Estimation using mlogit
12.2.2 Conditional logit models with alternative-specific covariates
Estimation using clogit
Estimation using cmclogit
12.2.3 Conditional logit models with alternative- and unit-specific covariates
Estimation using cmclogit
12.3 Independence from irrelevant alternatives
12.4 Utility-maximization formulation
12.5 Does marketing affect choice of yogurt?
12.6 Single-level conditional logit models
12.6.1 Conditional logit models with alternative-specific intercepts
Estimation using cmclogit
12.7 Multilevel conditional logit models
12.7.1 Preference heterogeneity: Brand-specific random intercepts
Estimation using gllamm
12.7.2 Response heterogeneity: Marketing variables with random coefficients
Estimation using gllamm
12.7.3 Preference and response heterogeneity
Estimation using gllamm
12.8 Prediction of marginal choice probabilities
12.9 Prediction of random effects and household-specific choice probabilities
12.10 Summary and further reading
12.11 Exercises
13.2 What are counts?
13.2.2 Counts as aggregated event-history data
13.3 Single-level Poisson models for counts
13.4 Did the German healthcare reform reduce the number of doctor visits?
13.5 Longitudinal data structure
13.6 Single-level Poisson regression
13.6.1 Model specification
Estimation using glm
13.7 Random-intercept Poisson regression
13.7.2 Measures of dependence and heterogeneity
13.7.3 Estimation
Using mepoisson
Using gllamm
13.8 Random-coefficient Poisson regression
13.8.1 Model specification
Estimation using gllamm
13.9 Overdispersion in single-level models
13.9.1 Normally distributed random intercept
13.9.2 Negative binomial models
Constant dispersion or NB1
13.9.3 Quasilikelihood
13.10 Level-1 overdispersion in two-level models
13.10.1 Random-intercept Poisson model with robust standard errors
13.10.2 Three-level random-intercept model
13.10.3 Negative binomial models with random intercepts
13.10.4 The HHG model
13.11 Other approaches to two-level count data
13.11.1 Conditional Poisson regression
Estimation using Poisson regression with dummy variables for clusters
13.11.2 Conditional negative binomial regression
13.11.3 Generalized estimating equations
13.12 Marginal and conditional effects when responses are MAR
13.13 Which Scottish counties have a high risk of lip cancer?
13.14 Standardized mortality ratios
13.15 Random-intercept Poisson regression
13.15.1 Model specification
13.15.2 Prediction of standardized mortality ratios
13.16 Nonparametric maximum likelihood estimation
13.16.1 Specification
13.16.2 Prediction
13.17 Summary and further reading
13.18 Exercises
14.2 Single-level models for discrete-time survival data
14.2.1 Discrete-time hazard and discrete-time survival
14.2.2 Data expansion for discrete-time survival analysis
14.2.3 Estimation via regression models for dichotomous responses
14.2.4 Including time-constant covariates
14.2.5 Including time-varying covariates
14.2.6 Multiple absorbing events and competing risks
14.2.7 Handling left-truncated data
14.3 How does mother’s birth history affect child mortality?
14.4 Data expansion
14.5 Proportional hazards and interval-censoring
14.6 Complementary log–log models
14.6.1 Marginal baseline hazard
14.6.2 Including covariates
14.7 Random-intercept complementary log-log model
14.7.1 Model specification
14.9 Summary and further reading
14.10 Exercises
15.2 What makes marriages fail?
15.3 Hazards and survival
15.4 Proportional hazards models
15.4.1 Piecewise exponential model
Estimation using poisson
15.4.2 Cox regression model
15.4.3 Cox regression via Poisson regression for expanded data
15.4.4 Approximate Cox regression: Poisson regression, smooth baseline hazard
15.5 Accelerated failure-time models
15.5.1 Log-normal model
Estimation using stintreg
15.6 Time-varying covariates
15.7 Does nitrate reduce the risk of angina pectoris?
15.8 Marginal modeling
15.8.1 Cox regression with occasion-specific dummy variables
15.8.2 Cox regression with occasion-specific baseline hazards
15.8.3 Approximate Cox regression
15.9 Multilevel proportional hazards models
15.9.1 Cox regression with gamma shared frailty
15.9.2 Approximate Cox regression with log-normal shared frailty
15.9.3 Approximate Cox regression with normal random intercept and coefficient
15.10 Multilevel accelerated failure-time models
15.10.1 Log-normal model with gamma shared frailty
15.10.2 Log-normal model with log-normal shared frailty
15.10.3 Log-normal model with normal random intercept and random coefficient
15.11 Fixed-effects approach
15.11.1 Stratified Cox regression with subject-specific baseline hazards
15.12 Different approaches to recurrent-event data
15.12.2 Counting process risk interval
15.12.3 Gap-time risk interval
15.13 Summary and further reading
15.14 Exercises
16.2 Did the Guatemalan-immunization campaign work?
16.3 A three-level random-intercept logistic regression model
16.3.2 Measures of dependence and heterogeneity
Types of median odds ratios
16.3.3 Three-stage formulation
16.3.4 Estimation
Using gllamm
16.4 A three-level random-coefficient logistic regression model
16.4.1 Estimation
Using gllamm
16.5 Prediction of random effects
16.5.2 Empirical Bayes modal prediction
16.6 Different kinds of predicted probabilities
16.6.2 Predicted median or conditional probabilities
16.6.3 Predicted posterior mean probabilities: Existing clusters
16.7 Do salamanders from different populations mate successfully
16.8 Crossed random-effects logistic regression
16.8.2 Approximate maximum likelihood estimation
16.8.3 Bayesian estimation
Priors for the salamander data
Estimation using bayes: melogit
16.8.4 Estimates compared
16.8.5 Fully Bayesian versus empirical Bayesian inference for random effects
16.9 Summary and further reading
16.10 Exercises
Author index (PDF)
Subject index (PDF)
© Copyright 1996–2023 StataCorp LLC