Bayesian Analysis with Stata

Bayesian Analysis with Stata is a complete guide to using Stata for Bayesian analysis. It contains just enough theoretical and foundational material to be useful to all levels of users interested in Bayesian statistics, from neophytes to aficionados.

 

The book is careful to introduce concepts and coding tools incrementally so that there are no steep patches or discontinuities in the learning curve. The content helps the user see exactly what computations are done for simple standard models and shows the user how those computations are implemented. Understanding these concepts is important for users because Bayesian analysis lends itself to custom or very complex models, and users must be able to code these themselves. Bayesian Analysis with Stata is wonderful because it goes through the computational methods three times—first using Stata’s ado-code, then using Mata, and finally using Stata to run the MCMC chains with WinBUGS or OpenBUGS. This reinforces the material while making all three methods accessible and clear. Once the book explains the computations and underlying methods, it satisfies the user’s yearning for more complex models by providing examples and advice on how to implement such models. The book covers advanced topics while showing the basics of Bayesian analysis—which is quite an achievement.

 

Bayesian Analysis with Stata presents all the material using real datasets rather than simulated datasets, and there are many exercises that also use real datasets. There is also a chapter on validating code for users who like to learn by simulating models and recovering the known models. This provides users with the opportunity to gain experience in assessing and running Bayesian models and teaches users to be careful when doing so.

 

The book starts by discussing the principles of Bayesian analysis and by explaining the thought process underlying it. It then builds from the ground up, showing users how to write evaluators for posteriors in simple models and how to speed them up using algebraic simplification.

 

Of course, this type of evaluation is useful only in very simple models, so the book then addresses the MCMC methods used throughout the Bayesian world. Once again, this starts from the fundamentals, beginning with the Metropolis–Hastings algorithm and moving on to Gibbs samplers. Because the latter are much quicker to use but are often intractable, the book thoroughly explains the specialty tools of Griddy sampling, slice sampling, and adaptive rejection sampling.

 

After discussing the computational tools, the book changes its focus to the MCMC assessment techniques needed for a proper Bayesian analysis; these include assessing convergence and avoiding problems that can arise from slowly mixing chains. This is where burn-in gets treated, and thinning and centering are used for performance gains.

 

The book then returns its focus to computation. First, it shows users how to use Mata in place of Stata’s ado-code; second, it demonstrates how to pass data and models to WinBUGS or OpenBUGS and retrieve its output. Using Mata speeds up evaluation time. However, using WinBUGS or OpenBUGS further speeds evaluation time, and each one opens a toolbox, which reduces the amount of custom Stata programming needed for complex models. This material is easy for the book to introduce and explain because it has already laid the conceptual and computational groundwork.

 

The book finishes with detailed chapters on model checking and selection, followed by a series of case studies that introduce extra modeling techniques and give advice on specialized Stata code. These chapters are very useful because they allow the book to be a self-contained introduction to Bayesian analysis while providing additional information on models that are normally beyond a basic introduction.

 

© Copyright 1996–2023 StataCorp LLC

List of figures
List of tables
Preface (PDF)
Acknowledgments

 

1 The problem of priors
1.1 Case study 1: An early phase vaccine trial
1.2 Bayesian calculations
1.3 Benefits of a Bayesian analysis
1.4 Selecting a good prior
1.5 Starting points
1.6 Exercises

 

2 Evaluating the posterior
2.1 Introduction
2.2 Case study 1: The vaccine trial revisited
2.3 Marginal and conditional distributions
2.4 Case study 2: Blood pressure and age
2.5 Case study 2: BP and age continued
2.6 General log posteriors
2.7 Adding distributions to logdensity
2.8 Changing parameterization
2.9 Starting points
2.10 Exercises

 

3 Metropolis–Hastings
3.1 Introduction
3.2 The MH algorithm in Stata
3.3 The mhs commands
3.4 Case study 3: Polyp counts
3.5 Scaling the proposal distribution
3.6 The mcmcrun command
3.7 Multiparameter models
3.8 Case study 3: Polyp counts continued
3.9 Highly correlated parameters

3.9.1 Centering
3.9.2 Block updating

3.10 Case study 3: Polyp counts yet again
3.11 Starting points
3.12 Exercises

 

4 Gibbs sampling
4.1 Introduction
4.2 Case study 4: A regression model for pain scores
4.3 Conjugate priors
4.4 Gibbs sampling with nonstandard distributions

4.4.1 Griddy sampling
4.4.2 Slice sampling
4.4.3 Adaptive rejection

4.5 The gbs commands
4.6 Case study 4 continued: Laplace regression
4.7 Starting points
4.8 Exercises

 

5 Assessing convergence
5.1 Introduction
5.2 Detecting early drift
5.3 Detecting too short a run

5.3.1 Thinning the chain

5.4 Running multiple chains
5.5 Convergence of functions of the parameters
5.6 Case study 5: Beta-blocker trials
5.7 Further reading
5.8 Exercises

 

6 Validating the Stata code and summarizing the results
6.1 Introduction
6.2 Case study 6: Ordinal regression
6.3 Validating the software
6.4 Numerical summaries
6.5 Graphical summaries
6.6 Further reading
6.7 Exercises

 

7 Bayesian analysis with Mata
7.1 Introduction
7.2 The basics of Mata
7.3 Case study 6: Revisited
7.4 Case study 7: Germination of broomrape

7.4.1 Tuning the proposal distributions
7.4.2 Using conditional distributions
7.4.3 More efficient computation
7.4.4 Hierarchical centering
7.4.5 Gibbs sampling
7.4.6 Slice, Griddy, and ARMS sampling
7.4.7 Timings
7.4.8 Adding new densities to logdensity()

7.5 Further reading
7.6 Exercises

 

8 Using WinBUGS for model fitting
8.1 Introduction
8.2 Installing the software

8.2.1 Installing OpenBUGS
8.2.1 Installing WinBUGS

8.3 Preparing a WinBUGS analysis

8.3.1 The model file
8.3.2 The data file
8.3.3 The initial values file
8.3.4 The script file
8.3.5 Running the script
8.3.6 Reading the results into Stata
8.3.7 Inspecting the log file
8.3.8 Reading WinBUGS data files

8.4 Case study 8: Growth of sea cows

8.4.1 WinBUGS or OpenBUGS

8.5 Case study 9: Jawbone size

8.5.1 Overrelaxation
8.5.2 Changing the seed for the random-number generator

8.6 Advanced features of WinBUGS

8.6.1 Missing data
8.6.2 Censoring and truncation
8.6.3 Nonstandard likelihoods
8.6.4 Nonstandard priors
8.6.5 The cut() function

8.7 GeoBUGS
8.8 Programming a series of Bayesian analyses
8.9 OpenBUGS under Linux
8.10 Debugging WinBUGS
8.11 Starting points
8.12 Exercises

 

9 Model checking
9.1 Introduction
9.2 Bayesian residual analysis
9.3 The mcmccheck command
9.4 Case study 10: Models for Salmonella assays

9.4.1 Generating the predictions in WinBUGS
9.4.2 Plotting the predictive distributions
9.4.3 Residual plots
9.4.4 Empirical probability plots
9.4.5 A summary plot

9.5 Residual checking with Stata
9.6 Residual checking with Mata
9.7 Further read
9.8 Exercises

 

10 Model selection
10.1 Introduction
10.2 Case study 11: Choosing a genetic

10.2.1 Plausible models
10.2.2 Bayes factors

10.3 Calculating a BF
10.4 Calculating the BFs for the NTD case study
10.5 Robustness of the BF
10.6 Model averaging
10.7 Information criteria
10.8 DIC for the genetic models
10.9 Starting points
10.10 Exercises

 

11 Further case studies
11.1 Introduction
11.2 Case study 12: Modeling cancer incidence
11.3 Case study 13: Creatinine clearance
11.4 Case study 14: Microarray experiment
11.5 Case study 15: Recurrent asthma attacks
11.6 Exercises

 

12 Writing Stata programs for specific Bayesian analysis
12.1 Introduction
12.2 The Bayesian lasso
12.3 The Gibbs sampler
12.4 The Mata code
12.5 A Stata ado-file
12.6 Testing the code
12.7 Case study 16: Diabetes data
12.8 Extensions to the Bayesian lasso program
12.9 Exercises

 

A Standard distributions 

References

Author index (PDF)

Subject index (PDF)

 

© Copyright 1996–2023 StataCorp LLC

Author: John Thompson
ISBN-13: 978-1-59718-141-9
©Copyright: 2014

The book is careful to introduce concepts and coding tools incrementally so that there are no steep patches or discontinuities in the learning curve. The content helps the user see exactly what computations are done for simple standard models and shows the user how those computations are implemented. Understanding these concepts is important for users because Bayesian analysis lends itself to custom or very complex models, and users must be able to code these themselves.