Psychological Statistics and Psychometrics Using Stata

Psychological Statistics and Psychometrics Using Stata by Scott Baldwin is a complete and concise resource for students and researchers in the behavioral sciences.


Baldwin’s primary goal in this book is to help readers become competent users of statistics. To that end, he first introduces basic statistical methods such as regression, t tests, and ANOVA. He focuses on explaining the models, how they can be used with different types of variables, and how to interpret the results. After building this foundation, Baldwin covers more advanced statistical techniques, including power-and-sample size calculations, multilevel modeling, and structural equation modeling. This book also discusses measurement concepts that are crucial in psychometrics. For instance, Baldwin explores how reliability and validity can be understood and evaluated using exploratory and confirmatory factor analysis. Baldwin includes dozens of worked examples using real data to illustrate the theory and concepts.


In addition to teaching statistical topics, this book helps readers become proficient Stata users. Baldwin teaches Stata basics ranging from navigating the interface to using features for data management, descriptive statistics, and graphics. He emphasizes the need for reproducibility in data analysis; therefore, he is careful to explain how version control and do-files can be used to ensure that results are reproducible. As each statistical concept is introduced, the corresponding commands for fitting and interpreting models are demonstrated. Beyond this, readers learn how to run simulations in Stata to help them better understand the models they are fitting and other statistical concepts.


This book is an excellent textbook for graduate-level courses in psychometrics. It is also an ideal reference for psychometricians and other social scientists who are new to Stata.


© Copyright 1996–2023 StataCorp LLC

List of figures
List of tables
Notation and Typography


Getting oriented to Stata


1 Introduction
1.1 Structure of the book
1.2 Benefits of Stata
1.3 Scientific context


2 Introduction to Stata
2.1 Point-and-click versus writing commands
2.2 The Stata interface
2.3 Getting data in Stata
2.4 Viewing and desribing data

2.4.1 list, in, and if

2.5 Creating new variables

2.5.1 Missing data
2.5.2 Labels

2.6 Summarizing data

2.6.1 summarize
2.6.2 table and tabulate

2.7 Graphing data

2.7.1 Histograms
2.7.2 Box plots
2.7.3 Scatterplots

2.8 Reproducible analysis

2.8.1 Do-files
2.8.2 Log files
2.8.3 Project Manager
2.8.4 Workflow

2.9 Getting help

2.9.1 Help documents
2.9.2 PDF documentation

2.10 Extending Stata

2.10.1 Statistical Software Components
2.10.2 Writing your own programs


Understanding relationships between variables


3 Regression with continuous predictors
3.1 Data
3.2 Exploration

3.2.1 Demonstration
Simulation program

3.3 Bivariate regression

3.3.1 Lines
3.3.2 Regression equation
3.3.3 Estimation
3.3.4 Interpretation
3.3.5 Residuals and predicted values
3.3.6 Partitioning variance
3.3.7 Confidence intervals
3.3.8 Null hypothesis significance testing
3.3.9 Additional methods for understanding models
Using predicted scores to understand model implications
Composite contrasts

3.4 Conclusions


4 Regression with categorical and continuous predictors
4.1 Data for this chapter
4.2 Why categorical predictors need special care
4.3 Dummy coding

4.3.1 Example: Incorrect use of categorical variable

4.4 Multiple predictors

4.4.1 Interpretation
Model fit
4.4.2 Unique variance

4.5 Interactions

4.5.1 Categorical by continuous interactions
Dichotomous by continuous interactions
Polytomous by continuous interactions
Joint test for interactions with polytomous variables
4.5.2 Continuous by continuous interactions

4.6 Summary


5 t tests and one-way ANOVA
5.1 Data
5.2 Comparing two means

5.2.1 t test
5.2.2 Effect size

5.3 Comparing three or more means

5.3.1 Analysis of variance
5.3.2 Multiple comparisons
Planned comparisons
Direct adjustment for multiple comparisons

5.4 Summary


6 Factorial ANOVA
6.1 Data for this chapter
6.2 Factorial design with two factors

6.2.1 Examining and visualizing the data
6.2.2 Main effects
Testing the null hypothesis
6.2.3 Interactions
6.2.4 Partitioning the variance
6.2.5 2 x 2 source table
6.2.6 Using anova to estimate a factorial ANOVA
6.2.7 Simple effects
6.2.8 Effect size

6.3 Factorial design with three factors

6.3.1 Examining and visualizing the data
6.3.2 Marginal means
6.3.3 Main effects and interactions
6.3.4 Three-way interaction
6.3.5 Fitting the model with anova
6.3.6 Interpreting the interaction
6.3.7 A note about effect size

6.4 Conclusion


7 Repeated-measures models
7.1 Data for this chapter
7.2 Basic model
7.3 Using mixed to fit a repeated-measures model

7.3.1 Covariance structures
Compound symmetry (exchangeable)
First-order autoregressive
7.3.2 Degrees of freedom
7.3.3 Pairwise comparisons

7.4 Models with multiple factors
7.5 Estimating heteroskedastic residuals
7.6 Summary


8 Planning studies: Power and sample-size calculations

8.1 Foundational ideas

8.1.1 Null and alternative distributions
8.1.2 Simulating draws out of the null and alternative distributions

8.2 Computing power manually
8.3 Stata’s commands

8.3.1 Two-sample z test
8.3.2 Two-sample t test
8.3.3 Correlation
8.3.4 One-way ANOVA
8.3.5 Factorial ANOVA

8.4 The central importance of power

8.4.1 Type M and S errors
Type S errors
Type M errorss

8.5 Summary


9 Multilevel models for cross-sectional data
9.1 Data used in this chapter
9.2 Why clustered data structures matter

9.2.1 Statistical issues
9.2.2 Conceptual issues

9.3 Basics of a multilevel model

9.3.1 Partitioning sources of variance
9.3.2 Random intercepts
9.3.3 Estimating random intercepts
9.3.4 Intraclass correlations
9.3.5 Estimating cluster means
Comparing pooled and unpooled means
9.3.6 Adding a predictor

9.4 Between-clusters and within-cluster relationships

9.4.1 Partitioning variance in the predictor
9.4.2 Total- versus level-specific relationships
9.4.3 Exploring the between-clusters and within-cluster relationships
9.4.4 Estimating the between-clusters and within-cluster effects

9.5 Random slopes
9.6 Summary


10 Multilevel models for longitudinal data
10.1 Data used in this chapter
10.2 Basic growth model

10.2.1 Multilevel model

10.3 Adding a level-2 predictor
10.4 Adding a level-1 predictor
10.5 Summary


Psychometrics through the lens of factor analysis


11 Factor analysis: Reliability
11.1 What you will learn in this chapter
11.2 Example data
11.3 Common versus unique variance
11.4 One-factor model

11.4.1 Parts of a path model
11.4.2 Where do the latent variables come from?

11.5 Prediction equation
11.6 Using sem to estimate CFA models
11.7 Model fit

11.7.1 Computing χ²

11.8 Obtaining σ²C and σ²U

11.8.1 Computing R² for an item
11.8.2 Computing σ²C and σ²U for all items
11.8.3 Computing reliability—ω
11.8.4 Bootstrapping the standard error and 95% confidence interval for ω

11.9 Comparing ω with α

11.9.1 Evaluating the assumption of tau-equivalence
11.9.2 Parallel items

11.10 Correlated residuals
11.11 Summary


12 Factor analysis: Factorial validity
12.1 Data for this chapter
12.2 Exploratory factor analysis

12.2.1 Common factor model
12.2.2 Extraction methods
12.2.3 Interpreting loadings
12.2.4 Eigenvalues
12.2.5 Communality and uniqueness
12.2.6 Factor analysis versus principal-component analysis
12.2.7 Choosing factors and rotation
How many factors should we extract?
Eigenvalue-greater-than-one rule
Scree plots
Parallel analysis
Orthogonal rotation—varimax
Oblique rotation—promax

12.3 Confirmatory factor analysis

12.3.1 EFA versus CFA
12.3.2 Estimating a CFA with sem
12.3.3 Mean structure versus variance structure
12.3.4 Identifying models
Imposing constraints for identification
How much information is needed to identify a model?
12.3.5 Refitting the model with constrained latent variables
12.3.6 Standardized solutions
12.3.7 Global fit
A summary and a caution
12.3.8 Refining models further
12.3.9 Parallel items

12.4 Summary


13 Measurement invariance
13.1 Data
13.2 Measurement invariance
13.3 Measurement invariance across groups

13.3.1 Configural invariance
13.3.2 Metric invariance
13.3.3 Scalar invariance
13.3.4 Residual invariance
13.3.5 Using the comparative fit index to evaluate invariance

13.4 Structural invariance

13.4.1 Invariant factor variances
13.4.2 Invariant factor means

13.5 Measurement invariance across time

13.5.1 Configural invariance
Effects coding for identification
Effects-coding constraints in Stata
13.5.2 Metric invariance
13.5.3 Scalar invariance
13.5.4 Residual invariance

13.6 Structural invariance
13.7 Summary


Author index
Subject index


© Copyright 1996–2023 StataCorp LLC

Author: Scott Baldwin
ISBN-13: 978-1-59718-303-1
©Copyright: 2019 Stata Press
e-Book version available

Psychological Statistics and Psychometrics Using Stata by Scott Baldwin is a complete and concise resource for students and researchers in the behavioral sciences.


Baldwin’s primary goal in this book is to help readers become competent users of statistics. To that end, he first introduces basic statistical methods such as regression, t tests, and ANOVA.