1 item has been added to your cart.
Estimate mediation effects, analyze the relationship between an unobserved latent concept such as depression and the observed variables that measure depression, model a system with many endogenous variables and correlated errors, or fit a model with complex relationships among both latent and observed variables. Fit models with continuous, binary, count, ordinal, fractional, and survival outcomes. Even fit multilevel models with groups of correlated observations such as children within the same schools. Evaluate model fit. Compute indirect and total effects. Fit models by drawing a path diagram or using the straightforward command syntax.
Learn about structural equation modeling (SEM). Model specification
- Use the SEM Builder or command language
- SEM Builder uses standard path diagrams
- Command language is a natural variation on path diagrams
- Group estimation is as easy as adding group(sex). Also easily add or relax constraints—ginvariant(mcoef) constrains all coefficients in the measurement model to be equal across groups. Or add paths for some groups but not others.
SEM Builder
- Drag, drop, and connect to create path diagrams
- Estimate models from path diagrams
- Display results on the path diagram
- Save and modify diagrams
- Tools to create measurement and regression components
- Set constant and equality constraints by clicking
- Complete control of how your diagrams look
Watch Using the SEM Builder in Stata tutorial.
Classes of models for linear SEM
- Linear regression
- Multivariate regression
- Path analysis
- Mediation analysis
- Measurement models
- Confirmatory factor analysis (CFA)
- Multiple indicators and multiple causes (MIMIC) models
- Latent growth curve models
- Hierarchical CFA
- Correlated uniqueness models
- Arbitrary structural equation models
Additional classes of models for generalized SEM
- Generalized linear models
- Item response theory models
- Measurement models with binary, count, and ordinal measurements
- Multilevel CFA models
- Multilevel mixed-effects models
- Latent growth curve models with generalized-linear responses
- Multilevel mediation models
- Latent class analysis (LCA)
- Selection models
- with random intercepts and slopes
- with binary, count, and ordinal outcomes
- Endogenous treatment-effect models
- Any multilevel structural equation models with generalized-linear responses
- Latent predictors of survival outcomes
- Path models, growth curve models, and more
- Weibull, exponential, lognormal, loglogistic, or gamma models
- Survival outcomes with other outcomes
Linear and generalized-linear responses
- Models for continuous, binary, count, ordinal, and nominal outcomes
- Thirteen distribution families
- Gaussian
- Bernoulli
- Binomial
- Poisson
- Negative binomial
- Ordinal
- Multinomial
- Beta
- Exponential
- Gamma
- Lognormal
- Loglogistic
- Weibull
- Five links
- Identity
- Log
- Logit
- Probit
- Cloglog
- Support for common regression models: linear, logistic, probit, ordered logit, ordered probit, Poisson, multinomial logistic, tobit, interval measurements, and more
Multilevel models
- Two-, three-, and higher-level structural equation models
- Multilevel mixed-effects models
- Random intercepts and random slopes
- Crossed and nested random effects
Estimation methods for linear SEM
- ML—maximum likelihood
- MLMV—maximum likelihood for missing values; sometimes called FIML
- ADF—asymptotic distribution free, meaning GMM (generalized method of moments) using ADF weighting matrix
Estimation methods for generalized SEM
- Maximum likelihood
- Mean-variance or mode-curvature adaptive Gauss–Hermite quadrature
- Nonadaptive Gauss–Hermite quadrature
- Laplace approximation
Standard-error methods
- OIM—observed information matrix
- EIM—expected information matrix
- OPG—outer product of gradients
- Satorra—Bentler estimator
- Robust—distribution-free linearized estimator
- Cluster–robust—robust adjusting for correlation within groups of observations
- Bootstrap—nonparametric bootstrap and clustered bootstrap
- Jackknife—delete-one, delete-n, and clustered jackknife
Survey support for linear SEM and generalized SEM
- Sampling weights and stage-level weights
- Stratification and poststratification
- Clustered sampling at one or more levels
Postestimation Selector
- View and run all postestimation features for your command
- Automatically updated as estimation commands are run
Summary statistics data (SSD)
- Fit linear SEMs on observed or summary (SSD) data
- Fit models on covariances or correlations and optionally variances and means
- SSD may be group specific
- Easily create and manage SSDs
- Build SSDs from original (raw) data for distribution or publication
- Automatic corruption/error checking and repairing
- Electronic signatures
Starting values
- Automatic
- May specify for some or all parameters
- Grid search available
- May fit one model, subset or superset, and use fitted values for another model
Identification
- Automatic normalization (anchoring) constraints provide scale for latent variables; may be overridden
Reliability
- May specify fraction of variance not due to measurement error
Direct and indirect effects for linear SEM
- Confidence intervals
- Unstandardized or standardized units
Overall goodness-of-fit statistics for linear SEM
- Model vs. saturated
- Baseline vs. saturated
- RMSEA, root mean squared error of approximation
- AIC, Akaike's information criterion
- BIC, Bayesian information criterion
- CFI, comparative fit index
- TLI, Tucker–Lewis index, a.k.a. nonnormed fit index
- SRMR, standardized root mean squared residual
- CD, coefficient of determination
Equation-level goodness-of-fit statistics for linear SEM
- R-squared
- Equation-level variance decomposition
- Bentler–Raykov squared multiple-correlation coefficient
Group-level goodness-of-fit statistics for linear SEM
- SRMR
- CD
- Model vs. saturated chi-squared contribution
Residual analysis for linear SEM
- Mean residuals
- Variance and covariance residuals
- Raw, normalized, and standardized values available
Parameter tests
- Modification indices
- Wald tests
- Score tests
- Likelihood-ratio tests
- Easy to specify single or joint custom tests for omitted paths, included paths, and relaxing constraints
- Linear and nonlinear tests of estimated parameters
- Tests may be specified in standardized or unstandardized parameter units
Group-level parameter tests
- Group invariance by parameter class or user specified
Linear and nonlinear combinations of estimated parameters
- Confidence intervals
- Unstandardized or standardized units
Assess nonrecursive system stability
Predictions for linear SEM
- Observed endogenous variables
- Latent endogenous variables
- Latent variables (factor scores)
- Equation-level first derivatives
- In- and out-of-sample prediction; may estimate on one sample and form predictions in another
Predictions for generalized SEM
- Means of observed endogenous variables—probabilities for 0/1 outcomes, mean counts, etc.
- Linear predictions of observed endogenous variables
- Latent variables using empirical Bayes means and modes
- Standard errors of empirical Bayes means and modes
- Observed endogenous variables with and without predictions of latent variables
- Density function
- Distribution function
- Survivor function
- Predict observed endogenous variables marginally with respect to latent variables
- User-defined nonlinear predictions
- May be used with postestimation features
- May be saved to disk for restoration and use later
- Displayed in standardized or unstandardized units
- Optionally display results in Bentler–Weeks form
- Optionally display results in exponentiated form as odds ratios, incidence rate ratios, and relative risk ratios
- All results accessible for community-contributed programs
Factor variables with generalized SEM
- Automatically create indicators based on categorical variables
- Include polynomial terms
- Perform contrasts of categories/levels
Marginal analysis
- Estimated marginal means
- Marginal and partial effects
- Average marginal and partial effects
- Least-squares means
- Predictive margins
- Adjusted predictions, means, and effects
- Works with multiple outcomes simultaneously
- Integrates over latent variables
- Contrasts of margins
- Pairwise comparisons of margins
- Profile plots
- Graphs of margins and marginal effects
Contrasts for generalized SEM
- Analysis of main effects, simple effects, interaction effects, partial interaction effects, and nested effects
- Comparisons against reference groups, of adjacent levels, or against the grand mean
- Orthogonal polynomials
- Helmert contrasts
- Custom contrasts
- ANOVA-style tests
- Contrasts of nonlinear responses
- Multiple-comparison adjustments
- Balanced and unbalanced data
- Contrasts of means, intercepts, and slopes
- Graphs of contrasts
- Interaction plots
Pairwise comparisons for generalized SEM
- Compare estimated means, intercepts, and slopes
- Compare marginal means, intercepts, and slopes
- Balanced and unbalanced data
- Nonlinear responses
- Multiple-comparison adjustments: Bonferroni, Sidak, Scheffe, Tukey HSD, Duncan, and Student–Newman–Keuls adjustments
- Group comparisons that are significant
- Graphs of pairwise comparisons
Explore more about SEM in Stata.
Additional resources
- Structural Equation Modeling Reference Manual
- Discovering Structural Equation Modeling Using Stata, Revised Edition by Alan C. Acock
- In the spotlight: SEM for economists (and others who think they don't care)
- In the spotlight: Path diagram for multinomial logit with random effects
- In the spotlight: Meet Stata's new xtmlogit
- Structural equation modeling using Stata training course
- Structural equation modeling (SEM) flyer
See New in Stata 18 to learn about what was added in Stata 18.