R lavann

lavann is a framework for fitting a SEM.


Installation

install.package('lavaan')


Example

"
Y =~ y1 + y2 + y3
X =~ x1 + x2 + x3
Z =~ z1 + z2 + z3
Y ~ X + Z
" |>
  sem(data = df, estimator = "ML", std.lv = TRUE) |>
  lavaanPlot(coefs = TRUE)

Note that variances of a term prefixed by a dot (like .Y) are error residuals.

Model Specification

A model is specified in a domain specific syntax.

mod <- '
# Measurement paths
X =~ x1 + x2 + x3
Y =~ y1 + y2 + y3
Z =~ z1 + z2 + z3

# Regression paths
X ~ Y

# Declare covariance structure
x1 ~~ x2 + x3            #shorthand for `x1 ~~ x2` and `x1 ~~ x3`
x2 ~~ x3

# Constrain coefficient
Z ~ 1*Y

# Constrain intercept
Y ~ 0

# Constrain covariance
X ~~ 0*x1 + 0*x2 + 0*x3  #shorthand for `X ~~ 0*x1`, `X ~~ 0*x2`, and `X ~~ 0*x3`
'

Paths, like regressions in R, are indicated with a tilde (~).

Measurement paths use the compound =~ symbol, and should be read as ' X is measured by x1, x2, and x3 '. All latent variables must be declared like this, with the name of the variable on the left-hand side. Right-hand side only (i.e., strictly exogenous latent variables) are not supported.

Covariances are indicated with double tilde (~~). A variance is indicated by covariance of a variable with itself.

Estimators

Available estimators are:

To use incomplete data, pass the missing="ML" option. Note that "FIML" is an alias for "ML", and some documents prefer that naming.

Constraints

To constrain a loading/coefficient in a path, insert a constant and an asterisk (*) next to a variable on the right-hand side, as in Z ~ 1*Y. (Co)variances are constrained in the same way, as in Y ~~ 1*Y.

There is a shorthand for declaring all latent variables to be orthogonal/independent: try sem(orthogonal = TRUE).

Intercepts are constrained using a syntax similar to path declarations. The concept is that the constant measurement of a variable is its expected value.

Default Behaviors

Regarding (co)variances, note the following default behaviors:

By default, intercepts are not estimated. To force estimation, declare it without a constraint or use the meanstructure=TRUE option. Note however that the missing="ML" option flips this default, because means are used in the incomplete data procedure.


Tips

Growth Curves

Specify the model like:

mod <- '
intercept = 1*t1 + 1*t2 + 1*t3 + 1*t4
slope     = 0*t1 + 1*t2 + 2*t3 + 3*t4
'

Multilevel

There is limited support for multilevel models. Try:

mod <- '
# Within cluster
level: 1
       x ~ x1 + x2 + x3 + Y

# Across clusters
level: 2
       Y =~ y1 + y2 + y3
'

mod.fit <- sem(model = mod, data = df, estimator = "ML", cluster = "clusterid")


CategoryRicottone

R/Lavaan (last edited 2025-12-18 17:02:24 by DominicRicottone)