Differences between revisions 1 and 4 (spanning 3 versions)

R lavann

lavann is a framework for fitting a SEM.

Contents

R lavann

Installation

install.package('lavaan')

Example

"
Y =~ y1 + y2 + y3
X =~ x1 + x2 + x3
Z =~ z1 + z2 + z3
Y ~ X + Z
" |>
  sem(data = df, estimator = "ML", std.lv = TRUE) |>
  lavaanPlot(coefs = TRUE)

Note that variances of a term prefixed by a dot (like .Y) are error residuals.

Model Specification

A model must be specified in a custom syntax.

mod <- '
# Measurement paths
X =~ x1 + x2 + x3
Y =~ y1 + y2 + y3
Z =~ z1 + z2 + z3

# Regression paths
X ~ Y

# Constrain coefficients
Z ~ 1*Y

# Constrain intercepts
Y ~ 0

# Declare (co)variances
X ~~ X
x1 ~~ x2 + x3            #shorthand for `x1 ~~ x2` and `x1 ~~ x3`
x2 ~~ x3

# Constrain covariances
X ~~ 0*x1 + 0*x2 + 0*x3  #shorthand for `X ~~ 0*x1`, `X ~~ 0*x2`, and `X ~~ 0*x3`

# Constrain variances
Y ~~ 1*Y
'

Paths, like regressions in R, are indicated with a tilde (~).

Measurement paths use the compound =~ symbol, and should be read as ' X is measured by x1, x2, and x3 '. All latent variables must be declared like this, with the name of the variable on the left-hand side. Right-hand side only (i.e., strictly exogenous latent variables) are not supported.

Covariances are indicated with double tilde (~~). A variance is indicated by covariance of a variable with itself.

Constraints

To constrain a loading/coefficient in a path, insert a constant and an asterisk (*) next to a variable on the right-hand side, as in Z ~ 1*Y. (Co)variances are constrained in the same way, as in Y ~~ 1*Y.

There is a shorthand for declaring all latent variables to be orthogonal/independent: try sem(orthogonal = TRUE).

Intercepts are constrained using a syntax similar to path declarations. The concept is that the constant measurement of a variable is its expected value.

Default Behaviors

Regarding (co)variances, note the following default behaviors:

All latent variable (co)variances are allowed to vary freely, but are not necessarily estimated. To force estimation, declare it without a constraint.
All observed variable (co)variances are calculated and then constrained as given.
When a (co)variance declaration or constraint is made on an outcome/endogenous variable, it is automatically interpreted as the residual variance.

Tips

Growth Curves

Specify the model like:

mod <- '
intercept = 1*t1 + 1*t2 + 1*t3 + 1*t4
slope     = 0*t1 + 1*t2 + 2*t3 + 3*t4
'

Multilevel

There is limited support for multilevel models. Try:

mod <- '
# Within cluster
level: 1
       x ~ x1 + x2 + x3 + Y

# Across clusters
level: 2
       Y =~ y1 + y2 + y3
'

mod.fit <- sem(model = mod, data = df, estimator = "ML", cluster = "clusterid")

CategoryRicottone

-  ⇤ ← Revision 1 as of 2025-04-03 21:11:27 → 
  Size: 1262
  Editor: DominicRicottone
  Comment: Initial commit
+   ← Revision 4 as of 2025-11-03 01:36:30 → ⇥
  Size: 3101
  Editor: DominicRicottone
  Comment: Link
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-'''lavann''' is a [[Statistics/StructuralEquationModeling|SEM]] fitting software.
+'''lavann''' is a framework for fitting a [[Statistics/StructuralEquationModeling|SEM]].
 Line 21:
-== Usage ==
+== Example ==
 Line 30:
-  sem(data = data, std.lv = TRUE) |>
+  sem(data = df, estimator = "ML", std.lv = TRUE) |>
 Line 34:
-This displays:
 * chi-squared test statistic for the model
 * regression coefficients for the measurement model(s), including Z statistics, under the '''Latent Variables''' header.
 * regression coefficients for the structural model, including Z statistics, under the '''Regression''' header
 * covariances among latent variables, including Z statistics
 * variances of observed variables, including Z statistics
   * Note that certain variances are forced to be 1 by assumption; in this case the variances of latent variables (i.e., `X` and `Z`) and the variance of the outcome variable's errors  (i.e. `.Y`; the leading dot indicates a variance).
+Note that [[Statistics/Variance|variances]] of a term prefixed by a dot (like `.Y`) are [[Statistics/Residuals|error residuals]].
-Line 42:
+Line 36:
-lavaan estimates latent variances whereas [[Stata/Gsem|gsem]] fits data to a model using maximum likelihood. While regression coefficients can be comparable, the methods are fundamentally not.
+=== Model Specification ===

A model must be specified in a custom syntax.

{{{
mod <- '
# Measurement paths
X =~ x1 + x2 + x3
Y =~ y1 + y2 + y3
Z =~ z1 + z2 + z3

# Regression paths
X ~ Y

# Constrain coefficients
Z ~ 1*Y

# Constrain intercepts
Y ~ 0

# Declare (co)variances
X ~~ X
x1 ~~ x2 + x3            #shorthand for `x1 ~~ x2` and `x1 ~~ x3`
x2 ~~ x3

# Constrain covariances
X ~~ 0*x1 + 0*x2 + 0*x3  #shorthand for `X ~~ 0*x1`, `X ~~ 0*x2`, and `X ~~ 0*x3`

# Constrain variances
Y ~~ 1*Y
'
}}}

Paths, like regressions in R, are indicated with a tilde (`~`).

Measurement paths use the compound `=~` symbol, and should be read as ' ''X'' is measured by ''x1'', ''x2'', and ''x3'' '. All latent variables must be declared like this, with the name of the variable on the left-hand side. Right-hand side only (i.e., strictly exogenous latent variables) are not supported.

Covariances are indicated with double tilde (`~~`). A variance is indicated by covariance of a variable with itself.



=== Constraints ===

To constrain a loading/coefficient in a path, insert a constant and an asterisk (`*`) next to a variable on the right-hand side, as in `Z ~ 1*Y`. (Co)variances are constrained in the same way, as in `Y ~~ 1*Y`.

There is a shorthand for declaring all latent variables to be orthogonal/independent: try `sem(orthogonal = TRUE)`.

Intercepts are constrained using a syntax similar to path declarations. The concept is that the constant measurement of a variable is its expected value.



=== Default Behaviors ===

Regarding (co)variances, note the following default behaviors:
 * All latent variable (co)variances are allowed to vary freely, but are not necessarily estimated. To force estimation, declare it without a constraint.
 * All observed variable (co)variances are calculated and then constrained as given.
 * When a (co)variance declaration or constraint is made on an outcome/endogenous variable, it is automatically interpreted as the residual variance.

----



== Tips ==

=== Growth Curves ===

Specify the model like:

{{{
mod <- '
intercept = 1*t1 + 1*t2 + 1*t3 + 1*t4
slope     = 0*t1 + 1*t2 + 2*t3 + 3*t4
'
}}}



=== Multilevel ===

There is limited support for multilevel models. Try:

{{{
mod <- '
# Within cluster
level: 1
       x ~ x1 + x2 + x3 + Y

# Across clusters
level: 2
       Y =~ y1 + y2 + y3
'

mod.fit <- sem(model = mod, data = df, estimator = "ML", cluster = "clusterid")
}}}

Diff for "R/Lavaan"