Stata SEM Covariance Structure
Covariance structures are a major component to fitting an SEM. There are several tricky differences between -sem- and -gsem- in this category.
Default and Implicit Behaviors
The variances of and covariances among observed exogenous variables are calculated from data. Using -sem-, it is possible to constrain the (co)variances of these. -gsem- however disallows this.
The variances of and covariances among latent exogenous variables are freely estimated. They can be constrained, although -gsem- disallows setting variances to 0. (Setting covariances to 0 is still supported.) Note however that using the interactive builder in -sem- mode leads to a different default behavior. Covariances between latent exogenous variables are 0 by default.
Errors are generally counted as latent exogenous variables, and largely do fit into the above behavior. The important difference is that errors are assumed to be independent, i.e. covariances of errors are 0 by default. -gsem- is also unable to freely estimate or constrain covariances of errors if a term corresponds to a generalized response with family Gaussian, link log, or link identity with censoring.
Lastly, covariances between observed and latent exogenous variables are handled differently between -sem- and -gsem-. The former freely estimates these while allowing constraints. The latter does not estimate them at all. Note once again that using the interactive builder in -sem- mode leads to a different default behavior. Covariances between observed and latent exogenous variables are 0 by default.
There is a Statalist thread suggesting that, in fact, -gsem- assumes all exogenous variables to be independent if there is a mixture of observed and latent exogenous variables.
Fixing Parameters
To specify that two variables have non-zero covariance, use the cov option. This is mainly useful for error variables, since the default behavior is to assume independence.
... (x1 <- X) (x2 <- X) (x3 <- X), cov(e.x1*e.x2)
The option can be repeated to indicate that other variables also covary. For example, add cov(e.x2*e.x3) to the above.
On the other hand, to specify that two variables have zero covariance, try:
... cov(Y*X@0)
-gsem- disallows constraining the covariances of observed exogenous variables, and also between observed and latent exogenous variables.
Structure
Sometimes, the simplest way to declare covariances is with a systemic statement. The alternate covstructure option takes two arguments: a variable list and a structure name.
Valid structure names are:
- unstructured = all unrestructed
- diagonal = variances unrestricted, zero covariance
- exchangeable = equal (co)variance
- identity = equal variance, zero covariance
- zero = zero (co)variance
There are a few keywords that can be taken as a variable list on this option.
_Ex = all exogenous variables
disallowed in -gsem-
_OEx = all observed exogenous variables
disallowed in -gsem-
_LEx = all latent exogenous variables
e._En = all error variables
e._OEn = all error variables associated with observed exogenous variables
e._LEn = all error variables associated with latent exogenous variables
..., covstructure(_oexogenous, unstructured)
This option, like cov, can be repeated.
