Gsem: Multiple mediation model with categorical mediators

Eva Boonaert

Join Date: Oct 2019

Posts: 26
#1

Gsem: Multiple mediation model with categorical mediators

24 Feb 2023, 07:10

Dear Statalist members

I would like to specify a multiple mediation model using "gsem" with one contintuous outcome variable (y: income), two binary mediators (m1 and m2: fertilizers and pesticides) and one continuous treatment variable (x: certification). The aim is to analyze the relative importance of both mediators by comparing their standardized indirect effect.

Since both mediators are highly correlated, I would like to allow the two mediators to covary in the mediation model. However, it does not seem possible to covary two binary mediators in "gsem":

Code:

*************************** Generate example dataset ***************************** cap log close clear clear matrix set more off set obs 5000 *generate certification variable gen x = 5 * uniform() *generate idiosyncratic error term gen e1 = invnorm(uniform()) sum e1 replace e1 = (e1 - r(mean)) / r(sd) //standard normal distribution gen e2 = invnorm(uniform()) sum e2 replace e2 = (e2 - r(mean)) / r(sd) //standard normal distribution gen e3 = invnorm(uniform()) sum e3 replace e3 = e3 - r(mean) //normal distribution *generate mediator 1 (fertilizers) gen m1 = -2 + 2*x + e1 replace m1 = m1 > 0 tab m1 *generate mediator 2 (pesticides) gen m2 = -2 + 3*x + e2 replace m2 = m2 > 0 tab m2 *generate income variable gen y = 1*x + 2*m1 + 3*m2 + e3 *"true" average marginal effect gen trx1 = normalden(-2 + 2*x) * 2 sum trx1, d //0,20 gen trx2 = normalden(-2 + 3*x) * 3 sum trx2, d //0,19 ***************************** Mediation analysis ********************************* *estimation via "sem" sem (y <- x m1 m2) /// (m1 <- x) /// (m2 <- x), cov((e.m1)*(e.m2)) method(ml) covstructure(_Ex,diag) *estimation via "gsem" gsem (y <- x m1 m2) /// (m1 <- x, family(bernoulli) link(probit)) /// (m2 <- x, family(bernoulli) link(probit)), cov((e.m1)*(e.m2)) method(ml) margins, dydx(*) nose

The "sem" command allows correlation between the mediators, but is invalid since we need nonlinear models for the paths between the treatment and the binary mediators. The related "gsem" command gets the error: "invalid covariance specification; e.m1 does not identify an error for an uncensored gaussian response with the identity link". As suggested in this post, this is probably because the nonlinear models for the two binary mediators inherently have no error term (logit(p) = XB whereas for OLS: y = XB + e)).

I have two questions:

1) Does anyone have a suggestion on how to allow for correlation between binary mediators in "gsem" in Stata? I have tried different solutions. First, I tried to make the mediators covary by adding common latent variables, as suggested in this post in another context. The approach seems to work with a simple dataset like this, but in my own case (N>50,000) it took several days and did not converge.

Code:

************************** Latent variable approach ****************************** sem (y <- x m1 m2) /// (m1 <- x) /// (m2 <- x), cov((e.m1)*(e.m2)) method(ml) covstructure(_Ex,diag) sem (y <- x m1 m2) /// (m1 <- x A@1) /// (m2 <- x A@1), method(ml) covstructure(_Ex,diag)

Second, I was thinking to use a bayesmh estimation in Stata, but I can't figure out how to specify this for a fixed effects panel data estimator in my own case of N>50,000. Any suggestions for this?
One last solution I tried was to use the "lavaan" package in R which is able to let the binary mediators covary, although the warnings indicate that the model is probably not identified. Anyone have an idea how "lavaan" allows the mediators to covariate when this is not possible in Stata?

2) If you could estimate the multiple mediation model, the effects of the treatment variable on the mediators are on another scale (namely beta/sigma) than the effect of the mediators on the outcome variable (beta). Is it even possible to calculate an indirect effect when both effects are on a different scale? If so, how do you interpret this?

Thanks in advance for any suggestions

Kind regards
Eva
Tags: None

Eva Boonaert

Join Date: Oct 2019
Posts: 26

03 Apr 2023, 04:27

If someone would be interested in the answer, I contacted Stata's technical support and they provided me with the following solution:

Stata's -gsem- command does not allow the error terms for the generalized distributed endogenous variables to covary. You could, however, consider using the community-contributed command -cmp- (ssc install cmp and ssc install ghk2 in order to run cmp). Here is a quick illustrative example:

Code:

cscript
 
set obs 5000
set seed 16
gen x = 5*uniform()
gen e1 = invnorm(uniform())
summarize e1
replace e1 = (e1 - r(mean)) / r(sd)
gen e2 = invnorm(uniform())
summarize e2
replace e2 = (e2 - r(mean))
gen e3 = invnorm(uniform())
summarize e3
replace e3 = e3 - r(mean)
gen m1 = -2 + 2*x + e1
replace m1 = m1 > 0
gen m2 = -2 + 3*x + e2
replace m2 = m2 > 0
gen y = 1*x + 2*m1 + 3*m2 + e3
gen trx1 = normalden(-2 + 2*x) * 2
sum trx1, d
gen trx2 = normalden(-2 + 3*x) * 3
sum trx2, d
 
sem (y <- x m1 m2) ///
     (m1 <- x) ///
     (m2 <- x), cov((e.m1)*(e.m2)) method(ml) covstructure(_Ex,diag)
                
gsem (y <- x m1 m2) ///
     (m1 <- x, family(bernoulli) link(probit)) ///
     (m2 <- x, family(bernoulli) link(probit)),  ///
                method(ml)
                
cmp setup          
 
// matching -gsem- results with independent covariance structure
 
cmp (y = x m1 m2) (m1 = x) (m2 = x) ///
               , indicator($cmp_cont $cmp_probit $cmp_probit) ///
               covariance(independent)
              
cmp (y = x m1 m2) (m1 = x) (m2 = x) ///
               , indicator($cmp_cont $cmp_probit $cmp_probit) ///
               covariance(unstructured)

The first -cmp- command reproduces the results from running -gsem-, while the second -cmp- command changes the -covariance()- option to allow the covariances of the error terms to covary. You should read more details about other supported covariance structure to use in -covariance()- in the help file of the command -cmp-.

Kind regards
Eva

Announcement

Gsem: Multiple mediation model with categorical mediators

Comment