Dear Statalist members
I would like to specify a multiple mediation model using "gsem" with one contintuous outcome variable (y: income), two binary mediators (m1 and m2: fertilizers and pesticides) and one continuous treatment variable (x: certification). The aim is to analyze the relative importance of both mediators by comparing their standardized indirect effect.
Since both mediators are highly correlated, I would like to allow the two mediators to covary in the mediation model. However, it does not seem possible to covary two binary mediators in "gsem":
The "sem" command allows correlation between the mediators, but is invalid since we need nonlinear models for the paths between the treatment and the binary mediators. The related "gsem" command gets the error: "invalid covariance specification; e.m1 does not identify an error for an uncensored gaussian response with the identity link". As suggested in this post, this is probably because the nonlinear models for the two binary mediators inherently have no error term (logit(p) = XB whereas for OLS: y = XB + e)).
I have two questions:
1) Does anyone have a suggestion on how to allow for correlation between binary mediators in "gsem" in Stata? I have tried different solutions. First, I tried to make the mediators covary by adding common latent variables, as suggested in this post in another context. The approach seems to work with a simple dataset like this, but in my own case (N>50,000) it took several days and did not converge.
Second, I was thinking to use a bayesmh estimation in Stata, but I can't figure out how to specify this for a fixed effects panel data estimator in my own case of N>50,000. Any suggestions for this?
One last solution I tried was to use the "lavaan" package in R which is able to let the binary mediators covary, although the warnings indicate that the model is probably not identified. Anyone have an idea how "lavaan" allows the mediators to covariate when this is not possible in Stata?
2) If you could estimate the multiple mediation model, the effects of the treatment variable on the mediators are on another scale (namely beta/sigma) than the effect of the mediators on the outcome variable (beta). Is it even possible to calculate an indirect effect when both effects are on a different scale? If so, how do you interpret this?
Thanks in advance for any suggestions
Kind regards
Eva
I would like to specify a multiple mediation model using "gsem" with one contintuous outcome variable (y: income), two binary mediators (m1 and m2: fertilizers and pesticides) and one continuous treatment variable (x: certification). The aim is to analyze the relative importance of both mediators by comparing their standardized indirect effect.
Since both mediators are highly correlated, I would like to allow the two mediators to covary in the mediation model. However, it does not seem possible to covary two binary mediators in "gsem":
Code:
*************************** Generate example dataset ***************************** cap log close clear clear matrix set more off set obs 5000 *generate certification variable gen x = 5 * uniform() *generate idiosyncratic error term gen e1 = invnorm(uniform()) sum e1 replace e1 = (e1 - r(mean)) / r(sd) //standard normal distribution gen e2 = invnorm(uniform()) sum e2 replace e2 = (e2 - r(mean)) / r(sd) //standard normal distribution gen e3 = invnorm(uniform()) sum e3 replace e3 = e3 - r(mean) //normal distribution *generate mediator 1 (fertilizers) gen m1 = -2 + 2*x + e1 replace m1 = m1 > 0 tab m1 *generate mediator 2 (pesticides) gen m2 = -2 + 3*x + e2 replace m2 = m2 > 0 tab m2 *generate income variable gen y = 1*x + 2*m1 + 3*m2 + e3 *"true" average marginal effect gen trx1 = normalden(-2 + 2*x) * 2 sum trx1, d //0,20 gen trx2 = normalden(-2 + 3*x) * 3 sum trx2, d //0,19 ***************************** Mediation analysis ********************************* *estimation via "sem" sem (y <- x m1 m2) /// (m1 <- x) /// (m2 <- x), cov((e.m1)*(e.m2)) method(ml) covstructure(_Ex,diag) *estimation via "gsem" gsem (y <- x m1 m2) /// (m1 <- x, family(bernoulli) link(probit)) /// (m2 <- x, family(bernoulli) link(probit)), cov((e.m1)*(e.m2)) method(ml) margins, dydx(*) nose
I have two questions:
1) Does anyone have a suggestion on how to allow for correlation between binary mediators in "gsem" in Stata? I have tried different solutions. First, I tried to make the mediators covary by adding common latent variables, as suggested in this post in another context. The approach seems to work with a simple dataset like this, but in my own case (N>50,000) it took several days and did not converge.
Code:
************************** Latent variable approach ****************************** sem (y <- x m1 m2) /// (m1 <- x) /// (m2 <- x), cov((e.m1)*(e.m2)) method(ml) covstructure(_Ex,diag) sem (y <- x m1 m2) /// (m1 <- x A@1) /// (m2 <- x A@1), method(ml) covstructure(_Ex,diag)
One last solution I tried was to use the "lavaan" package in R which is able to let the binary mediators covary, although the warnings indicate that the model is probably not identified. Anyone have an idea how "lavaan" allows the mediators to covariate when this is not possible in Stata?
2) If you could estimate the multiple mediation model, the effects of the treatment variable on the mediators are on another scale (namely beta/sigma) than the effect of the mediators on the outcome variable (beta). Is it even possible to calculate an indirect effect when both effects are on a different scale? If so, how do you interpret this?
Thanks in advance for any suggestions
Kind regards
Eva
Comment