Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Specify an interaction between a group of dummy variables and a continuous variable in SEM

    Hello,

    I am running the below model in SEM in stata v.13 and would like to add an interaction term between the group of dummies, "cm2-cm14" and the continuous variable "gestage" From what I see, SEM doesn't seem to accept the i. notation (i.e. i.cm) like I would put into a reg command. Any help would be greatly appreciated!

    Best wishes,
    Emily

    sem (bweightcor <- bage_d gestage bsex mheightavg primip noedu foodinsec cm2-cm14 vg2-vg9) if bweightcor~=. & bage_d<=7, nocapslatent method (mlmv)

  • #2
    something like this:

    Code:
    foreach v of varlist cm2-cm14 {
        gen gestageX`v' = gestage*`v'
    }
    sem (bweightcor <- bage_d gestage gestageX* cm2-cm14 bsex mheightavg primpip noedu ///
        foodinsec vg2-vg9) if !missing(bweightcor) & bage_d <= 7, nocapslatent method(mlmv)

    Comment


    • #3
      Thank you so much for your help!

      Comment


      • #4
        Hi Clyde,

        Could I trouble you with a follow up question?

        After SEM, I am running a post estimation command using lincom in order to compare marginal mean monthly birthweights to the overall sample mean birthweight when all other covariates are at their mean value (i.e. 0.34*foodinsec is the mean value of food insecurity for the overall sample).

        I would like for the below code to be giving me the contrast between the marginal mean birthweight in month 2 ("c2m2") when all other covariates are at their mean value. The output seems to make sense when I run it without the interaction terms. However, when I add the interaction terms (part of the code in bold below), the marginal means no longer make sense. Am I putting the interaction term in the lincom statement incorrectly?


        sem (bweightcor <- bage_d gestage bsex mheightavg primip noedu foodinsec c2m2-c2m7 vg2-vg9 gestageX*) if bweightcor~=. & bage_d<=7, nocapslatent method (mlmv)
        sum blengthavg //summarise the overall mean. This command should be run simultaneously with the following two commands below
        loc overall_mean = r(mean) // get the overall mean in a macro
        lincom 39.56*gestage + 149.92*mheightavg + 0.50*bsex + 0.28*primip+ 0.66*noedu + 2.96*bage_d + 0.34*foodinsec + .14*vg2 + .10*vg3+ .16*vg4+ .13*vg5 + .02*vg6 +.06*vg7 + .15*vg8 + .09*vg9+ 10.61*gestageXc2m2+ 6.66*gestageXc2m3+ 6.47*gestageXc2m4+ 2.00*gestageXc2m5+ 4.22*gestageXc2m6+ 3.13*gestageXc2m7+ _cons +c2m2 - `overall_mean'



        Thank you so much!

        Emily

        Comment


        • #5
          First there is the -lincom- command itself. I'm guessing that the variables c2m2-c2m7 are indicator variables for months two through seven, respectively. I make that inference because you have singled out c2m2 among them for inclusion of its coefficient in the -lincom- command, so you are effectively setting the others to zero and c2m2 itself to 1. And that is fine. But then, among the gestageX coefficients you must likewise omit all of them except gestageXc2m2 from your -lincom- command. After all when, e.g. c3m3 = 0 (which is a constraint you are trying to impose here) then gestageXc3m3 must also be 0, etc.

          There is possibly one other issue. In #1 you wanted to create interaction variables for c2m2 through c2m14. Perhaps you modified the code in #2 to only create them for c2m2 through c2m7. But if you didn't modify it, then you created 13 interaction variables, not just 6. And in your -sem- command, you included gestageX* among the predictors, which means that gestageXc2m8 through gestageXc2m14 are in your model, but c2m8 through c2m14 themselves are not. That would be a mis-specified model in the first place.

          Comment


          • #6
            And, on additional reflection, one other thing. In the -lincom- command, the coefficient of (the coefficient of) gestageXc2m2 needs to be the average value of gestage in the estimation sample, not the average value of gestageXc2m2.

            This kind of complexity is the reason I encourage people to use factor variable notation and -margins- rather than trying to calculate margins and effects "by hand." Now, -sem-, for whatever reason, does not support factor variable notation, so that option is not open to you with this approach. Which raises the question why you are using -sem- instead of -regress- for this. I can see two possible reasons. Perhaps this is just part of a more complicated model with multiple equations and maybe some latent variables as well. Or perhaps you have a lot of missing values and you want to handle that with full-information maximum likelihood estimation. If these reasons apply, then I wouldn't change the approach. But if they don't, you should seriously consider switching over to factor-variable notation, -regress-, and -margins-.

            Comment


            • #7
              Thank you so much! I really appreciate your help! I ended up collapsing some of my months which is why the reduced number of dummy variables, and you are correct that I have a lot of missing data in covariates and so I am using the full information maximum likelihood within SEM to handle this. The error I was making was using the average value of gestageXc2m2 as opposed to the average value of gestational age. Hopefully the post estimation values should now make sense! Thank you again and Happy New Year!

              Comment

              Working...
              X