Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to include interactions between an endogenous variable and exogenous variables in GSEM

    Hello all,

    I am using GSEM to estimate the effect of X, which is endogenous, on Y, with the instrumental variable IV and bunch of other exogenous variables Zs.
    The problem is, my X and two of Zs (let's call them Z1 and Z2) are interacted and I'm not so sure how to deal with this.


    I've tried the code below but STATA says that I can't include interaction terms between latent variables.


    . gsem (Y <- X Z1 Z2 X#Z1 X#Z2 Z3 Z4 Z5, Oprobit) (X <- IV), vce(cluster Z5)


    Y, X, Z1, and Z2 are all ordinal variables and IV is continuous.

    Since I've never used gsem command, I'm not sure about the details... Any help will be greatly appreciated!!

    Best,
    MJ

  • #2
    Edit: My dataset is cross-sectional and all variables are observed.

    Comment


    • #3
      You are expected to provide the exact code you used and show us what Stata provided as output. I don't think 'Oprobit' is a valid command. See FAQ, section-12 for how to make useful posts. For your codes, I think the problem is not to use Stata's factor variable notation. If you are using Z1 Z2 as continuous (assuming X is continuous), the codes should be:

      Code:
      gsem (Y <- X Z1 Z2 c.X#c.Z1 c.X#c.Z2 Z3 Z4 Z5, oprobit) (X <- IV)
      If you are using Z1 Z2 as categorical:

      Code:
      gsem (Y <- X i.Z1 i.Z2 c.X#i.Z1 c.X#i.Z2 Z3 Z4 Z5, oprobit) (X <- IV)
      Roman

      Comment


      • #4
        Dear Roman,

        I apologize. This is my first time posting here and didn't know the rules.

        Here's the actual code I ran:

        Code:
        gsem (V29 <- i.V43 i.V32 i.INCOME_Q i.V43#i.V32 i.V43#i.INCOME_Q i.SEX AGE i.MARITAL i.DEGREE i.SUB_KNOW_Q i.OBJ_KNOW_Q i.V38
        >  i.ASIA i.NOR_AM i.CT_AM i.W_EUR i.E_EUR i.N_EUR i.S_EUR, oprobit) (i.V43 <- TEMP_ABS), vce(cluster REGION)

        So all the main variables are ordinal, including the endogenous variable.
        And here's the message I got:


        Code:
        note: Latent variable V29 was specified with option family(ordinal),but family(gaussian) is the only option allowed.  Assuming
              family(gaussian) for V29.
        note: Latent variable V29 was specified with option link(probit),but link(identity) is the only option allowed.  Assuming
              (identity) for V29.
        interactions between latent variables are not allowed
        r(198);

        Comment


        • #5
          Not all are being treated as ordinal categorical here in your code. Age is being treated as continous.

          Is your Stata updated to the latest? Type update all in the Stata command box/do file. Showing some example of your data may be helpful for everyone. Please see the FAQ-12 to check how to post example data. If your Stata is updated, try fitting a simpler model with each variable at once and see where the problem occurs. Also check your variable names. I will be away and not sure if will be able to look at this but I can run the following example, replicating your problem with few ordinal variables, without any trouble:


          Code:
          set obs 300
          
          gen o = floor((4-1+1)*runiform()+1) //generate ordinal outcome
          
          //Generate some ordinal categorical variable
          
          gen x= floor((3-1+1)*runiform()+1)
          gen z1 = floor((4-1+1)*runiform()+1)
          gen z2 = floor((3-1+1)*runiform()+1)
          gen t = floor((4-1+1)*runiform()+1)
          
          lis o x z1 z2 t in 1/10, clean //data example
          
                 o   x   z1   z2   t  
            1.   3   1    1    2   3  
            2.   4   3    1    1   2  
            3.   3   3    2    2   1  
            4.   3   2    3    1   3  
            5.   3   1    3    1   1  
            6.   4   3    4    3   1  
            7.   4   2    1    1   1  
            8.   2   3    4    1   2  
            9.   3   2    2    3   2  
           10.   3   1    4    2   4  
          
          
          /*Run the model*/
          
          gsem (o <- i.x i.z1 i.z2 i.x#i.z1 i.x#i.z2, oprobit) (i.x <- t)
          
          /*Output:*/
          *****************************************************
          
          Iteration 0:   log likelihood = -739.97586  
          Iteration 1:   log likelihood = -732.38946  
          Iteration 2:   log likelihood = -732.38777  
          Iteration 3:   log likelihood = -732.38777  
          
          Generalized structural equation model           Number of obs     =        300
          
          Response       : o
          Family         : ordinal
          Link           : probit
          
          Response       : x
          Base outcome   : 1
          Family         : multinomial
          Link           : logit
          
          Log likelihood = -732.38777
          
          ------------------------------------------------------------------------------
                       |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          o <-         |
                       |
                     x |
                    2  |   .2237729   .4013629     0.56   0.577     -.562884     1.01043
                    3  |   .6640192   .4130772     1.61   0.108    -.1455973    1.473636
                       |
                    z1 |
                    2  |  -.0389763   .3200235    -0.12   0.903    -.6662108    .5882581
                    3  |   .3964114   .3393525     1.17   0.243    -.2687073     1.06153
                    4  |   .1178644   .3305149     0.36   0.721    -.5299329    .7656617
                       |
                    z2 |
                    2  |   .2070285   .2622743     0.79   0.430    -.3070197    .7210768
                    3  |   .2599257   .2744904     0.95   0.344    -.2780656     .797917
                       |
                  x#z1 |
                  2 2  |   .0813855   .4534593     0.18   0.858    -.8073785    .9701494
                  2 3  |  -.9779816   .4642522    -2.11   0.035    -1.887899    -.068064
                  2 4  |    -.05667   .4531824    -0.13   0.900    -.9448911    .8315511
                  3 2  |  -.4603238   .4729382    -0.97   0.330    -1.387266    .4666181
                  3 3  |  -.7826331   .4878125    -1.60   0.109    -1.738728    .1734618
                  3 4  |  -.5461176   .4646152    -1.18   0.240    -1.456747    .3645113
                       |
                  x#z2 |
                  2 2  |  -.2701369   .3806659    -0.71   0.478    -1.016228    .4759545
                  2 3  |  -.0952382   .3745396    -0.25   0.799    -.8293224    .6388459
                  3 2  |  -.1082127   .3865593    -0.28   0.780     -.865855    .6494295
                  3 3  |  -.2724948    .386346    -0.71   0.481    -1.029719    .4847294
          -------------+----------------------------------------------------------------
          1.x          |  (base outcome)
          -------------+----------------------------------------------------------------
          2.x <-       |
                     t |   .0376513   .1265049     0.30   0.766    -.2102938    .2855963
                 _cons |  -.1088241   .3299806    -0.33   0.742    -.7555742     .537926
          -------------+----------------------------------------------------------------
          3.x <-       |
                     t |   .1336069    .125077     1.07   0.285    -.1115395    .3787533
                 _cons |  -.3028232    .333219    -0.91   0.363    -.9559205    .3502741
          -------------+----------------------------------------------------------------
          o            |
                 /cut1 |  -.6918708   .2947713    -2.35   0.019    -1.269612   -.1141297
                 /cut2 |   .0475338   .2918675     0.16   0.871     -.524516    .6195835
                 /cut3 |    .755419   .2940588     2.57   0.010     .1790743    1.331764
          ------------------------------------------------------------------------------
          Last edited by Roman Mostazir; 18 Apr 2016, 21:25. Reason: Typo corrected
          Roman

          Comment


          • #6
            Thank you so much for suggesting the update and replicating the code.
            I'm afraid that I can't post my data here since it is not public but here's how my main variables look like. TEMP_ABS is a continuous variable of the absolute values of temperature anomalies:


            Code:
            . tab V29
            
            Q12a Protect environment: pay |
                       much higher prices |      Freq.     Percent        Cum.
            ------------------------------+-----------------------------------
                             Very willing |      2,028        4.65        4.65
                           Fairly willing |     12,094       27.72       32.36
            Neither willing nor unwilling |     10,178       23.33       55.69
                         Fairly unwilling |     10,419       23.88       79.57
                           Very unwilling |      8,916       20.43      100.00
            ------------------------------+-----------------------------------
                                    Total |     43,635      100.00
            
            . tab V43
            
                 Q14e A rise in world's temperature |
                           caused by climate change |      Freq.     Percent        Cum.
            ----------------------------------------+-----------------------------------
            Extremely dangerous for the environment |     11,823       27.62       27.62
                                     Very dangerous |     15,523       36.27       63.89
                                 Somewhat dangerous |     11,231       26.24       90.13
                                 Not very dangerous |      3,492        8.16       98.29
            Not dangerous at all for the environmen |        731        1.71      100.00
            ----------------------------------------+-----------------------------------
                                              Total |     42,800      100.00
            
            . tab V32
            
                      Q13a To do about |
            environment: too difficult |      Freq.     Percent        Cum.
            ---------------------------+-----------------------------------
                        Agree strongly |      3,897        8.82        8.82
                                 Agree |     11,972       27.09       35.91
            Neither agree nor disagree |      7,347       16.63       52.54
                              Disagree |     16,254       36.79       89.33
                     Disagree strongly |      4,716       10.67      100.00
            ---------------------------+-----------------------------------
                                 Total |     44,186      100.00
            
            . tab INCOME_Q
            
               INCOME_Q |      Freq.     Percent        Cum.
            ------------+-----------------------------------
                      1 |      2,457       14.05       14.05
                      2 |      2,294       13.12       27.16
                      3 |      1,474        8.43       35.59
                      4 |      2,012       11.50       47.10
                      5 |      1,242        7.10       54.20
                      6 |      2,136       12.21       66.41
                      7 |      1,601        9.15       75.56
                      8 |      1,426        8.15       83.72
                      9 |      1,611        9.21       92.93
                     10 |      1,237        7.07      100.00
            ------------+-----------------------------------
                  Total |     17,490      100.00

            I updated my STATA and ran the same code again, but I still get the same error message as before.

            Comment


            • #7
              MJ KIM see:

              Code:
              help sem_and_gsem_syntax_options
              
              //  nocapslatent              do not treat capitalized Names as latent
              I suspect that would be the easiest candidate to eliminate from your previous example that resulted in

              Code:
              note: Latent variable V29 was specified with option family(ordinal),but family(gaussian) is the only option allowed. Assuming
                   family(gaussian) for V29.
              note: Latent variable V29 was specified with option link(probit),but link(identity) is the only option allowed. Assuming
                   (identity) for V29.
              interactions between latent variables are not allowed
              r(198);

              Comment


              • #8
                wbuchanan nailed it. That's the evil. Either change the name of the variables (V29) to lowercase or use the option 'nocapslatent'

                Code:
                 
                 gsem (V29 <- i.V43 i.V32 i.INCOME_Q i.V43#i.V32 i.V43#i.INCOME_Q /// i.SEX AGE i.MARITAL i.DEGREE i.SUB_KNOW_Q i.OBJ_KNOW_Q i.V38 /// i.ASIA i.NOR_AM i.CT_AM i.W_EUR i.E_EUR i.N_EUR i.S_EUR, oprobit) ///  (i.V43 <- TEMP_ABS), vce(cluster REGION) nocapslatent
                Regarding dataset, we actually do not ask for the whole data rather a sub-sample (may be 10/20 rows) of your data so that we can play with. That saves time for everyone. Alternatively, an easy way will be to install this program and read the help file on how to provide data examples ssc install dataex .
                Roman

                Comment


                • #9
                  MJ KIM depending on your comfort level you could also simulate some data that have similar properties to the data you are working with (particularly if some of the properties are responsible for the issue at hand).

                  Comment

                  Working...
                  X