Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction Term between independent variable and lagged dependent variable in xtabond

    Hi everybody

    I am setting up a dynamic model by using the xtabond command. I need to add interaction between the lagged dependent variable and other variables, as attached.
    I tried:

    c.L1.depvariable#c.indvariable
    c.L1.depvariable*c.indvariable
    L1.depvariable*indvariable

    but they do not work.

    Can someone please help me with the syntax?

  • #2
    I forgot to mention: my variables are all continuous numerical!

    Comment


    • #3
      This may be a data specific issue for you

      c.L1.depvariable#c.indvariable
      Some of your variables may be categorical and you are marking them as continuous, so check on this. In terms of the correct syntax, you have it!

      I interact the first lag of the dependent variable "n" with the variable "w" below:




      Code:
      . webuse abdata
      
      . xtabond2 n L.n L2.n w c.L.n#c.w L.w L(0/2).(k ys) yr*, gmm(L.n) iv(w L.w L(0/2).(k ys) yr*) nolevel robust
      Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
      yr1976 dropped due to collinearity
      yr1977 dropped due to collinearity
      yr1978 dropped due to collinearity
      Warning: Two-step estimated covariance matrix of moments is singular.
        Using a generalized inverse to calculate robust weighting matrix for Hansen test.
        Difference-in-Sargan/Hansen statistics may be negative.
      
      Dynamic panel-data estimation, one-step difference GMM
      ------------------------------------------------------------------------------
      Group variable: id                              Number of obs      =       611
      Time variable : year                            Number of groups   =       140
      Number of instruments = 41                      Obs per group: min =         4
      Wald chi2(17) =   1356.64                                      avg =      4.36
      Prob > chi2   =     0.000                                      max =         6
      ------------------------------------------------------------------------------
                   |               Robust
                 n |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                 n |
               L1. |   1.177182   .2705461     4.35   0.000     .6469215    1.707443
               L2. |  -.0948643   .0543456    -1.75   0.081    -.2013797     .011651
                   |
                 w |  -.4584522   .2685266    -1.71   0.088    -.9847547    .0678503
                   |
          cL.n#c.w |  -.1808065   .0712044    -2.54   0.011    -.3203645   -.0412485
                   |
                 w |
               L1. |   .3786299   .1703421     2.22   0.026     .0447655    .7124943
                   |
                 k |
               --. |   .3779039   .0595079     6.35   0.000     .2612706    .4945373
               L1. |     -.0186   .0749666    -0.25   0.804    -.1655319    .1283319
               L2. |   .0027866   .0335273     0.08   0.934    -.0629256    .0684988
                   |
                ys |
               --. |   .5721959    .193091     2.96   0.003     .1937444    .9506474
               L1. |  -.7372516    .272785    -2.70   0.007      -1.2719   -.2026027
               L2. |    .114775   .1407182     0.82   0.415    -.1610277    .3905776
                   |
            yr1979 |   .0030766   .0120867     0.25   0.799    -.0206129    .0267662
            yr1980 |   .0149288   .0197241     0.76   0.449    -.0237297    .0535873
            yr1981 |  -.0211098   .0316787    -0.67   0.505    -.0831989    .0409793
            yr1982 |  -.0370468   .0324049    -1.14   0.253    -.1005593    .0264657
            yr1983 |  -.0332645   .0325828    -1.02   0.307    -.0971256    .0305966
            yr1984 |  -.0230104   .0375276    -0.61   0.540    -.0965632    .0505424
      ------------------------------------------------------------------------------
      Instruments for first differences equation
        Standard
          D.(w L.w k L.k L2.k ys L.ys L2.ys yr1976 yr1977 yr1978 yr1979 yr1980
          yr1981 yr1982 yr1983 yr1984)
        GMM-type (missing=0, separate instruments for each period unless collapsed)
          L(1/8).L.n
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z =  -3.03  Pr > z =  0.002
      Arellano-Bond test for AR(2) in first differences: z =  -0.34  Pr > z =  0.737
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(24)   =  56.24  Prob > chi2 =  0.000
        (Not robust, but not weakened by many instruments.)
      Hansen test of overid. restrictions: chi2(24)   =  25.24  Prob > chi2 =  0.393
        (Robust, but weakened by many instruments.)
      
      Difference-in-Hansen tests of exogeneity of instrument subsets:
        iv(w L.w k L.k L2.k ys L.ys L2.ys yr1976 yr1977 yr1978 yr1979 yr1980 yr1981 yr1982 yr1983 yr1984)
          Hansen test excluding group:     chi2(10)   =  11.54  Prob > chi2 =  0.317
          Difference (null H = exogenous): chi2(14)   =  13.71  Prob > chi2 =  0.472

      ADDED NOTE: Apologies, I had not seen that you had specified xtabond and not xtabond2 as I illustrate above. You are correct, it seems that this was an issue back in the days when xtabond was introduced, and I do not know if it was ever resolved.

      http://www.stata.com/statalist/archi.../msg00522.html

      However, now you have xtabond2 which addresses this. My recommendation is that you use it for your estimation. If you insist on xtabond, you can revert to the old-fashioned way of creating interactions, using a loop if you have too many variables.
      Last edited by Andrew Musau; 12 Mar 2016, 18:35.

      Comment


      • #4
        I already checked the post you mentioned, but since it is the very first time for me in using interaction terms, I am not very confident about it, thus I did not understand it very well.
        About the "old-fashioned way": I do not have too many variables so, is it right to just multiply the lagged dependent variable with my not lagged independent variables, whether they're continuous or categorical? Is the procedure the same in both cases?
        Thank you

        Comment


        • #5
          About the "old-fashioned way": I do not have too many variables so, is it right to just multiply the lagged dependent variable with my not lagged independentvariables, whether they're continuous or categorical? Is the procedure the same in both cases?
          Generally yes, an interaction is nothing more than the product of two variables. With continuous variables, it's straightforward, you just multiply. However, with categorical variables, you will have several interaction terms according to the number of categories. Here is a simple example that illustrates:

          Code:
          . webuse lbw
          (Hosmer & Lemeshow data)
          
          . des
          
          Contains data from http://www.stata-press.com/data/r11/lbw.dta
            obs:           189                          Hosmer & Lemeshow data
           vars:            11                          15 Jan 2009 05:01
           size:         3,402 (99.9% of memory free)
          -------------------------------------------------------------------------------------------------------------------------
                        storage  display     value
          variable name   type   format      label      variable label
          -------------------------------------------------------------------------------------------------------------------------
          id              int    %8.0g                  identification code
          low             byte   %8.0g                  birthweight<2500g
          age             byte   %8.0g                  age of mother
          lwt             int    %8.0g                  weight at last menstrual period
          race            byte   %8.0g       race       race
          smoke           byte   %8.0g                  smoked during pregnancy
          ptl             byte   %8.0g                  premature labor history (count)
          ht              byte   %8.0g                  has history of hypertension
          ui              byte   %8.0g                  presence, uterine irritability
          ftv             byte   %8.0g                  number of visits to physician during 1st trimester
          bwt             int    %8.0g                  birthweight (grams)
          -------------------------------------------------------------------------------------------------------------------------
          Sorted by:  
          
          
          . label list race
          race:
                     1 white
                     2 black
                     3 other
          
          .
          In this dataset, we have the categorical variable race with 3 categories: white, black, and other. Here is a model with interactions

          Code:
          . regress  low age lwt i.race smoke smoke#i.race
          note: 1.smoke#3.race omitted because of collinearity
          
                Source |       SS       df       MS              Number of obs =     189
          -------------+------------------------------           F(  7,   181) =    3.03
                 Model |   4.2571373     7  .608162472           Prob > F      =  0.0049
              Residual |  36.3248733   181  .200689908           R-squared     =  0.1049
          -------------+------------------------------           Adj R-squared =  0.0703
                 Total |  40.5820106   188  .215861758           Root MSE      =  .44798
          
          ------------------------------------------------------------------------------
                   low |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |  -.0033655   .0066344    -0.51   0.613    -.0164563    .0097252
                   lwt |  -.0021554   .0011473    -1.88   0.062    -.0044192    .0001085
                       |
                  race |
                    2  |   .2239505   .1380235     1.62   0.106    -.0483915    .4962925
                    3  |   .2180009   .0955033     2.28   0.024     .0295578     .406444
                       |
                 smoke |   .0639134   .1428389     0.45   0.655    -.2179302    .3457569
                       |
            smoke#race |
                  1 1  |    .172793   .1718378     1.01   0.316      -.16627     .511856
                  1 2  |   .2228583   .2323809     0.96   0.339    -.2356657    .6813823
                  1 3  |  (omitted)
                       |
                 _cons |   .4777417   .2242785     2.13   0.035      .035205    .9202784
          ------------------------------------------------------------------------------
          Manually, I would have to first create three variables representing the categories of race, then create the interactions as follows


          Code:
          . tab race, gen(Race)
          
                 race |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                white |         96       50.79       50.79
                black |         26       13.76       64.55
                other |         67       35.45      100.00
          ------------+-----------------------------------
                Total |        189      100.00
          This creates 3 variables Race1, Race2, and Race3. Then I just multiply these with the variable "smoke" to create the interactions, and run the regression (omitting the third interaction as above)

          Code:
          . gen smokexRace1= smoke*Race1
          
          . gen smokexRace2= smoke*Race2
          
          . gen smokexRace3= smoke*Race3
          
          . regress  low age lwt  Race2 Race3 smoke  smokexRace1 smokexRace2
          
                Source |       SS       df       MS              Number of obs =     189
          -------------+------------------------------           F(  7,   181) =    3.03
                 Model |   4.2571373     7  .608162472           Prob > F      =  0.0049
              Residual |  36.3248733   181  .200689908           R-squared     =  0.1049
          -------------+------------------------------           Adj R-squared =  0.0703
                 Total |  40.5820106   188  .215861758           Root MSE      =  .44798
          
          ------------------------------------------------------------------------------
                   low |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |  -.0033655   .0066344    -0.51   0.613    -.0164563    .0097252
                   lwt |  -.0021554   .0011473    -1.88   0.062    -.0044192    .0001085
                 Race2 |   .2239505   .1380235     1.62   0.106    -.0483915    .4962925
                 Race3 |   .2180009   .0955033     2.28   0.024     .0295578     .406444
                 smoke |   .0639134   .1428389     0.45   0.655    -.2179302    .3457569
           smokexRace1 |    .172793   .1718378     1.01   0.316      -.16627     .511856
           smokexRace2 |   .2228583   .2323809     0.96   0.339    -.2356657    .6813823
                 _cons |   .4777417   .2242785     2.13   0.035      .035205    .9202784
          ------------------------------------------------------------------------------


          So just a few more steps with categorical variables.


          Comment


          • #6
            Note that this same question was raised on Stack Overflow at http://stackoverflow.com/questions/3...-variable-in-x

            Olivia, The Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post, requests that you state when you have raised the same question in other forums. Before you start your next topic, you should review the FAQ, and especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using CODE delimiters, as described in section 12 of the FAQ. Saying "it didn't work" is unhelpful and leads to misdirected effort, as in Andrew's work in post #2 where your initial post made it easy to misunderstand what the precise problem was.

            Comment

            Working...
            X