Interaction Term between independent variable and lagged dependent variable in xtabond

Olivia Carter

Join Date: Jan 2016

Posts: 8
#1

Interaction Term between independent variable and lagged dependent variable in xtabond

12 Mar 2016, 11:52

Hi everybody

I am setting up a dynamic model by using the xtabond command. I need to add interaction between the lagged dependent variable and other variables, as attached.
I tried:

c.L1.depvariable#c.indvariable
c.L1.depvariable*c.indvariable
L1.depvariable*indvariable

but they do not work.

Can someone please help me with the syntax?

1 Photo
Tags: None
Olivia Carter

Join Date: Jan 2016

Posts: 8
#2

12 Mar 2016, 13:37

I forgot to mention: my variables are all continuous numerical!
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10225

12 Mar 2016, 18:05

This may be a data specific issue for you

c.L1.depvariable#c.indvariable

Some of your variables may be categorical and you are marking them as continuous, so check on this. In terms of the correct syntax, you have it!

I interact the first lag of the dependent variable "n" with the variable "w" below:

Code:

. webuse abdata

. xtabond2 n L.n L2.n w c.L.n#c.w L.w L(0/2).(k ys) yr*, gmm(L.n) iv(w L.w L(0/2).(k ys) yr*) nolevel robust
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
yr1976 dropped due to collinearity
yr1977 dropped due to collinearity
yr1978 dropped due to collinearity
Warning: Two-step estimated covariance matrix of moments is singular.
  Using a generalized inverse to calculate robust weighting matrix for Hansen test.
  Difference-in-Sargan/Hansen statistics may be negative.

Dynamic panel-data estimation, one-step difference GMM
------------------------------------------------------------------------------
Group variable: id                              Number of obs      =       611
Time variable : year                            Number of groups   =       140
Number of instruments = 41                      Obs per group: min =         4
Wald chi2(17) =   1356.64                                      avg =      4.36
Prob > chi2   =     0.000                                      max =         6
------------------------------------------------------------------------------
             |               Robust
           n |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           n |
         L1. |   1.177182   .2705461     4.35   0.000     .6469215    1.707443
         L2. |  -.0948643   .0543456    -1.75   0.081    -.2013797     .011651
             |
           w |  -.4584522   .2685266    -1.71   0.088    -.9847547    .0678503
             |
    cL.n#c.w |  -.1808065   .0712044    -2.54   0.011    -.3203645   -.0412485
             |
           w |
         L1. |   .3786299   .1703421     2.22   0.026     .0447655    .7124943
             |
           k |
         --. |   .3779039   .0595079     6.35   0.000     .2612706    .4945373
         L1. |     -.0186   .0749666    -0.25   0.804    -.1655319    .1283319
         L2. |   .0027866   .0335273     0.08   0.934    -.0629256    .0684988
             |
          ys |
         --. |   .5721959    .193091     2.96   0.003     .1937444    .9506474
         L1. |  -.7372516    .272785    -2.70   0.007      -1.2719   -.2026027
         L2. |    .114775   .1407182     0.82   0.415    -.1610277    .3905776
             |
      yr1979 |   .0030766   .0120867     0.25   0.799    -.0206129    .0267662
      yr1980 |   .0149288   .0197241     0.76   0.449    -.0237297    .0535873
      yr1981 |  -.0211098   .0316787    -0.67   0.505    -.0831989    .0409793
      yr1982 |  -.0370468   .0324049    -1.14   0.253    -.1005593    .0264657
      yr1983 |  -.0332645   .0325828    -1.02   0.307    -.0971256    .0305966
      yr1984 |  -.0230104   .0375276    -0.61   0.540    -.0965632    .0505424
------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.(w L.w k L.k L2.k ys L.ys L2.ys yr1976 yr1977 yr1978 yr1979 yr1980
    yr1981 yr1982 yr1983 yr1984)
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/8).L.n
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -3.03  Pr > z =  0.002
Arellano-Bond test for AR(2) in first differences: z =  -0.34  Pr > z =  0.737
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(24)   =  56.24  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(24)   =  25.24  Prob > chi2 =  0.393
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  iv(w L.w k L.k L2.k ys L.ys L2.ys yr1976 yr1977 yr1978 yr1979 yr1980 yr1981 yr1982 yr1983 yr1984)
    Hansen test excluding group:     chi2(10)   =  11.54  Prob > chi2 =  0.317
    Difference (null H = exogenous): chi2(14)   =  13.71  Prob > chi2 =  0.472

ADDED NOTE: Apologies, I had not seen that you had specified xtabond and not xtabond2 as I illustrate above. You are correct, it seems that this was an issue back in the days when xtabond was introduced, and I do not know if it was ever resolved.

http://www.stata.com/statalist/archi.../msg00522.html

However, now you have xtabond2 which addresses this. My recommendation is that you use it for your estimation. If you insist on xtabond, you can revert to the old-fashioned way of creating interactions, using a loop if you have too many variables.

Last edited by Andrew Musau; 12 Mar 2016, 18:35.

Comment

Olivia Carter

Join Date: Jan 2016

Posts: 8
#4

13 Mar 2016, 07:52

I already checked the post you mentioned, but since it is the very first time for me in using interaction terms, I am not very confident about it, thus I did not understand it very well.
About the "old-fashioned way": I do not have too many variables so, is it right to just multiply the lagged dependent variable with my not lagged independent variables, whether they're continuous or categorical? Is the procedure the same in both cases?
Thank you
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10225

13 Mar 2016, 12:13

About the "old-fashioned way": I do not have too many variables so, is it right to just multiply the lagged dependent variable with my not lagged independentvariables, whether they're continuous or categorical? Is the procedure the same in both cases?

Generally yes, an interaction is nothing more than the product of two variables. With continuous variables, it's straightforward, you just multiply. However, with categorical variables, you will have several interaction terms according to the number of categories. Here is a simple example that illustrates:

Code:

. webuse lbw
(Hosmer & Lemeshow data)

. des

Contains data from http://www.stata-press.com/data/r11/lbw.dta
  obs:           189                          Hosmer & Lemeshow data
 vars:            11                          15 Jan 2009 05:01
 size:         3,402 (99.9% of memory free)
-------------------------------------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------------------------------------------------
id              int    %8.0g                  identification code
low             byte   %8.0g                  birthweight<2500g
age             byte   %8.0g                  age of mother
lwt             int    %8.0g                  weight at last menstrual period
race            byte   %8.0g       race       race
smoke           byte   %8.0g                  smoked during pregnancy
ptl             byte   %8.0g                  premature labor history (count)
ht              byte   %8.0g                  has history of hypertension
ui              byte   %8.0g                  presence, uterine irritability
ftv             byte   %8.0g                  number of visits to physician during 1st trimester
bwt             int    %8.0g                  birthweight (grams)
-------------------------------------------------------------------------------------------------------------------------
Sorted by:  


. label list race
race:
           1 white
           2 black
           3 other

.

In this dataset, we have the categorical variable race with 3 categories: white, black, and other. Here is a model with interactions

Code:

. regress  low age lwt i.race smoke smoke#i.race
note: 1.smoke#3.race omitted because of collinearity

      Source |       SS       df       MS              Number of obs =     189
-------------+------------------------------           F(  7,   181) =    3.03
       Model |   4.2571373     7  .608162472           Prob > F      =  0.0049
    Residual |  36.3248733   181  .200689908           R-squared     =  0.1049
-------------+------------------------------           Adj R-squared =  0.0703
       Total |  40.5820106   188  .215861758           Root MSE      =  .44798

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0033655   .0066344    -0.51   0.613    -.0164563    .0097252
         lwt |  -.0021554   .0011473    -1.88   0.062    -.0044192    .0001085
             |
        race |
          2  |   .2239505   .1380235     1.62   0.106    -.0483915    .4962925
          3  |   .2180009   .0955033     2.28   0.024     .0295578     .406444
             |
       smoke |   .0639134   .1428389     0.45   0.655    -.2179302    .3457569
             |
  smoke#race |
        1 1  |    .172793   .1718378     1.01   0.316      -.16627     .511856
        1 2  |   .2228583   .2323809     0.96   0.339    -.2356657    .6813823
        1 3  |  (omitted)
             |
       _cons |   .4777417   .2242785     2.13   0.035      .035205    .9202784
------------------------------------------------------------------------------

Manually, I would have to first create three variables representing the categories of race, then create the interactions as follows

Code:

. tab race, gen(Race)

       race |      Freq.     Percent        Cum.
------------+-----------------------------------
      white |         96       50.79       50.79
      black |         26       13.76       64.55
      other |         67       35.45      100.00
------------+-----------------------------------
      Total |        189      100.00

This creates 3 variables Race1, Race2, and Race3. Then I just multiply these with the variable "smoke" to create the interactions, and run the regression (omitting the third interaction as above)

Code:

. gen smokexRace1= smoke*Race1

. gen smokexRace2= smoke*Race2

. gen smokexRace3= smoke*Race3

. regress  low age lwt  Race2 Race3 smoke  smokexRace1 smokexRace2

      Source |       SS       df       MS              Number of obs =     189
-------------+------------------------------           F(  7,   181) =    3.03
       Model |   4.2571373     7  .608162472           Prob > F      =  0.0049
    Residual |  36.3248733   181  .200689908           R-squared     =  0.1049
-------------+------------------------------           Adj R-squared =  0.0703
       Total |  40.5820106   188  .215861758           Root MSE      =  .44798

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0033655   .0066344    -0.51   0.613    -.0164563    .0097252
         lwt |  -.0021554   .0011473    -1.88   0.062    -.0044192    .0001085
       Race2 |   .2239505   .1380235     1.62   0.106    -.0483915    .4962925
       Race3 |   .2180009   .0955033     2.28   0.024     .0295578     .406444
       smoke |   .0639134   .1428389     0.45   0.655    -.2179302    .3457569
 smokexRace1 |    .172793   .1718378     1.01   0.316      -.16627     .511856
 smokexRace2 |   .2228583   .2323809     0.96   0.339    -.2356657    .6813823
       _cons |   .4777417   .2242785     2.13   0.035      .035205    .9202784
------------------------------------------------------------------------------

So just a few more steps with categorical variables.

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#6

13 Mar 2016, 15:19

Note that this same question was raised on Stack Overflow at http://stackoverflow.com/questions/3...-variable-in-x

Olivia, The Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post, requests that you state when you have raised the same question in other forums. Before you start your next topic, you should review the FAQ, and especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using CODE delimiters, as described in section 12 of the FAQ. Saying "it didn't work" is unhelpful and leads to misdirected effort, as in Andrew's work in post #2 where your initial post made it easy to misunderstand what the precise problem was.
Comment

Announcement