Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtabond2 syntax which excludes few year observations

    Dear all,

    I was trying to apply system GMM on the panel dataset counting firms over the 2006-2014 period. However my interest lies in 2008-2014 period. Hence, I apply the following command

    Code:
    xtabond2 y L.Y x y z i.year if ser==0 & year>2008, /*
    */ robust twostep cluster(id) small /*
    */ gmm(L.Y, coll lag(1 2)) /*
    */ iv(x y z yr2-yr7)
    Afterwords, just out of curiosity, i dropped 2006 and 2007 year and then applied the command with slightly different syntax, expecting the same results


    Code:
    xtabond2 y L.Y x y z i.year if ser==0, /*
    */ robust twostep cluster(id) small /*
    */ gmm(L.Y, coll lag(1 2)) /*
    */ iv(x y z yr2-yr7)
    however, the results differ!

    Does anyone know why?


  • #2
    You require a strict inequality in your first code, but not in the second. I.e. the first part takes 2009-2014, the second 2008-2014.

    E.g. the difference between
    Code:
    if year > 2008
    if year >= 2008

    Comment


    • #3
      Dear Jesse,
      the results still differ.

      Here are the results before deletion of 2007 and 2006 and 2008, but with the xtabond2 restriction only on the subset of data. This time I explicitly exclude three years. The model diagnostics and well as basic description of the model is below:

      Code:
      xtabond2.... if year!=2006 & year!=2007 & year!=2008
      
      Dynamic panel-data estimation, two-step system GMM
      ------------------------------------------------------------------------------
      Group variable: id                              Number of obs      =     26961
      Time variable : year                            Number of groups   =      6191
      Number of instruments = 27                      Obs per group: min =         1
      F(25, 6190)   =    797.53                                      avg =      4.35
      Prob > F      =     0.000                                      max =         6
                                           (Std. Err. adjusted for clustering on id)
      ------------------------------------------------------------------------------
      
      
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z = -17.53  Pr > z =  0.000
      Arellano-Bond test for AR(2) in first differences: z =  -0.41  Pr > z =  0.679
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(1)    = 368.27  Prob > chi2 =  0.000
        (Not robust, but not weakened by many instruments.)
      Hansen test of overid. restrictions: chi2(1)    =  65.51  Prob > chi2 =  0.000
        (Robust, but weakened by many instruments.)
      below are the results AFTER i exclude 2007 and 2008

      Code:
      drop if year<=2008
      xtabond2... 
      Dynamic panel-data estimation, two-step system GMM
      ------------------------------------------------------------------------------
      Group variable: id                              Number of obs      =     22060
      Time variable : year                            Number of groups   =      5994
      Number of instruments = 23                      Obs per group: min =         1
      F(22, 5993)   =    830.66                                      avg =      3.68
      Prob > F      =     0.000                                      max =         5
                                           (Std. Err. adjusted for clustering on id)
      ------------------------------------------------------------------------------
      
      
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z = -11.69  Pr > z =  0.000
      Arellano-Bond test for AR(2) in first differences: z =   3.80  Pr > z =  0.000
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(0)    =  29.14  Prob > chi2 =      .
        (Not robust, but not weakened by many instruments.)
      Hansen test of overid. restrictions: chi2(0)    =   7.32  Prob > chi2 =      .
        (Robust, but weakened by many instruments.)
      AS you can see, the number of groups and observations differ a lot. Could it be that the stata does the difference of of 2009 and 2008 observations in the first case, even though I omit the 2008 year observations? I am really confused.

      Comment


      • #4
        I noticed that if I exclude 2009 year in the first case (xtabond2 ... year!=2006 & year!=2007 & year!=2008 & year!=2009) I get results more similar to the dataset 2, which does not have 2006-2008 observations
        Code:
        case 3 
         xtabond2.... if year!=2006 & year!=2007 & year!=2008 & year!=2009 
        Dynamic panel-data estimation, two-step system GMM
        ------------------------------------------------------------------------------
        Group variable: id                              Number of obs      =     22060
        Time variable : year                            Number of groups   =      5994
        Number of instruments = 25                      Obs per group: min =         1
        F(25, 5993)   =   1050.55                                      avg =      3.68
        Prob > F      =     0.000                                      max =         5
                                             (Std. Err. adjusted for clustering on id)
        ------------------------------------------------------------------------------
                     |              Corrected
                TFP1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        
        
        ------------------------------------------------------------------------------
        Arellano-Bond test for AR(1) in first differences: z =  -2.36  Pr > z =  0.018
        Arellano-Bond test for AR(2) in first differences: z =  -0.42  Pr > z =  0.674
        ------------------------------------------------------------------------------
        Sargan test of overid. restrictions: chi2(-1)   =  79.81  Prob > chi2 =      .
          (Not robust, but not weakened by many instruments.)
        Hansen test of overid. restrictions: chi2(-1)   =  31.67  Prob > chi2 =      .
          (Robust, but weakened by many instruments.)
        When I exclude 2009 from the dataset containing 2006-2014 year observations, the bottom part of the table results resembles more the results obtained with the dataset containing 2008-2014 period. This is strange too. Could it be that the STATA takes the last excluded year in the "if" command as the base year, and it drops all the prior years?
        Diagnostics however, still differ from the case 2.

        Comment


        • #5
          I think the difference is due to the the lag operator. If you use xtabond2 y L.y ... if year > 2008, then the lags of y still exist in the data. Hence, you can use data from 2009 in this regression. If you explicitly drop if year <= 2008, then the lags of y won't exist for 2009. Hence, you have to drop those observation too.

          Example:

          Code:
          set obs 10
          gen t = runiformint(0,1)
          br
          gen y = runiform()
          bysort t: gen i = _n
          xtset i t
          reg y L.y
          reg y L.y if t == 1
          drop if t == 0
          reg y L.y

          Comment

          Working...
          X