Dear all,
I have been trying to estimate employment with a sys-gmm estimator and with data by the world input output database. that is, data on industry level.
The equation follows the common form:
Number of employees = constant + L1.Num of employees + L1.wage + L1.Capital + L1.ValueAdded + error term
I've tried to familiarize myself with GMM using the -webuse abdata- database that allows for the estimation of an equation similiar to the one above and that is also proposed in Roodman (2009). so, since I wanted to estimate an employment equation I thought that this approach would also work with different data. However, after running the regression, the outcome of the hansen-test of overid restrictions indicates that the instrument are not valid (see code below). I've run different variations of the model (changes in lags or endogenous variables), yet the hansen test always rejects the validity of my instruments. so I am confused why the typical example for GMM (employment data) seems not to work with industry level data.
can someone offer an explanation?
thanks! Thomas
I have been trying to estimate employment with a sys-gmm estimator and with data by the world input output database. that is, data on industry level.
The equation follows the common form:
Number of employees = constant + L1.Num of employees + L1.wage + L1.Capital + L1.ValueAdded + error term
I've tried to familiarize myself with GMM using the -webuse abdata- database that allows for the estimation of an equation similiar to the one above and that is also proposed in Roodman (2009). so, since I wanted to estimate an employment equation I thought that this approach would also work with different data. However, after running the regression, the outcome of the hansen-test of overid restrictions indicates that the instrument are not valid (see code below). I've run different variations of the model (changes in lags or endogenous variables), yet the hansen test always rejects the validity of my instruments. so I am confused why the typical example for GMM (employment data) seems not to work with industry level data.
can someone offer an explanation?
thanks! Thomas
Code:
xtabond2 ln_EMPE l.ln_EMPE l.ln_P_L_EMP l.ln_K l.ln_VA i.year , gmm( l.ln_EMPE l.ln_P_L_EMP l.ln_K l.ln_VA ,lag(3 5)) iv(i.year, equation
> (level)) twostep robust artest(3)
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
Difference-in-Sargan/Hansen statistics may be negative.
Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: ctry_indus~y Number of obs = 30700
Time variable : year Number of groups = 2196
Number of instruments = 178 Obs per group: min = 3
Wald chi2(19) = 215606.58 avg = 13.98
Prob > chi2 = 0.000 max = 14
------------------------------------------------------------------------------
| Corrected
ln_EMPE | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_EMPE |
L1. | .9823159 .0147167 66.75 0.000 .9534718 1.01116
|
ln_P_L_EMP |
L1. | -.0401773 .0115665 -3.47 0.001 -.0628471 -.0175074
|
ln_K |
L1. | .0000743 .0080639 0.01 0.993 -.0157306 .0158793
|
ln_VA |
L1. | .0170818 .0139796 1.22 0.222 -.0103178 .0444814
|
year |
2000 | 0 (empty)
2001 | 0 (omitted)
2002 | .0002993 .003692 0.08 0.935 -.0069369 .0075354
2003 | -.0012104 .0030508 -0.40 0.692 -.0071899 .0047691
2004 | .0031081 .0038363 0.81 0.418 -.0044109 .010627
2005 | .0085499 .0047215 1.81 0.070 -.000704 .0178037
2006 | .0140131 .0045831 3.06 0.002 .0050305 .0229958
2007 | .0170023 .0046812 3.63 0.000 .0078273 .0261773
2008 | .006865 .0055576 1.24 0.217 -.0040278 .0177577
2009 | -.0336384 .0058694 -5.73 0.000 -.0451422 -.0221346
2010 | -.0129387 .004947 -2.62 0.009 -.0226347 -.0032426
2011 | .0077797 .0045261 1.72 0.086 -.0010913 .0166506
2012 | -.0006394 .0052275 -0.12 0.903 -.0108851 .0096062
2013 | .0009348 .0046395 0.20 0.840 -.0081585 .0100281
2014 | .0124441 .0049524 2.51 0.012 .0027377 .0221506
|
_cons | .0714095 .0232882 3.07 0.002 .0257656 .1170535
------------------------------------------------------------------------------
Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(3/5).(L.ln_EMPE L.ln_P_L_EMP L.ln_K L.ln_VA)
Instruments for levels equation
Standard
2000b.year 2001.year 2002.year 2003.year 2004.year 2005.year 2006.year
2007.year 2008.year 2009.year 2010.year 2011.year 2012.year 2013.year
2014.year
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
DL2.(L.ln_EMPE L.ln_P_L_EMP L.ln_K L.ln_VA)
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -10.11 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = 0.37 Pr > z = 0.711
Arellano-Bond test for AR(3) in first differences: z = 0.69 Pr > z = 0.488
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(158) = 833.81 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(158) = 345.94 Prob > chi2 = 0.000
(Robust, but weakened by many instruments.)
Difference-in-Hansen tests of exogeneity of instrument subsets:
GMM instruments for levels
Hansen test excluding group: chi2(114) = 228.93 Prob > chi2 = 0.000
Difference (null H = exogenous): chi2(44) = 117.01 Prob > chi2 = 0.000
iv(2000b.year 2001.year 2002.year 2003.year 2004.year 2005.year 2006.year 2007.year 2008.year 2009.year 2010.year 2011.year 2012.year 201
> 3.year 2014.year, eq(level))
Hansen test excluding group: chi2(145) = 309.69 Prob > chi2 = 0.000
Difference (null H = exogenous): chi2(13) = 36.24 Prob > chi2 = 0.001

Comment