Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sebastian Kripfganz
    replied
    I am afraid the last xtdpdgmm update (version 2.3.11) was premature and did more harm than good. I "fixed" a bug that actually wasn't one. I apologize for this mishap.

    With the now available latest version 2.4.0, the correct computations have been restored for estat serial and estat hausman. Furthermore, a minor bug in option auxiliary has been fixed which was introduced in version 2.3.10.

    As a major new feature, this latest version can now compute the continously-updating GMM estimator as an alternative to the two-step and iterated GMM estimators. Simply specify the new option cugmm. The CU-GMM estimator updates the weighting matrix simultaneously with the coefficient estimates while minimizing the objective function. This is in contrast to the iterated GMM estimator (of which the two-step estimator is a special case), which iterates back and forth between updating the coefficient estimates and the weighting matrix. As a technical comment: The CU-GMM objective function generally does not have a unique minimum. The estimator therefore can be sensitive to the choice of initial values. By default, xtdpdgmm uses the two-stage least squares estimates, ignoring any nonlinear moment conditions, as starting values for the numerical CU-GMM optimization. This seems to work fine.

    The following example illustrates the CU-GMM estimator, and how the xtdpdgmm results can be replicated with ivreg2 (up to minor differences due to the numerical optimization):
    Code:
    . webuse abdata
    
    . xtdpdgmm L(0/1).n w k, gmm(L.n w k, l(1 4) c m(d)) iv(L.n w k, d) cu nofooter
    
    Generalized method of moments estimation
    
    Fitting full model:
    
    Continously updating:
    Iteration 0:   f(b) =  .22189289  
    Iteration 1:   f(b) =  .08073713  
    Iteration 2:   f(b) =  .07655265  
    Iteration 3:   f(b) =  .07646044  
    Iteration 4:   f(b) =  .07645679  
    Iteration 5:   f(b) =  .07645673  
    
    Group variable: id                           Number of obs         =       891
    Time variable: year                          Number of groups      =       140
    
    Moment conditions:     linear =      16      Obs per group:    min =         6
                        nonlinear =       0                        avg =  6.364286
                            total =      16                        max =         8
    
    ------------------------------------------------------------------------------
               n | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
               n |
             L1. |   .4342625   .1106959     3.92   0.000     .2173024    .6512225
                 |
               w |  -2.153388   .3702817    -5.82   0.000    -2.879126   -1.427649
               k |  -.0054155   .1221615    -0.04   0.965    -.2448477    .2340166
           _cons |   7.284639   1.123693     6.48   0.000     5.082241    9.487037
    ------------------------------------------------------------------------------
    
    . predict iv*, iv
     1, model(diff):
       L1.L.n L2.L.n L3.L.n L4.L.n L1.w L2.w L3.w L4.w L1.k L2.k L3.k L4.k
     2, model(level):
       D.L.n D.w D.k
     3, model(level):
       _cons
    
    . ivreg2 n (L.n w k = iv*), cue cluster(id) nofooter
    Iteration 0:   f(p) =  31.065005  (not concave)
    Iteration 1:   f(p) =  27.307398  (not concave)
    Iteration 2:   f(p) =  26.543788  (not concave)
    Iteration 3:   f(p) =  25.047573  (not concave)
    Iteration 4:   f(p) =  24.521102  (not concave)
    Iteration 5:   f(p) =  24.107293  (not concave)
    Iteration 6:   f(p) =  23.931765  (not concave)
    Iteration 7:   f(p) =  23.746613  (not concave)
    Iteration 8:   f(p) =  23.636564  
    Iteration 9:   f(p) =  23.304181  (not concave)
    Iteration 10:  f(p) =  23.241277  (not concave)
    Iteration 11:  f(p) =  23.178503  (not concave)
    Iteration 12:  f(p) =  23.125314  (not concave)
    Iteration 13:  f(p) =  23.074408  
    Iteration 14:  f(p) =  19.278726  
    Iteration 15:  f(p) =  12.160385  (not concave)
    Iteration 16:  f(p) =  11.700402  
    Iteration 17:  f(p) =   11.03222  (not concave)
    Iteration 18:  f(p) =  10.950583  (not concave)
    Iteration 19:  f(p) =  10.907663  
    Iteration 20:  f(p) =  10.800048  
    Iteration 21:  f(p) =  10.704051  
    Iteration 22:  f(p) =  10.703945  
    Iteration 23:  f(p) =  10.703942  
    Iteration 24:  f(p) =  10.703942  
    
    CUE estimation
    --------------
    
    Estimates efficient for arbitrary heteroskedasticity and clustering on id
    Statistics robust to heteroskedasticity and clustering on id
    
    Number of clusters (id) =          140                Number of obs =      891
                                                          F(  3,   139) =    83.84
                                                          Prob > F      =   0.0000
    Total (centered) SS     =  1601.042507                Centered R2   =   0.5099
    Total (uncentered) SS   =  2564.249196                Uncentered R2 =   0.6940
    Residual SS             =  784.7107633                Root MSE      =    .9385
    
    ------------------------------------------------------------------------------
                 |               Robust
               n | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
               n |
             L1. |   .4342987   .1003318     4.33   0.000     .2376521    .6309453
                 |
               w |  -2.153233   .2986292    -7.21   0.000    -2.738535    -1.56793
               k |  -.0053816   .1162739    -0.05   0.963    -.2332742    .2225111
           _cons |   7.284114   .8901409     8.18   0.000     5.539469    9.028758
    ------------------------------------------------------------------------------
    To update to the new version, type the following in Stata's command window:
    Code:
    net install xtdpdgmm, from(http://www.kripfganz.de/stata) replace
    Disclaimer: I have extensively tested this new version. However, due to the complexity of the command, the variety of options, and the lack of alternative software to compare the results for some advanced options, I cannot guarantee that the implementation is error-free. Please let me know if you spot any irregularities.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    If either X1 or X2 is endogenous, then it usually makes sense to assume that their interaction X3 is endogenous as well. You can then just treat it the same way as any other endogenous variable.

    Leave a comment:


  • Sarah Magd
    replied
    ############################################
    #Interaction variables with xtdpdgmm
    #############################################
    Can you please write a post about how we should instrument the interaction variable X3 (X1*X2) in the following two cases?
    Case 1: X1 and X2 are both endogenous
    Case 2: X1 is endogenous whereas X2 is predeterminant

    Leave a comment:


  • Sebastian Kripfganz
    replied
    In the case of system GMM, yes.

    Leave a comment:


  • Sarah Magd
    replied
    Thanks, Prof. Sebastian Kripfganz
    In this case, do you mean that I should ignore the 1-step weighting matrix chi2 test and report the 2-step weighting matrix chi2 only?

    Leave a comment:


  • Sebastian Kripfganz
    replied
    As the note says, both 1-step tests are asymptotically invalid because the one-step weighting matrix of the system GMM estimator is not optimal. That is especially true for the first of the two reported tests (1-step moment functions, 1-step weighting matrix). You should just ignore that test.

    Leave a comment:


  • Sarah Magd
    replied
    Sebastian Kripfganz
    Dear Prof. Kripfganz,
    I have a question regarding the "estat overid" after running the two-step and one-step system GMM. I run the same specification (i.e., with the same number of lags) and get different results for the overidentification test. For the one-step estimation, I never obtain an insignificant P-value for the 1-step moment functions and 1-step weighting matrix. Even if I change the specification of the model. Please find below the details of the output. My sample has 27 cross-section units and 13 years.
    What does this conflicting result mean?

    ##########################
    #estat overid after the one step
    ##########################
    Sargan-Hansen test of the overidentifying restrictions
    H0: overidentifying restrictions are valid

    1-step moment functions, 1-step weighting matrix chi2(21) = 122.9740
    note: * Prob > chi2 = 0.0000

    1-step moment functions, 2-step weighting matrix chi2(21) = 27.9882
    note: * Prob > chi2 = 0.1405

    * asymptotically invalid if the one-step weighting matrix is not optimal
    ####################################
    #estat overid after the two step
    ####################################
    Sargan-Hansen test of the overidentifying restrictions
    H0: overidentifying restrictions are valid

    2-step moment functions, 2-step weighting matrix chi2(21) = 27.3389
    Prob > chi2 = 0.1599

    2-step moment functions, 3-step weighting matrix chi2(21) = 27.9939
    Prob > chi2 = 0.1403


    Leave a comment:


  • Sebastian Kripfganz
    replied
    Addendum: For a simple example with industry dummies (not using the two-stage approach), see slide 86 of my 2019 London Stata Conference presentation:

    Leave a comment:


  • Neyati Ahuja
    replied
    Thanks a lot
    I ll refer the above paper and will try implement xtseqreg command.

    Thank You

    Leave a comment:


  • Sebastian Kripfganz
    replied
    1. You can include "global" variables (invariant over cross sections) as long as you do not also include time dummies for each year. If those global variables do not account sufficiently for common time effects, then this might result in biased estimates of all coefficients.

    2. There is no simple way of analyzing the impact of global variables in the presence of time effects. (In theory, the same two-stage approach as in 3. below could be applied. In practice, due to the necessary correction of the standard errors, this is more complicated because the respective Stata command was not designed to do this for time effects.)

    3. In principle, the same issues as above apply to time-invariant variables in the presence of firm-fixed effects. We discuss potential solutions in the following paper: You can add appropriate instruments which are assumed to be uncorrelated with the firm-fixed effects, e.g. by assuming that the time-invariant variables themselves are uncorrelated with the fixed effects. This can be a quite strong assumption. If violated, it could bias all of the coefficient estimates. (With industry dummies, this should be fine. The firm-fixed effects would then need to be interpreted as firm-specific differences which are not due to being in a specific industry. You can then simply add the industry dummy as a regressor and a standard instrument for the level model.) To gain some robustness for the time-varying regressors, you could apply a two-stage procedure (also discussed in the above paper). The latter can be implemented with my xtseqreg command.
    Last edited by Sebastian Kripfganz; 02 Jun 2022, 04:30.

    Leave a comment:


  • Neyati Ahuja
    replied
    Hello Prof. Sebastian

    I have a query on variables that can be included in the dynamic panel data model.
    In my study I am working on firm level data (N=1400 companies, T=11 years). Besides the firm level variable as independent variables, I am also interested in analysing the impact of macroeconomic variable ((country specific) which are cross section invariant and time variant ) on my dependent variable which is firm performance.
    1. Can these time variant cross section invariant variable eg GDP incorportated in the dynamic paniel model along with my time varying cross section varying variable.
    2. If ans to 1. is no, is there a way to analyse the impact of these cross section invariant variable.
    3. Can we incorporate cross section variant and time invariant variable in the dynamic panel mode eg industry to which the firm belong using a dummy variable. If yes, please elaborate.

    Regards

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Yes, option ue of predict gives you the usual residuals.

    Leave a comment:


  • Sarah Magd
    replied
    Thanks, Prof Sebastian Kripfganz,

    - Yes, when I run two different panel unit root tests, they give contradicting results for the stationarity of ln_GDPc.

    - I want to make sure that when I use "predict residual, ue" after estimating the model with xtdpdgmm, this gives the residuals. Am I right?
    Last edited by Sarah Magd; 01 Jun 2022, 07:25.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    There is an issue at the conceptual level: It does not really make sense to model the effect of a nonstationary variable on a stationary variable (if you believe the stationarity tests). Such a model would be unbalanced in terms of the integration orders. I recommend to do the following:

    Case 3: Regress REN on ln_GDPc and L.ln_GDPc. By adding the lag of ln_GDPc, the model in Case 2 becomes nested in Case 3. Case 2 is essentially the same model as Case 3 but with the restriction that the coefficients of ln_GDPc and L.ln_GDPc are identical with opposite signs. If you estimate Case 3 and indeed find that the estimates support this restriction, you can go for the simplified Case 2. Otherwise, I recommend to stick to Case 3.

    Case 4: Equivalently, regress REN on ln_GDPc and Δln_GDPc. If the coefficient of ln_GDPc is zero, then again Cases 2, 3, and 4 are equivalent.

    With a small time horizon, you do not need to worry about stationarity for the reason it can cause trouble in time series models (nonstandard distributions etc.). However, as mentioned in my previous post, nonstationarity can lead to invalidity of instruments for the level model and weak instruments in general.

    Sometimes, it helps to include a deterministic time trend in the regression (and, importantly, in the unit-root tests). Otherwise, the unit-root tests might mistake the deterministic time trend for a unit root.

    Leave a comment:


  • Sarah Magd
    replied
    Thanks very much, Prof. Sebastian Kripfganz for your reply,
    I just figured out that I explained my model in the wrong way. However, I think your answer in #406 could guide us if our dependent variable is nonstationary in level.

    I am regressing renewable energy consumption (REN) on the log of GDP per capita (ln_GDPc), using the two-step sys GMM model for a sample of 26 countries over the period 2005 to 2017. The panel unit root analysis shows that ln_GDPc is stationary in first difference, whereas REN is stationary in level. (So the independent variable in the model is ln_GDPc and it is non-stationary at level).

    This leads to two cases:
    - Case 1: If I regress REN on ln_GDPc in level: the coefficient of ln_GDPc is 0.3 and statistically significant.
    - Case 2: If I regress REN on ln_GDPc in the first difference (Δln_GDPc) on REN: the coefficient of Δln_GDPc becomes 0.9 and statistically significant.

    In this context, are the two cases equivalent to each other? Should I care about the stationarity given the small-time period I have?

    Leave a comment:

Working...
X