Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • XTDPDBC: new Stata command for bias-corrected estimation of linear dynamic panel data models

    Dear Statalisters,

    Linear dynamic panel data models are commonly estimated by GMM (for example with my command xtdpdgmm). Yet, when all regressors (besides the lagged dependent variable) are strictly exogenous, more efficient alternatives are available. Besides maximum likelihood estimation (for example with my command xtdpdqml), an estimator that directly corrects the dynamic panel data bias (a.k.a. Nickell bias) of the conventional fixed-effects (FE) estimator can be quite attractive because it typically retains the small variance of the FE estimator compared to GMM estimators.

    My new command, xtdpdbc, implements the bias-corrected method of moments estimator described by Breitung, Kripfganz, and Hayakawa (2021). It analytically corrects the first-order condition of the FE estimator, which leads to a set of nonlinear moment conditions that can be solved with conventional numerical methods (Gauss-Newton). Another advantage of this procedure is that a formula of the asymptotic variance-covariance matrix for the calculation of standard errors is readily available, unlike the bias-corrected estimator by Kiviet (1995) that is implemented in the community-contributed xtlsdvc command by Bruno (2005).

    Yet another advantage is that the estimator can accommodate higher-order lags of the dependent variable. Moreover, the moment conditions can be adjusted to create a random-effects (RE) version of the estimator, assuming that all (or some) of the exogenous regressors are uncorrelated with the unobserved group-specific effects. This RE version is not yet implemented in xtdpdbc, but will be added in due course.

    In turns out that under the FE assumption the bias-corrected method of moments estimator is equivalent to the Dhaene and Jochmans (2016) adjusted profile likelihood estimator. Furthermore, if there is only a single lag of the dependent variable, it is also equivalent to the bias-corrected estimator of Bun and Carree (2005).

    It should be noted that due to the nonlinearity of the bias-corrected moment functions, the estimator in general has multiple solutions and the numerical algorithm may not always converge to the correct one. The correct solution is characterized by a negativity condition on the gradient, i.e. all eigenvalues of the gradient should be negative. In the current version of xtdpdbc, the command will display a note if the gradient has positive eigenvalues. In that case, the estimation should be repeated with different starting values (using the from() option) until the correct solution is found. Starting values for the coefficient of the lagged dependent variable should typically be varied over the interval [0, 1]. Starting values for the exogenous regressors do not matter much.

    In some cases, the numerical algorithm might not converge due to an almost flat criterion function. In such a case, it might help to simplify the optimization problem by concentrating out the coefficients of the exogenous regressors with option concentration. If this does not help, formal convergence could possibly be achieved by declaring the option nonrtolerance. However, the results in that case might not be very robust.

    Last but not least, the command also supports unbalanced panel data.

    To install the command, type the following in Stata's command window:
    Code:
    net install xtdpdbc, from(http://www.kripfganz.de/stata/)
    Please see the help file for the fairly standard command syntax and the available options:
    Code:
    help xtdpdbc
    Here is an example with second-order autoregressive dynamics:
    Code:
    . webuse psidextract
    
    . xtdpdbc lwage wks south smsa ms exp exp2 occ ind union, lags(2)
    
    Bias-corrected estimation
    Iteration 0:   f(b) =  .00415219  
    Iteration 1:   f(b) =  7.766e-06  
    Iteration 2:   f(b) =  2.040e-09  
    Iteration 3:   f(b) =  2.132e-16  
    
    Group variable: id                           Number of obs         =      2975
    Time variable: t                             Number of groups      =       595
    
                                                 Obs per group:    min =         5
                                                                   avg =         5
                                                                   max =         5
    
    ------------------------------------------------------------------------------
           lwage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           lwage |
             L1. |   .2777891   .0713708     3.89   0.000     .1379049    .4176733
             L2. |   .0777857   .0411693     1.89   0.059    -.0029045     .158476
                 |
             wks |  -.0000815   .0014887    -0.05   0.956    -.0029992    .0028363
           south |   .0828634   .0950579     0.87   0.383    -.1034466    .2691735
            smsa |  -.0304335   .0293295    -1.04   0.299    -.0879182    .0270513
              ms |  -.0096381   .0294365    -0.33   0.743    -.0673326    .0480565
             exp |     .06042    .012486     4.84   0.000     .0359478    .0848921
            exp2 |  -.0002095   .0001089    -1.92   0.054    -.0004229    3.86e-06
             occ |   -.029654   .0222952    -1.33   0.183    -.0733517    .0140437
             ind |   .0189437    .025248     0.75   0.453    -.0305414    .0684289
           union |  -.0044655    .030205    -0.15   0.882    -.0636661    .0547351
           _cons |   3.283092   .5078034     6.47   0.000     2.287815    4.278368
    ------------------------------------------------------------------------------
    Any comments and suggestions are welcome.

    References:
    https://twitter.com/Kripfganz

  • #2
    Sebastian,

    Thanks for creating another new Stata command that I believe will gain wide interest and use.

    I have tried using xtdpdbc with a new project of mine that investigates the political determinants of various aspects of human rights in the developing world. However, I have run into a problem, one that I have encountered with other projects whenever I include a lagged dependent variable in my models, regardless of what methodology I use (e.g., xtreg, xtdpdgmm, xtdpdqml, and now, xtdpdbc). The problem is that adding one or more LDVs causes the independent variables of interest coefficients to decline to implausibly low levels (as compared to what I get without an LDV and what I expect from theory).

    I see from p. 91 of your 2019 London Stata Conference presentation explaining xtdpdgmm that you recommend adding one or more lags of the independent variables. Although this recommendation is for a specific reason (possible correlation between instrument lags and the error term), does this recommendation also hold for more general reasons when using xtdpdbc?

    I note that this journal article recommends trying one or more lags of independent variables when using LDVs to solve the problem I am describing: Wilkins, Arjun S. 2018. To Lag or Not to Lag?: Re-Evaluating the Use of Lagged Dependent Variables in Regression Analysis. Political Science Research and Methods 6(2): 393-411.

    I tried adding a first lag to each of my independent variables of interest and doing so brings the coefficients of the unlagged variables up to plausible levels. However, I am concerned about collinearity between the lagged and unlagged variables when I do this because the signs of the coefficients for the lagged variables flip to negative and I get very high VIF scores when I test for multicollinearity. I show below the regression results I get without and with lags for the independent variables and the VIF results.

    Code:
     xtdpdbc y x1 x2 x3  if y_l1~=. & y_l2~=. & x1_l1~=. & x2_l1~=. & x3_l1~=. & year>1979, fe lags(2)
    
    Bias-corrected estimation
    Iteration 0:   f(b) =  .00031196  
    Iteration 1:   f(b) =  5.466e-07  
    Iteration 2:   f(b) =  5.337e-12  
    Iteration 3:   f(b) =  5.415e-22  
    
    Group variable: ccode                        Number of obs         =      5088
    Time variable: year                          Number of groups      =       144
    
                                                 Obs per group:    min =         5
                                                                   avg =  35.33333
                                                                   max =        39
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               y |
             L1. |   .9120774   .0230568    39.56   0.000     .8668869    .9572678
             L2. |  -.0380304    .017669    -2.15   0.031     -.072661   -.0033999
                 |
              x1 |   .0170865   .0051434     3.32   0.001     .0070056    .0271674
              x2 |   .0552508   .0160587     3.44   0.001     .0237763    .0867252
              x3 |   .0622974   .0196529     3.17   0.002     .0237785    .1008163
           _cons |  -.0005421    .010707    -0.05   0.960    -.0215275    .0204432
    ------------------------------------------------------------------------------
    
    . xtdpdbc y x1 x1_l1 x2 x2_l1 x3 x3_l1 if y_l1~=. & y_l2~=. & x1_l1~=. & x2_l1~=. & x3_l1~=. & year>1979, fe lags(2)
    
    Bias-corrected estimation
    Iteration 0:   f(b) =  .00031894  
    Iteration 1:   f(b) =  4.215e-08  
    Iteration 2:   f(b) =  3.750e-13  
    
    Group variable: ccode                        Number of obs         =      5088
    Time variable: year                          Number of groups      =       144
    
                                                 Obs per group:    min =         5
                                                                   avg =  35.33333
                                                                   max =        39
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               y |
             L1. |   .9382101   .0209673    44.75   0.000      .897115    .9793052
             L2. |  -.0031981   .0184993    -0.17   0.863     -.039456    .0330599
                 |
              x1 |   .0460023   .0128886     3.57   0.000     .0207411    .0712635
           x1_l1 |  -.0371751   .0117679    -3.16   0.002    -.0602397   -.0141104
              x2 |   .2712156   .0439953     6.16   0.000     .1849864    .3574447
           x2_l1 |  -.2536297   .0419765    -6.04   0.000    -.3359021   -.1713573
              x3 |   .3128306    .056701     5.52   0.000     .2016987    .4239625
           x3_l1 |  -.2910625   .0517423    -5.63   0.000    -.3924755   -.1896496
           _cons |    .013027   .0058651     2.22   0.026     .0015317    .0245223
    ------------------------------------------------------------------------------
    
    
    
    collin y_l1 y_l2 x1 x1_l1 x2 x2_l1 x3 x3_l1 if year>1979 
    (obs=5,088)
    
      Collinearity Diagnostics
    
                            SQRT                   R-
      Variable      VIF     VIF    Tolerance    Squared
    ----------------------------------------------------
          y_l1     38.99    6.24    0.0256      0.9744
          y_l2     37.66    6.14    0.0266      0.9734
            x1     14.56    3.82    0.0687      0.9313
         x1_l1     14.62    3.82    0.0684      0.9316
            x2     31.44    5.61    0.0318      0.9682
         x2_l1     31.51    5.61    0.0317      0.9683
            x3     29.84    5.46    0.0335      0.9665
         x3_l1     29.84    5.46    0.0335      0.9665
    ----------------------------------------------------
      Mean VIF     28.56
    Based on correspondence I had with Arjun Wilkins, I calculated AIC and BIC scores for the models without and with lagged independent variables. I had to do this using xtreg, fe, because xtdpdbc would not produce results when I ran estat ic. I show the results below. As can be seen, the AIC and BIC scores become substantially better when I add the lagged independent variables. As I understand it, if the AIC and BIC scores improve with the lagged variables present, this indicates a better model fit, thereby lending confidence that the higher coefficients of the unlagged variables are real and not mere artifacts of multicollinearity. But I'm not certain of this and not yet confident that I can rely on the coefficient results I obtain after adding the lags.


    Code:
      qui xtreg y y_l1 y_l2 x1 x2 x3  if y_l1~=. & y_l2~=. & x1_l1~=. & x2_l1~=. & x3_l1~=. & year>1979, fe 
    
    . estat ic
    
    Akaike's information criterion and Bayesian information criterion
    
    -----------------------------------------------------------------------------
           Model |          N   ll(null)  ll(model)      df        AIC        BIC
    -------------+---------------------------------------------------------------
               . |      5,088   3782.729   9240.922       6  -18469.84  -18430.64
    -----------------------------------------------------------------------------
    Note: BIC uses N = number of observations. See [R] BIC note.
    
    . qui xtreg y y_l1 y_l2 x1 x1_l1 x2 x2_l1 x3 x3_l1 if y_l1~=. & y_l2~=. & x1_l1~=. & x2_l1~=. & x3_l1~=. & y
    > ear>1979, fe 
    
    . estat ic
    
    Akaike's information criterion and Bayesian information criterion
    
    -----------------------------------------------------------------------------
           Model |          N   ll(null)  ll(model)      df        AIC        BIC
    -------------+---------------------------------------------------------------
               . |      5,088   3782.729   9569.201       9   -19120.4  -19061.59
    -----------------------------------------------------------------------------
    Note: BIC uses N = number of observations. See [R] BIC note.
    I would appreciate any advice you, or anyone else on Statalist, can provide to me.

    Joe

    Comment


    • #3
      Originally posted by Joseph L. Staats View Post
      I have tried using xtdpdbc with a new project of mine that investigates the political determinants of various aspects of human rights in the developing world. However, I have run into a problem, one that I have encountered with other projects whenever I include a lagged dependent variable in my models, regardless of what methodology I use (e.g., xtreg, xtdpdgmm, xtdpdqml, and now, xtdpdbc). The problem is that adding one or more LDVs causes the independent variables of interest coefficients to decline to implausibly low levels (as compared to what I get without an LDV and what I expect from theory).
      The coefficients have a different interpretation in static models - without a lagged dependent variable - compared to dynamic models - with at least one lag of the dependent variable. In static models, they could be interpreted as long-run coefficients (although their estimates might suffer from omitted-variable bias due to the omitted LDV). In dynamic models, they would be interpreted as short-run coefficients. Corresponding long-run coefficients (which are typically again larger than the short-run coefficients) can be obtained as follows
      Code:
      nlcom (_b[x] + _b[L.x]) / (1 - _b[L.y] - _b[L2.y])
      Originally posted by Joseph L. Staats View Post
      I see from p. 91 of your 2019 London Stata Conference presentation explaining xtdpdgmm that you recommend adding one or more lags of the independent variables. Although this recommendation is for a specific reason (possible correlation between instrument lags and the error term), does this recommendation also hold for more general reasons when using xtdpdbc?
      For xtdpdbc, absence of serial correlation in the error term is similarly important as it is for xtdpdgmm. Otherwise, the lagged dependent variable would be correlated with the idiosyncratic error term. The bias correction does not correct for that. Adding further lags of the dependent variable (and the independent variables) aims to obtain a model that is "dynamically complete". With GMM, we could possibly still find instruments that are valid when the model is not dynamically complete, but the bias correction approach critically depends on this assumption.

      Originally posted by Joseph L. Staats View Post
      I note that this journal article recommends trying one or more lags of independent variables when using LDVs to solve the problem I am describing: Wilkins, Arjun S. 2018. To Lag or Not to Lag?: Re-Evaluating the Use of Lagged Dependent Variables in Regression Analysis. Political Science Research and Methods 6(2): 393-411.
      Including lags of independent variables serves the same purpose of adding further lags of the dependent variable: to obtain a dynamically complete model; in other words: to proxy for the serial correlation in the idiosyncratic error term.

      Originally posted by Joseph L. Staats View Post
      I tried adding a first lag to each of my independent variables of interest and doing so brings the coefficients of the unlagged variables up to plausible levels. However, I am concerned about collinearity between the lagged and unlagged variables when I do this because the signs of the coefficients for the lagged variables flip to negative and I get very high VIF scores when I test for multicollinearity. I show below the regression results I get without and with lags for the independent variables and the VIF results.
      Notice that the coefficient sum of the unlagged and lagged independent variables is of a similar magnitude as in the model without the added lags. Adding lags allows for richer short-term dynamics: You might have a stronger contemporaneous effect and a balancing delayed effect, but the combined effect is still similar to the case where you only allow for a contemporaneous effect. The key quantity of interest often remains the long-run effect as outlined above.

      Originally posted by Joseph L. Staats View Post
      Based on correspondence I had with Arjun Wilkins, I calculated AIC and BIC scores for the models without and with lagged independent variables. I had to do this using xtreg, fe, because xtdpdbc would not produce results when I ran estat ic. I show the results below. As can be seen, the AIC and BIC scores become substantially better when I add the lagged independent variables. As I understand it, if the AIC and BIC scores improve with the lagged variables present, this indicates a better model fit, thereby lending confidence that the higher coefficients of the unlagged variables are real and not mere artifacts of multicollinearity. But I'm not certain of this and not yet confident that I can rely on the coefficient results I obtain after adding the lags.
      Leaving aside the complications of obtaining AIC/BIC criteria after xtdpdbc, these criteria can be useful indeed to decide about the inclusion of lagged regressors. This has a long-standing tradition in time series econometrics. There are similar criteria available after GMM estimation (see estat mmsc after xtdpdgmm). Regarding the collinearity, you can think about it from a different perspective. You can obtain an equivalent regression by including D.x instead of L.x. This would merely be a different parameterization of the same model, but clearly there would be no concern about collinearity among D.x and x. Note: In this model, you would calculate the long-run coefficient only based on the coefficient of x, not D.x:
      Code:
      nlcom _b[x] / (1 - _b[L.y] - _b[L2.y])
      https://twitter.com/Kripfganz

      Comment


      • #4
        Thanks so much for responding in a such a clear and comprehensive way. I am now much better informed about using LDVs and the calculation of short- and long-term effects and feel much more confident going forward with my current project.

        In working with xtdpdbc, I notice that it (as well as xtdpdqml) doesn't allow use of factor-variable operators, whereas xtdpdgmm does. I often use factor-variable operators for handling interactions and creating higher-order variables, especially when using the margins command for various purposes. Is adding this feature to xtdpdbc at all feasible?

        Thanks again.

        Comment


        • #5
          Factor variables are a programmer's nightmare. This is why I usually do not bother with them in the early stages of a new package. Yet, motivated by your request, I have now implemented the support for factor variables in xtdpdbc. The latest update to version 1.1.0 also has the new option small for a small-sample degrees-of-freedom correction of the standard errors and the reporting of small-sample test statistics.

          Code:
          adoupdate xtdpdbc, update
          Please let me know if you observe any unexpected behavior or error messages when using the command with factor variables.
          https://twitter.com/Kripfganz

          Comment


          • #6
            I figured it probably wasn't that simple to add factor variables and applaud your willingness to do so for xtdpdbc. I just tried the new feature to create and then graph higher-order variables using margins and marginsplot and everything worked as it should. I'll let you know if I encounter any problems, but I don't expect I will.

            Comment

            Working...
            X