Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about using xtabond2 to simultaneously capture short and long-run relationship between an explanatory and an outcome variable

    Dear stata community,

    I would like to use GMM (i.e. the xtabond2 or a similar command) to assess inequality’s short-run and long-run impact on growth simultaneously. As such the outcome variable is growth and there are two explanatory variables. The first is Gini coefficient at the start of each 5-year period. The second explanatory variable is the first observed Gini coefficient for each country (this variable is called initial_gini). The code I ran is below:

    xtabond2 growth L.(gini initial_gini lgdp syrm syrf pi) _Iper*, gmm(L.(syrm syrf pi gini lgdp)) iv(_Iper*) two small nol robust

    When I run the code it works but the second explanatory variable is omitted. Is there a way to prevent it from being omitted? I presume the problem is that for each country the initial_gini takes on the same value for each observation. Is there a way to do this either using xtabond2 or a similar command? Or more broadly, having any advice on how to use two explanatory variables one capturing the short-run impact and the other the long-run impact would be greatly appreciated.

    The warning that stata gives me reads: Warning: "Two-step estimated covariance matrix of moments is singular. Using a generalized inverse to calculate optimal weighting matrix for two-step estimation. Difference-in-Sargan/Hansen statistics may be negative."

    Thank you

  • #2
    initial_gini is a time-invariant regressor. It is removed from the model when apllying the first-difference transformation. Its coefficient can only be identified from the model in levels but you have to make strong (over-)identifying assumptions. For example, you could assume that initial_gini is uncorrelated with the unobserved country-specific effects (which is probably implausible) and use it as an instrument for itself. Alternatively, you need to find other instruments.

    Notice: The first differences of the time-varying regressors that are usually used as instruments for the level model in a system-GMM estimation are not helpful to identify the coefficient of initial_gini. By assumption, these first differences are uncorrelated to the unobserved time-invariant variables (and effectively to any time-invariant variable). Even though Stata would produce a number for the coefficient estimate of initial_gini, this estimate is not reliable at all.

    Further background information:
    Kripfganz and Schwarz (2015). Estimation of linear dynamic panel data models with time-invariant regressors. ECB Working Paper 1838, European Central Bank.
    https://twitter.com/Kripfganz

    Comment


    • #3
      Hi Sebastian Kripfganz,

      This makes perfect sense. I will take a more thorough read through the paper you sent. Will it provide insight on how I can assess inequality's short-run and long-run impact on growth simultaneously in one model? Or do you have any suggestions as to how I can do that?

      Thank you!

      Comment


      • #4
        The paper does not address your particular question regarding the long-run and short-run effects of inequality. I recommend to search the related literature for articles that are concerned with a similar question.
        https://twitter.com/Kripfganz

        Comment


        • #5
          Sebastian Kripfganz,

          Hello, I'm Jane. I'm encountering some issues with the xtabond2 command and would appreciate your help. I'm using fixed effects (FE) and generalized method of moments (GMM) estimation methods for a gravity model. My question is about the treatment of time-invariant data, such as distance, in both estimation techniques.

          Should time-invariant data, like distance, be omitted from both FE and GMM estimation results? I've noticed that while the distance variable is omitted in the FE command, it's not excluded in the GMM command automatically. Below are the commands I've been using:

          >>FE command: xtreg logrealagritotal L.(logrealagritotal) logimmi1 logdist logecondist logexrate i.year, fe vce(robust)

          >>GMM command: xtabond2 logrealagritotal L.logrealagritotal logimmi1 logdist logecondist logexrate i.year, gmm(L.(logrealagritotal logimmi1)) iv(logdist logecondist logexrate i.year, equation(diff)) robust h(2) nocons artests(3) twostep small


          Should I exclude the logdist variable manually in the GMM command before I run it?

          Thank you in advance for your assistance.

          Comment


          • #6
            If distance is not a key variable of interest, then you should indeed better remove it from the model specification. The system GMM estimator will otherwise yield potentially spurious estimates for this coefficient, which might be misleading.

            If you are actually interested in the effect of this variable, you would need to pursue a more careful modeling approach as outlined in my research paper with Claudia Schwarz:
            https://twitter.com/Kripfganz

            Comment


            • #7
              Originally posted by Sebastian Kripfganz View Post
              If distance is not a key variable of interest, then you should indeed better remove it from the model specification. The system GMM estimator will otherwise yield potentially spurious estimates for this coefficient, which might be misleading.

              If you are actually interested in the effect of this variable, you would need to pursue a more careful modeling approach as outlined in my research paper with Claudia Schwarz:
              Thank you so much for the insightful answer!

              Best regards.

              Comment

              Working...
              X