Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Sebastian Kripfganz View Post
    Unbalanced panel data should not be a problem. If Stata takes a long time to compute the results with the large data set, it is just because of the total number of observations, not because the data set is unbalanced.

    With your reduced data set, there are only 3 consecutive time periods used in the estimation. Note that you effectively lose 2 time periods because of the lags of the dependent variable. To compute an AR(2) test statistic, you need more than those 3 effective time periods.

    Dear Dr. Kripfganz

    Many thanks indeed for your valuable comment and suggestion.

    As you mentioned, “Note that you effectively lose 2 time periods because of the lags of the dependent variable. To compute an AR(2) test statistic, you need more than those 3 effective time periods”.
    I took under consideration your advice. In the balanced panel (smaller number of observations) I filtered the data in such a way that I have 6 consecutive years per company for lustrums. It worked well, considering that my model uses second lag of dependent variable. I have several scenarios with different lags in the independent variables. I implemented control for years (called “dano”) and company size (Micro, Small, Medium). For analysis simplicity I exported the results in a word table to compare scenarios (table includes: Hansen test, Sargan test, AR(1), AR(2)).

    Dr. Kripfganz your kind advice on how to decide (protocol if any) which is (are) the best scenario(s)

    Thank you very much again,

    Kind regards,

    Paul.
    Attached Files

    Comment


    • #17
      You can use the Hansen test and Difference-in-Hansen tests as well as the Andrews-Lu model and moment selection criteria to discriminate between models. Please have a look at the section on "Model selection" in my 2019 London Stata Conference and the paper by Kiviet (2020) that is referenced in my presentation:
      https://www.kripfganz.de/stata/

      Comment


      • #18
        Dear Dr. Kripfganz

        Again thank you indeed for your valued comments.

        I tried to follow your recommendation from your London Stata conference 2019, regarding model selection (slide 90).
        However, a dichotomy revealed itself before me.
        Find below the models implemented under xtdpdgmm and xtabond2

        Firstly. I tried another angle, and I used the command xtdpdgmm as per suggestion from the presentation.

        set more off
        xtdpdgmm Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium, ///
        twostep iv(GDPgrowth) gmm(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) vce(robust) teffects
        estat overid
        estat serial, ar(1/3)
        estimate store xtdpdgmmctrYS1
        estat mmsc xtdpdgmmctrYS1

        Secondly. So far, I have been working with xtabond2 and it is my understanding that both xtabond2 and xtdpdgmm should provide the
        same results using the adequate coding. I tried to replicate the equations but the results were not the same.
        On slide 6 “Equivalent system-GMM implementations in Stata” I noticed the following structural command syntax:

        xtabond2 L(0/1).n w k, gmm(L.n w k, lag(1 3)) h(2) two
        xtdpdgmm L(0/1).n w k, gmm(L.n w k, l(1 3) m(d)) gmm(L.n w k, d l(0 0)) w(ind) two

        Maybe the models did not yield the same results, because the highlighted terms are not familiar to me or I needed to implement other options in the code.
        Your kind help with this.

        set more off
        global danolist dano*
        xtabond2 Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium $danolist, ///
        twostep ivstyle(GDPgrowth) gmmstyle(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) robust small orthogonal
        estimate store GMMctrYS1

        On the xtabond2 estimations. To determine Akaike and Bayesian (AIC) (BIC) I used estat ic but Stata replied: “likelihood information not found in last estimation results”. So I manually calculated it using:
        AIC = n*ln(SSR) + 2*k
        BIC = n*ln(SSR) + k*ln(n)
        where:
        n is the sample size and k is the number of estimated parameters.

        SSR was estimated using:
        predict e
        gen e2 = e^2
        total e2

        The model with the lowest AIC BIC result is selected
        (Don’t know if this calculation is ok)
        NOTE: All GMM estimations implemented time and industry dummies.

        Dr. Kripfganz your kind advice on how to solve this conundrum.

        Kind regards,

        Paul

        Comment


        • #19
          Originally posted by Paul Jameson View Post
          xtdpdgmm Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium, ///
          twostep iv(GDPgrowth) gmm(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) vce(robust) teffects
          You did not specify the (sub-)option model(). By default, all instruments refer to the level model. It is unlikely that this is what you intend to do.

          Originally posted by Paul Jameson View Post
          global danolist dano*
          xtabond2 Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium $danolist, ///
          twostep ivstyle(GDPgrowth) gmmstyle(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) robust small orthogonal
          The ivstyle() option without the suboption equation() does not produce separate instruments for the transformed model and the level model. Unless you know what that option is doing and you indeed intend to this, do not use the ivstyle() option without the suboption equation(). The ivstyle() option without the equation() suboption cannot be replicated with xtdpdgmm. As a general recommendation, always explicitly specify the suboption equation() (or model() for xtdpdgmm) to make sure that you are really specifying the desired model.
          Also note that the orthogonal option produces (in most cases) incorrect estimates due to a bug in xtabond2.

          Originally posted by Paul Jameson View Post
          On the xtabond2 estimations. To determine Akaike and Bayesian (AIC) (BIC) I used estat ic but Stata replied: “likelihood information not found in last estimation results”. So I manually calculated it using:
          AIC = n*ln(SSR) + 2*k
          BIC = n*ln(SSR) + k*ln(n)
          where:
          n is the sample size and k is the number of estimated parameters.

          SSR was estimated using:
          predict e
          gen e2 = e^2
          total e2

          The model with the lowest AIC BIC result is selected
          (Don’t know if this calculation is ok)
          The MMSC-AIC and MMSC-BIC reported by estat mmsc after xtdpdgmm are not the conventional AIC / BIC. The conventional criteria just focus on the number of estimated coefficients. The MMSC take the number of moment conditions into account. xtabond2 does not calculate these criteria.
          https://www.kripfganz.de/stata/

          Comment


          • #20
            Originally posted by Sebastian Kripfganz View Post
            The hope would be that the lagged dependent variables takes care of this serial correlation. If the model passes the Arellano-Bond serial correlation test, you should be fine.

            If model(diff) lagrange(1 4) is valid for the first gmmiv() set, then model(level) diff lagrange(0 0) would be valid for the second gmmiv() set.
            If you suspect remaining serial error correlation, then you need to adjust both lag ranges, e.g. model(diff) lagrange(2 4) for the first set and model(level) diff lagrange(1 1) for the second set.
            Hi, Sebastian. I'm returning to this paper many months later, and I have a question (I hope this will catch your attention!).

            In the final version, I used
            model(diff) lagrange(2 2) for the first set of instruments and
            model(level) diff lagrange(1 1) for the second set of instruments.

            Then, in the Arellano-Bond test, I got

            Code:
            Arellano-Bond test for autocorrelation of the first-differenced residuals
            H0: no autocorrelation of order 1:     z =   -0.3523   Prob > |z|  =    0.8146
            H0: no autocorrelation of order 2:     z =   -3.1657   Prob > |z|  =    0.0014
            H0: no autocorrelation of order 3:     z =   -1.6235   Prob > |z|  =    0.1055
            My understanding was that these results mean I passed the Arellano-Bond test of serial autocorrelation, because I shifted the instruments back one lag, so in the Arellano-Bond test I am now interested in whether there is auto-correlation in order 3, not order 2. I have auto-correlation in order 2, but none in order 3, so I pass the test.

            Is that correct? Or do we always have to not reject the null for order 2 when using the A-B test?
            Thanks for any insights you might have!
            Last edited by Sandy Lovejoy; 04 Apr 2021, 00:18.

            Comment


            • #21
              Based on a strict interpretation of the test results, you instruments might be valid. However, AR(3) test is not too comforting and the non-rejection of the AR(1) test is an indication of an unusual serial correlation pattern. It is usually a better approach to construct a model that passes the Arellano-Bond tests in the usual way (reject AR(1) but not reject any higher-order tests). If that is not possible for whatever reason, you might get away with your approach of using deeper lags, although these potentially might become relatively week instruments.
              https://www.kripfganz.de/stata/

              Comment

              Working...
              X