Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help to reduce instrument count with xtabond2 command

    Dear all
    I need help to minimize the instrument count in my dynamic panel model estimation using difference and system GMM method. Can anyone please point out my mistake in writing the code/suggest minimizing instrument count with xtabond2 command.

    I am using xtabond2 command in Stata 16, have in total 14 independent variables covering years from 1976-2017, total group is 78. All the variables are logged.

    Have the plan to run regressions as below
    a) Regression 1: use 13 independent variables covering years 1976-1990
    b) Regression 2: use 13+1 independent variables covering years 1991-2005 (Because the institutional variables are only available from 1996)
    c) Regression 3: use 13+1 independent variables covering years 2006-2017


    I have run regression 1 with xtabond2 using the following command.


    . xtabond2 lngrate_mx l.lngrate_mx $endo2 $pre6 yr1976-yr1984 if year<=1990, gmm(l.lngrate_mx l2.$endo2 l1.$pre6, collapse) iv(yr1976-yr1990) noleveleq nodiffsa
    > rgan robust
    Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
    Warning: Number of instruments may be large relative to number of observations.
    Warning: Two-step estimated covariance matrix of moments is singular.
    Using a generalized inverse to calculate robust weighting matrix for Hansen test.

    Dynamic panel-data estimation, one-step difference GMM
    ------------------------------------------------------------------------------
    Group variable: cnt Number of obs = 189
    Time variable : year Number of groups = 39
    Number of instruments = 178 Obs per group: min = 0
    Wald chi2(18) = 97.74 avg = 4.85
    Prob > chi2 = 0.000 max = 15
    ------------------------------------------------------------------------------
    | Robust
    lngrate_mx | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    lngrate_mx |
    L1. | -.2979105 .0732073 -4.07 0.000 -.4413942 -.1544268
    |
    lnfdif_mx | -.0496543 .0345899 -1.44 0.151 -.1174493 .0181408
    lnaid | -.0077373 .1264407 -0.06 0.951 -.2555565 .2400819
    lninftel_mx | -.4036854 .4649351 -0.87 0.385 -1.314941 .5075707
    lntrd_mx | .9904507 .6384156 1.55 0.121 -.2608209 2.241722
    lnexreal_mx | -.217682 .2607278 -0.83 0.404 -.7286991 .2933352
    lngcf | -.0503376 .5228976 -0.10 0.923 -1.075198 .9745228
    lninfgdp | -.1821023 .1134625 -1.60 0.109 -.4044847 .0402802
    lnprenr | -1.694952 1.023869 -1.66 0.098 -3.701697 .3117942
    yr1976 | .0499593 .2039131 0.25 0.806 -.3497029 .4496216
    yr1977 | -.1087436 .2819799 -0.39 0.700 -.6614141 .4439269
    yr1978 | .1015337 .2823504 0.36 0.719 -.4518629 .6549304
    yr1979 | -.2403875 .2630821 -0.91 0.361 -.756019 .275244
    yr1980 | -.0665753 .3839448 -0.17 0.862 -.8190933 .6859428
    yr1981 | .1880516 .2475811 0.76 0.448 -.2971984 .6733016
    yr1982 | -.3936692 .2194516 -1.79 0.073 -.8237863 .036448
    yr1983 | -.862448 .2664371 -3.24 0.001 -1.384655 -.3402408
    yr1984 | -.6206123 .2557127 -2.43 0.015 -1.1218 -.1194245
    ------------------------------------------------------------------------------
    Instruments for first differences equation
    Standard
    D.(yr1976 yr1977 yr1978 yr1979 yr1980 yr1981 yr1982 yr1983 yr1984 yr1985
    yr1986 yr1987 yr1988 yr1989 yr1990)
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/.).(L.lngrate_mx L2.lnfdif_mx lnaid L.lninftel_mx lntrd_mx lnexreal_mx
    lngcf lninfgdp lnprenr) collapsed
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z = -1.47 Pr > z = 0.141
    Arellano-Bond test for AR(2) in first differences: z = -1.75 Pr > z = 0.079
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(160) = 229.69 Prob > chi2 = 0.000
    (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(160) = 17.73 Prob > chi2 = 1.000
    (Robust, but can be weakened by many instruments.)




    ****Here $endo2=lnfdi_mx & lnaid, $pre6=lngcf lntrd_mx lnexreal_mx lninftel_mx lninfgdp lnprenr).
    ****Other variables are 4 financial development variables and 1 index of institutional variable
    ***$endo2 and l.lngrate are considered endogenous; $pre6, 4 financial development variables and institutional index are considered predetermined variables.
    The regression returns results with 39 groups and instrument count 178.

    My questions are;
    1. How to restrict the instrument count to be less than the number of groups to validate the Hansen test of overidentification restrictions (Chi square value is always 1.00 but Arellano-Bond test for AR(2) in first differences is quite ok)? The same problem is faced with xtdpd and xtdpdsys command as well.

    2. How can ensure the same groups to be used in every regression for consistency and validity of results? It seems if the instrument count is fixed,it will be managed whatever difference or system GMM.
    Regards
    Habibul Hasan












    Last edited by Habibul Hasan; 01 Oct 2019, 02:30.

  • #2
    Regarding ways to restrict the number of instruments, please have a look at my presentation at this year's London Stata Conference, in particular slides 18 and following:
    Further information on dynamic panel model estimation:
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Hi Dr. Kripfganz

      Thank you for your paper, I have read your paper and that of Kiviet(2019) thoroughly-in particular 10 steps in Kiviet(2019).

      Applying xtdpdgmm with collapsed and curtailed option, instruments count is now below N and could pass the overidentification test well with acceptable P values. But some points are still confusing to me, Would you kindly help to clarify those;

      a) What is the decision rule in terms of P values to drop/remove higher-order lags of the regressors among the explanatory variables

      b) In the output of the Incremental Hansen test, there is two columns of P values under ( Excluding....Difference);
      which p values to consider and What is the decision rule in terms of P values to update/modify instrumentation.

      According to Kiviet (2019), the suggestion is if any instrument set has a P>0.10 or the coefficient of additional regressor (squared term/interaction term) has a P>0.02, then instrument construction is OK. Did I pick up the correct interpretation?

      c) What is the limit of Sargan Test p Values to be considered?


      Regarding my research, would you suggest a bit;
      My variables are Growth rate (Dependent),

      FDI inflow (variable of interest), Aid inflow, Infrastructure (telephone line per 100 people),
      Domestic investment (Gross capital formation), Inflation (GDP deflator), Trade openness,
      Real effective exchange rate, Human capital (Gross enrolment in primary school)

      Additional variables are; a) Financial Development variables (Domestic credit to private sector, Banks credit to private sector, Commercial bank’s assets to comm. & central bank assets, Liquid liabilities of financial System) b) Institutional variable. All sourced from WB.
      I have taken log of all the variables.

      Do you think it is the right approach to take log of all these variables


      Thanks
      Habibul Hasan

      Comment


      • #4
        I am afraid I cannot give you particular p-value thresholds beyond the recommendations made in Kiviet's paper.

        For the incremental Hansen test, you are mostly interested in the "Difference" statistic. However, for this test to be valid it is important as a prerequisite that the "Excluding" statistic is not statistically significant, i.e. that all the other instruments that are not tested by the "Difference" statistic are valid.

        I believe that in principle you can take logs of all of these variables. The related literature in your field of research should guide you on that matter. Keep in mind that the coefficients are interpreted as elasticities after you have taken logs.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Hi Dr. Kripfganz
          Thank you for your kind suggestions.

          I am following your guideline and the 10 steps by Kiviet (2019)
          According to step 5 of that Kiviet's paper, I found FDI----exogenous and aid--predetermined and infrastructure, trade openness, domestic investment-----endogenous
          But as a cautious approach and based on the literature I too FDI & aid as endogenous. Is that logical?

          The xtdpdgmm command magically reduces the instruments count and fits all the required tests. In cases including time dummies by "teffects" option increases the instrument count.
          I wonder, is it mandatory to include the time dummies in difference/system GMM analysis.

          ****Would you kindly check whether the under-identification test is available in the xtdpdgmm package, because I tried with your dataset and command but it returns an error message.

          Thanks
          Habibul

          Comment


          • #6
            It is often a good idea to be cautious and to follow the common practice in the literature, although stronger exogeneity assumptions might help with the identification by reducing the risk of running into weak-instruments problems.

            Not sure what you mean by the statement that xtdpdgmm was magically reducing the instrument count.

            The option teffects adds dummy variables for the time periods. Each additional dummy variable also comes with an additional instrument which is why the instruments count increases. It is generally recommended to add time dummies. You should check the literature that is related to your specific application.

            As far as I am aware, Mark Schaffer's underid command is not yet available.
            https://www.kripfganz.de/stata/

            Comment


            • #7
              Hi Dr. Kripfganz

              Very sorry for bothering you again and again. It's because I get little support/suggestions about the xtdpdgmm command. While applying the command in my model, I used to face different problems at different times, So I seek your kind support.

              I have all three (endogenous, predetermined, and exogenous) types of variables in my model. But how to differentiate the variables as endogenous, predetermined, and exogenous while implementing the model using xtdpdgmm command (difference and system GMM)? For example, I have written the following command for 2-step system GMM considering all the variables as endogenous

              xtdpdgmm l(0/2).lngrate_mx lnfdif_mx lnaid lninftel_mx lnprenr lntrd_mx lngcf lnfd lnexreal_mx lninfgdp lninst,
              collapse model(diff) gmm(lngrate_mx , lag(1 2))

              gmm(lnfdif_mx, lag(2 3))
              gmm(lnaid, lag(2 2))
              gmm(lninftel_mx lnprenr lntrd_mx lninfgdp lnexreal_mx lngcf lnfd lninst lninst, lag(2 2))

              gmm(lngrate_mx, lag(1 2) diff model(level))
              gmm(lnfdif_mx, lag(2 2) diff model(level))
              gmm(lnaid, lag(2 2) diff model(level))
              gmm(lninftel_mx lnprenr lntrd_mx lninfgdp lnexreal_mx lngcf lnfd lninst lninst, lag(2 2) diff model(level))
              teffects two vce(r) overid nolog nofoot


              ***But if I want to impose some variable as predetermined (say,lnprenr ) and some as exogenous (say, lngcf); how can I reflect/implement that in the above xtdpdgmm code.

              ***Secondly, in the post-estimation results,

              a) Some of the '2-step moment functions, 3-step weighting matrix' p-value fails to pass (Pvalue<0.05; Null hypothesis rejected);
              b) Some (usually 1 or 2) of the 'Incremental Hansen test' also fail to pass (p-value<0.05; Null hypothesis rejected).
              c) Also, AR(3) of No autocorrelation is also rejected in some models;

              Are these usual or a major problem in formulating instruments.

              Kind Regards
              Habibul
              Last edited by Habibul Hasan; 24 Nov 2019, 19:32.

              Comment


              • #8
                Slide 11 of my London Stata Conference presentation tells you which lags are valid under the usual assumptions.

                First of all, note that - by construction - gmm(lngrate_mx , lag(1 2)) creates an invalid instrument because the first lag of the dependent variable cannot be a valid instrument in the first-differenced model. You can only use lags 2 onwards, i.e. the first entry in the lag() option must be 2 or a higher value.

                For endogenous regressors, you can use lags 2 onwards in the first-differenced model. Then, for predetermined regressors, you can simply add the first lag. For strictly exogenous regressors, you can further add the contemporaneous term.

                For the level model, you can find the valid lags on slide 31 of my presentation. These are the first lag of the first-differenced dependent variable, and similarly the first lag for other endogenous variables. For predetermined or strictly exogenous variables, you could also use contemporaneous first differences in the level model.

                As an example of how to test for the a predetermined versus an endogenous variable, have a look at slides 108 and 109.

                Furthermore, all of these results mentioned in (a)-(c) are potential cause for concern that you usually should try to address. Incremental Hansen tests could indicate which of the instruments might be invalid, provided that all other instruments are valid.
                https://www.kripfganz.de/stata/

                Comment


                • #9
                  Hi Dr. Kripfganz

                  Once again writing to you for some suggestions.

                  a) Regarding the selection of MSM after running different models using xtdpdgmm command. I got 4 models that pass the required test (ar test, overidentification test, and Incremental Hansen test). Based on the 4 MSM, I run 'The Andrews and Lu (2001) model and moment selection criteria (MMSC), and Models with lower values of the criteria should be preferred. I got the following results where any specific model does not the lowest value for all of AIC, BIC and HQIC criteria.

                  estat mmsc fod1 fod2 fod3 fod4 fod5

                  Andrews-Lu model and moment selection criteria

                  Model | ngroups J nmom npar MMSC-AIC MMSC-BIC MMSC-HQIC
                  -------------+----------------------------------------------------------------------------------------------
                  . | 63 12.4543 54 35 -25.5457 -66.2653 -42.1011
                  fod1 | 63 12.4543 54 35 -25.5457 -66.2653 -42.1011
                  fod2 | 62 10.7125 54 36 -25.2875 -63.5759 -40.8308
                  fod3 | 62 13.4618 54 37 -20.5382 -56.6995 -35.2180
                  fod4 | 61 6.0796 54 39 -23.9204 -55.5835 -36.7536
                  fod5 | 62 5.6983 54 38 -26.3017 -60.3358 -40.1179

                  Questions: Based on the above results, which model should be chosen for further analysis, and what does the first row of the result (red color) represents.


                  b) After selecting the MSM, the next step involves identifying the variables as either endogenous, predetermined or exogenous. According to that slide 106 of your presentation suggests the following
                  "Separately for all regressors classified as endogenous, add the extra instruments that become valid if the regressors were predetermined (unless theory clearly indicates that a variable should be endogenous), and check the corresponding incremental overidentification tests and Treat the variable with the highest acceptable p-value of the incremental overidentification tests as predetermined"


                  i) For this step, in your example (slide 108), any variable that is used as endogenous now be treated as predetermined requires adding extra gmm(0 0) option and looking for whether p-value of the incremental overidentification tests supports whether that variable as actually predetermined. You have added the same gmm(0 0) option while checking whether a previously treated predetermined variable is actually exogenous.

                  ii) In
                  Kiviet (2019) paper that you have referenced in your presentation, the author suggested adding extra gmm( 1 1) for checking whether an endogenous variable is actually predetermined and adding extra gmm(0 0) for checking whether a predetermined variable is actually exogenous. And it is suggested that if a P-value of Incremental Hansen test for that extra instrument is greater than 0.5 (in both cases), the variable that is previously treated as endogenous would now be treated as Predetermined and the one treated as predetermined before would be now treated as exogenous (Step 5 & 6 in the paper)


                  Question: Would you kindly suggest which one is consistent to follow regarding this test of adding extra instrument; your option of adding extra instrument of gmm(0 0) for both or as used by Kiviet (2019).

                  Question: What is the decision rule for the highest acceptable p-value of the incremental overidentification tests because I am using the xtdpdgmm command but Kiviet (2019) has formulated his steps using xtbond2 command and there seems some differences in the assumption of models in each command.


                  Question: If a previously treated endogenous/predetermined variable is found to be Predetermined/exogenous after adding extra gmm(0 0) option,
                  i) Should I remove that variable from its initial endogenous treatment (that used before identifying its type) and treat that now as a predetermined with its new instrument structure in the revised model (assigning instruments after defining its type as predetermined)
                  or

                  ii) Do I have to keep the initial instrument structure and add the extra instrument gmm(0 0) for that variable in the revised model?

                  My heartiest thanks for your continuous support.

                  Kind Regards
                  Habibul Hasan
                  ***Kindly
                  note that I am strictly following your xtdpdgmm command for doing my analysis because it has solved many of my problems in dealing with my dataset.

                  Last edited by Habibul Hasan; 30 Jun 2020, 07:06.

                  Comment


                  • #10
                    a) The first row with the dot refers to the last estimates still in memory. This is apparently model fod1 given that the numbers are identical. Note that you cannot compare models with different numbers of groups (first column). You need to restrict the models to all have the same number of groups for a meaningful comparison.

                    b) It depends on the model transformation:
                    (i) The extra moment condition added with gmm(w, lag(0 0) model(fod)) is valid (in the FOD model) when variable w is predetermined. If w is strictly exogenous, this moment condition remains valid. On slide 112 you can see that strict exogeneity of a variable k allows to further add the moment condition gmm(k, lag (0 0) model(mdev)) for the MDEV model.
                    (ii) Kiviet considers the DIFF model. There, lag 1 would be valid for a predetermined variable and lag 0 becomes valid for a strictly exogenous variable.

                    What are the different assumptions that you think distinguish xtdpdgmm from xtabond2? There should not be any.

                    Not sure I understand your last question. The difference between an endogenous and a predetermined variable lies in the extra moment condition that becomes available in the latter case. Everything else remains unchanged.
                    https://www.kripfganz.de/stata/

                    Comment


                    • #11
                      Thank you for patience reply

                      By the last Question, I was inquiring to know that
                      • In slide 108, it is said “Treating ‘w’ as predetermined with collapsed instruments, adds one more moment condition”.Accordingly, in slide 109, the Incremental Hansen test provides the validity that w to be treated as predetermined endogenous anymore.
                      • According to slide 110, “Skipping some intermediate steps, we arrive at a model with w and k (as well as the interaction terms) treated as predetermined”. The model is,
                      xtdpdgmm L(0/2).n L(0/2).w k L(0/3).ys c.w#c.w c.w#c.k, model(fod) collapse gmm(n, lag(1 .)) gmm(w, lag(1 .)) gmm(k, lag(1 .)) gmm(ys, lag(1 .)) gmm(c.w#c.w, lag(1 .)) gmm(c.w#c.k, lag(1 .)) gmm(w k c.w#c.w c.w#c.k, lag(0 0)) teffects two vce(r) overid
                      • In the above model, both gmm(1 .) and gmm(0 .) moment conditions are used for w, k, and their interaction terms. Once ‘w & K’ is decided to be predetermined shouldn’t only gmm(0 .) be used for w, k and their interaction terms.
                      • Q: If ‘w, k..’ endogenous variables are found to be predetermined, is it correct that the revised model should use gmm(0 0) for ‘w, k..’ dropping the instrument gmm(1 .) that was used in the initial model considering them as endogenous.
                      • Q: To check the Exogeneity of any variable, it is required to add further moment condition gmm(.., lag (0 0) model (mdev)) for the model in mean deviations (while using the FOD model in the system gmm). Is it the rule of thumb to use the model in mean deviations only?
                      • Q: For selecting models with estat mmsc, It's clear to understand that 'n' to be the same number. But in cases where 'n' changes by +1/-1 or +2/-2, and the summary statistics of the variables are very close/same, does that comparison of selecting a model is yet acceptable (as the case of my models fod1---fod5)
                      Last edited by Habibul Hasan; 02 Jul 2020, 04:15.

                      Comment


                      • #12
                        Note that the part in red has lag(0 0), not lag(0 .). The combination of lag(0 0) with lag(1 .) from the previous step gives lag(0 .) as desired. Do not drop the lag(1 .) part from the model. Of course, you could simply combine the two sets of instruments into a single set with lag(0 .), which however would no longer produce difference-in-Hansen test statistics for the extra moment conditions imposed by lag(0 0).

                        Instead of using the extra moment condition for model(mdev), you could also use a one-period lead for model(fod), i.e. gmm(..., lag(-1 -1) model(fod)). It is just my personal preference to use the model(mdev) moment condition because this resembles the moment condition for a conventional fixed-effects approach in which all regressors are strictly exogenous and the model is transformed into mean deviations to remove the fixed effects.

                        It is really not possible to compare the models even if the number of groups differs only slightly. As an argument in favor of a model with more groups you could use that it retains more observations and therefore might yield more efficient estimates, but this has nothing to do with the MMSC.
                        https://www.kripfganz.de/stata/

                        Comment


                        • #13
                          Hi Dr. Kripfganz

                          Thank you for your kind support and patience throughout the communication.
                          Some more questions for your kind support

                          Step 1)
                          Initially, considering all variables as endogenous I have the following two-step system GMM model (FoD equation and level equation).


                          My initial model:
                          Model 1

                          xtdpdgmm l(0/1).lngrate_mx lnfdif_mx lnaid lngcf lnexreal_mx lninfgdp l(0/1).lntrd_mx l(0/1).lninftel_mx lnprenr lnfdifsq c.lnfdif_mx#c.lnprenr if year>=1976 & year<=1996, collapse model(fod) gmm(lngrate_mx, lag(2 2)) gmm(lnfdif_mx lnfdifsq, lag(1 1)) gmm(lnaid, lag(1 1)) gmm(c.lnfdif_mx#c.lnprenr, lag(1 1)) gmm(lngcf lninftel_mx lntrd_mx lnprenr lninfgdp lnexreal_mx, lag(1 1))

                          gmm(lngrate_mx, lag(1 1) diff model(level)) gmm(lnfdif_mx lnfdifsq, lag(1 1) diff model(level)) gmm(lnaid, lag(1 1) diff model(level)) gmm(c.lnfdif_mx#c.lnprenr, lag(1 1) diff model(level)) gmm(lngcf lninftel_mx lntrd_mx lnprenr lninfgdp lnexreal_mx, lag(1 1) diff model(level)) teffects vce(r) overid two nofoot nolog nocons


                          Step 2)
                          Later on, I added extra instrument gmm(…, lag(0 0)) one by one for each variable to check whether an initially as endogenous treated regressor seems actually predetermined and checked the incremental Hansen (IH) test result for that extra instrument.

                          I got 3 of my variables (lnfdif_mx lngcf lnprenr) as predetermined and added the extra instrument gmm(lnfdif_mx lngcf lnprenr, lag(0 0)) in the revised model.


                          Revised Model:
                          Model 2

                          xtdpdgmm l(0/1).lngrate_mx lnfdif_mx lnaid lngcf lnexreal_mx lninfgdp l(0/1).lntrd_mx l(0/1).lninftel_mx lnprenr lnfdifsq c.lnfdif_mx#c.lnprenr if year>=1976 & year<=1996, collapse model(fod) gmm(lngrate_mx, lag(2 2)) gmm(lnfdif_mx lnfdifsq, lag(1 1)) gmm(lnaid, lag(1 1)) gmm(c.lnfdif_mx#c.lnprenr, lag(1 1)) gmm(lngcf lninftel_mx lntrd_mx lnprenr lninfgdp lnexreal_mx, lag(1 1))

                          gmm(lngrate_mx, lag(1 1) diff model(level)) gmm(lnfdif_mx lnfdifsq, lag(1 1) diff model(level)) gmm(lnaid, lag(1 1) diff model(level)) gmm(c.lnfdif_mx#c.lnprenr, lag(1 1) diff model(level)) gmm(lngcf lninftel_mx lntrd_mx lnprenr lninfgdp lnexreal_mx, lag(1 1) diff model(level)) gmm(lnfdif_mx lngcf lnprenr, lag(0 0)) teffects vce(r) overid two nofoot nolog nocons



                          Step 3)
                          Now, I added extra instrument, gmm(…, lag(0 0) model(md)), one by one for each of the variables (
                          lnfdif_mx lngcf lnprenr) from model 2 to check whether an as predetermined treated regressor is actually exogenous and checked the incremental Hansen test result for their extra instrument.

                          Based on IH test p-values for (
                          lnfdif_mx lngcf lnprenr), only lngcf is found to be exogenous. So formulated the final model as below.


                          Revised Model:
                          Final Model


                          xtdpdgmm l(0/1).lngrate_mx lnfdif_mx lnaid lngcf lnexreal_mx lninfgdp l(0/1).(lntrd_mx lninftel_mx lnprenr) lnfdifsq c.lnfdif_mx#c.lnprenr if year>=1976 & year<=1996, collapse model(fod) gmm(lngrate_mx, lag(1 1)) gmm(lnfdif_mx lnfdifsq, lag(1 1)) gmm(lnaid, lag(1 1)) gmm(c.lnfdif_mx#c.lnprenr, lag(1 1)) gmm(lngcf lninftel_mx lntrd_mx lnprenr lninfgdp lnexreal_mx, lag(1 1))
                          gmm(lngrate_mx, lag(1 1) diff model(level)) gmm(lnfdif_mx lnfdifsq, lag(1 1) diff model(level)) gmm(lnaid, lag(1 1) diff model(level)) gmm(c.lnfdif_mx#c.lnprenr, lag(1 1) diff model(level)) gmm(lngcf lninftel_mx lntrd_mx lnprenr lninfgdp lnexreal_mx, lag(1 1) diff model(level))
                          gmm(lnfdif_mx lngcf lnprenr, lag(0 0)) gmm(lngcf, lag(0 0) model(md)) teffects vce(r) overid nofoot two nolog nocons


                          Q. Would you kindly check whether the lag structure for my models is correct?

                          In your example in slide 109 and 113, I could understand that you might have considered “a range of 0.30-0.50 for deciding whether an initially as endogenous treated regressor seems actually predetermined or whether an as predetermined treated regressor is actually exogenous” [In the last part of (section 2.5. P-values as beacons: opportunities and shortcomings) from Kiviet, 2019].

                          Q. Could you please give your opinion based on following IH test p-values (a, b & c) to decide on whether an initially as predetermined treated regressor is actually exogenous:
                          a) 0.5192
                          b) 0.0014
                          c) 0.4033

                          Apologize for any inconvenience made to you.

                          Kind Regards
                          Habibul Hasan

                          Comment


                          • #14
                            For model(fod), you can use gmm(lngrate_mx, lag(1 1)) instead of gmm(lngrate_mx, lag(2 2)). Unless your sample size is very small, you could also use further lags of the endogenous variables in all steps, e.g. gmm(lngrate_mx, lag(1 3)) gmm(lnfdif_mx lnfdifsq, lag(1 3)) gmm(lnaid, lag(1 3)) gmm(c.lnfdif_mx#c.lnprenr, lag(1 3)) gmm(lngcf lninftel_mx lntrd_mx lnprenr lninfgdp lnexreal_mx, lag(1 3)). The upper lag limit of 3 is just an example. It is often advisable not to use too many lags because at some point they become weak instruments. Estimates based on too few lags, on the other side, might suffer from low efficiency. The extra instruments in steps 2 and 3 look good.

                            I would not use a strict cut-off value for the p-values. It is often a judgement depending on how severe the consequence of a misclassification would be. In general, both a) and c) appear reasonably large to not reject the null hypothesis. b) clearly must be rejected.
                            https://www.kripfganz.de/stata/

                            Comment


                            • #15
                              Thank you once again

                              While running the models, another question pops up.

                              In slide 103 and onwards, when interaction and squared terms of the variables w & K are included as regressors in their level value (e.g. c.w#c.w c.w#c.k) along with their instruments,
                              xtdpdgmm L(0/2).n L(0/2).w k L(0/3).ys c.w#c.w c.w#c.k, model(fod) collapse gmm(n, lag(1 .)) gmm(w, lag(1 .)) gmm(k, lag(1 .)) gmm(ys, lag(1 .)) gmm(c.w#c.w, lag(1 .)) gmm(c.w#c.k, lag(1 .)) teffects two vce(r) overid

                              Do these regressors (c.w#c.w and c.w#c.k) can cause a multicollinearity problem?

                              Kind Regards
                              Habibul Hasan

                              Comment

                              Working...
                              X