Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • statistical tests and choice of options with xtabond2

    Hi,

    I am currently investigating the impact of macroprudential policies on credit growth, with Stata 13.0. The idea is to reproduce the model used by Cerutti and al (2015) : ''The Use and Effectiveness of Macroprudential Policies : New Evidence'', but with another set of data.

    In their paper, they used the '' xtdpd '' command in order to obtain the Arellano-Bond GMM estimator. They used a dynamic panel data because past credit growth affect the actual ones.
    So I tried this command, but because GMM is very new to me, I tried Xtabond2 command by Roodman (2007), which is less complex. In order to replicate the model used by Cerutti and al (2015), I choose the difference GMM estimator, and considered that all variables are endogeneous.

    My understanding is that :

    - The number of instruments has to be lower than the number of groups (of observations). In fact, I have N = 21 and T = 56 (14 years, 2000 to 2014, with a quarterly frequency) and several missing values. I know that xtabond2 is not very recommanded when there is a small N and a big T, but with the ''collapse'' option and only lags(2 3) for each instruments, I obtain a number of intruments equals to 18, so maybe it should work. In comparison, Cerutti and al (2015) have N=119 and T = 13.

    - Because all my variables are endogeneous, i can't use the first lag of a variable as an instrument.

    - Twostep method is preferred over the onestep method.


    In fact, there is 5 macroprudential measures in the model, which are coded as follows : if a measure is tightened (loosened), the associed value is ''1'' (''-1''). If there is no change, the value is 0. If a measure is tightened (loosened) twice over the years, then the value will be ''2'' (''-2'') after the second tightening (loosening). These are the variables with the "c_"
    In addition to macroprudential variables, the other independent variables are the lagged dependent variable, the real GDP growh in the previous quarter, a dummy variable which capture the presence of a banking crisis during the previous quarter, and a variable which capture the impact of the interest rate in the previous quarter.
    So all independent variables have a lag.

    This is the model: CreditGrowthi,t= CreditGrowthi,t-1 α + C_macroprudentialmeasuresi,t-1 β + GDPGrowthi,t-1 + BankingCrisisi,t-1 δ + InterestRatei,t-1θ + CountryFixedEffect + ei

    And this is an example of what I tried:
    • first attempt
    Code:
    xtabond2 credit_growth L1.credit_growth L1.c_sscb L1.c_cap_req L1.c_ltv_cap L1.c_rr L1.c_exposition
    >  L1.interest_rate L1.GDP_growth L1.banking_crisis, gmmstyle(L1.credit_growth L1.c_sscb L1.c_cap_req
    >  L1.c_ltv_cap L1.c_rr L1.c_exposition L1.interest_rate L1.GDP_growth L1.banking_crisis, lag(2 3)
    > collapse) noleveleq twostep
    Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
    
    Dynamic panel-data estimation, two-step difference GMM
    ------------------------------------------------------------------------------
    Group variable: num Number of obs = 1007
    Time variable : q_date Number of groups = 21
    Number of instruments = 18 Obs per group: min = 37
    Wald chi2(9) = 74.81 avg = 47.95
    Prob > chi2 = 0.000 max = 58
    -----------------------------------------------------------------------------------------
    credit_growth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    ------------------------+----------------------------------------------------------------
    credit_growth |
    L1. | -.0390548 .1130852 -0.35 0.730 -.2606977 .1825881
    |
    c_sscb |
    L1. | -4.62242 .9516904 -4.86 0.000 -6.487699 -2.757141
    |
    c_cap_req |
    L1. | -1.407323 1.02413 -1.37 0.169 -3.41458 .5999351
    |
    c_ltv_cap |
    L1. | 4.428849 3.137135 1.41 0.158 -1.719824 10.57752
    |
    c_rr |
    L1. | -1.116012 1.679616 -0.66 0.506 -4.407999 2.175975
    |
    c_exposition |
    L1. | .0300822 1.116648 0.03 0.979 -2.158508 2.218673
    |
    interest_rate |
    L1. | .0787794 .2102005 0.37 0.708 -.333206 .4907648
    |
    GDP_growth |
    L1. | -.2289244 .2287425 -1.00 0.317 -.6772514 .2194026
    |
    banking_crisis |
    L1. | 1.426424 1.563614 0.91 0.362 -1.638203 4.491051
    -----------------------------------------------------------------------------------------
    Warning: Uncorrected two-step standard errors are unreliable.
    
    Instruments for first differences equation
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(2/3).(L.credit_growth L.interest_rate L.c_sscb L.c_cap_req
    L.c_ltv_cap L.c_rr L.c_exposition L.GDP_growth
    L.banking_crisis) collapsed
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z = -2.34 Pr > z = 0.019
    Arellano-Bond test for AR(2) in first differences: z = 0.29 Pr > z = 0.775
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(9) = 26.54 Prob > chi2 = 0.002
    (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(9) = 11.35 Prob > chi2 = 0.253
    (Robust, but weakened by many instruments.)
    
    
    .
    end of do-file
    And this is the post-estimation results (Postestimation → Manage estimation results → Table of estimation results):

    Code:
    Variable     active      
        
    credit_gro~h
    L1.  -.03905478    
                
    c_sscb
    L1.  -4.6224202***  
                
    c_cap_req
    L1.  -1.4073225    
                
    c_ltv_cap
    L1.    4.428849    
                
    c_rr
    L1.  -1.1160123    
                
    c_exposition
    L1.   .03008222    
                
    interest_r~e
    L1.   .07877938    
                
    GDP_growth
    L1.   -.2289244    
                
    banking_cr~s
    L1.   1.4264237    
        
    legend: * p<.1; ** p<.05; ***    p<.01
    If I'm not mistaken, the model passes both the AR test and the Hansen test (more appropriate than the Sargan test because of the "twostep" option) but for the last one I'm not sure because of this :''(Robust, but weakened by many instruments.)''
    The problem is that there is some of these variables, like crisis_banking, which are supposed to have a huge negative impact on the credit growth.
    • second attempt
    With the addition of the "robust" option, no variables are significant.


    Code:
    Variable     active        
            
    credit_gro~h
    L1.  -.03905478        
                
    c_sscb
    L1.  -4.6224202        
                
    c_cap_req
    L1.  -1.4073225        
                
    c_ltv_cap
    L1.    4.428849        
                
    c_rr
    L1.  -1.1160123        
                
    c_exposition
    L1.   .03008222        
                
    interest_r~e
    L1.   .07877938        
                
    GDP_growth
    L1.   -.2289244        
                
    banking_cr~s
    L1.   1.4264237        
            
    legend: * p<.1; ** p<.05;    ***    p<.01


    Because these macroprudential measures are very new, I have a lot of ''0'' in my data, and so Stata dropped many of them. So I had to agregate several variables together in order to solve this issue, and this is why I have only 5 macroprudential variables.
    In addition, one issue is that some coefficients have the wrong sign, and their coefficients are volatile (when I delete one variable which is not significant, some coefficients change completly, both in terms of sign and impact (see below)).
    • third attempt:
    Here is an attempt which is similar to the first attempt, except that I deleted one non-significant variable, banking_crisis:

    Code:
    xtabond2 credit_growth L1.credit_growth L1.c_sscb L1.c_cap_req L1.c_ltv_cap L1.c_rr L1.c_exposition
    > L1.interest_rate L1.GDP_growth, gmmstyle(L1.credit_growth L1.c_sscb L1.c_cap_req L1.c_ltv_cap L1.c_rr
    >  L1.c_exposition L1.interest_rate L1.GDP_growth, lag(2 3) collapse) noleveleq twostep
    Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
    
    Dynamic panel-data estimation, two-step difference GMM
    ------------------------------------------------------------------------------
    Group variable: num                             Number of obs      =      1180
    Time variable : q_date                          Number of groups   =        21
    Number of instruments = 16                      Obs per group: min =        41
    Wald chi2(8)  =      8.45                                      avg =     56.19
    Prob > chi2   =     0.391                                      max =        58
    -------------------------------------------------------------------------------
    credit_growth |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    credit_growth |
              L1. |  -.2786686   .2411136    -1.16   0.248    -.7512425    .1939053
                  |
           c_sscb |
              L1. |  -2.977535   3.035509    -0.98   0.327    -8.927022    2.971952
                  |
        c_cap_req |
              L1. |  -.2339573   1.101806    -0.21   0.832    -2.393458    1.925544
                  |
        c_ltv_cap |
              L1. |   5.151175   7.673965     0.67   0.502     -9.88952    20.19187
                  |
             c_rr |
              L1. |  -1.594947   2.977368    -0.54   0.592    -7.430481    4.240588
                  |
     c_exposition |
              L1. |   -.824738   3.071016    -0.27   0.788    -6.843818    5.194342
                  |
    interest_rate |
              L1. |   .1985294   .3771818     0.53   0.599    -.5407333    .9377921
                  |
       GDP_growth |
              L1. |  -.0554081   .3740414    -0.15   0.882    -.7885158    .6776997
    -------------------------------------------------------------------------------
    Warning: Uncorrected two-step standard errors are unreliable.
    
    Instruments for first differences equation
      GMM-type (missing=0, separate instruments for each period unless collapsed)
        L(2/3).(L.credit_growth L.c_sscb L.c_cap_req L.c_ltv_cap L.c_rr
        L.c_exposition L.interest_rate L.GDP_growth) collapsed
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z =  -0.77  Pr > z =  0.440
    Arellano-Bond test for AR(2) in first differences: z =  -1.13  Pr > z =  0.257
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(8)    =  18.29  Prob > chi2 =  0.019
      (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(8)    =  10.57  Prob > chi2 =  0.227
      (Robust, but weakened by many instruments.)
    
    
    .
    end of do-file
    
    . estimates table, star(.1 .05 .01) style(oneline)
    
    ------------------------------
        Variable |    active      
    -------------+----------------
    credit_gro~h |
             L1. | -.27866862    
                 |
          c_sscb |
             L1. |  -2.977535    
                 |
       c_cap_req |
             L1. | -.23395735    
                 |
       c_ltv_cap |
             L1. |   5.151175    
                 |
            c_rr |
             L1. | -1.5949469    
                 |
    c_exposition |
             L1. | -.82473802    
                 |
    interest_r~e |
             L1. |  .19852941    
                 |
      GDP_growth |
             L1. | -.05540809    
    ------------------------------
    legend: * p<.1; ** p<.05; *** p<.01
    I read that this issue may come from multicollinearity. So I did a multicollinearity diagnostic, but because the VIF is not available after xtabond2, I used ''regress'' instead of ''xtabond2'', and here are the results:

    Code:
    regress credit_growth L1.c_sscb L1.c_cap_req L1.c_ltv_cap L1.c_rr L1.c_exposition L1.interest_rate
    >  L1.GDP_growth L1.banking_crisis
    estat vif
    
    Variable                 VIF    1/VIF    
            
    banking_crisis    
    L1.                     1.25    0.803007
    c_cap_req    
    L1.                     1.24    0.807731
    c_exposition    
    L1.                     1.20    0.831029
    c_sscb    
    L1.                     1.18    0.849848
    interest_rate    
    L1.                     1.14    0.879999
    c_rr    
    L1.                     1.14    0.881049
    GDP_growth    
    L1.                     1.11    0.903970
    c_ltv_cap    
    L1.                     1.02    0.976797
            
    Mean VIF    1.16
    1) Is this the good way to perform statistics tests when we use GMM ? I have the same remark for the heteroskedasticity test.

    2) Apparently, there is no multicollinearity (the VIF is very low for each variable), what I'm supposed to do in order to obtain strong coefficients ?

    3) In case several regression with different lags successfully pass the Sargan/Hansen test and the AR test, what critererion is used to choose the best ? The one with the higher number of instruments ?

    4) with fixed effects models, some variables are highly significant, as when I use xtabond2 without the ''robust '' option. When I add this option, no variables are significant. My understanding is that the''robust'' option allows us to work with heteroskedasticity. In this particular case, heteroskedasticity is an assumption or I have to test it ? (as for my first question, I can't test it after xtabond2).

    5) Finally, I'am used to go to Statistics → Postestimation → Manage estimation results → Table of estimation results, in order to obtain the significance of coefficients with stars. Is it possible to do that directly in the regression results ?

    I'm a little lost with the choice of options, especially with the option ''orthogonal'', I don't know if I can use it.

    Here is a sample of my data:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
               credit_growth c_sscb c_cap_req c_ltv_cap c_rr c_exposition   interest_rate  GDP_growth  banking_crisis
    "2000q1"          0 0 0 0 -1 1   3.5423    .93956 0
    "2000q2"   1.322812 0 0 0 -1 1    4.263  1.013441 0
    "2000q3"   .6646443 0 0 0 -1 1   4.7376  -.141768 0
    "2000q4"  1.7269744 0 0 0 -1 1 5.024167   .076452 0
    "2001q1"   .4631569 0 0 0 -1 1 4.745033  1.634062 0
    "2001q2"    .224378 0 0 0 -1 1 4.590766   .085889 0
    "2001q3"  1.0374751 0 0 0 -1 1 4.267833  -.289637 0
    "2001q4"  1.0920715 0 0 0 -1 1   3.4435   .118342 0
    "2002q1"  .24551354 0 0 0 -1 1 3.362233  -.308947 0
    "2002q2"   .8305011 0 0 0 -1 1    3.446   .258706 0
    "2002q3"  1.6563503 0 0 0 -1 1 3.357333   .473059 0
    "2002q4"    .241526 0 0 0 -1 1   3.1088  -.214017 0
    "2003q1"  1.4132304 0 0 0 -1 1   2.6831 -1.211775 0
    "2003q2"  1.1803162 0 0 0 -1 1   2.3619   .021715 0
    "2003q3" -.19664878 0 0 0 -1 1 2.139233   .499232 0
    "2003q4"   .3575432 0 0 0 -1 1 2.149633   .367172 0
    "2004q1"   .7381726 0 0 0 -1 1 2.062967  -.024217 0
    "2004q2"  -.3454666 0 0 0 -1 1 2.082467   .344379 0
    "2004q3"   .4714422 0 0 0 -1 1   2.1163  -.182331 0
    "2004q4"  .35406515 0 0 0 -1 1   2.1636   .096709 0
    end
    label var GDP_growth "taux_de_croissance_PIB_reel"
    So, it's a very long post with a lot of questions, and I think I forgot some, so the title is not very accurate. Sorry for that, but I did a lot of research, especially on this forum.
    Thank you in advance for your answers.


    References :

    Cerutti, E., S. Claessens and L. Laeven. 2015. “The Use and Effectiveness of Macroprudential Policies: New Evidence.” IMF Working Paper WP/15/61.
    Cerutti, E., R. Correa, E. Fiorentino, and E. Segalla. 2017. “Changes in Prudential Policy Instruments—A New Cross-Country Database.” International Journal of Central Banking 13 (S1).
    Roodman, D. (2009) : How to do xtabond2: An introduction to difference and system GMM in Stata. Stata Journal 9(1): 86-136
    Roodman, D. (2009), “A note on the theme of too many instruments,” Oxford Bulletin of Economics and Statistics, 71, 135-158
    Mileva, E. (2007) "Using Arellano – Bond Dynamic Panel GMM Estimators in Stata, Tutorial with Examples using Stata 9.0 (xtabond and xtabond2)," Economics Department, Fordham University July 9, 2007

  • #2
    Your "small N, large T" setup is not just a problem when it comes to instrument proliferation (which you solve by using the collapse option). More importantly, the consistency of the estimator and the distributional properties of the test statistics are obtained under "large N, small T" asymptotics. In your case, you are running into several problems that cannot really be solved within the GMM context given the dimensions of your data set:
    • The twostep estimator essentially clusters the moment conditions on the group level. With your relatively small numbers of clusters, the estimates of the optimal weighting matrix become very imprecise and essentially unreliable.
    • The same argument applies to robust standard errors.
    • The Hansen test relies on an optimal weighting matrix. Hence, the same concerns apply again.
    • The Sargan test relies on the absence of serial correlation in the idiosyncratic error term and homoskedasticity. This is a strong assumption but the only assumption you can work with in the context of your data set. Also, it is not valid in the context of the system GMM estimator, in case you were planning to extend the analysis in that direction.
    To sum up, you should not use the twostep and the robust options. You should use the collapse and the noleveleq options (as you have done).
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Thanks you for your answer Sebastian.

      First, I do not plan to use the system GMM estimator.

      Here are the new results:

      Code:
       xtabond2 credit_growth L1.credit_growth L1.c_sscb L1.c_cap_req L1.c_ltv_cap L1.c_rr L1.c_exposition
      > L1.interest_rate L1.GDP_growth L1.banking_crisis, gmmstyle(L1.credit_growth L1.c_sscb L1.c_cap_req
      > L1.c_ltv_cap L1.c_rr L1.c_exposition L1.interest_rate L1.GDP_growth L1.banking_crisis,
      > lag(2 3) collapse) noleveleq
      Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
      
      Dynamic panel-data estimation, one-step difference GMM
      ------------------------------------------------------------------------------
      Group variable: num                             Number of obs      =      1007
      Time variable : q_date                          Number of groups   =        21
      Number of instruments = 18                      Obs per group: min =        37
      Wald chi2(9)  =      9.56                                      avg =     47.95
      Prob > chi2   =     0.387                                      max =        58
      --------------------------------------------------------------------------------
       credit_growth |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ---------------+----------------------------------------------------------------
       credit_growth |
                 L1. |   .0777834    .229739     0.34   0.735    -.3724967    .5280635
                     |
              c_sscb |
                 L1. |   -3.67845   1.668837    -2.20   0.028     -6.94931   -.4075894
                     |
           c_cap_req |
                 L1. |  -1.253732   1.231352    -1.02   0.309    -3.667138    1.159673
                     |
           c_ltv_cap |
                 L1. |   2.848392   2.720483     1.05   0.295    -2.483657    8.180442
                     |
                c_rr |
                 L1. |   .5770481   1.645909     0.35   0.726    -2.648874     3.80297
                     |
        c_exposition |
                 L1. |    .193002   1.398424     0.14   0.890    -2.547859    2.933863
                     |
       interest_rate |
                 L1. |  -.1184319   .2067701    -0.57   0.567    -.5236939    .2868301
                     |
          GDP_growth |
                 L1. |  -.1579743   .4710919    -0.34   0.737    -1.081297    .7653487
                     |
      banking_crisis |
                 L1. |     1.5616   1.431815     1.09   0.275    -1.244706    4.367906
      --------------------------------------------------------------------------------
      Instruments for first differences equation
        GMM-type (missing=0, separate instruments for each period unless collapsed)
          L(2/3).(L.credit_growth L.c_sscb L.c_cap_req L.c_ltv_cap L.c_rr
          L.c_exposition L.interest_rate L.GDP_growth L.banking_crisis) collapsed
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z =  -2.05  Pr > z =  0.040
      Arellano-Bond test for AR(2) in first differences: z =   0.90  Pr > z =  0.371
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(9)    =  26.54  Prob > chi2 =  0.002
        (Not robust, but not weakened by many instruments.)
      
      
      . 
      end of do-file
      
      . estimates table, star(.1 .05 .01) style(oneline)
      
      ------------------------------
          Variable |    active      
      -------------+----------------
      credit_gro~h |
               L1. |  .07778339     
                   |
            c_sscb |
               L1. | -3.6784495**   
                   |
         c_cap_req |
               L1. | -1.2537323     
                   |
         c_ltv_cap |
               L1. |  2.8483922     
                   |
              c_rr |
               L1. |   .5770481     
                   |
      c_exposition |
               L1. |  .19300195     
                   |
      interest_r~e |
               L1. | -.11843188     
                   |
        GDP_growth |
               L1. | -.15797434     
                   |
      banking_cr~s |
               L1. |     1.5616     
      ------------------------------
      legend: * p<.1; ** p<.05; *** p<.01
      .
      According to the Sargan test, the model is not valid. In addition to that, it is weird that only one variable has an impact on the dependent variable. Like I said previously, L.credit_growth and L.banking_crisis are supposed to have a big influence. Maybe something is wrong the data, I don't know what to do. Should I give up the GMM estimator? Should I reduce the "T" using an annual frequency instead?

      If I understand correctly, I can't do any test. So, the way I obtain the VIF is not valid?
      Finally, is it possible to have the stars (ie, the significance of coefficients) directly instead of being forced to use the postestimation button please?

      Thanks,
      Raphaël


      Comment


      • #4
        Reducing T certainly does not help. The small N is the limiting factor.

        You might indeed have to rethink your whole model. Starting with the assumption that all variables are endogenous is quite challenging already. Is there any meaningful interpretation of the coefficients in such a model? I am not familiar with the particular applied literature, so cannot comment further in that regard.

        I cannot say anything about the VIF, sorry.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Thank you again for your answer.

          In fact, N refers to the European countries, but because I have difficulty finding all the data (it's quite challenging to search for quarterly data), I have only N=21. At best, N = 27 (Cyprus excluded).
          I guess it doesn't change anything much, but annual data are easier to find.

          I started to think about which are the variables are endogenous, or exogenous, and then I read this in Cerutti et al (2015): "The estimates are determined using Arellano-Bond GMM treating the instrument and the control variables of credit growth, GDPgrowth, the crisis dummy, and the policy rate as endogeneous."
          In addition to that, macroprudential measures are used to avoid excessive or insufficient credit growth.
          When the credit growth is excessive, these measures are tightened (+1 in the database), and they are loosened (-1 in the database) when the credit growth is insufficient.
          So we expect:
          • a positive impact of L.credit_growth, because there is some persistance in credit developments.
          • a negative impact of the macroprudential measures;
          • a positive impact of L.real_GDP, because there is more credit when the econonomics conditions are good
          • a negative impact of L.banking_crisis, for the same reasons as before
          • a negative impact of L.interest_rate, because the higher the interest rate, the more expensive the credit becomes.

          Comment


          • #6
            I'm still stuck, and I don't think a fixed-effect would be a good alternative.

            Comment


            • #7
              Hello everyone,

              I know that bumping is unpopular here, but I really don't know how I can improve my questions. Sorry for that.

              So I bump this thread one last time.

              Comment

              Working...
              X