Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Number of zero and ppmlhdfe

    Hello everyone, I’m estimating the effects of fiscal consolidation episodes, measured as a percentage of GDP, on bilateral inward flows to developing economies. My dependent variable contains approximately 90% zeros in the sample.
    When I run the regression excluding the zeros, I obtain a statistically significant result at the 10% level with a positive sign. However, when I include the zeros, the significance disappears, the coefficient sign turns negative, and the p-value becomes very high. I’m using fixed effects for origin-year, country pairs. I cluster at the destination-year level because clustering at the country-pair level prevents coefficient estimation.
    I would like to understand what might be causing these discrepancies.



    . ppmlhdfe in_Flow_per_r Fisc_r $control_jt_MR $control_ijt, vce(cl da) absorb(fepr fpt) nolog d keepsingletons sep(fe)
    warning: keeping singleton groups will keep fixed effects that cause separation
    warning: dependent variable takes very low values after standardizing (9.6738e-12)
    Converged in 20 iterations and 97 HDFE sub-iterations (tol = 1.0e-08)

    HDFE PPML regression No. of obs = 87,087
    Absorbing 2 HDFE groups Residual df = 956
    Statistics robust to heteroskedasticity Wald chi2(15) = 84.27
    Deviance = 1268.88905 Prob > chi2 = 0.0000
    Log pseudolikelihood = -1708.470875 Pseudo R2 = 0.7994

    Number of clusters (da) = 957
    (Std. err. adjusted for 957 clusters in da)
    -------------------------------------------------------------------------------
    | Robust
    in_Flow_per_r | Coefficient std. err. z P>|z| [95% conf. interval]
    --------------+----------------------------------------------------------------
    Fisc_r | -.0108548 .0337795 -0.32 0.748 -.0770614 .0553519
    fin_dev_r | -.0030926 .0116539 -0.27 0.791 -.0259337 .0197486
    inflation_r | .0218521 .0184559 1.18 0.236 -.0143209 .058025
    access_elec_r | -.0365178 .0113473 -3.22 0.001 -.0587581 -.0142774
    res_rents_r | -.0474482 .0127503 -3.72 0.000 -.0724383 -.0224582
    gross_debt_r | -.0113401 .0057079 -1.99 0.047 -.0225274 -.0001529
    gdp_growth_r | .0699072 .0186284 3.75 0.000 .0333963 .1064182
    remit_gdp_r | .0520429 .0353487 1.47 0.141 -.0172392 .121325
    log_GDP_r | 2.392849 1.144094 2.09 0.036 .1504654 4.635233
    Inst_qlt | -.9502582 .4980707 -1.91 0.056 -1.926459 .0259424
    CIT_r | .0392391 .0214041 1.83 0.067 -.0027121 .0811903
    MR | 3.948637 1.475418 2.68 0.007 1.056872 6.840402
    BIT | .707005 .3555214 1.99 0.047 .010196 1.403814
    RTA | .5260533 .3545719 1.48 0.138 -.1688948 1.221001
    InstDist | -.2619662 .3371992 -0.78 0.437 -.9228645 .3989322
    _cons | -89.65802 30.7708 -2.91 0.004 -149.9677 -29.34835
    -------------------------------------------------------------------------------

    Absorbed degrees of freedom:
    -----------------------------------------------------+
    Absorbed FE | Categories - Redundant = Num. Coefs |
    -------------+---------------------------------------|
    fepr | 8281 0 8281 |
    fpt | 1196 92 1104 |
    -----------------------------------------------------+

  • #2
    Dear Koko DIBLONI,

    There is something strange about what is going on here. First, you say that you are including origin-year and country pairs fixed effects. Why don't you also include destination-year fixed effects? Is it because you have a single destination? Also, if you are including pair fixed effects, how come distance does not drop? Also, do not use the keepsingletons and sep(fe) options. Finally, what do you mean when you say that you cluster at the destination-year level because clustering at the country-pair level prevents coefficient estimation?

    Best wishes,

    Joao

    Comment


    • #3

      Dear Joao Santos Silva , Thank you for your assistance.

      Regarding your first question, I do not use destination-year fixed effects because my variable of interest is defined at the destination-year level. Including such fixed effects would absorb the variation I aim to analyze.

      Secondly, the variable in question is not geographical distance but rather institutional distance, which varies over time.

      Also, when I do not include the options keepsingletons and sep(fe) in my regression, my sample size drops dramatically from 126,126 to 17,839 observations. Is this reduction normal?

      Finally, when I cluster at the country-pair level, I notice that in the small table at the bottom (Absorbed degrees of freedom), the number of coefficients (num.coef) appears as 0*. Could you clarify what this means?



      1) regression 1: clustering in country pair level


      . ppmlhdfe in_Flow_per_r Fisc_r $control_jt $control_ijt MR , vce(cl fepr) absorb(fepr fpt) nolog
      (dropped 69248 observations that are either singletons or separated by a fixed effect)
      warning: dependent variable takes very low values after standardizing (4.3809e-12)
      $$ Stopping (no negative residuals); separation found in 0 observations (1 iterations and 25 subiterations)
      Converged in 16 iterations and 81 HDFE sub-iterations (tol = 1.0e-08)

      HDFE PPML regression No. of obs = 17,839
      Absorbing 2 HDFE groups Residual df = 1,722
      Statistics robust to heteroskedasticity Wald chi2(15) = 51.09
      Deviance = 1268.889047 Prob > chi2 = 0.0000
      Log pseudolikelihood = -1708.470874 Pseudo R2 = 0.7356

      Number of clusters (fepr) = 1,723
      (Std. err. adjusted for 1,723 clusters in fepr)
      -------------------------------------------------------------------------------
      | Robust
      in_Flow_per_r | Coefficient std. err. z P>|z| [95% conf. interval]
      --------------+----------------------------------------------------------------
      Fisc_r | -.0108548 .034523 -0.31 0.753 -.0785187 .0568092
      fin_dev_r | -.0030926 .0138351 -0.22 0.823 -.0302089 .0240237
      inflation_r | .0218521 .021187 1.03 0.302 -.0196737 .0633778
      access_elec_r | -.0365178 .0145321 -2.51 0.012 -.0650001 -.0080354
      res_rents_r | -.0474482 .0196069 -2.42 0.016 -.085877 -.0090195
      gross_debt_r | -.0113401 .0078073 -1.45 0.146 -.0266422 .0039619
      gdp_growth_r | .0699072 .0262637 2.66 0.008 .0184314 .121383
      remit_gdp_r | .0520429 .0445919 1.17 0.243 -.0353557 .1394415
      log_GDP_r | 2.392849 1.270034 1.88 0.060 -.0963719 4.88207
      Inst_qlt | -.9502582 .6139648 -1.55 0.122 -2.153607 .2530908
      CIT_r | .0392391 .0236642 1.66 0.097 -.007142 .0856201
      BIT | .707005 .3951512 1.79 0.074 -.067477 1.481487
      RTA | .5260533 .3956898 1.33 0.184 -.2494845 1.301591
      InstDist | -.2619662 .4400135 -0.60 0.552 -1.124377 .6004444
      MR | 3.948637 2.22782 1.77 0.076 -.4178101 8.315084
      _cons | -89.65802 36.30691 -2.47 0.014 -160.8183 -18.49779
      -------------------------------------------------------------------------------

      Absorbed degrees of freedom:
      -----------------------------------------------------+
      Absorbed FE | Categories - Redundant = Num. Coefs |
      -------------+---------------------------------------|
      fepr | 1723 1723 0 *|
      fpt | 1025 1 1024 |
      -----------------------------------------------------+



      2) regression 2: clustering in destination year level


      . ppmlhdfe in_Flow_per_r Fisc_r $control_jt $control_ijt MR , vce(cl da) absorb(fepr fpt) nolog
      (dropped 69248 observations that are either singletons or separated by a fixed effect)
      warning: dependent variable takes very low values after standardizing (4.3809e-12)
      $$ Stopping (no negative residuals); separation found in 0 observations (1 iterations and 25 subiterations)
      Converged in 16 iterations and 81 HDFE sub-iterations (tol = 1.0e-08)

      HDFE PPML regression No. of obs = 17,839
      Absorbing 2 HDFE groups Residual df = 956
      Statistics robust to heteroskedasticity Wald chi2(15) = 84.27
      Deviance = 1268.889047 Prob > chi2 = 0.0000
      Log pseudolikelihood = -1708.470874 Pseudo R2 = 0.7356

      Number of clusters (da) = 957
      (Std. err. adjusted for 957 clusters in da)
      -------------------------------------------------------------------------------
      | Robust
      in_Flow_per_r | Coefficient std. err. z P>|z| [95% conf. interval]
      --------------+----------------------------------------------------------------
      Fisc_r | -.0108548 .0337795 -0.32 0.748 -.0770614 .0553519
      fin_dev_r | -.0030926 .0116539 -0.27 0.791 -.0259337 .0197486
      inflation_r | .0218521 .0184559 1.18 0.236 -.0143209 .058025
      access_elec_r | -.0365178 .0113473 -3.22 0.001 -.0587581 -.0142774
      res_rents_r | -.0474482 .0127503 -3.72 0.000 -.0724383 -.0224582
      gross_debt_r | -.0113401 .0057079 -1.99 0.047 -.0225274 -.0001529
      gdp_growth_r | .0699072 .0186284 3.75 0.000 .0333963 .1064182
      remit_gdp_r | .0520429 .0353487 1.47 0.141 -.0172392 .121325
      log_GDP_r | 2.392849 1.144094 2.09 0.036 .1504654 4.635233
      Inst_qlt | -.9502582 .4980707 -1.91 0.056 -1.926459 .0259424
      CIT_r | .0392391 .0214041 1.83 0.067 -.0027121 .0811903
      BIT | .707005 .3555214 1.99 0.047 .010196 1.403814
      RTA | .5260533 .3545719 1.48 0.138 -.1688948 1.221001
      InstDist | -.2619662 .3371992 -0.78 0.437 -.9228645 .3989322
      MR | 3.948637 1.475418 2.68 0.007 1.056872 6.840402
      _cons | -89.65802 30.7708 -2.91 0.004 -149.9677 -29.34835
      -------------------------------------------------------------------------------

      Absorbed degrees of freedom:
      -----------------------------------------------------+
      Absorbed FE | Categories - Redundant = Num. Coefs |
      -------------+---------------------------------------|
      fepr | 1723 0 1723 |
      fpt | 1025 92 933 |
      -----------------------------------------------------+






      Comment


      • #4
        The first table did not display properly, so I’m reposting it here

        . ppmlhdfe in_Flow_per_r Fisc_r $control_jt $control_ijt MR , vce(cl fepr) absorb(fepr fpt) nolog
        (dropped 69248 observations that are either singletons or separated by a fixed effect)
        warning: dependent variable takes very low values after standardizing (4.3809e-12)
        $$ Stopping (no negative residuals); separation found in 0 observations (1 iterations and 25 subiterations)
        Converged in 16 iterations and 81 HDFE sub-iterations (tol = 1.0e-08)

        HDFE PPML regression No. of obs = 17,839
        Absorbing 2 HDFE groups Residual df = 1,722
        Statistics robust to heteroskedasticity Wald chi2(15) = 51.09
        Deviance = 1268.889047 Prob > chi2 = 0.0000
        Log pseudolikelihood = -1708.470874 Pseudo R2 = 0.7356

        Number of clusters (fepr) = 1,723
        (Std. err. adjusted for 1,723 clusters in fepr)

        Robust
        in_Flow_per_r Coefficient std. err. z P>z [95% conf. interval]

        Fisc_r -.0108548 .034523 -0.31 0.753 -.0785187 .0568092
        fin_dev_r -.0030926 .0138351 -0.22 0.823 -.0302089 .0240237
        inflation_r .0218521 .021187 1.03 0.302 -.0196737 .0633778
        access_elec_r -.0365178 .0145321 -2.51 0.012 -.0650001 -.0080354
        res_rents_r -.0474482 .0196069 -2.42 0.016 -.085877 -.0090195
        gross_debt_r -.0113401 .0078073 -1.45 0.146 -.0266422 .0039619
        gdp_growth_r .0699072 .0262637 2.66 0.008 .0184314 .121383
        remit_gdp_r .0520429 .0445919 1.17 0.243 -.0353557 .1394415
        log_GDP_r 2.392849 1.270034 1.88 0.060 -.0963719 4.88207
        Inst_qlt -.9502582 .6139648 -1.55 0.122 -2.153607 .2530908
        CIT_r .0392391 .0236642 1.66 0.097 -.007142 .0856201
        BIT .707005 .3951512 1.79 0.074 -.067477 1.481487
        RTA .5260533 .3956898 1.33 0.184 -.2494845 1.301591
        InstDist -.2619662 .4400135 -0.60 0.552 -1.124377 .6004444
        MR 3.948637 2.22782 1.77 0.076 -.4178101 8.315084
        _cons -89.65802 36.30691 -2.47 0.014 -160.8183 -18.49779


        Absorbed degrees of freedom:

        Absorbed FE Categories - Redundant = Num. Coefs
        -
        fepr 1723 1723 0 *
        fpt 1025 1 1024

        * = FE nested within cluster; treated as redundant for DoF computation

        Comment


        • #5
          Dear Koko DIBLONI,

          Thanks for the clarification.

          1. That is a problem. A "structural" gravity equation needs both importer-year and exporter-year fixed effects. If you cannot use both, you need to be very careful in interpreting your results.
          2. Ah! OK.
          3. That is normal; the observations that are dropped do not contribute to the estimates (notice that the estimates are the same in both cases). However, keeping the singletons makes the standard errors unreliable.
          4. You can ignore that (there is a note at the bottom explaining it) and cluster by pair, or distance (distance may be better).

          Best wishes,

          Joao

          Comment


          • #6
            Dear Joao Santos Silva

            Thank you very much for your helpful insights and guidance. I truly appreciate your support.

            Best regards,
            Koko

            Comment


            • #7
              Dear Joao Santos Silva

              I would be very grateful if you could kindly explain why clustering at the distance level is considered preferable.

              Additionally, since I am not including all fixed effects in my model, does that mean I cannot identify the causal effect?

              These clarifications would be extremely helpful to me.

              Best regards,
              ​​​​​​​Koko

              Comment


              • #8
                Dear Koko DIBLONI,

                If you cluster by distance, pairs (A,B) and (B,A) will be in the same cluster, as they should. If you cluster by cluster-pair, they will be in different clusters.
                If you do not include all the fixed effects, you cannot call your estimates "structural"; I would never call them causal.

                Best wishes,

                Joao

                Comment


                • #9
                  Dear Joao Santos Silva ,

                  Thank you,

                  Best regards,
                  Koko

                  Comment

                  Working...
                  X