Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Non-Convergence in PPML Gravity Model

    Hi Stata users,

    I'm trying to run a gravity model on a panel of 43 countries and 15 years using the World Input-Output Dataset (WIOD) but am having some issues with convergence. When I estimate models by sector, with a number of standard gravity variables (i.e. distance, language, etc), the model converges for some sectors but not for all. My preference for estimating my models via PPML is to use the glm command (due to the syntax being similar to reg which I use for estimating in logarithms) with the poisson distribution specified. I've also run the model on the ppml_panel_sg & poi2hdfe commands and found the same result.

    I know the issues identified by Joao Santo-Silva and Silvana Tenreyro (https://www.sciencedirect.com/scienc...832?via%3Dihub) that non-convergence can be caused by complete separation of the variables. I've tried using the test that they specify to identify and remove the separated problem variables, but its suggesting that none of the variables are separated (collinear for tradeflow>0). This test is also run by the ppml_panel-sg command, it didn't identify any non-existence issues either but still failed to converge)

    Does anyone have any experience of the model still failing to converge? I've looked for other posts here and for academic papers on the problem but have had no joy. Alternatively, are there situations where the proposed test for separation doesn't identify the issue?

    I use Stata 15 SE on a Windows computer.

    Many thanks for you help!
    Elliot

  • #2
    Dear Elliot Delahaye,

    That is very strange; please try the command ppmlhdfe and do let us know what happens.

    Best wishes,

    Joao

    Comment


    • #3
      Thanks for the reply Joao Santos Silva,

      Apologies for my slight delay, my Stata is on a secure server and so it took a while to install the new command!

      That solves the problem perfectly, thank-you! I hadn't noticed the new command be released. It gives a warning that the dependent variable has very small values after normalisation (7.6466e-16) but that was also the case on other sectors that had worked using the glm command so I don't know if that's normal/something to worry about? The "min(eta)" turns red and also says that epsilon is "below tolerance". Not being familiar with the command yet, I'm not sure what that means?

      Can I ask a quick follow-up, what's the difference is between the standard errors computed by the different commands as they're all give different standard errors and t-stats. I assume the difference for the glm command (using factor variables for the FEs) is that it doesn't adjust the d.f. for the FEs, but its still different between ppmlhdfe and ppml_panel_sg which I assume are both done using the within-FE variation so takes has different d.f.. But the two sets of estimates are still not the same. Both are estimating robust std. errors I believe, is it just that one's more accurate? It can generate in t-stat of over 20.

      Elliot

      Comment


      • #4
        Dear Elliot Delahaye,

        I am not the author of those commands and therefore I do not know the reason for this; maybe Tom Zylkin can help?

        Best wishes,

        Joao

        Comment


        • #5
          Originally posted by Elliot Delahaye View Post
          Thanks for the reply Joao Santos Silva,

          Apologies for my slight delay, my Stata is on a secure server and so it took a while to install the new command!

          That solves the problem perfectly, thank-you! I hadn't noticed the new command be released. It gives a warning that the dependent variable has very small values after normalisation (7.6466e-16) but that was also the case on other sectors that had worked using the glm command so I don't know if that's normal/something to worry about? The "min(eta)" turns red and also says that epsilon is "below tolerance". Not being familiar with the command yet, I'm not sure what that means?

          Can I ask a quick follow-up, what's the difference is between the standard errors computed by the different commands as they're all give different standard errors and t-stats. I assume the difference for the glm command (using factor variables for the FEs) is that it doesn't adjust the d.f. for the FEs, but its still different between ppmlhdfe and ppml_panel_sg which I assume are both done using the within-FE variation so takes has different d.f.. But the two sets of estimates are still not the same. Both are estimating robust std. errors I believe, is it just that one's more accurate? It can generate in t-stat of over 20.

          Elliot
          Hi Elliot,
          What options are you specifying for the standard errors in each case? If, for example, you are specifying clustered standard errors, there may be minor numerical differences (or in the case of ppmlhdfe there could be a difference because it does a better job of calculating the number of observations that are neither perfectly predicted nor singletons). However, there would obviously be a difference if one command is reporting robust standard errors whereas another is reporting clustered standard errors.

          Btw, I'm pleased to hear ppmlhdfe solved your problem. That was one of the big things we invested in when we made this command. For if you're interested, we (Sergio Correia, Paulo Guimaraes, and I) wrote a paper that explains our methods and extends some of the earlier results from the Santos Silva and Tenreyro paper you mention. There is also a shorter version here.

          Regards,
          Tom

          PS: don't worry about the warning.

          Comment


          • #6
            Thanks Joao and hi Tom,

            In both cases, I'm using the standard robust std. errors that are default in both (the help files don't indicate that these are clustered). Does that mean that the only other explanation for the difference is greater accuracy?

            Thanks for sending the greater detail on your papers! I'm finding that some of my models drop a country due to it being a singleton or separated by a FE. Is this something I can improve on with changes to the solver algorithm or is it best to just accept the omission?

            Thanks,
            Elliot

            Comment


            • #7
              Originally posted by Elliot Delahaye View Post
              Thanks Joao and hi Tom,

              In both cases, I'm using the standard robust std. errors that are default in both (the help files don't indicate that these are clustered). Does that mean that the only other explanation for the difference is greater accuracy?

              Thanks for sending the greater detail on your papers! I'm finding that some of my models drop a country due to it being a singleton or separated by a FE. Is this something I can improve on with changes to the solver algorithm or is it best to just accept the omission?

              Thanks,
              Elliot
              Hi Elliot,

              For ppml_panel_sg, check the notes below the results table. These should indicate whether or not your SEs are clustered or not.

              I would not worry about the singletons or about the perfectly predicted observations. These should not be affecting identification of any of the variables you care about.

              Regards,
              Tom

              Comment


              • #8
                Hi,
                Can I use the PPML model if my dependent variable is an index (relative trade share) instead of a simple bilateral trade flow data? Also, it is not a gravity model.

                Comment

                Working...
                X