Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PPML estimation with firm-level data

    Hello everybody,

    I am estimating PPML model using an annual Dutch firm-level dataset spanning 6 years to identify the role of so-called experience and spillover effects.
    Hence, I use (Dutch) firm-export country pairs.

    Now I have read up on the PPML estimation, but I could not find much information on how to use the ppml command for firm-level data.
    That is, the literature is mainly concentrated on sector- or country-level data instead.
    My main question is whether PPML as programmed in STATA (ppml) is equiped for firm-level data as well.
    I know that some authors argue that PPML is not ideal in case of frequent zero trade flows which is the case in my dataset (as I use firm-level data) - see, for example, On the Specification of the Gravity Model of Trade: Zeros, Excess Zeros and Zero-inflated Estimation by Burger, M., van Oort, F., & Linders, G.J. Spatial Economic Analysis, 4:2, 167-190, DOI: 10.1080/17421770902834327. Link to this article: http://dx.doi.org/10.1080/17421770902834327

    My current estimation command includes country, sector and postal code fixed effects interacted with time and is specified as follows:
    ppml trade ldist lgdp lgdppc [other control varaibles] countryyear_fe* sectoryear_fe* postalcodeyear_fe*, cluster(country)

    Thank you in advance for your input.





  • #2
    Dear Ruben,

    There is no problem at all with using -ppml- with firm data. Also, the validity of ppml does not depend on the percentage of zeros.

    Best regards,

    Joao

    Comment


    • #3
      Dear Joao,

      Thank you for your prompt reply.
      I am happy to hear the ppml specification can be applied to firm-level data as well without running into problems. Still, do you believe the zero-inflated model proposed by Burger et al. (2009) improves subsequent estimation (perhaps just for comparison purposes)? Also, what is your opinion on some remarks raised related to overdispersion that the ppml estimation may be suffering from?

      Out of curiosity, how do you feel about spatial filtering variant of the ppml as proposed by Tamás Krisztin & Manfred M. Fischer in 'The Gravity Model for International Trade: Specification and Estimation Issues'? Published in Spatial Economic Analysis, 10:4: http://dx.doi.org/10.1080/17421772.2015.1076575.

      That is, I would like to compare the ppml estimation to other model specification as well. For now, I am thinking of the sample selection correction, two-step Heckman approach as a comparison.

      Thank you again for your advice.
      Last edited by ruben vandenhengel; 25 Apr 2016, 14:54.

      Comment


      • #4
        Dear Ruben,

        Overdispersion is meaningless out of the count data context; if you change the scale of your dependent variable you change the relation between the variance and the mean. This is why the NegBin model is sensitive to the scale of the data, and I believe zero inflated models suffer from the same problem. Therefore, anyone worried with overdispersion in this context simply does not understand the nature of the problem they are dealing with. There are other reasons that make these alternative approaches inadequate, but I do not think I need to go into that in detail.

        I was not aware of the Krisztin & Fischer paper that you mention, but I had a quick look and I am not convinced their main claim is correct. Anyway, I'll have to read the paper more carefully before being able to comment in detail.

        Best regards,

        Joao

        Comment


        • #5
          Dear Joao,

          Thank you again for your reply. I concur some alternative models are prone to critique e.g. scale dependence.
          When I review the literature it seems the ppml is currently favored by many. Still, say I would like to include an alternative estimation to compare the ppml model outcomes with which model would you recommend? For now, I tend to lean towards the sample selection correction, two-step Heckman approach.

          If I may, I have one theoretical question on the ppml specification. Looking at equations (14) and (15) in your paper: 'The Log of Gravity' it seems to me that these suggest that theoretically this model cannot generate zero trade flows as you take the exponent of the sum of all explanatory variables*coefficients. Firstly, is my interpretation correct? If so, how do you argue that this is no problem when estimating the model?

          Best,
          Ruben

          Comment


          • #6
            Dear Ruben,

            If you want to consider an alternative, got for the gamma model. The sample selection model relies on the normality and especially on the homoskedasticity of the data; those will be badly violated with trade data.

            About the zeros, what is an exponential function is the conditional expectation and this indeed cannot be zero. However, this does not imply that y cannot be zero; just think of the logit or probit: these functions cannot be zero nor 1 but binary data can only be zero or one. The same goes for count data, y can be zero but the mean is generally specified as an exponential function.

            Another question is whether we can generate non-count data with zeros and an exponential mean. Indeed we can, see here:

            Santos Silva, J.M.C. and Tenreyro, Silvana (2011), Further Simulation Evidence on the Performance of the Poisson Pseudo-Maximum Likelihood Estimator, Economics Letters, 112(2), pp. 220-222.

            Best regards,

            Joao

            Comment


            • #7
              Dear Joao,

              Thank you again for your comments. I will definitely run the gamma model as a comparison. I see in the paper you referred to this seems to be the second-best model in terms of error terms reported.

              One final question I have is on the R-squared of PPML firm-level estimation. I have reviewed the literature but could barely find any papers employing a PPML estimation on firm-level data. Just like the ones I did find my R-squared is rather low (compared to other estimations) or the R-squared reported for country-level estimations. Is this usually the case for PPML estimation at the firm-level?

              Best,
              Ruben

              Comment


              • #8
                Dear Ruben,

                It is not surprising that with firm-level data the R2 is lower; firms are more difficult to predict than countries. Anyway, the R2 is not very important...

                Best regards,

                Joao

                Comment

                Working...
                X