Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting estimated parameters using ppmlhdfe

    Dear All, I am new to this (count) model/method/command (Please search ppmlhdfe and install). The key paper is "Fast Poisson estimation with high-dimensional fixed effects", Stata Journal, 20(1), 95-115 (2020, by Correia, Guimaraes, and Zylkin). My simple question is how to interpret the estimated coefficient from the ppmlhdfe. Consider the following example (using accompanied "data citations_example.dta")
    Code:
    use citations_example, clear
    ppmlhdfe cit nbaut, absorb(issn type jel2 pubyear)
    The result is (the dependent variable "cit" is the number of citations, and the key explanatory variable is "nbaut" is the number of authors in an article)
    Code:
    . use citations_example, clear
    
    . ppmlhdfe cit nbaut, absorb(issn type jel2 pubyear)
    Iteration 1:   deviance = 2.6721e+06  eps = .         iters = 6    tol = 1.0e-04  min(eta) =  -3.58  P   
    Iteration 2:   deviance = 2.4118e+06  eps = 1.08e-01  iters = 5    tol = 1.0e-04  min(eta) =  -4.71      
    Iteration 3:   deviance = 2.3984e+06  eps = 5.57e-03  iters = 4    tol = 1.0e-04  min(eta) =  -5.78      
    Iteration 4:   deviance = 2.3982e+06  eps = 9.09e-05  iters = 3    tol = 1.0e-04  min(eta) =  -6.45      
    Iteration 5:   deviance = 2.3982e+06  eps = 6.40e-06  iters = 3    tol = 1.0e-05  min(eta) =  -7.27      
    Iteration 6:   deviance = 2.3982e+06  eps = 9.18e-07  iters = 3    tol = 1.0e-06  min(eta) =  -7.99      
    Iteration 7:   deviance = 2.3982e+06  eps = 1.33e-07  iters = 3    tol = 1.0e-07  min(eta) =  -8.39   S  
    Iteration 8:   deviance = 2.3982e+06  eps = 5.93e-09  iters = 2    tol = 1.0e-07  min(eta) =  -8.51   S  
    Iteration 9:   deviance = 2.3982e+06  eps = 1.82e-11  iters = 2    tol = 1.0e-08  min(eta) =  -8.51   S  
    Iteration 10:  deviance = 2.3982e+06  eps = 1.38e-16  iters = 3    tol = 1.0e-09  min(eta) =  -8.51   S O
    ------------------------------------------------------------------------------------------------------------
    (legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon below tolerance)
    Converged in 10 iterations and 34 HDFE sub-iterations (tol = 1.0e-08)
    
    HDFE PPML regression                              No. of obs      =    1083701
    Absorbing 4 HDFE groups                           Residual df     =    1083389
                                                      Wald chi2(1)    =    3789.26
    Deviance             =  2398153.725               Prob > chi2     =     0.0000
    Log pseudolikelihood = -1714495.906               Pseudo R2       =     0.2031
    ------------------------------------------------------------------------------
                 |               Robust
             cit | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           nbaut |   .1896705   .0030812    61.56   0.000     .1836314    .1957096
           _cons |   .0544407   .0058841     9.25   0.000     .0429081    .0659733
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
            issn |       170           0         170     |
            type |         4           1           3     |
            jel2 |       124           1         123    ?|
         pubyear |        16           1          15    ?|
    -----------------------------------------------------+
    ? = number of redundant parameters may be higher
    Could someone kindly tell me what does the coefficient .1896705 mean? In addition, suppose that I have a dummy variable "male" with estimated coefficient being -0.12, how do I interpret in this case? Thanks.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    Dear River Huang,

    The interpretation of the coefficients in a Poisson regression are exactly as in a linear model where the dependent variable is in logs. So, if the regressor is also in logs, the coefficient is an elasticity, otherwise it is a semi elasticity. So, in your case, the coefficient means that increasing the number of authors by 1 increases the expected number of citations by 100*(exp(0.19) - 1)% = 20.9%. Likewise, the coefficient on the male dummy means that the expected number of citations goes down by 11.3% when the dummy is one. Naturally, this is all ceteris paribus.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao, Thanks a lot. The interpretation is clear and really helpful.
      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        Hi Joao Santos Silva, I also have ppmlhdfe results. But what is the "exp" in the formula you are mentioning?

        Comment


        • #5
          see
          Code:
          help exp()

          Comment


          • #6
            Dear Chiel Enk,

            That is the exponential function.

            Best wishes,

            Joao

            Comment


            • #7
              Hi Joao Santos Silva. I was wondering about the interpretation when the dependent variable is a rate, for example the rate of student who fail the grade.

              Comment


              • #8
                Dear Ander Tami,

                Please tell us more about your model.

                Best wishes,

                Joao

                Comment


                • #9
                  Dear @Joao Santos Silva
                  I am using ppmlhdfe.
                  I am estimating a DID estimator. The outcome variable is the rate of students who fail the grade. I am using ppmlhdfe because there are many zeros in this variable. The “Treatment” variable is 1 if the school received a treatment, and “Post” is a dummy variable that indicates whether the year is after the start of treatment. Therefore the explanatory variable is the interaction “Treatment X Post”. Let’s assume that the coefficient I get is 0.1686. How should I interpret this coefficient?
                  Thank you very much

                  Comment


                  • #10

                    Dear Ander Tami,

                    Please see if this paper helps:

                    https://www.degruyter.com/document/d...2016-0011/html

                    Best wishes,

                    Joao

                    Comment


                    • #11
                      Thank you very much, Joao Santos Silva

                      Comment

                      Working...
                      X