Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting PPMLHDFE Regression Outputs, Including Pseudo R2

    Hello STATA Community,

    I'm currently working with a PPMLHDFE regression model and have some questions about interpreting the results, specifically regarding the Pseudo R2 statistic and other outputs.

    1. Pseudo R2: Firstly, why is it termed 'Pseudo' R2 in this context? Does it differ significantly from the traditional R2 used in linear regressions?

    2. Comparison with Simple Regression: Is the Pseudo R2 comparable to the correlation result in a simple regression model, or are there additional factors to consider?

    3. Log Pseudolikelihood: I'm also trying to grasp the meaning of the 'Log Pseudolikelihood' result. How should this be interpreted in the context of a PPMLHDFE model?

    4. Deviance and Wald Chi2: Furthermore, I would appreciate insights into the 'Deviance' result and the 'WALD CHI2(8)' statistic. What do these indicate about the model and its fit?

    Thank you for your time and assistance.

    Best regards,

    George

  • #2
    Dear George Mane,

    1. Yes, it differs significantly and is essentially meaningless, especially if your dependent variable are not counts, so avoid it.
    2. If you want a measure of goodness of fit, get the fitted values (be careful because you need the option d when you estimate and the option xbd when you predict) and compute y_hat the exponential of that. Check that the sum of y and y_hat are the same and, if so, compute the correlation between the two. The square of that is comparable the the R2 of a linear model, but does not have the usual interpretation as the proportion of explained variance. Note, however, that all these measures of goodness-of-fit are generally of very little interest.
    3. That is just the value of the objective function at the estimates; does not have a particular interpretation.
    4. I would just ignore those.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao,

      Thank you for your detailed and insightful response. Your explanation, especially about the limited relevance of the Pseudo R2 in this context, was particularly useful. I appreciate the advice on alternative ways to measure goodness of fit and the cautionary note on the use of fitted values.

      Thank you once again for taking the time to clarify these points. If, in the future, you have further insights into the usefulness of any of the metrics behind the PPMLHDFE command, they are more than welcome.

      Best wishes,

      George

      Comment


      • #4
        Dear Joao Santos Silva

        Thank you once again for your assistance. I have a couple of further queries regarding the interpretation of the PPMLHDFE model's results.

        Firstly, are the significance levels (p-values) interpreted in the same manner as in ordinary least squares regression?
        Additionally, I'm curious about the Robust Standard Errors – would they be interpreted similarly as well?

        My initial assumption is that there wouldn't be significant differences, but I wanted to confirm if there are any nuances I should be aware of.

        Best regards,

        George

        Comment


        • #5
          Dear George Mane,

          By default, ppmlhdfe produces robust standard errors; it you want, you can cluster as well. Interpretation of standard errors, p-values, etc., is just like in OLS.

          Best wishes,

          Joao

          Comment


          • #6
            Dear Joao,

            Thank you so much. This has been very helpful.

            Best regards,

            George

            Comment

            Working...
            X