No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • High R-square in ppml

    Dear Statalist community,

    I would like to clarify few things about using ppml on gravity data.

    1. I'm trying to do a gravity model using patent data. I have panel gravity dataset with 100000 observations. My dependent variable is bilateral patent counts. independent variables are gdp, economic globalization, education of both country of origin and destination. I have included year_*, origin_*, dest_* as fixed effects and clustered the standard errors by (dist). I noticed that my R-square is close to 0.92 and with additional variables it increases to 0.97. This is extremely high and should I worry about it?

    2. I have not declared the dataset as a panel using xtset countrypair year. Is this needed for ppml estimation?

    3. Is there a way to get the adjusted r-square using ppml?

    4. Instead of using year, origin, destination fixed effects, I have tried year_*, origin_destination_* fixed effects (this created 5049 dummies) as well as origin_year_*, destination_year_* fixed effects. However, ppml method takes an extremely long time to estimate. Is this expected?

    5. I'm interested in doing some mediation and moderation analysis on the gravity data but not sure whether ppml can be modified. Is there a method to perform mediation and moderation analysis on gravity data with dependent variable having large zero values?

    Appreciate your advice.


  • #2
    This topic was earlier posted in the General Forum, the appropriate location for these questions, which seem to have little to do with the Mata language, and replies should be posted to the topic at that location.