Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Random Effects Negative Binomial vs. Fixed Effects Poisson

    Dear Researchers,

    I have a dataset of social media posts created by individuals (i.e. authors) in the first six months of 2022, that comprises of both author-related and post-related variables.
    Posts can be shared over the social network, so I intend to examine the effect of author's gender on the number of shares each post receives.
    An author may have multiple social media posts created in this time period, so I tried panel models.

    To estimate the impact of gender (the variable "Female"), I have primarily used a random effects negative binomial count model with author's id as the panel variable but without a time variable (as the repeated observations are not measured at the same point in time). "Female" was statistically significant in this model.

    However, given the unobserved heterogeneity of authors and potential author-related omitted variables, I wanted to try fixed effect count models instead of a random effects count model.
    As suggested by several StataListers, I employed a Poisson Fixed Effects model, setting author's id as the panel variable and with no time variable.
    "Female" was NOT statistically significant this time.

    As far as I know, in any fixed-effects model, time-invariant variables should not have identifiable effects.As a result, I expected author's gender ("Female") to be excluded from the regression results as it is time-invariant. However, my FE Poisson model results does include "Female".

    1. Could you please help me to understand why the variable "Female" is present in the Poisson FE regression results? Is this because I have not set a time variable along with the panel variable in xtset?
    2. How can I choose between RE Negative Binomial and FE Poisson when they do not provide qualitatively similar estimates?

    Thanks a lot for your guidance!
    Last edited by Priyanga Gunarathne; 18 Oct 2023, 10:15.

  • #2
    As far as I know, in any fixed-effects model, time-invariant variables should not have identifiable effects.As a result, I expected author's gender ("Female") to be excluded from the regression results as it is time-invariant. However, my FE Poisson model results does include "Female".
    Your understanding about fixed-effects models and time-invariant variables is correct. The inescapable conclusion is that in your data set, the sex variable is not time-invariant within author.

    This can arise in two ways. By far the likeliest way this can happen is that you simply have errors in your data. The other possibility is that your data set includes one or more transgender people who transitioned (at least with respect to their declared gender) during they time they were under observation in your study.

    You can identify the non-time-invariant gender IDs as follows:
    Code:
    by id (gender), sort: gen byte time_variant = gender[1] != gender[_N]
    browse if time_variant
    Then you can decide how to manage the problem. It may be possible to fix some of the inconsistencies so that they are time-invariant with a presumptively correct time-invariant gender. You might need to exclude some or all of the time-variant as impossible to identify.

    Then go back to your random effects model with the corrected data. It is true that your random-effects data may be inconsistent due to correlation between the error term and the model predictors. That is a limitation inherent to using those models with observational data. You can try to mitigate the problem by including additional covariates that might absorb that correlation--though you can never be certain you have finished the job. Another thought that occurs to me is that if you really believe in a Poisson or negative binomial model, you can aggregate your data up to one observation per author, totaling up the number of shares of all the articles as the outcome variable and using a count of the number of articles as an exposure variable. If the model is correct, that approach will be valid, and analyzing just one observation per author will enable you to avoid the problems associated with panel data.

    Comment


    • #3
      I agree with Clyde. One other possibility with your random-effects model is to group mean center the time-varying variables and/or include the group means for these variables. This is sometimes referred to as the Mundlak approach and it ensures that the within-group effects are consistent with those from a fixed effects model. The between group effects are still subject to correlations between the group-level error and the group-level predictors, though.

      Comment


      • #4
        My deepest gratitude to you Clyde and Erik! I checked my data and found 2 authors with inconsistent gender values!
        After fixing it, my problems are solved!
        Thanks a lot for your wisdom!

        Comment

        Working...
        X