Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Choosing Between reghdfe and Mixed Model for Nested Ratings

    I’m working with a dataset of 4,455 individuals, each of whom rated a random selection of 10 out of 40 possible foods. For each food, participants rated:
    • Perceived healthfulness (0–10)
    • Perceived processing (0–10).
    I want to examine the within-person association between perceived processing and perceived healthfulness.

    Since each person rated multiple foods and each food was rated by many people, observations are clustered both by person_id and food_id.

    I initially considered a mixed-effects model with random intercepts for both person_id and food_id. However, I’m concerned about the assumption that random effects (particularly for person_id) are uncorrelated with the predictors—in this case, perceived processing. I suspect there are person-level traits (e.g., age) that may influence both processing and healthfulness ratings.

    Given this, would it be more appropriate to estimate a fixed-effects model using
    reghdfe to absorb person_id and cluster standard errors at the food_id level?

    Here's the model I’m considering:
    reghdfe healthiness_food i.quartile_processing, absorb(person_id) vce(cluster food_id)


    My goals are to:
    1. Estimate the within-person association between perceived processing and healthfulness.
    2. Account for clustering at the food level.
    Is this specification appropriate for these goals? Would there be a better approach? Are there any limitations I should be aware of?
    Thank you for your guidance.



  • #2
    Originally posted by Anna Claire Tucker View Post
    I want to examine the within-person association between perceived processing and perceived healthfulness.


    Given this, would it be more appropriate to estimate a fixed-effects model using
    reghdfe to absorb person_id and cluster standard errors at the food_id level?

    Here's the model I’m considering:
    reghdfe healthiness_food i.quartile_processing, absorb(person_id) vce(cluster food_id)
    That all looks correct. Apart from a critical reason that you highlight:

    However, I’m concerned about the assumption that random effects (particularly for person_id) are uncorrelated with the predictors—in this case, perceived processing. I suspect there are person-level traits (e.g., age) that may influence both processing and healthfulness ratings.
    [,] Winkelmann and Winkelmann (1998) argue that inter-personal comparisons of scores are problematic, given that individuals tend to anchor their scales at different levels. Thus, a score of 7 for one person may be very different from a score of 7 for another, especially if their perception of the midpoint of the scale differs. Below is the full argument from page 3.

    While such subjective variables (which measure what people say rather than
    what they do) have usually been treated with suspicion by economists, they
    have been used occasionally in the past. Freeman (1978) and Ackerlof et al.
    (1988) are examples for studies using job satisfaction, while Easterlin
    (1974, 1995) and Blanchflower (1996) are examples of studies based on life
    satisfaction responses. The measurement issues are the same for job and life
    satisfaction. A particular concern is that individuals ‘anchor’ their scale at
    different levels, rendering interpersonal comparisons of responses meaningless.
    This problem bears a close resemblance to the issue of cardinal versus ordinal
    utility. Any statistic that is calculated from a cross-section of individuals, for
    instance an average satisfaction, requires cardinality of the measurement scale.
    It is clear that, from a statistical perspective, this problem is closely related
    to the problem of unobserved individual specific effects. Hence, anchoring
    causes the estimator to be biased as long as it is not random but correlated
    with explanatory variables. Panel data help if the metric used by individuals is
    time-invariant. The important benefit of panel data is that such data allow us
    to make inferences based on intra- rather than interpersonal comparisons of
    satisfaction. Of course, the limitation to intra-individual variation avoids not
    only potential biases caused by anchoring, but also biases caused by other
    unobserved individual specific factors.

    Reference
    Winkelmann, L., & Winkelmann, R. (1998). Why are the Unemployed so Unhappy? Evidence from Panel Data. Economica, 65(257), 1–15.



    Comment


    • #3
      Thank you so much for this, Andrew. So if I am not mistaken, the excerpt from Winkelmann & Winkelmann supports the use of fixed effects over mixed model with random effects for person_id because it is better at addressing anchoring bias by fully controlling for differences in individuals" rating tendencies?

      Comment


      • #4
        Correct, that is the suggestion.

        Comment


        • #5
          Great, thanks!

          Comment

          Working...
          X