Choosing Between reghdfe and Mixed Model for Nested Ratings

Anna Claire Tucker

Join Date: May 2025

Posts: 3
#1

Choosing Between reghdfe and Mixed Model for Nested Ratings

27 May 2025, 09:21

I’m working with a dataset of 4,455 individuals, each of whom rated a random selection of 10 out of 40 possible foods. For each food, participants rated:
Perceived healthfulness (0–10)

Perceived processing (0–10).

I want to examine the within-person association between perceived processing and perceived healthfulness.

Since each person rated multiple foods and each food was rated by many people, observations are clustered both by person_id and food_id.

I initially considered a mixed-effects model with random intercepts for both person_id and food_id. However, I’m concerned about the assumption that random effects (particularly for person_id) are uncorrelated with the predictors—in this case, perceived processing. I suspect there are person-level traits (e.g., age) that may influence both processing and healthfulness ratings.

Given this, would it be more appropriate to estimate a fixed-effects model using reghdfe to absorb person_id and cluster standard errors at the food_id level?

Here's the model I’m considering:
reghdfe healthiness_food i.quartile_processing, absorb(person_id) vce(cluster food_id)

My goals are to:
Estimate the within-person association between perceived processing and healthfulness.

Account for clustering at the food level.

Is this specification appropriate for these goals? Would there be a better approach? Are there any limitations I should be aware of?
Thank you for your guidance.
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10285
#2

27 May 2025, 17:27

Originally posted by Anna Claire Tucker View Post

I want to examine the within-person association between perceived processing and perceived healthfulness.

Given this, would it be more appropriate to estimate a fixed-effects model using reghdfe to absorb person_id and cluster standard errors at the food_id level?

Here's the model I’m considering:
reghdfe healthiness_food i.quartile_processing, absorb(person_id) vce(cluster food_id)

That all looks correct. Apart from a critical reason that you highlight:

However, I’m concerned about the assumption that random effects (particularly for person_id) are uncorrelated with the predictors—in this case, perceived processing. I suspect there are person-level traits (e.g., age) that may influence both processing and healthfulness ratings.

[,] Winkelmann and Winkelmann (1998) argue that inter-personal comparisons of scores are problematic, given that individuals tend to anchor their scales at different levels. Thus, a score of 7 for one person may be very different from a score of 7 for another, especially if their perception of the midpoint of the scale differs. Below is the full argument from page 3.

While such subjective variables (which measure what people say rather than
what they do) have usually been treated with suspicion by economists, they
have been used occasionally in the past. Freeman (1978) and Ackerlof et al.
(1988) are examples for studies using job satisfaction, while Easterlin
(1974, 1995) and Blanchflower (1996) are examples of studies based on life
satisfaction responses. The measurement issues are the same for job and life
satisfaction. A particular concern is that individuals ‘anchor’ their scale at
different levels, rendering interpersonal comparisons of responses meaningless.
This problem bears a close resemblance to the issue of cardinal versus ordinal
utility. Any statistic that is calculated from a cross-section of individuals, for
instance an average satisfaction, requires cardinality of the measurement scale.
It is clear that, from a statistical perspective, this problem is closely related
to the problem of unobserved individual specific effects. Hence, anchoring
causes the estimator to be biased as long as it is not random but correlated
with explanatory variables. Panel data help if the metric used by individuals is
time-invariant. The important benefit of panel data is that such data allow us
to make inferences based on intra- rather than interpersonal comparisons of
satisfaction. Of course, the limitation to intra-individual variation avoids not
only potential biases caused by anchoring, but also biases caused by other
unobserved individual specific factors.

Reference
Winkelmann, L., & Winkelmann, R. (1998). Why are the Unemployed so Unhappy? Evidence from Panel Data. Economica, 65(257), 1–15.
1 like
Comment
Anna Claire Tucker

Join Date: May 2025

Posts: 3
#3

28 May 2025, 06:14

Thank you so much for this, Andrew. So if I am not mistaken, the excerpt from Winkelmann & Winkelmann supports the use of fixed effects over mixed model with random effects for person_id because it is better at addressing anchoring bias by fully controlling for differences in individuals" rating tendencies?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10285
#4

28 May 2025, 06:48

Correct, that is the suggestion.
Comment
Anna Claire Tucker

Join Date: May 2025

Posts: 3
#5

28 May 2025, 08:00

Great, thanks!
Comment

Announcement

Choosing Between reghdfe and Mixed Model for Nested Ratings

Comment

Comment

Comment

Comment