I’m working with a dataset of 4,455 individuals, each of whom rated a random selection of 10 out of 40 possible foods. For each food, participants rated:
Since each person rated multiple foods and each food was rated by many people, observations are clustered both by person_id and food_id.
I initially considered a mixed-effects model with random intercepts for both person_id and food_id. However, I’m concerned about the assumption that random effects (particularly for person_id) are uncorrelated with the predictors—in this case, perceived processing. I suspect there are person-level traits (e.g., age) that may influence both processing and healthfulness ratings.
Given this, would it be more appropriate to estimate a fixed-effects model using reghdfe to absorb person_id and cluster standard errors at the food_id level?
Here's the model I’m considering:
reghdfe healthiness_food i.quartile_processing, absorb(person_id) vce(cluster food_id)
My goals are to:
Thank you for your guidance.
- Perceived healthfulness (0–10)
- Perceived processing (0–10).
Since each person rated multiple foods and each food was rated by many people, observations are clustered both by person_id and food_id.
I initially considered a mixed-effects model with random intercepts for both person_id and food_id. However, I’m concerned about the assumption that random effects (particularly for person_id) are uncorrelated with the predictors—in this case, perceived processing. I suspect there are person-level traits (e.g., age) that may influence both processing and healthfulness ratings.
Given this, would it be more appropriate to estimate a fixed-effects model using reghdfe to absorb person_id and cluster standard errors at the food_id level?
Here's the model I’m considering:
reghdfe healthiness_food i.quartile_processing, absorb(person_id) vce(cluster food_id)
My goals are to:
- Estimate the within-person association between perceived processing and healthfulness.
- Account for clustering at the food level.
Thank you for your guidance.
Comment