I am using STATA/MP 18.5 on Mac to examine the socioeconomic determinants of health using a longitudinal dataset spanning 13 waves (approximately 80,000 individuals per wave before data cleaning).
Dependent Variable: self-reported health (excellent; very good; good; fair; poor) – ordinal
Key Independent Variables: Educational Attainment (Degree; A-Levels; GCSEs; No Qualifications) and Neighbourhood Deprivation (IMD quintiles 1-5) – both are categorical
Other Variables: demographic covariates (e.g. age, sex, ethnicity, marital status) and socio-economic covariates (e.g. income, occupation, etc)
Panel Structure: individual-level repeated observations over time (13 waves of data per individual)
I am looking for a suitable model that meets the following criteria:
I have considered using a partial model (e.g. geologit2), but I understand that this model does not natively support panel data, thereby violating criteria (2).
Potential Solutions Considered:
Dependent Variable: self-reported health (excellent; very good; good; fair; poor) – ordinal
Key Independent Variables: Educational Attainment (Degree; A-Levels; GCSEs; No Qualifications) and Neighbourhood Deprivation (IMD quintiles 1-5) – both are categorical
Other Variables: demographic covariates (e.g. age, sex, ethnicity, marital status) and socio-economic covariates (e.g. income, occupation, etc)
Panel Structure: individual-level repeated observations over time (13 waves of data per individual)
I am looking for a suitable model that meets the following criteria:
- Models an ordinal dependent variable
- Supports panel data structures (i.e. incorporates random intercepts to adjust for individual-level variation0
- Does not assume proportional odds, as my data violates this assumption (confirmed via the Brant test)
I have considered using a partial model (e.g. geologit2), but I understand that this model does not natively support panel data, thereby violating criteria (2).
Potential Solutions Considered:
- Using a Generalised Ordered Logit Model (gsem) to relax the proportional odds assumption whilst allowing for logit with mixed effects. E.g. “((gsem) (y M1[pidp] <- x1 x2 x3, nocons), family(ordinal) link(logit))”.
- Dichotomising the outcome variable: recoding self-reported health (excellent, very good, good, fair, poor) into a binary outcome (“Good health” = excellent, very good, good. “Poor health” = fair, poor) to enable the use of a standard panel logistic regression model such as ‘xtlogit, re’ (as the outcome variable wouldn’t be ordinal so proportional odds would not need to be assumed).
- An appropriate alternative model in Stata that can handle ordinal dependent variables in panel data while relaxing the proportional odds assumption?
- Is the GSEM approach a viable solution, or are there better implementations?
- Would dichotomisation be a reasonable compromise, or are there preferable ways to handle non-proportional odds in this context?
Comment