Hi everyone,
I'm working with survey data in Stata MP 15.1 and am hoping for advice or suggestions from anyone who has attempted to combine multiple imputation (MI) with the Oaxaca-Blinder decomposition.
Key details:
What I’ve tried / considered:
I'm working with survey data in Stata MP 15.1 and am hoping for advice or suggestions from anyone who has attempted to combine multiple imputation (MI) with the Oaxaca-Blinder decomposition.
Key details:
- Stata version: MP 15.1
- oaxaca package: Version 4.1.1
- mi package: Version 1.3.3
- Data type: Complex survey data (with probability weights)
- Outcome variable: Binary
- Main objective: Incorporate MI results into a decomposition analysis
What I’ve tried / considered:
- I’ve considered running the Oaxaca decomposition manually on each imputed dataset.
- I am considering running the oaxaca decomposition separately within each imputation using a loop, then exporting the results (e.g., explained and unexplained components, standard errors) for manual pooling.
- I’ve also come across a 2013 post on Statalist suggesting the use of:
Code:
mi estimate, cmdok: oaxaca ... Link to post
- However, I’m not sure whether this approach produces valid results, since oaxaca performs more than just regression and may not fully align with Rubin’s rules or MI assumptions.
- I’ve reviewed the literature and noted that some studies use listwise deletion or missing indicator methods with Oaxaca-Blinder decomposition (e.g., Newman, 2014), but these approaches have known limitations.
Newman, D. A. (2014). Missing Data: Five Practical Guidelines. Organizational Research Methods, 17(4), 372–411. https://doi.org/10.1177/1094428114548590My questions:
- Has anyone attempted manual pooling of oaxaca output across imputed datasets? If so, how did you handle combining the explained and unexplained components?
- Is there a practical way to extract decomposition estimates and standard errors from each imputed dataset (e.g., using estimates store) and apply Rubin’s rules manually?
- Is there a better-suited user-written command or workaround for performing Oaxaca-Blinder decomposition with multiply imputed data?
Comment