Which method produces the best estimate of the race difference in annual household income (equivalized for household composition): margins or predict?
I use OLS to regress age, sex, education, region, urban and race on logged, equivalized annual household income for a sample of black and non-black households.
CODE reg lnehhinc age i.sex i.educ i.region i.urban i.black
I’ve tried the following post-estimation commands:
margins black
This generates an estimate for non-blacks and for blacks. Both are logged and need to be exponentiated. For example, if I exponentiate the black margin, I get CODE di exp(10.31384) = $30,146
Alternatively, I can use predict
Code predict yhat if black==1 & e(sample)
sum yhat, detail
These commands generate, among other things, a mean and median. Both of which, again, need to exponentiated. For the median: CODE di exp(10.35673) = $31,468;
for the mean: CODE di exp(10.37806) =$32,147
Ultimately, my goal is to compute the ratio of black/non-black household income. Which result should I use? Or, put differently, what are the strengths and weaknesses of the three different results? Thanks!
I use OLS to regress age, sex, education, region, urban and race on logged, equivalized annual household income for a sample of black and non-black households.
CODE reg lnehhinc age i.sex i.educ i.region i.urban i.black
I’ve tried the following post-estimation commands:
margins black
This generates an estimate for non-blacks and for blacks. Both are logged and need to be exponentiated. For example, if I exponentiate the black margin, I get CODE di exp(10.31384) = $30,146
Alternatively, I can use predict
Code predict yhat if black==1 & e(sample)
sum yhat, detail
These commands generate, among other things, a mean and median. Both of which, again, need to exponentiated. For the median: CODE di exp(10.35673) = $31,468;
for the mean: CODE di exp(10.37806) =$32,147
Ultimately, my goal is to compute the ratio of black/non-black household income. Which result should I use? Or, put differently, what are the strengths and weaknesses of the three different results? Thanks!
Comment