Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heckman Post-Estimation Question

    Hello Forum Members:

    Using the 2012 American Community Survey, I am running Stata 16.1 to analyze the hourly earnings difference between black and non-black males. I control for some of their personal characteristics (age, education, etc.) and for the difference in the two groups’ rates of gainful employment. Net these controls, I seek an estimate of black hourly earnings as a percent of non-black. Ideally, I also would like an estimate of black earnings and of non-black earnings (in dollars), adjusting for the differences in the two groups' characteristics and their propensity to hold jobs. (I realize there are several ways to calculate this estimate and am interested in experts’ recommendations on the pros and cons of different approaches.)

    Other relevant details include that my dependent variable is logged earnings because earnings are skewed and that I use Heckman’s MLE because my data are weighted. Because the race variable appears in both the selection and the earnings equation, I adjust the race coefficient for selection using the method recommended by Sigelman and Zeng (2000). (I got the requisite Stata code from a Stata List response that I can no longer find). It is my understanding that this result, when exponentiated, gives the percent of unlogged adjusted non-black male earnings that black men receive. If I am misinformed on this point, please let me know!

    Code:
      quietly heckman lnhearn age i.educ i.region i.urban i.black [fw=fwt], select(age i.educ disabled i.urban i. region i.black)
    * to obtain the "black" coefficient taking into account that "black" appears in the earnings equation and the selection equation

    Code:
     
    predict selxbpr, xbs
    gen testpr = normalden(selxbpr)/normal(selxbpr)
    gen Dpr=testpr*(testpr+selxbpr)
    gen b_black= [lnhearn]_b[1.black] - ([select]_b[1.black]*e(rho)*e(sigma)*Dpr)
    . * to convert b_black from log wage to numerical wage
    Code:
     di exp(b_black)
    . * result is 0.91285314 which I interpret as: if all blacks worked, they would earn 91.3% of non-blacks per hour

    I would appreciate suggestions regarding the most appropriate post-estimation command to generate estimates of black earnings and of non-black earnings (in unlogged dollars) that control for group differences on the independent variables and for group differences in gainful employment.

    Many many thanks,
    Suzanne Model

Working...
X