Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cell Mean Imputation

    Good day all

    I am using a stacked cross-sectional dataset, called the South African Post-Apartheid Labour Market Series (PALMS), from the years 1993 to 2017.
    There is a lot of missing data for the monthly earnings variable. Therefore, I have been requested to figure out how to go about doing a cell mean imputation for item non-response on missing earning figures. This is apparently done by calculating the cell mean of earnings for all those who have the same education (coded to be if they have less than 12 years of schooling, have 12 years or more than 12 years) and belong to the same population group; and then giving those in the same groups with missing earnings this cell mean.

    I do not know how to go about doing this. I would understand i could use a loop for the respective years, but I am at a loss with calculating the cell means and imputing them.
    Would anyone be able to help?

    Regards

  • #2
    The task at hand is to:
    For the cell mean imputation you should assign all the employed who have missing earnings the mean of earnings of their education level-race-year cell. For education you should only use 3 education categories- less than matric, matric and greater than matric. For race assign those with “other” to the white group, so there are 4 groups. There are 22 years with earnings data. This means there are 3x4x22 cells.
    Hint: you will need to use a forvalues and probably a foreach loop with a sum [aweight] to get the cell means

    Comment


    • #3
      Sophie:
      Statalist hint for this kind of queries is reported at: https://www.statalist.org/forums/help#adviceextras, #4.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment

      Working...
      X