Cell Mean Imputation

Sophie Gebers

Join Date: May 2019

Posts: 22
#1

Cell Mean Imputation

03 Oct 2019, 03:50

Good day all

I am using a stacked cross-sectional dataset, called the South African Post-Apartheid Labour Market Series (PALMS), from the years 1993 to 2017.
There is a lot of missing data for the monthly earnings variable. Therefore, I have been requested to figure out how to go about doing a cell mean imputation for item non-response on missing earning figures. This is apparently done by calculating the cell mean of earnings for all those who have the same education (coded to be if they have less than 12 years of schooling, have 12 years or more than 12 years) and belong to the same population group; and then giving those in the same groups with missing earnings this cell mean.

I do not know how to go about doing this. I would understand i could use a loop for the respective years, but I am at a loss with calculating the cell means and imputing them.
Would anyone be able to help?

Regards
Tags: None
Sophie Gebers

Join Date: May 2019

Posts: 22
#2

03 Oct 2019, 03:53

The task at hand is to:
For the cell mean imputation you should assign all the employed who have missing earnings the mean of earnings of their education level-race-year cell. For education you should only use 3 education categories- less than matric, matric and greater than matric. For race assign those with “other” to the white group, so there are 4 groups. There are 22 years with earnings data. This means there are 3x4x22 cells.
Hint: you will need to use a forvalues and probably a foreach loop with a sum [aweight] to get the cell means
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17704
#3

03 Oct 2019, 09:00

Sophie:
Statalist hint for this kind of queries is reported at: https://www.statalist.org/forums/help#adviceextras, #4.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Cell Mean Imputation

Comment

Comment