Hi all,
I'm currently working with the national household survey of my country and ran into the problem that income variables often have missing values. For this reason, I've decided to use the hot deck method to impute these missing values and found the hotdeck command in STATA. To make it easier to explain I copy the command's syntax:
where varlist are the variable/s I'd like to impute.
I have a doubt regarding the usage of the command. More specifically, I'm not sure whether I should run the command on all the variables I want to impute at once or if I should do it variable by variable. If I've understood the ado file correctly, if a unit has at least one missing value in one of the variables in varlist, it's considered a missing observation and, therefore, the command imputes all variables in varlist, even when it has observed values for the rest of the variables in varlist. So, my first thought was to run the command variable by variable. But then I've started reading about the method and now I'm not sure anymore which one is the right way.
I would be extremely grateful for your guidance.
I'm currently working with the national household survey of my country and ran into the problem that income variables often have missing values. For this reason, I've decided to use the hot deck method to impute these missing values and found the hotdeck command in STATA. To make it easier to explain I copy the command's syntax:
hotdeck [varlist] [using] [if exp] [in exp] , [ by(varlist) store impute(varlist) noise keep(varlist) command(command) parms(varlist) seed(#)
infiles(filename filename ...) ]
infiles(filename filename ...) ]
I have a doubt regarding the usage of the command. More specifically, I'm not sure whether I should run the command on all the variables I want to impute at once or if I should do it variable by variable. If I've understood the ado file correctly, if a unit has at least one missing value in one of the variables in varlist, it's considered a missing observation and, therefore, the command imputes all variables in varlist, even when it has observed values for the rest of the variables in varlist. So, my first thought was to run the command variable by variable. But then I've started reading about the method and now I'm not sure anymore which one is the right way.
I would be extremely grateful for your guidance.
Comment