I am constructing a panel dataset based on the survey data for the years 2010-2013 (four consecutive years). As is usually the case with household survey data, there is an issue of attrition, i.e. some households drop out from the survey from year to year. I need to figure out whether these households are missing at random.
My idea is to come up with a dummy equal to 1 in 2011 if a household is present in 2010 is missing in 2011 (and 0 otherwise), and so on for the years 2012, 2013. Then for each year above (2011, 2012, 2013) I want to run the logit/probit regression on this dummy with a set of covariates that I would like to control for in my study. The variable for household id is "hhid" and I have of course the time dimension variable "year".
Does anyone have a precise idea how this should be properly coded in Stata? I know it is not complicated, but I just cannot wrap my head around it and figure this out....
My idea is to come up with a dummy equal to 1 in 2011 if a household is present in 2010 is missing in 2011 (and 0 otherwise), and so on for the years 2012, 2013. Then for each year above (2011, 2012, 2013) I want to run the logit/probit regression on this dummy with a set of covariates that I would like to control for in my study. The variable for household id is "hhid" and I have of course the time dimension variable "year".
Does anyone have a precise idea how this should be properly coded in Stata? I know it is not complicated, but I just cannot wrap my head around it and figure this out....
Comment