Hello,
I am trying to calculate the percentage of missing observations per variable by dividing the number of missing observations by the number of expected observations per variable. Because the denominator will vary across most variables (due to skip patterns in the survey), I have created a unique variable to store the number of missing observations and the number of expected observations per variables. So, for example, a variable named var1 has corresponding var1_m (for missing) and var1_exp (for expected) variables, and var2 has corresponding var2_m and var2_exp variables, etc. (I realize this is probably not the best way to do this, as it creates a ton of extra variables who have the same value for each observation. I didn't know how to store single numbers though - feedback on that is welcome, as well!). I've gotten that far, but am now having trouble creating the percentage variable. To create the percentage variable, I would like to create a foreach loop like this:
foreach var in varlist _all {
gen `var'_per = `var'_m/`var'_exp
}
This doesn't work, as the `var' names now include the suffixes _m and _exp, so Stata reads this as variables var2_m_m and var2_exp_exp, which don't exist. Is there a way to create the percentage variable in a loop such that the beginning of the variable before the _m and _exp can be used to match on which variables are supposed to be included in the division equation, and would also determine what precedes _per in creating the new percentage variable?
Thanks for any help you can provide!
I am trying to calculate the percentage of missing observations per variable by dividing the number of missing observations by the number of expected observations per variable. Because the denominator will vary across most variables (due to skip patterns in the survey), I have created a unique variable to store the number of missing observations and the number of expected observations per variables. So, for example, a variable named var1 has corresponding var1_m (for missing) and var1_exp (for expected) variables, and var2 has corresponding var2_m and var2_exp variables, etc. (I realize this is probably not the best way to do this, as it creates a ton of extra variables who have the same value for each observation. I didn't know how to store single numbers though - feedback on that is welcome, as well!). I've gotten that far, but am now having trouble creating the percentage variable. To create the percentage variable, I would like to create a foreach loop like this:
foreach var in varlist _all {
gen `var'_per = `var'_m/`var'_exp
}
This doesn't work, as the `var' names now include the suffixes _m and _exp, so Stata reads this as variables var2_m_m and var2_exp_exp, which don't exist. Is there a way to create the percentage variable in a loop such that the beginning of the variable before the _m and _exp can be used to match on which variables are supposed to be included in the division equation, and would also determine what precedes _per in creating the new percentage variable?
Thanks for any help you can provide!
Comment