Hello everyone,
This is my first time posting on the forum. Having read the FAQ, I must still apologize in prior for any unintended mistakes.
I am using Stata/SE 14.0 for Mac on macOS High Sierra. I am working on a dataset consist of roughly 15,000 observations. It is from a birth cohort study on which I am studying the effect of sexual orientation on health outcome. I am running a multiple imputation model before the analyses due to high missingness. My question arised when I am running regression models for every variable I intend to impute, in order to determine whether to use linear regression or predictive mean matching (pmm) in the imputation model (effectively I am checking the distribution of residuals from the "dryrun"s of my imputation model).
My diagnostic regressions look like this:
I have simplified my codes to make them easier to read without missing my main point. The actual list of imputed variables are much longer than this and it is not very clear to read, so I wish to use a loop for this. However, instead of what "foreach" allows me to do, I do not want all the variables in the macro to repeat identically everytime the loop runs. The variable list should change slightly every time, i.e. one of the variable has to serve as the dependent variable (and hence NOT be one of the independent variables). What I envisioned is something like this:
Apparently this is incorrect and I was shown an error message as follow:
To be concise, I am wondering if there is a way for the foreach function to, while running, omit one variable from the macro at a time?
Thank you very much.
best regards,
Kai-Yuan
This is my first time posting on the forum. Having read the FAQ, I must still apologize in prior for any unintended mistakes.
I am using Stata/SE 14.0 for Mac on macOS High Sierra. I am working on a dataset consist of roughly 15,000 observations. It is from a birth cohort study on which I am studying the effect of sexual orientation on health outcome. I am running a multiple imputation model before the analyses due to high missingness. My question arised when I am running regression models for every variable I intend to impute, in order to determine whether to use linear regression or predictive mean matching (pmm) in the imputation model (effectively I am checking the distribution of residuals from the "dryrun"s of my imputation model).
My diagnostic regressions look like this:
Code:
regress depression9 depression10 depression11 depression12 sexuality i.sex i.nonwhite i.ses edumum predict rep2, residuals pnorm rep2 qnorm rep2 regress depression10 depression9 depression11 depression12 sexuality i.sex i.nonwhite i.ses edumum predict rep2, residuals pnorm rep2 qnorm rep2 regress depression11 depression9 depression10 depression12 sexuality i.sex i.nonwhite i.ses edumum predict rep2, residuals pnorm rep2 qnorm rep2 regress depression12 depression9 depression10 depression11 sexuality i.sex i.nonwhite i.ses edumum predict rep2, residuals pnorm rep2 qnorm rep2
Code:
loc imputed "depression9 depression10 depression11 depression12" loc regular "sexuality i.sex i.nonwhite i.ses edumum" foreach x of loc imputed { regress `x' `"`imputed' - `x'"' `regular' }
Code:
"depression9 depression10 depression11 depression12 - depression invalid name
Thank you very much.
best regards,
Kai-Yuan
Comment