new variable from mi impute

Jessica Di Cocco

Join Date: Mar 2019

Posts: 4
#1

new variable from mi impute

23 Jan 2020, 04:23

Dear all,

I'm struggling to generate a new variable after a Multiple Imputation. I'm new in MI, but before asking I've been reading different threads on this issue. As far as I understood, it would seem that the command that I should use is -mi passive- but I'm not sure about this.

I have a variable that includes 13 parties voted at the individual-level. Nevertheless, I do have missing values for survey respondents who:
1) have not voted;
2) have answered that they don't remember or refused to answer.

Considering my research design, I have decided to impute these missing values.

The model is the following

noisily mi impute mlogit parties = ivar [indepvars], augment force add(60)

I now need to store the results into a new variable which I will use to attribute each respondent a specific score that I derived. By reading Stata help material, I could not understand exactly what to do to generate this new variable that includes results from -mi imput mlogit-
I have tried the command

mi passive: egen miparties = mean(parties)

but I get a return message error claiming that the variable _1_parties is not found (and actually, it is not there!)

variable _1_parties not found
st_varrename(): 3500 invalid Stata variable name
u_mi_wide_swapvars(): - function returned error
<istmt>: - function returned error

I thus imagine that I missed some step in between or my procedure is not correct. Can you help me?

Thank you in advance for your support,

J.
Tags: None
daniel klein

Join Date: Mar 2014

Posts: 3885
#2

23 Jan 2020, 04:55

Before diving into technical problems, I would like to point out two more substantive issues.

Although I do not have a background in political sciences, I would sharply distinguish non-voters from those who have voted but cannot (or are not willing to) remember the party. For the former, there is probably no "true" value to impute. Therefore, I would treat non-voters as a separate category and either exclude them from the sample or add an extra (14th) category in your voting variable. I would not try to impute those values.

Because you are imputing parties with a multinomial logit model, I do not see how a mean value (as indicated by egen ... = mean(parties)) would make any sense at all. If you want the mean over the imputed values, stop right here. That is not how MI works.

Concerning the more technical aspects: First, drop the force option! Also, forget about ever hearing that there is such a thing as a force option and never specify it again. I have not yet encountered a situation where this would do what you want.

If parties is to become your outcome/dependent variable in the substantive analyses, make sure that your imputation model is (much) richer than the substantive model. Otherwise, you are probably better off just using the complete cases.

Concerning the error message: you should not see this error message. Either something is wrong with your installation of Stata (type: update all) or you have messed with mi settings in a way you should not. I do not see any indication of the latter in your code but then again you are not really showing us all the code (for example, you must mi set your data, otherwise the mi impute command would not even work).

Best
Daniel

Last edited by daniel klein; 23 Jan 2020, 04:58.
Comment
Jessica Di Cocco

Join Date: Mar 2019

Posts: 4
#3

24 Jan 2020, 01:47

Dear Daniel, thank you for your reply. I appreciated it and your suggestions were very useful to me.

I totally agree with you that voters and non-voters should be treated as different categories. Still, in my case, there is a reasoning behind the decision to include them into a single group. I will be a bit more explicit on my model so that one can better understand the reason behind my choice.

In the final model, my dependent variable is economic insecurity modelled as binary (1=economically insecure).
Among the independent variables, I do not have the parties that each respondent has voted but an index (or score), which I have previously derived, that I use as a proxy of political radicalism. Given that also non-voting respondents can be radical, I wanted to test what happens by including them in the analysis. Nevertheless, I can't infer their "score" without knowing their closest "probable" party.
For this reason, I was looking for a way to generate a new variable from my 60 estimates obtained with MI. I could impute the score, but since it is a continuous measure ranging from 0 to 1 I'm not sure that it can be a better option.
Mine is, of course, a general attempt that needs to be verified better, also from a theoretical and normative point of view. Besides this attempt, I'm also considering option 2, that implies reducing the case selection bias (of voting/non-voting) via Heckman Two-Step Model and then proceeding with the final model as described.

Thank you also for your technical advice. I will definitely drop the force option and better check what happened with my data.

All the best,

J.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3885
#4

24 Jan 2020, 03:13

Jessica

Thanks for the clarification. Concerning the theoretical arguments, I just wanted to make sure that you did not overlook something.

I would like to comment on two more statements just to make sure that I am getting this right and you do not get this wrong.

Originally posted by Jessica Di Cocco View Post

I was looking for a way to generate a new variable from my 60 estimates obtained with MI.

The word "from" irritates me (but I am obviously not a native speaker). It seems to imply that you are trying to combine the 60 imputed values into a single value. Do not do that! If you do, you are throwing away the variance between the imputed values, which was MI is really all about. In fact, the only reason to impute a value more than once is to capture the uncertainty associated with the imputation. That uncertainty must be accurately reflected in the analyses, as well, to get the standard errors right. Pooling those values into a single one before you analyze the data is just as if you had imputed only a single value in the first place. Instead, create one new variable in each of the 60 completed datasets. For that, you would probably use mi passive. Then, analyze your data using mi estimate.

Originally posted by Jessica Di Cocco View Post

I could impute the score, but since it is a continuous measure ranging from 0 to 1 I'm not sure that it can be a better option.

You could probably impute the score using pmm. However, for summative scores, it has been suggested to impute the single items, then create the score from that. So imputing the components appears to be the better choice. However, your code only imputes one variable and I still do not get why you are using mlogit if parties is a continuous score; I probably misunderstood something here.

Best
Daniel
Comment
Jessica Di Cocco

Join Date: Mar 2019

Posts: 4
#5

27 Jan 2020, 02:40

Dear Daniel,

Thank you immensely for your comments. I really appreciated them. I will take into account your advice and work in the direction you suggested.

All the best,

Jessica
Comment

Announcement

new variable from mi impute

Comment

Comment

Comment

Comment