MI dataset, alternative to commands that not supported under mi estimate

Sunil Sampath

Join Date: Oct 2015

Posts: 15
#1

MI dataset, alternative to commands that not supported under mi estimate

27 Apr 2016, 04:52

Dear statalist community,

I have a multiple imputed data set.

1) I want to compare ‘joint count’(which is not normally distributed) between 2 independent groups. I have used the stata’s ranksum command to do this in the complete case. I want to perform a similar, valid hypothesis testing in the MI data, but ranksum doesnt seem to be supported with mi estimate. I think I cannot use regression (which is supported in mi estimate) since ‘joint count’ is not normally distributed.

2)I want to compare the proportion of ‘male/female’ between 2 independent groups. I have used the chisquare test to do this in the complete case. Again, as above, chisquare is not supported with mi estimate. Is there a valid way to combine the p values of chisquare test across all the imputed datasets.

3)I have built a logistic regression model in a MI data set. I wanted to test the adequacy of the model using the Hosmer-Lemeshow test, but estat gof is not supported with mi estimate. I want to obtain the area under the ROC curve, but command lroc is not supported with mi estimate. Are there other commands that can achieve the same in a MI data set.

I have tried the cmdok function that stata manual suggests to use with mi estimate unsupported commands, but haven’t had any luck.
I am grateful for any advice.
Kind regards
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17730
#2

27 Apr 2016, 04:55

Sunil:

I think I cannot use regression (which is supported in mi estimate) since ‘joint count’ is not normally distributed.

as far as your question #1 is concerned, if by regresssion you mean OLS, the normality requirement affects residual distribution only.

Kind regards,
Carlo
(Stata 19.0)
Comment
Tim Morris

Join Date: Apr 2014

Posts: 92
#3

27 Apr 2016, 07:20

Hi Sunil, on 1) and 2):

To understand the problem you need to consider what MI aims to do. mi estimate is an implementation of Rubin's rules, which combine the various estimates of population parameters and their variances, but cannot combine sample statistics that depend on n, such as p-values. When you specify the cmdok option, mi estimate runs the estimation command and looks for e(b) and e(V); if it cannot locate them it returns an error. Because ranksum does not estimating any population parameter, it does not return e(b) or e(V) and so you get an error from mi estimate.

A question for you: was it joint count that you imputed, and if so, how? Using mi impute regress? If so, you are making assumptions about normality of the residuals in the imputation model. It sounds like you are uncomfortable with this, and it may introduce problems which feed through to your MI analysis. Conversely, if you are comfortable with using a certain model for imputation, why not also use it for your analysis? (Perhaps you have used mi impute pmm, which makes weaker assumptions.)

Hope that helps to understand the first two problems. Tim
Comment
Sunil Sampath

Join Date: Oct 2015

Posts: 15
#4

27 Apr 2016, 08:09

Hi Carlo and Tim,

Many thanks for taking time to respond to my query. I am still relatively new to stats and stata, so please excuse my naivety.
Yes, joint_count is one of the variables that I imputed using a regression model. This variable has some degree of correlation with other variables in my data set, so I am actually comfortable with this. So, I guess then it should be fine to use this for my analysis as well.

May I please check, in my MI dataset I have a binary response category (response). To answer the question if joint_count and gender are different between the responders and non-responders, if I simply did

mi estimate: logistic response joint_count
&
mi estimate: logistic response gender

I understand that a logistic regression model is meant to determine if the independent variable can predict the dependent variable. But the p value obtained from the logistic models specified above, would still answer the question (if joint_count and gender are significantly different between the 2 response categories? Am I right ?
Thank you
Kind regards
Comment
Tim Morris

Join Date: Apr 2014

Posts: 92
#5

27 Apr 2016, 08:29

Hi Sunil,

No problem. If you are new to MI there are plenty of potential pitfalls. I'd suggest reading the excellent tutorial by Ian White, Patrick Royston and Angela Wood before putting too much faith in results you get out of mi estimate.

If joint count and response are both binary variables then your mi estimate commands above should run ok, but what you wrote earlier implies joint count is not binary, and you have imputed it using linear regression, in which case the logistic command will return an error.

Tim
Comment
Sunil Sampath

Join Date: Oct 2015

Posts: 15
#6

27 Apr 2016, 08:36

Hi Tim
Thank you for the link to the tutorial.
Sorry I didnt make it clear. Joint_count is a continuous variable and gender is categorical variable. My dependent variable - response is a binary variable. Joint_count was imputed using regression. But since the dependent variable is a binary, I have used logistic command. Thanks again.
Sunil
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17730
#7

27 Apr 2016, 10:24

Sunil:

I understand that a logistic regression model is meant to determine if the independent variable can predict the dependent variable.

Basically, regression machinery (regardless if logistic or other) investigates if, other things being equal, a given predictor can explain the variation in the dependent variable.

Tim suggested an excellent tutorial indeed.
My first approach with MI was favoured by the following article: http://www.ncbi.nlm.nih.gov/pubmed/12720255

Last edited by Carlo Lazzaro; 27 Apr 2016, 10:27.

Kind regards,
Carlo
(Stata 19.0)
Comment
Sunil Sampath

Join Date: Oct 2015

Posts: 15
#8

28 Apr 2016, 06:31

Thanks Carlo, for the explanation and article.
BW
Comment

Announcement

MI dataset, alternative to commands that not supported under mi estimate

Comment

Comment

Comment

Comment

Comment

Comment

Comment