Stata doesn't recognize long format after multiple imputations by chained equations

John Havens

Join Date: Apr 2016
Posts: 35

Stata doesn't recognize long format after multiple imputations by chained equations

27 Apr 2018, 22:18

Dear Statalists,

I have a dataset containing 2222 observations. I worked with three variable: patid (ID), depression, and month. I imputed 50 imputations to the dataset in wide format (-mi set wide-). After imputed, I reshaped it to long using -mi reshape long depression,i(patid) j(month)-. Then I generated passive binary variable called depressioncat using - mi passive gen-.
However, after -mi xtset patid month-, I ran -mi estimate: proportion- but Stata seems did not recognize that data is in long format. It showed the data had 10494 observations

Code:

. mi xtset patid month
       panel variable:  patid (strongly balanced)
        time variable:  month, 0 to 12, but with gaps
                delta:  1 unit

.
end of do-file

. do "C:\Users\tranh2\AppData\Local\Temp\STD09000000.tmp"

. mi estimate: proportion depressioncat

Multiple-imputation estimates      Imputations     =        50
Proportion estimation              Number of obs   =     10494
                                   Average RVI     =    0.2996
                                   Largest FMI     =    0.2322
                                   Complete DF     =     10493
DF adjustment:   Small sample      DF:     min     =    827.55
                                           avg     =    827.55
Within VCE type:     Analytic              max     =    827.55

---------------------------------------------------------------
              | Proportion   Std. Err.     [95% Conf. Interval]
--------------+------------------------------------------------
            0 |   .8546122   .0039228      .8469124    .8623119
            1 |   .1453878   .0039228      .1376881    .1530876
---------------------------------------------------------------

I checked mi set it is still stated wide

Code:

. mi set
data mi set wide, M = 50
last mi update 28apr2018 00:06:04, approximately 6 minutes ago

However, when I browse data, it is in long fomat (each subject has 5 observations for 5 time points). This is a part of my data with 5 imputations:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long patid byte month float(depressioncat _1_depressioncat _2_depressioncat _3_depressioncat _4_depressioncat _5_depression)
10000  0 0 0 0 0 0  0
10000  1 0 0 0 0 0  0
10000  3 0 0 0 0 0  0
10000  6 0 0 0 0 0  0
10000 12 0 0 0 0 0  0
10007  0 0 0 0 0 0  7
10007  1 0 0 0 0 0  3
10007  3 0 0 0 0 0  1
10007  6 0 0 0 0 0  1
10007 12 0 0 0 0 0  0
10012  0 0 0 0 0 0  6
10012  1 . 0 0 0 0  3
10012  3 . 0 0 0 0  2
10012  6 . 0 0 0 1  0
10012 12 . 0 0 0 0  0
10013  0 1 1 1 1 1 15
10013  1 . 1 1 0 1  6
10013  3 . 0 0 0 1  0
10013  6 . 0 0 0 1  3
10013 12 . 0 0 0 0  0
10020  0 1 1 1 1 1 12
10020  1 . 0 0 0 0  0
10020  3 . 0 0 0 0  0
10020  6 0 0 0 0 0  0
10020 12 0 0 0 0 0  0
10102  0 0 0 0 0 0  7
10102  1 0 0 0 0 0  3
10102  3 0 0 0 0 0  0
10102  6 0 0 0 0 0  0
10102 12 . 0 0 0 0  2
10103  0 0 0 0 0 0  9
10103  1 . 1 0 0 0 15
10103  3 0 0 0 0 0  2
10103  6 1 1 1 1 1  3
10103 12 0 0 0 0 0  1
10104  0 0 0 0 0 0  1
10104  1 0 0 0 0 0  7
10104  3 1 1 1 1 1  3
10104  6 0 0 0 0 0  0
10104 12 1 1 1 1 1  3
10105  0 0 0 0 0 0  0
10105  1 0 0 0 0 0  1
10105  3 0 0 0 0 0  0
10105  6 0 0 0 0 0  0
10105 12 0 0 0 0 0  0
10106  0 0 0 0 0 0  8
10106  1 0 0 0 0 0  1
10106  3 0 0 0 0 0  1
10106  6 . . . . .  .
10106 12 . . . . .  .
10111  0 0 0 0 0 0  6
10111  1 0 0 0 0 0  8
10111  3 0 0 0 0 0  1
10111  6 0 0 0 0 0  0
10111 12 . 0 0 0 0  0
10130  0 1 1 1 1 1 12
10130  1 1 1 1 1 1 19
10130  3 1 1 1 1 1  3
10130  6 0 0 0 0 0  2
10130 12 0 0 0 0 0  0
10132  0 0 0 0 0 0  4
10132  1 0 0 0 0 0  2
10132  3 . 0 0 0 0  0
10132  6 0 0 0 0 0  0
10132 12 0 0 0 0 0  0
10134  0 0 0 0 0 0  0
10134  1 0 0 0 0 0  0
10134  3 . 0 0 0 0  0
10134  6 . 0 0 0 0  0
10134 12 0 0 0 0 0  0
10137  0 1 1 1 1 1 15
10137  1 0 0 0 0 0  5
10137  3 0 0 0 0 0  0
10137  6 0 0 0 0 0  2
10137 12 1 1 1 1 1  3
10140  0 0 0 0 0 0  0
10140  1 . 0 0 0 0  3
10140  3 . 0 0 0 0  2
10140  6 . 0 0 0 0  1
10140 12 . 0 0 0 0  2
10156  0 1 1 1 1 1 23
10156  1 . 1 1 1 1 16
10156  3 . 1 1 1 1  1
10156  6 . 1 1 1 1  6
10156 12 . 1 1 1 0  6
10158  0 1 1 1 1 1 16
10158  1 0 0 0 0 0  2
10158  3 0 0 0 0 0  2
10158  6 . 0 0 1 0  3
10158 12 0 0 0 0 0  0
10159  0 . . . . .  .
10159  1 . . . . .  .
10159  3 1 1 1 1 1  4
10159  6 0 0 0 0 0  2
10159 12 . . . . .  .
10161  0 1 1 1 1 1 11
10161  1 . 0 0 0 0 10
10161  3 0 0 0 0 0  0
10161  6 0 0 0 0 0  0
10161 12 0 0 0 0 0  0
end

I don't want to -mi reshape long- again as it is in long format now, and when I reshaped from wide to long before, it took me 20 hours to do so.
Thanks for your help!

Stata MP 13 User

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#2

28 Apr 2018, 12:22

I think you meant to -mi convert long- instead of -mi reshape long-, and confusing the various long-format MI data sets with the long shape (or layout) of an unimputed data set. In fact, as I look at this data, it doesn't look to me like there is anything to -reshape- into long layout. I'm actually surprised that you were able to -mi reshape- it; I would have expected some error messages telling you that the data is already long. Rather, you might want to -convert- it to a long multiple-imputation dataset. (Choose mlong, flong or fsep).
Comment
John Havens

Join Date: Apr 2016

Posts: 35
#3

01 May 2018, 11:44

Thank Clyde!

Stata MP 13 User
Comment

Announcement

Stata doesn't recognize long format after multiple imputations by chained equations

Comment

Comment