Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error message when introducing factor variables in stcox on imputed data

    Dear Statalist,

    I'm running stcox on multiply imputed data. I have used the same model and syntax on non-imputed data, and it works fine. But when I try to fit the model using mi estimate, I get the following error message:

    Code:
    mi estimate, hr: stcox i.HADS_A_gli c.PartAg_gli if Sex==0
    
    HADS_A_gli:  factor variables may not contain noninteger values
    an error occurred when mi estimate executed stcox on m=1
    r(452);
    It works fine when I run HADS_A_gli as a continuous variable. I also tried to specify the reference group using ib1.HADS_A_gli, without luck. Do anyone know why it's not possible to run stcox with a factor variable?



    The imputation output looks like this:

    Code:
    . mi impute mvn HADS_A_gli HADS_D_gli Sex PartAg_gli SmoStatPackYrs_gli_missing SES_gli physact_gli alcoh
    > ol_gli COPDcat_gli CVD_gli cancer_gli chrondisADL_gli diabetes_gli musc_skel_gli if (glicopd_HUNT==1 | 
    > glicopd_HUNT==2), add(10) rseed (53421)
    note: variables Sex PartAg_gli COPDcat_gli CVD_gli cancer_gli chrondisADL_gli diabetes_gli
          musc_skel_gli contain no soft missing (.) values; imputing nothing
    
    Performing EM optimization:
      observed log likelihood = -6277.4486 at iteration 13
    
    Performing MCMC data augmentation ... 
    
    Multivariate imputation                     Imputations =       10
    Multivariate normal regression                    added =       10
    Imputed: m=1 through m=10                       updated =        0
    
    Prior: uniform                               Iterations =     1000
                                                    burn-in =      100
                                                    between =      100
    
    ------------------------------------------------------------------
                       |               Observations per m             
                       |----------------------------------------------
              Variable |   Complete   Incomplete   Imputed |     Total
    -------------------+-----------------------------------+----------
            HADS_A_gli |       1962          586       586 |      2548
            HADS_D_gli |       2169          379       379 |      2548
                   Sex |       2548            0         0 |      2548
            PartAg_gli |       2548            0         0 |      2548
        SmoS~i_missing |       2434          114       114 |      2548
               SES_gli |       2365          183       183 |      2548
           physact_gli |       2532           16        16 |      2548
           alcohol_gli |       2383          165       165 |      2548
           COPDcat_gli |       2548            0         0 |      2548
               CVD_gli |       2548            0         0 |      2548
            cancer_gli |       2548            0         0 |      2548
        chrondisADL_~i |       2548            0         0 |      2548
          diabetes_gli |       2548            0         0 |      2548
         musc_skel_gli |       2548            0         0 |      2548
    ------------------------------------------------------------------
    (complete + incomplete = total; imputed is the minimum across m
     of the number of filled-in observations.)
    
    . 
    end of do-file
    Best regards,
    Sigrid

  • #2
    Sigfrid:
    I would check whether the imputation model created =>1 non-integer values in -i.HADS_A_gli-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo is right. Using mvn (multivariate normal) as the imputation method, you will most likely end up with all non-integer imputed values. If you want to impute categorical variables, you should use a chained equations approach (chained) with logit, ologit, or mlogit models.

      Also, since you want to use a Cox model, you should probably include an estimate for the cumulative baseline hazard in the imputation model (White and Royston 2009). I have no personal experience with this, so I cannot advise further.

      Best
      Daniel


      White, I., R., and Royston, P. 2009. Imputing missing covariate values for the Cox model. Statistics in Medicine, 28(15), 1982--1998.

      Comment


      • #4
        Thank you for your helpful input, Carlo and Daniel! I will try to solve it using your advice, and post an update if I manage to get through with it.

        Comment

        Working...
        X