Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data shape for melogit, no observations error

    I'm working on a mixed effects multiple logistic regression model looking at risk factors for developing delirium (CAMICU) while in the hospital. My data is in long format. When I run a univariate model with data from daily repeated measures, such as whether or not they had been on a ventilator in the last day, it seems to work well...
    Code:
    . melogit CAMICU vent || record_id: , or
    
    Fitting fixed-effects model:
    
    Iteration 0:   log likelihood = -232.18903  
    Iteration 1:   log likelihood =  -231.6529  
    Iteration 2:   log likelihood = -231.65249  
    Iteration 3:   log likelihood = -231.65249  
    
    Refining starting values:
    
    Grid node 0:   log likelihood = -206.01746
    
    Fitting full model:
    
    Iteration 0:   log likelihood = -206.01746  
    Iteration 1:   log likelihood = -190.12013  
    Iteration 2:   log likelihood = -187.78663  
    Iteration 3:   log likelihood = -187.40216  
    Iteration 4:   log likelihood =   -187.382  
    Iteration 5:   log likelihood = -187.38115  
    Iteration 6:   log likelihood = -187.38102  
    Iteration 7:   log likelihood =   -187.381  
    Iteration 8:   log likelihood = -187.38099  
    
    Mixed-effects logistic regression               Number of obs     =        394
    Group variable: record_id                       Number of groups  =        140
    
                                                    Obs per group:
                                                                  min =          1
                                                                  avg =        2.8
                                                                  max =          7
    
    Integration method: mvaghermite                 Integration pts.  =          7
    
                                                    Wald chi2(1)      =      24.45
    Log likelihood = -187.38099                     Prob > chi2       =     0.0000
    ------------------------------------------------------------------------------
          CAMICU | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            vent |   22.30988   14.00888     4.94   0.000     6.516413    76.38109
           _cons |   .2028326   .0824132    -3.93   0.000     .0914712    .4497704
    -------------+----------------------------------------------------------------
    record_id    |
       var(_cons)|   11.45676   4.427319                      5.371875    24.43418
    ------------------------------------------------------------------------------
    Note: Estimates are transformed only in the first equation to odds ratios.
    Note: _cons estimates baseline odds (conditional on zero random effects).
    LR test vs. logistic model: chibar2(01) = 88.54       Prob >= chibar2 = 0.0000
    
    .
    However when I use a baseline variable which only occurs once per patient such as age, BMI, etc., I get a no observations error. I've tried this was 10 other variables with observations from this row and get the same error.
    Code:
     .  melogit CAMICU age || record_id:
    no observations
    r(2000);
    The baseline data is on a separate row from the daily outcome variable (CAMICU) (dataex below). Is this a problem? And if it is, how would I go about reshaping just this line?

    Thank you,
    Tom


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte age float(bmi CAMICU) byte vent
    73  25.30864 . .
     .         . 1 1
     .         . 0 0
     .         . 0 0
     .         . . 0
     .         . . .
    90 25.536703 . .
     .         . 0 0
     .         . 0 0
     .         . 0 0
     .         . 0 0
     .         . . 0
     .         . . .
    86 28.981144 . .
     .         . 0 0
     .         . 0 0
     .         . . 0
     .         . . .
    77 28.305996 . .
     .         . . 0
     .         . 1 0
     .         . 0 0
     .         . 0 0
     .         . 0 0
     .         . 0 0
     .         . 0 0
     .         . . .
    64 16.917233 . .
     .         . . 0
     .         . 0 0
     .         . . .
    52   39.5102 . .
     .         . 1 1
     .         . . 1
     .         . . 1
     .         . . 1
     .         . . 1
     .         . 1 1
     .         . 1 1
     .         . . .
    58 29.737045 . .
     .         . 0 0
     .         . 0 0
     .         . . .
    42  46.74515 . .
     .         . 0 0
     .         . 0 0
     .         . 0 0
     .         . . .
    72 30.299204 . .
     .         . 1 1
     .         . . 0
     .         . . 0
     .         . . 0
     .         . . 0
     .         . . .
    68 29.407597 . .
     .         . 0 0
     .         . 0 0
     .         . . .
    80 22.773186 . .
     .         . . 0
     .         . 0 0
     .         . . 0
     .         . . 0
     .         . . 0
     .         . 1 0
     .         . 1 0
     .         . . .
    62  41.50597 . .
     .         . . 1
     .         . . 1
     .         . 0 1
     .         . . 1
     .         . . 1
     .         . . 0
     .         . . 0
     .         . . .
    79         . . .
     .         . . 1
     .         . . 1
     .         . . 1
     .         . 1 1
     .         . . 1
     .         . . 1
     .         . 1 1
     .         . . .
    61   22.0741 . .
     .         . . 0
     .         . . .
    52  27.33564 . .
     .         . 0 0
     .         . 0 0
     .         . . 0
     .         . 0 0
     .         . 0 0
     .         . . 0
     .         . 0 0
     .         . . .
    49  32.52595 . .
    end


  • #2
    The simplest solution would be this:
    Code:
    egen age_bsl = min(age), by(record_id)
    egen bmi_bsl = min(bsl), by(record_id) 
     melogit CAMICU age_bsl || record_id:
    This is assuming age and bmi are consistent within a record_id.

    Comment


    • #3
      thank you Daniel!

      Comment


      • #4
        Let me make a general comment. The example data looks like it was imported from a spreadsheet, one in which individuals are represented in a group of rows. The first row is a "header" containing the person's age and BMI, and then there are varying numbers of rows containing the values of CAMICU and vent. It looks more or less like a PowerPoint slide with two levels of indentation. This is a nice organization for a spreadsheet; it is visually clean and easy for human's to understand.

        But Stata is not a spreadsheet, and it does not perceive things the way humans do. This kind of organization in a Stata data set usually ends badly, and O.P.'s experience is an example of that. The transformations in #2 mostly rectify the problem. Better still would be also to remove the "header" rows. They serve no positive purpose in a Stata data set, and, although harmless in the regression context, they complicate matters for other types of analysis. This requires a slightly different approach to what is in #2:
        Code:
        gen byte header = !missing(age, bmi)
        replace age = age[_n-1] if missing(age)
        replace bmi = bmi[_n-1] if missing(bmi)
        drop if header
        drop header
        Note: If in the full data set the "header" rows contain other variables besides age and bmi, those two should be -replace-d in the same way as age and bmi. If there are a large number of such variables, use a loop.

        Now your data set is prepared not just for your -melogit- but almost all other Stata commands you will need to use. And, in the future, you will make your life easier if you prepare your data sets in this "rectangular" layout as a matter of routine.

        Comment

        Working...
        X