Data shape for melogit, no observations error

Tom Lawson

Join Date: Jun 2022
Posts: 13

Data shape for melogit, no observations error

08 Feb 2023, 08:31

I'm working on a mixed effects multiple logistic regression model looking at risk factors for developing delirium (CAMICU) while in the hospital. My data is in long format. When I run a univariate model with data from daily repeated measures, such as whether or not they had been on a ventilator in the last day, it seems to work well...

Code:

. melogit CAMICU vent || record_id: , or

Fitting fixed-effects model:

Iteration 0:   log likelihood = -232.18903  
Iteration 1:   log likelihood =  -231.6529  
Iteration 2:   log likelihood = -231.65249  
Iteration 3:   log likelihood = -231.65249  

Refining starting values:

Grid node 0:   log likelihood = -206.01746

Fitting full model:

Iteration 0:   log likelihood = -206.01746  
Iteration 1:   log likelihood = -190.12013  
Iteration 2:   log likelihood = -187.78663  
Iteration 3:   log likelihood = -187.40216  
Iteration 4:   log likelihood =   -187.382  
Iteration 5:   log likelihood = -187.38115  
Iteration 6:   log likelihood = -187.38102  
Iteration 7:   log likelihood =   -187.381  
Iteration 8:   log likelihood = -187.38099  

Mixed-effects logistic regression               Number of obs     =        394
Group variable: record_id                       Number of groups  =        140

                                                Obs per group:
                                                              min =          1
                                                              avg =        2.8
                                                              max =          7

Integration method: mvaghermite                 Integration pts.  =          7

                                                Wald chi2(1)      =      24.45
Log likelihood = -187.38099                     Prob > chi2       =     0.0000
------------------------------------------------------------------------------
      CAMICU | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        vent |   22.30988   14.00888     4.94   0.000     6.516413    76.38109
       _cons |   .2028326   .0824132    -3.93   0.000     .0914712    .4497704
-------------+----------------------------------------------------------------
record_id    |
   var(_cons)|   11.45676   4.427319                      5.371875    24.43418
------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation to odds ratios.
Note: _cons estimates baseline odds (conditional on zero random effects).
LR test vs. logistic model: chibar2(01) = 88.54       Prob >= chibar2 = 0.0000

.

However when I use a baseline variable which only occurs once per patient such as age, BMI, etc., I get a no observations error. I've tried this was 10 other variables with observations from this row and get the same error.

Code:

 .  melogit CAMICU age || record_id:
no observations
r(2000);

The baseline data is on a separate row from the daily outcome variable (CAMICU) (dataex below). Is this a problem? And if it is, how would I go about reshaping just this line?

Thank you,
Tom

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte age float(bmi CAMICU) byte vent
73  25.30864 . .
 .         . 1 1
 .         . 0 0
 .         . 0 0
 .         . . 0
 .         . . .
90 25.536703 . .
 .         . 0 0
 .         . 0 0
 .         . 0 0
 .         . 0 0
 .         . . 0
 .         . . .
86 28.981144 . .
 .         . 0 0
 .         . 0 0
 .         . . 0
 .         . . .
77 28.305996 . .
 .         . . 0
 .         . 1 0
 .         . 0 0
 .         . 0 0
 .         . 0 0
 .         . 0 0
 .         . 0 0
 .         . . .
64 16.917233 . .
 .         . . 0
 .         . 0 0
 .         . . .
52   39.5102 . .
 .         . 1 1
 .         . . 1
 .         . . 1
 .         . . 1
 .         . . 1
 .         . 1 1
 .         . 1 1
 .         . . .
58 29.737045 . .
 .         . 0 0
 .         . 0 0
 .         . . .
42  46.74515 . .
 .         . 0 0
 .         . 0 0
 .         . 0 0
 .         . . .
72 30.299204 . .
 .         . 1 1
 .         . . 0
 .         . . 0
 .         . . 0
 .         . . 0
 .         . . .
68 29.407597 . .
 .         . 0 0
 .         . 0 0
 .         . . .
80 22.773186 . .
 .         . . 0
 .         . 0 0
 .         . . 0
 .         . . 0
 .         . . 0
 .         . 1 0
 .         . 1 0
 .         . . .
62  41.50597 . .
 .         . . 1
 .         . . 1
 .         . 0 1
 .         . . 1
 .         . . 1
 .         . . 0
 .         . . 0
 .         . . .
79         . . .
 .         . . 1
 .         . . 1
 .         . . 1
 .         . 1 1
 .         . . 1
 .         . . 1
 .         . 1 1
 .         . . .
61   22.0741 . .
 .         . . 0
 .         . . .
52  27.33564 . .
 .         . 0 0
 .         . 0 0
 .         . . 0
 .         . 0 0
 .         . 0 0
 .         . . 0
 .         . 0 0
 .         . . .
49  32.52595 . .
end

Tags: None

Daniel Shin

Join Date: Mar 2020

Posts: 146
#2

08 Feb 2023, 09:45

The simplest solution would be this:

Code:

egen age_bsl = min(age), by(record_id) egen bmi_bsl = min(bsl), by(record_id) melogit CAMICU age_bsl || record_id:

This is assuming age and bmi are consistent within a record_id.
Comment
Tom Lawson

Join Date: Jun 2022

Posts: 13
#3

09 Feb 2023, 18:06

thank you Daniel!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30357
#4

09 Feb 2023, 19:23

Let me make a general comment. The example data looks like it was imported from a spreadsheet, one in which individuals are represented in a group of rows. The first row is a "header" containing the person's age and BMI, and then there are varying numbers of rows containing the values of CAMICU and vent. It looks more or less like a PowerPoint slide with two levels of indentation. This is a nice organization for a spreadsheet; it is visually clean and easy for human's to understand.

But Stata is not a spreadsheet, and it does not perceive things the way humans do. This kind of organization in a Stata data set usually ends badly, and O.P.'s experience is an example of that. The transformations in #2 mostly rectify the problem. Better still would be also to remove the "header" rows. They serve no positive purpose in a Stata data set, and, although harmless in the regression context, they complicate matters for other types of analysis. This requires a slightly different approach to what is in #2:

Code:

gen byte header = !missing(age, bmi) replace age = age[_n-1] if missing(age) replace bmi = bmi[_n-1] if missing(bmi) drop if header drop header

Note: If in the full data set the "header" rows contain other variables besides age and bmi, those two should be -replace-d in the same way as age and bmi. If there are a large number of such variables, use a loop.

Now your data set is prepared not just for your -melogit- but almost all other Stata commands you will need to use. And, in the future, you will make your life easier if you prepare your data sets in this "rectangular" layout as a matter of routine.
Comment

Announcement

Data shape for melogit, no observations error

Comment

Comment

Comment