Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mixed effects model and crossed random effects


    Hi,


    I am analyzing a large dataset on surgeries. We want to explore the risk of having a surgical complication depending on if the surgery was performed during day or night time. Model will be adjusted for sex, age, comorbidity (grpci) and case urgency.
    We have missing data on ASA-score (physical status scoring) and are thus using multiple imputation, using all variables in the regression for estimation.

    We believe that the coefficients vary by hospital as well as by patients. We therefore wish to use a mixed effects model, after the imputation of ASA. One specific patient might perform surgery at different hospitals, so we do not want to have patients nested within hospital.

    So I guess what we want is a crossed random effects. Am I right? If I am, I would like to include a random effect for “Id” and another random effect for “Hospital”. However the code takes days to run on my new Macbook Pro.

    Our code is as follows:

    Code:
    ** Imputation part (only missingness on ASA
    mi set mlong
    mi register imputed ASA
    mi impute mlogit ASA Complication NightTime Sex Age grpci CaseUrgency Hospital, add(20) noisily augment
    
    ** Estimation part
    mi estimate, or: meqrlogit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency || Hospital: || Id:
    Is this the correct syntax?




    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long Id int Age byte(Sex Hospital grpci Complication ASA NightTime) float CaseUrgency
      1 88 0 58 2 0 1 1 0
      3 40 1 58 0 0 0 0 0
      4 58 0 36 0 0 0 0 0
     13 42 1 58 0 0 0 0 0
     13 42 1 58 0 0 1 0 0
     13 42 1 58 0 0 1 0 0
     17 20 0 55 0 0 0 1 0
     21 56 0 79 1 0 1 1 0
     21 57 0 79 2 0 1 1 0
     22 65 1 30 2 0 1 1 0
     22 65 1 30 2 0 1 1 1
     25 77 1 25 2 0 0 0 0
     25 77 1 25 2 0 0 1 0
     26 55 0 36 2 0 1 0 0
     27 91 0 30 2 0 2 0 1
     30 77 1 29 2 0 1 0 0
     32 62 0 57 1 0 . 1 1
     35 57 0 70 0 0 0 0 0
     41 54 1 30 0 0 2 0 1
     45 31 0 29 0 0 2 1 1
     46 50 1  1 1 0 0 0 1
     57 37 0 56 0 0 . 0 1
     62 45 0 29 0 0 0 1 0
     64 40 0 43 0 0 0 1 0
     71 71 1 58 2 0 . 1 0
     71 71 1 58 2 0 1 1 0
     72 34 0 79 0 0 1 0 0
     73 21 0  1 0 0 0 1 0
     74 33 0 29 0 0 0 0 0
     74 33 0 14 0 0 0 1 0
     77 43 0  1 0 0 0 0 0
     77 43 0  1 0 0 0 0 0
     77 43 0  1 2 0 0 0 0
     85 32 0 14 0 0 0 1 0
     86 39 0 14 0 0 0 1 1
     90 74 0  9 0 0 1 1 0
     92 43 0 58 0 0 0 0 0
     93 29 0 17 0 0 0 1 0
     98 44 1  1 0 0 . 0 0
    100 29 0 58 0 0 0 1 0
    103 70 1 41 2 0 0 1 0
    106 78 1 14 2 0 2 1 0
    108 31 1 14 0 0 0 1 0
    109 57 0 58 1 0 0 1 0
    110 75 1  9 2 0 2 0 0
    110 75 1  9 2 0 1 1 0
    112 19 0  1 0 0 0 1 0
    116 90 0  9 2 0 1 0 0
    123 76 1  9 0 0 0 1 0
    124 64 0 14 0 0 0 0 0
    125 71 0 14 1 0 0 0 0
    127 53 0 56 0 0 0 1 0
    132 25 1  7 0 0 0 1 0
    135 55 0 62 0 0 0 0 0
    136 81 1 58 0 0 0 0 0
    144 83 1 12 1 0 1 1 0
    146 48 0 53 0 0 0 0 0
    149 45 0 11 0 0 0 0 0
    158 53 1 39 0 0 0 0 0
    161 24 1 34 0 0 0 1 0
    169 34 0 65 0 0 0 0 0
    174 20 1 34 0 0 0 1 0
    174 20 1 34 0 0 0 1 0
    179 39 0  9 0 0 0 1 0
    185 34 1 55 0 0 0 0 1
    188 73 1 57 0 0 0 1 0
    188 73 1 57 0 0 0 1 0
    188 74 1 80 0 0 0 1 0
    189 29 0 61 0 0 0 1 0
    189 29 0 61 0 0 0 0 0
    189 31 0 61 0 0 0 0 0
    190 26 0 72 0 0 0 1 0
    196 49 0 14 0 0 0 1 0
    197 70 0 14 0 0 0 1 0
    198 27 1 56 0 0 0 0 0
    206 49 0 44 0 0 0 0 0
    208 72 0 30 2 0 1 1 1
    209 44 1 56 0 0 0 1 0
    210 36 1 32 1 0 0 1 0
    210 37 1 72 1 0 0 0 0
    210 38 1 58 1 0 1 0 0
    211 74 1 58 2 0 1 0 0
    213 74 0  9 1 0 0 1 0
    228 88 0  9 0 0 2 1 0
    229 23 1 65 0 0 0 0 0
    230 49 1 29 0 0 0 0 0
    232 63 0 57 1 0 . 1 1
    239 52 0 57 0 0 0 1 0
    247 28 1 45 0 0 0 0 0
    250 76 1 14 2 0 1 0 0
    250 76 1 14 2 0 1 1 0
    255 80 0 58 0 0 2 0 1
    261 64 0  1 0 0 0 1 1
    261 64 0  1 0 0 2 1 0
    267 94 0 14 2 0 1 1 0
    268 56 1 29 0 0 0 1 0
    270 70 0 30 0 0 1 1 0
    275 62 1 56 0 0 0 0 0
    277 74 0 58 2 0 1 0 0
    278 32 0 14 0 0 0 0 1
    end
    label values Sex KON
    label def KON 0 "0. Female", modify
    label def KON 1 "1. Male", modify
    label values grpci grpci
    label def grpci 0 "0. Charlson Index Score 0", modify
    label def grpci 1 "1. Charlson Index Score 1", modify
    label def grpci 2 "2. Charlson Index Score >1", modify
    label values ASA ASA_4cat
    label def ASA_4cat 0 "ASA Score 1-2", modify
    label def ASA_4cat 1 "ASA Score 3", modify
    label def ASA_4cat 2 "ASA Score 4", modify

  • #2
    You might want to try a linear model as a faster alternative (mixed). In any case, this is not a fully crossed model. You need:

    Code:
    mi estimate, or: melogit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency || _all: R.Hospital || _all: R.Id
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      Originally posted by Felix Bittmann View Post
      You might want to try a linear model as a faster alternative (mixed). In any case, this is not a fully crossed model. You need:

      Code:
      mi estimate, or: melogit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency || _all: R.Hospital || _all: R.Id
      Thank you for the clarification. So if I understand the syntax above correctly; "_all" specifies that all variables are crossed in both "Id" and "Hospital"?

      What if only a few variables varies within "Id" and "Hospital" respectively? Would the following be a correct specification?

      Code:
      mi estimate, or: melogit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency || Hospital: NightTime CaseUrgency || Id: Sex Age grpci ASA

      Comment


      • #4
        I think you are confusing random slopes with the level at which a variable varies. By coding
        Code:
        Hospital: NightTime CaseUrgency || Id: Sex Age grpci ASA
        you are specifying random slopes at these levels. For example, you are saying that the sex variable has a different effect for each Id, and that CaseUrgency has a different effect at each Hospital. Now, perhaps the latter makes sense and is what you intend. But the former does not. In fact, estimating this model will probably fail to converge because it will likely not be possible to disentangle the residual error term from the individual level random slopes of variables which are defined and vary at the individual level (Sex and Age; I don't know what grpci and ASA are, so this may or may not apply to them.)

        In any case, the important general point here is that when you last variables in the random portion of the model at a given level, you are telling Stata to calculate random slopes for those variables at that level. This notation should not be used to try to tell Stata what level the variables are defined at (vary at). There is no need to specify that information in the command at all. And, most important, it is invalid to specify a random slope for a variable at the same level where it is defined.

        Comment


        • #5
          In addition to Clyde's excellent advice, here are some more pointers regarding crossed effects:

          https://blog.stata.com/2010/12/22/in...ffects-models/

          https://www.stata-press.com/books/mu...odeling-stata/

          While one can be really complex with these models, you should start simple. Especially since the computational times are already very long in your case, you might want to start with a simple baseline model. Then you can add higher-level covariates (random slopes) if necessary.
          Best wishes

          (Stata 16.1 MP)

          Comment


          • #6
            Thank you Clyde and Felix for your answers. Much appreciated.
            Ok, this was indeed complex. Further, I was not giving a clear explanation of my concern.

            I have patients (Id) that are undergoing surgery. I want to examine the risk of complications (Complication). My concerns are:
            1. Some patients are undergoing several surgeries during the study period, some are only undergoing one surgery. A form of repeated measurements one could argue.
            2. I suspect that there are different risks of complications at different hospitals.
            3. Patients undergoing several surgeries might not do all of these surgeries at the same hospital. One surgery could very well be performed at hospital A and the next at hospital B, third at hospital G and so on...
            I want to take the these matters into account. My thought was therefore to use random effects in a crossed random effects model. I might be wrong here, please correct me if this is the wrong way to go.

            So a more correct model would look like this (?):
            Code:
            mi estimate, or: melogit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency || _all: R.Hospital || _all: R.Id
            If I have understood the links provided and my read-up on the subject, this specifies that both Hospital and Id will have random effects, but that they are not nested in each other in any way. Correct?
            Does this make sense or am I still lost…?

            Comment


            • #7
              The code in comment #6 is the same code Felix provided. The problem I have with this is that it takes forever to run. Is there a way to make it faster? My dataset is roughly 500k surgeries, I read about the option
              Code:
              option intmethod(pclaplace)
              But I do not understand the consequences of this.
              Last edited by Jesper Eriksson; 20 Jan 2025, 11:56.

              Comment


              • #8
                These are all excellent questions and to get a competent answer you need to ask health researchers or look at published articles in the relevant journals of your field. For example, while time and hospitals can be treated as levels, one could also argue that these are fixed effect. It also depends: are these factors just nuisances or main research aspects? And, as you see, the practical questions are always relevant. Even if you have a great model yet it takes weeks to compute, is it helpful? And imputation does not make things easier or faster...

                If you have a binary outcome variable, you can try this simpler model first:

                Code:
                mi estimate, dots: mixed Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency, vce(cluster Id) || Hospital:
                Here we leave the Id aspect as a cluster SE and focus on the hospitals.
                Best wishes

                (Stata 16.1 MP)

                Comment


                • #9
                  Originally posted by Felix Bittmann View Post
                  These are all excellent questions and to get a competent answer you need to ask health researchers or look at published articles in the relevant journals of your field. For example, while time and hospitals can be treated as levels, one could also argue that these are fixed effect. It also depends: are these factors just nuisances or main research aspects? And, as you see, the practical questions are always relevant. Even if you have a great model yet it takes weeks to compute, is it helpful? And imputation does not make things easier or faster...

                  If you have a binary outcome variable, you can try this simpler model first:

                  Code:
                  mi estimate, dots: mixed Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency, vce(cluster Id) || Hospital:
                  Here we leave the Id aspect as a cluster SE and focus on the hospitals.
                  Thank you again for the advice. I will try using clustered SE when I get to my computer tomorrow morning. Hopefully that will take a little less time. I am as always grateful for this forum.

                  Comment


                  • #10
                    How many different hospitals are in the data? I would imagine that the number of patients in your data set is very large, but the number of hospitals might just be modest. If the number is relatively small, I would keep the Id: random effect but have Hospital as a fixed effect. That way you still capture both sources of variation in the modeling. But having Hospital as a fixed effect instead of a random intercept level will be faster, and it will automatically be crossed with Id:. It will also provide more useful results if part of your research question involves contrasting various hospitals.

                    Another way to speed things up a bit and keep the crossed random effects:
                    Code:
                    ... mixed ... || _all: R.Hospital || Id:
                    This fits exactly the same model as the version with -|| _all: R.Hospital || _all: R. Id-, but is both faster and less demanding of memory. If the numbers of Ids is very large, and much greater than the number of Hospitals, the speedup here will be quite substantial.

                    Another possibility is to eliminate the repeated-patient level by changing the underlying data set: randomly select and retain only one surgery per patient. But don't do this if you think it likely that the "frequent fliers" have, systematically, a different level of complication risk from the occasionally operated upon. Or, closer to the original data set, for patients operated at multiple hospitals, for each patient select and retain only one hospital per patient. This approach would probably be OK if "frequent fliers" have a different level of risk, but would be problematic if patients who use multiple hospitals have a systematically different risk.

                    Finally, I have to say that execution times measured in days, as you described it in #1, is probably to be expected in a three-level model with N = 500,000 and multiple imputation on top of that! And it will be longer with crossed effects than it would be were the effects nested. I have had simpler models than this take 2 weeks to complete estimation. As Felix Bittman says, you have to decide if it is worth it to invest that much compute time into this question, or whether you need to simplify the project in some way that may be more approximate. But if the results are not needed urgently, and none of these speed-ups are appropriate and work out, consider letting the thing run till it's done.

                    Comment


                    • #11
                      Clyde brings up quite relevant aspects here. It would be really interesting to see: how many hospitals are in the dataset? How many observations are there per ID? You can test this with an empty model like this:

                      Code:
                      mi estimate: mixed Complication || Hospital:
                      mi estimate: mixed Complication || Id:
                      Best wishes

                      (Stata 16.1 MP)

                      Comment


                      • #12
                        Originally posted by Clyde Schechter View Post
                        How many different hospitals are in the data? I would imagine that the number of patients in your data set is very large, but the number of hospitals might just be modest. If the number is relatively small, I would keep the Id: random effect but have Hospital as a fixed effect. That way you still capture both sources of variation in the modeling. But having Hospital as a fixed effect instead of a random intercept level will be faster, and it will automatically be crossed with Id:. It will also provide more useful results if part of your research question involves contrasting various hospitals.

                        Another way to speed things up a bit and keep the crossed random effects:
                        Code:
                        ... mixed ... || _all: R.Hospital || Id:
                        This fits exactly the same model as the version with -|| _all: R.Hospital || _all: R. Id-, but is both faster and less demanding of memory. If the numbers of Ids is very large, and much greater than the number of Hospitals, the speedup here will be quite substantial.

                        Another possibility is to eliminate the repeated-patient level by changing the underlying data set: randomly select and retain only one surgery per patient. But don't do this if you think it likely that the "frequent fliers" have, systematically, a different level of complication risk from the occasionally operated upon. Or, closer to the original data set, for patients operated at multiple hospitals, for each patient select and retain only one hospital per patient. This approach would probably be OK if "frequent fliers" have a different level of risk, but would be problematic if patients who use multiple hospitals have a systematically different risk.

                        Finally, I have to say that execution times measured in days, as you described it in #1, is probably to be expected in a three-level model with N = 500,000 and multiple imputation on top of that! And it will be longer with crossed effects than it would be were the effects nested. I have had simpler models than this take 2 weeks to complete estimation. As Felix Bittman says, you have to decide if it is worth it to invest that much compute time into this question, or whether you need to simplify the project in some way that may be more approximate. But if the results are not needed urgently, and none of these speed-ups are appropriate and work out, consider letting the thing run till it's done.
                        There are 79 hospitals, 994k surgeries, 366k patients. My plan was to use multiple imputation with 20 iterations on missing ASA (none of the other variables have missingness). I will try...

                        Code:
                        ... mixed ... || _all: R.Hospital || Id:
                        ... and see if the speedup works. Eliminating all but one surgery is not an appealing choice since we believe that, as you correctly presumed, frequent fliers have a different systematic level of complication risk. As compared to patients with only one or a few surgeries. We do not think that we can capture and model this extra risk using the variables in the model. For your other suggestion, keeping only one hospital is more appealing. However, we know that som hospitals (the larger University hospitals), take more complicated surgeries and patients. We are concerned that a patient that undergo surgery at one small hospital and are experiencing a complication are then scheduled for his/hers next surgery at a larger hospital (since the surgery now is viewed as more complicated).

                        I will try your code suggestion (for a few days) and see if I get a result. Thanks a lot for your help!​

                        Comment


                        • #13
                          Originally posted by Felix Bittmann View Post
                          These are all excellent questions and to get a competent answer you need to ask health researchers or look at published articles in the relevant journals of your field. For example, while time and hospitals can be treated as levels, one could also argue that these are fixed effect. It also depends: are these factors just nuisances or main research aspects? And, as you see, the practical questions are always relevant. Even if you have a great model yet it takes weeks to compute, is it helpful? And imputation does not make things easier or faster...

                          If you have a binary outcome variable, you can try this simpler model first:

                          Code:
                          mi estimate, dots: mixed Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency, vce(cluster Id) || Hospital:
                          Here we leave the Id aspect as a cluster SE and focus on the hospitals.
                          Thank you Felix! However, when I try your code I get the following error message:

                          Code:
                          Imputations (20):
                            
                          highest-level groups are not nested within LopNr
                          an error occurred when mi estimate executed mixed on m=1
                          r(459);
                          Is the reason for the error that the code says that observations are non-independent at Id level [vce(cluster Id)] BUT at the same time saying they are independent at other levels? Resulting in that there will be a "controversy" (in lack of a better word) with the random effect on Hospital? Or is the error due to something else?

                          Comment


                          • #14
                            Yes I think it means that you have a crossed-effect dataset. Maybe Clyde's suggestion from #10 works better. You can first try without covariates or only use a few imputations to get a first impression:

                            Code:
                            mi estimate, nimp(5) dots: ...
                            Best wishes

                            (Stata 16.1 MP)

                            Comment


                            • #15
                              Originally posted by Felix Bittmann View Post
                              Clyde brings up quite relevant aspects here. It would be really interesting to see: how many hospitals are in the dataset? How many observations are there per ID? You can test this with an empty model like this:

                              Code:
                              mi estimate: mixed Complication || Hospital:
                              mi estimate: mixed Complication || Id:
                              Sorry for late reply Felix. Here is the results for the above code:

                              Code:
                              . mixed Complication || Hospital:
                              
                              Performing EM optimization: 
                              
                              Performing gradient-based optimization: 
                              
                              Iteration 0:   log likelihood =  1054376.7  
                              Iteration 1:   log likelihood =  1054376.7  
                              
                              Computing standard errors:
                              
                              Mixed-effects ML regression                     Number of obs     =    442,911
                              Group variable: Hospital                        Number of groups  =         78
                              
                                                                              Obs per group:
                                                                                            min =          1
                                                                                            avg =    5,678.3
                                                                                            max =     22,361
                              
                                                                              Wald chi2(0)      =          .
                              Log likelihood =  1054376.7                     Prob > chi2       =          .
                              
                              ------------------------------------------------------------------------------
                              Complication |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                     _cons |   .0005243   .0000503    10.43   0.000     .0004257    .0006228
                              ------------------------------------------------------------------------------
                              
                              ------------------------------------------------------------------------------
                                Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
                              -----------------------------+------------------------------------------------
                              Hospital: Identity           |
                                                var(_cons) |   6.61e-08   2.45e-08      3.19e-08    1.37e-07
                              -----------------------------+------------------------------------------------
                                             var(Residual) |   .0005009   1.06e-06      .0004988     .000503
                              ------------------------------------------------------------------------------
                              LR test vs. linear model: chibar2(01) = 22.68         Prob >= chibar2 = 0.0000
                              Code:
                              . mixed Complication || Id:
                              
                              Performing EM optimization: 
                              
                              Performing gradient-based optimization: 
                              
                              Iteration 0:   log likelihood =  1050268.8  
                              Iteration 1:   log likelihood =  1054363.5  
                              Iteration 2:   log likelihood =  1054365.4  
                              Iteration 3:   log likelihood =  1054365.4  
                              
                              Computing standard errors:
                              
                              Mixed-effects ML regression                     Number of obs     =    442,911
                              Group variable: Id                              Number of groups  =    365,960
                              
                                                                              Obs per group:
                                                                                            min =          1
                                                                                            avg =        1.2
                                                                                            max =        155
                              
                                                                              Wald chi2(0)      =          .
                              Log likelihood =  1054365.4                     Prob > chi2       =          .
                              
                              ------------------------------------------------------------------------------
                              Complication |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                     _cons |   .0005012   .0000336    14.90   0.000     .0004353    .0005671
                              ------------------------------------------------------------------------------
                              
                              ------------------------------------------------------------------------------
                                Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
                              -----------------------------+------------------------------------------------
                              Id: Identity                 |
                                                var(_cons) |   3.19e-11   2.67e-12      2.71e-11    3.76e-11
                              -----------------------------+------------------------------------------------
                                             var(Residual) |    .000501   1.06e-06      .0004989    .0005031
                              ------------------------------------------------------------------------------
                              LR test vs. linear model: chibar2(01) = 0.00          Prob >= chibar2 = 1.0000
                              If I may ask your opinion on a way forward I would be grateful. Is there a way to evaluate if random effects on Id has a significant impact on model fit. If not, maybe I just can skip using Id as a random effect which would make the model much easier to compute. I have tried:

                              Code:
                              logit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency, or 
                              
                              Iteration 0:   log likelihood = -1712.3685  
                              Iteration 1:   log likelihood = -1642.3642  
                              Iteration 2:   log likelihood = -1578.7786  
                              Iteration 3:   log likelihood = -1564.5523  
                              Iteration 4:   log likelihood = -1564.4664  
                              Iteration 5:   log likelihood = -1564.4664  
                              
                              Logistic regression                             Number of obs     =    415,331
                                                                              LR chi2(9)        =     295.80
                                                                              Prob > chi2       =     0.0000
                              Log likelihood = -1564.4664                     Pseudo R2         =     0.0864
                              
                              ---------------------------------------------------------------------------------------------
                                             Complication | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              ----------------------------+----------------------------------------------------------------
                                              1.NightTime |   1.118775   .1622073     0.77   0.439     .8420349    1.486468
                                                          |
                                                      ASA |
                                             ASA Score 3  |   1.315612   .2758496     1.31   0.191     .8722742    1.984279
                                             ASA Score 4  |   3.410797   .8047113     5.20   0.000     2.147986    5.416021
                                             ASA Score 5  |    9.90411   4.168505     5.45   0.000     4.340666    22.59824
                                                          |
                                                      Sex |
                                                  1. Man  |   .8238151   .1203674    -1.33   0.185     .6186726     1.09698
                                                      Age |   1.040259   .0060355     6.80   0.000     1.028496    1.052156
                                                          |
                                                    grpci |
                               1. Charlson Index Score 1  |   2.309881   .5464024     3.54   0.000     1.452907    3.672327
                              2. Charlson Index Score >1  |   2.579663   .5503176     4.44   0.000     1.698153    3.918765
                                                          |
                                            1.CaseUrgency |   2.425043   .4727564     4.54   0.000      1.65493    3.553523
                                                    _cons |   .0000109   4.82e-06   -25.72   0.000     4.54e-06    .0000259
                              ---------------------------------------------------------------------------------------------
                              Note: _cons estimates baseline odds.
                              
                              . est store ModelNonRE
                              
                              . 
                              . xtset Id 
                                     panel variable:  Id (unbalanced)
                              
                              . xtlogit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency, re or 
                              
                              Fitting comparison model:
                              
                              Iteration 0:   log likelihood = -1712.3685  
                              Iteration 1:   log likelihood = -1642.3642  
                              Iteration 2:   log likelihood = -1578.7786  
                              Iteration 3:   log likelihood = -1564.5523  
                              Iteration 4:   log likelihood = -1564.4664  
                              Iteration 5:   log likelihood = -1564.4664  
                              
                              Fitting full model:
                              
                              tau =  0.0     log likelihood = -1564.4664
                              tau =  0.1     log likelihood = -1564.7726
                              
                              Iteration 0:   log likelihood = -1564.7726  
                              Iteration 1:   log likelihood = -1564.4655  
                              Iteration 2:   log likelihood = -1564.4651  (not concave)
                              Iteration 3:   log likelihood = -1564.4651  (not concave)
                              Iteration 4:   log likelihood =  -1564.465  
                              Iteration 5:   log likelihood =  -1564.465  
                              Iteration 6:   log likelihood = -1564.4646  
                              Iteration 7:   log likelihood = -1564.4646  
                              
                              Random-effects logistic regression              Number of obs     =    415,331
                              Group variable: Id                              Number of groups  =    347,144
                              
                              Random effects u_i ~ Gaussian                   Obs per group:
                                                                                            min =          1
                                                                                            avg =        1.2
                                                                                            max =        153
                              
                              Integration method: mvaghermite                 Integration pts.  =         12
                              
                                                                              Wald chi2(9)      =     264.90
                              Log likelihood  = -1564.4646                    Prob > chi2       =     0.0000
                              
                              ---------------------------------------------------------------------------------------------
                                             Complication | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              ----------------------------+----------------------------------------------------------------
                                              1.NightTime |   1.118822    .162268     0.77   0.439     .8419907    1.486671
                                                          |
                                                      ASA |
                                             ASA Score 3  |   1.315553   .2758664     1.31   0.191     .8721968    1.984275
                                             ASA Score 4  |   3.412183   .8052329     5.20   0.000     2.148618    5.418827
                                             ASA Score 5  |   9.915997   4.180516     5.44   0.000     4.339859    22.65672
                                                          |
                                                      Sex |
                                                  1. Man  |   .8237489   .1204135    -1.33   0.185     .6185409    1.097037
                                                      Age |   1.040273   .0060377     6.80   0.000     1.028506    1.052174
                                                          |
                                                    grpci |
                               1. Charlson Index Score 1  |   2.310142   .5465725     3.54   0.000     1.452938     3.67308
                              2. Charlson Index Score >1  |   2.580193   .5505217     4.44   0.000     1.698384     3.91984
                                                          |
                                            1.CaseUrgency |   2.426088   .4732588     4.54   0.000     1.655244    3.555913
                                                    _cons |   9.93e-06   5.60e-06   -20.42   0.000     3.29e-06      .00003
                              ----------------------------+----------------------------------------------------------------
                                                 /lnsig2u |  -1.738451   3.906298                     -9.394654    5.917753
                              ----------------------------+----------------------------------------------------------------
                                                  sigma_u |   .4192762   .8189089                      .0091196     19.2763
                                                      rho |   .0507241   .1880928                      .0000253    .9912239
                              ---------------------------------------------------------------------------------------------
                              Note: Estimates are transformed only in the first equation.
                              Note: _cons estimates baseline odds (conditional on zero random effects).
                              LR test of rho=0: chibar2(01) = 3.5e-03                Prob >= chibar2 = 0.476
                              
                              . est store ModelRE 
                              
                              . lrtest  ModelRE ModelNonRE, force 
                              
                              Likelihood-ratio test                                 LR chi2(1)  =      0.00
                              (Assumption: ModelNonRE nested in ModelRE)            Prob > chi2 =    0.9526
                              
                              . estimates table ModelRE ModelNonRE, stats(aic bic)
                              
                              ----------------------------------------
                                  Variable |  ModelRE     ModelNonRE  
                              -------------+--------------------------
                              Complication |
                                 NightTime |
                                        1  |  .11227651    .11223443  
                                           |
                                       ASA |
                              ASA Score 3  |  .27425678    .27430199  
                              ASA Score 4  |  1.2273522     1.226946  
                              ASA Score 5  |  2.2941493    2.2929499  
                                           |
                                       Sex |
                                   1. Man  | -.19388951   -.19380922  
                                           |
                                       Age |  .03948276    .03946957  
                                           |
                                     grpci |
                              1. Charls..  |  .83730909    .83719583  
                              2. Charls..  |  .94786403    .94765882  
                                           |
                               CaseUrgency |
                                        1  |   .8862801    .88584916  
                                           |
                                     _cons | -11.519956   -11.431126  
                              -------------+--------------------------
                                  /lnsig2u | -1.7384508               
                              -------------+--------------------------
                              Statistics   |                          
                                       aic |  3150.9292    3148.9327  
                                       bic |  3271.2343     3258.301  
                              ----------------------------------------
                              The estimates are very similar regardless if I am using random effects on Id or just using a standard logistic regression.
                              But this requires me to use "force" option after lrtest since i use different estimators. I was also thinking of comparing AIC/BIC to give support to not include Id as a random effect since this does not seem to increase model fit. Does the results above give evidence to simplify the procedure and use the following code instead:
                              Code:
                              . xtset Hospital 
                                     panel variable:  Hospital (unbalanced)
                              
                              . xtlogit Complication i.NightTime i.ASA i.Sex Age i.grpci i.CaseUrgency, re or
                              
                              Fitting comparison model:
                              
                              Iteration 0:   log likelihood = -1712.3685  
                              Iteration 1:   log likelihood = -1642.3642  
                              Iteration 2:   log likelihood = -1578.7786  
                              Iteration 3:   log likelihood = -1564.5523  
                              Iteration 4:   log likelihood = -1564.4664  
                              Iteration 5:   log likelihood = -1564.4664  
                              
                              Fitting full model:
                              
                              tau =  0.0     log likelihood = -1564.4664
                              tau =  0.1     log likelihood = -1559.1837
                              tau =  0.2     log likelihood = -1558.6451
                              tau =  0.3     log likelihood = -1559.9916
                              
                              Iteration 0:   log likelihood = -1558.6464  
                              Iteration 1:   log likelihood = -1558.2669  
                              Iteration 2:   log likelihood = -1558.2649  
                              Iteration 3:   log likelihood = -1558.2649  
                              
                              Random-effects logistic regression              Number of obs     =    415,331
                              Group variable: Hospital                        Number of groups  =         78
                              
                              Random effects u_i ~ Gaussian                   Obs per group:
                                                                                            min =          1
                                                                                            avg =    5,324.8
                                                                                            max =     21,118
                              
                              Integration method: mvaghermite                 Integration pts.  =         12
                              
                                                                              Wald chi2(9)      =     259.30
                              Log likelihood  = -1558.2649                    Prob > chi2       =     0.0000
                              
                              ---------------------------------------------------------------------------------------------
                                             Complication | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              ----------------------------+----------------------------------------------------------------
                                              1.NightTime |   1.124424   .1634261     0.81   0.420     .8456966    1.495015
                                                          |
                                                      ASA |
                                             ASA Score 3  |   1.322321   .2787068     1.33   0.185     .8748396     1.99869
                                             ASA Score 4  |   3.380232   .8060138     5.11   0.000     2.118254    5.394049
                                             ASA Score 5  |   9.514072   4.030539     5.32   0.000     4.147283    21.82575
                                                          |
                                                      Sex |
                                                  1. Man  |    .827965   .1210764    -1.29   0.197      .621638    1.102774
                                                      Age |   1.039377   .0060664     6.62   0.000     1.027555    1.051336
                                                          |
                                                    grpci |
                               1. Charlson Index Score 1  |   2.376717    .563484     3.65   0.000      1.49338     3.78255
                              2. Charlson Index Score >1  |   2.625058   .5615729     4.51   0.000      1.72601    3.992405
                                                          |
                                            1.CaseUrgency |   2.506501   .5006844     4.60   0.000     1.694481    3.707653
                                                    _cons |   .0000105   4.74e-06   -25.39   0.000     4.33e-06    .0000254
                              ----------------------------+----------------------------------------------------------------
                                                 /lnsig2u |  -1.500606   .4843402                     -2.449895   -.5513163
                              ----------------------------+----------------------------------------------------------------
                                                  sigma_u |   .4722235   .1143584                      .2937731    .7590724
                                                      rho |   .0634796    .028794                      .0255623    .1490383
                              ---------------------------------------------------------------------------------------------
                              Note: Estimates are transformed only in the first equation.
                              Note: _cons estimates baseline odds (conditional on zero random effects).
                              LR test of rho=0: chibar2(01) = 12.40                  Prob >= chibar2 = 0.000
                              Does this seem like a reasonable approach?

                              This does not allow me to use multiple imputation, since likelihood varies between the imputed sets. I.e. I do not get "one" likelihood to compare against. Thoughts on this?



                              Comment

                              Working...
                              X