Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ologit in favour of parsimoniousness despite violated parallel lines?

    Dear Statalists,

    I could use your input on the following 😊

    My case: I am testing the influence of a factor variable (4 different countries) on an ordinal outcome variable (text complexity, scale 1-6). Since the parallel lines assumption is not met for all four categories of the factor variable, I use gologit2 for a partial proportional odds model. This gives me the odds ratios for the predictor's influence at each cut off point, which is fine. However, it is rather detailed for the hypothesis I am testing, which suggests that text complexity increases depending on the country (category 1 in the factor variable should be lowest, 4 highest). What I can say based on the PPO model is that this varies at each cut off point (which makes sense). Yet computing a cumulative PO model with ologit also shows overaching block patterns (two low countries vs two higher countries) but not the increasing trend as hypothesized.

    My question: I have read that I could still use ologit in favour of parsimoniousness (justifying with BIC, which indeed is lower for olgit than gologit2), but I'm not sure if ignoring the violated paralell lines assumption is a good way to go. Do you have experience with whether it is "okay" or common to do this? Or maybe other ideas to make interpretation less detailed? I was thinking of clustering the scale values again, so I have less cut off points ...

    I'm looking forward to your opinions on this and am trying to copy the gologit2 and ologit models below (first time, so I hope this works).

    Thanks,
    Julia

    Code:
    ologit icomplexity i.csystem, or
    
    Iteration 0:   log likelihood = -5620.6987  
    Iteration 1:   log likelihood = -5537.1818  
    Iteration 2:   log likelihood = -5537.0217  
    Iteration 3:   log likelihood = -5537.0217  
    
    Ordered logistic regression                     Number of obs     =      4,563
                                                    LR chi2(3)        =     167.35
                                                    Prob > chi2       =     0.0000
    Log likelihood = -5537.0217                     Pseudo R2         =     0.0149
    
    ------------------------------------------------------------------------------
     icomplexity | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         csystem |
              2  |   .9913936    .080441    -0.11   0.915     .8456296    1.162283
              3  |   1.925266   .1530373     8.24   0.000     1.647517    2.249841
              4  |   2.157683   .1677328     9.89   0.000     1.852752    2.512799
    -------------+----------------------------------------------------------------
           /cut1 |   .3362666   .0567362                      .2250656    .4474675
           /cut2 |   1.378303   .0602158                      1.260282    1.496323
           /cut3 |   3.011779   .0782728                      2.858367     3.16519
           /cut4 |   4.851709   .1477801                      4.562065    5.141352
           /cut5 |   5.997357    .248598                      5.510114      6.4846
    ------------------------------------------------------------------------------
    Note: Estimates are transformed only in the first equation.
    Code:
    gologit2 icomplexity i.csystem, autofit lrforce or
    
    ------------------------------------------------------------------------------
    Testing parallel lines assumption using the .05 level of significance...
    
    Step  1:  Constraints for parallel lines imposed for 4.csystem (P Value = 0.8842)
    Step  2:  Constraints for parallel lines are not imposed for
              2.csystem (P Value = 0.00000)
              3.csystem (P Value = 0.00000)
    
    Wald test of parallel lines assumption for the final model:
    
     ( 1)  [1]4.csystem - [2]4.csystem = 0
     ( 2)  [1]4.csystem - [3]4.csystem = 0
     ( 3)  [1]4.csystem - [4]4.csystem = 0
     ( 4)  [1]4.csystem - [5]4.csystem = 0
    
               chi2(  4) =    1.16
             Prob > chi2 =    0.8842
    
    An insignificant test statistic indicates that the final model
    does not violate the proportional odds/ parallel lines assumption
    
    If you re-estimate this exact same model with gologit2, instead
    of autofit you can save time by using the parameter
    
    pl(1b.csystem 4.csystem)
    
    ------------------------------------------------------------------------------
    
    Generalized Ordered Logit Estimates             Number of obs     =      4,563
                                                    LR chi2(11)       =     232.51
                                                    Prob > chi2       =     0.0000
    Log likelihood =  -5504.443                     Pseudo R2         =     0.0207
    
     ( 1)  [1]4.csystem - [2]4.csystem = 0
     ( 2)  [2]4.csystem - [3]4.csystem = 0
     ( 3)  [3]4.csystem - [4]4.csystem = 0
     ( 4)  [4]4.csystem - [5]4.csystem = 0
    ------------------------------------------------------------------------------
     icomplexity | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    1            |
         csystem |
              2  |   .8776573    .072918    -1.57   0.116     .7457701    1.032868
              3  |    1.78562     .15009     6.90   0.000     1.514403     2.10541
              4  |   2.270685   .1801171    10.34   0.000     1.943736    2.652629
                 |
           _cons |    .745114   .0430006    -5.10   0.000     .6654261     .834345
    -------------+----------------------------------------------------------------
    2            |
         csystem |
              1  |          1   4.40e-17    -4.16   0.000            1           1
              2  |   1.289039   .1231981     2.66   0.008     1.068842    1.554599
              3  |   2.089253   .1922505     8.01   0.000     1.744474    2.502174
              4  |   2.270685   .1801171    10.34   0.000     1.943736    2.652629
                 |
           _cons |   .2302401     .01509   -22.41   0.000      .202485    .2617996
    -------------+----------------------------------------------------------------
    3            |
         csystem |
              1  |          1   1.20e-17    -8.01   0.000            1           1
              2  |    2.26293   .3548734     5.21   0.000     1.664123    3.077207
              3  |    3.49951   .5104081     8.59   0.000      2.62941    4.657534
              4  |   2.270685   .1801171    10.34   0.000     1.943736    2.652629
                 |
           _cons |   .0332747   .0035651   -31.76   0.000     .0269722      .04105
    -------------+----------------------------------------------------------------
    4            |
         csystem |
              2  |   3.739563   1.489331     3.31   0.001     1.713241    8.162503
              3  |   7.827675   2.759696     5.84   0.000      3.92226    15.62173
              4  |   2.270685   .1801171    10.34   0.000     1.943736    2.652629
                 |
           _cons |   .0032357   .0009507   -19.51   0.000     .0018192    .0057552
    -------------+----------------------------------------------------------------
    5            |
         csystem |
              2  |    5.32339   3.901655     2.28   0.023     1.265668    22.39014
              3  |   10.30695   6.901483     3.48   0.000     2.774406    38.29045
              4  |   2.270685   .1801171    10.34   0.000     1.943736    2.652629
                 |
           _cons |   .0008055   .0004672   -12.28   0.000     .0002585    .0025103
    ------------------------------------------------------------------------------
    Note: _cons estimates baseline odds.

  • #2
    I have argued before that one should not take those tests too serious when doing model selection. They are a starting point, but nothing more. The solution is not to replace one one-number-summery (p-value) by another one-number-summery (BIC, AIC, etc.). Instead you should always look at the models and see if you consider them substantively different. A quick glance at the models would make me conclude that they are different, and your description seems to concur, so unfortunately you will have to live with the gologit model.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thanks, Maarten, that's very helpful! As a mostly 'qualitative' researcher, I'm very grateful for the statistical guidance you guys provide :-)

      Comment

      Working...
      X