Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • My ordinal regression results in very large odds ratios

    Hey there,
    I'm working on data from my patients (n=150)
    I have almost 30 variables and an ordinal outcome with 4 categories (liver steatosis) (I also have 2 other similar outcomes)

    I don't know why when I perform an ordinal regression with backward elimination, I receive ORs as large as even 8000!

    Does anybody know what am I doing wrong?

    Here is a sample of my data:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(Gender Married Employment University_degree Age_group BMI_cat) double(ALK_p INR) byte(Emotionaleater Sono_Grade Bx_Steatohepatitis)
    1 2 1 1 3 5   175 1.01 2 2 1
    1 . . . 1 6   217    1 . 3 0
    1 . . . 1 6     .    . . 0 0
    1 2 1 1 1 6     .    1 1 1 0
    1 . . . 2 6   181    1 . 2 1
    1 2 1 1 2 6   197  1.2 2 2 2
    1 1 1 2 1 6    92    1 1 0 1
    1 2 1 1 3 6   210    1 2 2 1
    2 2 2 1 2 4   203    . 1 0 1
    1 . . . 3 5   144    . . 2 0
    1 . . . 2 6   170    . . 2 0
    1 . . . 2 4   141    1 . 1 0
    2 1 2 1 1 6   209  1.1 1 1 1
    1 2 1 1 3 6   124    1 2 0 1
    2 1 2 1 1 6   202    1 1 3 1
    1 2 1 1 3 6   171    1 2 2 2
    1 2 1 1 2 6   217    1 2 1 0
    1 1 1 1 1 6   176    1 2 1 0
    1 . . . 2 6    77  1.1 . 2 0
    2 . . . 3 6   178  1.3 . 2 1
    1 2 2 1 2 5   150    1 2 3 1
    1 1 2 2 2 4    17    1 2 1 1
    1 . . . 1 6   163    1 . 3 1
    1 . . . 3 6   158  1.2 . 3 .
    1 . . . 2 5   261    1 . 1 0
    1 2 1 1 1 4   142 1.24 2 2 1
    2 . . . 1 6     . 1.13 . 2 0
    1 2 1 1 1 6   188    . 2 0 0
    1 2 1 1 3 6   151    1 2 1 1
    1 . . . 2 4   274 1.31 . 0 1
    1 . . . 1 5   219    . . 0 .
    1 2 1 1 2 6   144    1 2 0 1
    1 . . . 1 6   224    1 . 2 0
    2 1 2 1 2 6   194  1.1 1 2 1
    1 . . . 4 5   177    . . 1 .
    1 . . . 1 6   156 1.05 . 0 0
    2 1 2 1 1 6   266 1.23 1 3 1
    1 1 1 1 1 6     .    . 2 0 .
    1 2 1 1 3 6   211    1 2 3 1
    1 2 2 2 4 5   129   .1 2 2 1
    2 1 2 1 2 5   128    1 2 3 0
    2 . . . 1 5   137    . . 2 1
    1 . . . 2 6   146  1.5 . 1 .
    1 2 1 1 1 6   228    1 2 2 1
    2 2 2 1 4 6   305    1 1 2 0
    2 . . . 2 6   252  1.4 . 3 .
    1 2 1 1 2 5   122    1 1 2 1
    2 2 2 1 2 6     . 1.08 2 3 .
    2 2 2 1 2 6     .    1 1 2 1
    2 2 2 1 3 5   171    1 2 0 1
    2 2 2 1 3 6   133    . 1 3 .
    1 2 1 1 4 5   181    1 1 3 1
    1 2 1 1 3 4     .    . 2 3 .
    1 2 1 1 4 6     .    . 1 3 .
    2 2 2 1 2 4     .  1.1 2 2 0
    2 2 2 1 2 6   245    1 2 3 1
    1 1 2 1 1 4   121    1 2 2 1
    2 2 2 1 3 6     . 1.11 1 3 1
    1 2 1 1 3 5    54  .86 2 2 1
    1 2 2 2 3 5   246    . 1 2 .
    1 1 2 1 1 6   134    1 2 1 .
    2 2 2 1 1 6    93    1 2 3 1
    1 1 2 1 1 5   171    1 2 2 1
    1 2 1 2 2 5    66    . 2 2 .
    2 2 2 2 2 5   106    1 1 2 .
    2 1 2 1 2 5   245  1.1 2 3 .
    2 1 2 1 2 6   179    1 1 3 .
    2 1 2 1 2 5     .    1 1 2 .
    1 1 2 1 1 5   124    1 2 1 1
    1 1 1 1 1 5   150    . 1 1 1
    1 1 1 1 4 6   250    1 1 2 .
    1 2 1 1 4 6   190    1 1 2 2
    1 1 1 1 1 6   246    1 2 3 .
    1 2 1 1 4 6     .    1 1 1 0
    1 1 1 1 2 6   136    . 2 2 .
    2 2 2 1 2 5    83    1 2 2 .
    2 2 2 1 2 6   177 1.17 2 0 1
    1 2 1 1 3 5   194    1 2 2 1
    1 2 2 1 2 6     .    1 2 3 1
    1 2 2 2 2 6   185    1 1 1 0
    1 2 2 1 1 5   176    1 2 2 0
    1 2 1 1 2 6   240    1 2 2 1
    1 2 1 1 1 6   127    1 2 1 1
    2 2 2 1 3 6   181    1 1 3 0
    1 2 1 1 1 5 252.5    1 2 2 0
    2 1 2 1 2 6   142  1.1 2 1 .
    1 1 1 2 1 5   197    1 1 0 .
    2 1 2 1 2 6   254    1 1 2 0
    1 1 1 1 4 6   179  1.1 2 1 .
    1 2 1 1 3 6   249    1 2 2 .
    1 2 2 1 2 6   131    1 2 2 .
    1 2 1 1 4 6   316  1.1 2 0 .
    1 2 1 1 4 6   179    1 2 2 2
    1 2 1 1 4 5   231    1 2 2 1
    1 1 2 1 2 6   154    1 2 0 .
    1 2 1 2 2 6    18    . 1 1 1
    1 2 2 2 2 6   137    1 1 1 .
    2 2 2 1 3 6   189    1 2 2 .
    1 1 1 1 2 6   194    1 2 2 .
    1 1 2 1 2 6   176    1 1 1 .
    end
    label values Gender labels0
    label def labels0 1 "Female", modify
    label def labels0 2 "Male", modify
    label values Married labels1
    label values University_degree labels1
    label def labels1 1 "No", modify
    label def labels1 2 "Yes", modify
    label values Employment labels2
    label def labels2 1 "No", modify
    label def labels2 2 "Yes", modify
    label values Age_group labels3
    label def labels3 1 "19-30", modify
    label def labels3 2 "31-40", modify
    label def labels3 3 "41-50", modify
    label def labels3 4 ">50", modify
    label values BMI_cat labels4
    label def labels4 4 "Obesity class I", modify
    label def labels4 5 "Obesity class II", modify
    label def labels4 6 "Obesity class III", modify
    label values Emotionaleater labels14
    label def labels14 1 "No", modify
    label def labels14 2 "Yes", modify
    label values Bx_Steatohepatitis labels19
    label def labels19 0 "Minimal", modify
    label def labels19 1 "Mild", modify
    label def labels19 2 "Moderate", modify

    This is my code:
    Code:
    stepwise, pr(0.3) : ologit Bx_Steatohepatitis i.Age_group i.BMI_cat Gender Married Employment University_degree Smoke Alcohol T2DM Hypo_Thyroid HLP HTN CVD Sweeteater Volumeeater Emotionaleater Snackernibbling Cancer_F HTN_F T2DM_F i.Sono_Grade SGOT SGPT ALK_p, or


    And this is my result:
    Bx_Steatohepatitis Odds ratio Std. err. z P>z [95% conf. interval]
    Age_group
    31-40 3.997381 4.217089 1.31 0.189 .5055795 31.60542
    >50 50.54159 87.94102 2.25 0.024 1.669479 1530.09
    2.Sono_Grade 2.85757 2.601264 1.15 0.249 .4798913 17.01574
    CVD 71.35374 172.0147 1.77 0.077 .6329919 8043.32
    ALK_p .9761676 .0107919 -2.18 0.029 .9552434 .99755
    Gender 16.21916 29.71174 1.52 0.128 .4474235 587.9463
    Married .0912319 .1329141 -1.64 0.100 .0052486 1.585796
    Employment .0104316 .019311 -2.46 0.014 .0002771 .3927363
    University_degree .0146786 .0255267 -2.43 0.015 .0004858 .4435643
    Smoke .0063704 .0115658 -2.78 0.005 .0001814 .2236534
    SGPT 1.036996 .0231966 1.62 0.104 .9925136 1.083472
    T2DM .134202 .2045136 -1.32 0.188 .0067701 2.660268
    HTN_F 16.57867 19.52767 2.38 0.017 1.647922 166.7871
    HLP .039629 .0843666 -1.52 0.129 .0006108 2.571254
    HTN 30.4704 40.1952 2.59 0.010 2.296207 404.3388

  • #2
    First, I recommend you read https://www.stata.com/support/faqs/s...sion-problems/ for a compelling exposition of why stepwise regression is a bogus procedure that should not be used.

    That said, even in a single regression, you are taking a modest sample of 150 and chopping it up into a large number of groups: 4 outcome levels * 4 age groups * 3 (or more?) BMI categories * 2 sexes * ... Even in the optimistic scenario where everything spreads into these bins as evenly as possible, they are going to be very small. In that situation, logistic regression coefficients are strongly biased upwards (in magnitude: large negative coefficients are also seen). You are simply asking too much of the available data.

    For more reasonable results, use a simpler model with fewer predictors. Choose them thoughtfully based on prior research and reasonable causal models of the data generating process, rather than trying to throw in "the kitchen sink." Or, get a much larger data set that can accommodate fine-grained subdivision.

    Comment

    Working...
    X