Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Polynomial expressions in logistic regression

    Hi, in my dataset, the proportion of individuals who have undergone cataract surgery rises steeply after the age of 80, so I thought I'd look to see if age2 fitted the model better.

    I was taught that with OLS regression you'd include age*age in the model (along with age) and see if the F statistic decreased. When I try this with Logistic regression, the LR Chi doesn't change. Same when I include log(age), except the z statistics become non-significant.

    Am I going about this the right way or have I lost the plot. Also does it make any difference. When I present my results, I'll say that "the proportion of individuals who had previous cataract increased steeply after the age of 80." Does the type of fit add anything to this ?

    I am constantly in awe of the statistical knowledge dispensed on these pages and as ever I am grateful for any advice you can give. Thanks Ali

    Ali Poostchi
    Ophthalmology Registrar
    Queens Medical Centre, Nottingham
    Attached Files

  • #2
    Fitting a logistic with a quadratic predictor can certainly make sense. In ecology that is a standard model known, some say misleadingly, as Gaussian logit. The name arises because the entire curve, pushed through the logit link, is a bell-shape, although possibly only some of the bell is visible within the range of the data.

    You don't give any numbers here but you have no turning point and perhaps rashly one would guess confidently that the relationship really is monotonic. . That doesn't rule out a square term adding a little extra power but the linear and square term would be fighting each other for market share.

    If this was my problem I would recast age to (age - 75), say. There probably won't be enormous numerical problems if you don't, but intercepts etc. will be a little easier to interpret.

    I can't see on obvious case for using log(age). age and log(age) will be close to perfectly correlated over a narrow range of age.

    I recommend plotting residuals from a model in age alone versus age to see if there is any structure there. Some would commend splines but here you seem very short on data points. Do you have patients' ages in years? I wouldn't aggregate to 5 year groups if you do.
    Last edited by Nick Cox; 07 Jan 2016, 10:31.

    Comment

    Working...
    X