Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clogit regression - Missing standard errors on interactions between categorical variables

    Dear Stata-listers,

    I would need some help with an issue that happens when using the clogit command to estimate a conditional choice model.
    The problem is that for some interactions between categorical variables, Stata spits out just the coefficients, but the standards errors (and CI and p-values) are missing.

    I am using the clogit command in Stata 13, on a Windows system.

    What I am trying to do is estimating a conditional choice model, for the choice of hospitals, where the key covariates of interest are a vector of hospital quality metrics.
    There are several hospital types, and their market share (of the different type of hospitals) changes dramatically across the year of my sample.
    This implies that in order to avoid bias of my coefficient of interest on quality, I need to control for the time-varying market shares of the different hospital types.
    As such, in my specification I enter an interaction term between years and hospital types.

    The final specification is of this kind:
    utility_ij = alfa * year * provider_tipe_j + beta * year * quality_j + error_ij , where i is the individual and j is the hospital

    Now, some of the alfa coefficients are estimated, but not the standard errors, as you can see from the output below. Here years is expressed by the variable periods and the provider type by the variable
    provider_type5.

    clogit chosen ( ib0.periods )#( ib1.provider_type5 ) c.distance##c.distance c.readmission c.revision c.death , ///
    group(cips) difficult technique( nr 15 bfgs 15 bhhh 20 ) iterate(200)

    Conditional (fixed-effects) logistic regression Number of obs = 15325440
    LR chi2(70) = 2420129.18
    Prob > chi2 = 0.0000
    Log likelihood = -527430.29 Pseudo R2 = 0.6964

    (Std. Err. adjusted for clustering on cips)
    ------------------------------------------------------------------------------------------------
    chosen | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------------+----------------------------------------------------------------
    periods#provider_type5 |
    0 2 | -.6661329 .007592 -87.74 0.000 -.6810129 -.6512529
    1 1 | 1.12236 .0080939 138.67 0.000 1.106497 1.138224
    1 2 | .4550258 . . . . .
    2 1 | 1.413986 .0099321 142.36 0.000 1.394519 1.433453
    2 2 | .8656113 . . . . .
    3 1 | 3.057734 .0099184 308.29 0.000 3.038295 3.077174
    3 2 | 2.375108 . . . . .
    |
    distance | -.2274371 .0012455 -182.60 0.000 -.2298783 -.2249959
    |
    c.distance#c.distance | .0005577 4.47e-06 124.66 0.000 .0005489 .0005664

    readmission | .0582533 .0033604 17.34 0.000 .0516671 .0648395
    revision | -.1091014 .0083197 -13.11 0.000 -.1254078 -.092795
    death | -14.8262 1.785682 -8.30 0.000 -18.32608 -11.32633


    I should also mention that I checked:
    1) that all the categories in both categorical variables are opportunely populated.
    2) it does not seem there is any straightforward collinearity between these two variables

    Do you have any other idea why the estimation is failing to provide standard errors?

    Thanks a lot in advance,
    Giuseppe









  • #2
    Just a guess, but could it be because the likelihood function is flat (or nearly so) or discontinuous in the neighborhood of the maximum likelihood estimates? It'd be the same reason as why you felt the need to use the difficult technique( nr 15 bfgs 15 bhhh 20 ) iterate(200) options.

    The affected estimates all are interaction terms involving the second hospital type (provider_type5 == 2). There might not be enough data (variation between values of cips) involving that kind of hospital. (Is there a reason why you didn't include the main effects for the period × type-of-hospital interaction term?)

    You might want to consider centering and re-scaling your continuous variables, especially if you're going to include the quadratic polynomial term for distance (0.0006 ± 0.000004) alongside death (-15 ± 2). (I don't know what death measures as a continuous variable in a fixed-effects model with a regression coefficient of -15 on the log-odds scale, but I assume that you're going to be able to interpret results for it after centering and re-scaling its values just as well as you can now.)

    I'm trying to remember how often I've encountered log-likelihood values of -500 000.

    Comment


    • #3
      Hi Joseph,

      Thanks a lot for your reply.
      Indeed, I reported only a part of the regressors coefficients in the pasted output, in order to not confuse the readers and to avoid taking too much space (the regression output is quite long: as you guessed it includes also the year*quality interactions).

      On your suggestion: out of frustration yesterday night I had the same idea of re-centering the variables and it works!
      Not only that, but the model estimates in just 5 iterations with the NR algorithm, while before it was taking 99 iterations and switching between algorithms in order to converge.

      I am still a bit puzzled why it doesn't work with time*hospital_type interactions.
      I would not marry the flat likelihood hypothesis, but I think that the one about discontinuity makes much sense.
      Indeed, there are substantial changes in quality and market shares by hospital type, and this might have hindered the estimation of the categorical interactions.
      Also, the second hospital type is the second largest category, with 221k individuals over a sample of 526k patients.
      It might be that variation between cips in this category is low, but I would be a bit surprised.
      I assume that the clogit model works better with continuous regressors, rather then loads of categorical ones, and indeed it seems to be the case since re-centering finally let the model estimate fine.

      Thanks again for your reply and your very valuable comments.

      Best wishes,
      Giuseppe

      Comment

      Working...
      X