Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ordinal or negative binomial regression

    I have an outcome variable which represents the number of aggravating clinical occurrences. For example if someone is obese, with high triglycerides and high blood pressure then the variable takes the value 3. I first thought to analyze this variable as a count variable using the poisson regression but I found out that it does not follow the poisson distrubution. The histogramm of this variable shows a very normal distributed variable; however, with seven possible values (0-6) it is not appropriate to use the linear regression. Then I thought that each level of the variable is worse than the previous (higher burden to health) and the ordinal regression may be suitable. What whould be the best regression to use with this variable as the dependet (outcome) variable?

  • #2
    Ionas:
    I would consider -ologit-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      I think you don't want to treat this as ordinal. For one, you'd be assuming that two people with three different conditions are at the same level of health. If you don't know the disaggregated categories then I suppose this is inevitable. Even more of a problem, is it clear that someone with conditions x1, x2, and x3 is in worse health than someone with x4 and x5? It depends what these categories are. Collapsing the outcomes into a count has the potential of losing some important information.

      If you want to treat it as a count, you should recognize the upper bound. The problem with Poisson regression is NOT that the distribution does not look Poisson because Poisson regression is completely robust provided the mean is correctly specified. The problem is the exponential mean function, which is not bounded, and therefore is not the best choice for your problem.

      My recommendation is to use binomial regression. You can find a discussion in Chapter 18 of my MIT Press book. It enforces an upper bound. Also, like Poisson regression, this is a fully robust quasi-MLE. Only the mean needs to be correctly specified. That's why you should use robust standard errors.

      BTW, I think you mean to say the histogram shows symmetry; it cannot show normality because the variable clearly is not normally distributed.

      Stata commands, where x2 is assumed discrete and x1 is continuous:

      Code:
      glm y x1 i.x2 ... xk, fam(bin 6) link(logit) vce(robust)
      margins, dydx(*)
      JW

      Comment

      Working...
      X