Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtreg versus xtlogit dropped observations


    For code below
    xtlogit drops 132 observations because for those observations outcomes do not vary by patient. However for xtreg nothing is dropped. Can anyone explain why those 132 observations are not also dropped when one considers xtreg

    PS: This code is written using stata's built in directory so it should be easy to replicate

    sysuse bplong, clear
    gen great_median = bp > 150
    xtset patient
    xtlogit great_median i.when i.sex, fe // drops 132 observations
    xtreg great_median i.when i.sex, fe // nothing drops

  • #2
    In a logistic regression like this, when the dependent variable is always 0 within a patient, or always 1 within a patient, then (maximum likelihood estimate of ) the coefficient of that patient's fixed effect is infinite in magnitude. The fixed effect for that patient becomes a perfect predictor of the outcome. (Perfect prediction is also known as complete separation.)

    In a simple linear regression, perfect prediction does not "blow up" the estimating process, so there is no need to drop anything. In fact, in this situation, the coefficient of the fixed effect will simply be zero, which may be uninteresting, but is not an obstacle to estimation.

    Note that different considerations apply when an independent variable is constant within person. In that case, it is removed in either logistic or simple linear regression because of collinearity issues (which is what happens with the sex variable in your example).

    Comment


    • #3
      Student (please, see FAQ #6 on preferred registration requirements on this forun, Thanks);
      the only way for your -xtreg- model to keep -i.sex- is to change its specification from -fe- to -re-, as the latter gives back coefficient estimates for time-invarying predictors, too:
      Code:
      xtreg great_median i.when i.sex,re
      On how to compare -fe- and -re- specifications, you may want to take a look at -help hausman-
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        Hello, "Student",

        I really didn't get your point. When you created the binary variable "great_median", you're not supposed to apply a linear regression, as you tried with "xtreg" (please see the drop down menu indications: Menu> Statistics > Longitudinal/panel data > Linear models > Linear regression [FE, RE, PA, BE]), but a logistical regression, as you did with xtlogit.

        Actually, when I reproduce your commands for xtlogit, there is an explanation on the reason to the drop outs as well:

        Code:
        . xtlogit great_median i.when i.sex, fe
        note: multiple positive outcomes within groups encountered.
        note: 66 groups (132 obs) dropped because of all positive or
              all negative outcomes.
        note: 1.sex omitted because of no within-group variance.
        Best,

        Marcos
        Best regards,

        Marcos

        Comment


        • #5
          I wish just to underline something worth mentioning:

          If we apply "xtlogit" with "re" instead of "fe",

          Code:
          . xtlogit great_median i.when i.sex, re
          We have no drops in observations, in spite of being still under the logistical "umbrella".

          Best,

          Marcos
          Last edited by Marcos Almeida; 17 Jan 2015, 11:47.
          Best regards,

          Marcos

          Comment


          • #6
            Now I guess I can theoretically envisage what might be happening with "xtlogit, fe" but not with "xtlogit, re".

            When we will apply xtlogit with "fe" option, it becomes a "conditional fixed-effects model".

            And, according to the Stata manual (http://www.stata.com/manuals13/xt.pdf),

            "In general, including panel-specific dummies to control for fixed effects in nonlinear models results in inconsistent estimates. For some nonlinear models, the fixed-effect term can be removed from the likelihood function by conditioning on a sufficient statistic. For example, the conditional fixed-effect logit model conditions on the number of positive outcomes within each panel".

            By the way, the examples from page 233 to 235 are very similiar to the ones preseted here, I mean, not exactly the examples presented by "Student" (linear versus logistic regression) , but the examples afterwards displayed in the discussion ("xtlogit, re" versus "xtlogit, fe"): both under logistical regression, both with the same variables, both within the same data set and format.

            In short, droppings due to the option "fe" seems to be part and parcel of the conditional fixed-effects logit models.

            If still in doubt, please apply "clogit" (for conditional logistic regression) without the "or" option to get practically the same coefficient for "when", as well as the droppings:

            Code:
            . clogit great_median i.when i.sex, group(patient)
            Best,

            Marcos
            Last edited by Marcos Almeida; 17 Jan 2015, 12:47.
            Best regards,

            Marcos

            Comment

            Working...
            X