xtreg versus xtlogit dropped observations

Student

Join Date: Jan 2015

Posts: 6
#1

xtreg versus xtlogit dropped observations

16 Jan 2015, 11:39

For code below
xtlogit drops 132 observations because for those observations outcomes do not vary by patient. However for xtreg nothing is dropped. Can anyone explain why those 132 observations are not also dropped when one considers xtreg

PS: This code is written using stata's built in directory so it should be easy to replicate

sysuse bplong, clear
gen great_median = bp > 150
xtset patient
xtlogit great_median i.when i.sex, fe // drops 132 observations
xtreg great_median i.when i.sex, fe // nothing drops
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#2

16 Jan 2015, 13:01

In a logistic regression like this, when the dependent variable is always 0 within a patient, or always 1 within a patient, then (maximum likelihood estimate of ) the coefficient of that patient's fixed effect is infinite in magnitude. The fixed effect for that patient becomes a perfect predictor of the outcome. (Perfect prediction is also known as complete separation.)

In a simple linear regression, perfect prediction does not "blow up" the estimating process, so there is no need to drop anything. In fact, in this situation, the coefficient of the fixed effect will simply be zero, which may be uninteresting, but is not an obstacle to estimation.

Note that different considerations apply when an independent variable is constant within person. In that case, it is removed in either logistic or simple linear regression because of collinearity issues (which is what happens with the sex variable in your example).
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#3

17 Jan 2015, 08:52

Student (please, see FAQ #6 on preferred registration requirements on this forun, Thanks);
the only way for your -xtreg- model to keep -i.sex- is to change its specification from -fe- to -re-, as the latter gives back coefficient estimates for time-invarying predictors, too:

Code:

xtreg great_median i.when i.sex,re

On how to compare -fe- and -re- specifications, you may want to take a look at -help hausman-

Kind regards,
Carlo
(Stata 19.0)
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

17 Jan 2015, 10:29

Hello, "Student",

I really didn't get your point. When you created the binary variable "great_median", you're not supposed to apply a linear regression, as you tried with "xtreg" (please see the drop down menu indications: Menu> Statistics > Longitudinal/panel data > Linear models > Linear regression [FE, RE, PA, BE]), but a logistical regression, as you did with xtlogit.

Actually, when I reproduce your commands for xtlogit, there is an explanation on the reason to the drop outs as well:

Code:

. xtlogit great_median i.when i.sex, fe note: multiple positive outcomes within groups encountered. note: 66 groups (132 obs) dropped because of all positive or all negative outcomes. note: 1.sex omitted because of no within-group variance.

Best,

Marcos

Best regards,

Marcos
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#5

17 Jan 2015, 11:45

I wish just to underline something worth mentioning:

If we apply "xtlogit" with "re" instead of "fe",

Code:

. xtlogit great_median i.when i.sex, re

We have no drops in observations, in spite of being still under the logistical "umbrella".

Best,

Marcos

Last edited by Marcos Almeida; 17 Jan 2015, 11:47.

Best regards,

Marcos
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

17 Jan 2015, 12:35

Now I guess I can theoretically envisage what might be happening with "xtlogit, fe" but not with "xtlogit, re".

When we will apply xtlogit with "fe" option, it becomes a "conditional fixed-effects model".

And, according to the Stata manual (http://www.stata.com/manuals13/xt.pdf),

"In general, including panel-specific dummies to control for fixed effects in nonlinear models results in inconsistent estimates. For some nonlinear models, the fixed-effect term can be removed from the likelihood function by conditioning on a sufficient statistic. For example, the conditional fixed-effect logit model conditions on the number of positive outcomes within each panel".

By the way, the examples from page 233 to 235 are very similiar to the ones preseted here, I mean, not exactly the examples presented by "Student" (linear versus logistic regression) , but the examples afterwards displayed in the discussion ("xtlogit, re" versus "xtlogit, fe"): both under logistical regression, both with the same variables, both within the same data set and format.

In short, droppings due to the option "fe" seems to be part and parcel of the conditional fixed-effects logit models.

If still in doubt, please apply "clogit" (for conditional logistic regression) without the "or" option to get practically the same coefficient for "when", as well as the droppings:

Code:

. clogit great_median i.when i.sex, group(patient)

Best,

Marcos

Last edited by Marcos Almeida; 17 Jan 2015, 12:47.

Best regards,

Marcos
1 like
Comment

Announcement

xtreg versus xtlogit dropped observations

Comment

Comment

Comment

Comment

Comment