logit with fixed effects - almost all observations dropped

Katharina Meyer

Join Date: Nov 2014

Posts: 22
#1

logit with fixed effects - almost all observations dropped

20 Nov 2014, 05:11

Hey everyone,

I have a problem with a logitig regression analysis. This is the first time that I work with stata so maybe this is a quite easy question.

My research question is which variables have an impact on the preferences for small or large companies of students after their graduation. i have a panel dataset. the survey was conducted 5 times.

i identified several characteristics that describe small (UG_W = 0) and large (UG_W = 1) companies. these attributes are career possibilities, solcial relationships and compensation.
My regression looks as follows:

. xtlogit UG_W career_pos_W social_rel_W high_compensation_W, fe

when i run the regression almost all of my observations are dropped and I am left with 31 observations, which is obviously not enough.
stata says:

note: multiple positive outcomes within groups encountered.
note: 130 groups (301 obs) dropped because of all positive or all negative outcomes.

what exactly does this mean. how can i solve this problem? which possibilities do i have?

thank you!
Tags: None
daniel klein

Join Date: Mar 2014

Posts: 3861
#2

20 Nov 2014, 05:30

Technically Stata is saying that 301 of your panel units (students, I guess from here) either prefer a small or a large company at every occasion/repeated measurement (interview, I would guess). In a fixed-effects logistic regression model, you cannot use observations that have no variation on y (i.e. your left-hand side variable/outcome/response/dependent variable ...).

If the problem is not due to coding errors, your options seem to be a random-effects model, or a linear probability model. But you may want to tell us a little bit more about your data, i.e. what are the panel units, what are the occasions, etc.

Best
Daniel
Comment
ben earnhart

Join Date: May 2014

Posts: 1027
#3

20 Nov 2014, 05:34

You may be stuck, unable to run the models you want. Fixed-effects models depend on there being variation within each higher-level unit of analysis. If there is no variation within a company's observations (assuming that's your level two), it can't be used in the model. Realistically, when you think about it, not a whole lot of companies would exhibit variation on UG_W, since small companies tend to stay small, and large companies tend to stay large.

For that matter, I think you have your causality backwards: are you really trying to predict company size? Or are you trying to predict things like career possibilities and social relationships with company size?

*===========added after other comments by others=========
Ah, now I understand your unit of analysis. Well, most students seem to have stuck with a preference for large or small companies, thus no variation. So it's the same issue I and others describe, but my comment about reversing the causality is now moot. Still can't run the model without within-student variation.

Last edited by ben earnhart; 20 Nov 2014, 06:19.
Comment
Katharina Meyer

Join Date: Nov 2014

Posts: 22
#4

20 Nov 2014, 05:46

The question the students had to answer in every survey was : "which size of company would you prefer to work for after your graduation". Therefore it is about the preferred size of company of the students.
based on literature i linked the attributes to company size and in the next step i want to analyse whether the students link these attributes with the "right" company size
Comment
daniel klein

Join Date: Mar 2014

Posts: 3861
#5

20 Nov 2014, 05:51

based on literature i linked the attributes to company size and in the next step i want to analyse whether the students link these attributes with the "right" company size

Sorry, I do not fully follow this. The students where only asked to state which company size they would prefer? How can you "link" this to attributes the students did not rate?

Best
Daniel
Comment
Katharina Meyer

Join Date: Nov 2014

Posts: 22
#6

20 Nov 2014, 05:58

they did rate the attributes but seperatedly from company size. so i know which attributes are important to themn and i know which company size they prefer. i want to know if my assumptions about the link of the attributes to the company size are supported.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3861
#7

20 Nov 2014, 06:09

Ok, so picking on your example " career_pos_W " isthe students rating of how important career possibilities are? In that case your mode seems adequate. Unfortunately, my explanation still holds. If there is no coding error, you cannot use the fixed-effects logit model.

Best
Daniel
Comment
Katharina Meyer

Join Date: Nov 2014

Posts: 22
#8

20 Nov 2014, 06:26

i am sure that there is no coding error.
so i use a random effects model.

logit UG_W career_pos_W social_rel_W high_compensation_W

and as there is no variation in y it is unnecessary to add variables like age or sex.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3861
#9

20 Nov 2014, 06:44

and as there is no variation in y it is unnecessary to add variables like age or sex.

Exactly the opposite. As (almost) all the variation comes from between-student comparisons, you want to make sure to control for all the factors that vary between students (like age and sex).

btw. also note that logit more corresponds to a pooled-model. The RE model would be xtlogit ,re and if use the former, you want to make sure to correct he standard errors for clustering within students.

Best
Daniel

Last edited by daniel klein; 20 Nov 2014, 06:47.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5008
#10

20 Nov 2014, 07:19

Paul Allison has an excellent and inexpensive book on fixed effects regression models:

http://www.amazon.com/Effects-Regres...dp/0761924973/

He has good discussions of the merits of fixed effects vs random effects.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Katharina Meyer

Join Date: Nov 2014

Posts: 22
#11

21 Nov 2014, 02:48

So if I wanted to use a pooled model and since there is no variation in y would it be a reasonable alternative to convert the panel data to a cross sectional dataset and run a regression on that?

how would i test for autocorrelation with a logit function? "estat dwatson"?
Comment
Katharina Meyer

Join Date: Nov 2014

Posts: 22
#12

21 Nov 2014, 03:08

how can i convert paneldata to cross sectional data?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17713
#13

21 Nov 2014, 03:13

Katharina:
why do you prefer a pooled -logit- model to -xtlogit, re-, as Daniel suggested?

Kind regards,
Carlo
(Stata 19.0)
Comment
Katharina Meyer

Join Date: Nov 2014

Posts: 22
#14

21 Nov 2014, 03:38

as i gave my question/ hyp. some thought i realized that i simply want to know if the students link the attributes to the "right" size of the company. At that point (as this is only one of many questions) i am not interested in how it changes over time but are the theoretical links supported by the data.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17713
#15

21 Nov 2014, 04:04

Katharina:
admittedly, I do not follow you in full.
You seem to have repeated measurements (5 times) on the same sample of students with a binary dependent variable; hence a panel model would be the right choice.
As within variation among students is null (i.e., students' answers do not vary across the 5 measurements), you will run out of luck with a -xtlogit, fe- specification; hence a -xtlogit, re- or a pooled -logit, vce(cluster idstudent)- would be worth trying.
However, the latter choice is possibly different from -xtlogit, re- specification.
You can assess if this holds for your model by observing the result of the likelihood-ratio test that appears as a footnote of -xtlogit, re- output table.
You may want to take a look at -help xtlogit- and related entry in Stata 13.1 .pdf manual.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

logit with fixed effects - almost all observations dropped

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment