Logit with panel data

Marcel Campion

Join Date: Feb 2017

Posts: 30
#1

Logit with panel data

04 May 2017, 11:23

Hello Stata users,

I am running a logit model with panel data (T=2, N=2256). Since the coefficient estimates from logit model are hard to understand and to interpret I am reporting marginal effect estimates that are easier to interpret. I want to take advantage of the panel dimension of my data by using fixed effect to control for time invariant individual characteristics. I have understood that a conditional FE logit model with individual fixed effects cannot provide marginal effects because the estimation procedure implies that we do not obtain estimates of the individual effect ci (ci is wiped out by the estimation procedure).
One potential solution to this issue might be to use Random Effect model but the strict exogeneity and zero correlation assumptions are in my opinion too strong for my study.

However, Professor Santos Silva has created a Stata command (aextlogit) that allows to estimate average semi-elasticity with respect to one specific covariate. When I implement this approach I lose 3/4 of my observations because of all positive or all negative outcomes. (By the way if anyone can enlighten me about what are average semi-elasticity).

My question is thus do you think the fact that I lose so many of my observation induce biases in my estimation? And the second question is what other method can I implement to obtain marginal effects with a fixed effect logit estimation? There are many discussions on this topic but in my opinion many of us misunderstand what a logit model or conditional fixed logit model gives us as beta coefficient and interpret it wrong.

Thank you for stopping by.

Marcel Campion.
Tags: None

1 like
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2174
#2

04 May 2017, 13:20

Marcel: I have two suggestions. First, use a linear model estimated by fixed effects. This often gives a good approximation to the average marginal effect from a nonlinear model. I have a paper that covers a special case here.

Then I would use a probit correlated random effects approach. The Mundlak version usually works well, but Chamberlain can also be used. You just need to generate the time averages of all time-varying variables (except the time period dummy). Once you've done that, the average marginal effects follow easily. Pooled estimation works well, and seems to lose little in terms of efficiency. I talk about this approach in my MIT Press book, "Econometric Analysis of Cross Section and Panel Data," 2e, 2010, MIT Press.

Generic code, where x1, ... xK change over time, d2 is the second period dummy, z1, ..., zM don't change:

Code:

xtset id year egen x1bar = mean(x1), by(id) ... egen xKbar = mean(xK), by(id) probit y x1 x2 ... xK z1 z2 ... zM x1bar ... xKbar d2, cluster(id) margins, dydx(x1 ... xK)
1 like
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3459
#3

04 May 2017, 13:43

Jeff: should the means computed in x*bar be restricted to those observations used in the model, i.e. are not excluded due to missing values on some other variable?

One way to do that would be:

Code:

// we are not interested in this model // we just want to find out which observations will be used qui : probit y x1 x2 ... xK z1 z2 ... zM d2 // store which observations will be used in variable touse (read: to use) gen touse = e(sample) // do the computations on only those observations that will be used in the model egen x1bar = mean(x1) if touse, by(id) ... egen xKbar = mean(xK) if touse, by(id) // continue as in #2

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2174
#4

05 May 2017, 08:02

Maarten: Definitely! Thanks for that. I assumed from the description of the data that the panel is balanced, but I see it is not explicitly stated that it is.

I have a paper where I consider cases where one might want the mean and the variance of the heterogeneity to depend on the number of time periods observed for each unit, in which case one can add dummy variables for the number of time periods, and perhaps even use -hetprobit-.
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3015
#5

05 May 2017, 13:27

Dear Marcel,

To answer your questions:

1 - The average semi-elasticity is exactly that: the sample average of the individual semi-elasticities (if you do not know what is a semi-elasticity, please check a textbook).

2 - Dropping those observations does not cause bias; they are dropped because they contain no information about the parameters of interest.

3 - I am not aware of any other method.

Best wishes,

Joao
1 like
Comment
Marcel Campion

Join Date: Feb 2017

Posts: 30
#6

09 May 2017, 13:45

Dear Joao, Jeff and Marteen,

Thank you for your contributions. It is actually working very well (I will soon do a recap of what I find with the two strategies so people can have their own opinion).
However I have a question about whether it is possible to instrument one endogenous variable?
Comment
Marcel Campion

Join Date: Feb 2017

Posts: 30
#7

09 May 2017, 15:57

At the moment I have thought of a strategy based on the prediction from OLS regression.
My endogenous variable is x2 in the above model described by Jeff.
first predict
reg x2 h1 h2 h3
predict X2, xb

then introduce X2 into the model
probit y x1 X2 x3 z1 z2 z3 x1bar X2bar x3bar... d2, cluster (hid)

I think it should yield unbiased estimates unless I miss understand something.

kind regards
Marcel
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3015
#8

10 May 2017, 13:10

Dear Marcel,

Your probit equation is an example of a "forbidden regression"; so that won't work.

Best wishes,

Joao
Comment
Panika Jain

Join Date: Mar 2019

Posts: 8
#9

14 Mar 2019, 03:21

Jeff: you have used meap94_98 to explain how to deal with unbalanced panel data. could you tell me from where I can get this data file or explain how to go with unbalanced datasets with Stata command.
Comment

Announcement

Logit with panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment