Panel Data and Fixed Effects

Stephen Colbrook

Join Date: Aug 2022

Posts: 3
#1

Panel Data and Fixed Effects

23 Aug 2022, 09:48

Hi everyone!

Apologies if this is a pretty basic query, but I'm struggling to decide which regression technique to apply to my data and wondered if anyone could help. I have panel data between 1984 and 1993 of every AIDS-related law passed by the U.S.'s fifty states, along with various independent variables (legislative professionalism, HIV/AIDS caseload, partisanship etc ...) for each year. I'm trying to determine the relationships between these independent variables and the likelihood that a state would pass an AIDS-related law for the entire period. Because this is count data, I was under the impression that I needed to use poisson regression, but I've seen a few academic articles that dismiss this approach when accounting for state and year fixed effects (for example, one article simply states 'I estimate OLS regressions, which I prefer to negative binomial count models because of the inclusion of state and year fixed effects').

Is anyone able to give me any guidance on this?

Thanks so much for your help.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#2

23 Aug 2022, 10:02

Well, without seeing the full article you are quoting from (and perhaps even if I did see it) it isn't completely clear what the author meant. But here's my best guess. The fixed-effects Poisson and negative binomial models are conditional fixed effects models. That is, you cannot get estimates of the state and year effect levels from them--instead, the likelihood function is calculated conditional on them, and they are not estimable in that model.

So if your research goals require that you estimate the state and year effects themselves, then the usual -xtpoisson, fe- and -xtnbreg, fe- will not do the job. That said, in my opinion, the most important thing about selecting a model is that it reflect as much as possible, our best understanding of the data generating process and provide a reasonable fit to the data. Now, if your count outcome numbers include values that are small, it is unlikely that a linear model will represent that process well, and you are likely to end up with a model that predicts negative counts for some realistic values of the explanatory variables. However, if your count outcomes are nicely distanced from zero and if they don't range over too many orders of magnitude, a linear model is likely to be very useful, and can easily work as well as or better than Poisson or negative binomial.
1 like
Comment
Stephen Colbrook

Join Date: Aug 2022

Posts: 3
#3

23 Aug 2022, 10:22

Thanks very much for such a prompt response. It sound like my data would fit a linear model.

On a side note, would it be correct to log the values of the independent variables, so that the regression coefficients read as percentage change in the number of laws passed?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#4

23 Aug 2022, 10:37

No, logging the outcome variable and leaving the independent variables would give you a model in which the coefficients would be approximately equal to the percentage difference in outcome associated with a unit change in an explanatory variable. Logging the independent variables does not do that: rather it gives you a model in which a given percentage change in the explanatory variable is associated with a certain additive change in the outcome variable.

That said, remember that log Y and Y are not linearly related to each other. If Y only varies over 1 order of magnitude (or maybe a bit more), then it may be close enough to linear that regressing log Y instead of Y does not violently destroy the model's representation of the real world data generating process, so you can get away with it. But if Y varies over several orders of magnitude, then you really are changing things quite a bit, and the Y model and the log Y model can't both be right: in that case you have to choose the model that properly describes what is going on in the real world. If the linear model fits the data well but the log Y model doesn't, then you shouldn't use the log Y model. Bear in mind that in that case, the reality is that a unit change in an explanatory variable is not actually associated with a fixed percentage change in the outcomes, so that "percentage change in the number of laws passed" you want to estimate doesn't even exist.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17851
#5

23 Aug 2022, 11:10

Stephen:
as an aside to Clyde's excellent advice, most depends on how you're going to disseminate the results of your research.
If you're dealing with an academic dissertation, bring these issues uo to your supervisor.
If you're going to draft a paper to be submitted to a technical journal in your research field, take some information about the editorial board (and possible reviewers) beforehand.
I would not take for granted that law journals have reviewers for panel data regression.

Kind regards,
Carlo
(Stata 19.0)
Comment
Stephen Colbrook

Join Date: Aug 2022

Posts: 3
#6

24 Aug 2022, 06:08

Thanks both for your help.

Having done some more thinking, would it not just be better to do a negative binomial regression with generalized estimating equations?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17851
#7

24 Aug 2022, 07:49

Stephen:
you could also consider -xtnbreg-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Panel Data and Fixed Effects

Comment

Comment

Comment

Comment

Comment

Comment