Finding the drivers of within-variation in a panel dataset

Kwasi Tabiri

Join Date: Apr 2019

Posts: 25
#1

Finding the drivers of within-variation in a panel dataset

03 Nov 2023, 21:48

Hi Statalist,

In panel data models such as

Code:

y = Bx_it + error

we usually decide which model to adopt based on the source of variation in x (between changes across individuals or changes within individuals over time).

My question is: in the case where the effect is driven by within-variation over time (and so we run a fixed effects model), is there a way to find out what drives this within-variation?

For example, say y is household bargaining power, and x is a measure of the extent of informality in women's employment. We now observe that reduced informality of employment leads to higher bargaining power within the household. How do we now determine that within-variation over time in informality is driven by factors like higher education or changes in the kind of job that women do?

Thanks for your time!
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17726
#2

04 Nov 2023, 04:50

Kwasi:
the first step I would take is comparing -fe- vs -re- specification.
Assuming that you're dealing with a N>T panel dataset with a continuous regressand, go -xtreg,fe- and -xtreg,re- and compare them via -hausman- or the community-contributed module -xtoverid-.
If this reply is not helpful, please provide further details. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Kwasi Tabiri

Join Date: Apr 2019

Posts: 25
#3

04 Nov 2023, 17:10

Thanks for your reply, Carlo. I probably did not frame the question as well as I should.
My goal is not to determine whether to use a fixed effects or random effects model. Rather, I want to know what factors are driving changes in the regressor over time.
Will running another fixed effects model with the regressor as the regressand be what I am looking for? Or there is way to extract that information from running the original regression?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2189
#4

04 Nov 2023, 22:14

Kwasi: I think you're on the right track with using the regressor as the dependent variable and try to explain it with other variables. The problem is when those other variables maybe should be in the model.

How about computing the within variance for each family for x. Then, you can related those variances to time-constant factors, such as education (often doesn't change over time). This would be a cross-sectional regression. I just you would use the egen command to compute the family-specific standard deviation in your x variable. Something like this:

Code:

egen sd_x = sd(x) reg sd_x z1 ... zk if year == 1

BTW, I think we generally choose fixed effects unless it's too imprecise to work with. The key is whether the x are correlated with heterogeneity. When that's the case, we opt for FE -- even if x doesn't have as much time-variation as we'd like.
2 likes
Comment
Kwasi Tabiri

Join Date: Apr 2019

Posts: 25
#5

05 Nov 2023, 17:41

Thank you so much for your help, Jeff! This looks like just what I need. I really appreciate it.
Comment

Announcement

Finding the drivers of within-variation in a panel dataset

Comment

Comment

Comment

Comment