Extrapolation in fixed-effects models with time dummies

Henning Hinkers

Join Date: Jul 2022
Posts: 6

Extrapolation in fixed-effects models with time dummies

04 Jul 2022, 06:32

Hello everyone!

I've opened a thread on my project already a few days ago, and the answer & some colleagues already helped me, so thanks for that.

Actually, there's a new matter.

Again, we're researching into the mental health trajectories of immigrants and natives in Germany over time. For that purpose, we're using unbalanced panel data. We're estimating the within-person-changes with two separate fixed-effects models for immigrants and natives each, with mental health (mh) as outcome and time dummies (2020, 2018, 2016 etc. – survey every two years) as exposure, controlling for age. We want to visualize the results using a combined coefplot.

For that purpose, we created an continuous impact function.

Code:

gen ycov_index = syear - 2020

Then we defined a dummy impact function for the years 2004 to 2020.

Code:

recode ycov_index ///
       (min/-16. = 0 "2004")                ///
       (-14      = 1 "2006")                  ///
       (-12      = 2 "2008")                  ///
       (-10      = 3 "2010")                  ///
       (-8       = 4 "2012")                   ///
       (-6       = 5 "2014")                   ///
       (-4       = 6 "2016")                   ///
       (-2       = 7 "2018")                   ///
       (0        = 8 "2020")                   ///
       , gen(time_dummies)

And then we estimated the two fixed-effect models.

Code:

xtset pid syear
xtreg mh i.time_dummies c.age##c.age if immigrant == 0, fe vce(robust)
margins time_dummies, post
est store A

xtreg mh i.time_dummies c.age#c.age if immigrant == 1, fe vce(robust)
margins time_dummies, post
est store B

Last, we create the plot.

Code:

coefplot ///
    (A, label(Natives) offset(-0.05)                                                                  ///
        msymbol(0) mcolor(black) msize(medlarge) lcolor(black)                    ///
        lwidth(medthick) lcolor(black) ciopts(recast(rcap) lcolor(black)))          ///
    (B, label(Immigrants) offset(0.05)                                                             ///
        msymbol(S) mcolor(gs9)                                                                      ///
        lwidth(medthick) lcolor(gs9) ciopts(recast(rcap) lcolor(gs9)))               ///
    ,    title("Comparison of natives and immigrants")                                     ///
            vertical recast(connected) ytitle("") ylab (51.5 (0.5) 48.5,                 ///
            labsize(small) angle(0)) xlab (, labsize(vsmall) angle (0))                 ///
            legend(size(small)) xtitle("Years before Covid-19",                           ///
            size(small)) xscale(titlegap(2))                                                          ///
            graphregion(color(white) fcolor(white) icolor(white))

We have witnessed a significant drop in mental health in 2020 for natives (but not immigrants), but we've also seen that this could be part of a periodic up and down of mental health. It was as low for natives in 2012 and 2004. Our idea was now to take the years 2004-2018, calculate expected values/extrapolations for 2020 and compare that with the actual estimation for 2020 of our fixed-effects models. What could be ways to do that?

Thanks in advance,
Henning

Tags: None

Andrew Musau

Join Date: Oct 2014

Posts: 10214
#2

04 Jul 2022, 07:09

What sample sizes are we talking about here?
Comment
Henning Hinkers

Join Date: Jul 2022

Posts: 6
#3

04 Jul 2022, 07:17

Originally posted by Andrew Musau View Post

What sample sizes are we talking about here?

After all restrictions, the fixed-effect model for natives has 172,951 observations (47,145 individuals) and the model for immigrants 43,159 observations (19,712 individuals).
In both cases, data are highly unbalanced.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10214

04 Jul 2022, 07:46

The fitted values can always be computed out-of-sample. The prediction including the fixed effects can only be computed for in-sample observations.

Code:

webuse grunfeld, clear
xtset company year
xtreg invest mvalue kstock if time<=10, fe
predict investhat, xb
g invest2= cond(time<=10, invest, investhat)
set scheme s1mono
tw (line invest year if co==1) (line invest2 year if co==1, lp(dash)), leg(order (1 "Actual" 2 "Predicted"))

Click image for larger version

Name: Graph.png
Views: 1
Size: 23.9 KB
ID: 1672034

Comment

Henning Hinkers

Join Date: Jul 2022

Posts: 6
#5

04 Jul 2022, 11:03

Originally posted by Andrew Musau View Post

The fitted values can always be computed out-of-sample. The prediction including the fixed effects can only be computed for in-sample observations.

Thanks a lot for your example! That works also for me, but I can't get this translated to my example, especially with the time-dummies.
Comment
Henning Hinkers

Join Date: Jul 2022

Posts: 6
#6

04 Jul 2022, 11:45

I also now took the variable I generated the time_dummies variable from (ycov_index) that goes from -18 to 0 and tried the following:

Code:

xtreg mh c.ycov_index c.age##c.age if immigrant == 1, fe vce(robust) margins, at(ycov_index = (-18 (2) (2))

But the values are "not estimable". Why is that?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10214
#7

04 Jul 2022, 18:14

I do not understand why you are manually creating the time dummies.

Code:

xtset pid syear, delta(2) xtreg mh i.syear c.age##c.age if !immigrant, fe vce(robust) margins i.syear, post

Originally posted by Henning Hinkers View Post

Thanks a lot for your example! That works also for me, but I can't get this translated to my example, especially with the time-dummies.

You need to be precise. What exactly have you typed and what is the Stata output?

Last edited by Andrew Musau; 04 Jul 2022, 18:23.
Comment
Henning Hinkers

Join Date: Jul 2022

Posts: 6
#8

05 Jul 2022, 07:46

Originally posted by Andrew Musau View Post

I do not understand why you are manually creating the time dummies.

Thanks for your patience! Reason is that I want to estimate the effect of 2020 as an event on mental health (-> effects of events, dummy impact function).

And I would prefer to do that with margins, so that was my idea:

Code:

gen ycov_index = syear-2020 drop if syear==2020

... as I have data on 2020, but I want a prediction of that, to then estimate the effect of event. Further:

Code:

recode ycov_index /// (-18 = 0 "2002") (-16 = 1 "2004") (-14 = 2 "2006") (-12 = 3 "2008") (-10 = 4 "2010") (-8 = 5 "2012") (-6 = 6 "2014") (-4 = 7 "2016") (-2 = 8 "2018") , gen(time_dummies) xtreg mcs i.time_dummies c.age##c.age if immigrant == 0, fe vce(robust) margins i.time_dummies, at(time_dummies=(0(1)9)) noestimcheck

So that the value for time_dummies = 9 would give me a prediction of 2020.

But it gives me:

Code:

at values for factor time_dummies do not sum to 1

Thx!
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10214
#9

05 Jul 2022, 08:22

You cannot get marginal effects for a factor level not present in the estimation sample. The best that you can do is to include a time trend and then predict future years based on the coefficient on the trend. If you suspect that the trend is not linear, you can experiment with other specifications.

Code:

xtset pid syear, delta(2) xtreg mh c.syear c.age##c.age if !immigrant & syear<=2018, fe vce(robust) predict mh_native, xb gen mh2= cond(!immigrant & syear<=2018, mh, mh_native) set scheme s1color tw (line mh syear if !immigrant) (line mh_native syear if !immigrant, lp(dash)), xline(2018) leg(order (1 "Actual" 2 "Predicted"))

Last edited by Andrew Musau; 05 Jul 2022, 08:26.
Comment

Announcement

Extrapolation in fixed-effects models with time dummies

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment