Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extrapolation in fixed-effects models with time dummies

    Hello everyone!

    I've opened a thread on my project already a few days ago, and the answer & some colleagues already helped me, so thanks for that.

    Actually, there's a new matter.

    Again, we're researching into the mental health trajectories of immigrants and natives in Germany over time. For that purpose, we're using unbalanced panel data. We're estimating the within-person-changes with two separate fixed-effects models for immigrants and natives each, with mental health (mh) as outcome and time dummies (2020, 2018, 2016 etc. – survey every two years) as exposure, controlling for age. We want to visualize the results using a combined coefplot.

    For that purpose, we created an continuous impact function.

    Code:
    gen ycov_index = syear - 2020
    Then we defined a dummy impact function for the years 2004 to 2020.

    Code:
    recode ycov_index ///
           (min/-16. = 0 "2004")                ///
           (-14      = 1 "2006")                  ///
           (-12      = 2 "2008")                  ///
           (-10      = 3 "2010")                  ///
           (-8       = 4 "2012")                   ///
           (-6       = 5 "2014")                   ///
           (-4       = 6 "2016")                   ///
           (-2       = 7 "2018")                   ///
           (0        = 8 "2020")                   ///
           , gen(time_dummies)

    And then we estimated the two fixed-effect models.

    Code:
    xtset pid syear
    xtreg mh i.time_dummies c.age##c.age if immigrant == 0, fe vce(robust)
    margins time_dummies, post
    est store A
    
    xtreg mh i.time_dummies c.age#c.age if immigrant == 1, fe vce(robust)
    margins time_dummies, post
    est store B
    Last, we create the plot.

    Code:
    coefplot ///
        (A, label(Natives) offset(-0.05)                                                                  ///
            msymbol(0) mcolor(black) msize(medlarge) lcolor(black)                    ///
            lwidth(medthick) lcolor(black) ciopts(recast(rcap) lcolor(black)))          ///
        (B, label(Immigrants) offset(0.05)                                                             ///
            msymbol(S) mcolor(gs9)                                                                      ///
            lwidth(medthick) lcolor(gs9) ciopts(recast(rcap) lcolor(gs9)))               ///
        ,    title("Comparison of natives and immigrants")                                     ///
                vertical recast(connected) ytitle("") ylab (51.5 (0.5) 48.5,                 ///
                labsize(small) angle(0)) xlab (, labsize(vsmall) angle (0))                 ///
                legend(size(small)) xtitle("Years before Covid-19",                           ///
                size(small)) xscale(titlegap(2))                                                          ///
                graphregion(color(white) fcolor(white) icolor(white))

    We have witnessed a significant drop in mental health in 2020 for natives (but not immigrants), but we've also seen that this could be part of a periodic up and down of mental health. It was as low for natives in 2012 and 2004. Our idea was now to take the years 2004-2018, calculate expected values/extrapolations for 2020 and compare that with the actual estimation for 2020 of our fixed-effects models. What could be ways to do that?

    Thanks in advance,
    Henning

  • #2
    What sample sizes are we talking about here?

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      What sample sizes are we talking about here?
      After all restrictions, the fixed-effect model for natives has 172,951 observations (47,145 individuals) and the model for immigrants 43,159 observations (19,712 individuals).
      In both cases, data are highly unbalanced.

      Comment


      • #4
        The fitted values can always be computed out-of-sample. The prediction including the fixed effects can only be computed for in-sample observations.

        Code:
        webuse grunfeld, clear
        xtset company year
        xtreg invest mvalue kstock if time<=10, fe
        predict investhat, xb
        g invest2= cond(time<=10, invest, investhat)
        set scheme s1mono
        tw (line invest year if co==1) (line invest2 year if co==1, lp(dash)), leg(order (1 "Actual" 2 "Predicted"))
        Click image for larger version

Name:	Graph.png
Views:	1
Size:	23.9 KB
ID:	1672034

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          The fitted values can always be computed out-of-sample. The prediction including the fixed effects can only be computed for in-sample observations.
          Thanks a lot for your example! That works also for me, but I can't get this translated to my example, especially with the time-dummies.

          Comment


          • #6
            I also now took the variable I generated the time_dummies variable from (ycov_index) that goes from -18 to 0 and tried the following:

            Code:
            xtreg mh c.ycov_index c.age##c.age if immigrant == 1, fe vce(robust)
            margins, at(ycov_index = (-18 (2) (2))
            But the values are "not estimable". Why is that?

            Comment


            • #7
              I do not understand why you are manually creating the time dummies.

              Code:
              xtset pid syear, delta(2)
              xtreg mh i.syear c.age##c.age if !immigrant, fe vce(robust)
              margins i.syear, post

              Originally posted by Henning Hinkers View Post

              Thanks a lot for your example! That works also for me, but I can't get this translated to my example, especially with the time-dummies.
              You need to be precise. What exactly have you typed and what is the Stata output?
              Last edited by Andrew Musau; 04 Jul 2022, 18:23.

              Comment


              • #8
                Originally posted by Andrew Musau View Post
                I do not understand why you are manually creating the time dummies.
                Thanks for your patience! Reason is that I want to estimate the effect of 2020 as an event on mental health (-> effects of events, dummy impact function).

                And I would prefer to do that with margins, so that was my idea:

                Code:
                gen ycov_index = syear-2020
                
                drop if syear==2020
                ... as I have data on 2020, but I want a prediction of that, to then estimate the effect of event. Further:

                Code:
                recode ycov_index ///
                       (-18       = 0 "2002")
                       (-16       = 1 "2004")
                       (-14       = 2 "2006")
                       (-12       = 3 "2008")
                       (-10       = 4 "2010")        
                       (-8        = 5 "2012")         
                       (-6        = 6 "2014")         
                       (-4        = 7 "2016")         
                       (-2        = 8 "2018")               
                       , gen(time_dummies)
                
                xtreg mcs i.time_dummies c.age##c.age if immigrant == 0, fe vce(robust)
                margins i.time_dummies, at(time_dummies=(0(1)9)) noestimcheck
                So that the value for time_dummies = 9 would give me a prediction of 2020.

                But it gives me:

                Code:
                at values for factor time_dummies do not sum to 1
                Thx!

                Comment


                • #9
                  You cannot get marginal effects for a factor level not present in the estimation sample. The best that you can do is to include a time trend and then predict future years based on the coefficient on the trend. If you suspect that the trend is not linear, you can experiment with other specifications.

                  Code:
                  xtset pid syear, delta(2)
                  xtreg mh c.syear c.age##c.age if !immigrant & syear<=2018, fe vce(robust)
                  predict mh_native, xb
                  gen mh2= cond(!immigrant & syear<=2018, mh, mh_native)
                  set scheme s1color
                  tw (line mh syear if !immigrant) (line mh_native syear if !immigrant, lp(dash)), xline(2018) leg(order (1 "Actual" 2 "Predicted"))
                  Last edited by Andrew Musau; 05 Jul 2022, 08:26.

                  Comment

                  Working...
                  X