Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel : trend mixed with one time dummy vs full set of time dummy

    Hi,

    I'm currently estimating the following model y_it = x_it B +O_t+e_it

    with T=7

    The parameter of the time trend is of interest to me.

    In my first estimation I decomposed the time effect into time dummies, every dummies is significant but 2008 and 2011 are much higher thus the dummy do not completely follow a trend.

    In the second estimation I used a time trend, and the B are now far from my first estimation.

    I know that there is in fact a similar shock during the years 2008 and 2011.

    In my third estimation I used a time trend and one dummy for 2008 or 2011. Now my B are very close to the first one, and the dummy is significant and have the expected sign.

    Is my third estimation alright? and could I use an Hausman test between the first and third regression to assess if the difference between B is significant? What should I do since I use clusters for SE and Hausman do not allow it?

    Thanks!

    Jerome

  • #2
    Jerome:
    as per FAQ, please post what you typed and what Stata gave you back. Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      . xtreg y x1 x2 i.years , fe cluster(indid)

      Fixed-effects (within) regression Number of obs = 956445
      Group variable: indid Number of groups = 136635

      R-sq: within = 0.0215 Obs per group: min = 7
      between = 0.0054 avg = 7.0
      overall = 0.0020 max = 7

      F(8,136634) = 2063.61
      corr(u_i, Xb) = -0.3289 Prob > F = 0.0000

      (Std. Err. adjusted for 136635 clusters in indid)
      ------------------------------------------------------------------------------
      | Robust
      y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      x1 | .0040815 .0000851 47.97 0.000 .0039147 .0042483
      x2 | -.000029 5.74e-07 -50.47 0.000 -.0000301 -.0000278
      |
      years |
      2008 | .109485 .0014934 73.31 0.000 .1065579 .112412
      2009 | .0762434 .0015167 50.27 0.000 .0732708 .0792161
      2010 | .0613979 .0015669 39.18 0.000 .0583268 .0644689
      2011 | .2647616 .002558 103.50 0.000 .259748 .2697752
      2012 | .1871743 .0026023 71.93 0.000 .1820739 .1922748
      2013 | .2361567 .0027295 86.52 0.000 .2308069 .2415064
      |
      _cons | .1363798 .0031749 42.96 0.000 .1301569 .1426026
      -------------+----------------------------------------------------------------
      sigma_u | .2445819
      sigma_e | .42651714
      rho | .24746009 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------

      . xtreg y x1 x2 trend, fe cluster(indid)

      Fixed-effects (within) regression Number of obs = 956445
      Group variable: indid Number of groups = 136635

      R-sq: within = 0.0087 Obs per group: min = 7
      between = 0.0058 avg = 7.0
      overall = 0.0035 max = 7

      F(3,136634) = 1503.64
      corr(u_i, Xb) = -0.0605 Prob > F = 0.0000

      (Std. Err. adjusted for 136635 clusters in indid)
      ------------------------------------------------------------------------------
      | Robust
      y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      x1 | .0014556 .0000731 19.92 0.000 .0013123 .0015988
      x2 | -8.91e-06 4.66e-07 -19.11 0.000 -9.82e-06 -8.00e-06
      trend | .0233577 .0004121 56.68 0.000 .02255 .0241654
      _cons | .175378 .0033544 52.28 0.000 .1688034 .1819525
      -------------+----------------------------------------------------------------
      sigma_u | .22624306
      sigma_e | .42929863
      rho | .21736546 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------

      . xtreg y x1 x2 trend et ey, fe cluster(indid)

      Fixed-effects (within) regression Number of obs = 956445
      Group variable: indid Number of groups = 136635

      R-sq: within = 0.0192 Obs per group: min = 7
      between = 0.0054 avg = 7.0
      overall = 0.0026 max = 7

      F(5,136634) = 2897.09
      corr(u_i, Xb) = -0.2595 Prob > F = 0.0000

      (Std. Err. adjusted for 136635 clusters in indid)
      ------------------------------------------------------------------------------
      | Robust
      y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      x1 | .003388 .0000794 42.68 0.000 .0032324 .0035435
      x2 | -.0000237 5.25e-07 -45.02 0.000 -.0000247 -.0000226
      trend | .0346596 .0004731 73.27 0.000 .0337324 .0355867
      et | .1033682 .0011771 87.82 0.000 .1010612 .1056753
      ey | .0159174 .0018851 8.44 0.000 .0122226 .0196122
      _cons | .0967299 .0034172 28.31 0.000 .0900324 .1034275
      -------------+----------------------------------------------------------------
      sigma_u | .2380697
      sigma_e | .42701454
      rho | .23712457 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------

      . Actually i have two dummies, et (2008 - 2011) and ey (2012-2013).

      Using model 2 alone would be good, but I don't think it is a good specification because the B change too much compare to 1. Now model 3 is much closer but still is probably significantly different. But I only care about the trend and having the beta much nearer to model 1 seems to me to be more robust... (also I don't really care about the specific coefficient of the trend between 2 and 3, so I am not pushing for 3)

      Thanks

      Comment


      • #4
        Jerome:
        some remarks about your model:
        - all the Rs are very low; are you sure that -fe- is the way to go?
        - models 1 and 2 are different because the predictors are different. Hence, I would base any comparison between them on the literature in your research field;
        - the previous remark holds for your last statement, too: you need to provide your audience with some theoretical undepinning to justify which model is "more robust" (and, by the way, "more robust" to what?)
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hi, thanks for the answer,

          R-square are very low because of the kind of dependent variable which is mainly a binary variable with a lot of 0. Unfortunately I cannot use random effect in a nonlinear model because of the bias due to incidental parameter and there is correlation between the unobserved heterogeneity. Therefore the linear probability model is my best bet but r-square have little chance to be very indicative.

          What I mean by being more robust is that using time fixed effect place no restriction on the relationship between the time effects and the x_it, this is why time fixed effects are usually preferred to trend or others. Thus if I used a trend and dummies consequent with the theory and I find similar B, it tells me that I'm probably modeling the time variable correctly, or at least the part that is correlated with the x_it no?

          In any case I wonder if from a statistical point of view using a time trend and other time specific dummy variable is a blunder if we stay far from colinearty. It is probably a stupid question because I have looked a lot to find an answer with no success ...

          Thanks a lot!

          Jerome

          Comment


          • #6
            Jerome:
            Thanks for providing more details.
            I'm not very familiar with linear probability model; however, I fail to get whether you are intended to plug both trend and time dummies as predictors or else. Personally, I would chose one of those two alternatives.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Well imagine there is a trend but a shock in two years. Using only the trend might be wrong because of the shock, and if you expect the shock to be correlated with the x_it indeed time fixed effect would seem better, but the relevant parameter to evaluate is the trend.
              Earlier I made a mistake, just correcting it, i meant:

              'Unfortunately I cannot use fixed effect in a nonlinear model because of the bias due to incidental parameter and there is correlation between the unobserved heterogeneity so random effect would be wrong. '

              Comment


              • #8
                Jerome:
                thanks for providing (much) more details.
                If the literature in your research field vouches that approach (i.e., trend + year dummies), well, follow that road.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment

                Working...
                X