Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue on yearly dummies (panel data)

    Hello everybody!

    I would kindly want to ask you about an issue I have with dummy variables (in panel data).

    Specifically, which is the difference between: " yr* " and " i.year " ?

    I specify better. I have a panel data of N=500 and T=20 (from 1994 to 2013).

    I want to analyze the impact of yearly dummies and income on consumption.

    I hence start with a pooled gls regression:

    “ reg consumption income yr* ” (which is the same of: “ reg consumption income yr1 yr2 yr3 … yr20 ”.

    However, when I carry out:

    “ reg consumption income i.year ”

    I obtain different results; moreover, in the results table, in this case I obtain a variable “year” but starting from 2000 instead of 1994.

    The same thing if I carry out a FE regression:

    “ xtreg consumption income yr* ”
    and
    “ xtreg consumption income i.year ”

    Also in this case, I always end up obtaining different results between the two.

    What is the difference, hence, between yr* and i.year ? Why aren't they the same? In the sense that they should both refer (?) to the impact of (yearly) time over consumption.

    So why do I obtain different results?

    Could someone please kindly help me to understand this? I would be really grateful.

    Thank you very much!

    K

  • #2
    Kodi,

    First, remember that -xtreg- without the -fe- option defaults to a random effects regression.

    About the factors, maybe it's because i.year omits the base year? If you use ibn.year instead, you would get all years, although one of them will be dropped due to collinearity.

    Best,
    S

    Comment


    • #3
      Dear Sergio,

      thank you for your reply.

      Yes, I forgot to write the fe option, indeed I wanted to write "xtreg consumption income yr*,fe "

      However, I did not get an answer to my question: ie.: which is the difference between yr* and i.year.

      Moreover, even when using ibn.year, I only get one more year in the table, while the other 5 years are still missing..

      Comment


      • #4
        Kodi,
        The fact that you are not obtaining all the year dummies as you do in the other case might indicate that yr* that you have and the dummies created using i.year are not the same. HAve you explore your data to see how they correlate?
        Perhaps something like:
        tabstat yr*, by(year)
        Will give you some idea.
        HTH
        ​F

        Comment


        • #5
          Kodi: Sergio already answered your question (and Fernando as well), perhaps you just did not understand the responses. Conveniently, I have some panel data which I had generated previously for illustration which can help you understand the difference.


          Code:
          clear 
          input y x1 x2 x3 x4 id year
          78 19  45 44  15   1 1
          23 19  17 47  72   1 2
          10 19  32 62  65   1 3
          34 19  11 21  20   1 4
          77 19  42 23  100  1 5
          91 55  12 13  14   2 1
          62 55  27 37  47   2 2
          33 55  13 14  15   2 3
          16 55  58 68  78   2 4
          99 55  80 90  70   2 5
          20 51  18 62  82   3 1
          38 39  39 11  63   3 2
          40 87  46 93  90   3 3
          56 03  64 80  28   3 4
          73 200 88 103  36  3 5
          115 70  85 18  85  4 1
          49 51  67 22 76    4 2
          57 28  49 26  96   4 3
          74 32  31 41  77   4 4
          110 16  12 60  80  4 5
          24 112  26 20  26  5 1
          111 123 81 82  37  5 2
          64 45  59 39  49   5 3
          39 72  31 29  92   5 4
          79 80  16 77  107  5 5
          37 47  19 89  12   6 1
          23 61  38 45  22   6 2
          32 83  82 83  66   6 3
          120 115 91 116 108 6 4
          7 150  54 93  72   6 5
          92 28  30 41  90   7 1
          100 28 40 96 102   7 2
          108 28  50 29  59  7 3
          116 28  60 42  76  7 4
          128 28  70 80  94  7 5
          39 7  55 103  106  8 1
          51 50  27 98  62   8 2
          73 61  19 81  74   8 3
          94 86  112 99  53  8 4
          103 99  67 102  10 8 5
          89 80  105 54  69  9 1
          62 90  97 108  62  9 2
          13 100  102 92  39 9 3
          100 110 81 66  85  9 4
          115 120 92 50  67  9 5
          40 37  75 19  14  10 1
          92 65  87 5  34   10 2
          56 72  92 15  40  10 3
          119 128  23 21 63 10 4
          84 80  82 67  29  10 5
          end
          We can now generate year dummies as you have


          Code:
          tab year, gen(year)
          So we have year1 - year 5 in this case

          Code:
           
             year |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                    1 |         10       20.00       20.00
                    2 |         10       20.00       40.00
                    3 |         10       20.00       60.00
                    4 |         10       20.00       80.00
                    5 |         10       20.00      100.00
          ------------+-----------------------------------
                Total |         50      100.00
          .
          Now what's the difference between i.year and year*? As you recall, with dummy variables, you always have to drop one dummy so that you do not fall into the dummy variable trap, By using i.year, Stata runs the regression having omitted the first year


          Code:
          xtset id year
          xtreg y x* i.year,fe
          Output

          Code:
          . 
          
          Fixed-effects (within) regression               Number of obs      =        50
          Group variable: id                              Number of groups   =        10
          
          R-sq:  within  = 0.2926                         Obs per group: min =         5
                 between = 0.0136                                        avg =       5.0
                 overall = 0.1888                                        max =         5
          
                                                          F(8,32)            =      1.65
          corr(u_i, Xb)  = -0.1009                        Prob > F           =    0.1485
          
          ------------------------------------------------------------------------------
                     y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                    x1 |   .1443385   .1560705     0.92   0.362    -.1735668    .4622437
                    x2 |   .2559484   .1982288     1.29   0.206    -.1478303    .6597272
                    x3 |  -.0618487    .214448    -0.29   0.775     -.498665    .3749677
                    x4 |   .0251512   .1788148     0.14   0.889    -.3390827    .3893851
                       |
                  year |
                    2  |   -3.37898   13.61836    -0.25   0.806    -31.11866     24.3607
                    3  |  -16.59534   13.63284    -1.22   0.232    -44.36454    11.17386
                    4  |   10.21783   14.08045     0.73   0.473    -18.46312    38.89877
                    5  |   18.03578   15.46857     1.17   0.252    -13.47266    49.54422
                       |
                 _cons |   44.74023   17.97087     2.49   0.018     8.134777    81.34569
          -------------+----------------------------------------------------------------
               sigma_u |  20.884301
               sigma_e |  30.034388
                   rho |   .3259214   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0:     F(9, 32) =     1.83               Prob > F = 0.1017
          If you proceed manually using year*, then you are telling Stata to use all year variables, i.e. year1, year2,..., yearN. However, Stata will automatically omit one year dummy to avoid the dummy variable trap. Which dummy is omitted may not be exactly the same as the one under i.year, and in this case, you may get different coefficients for the dummies (not your regressors). So make sure that the omitted variable using year* is exactly the same as the one under i.year if you want identical coefficients for the dummies.


          Code:
          rename year yr
          xtreg x* year*,fe
          Output

          Code:
          note: year1 omitted because of collinearity
          
          Fixed-effects (within) regression               Number of obs      =        50
          Group variable: id                              Number of groups   =        10
          
          R-sq:  within  = 0.2926                         Obs per group: min =         5
                 between = 0.0136                                        avg =       5.0
                 overall = 0.1888                                        max =         5
          
                                                          F(8,32)            =      1.65
          corr(u_i, Xb)  = -0.1009                        Prob > F           =    0.1485
          
          ------------------------------------------------------------------------------
                     y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                    x1 |   .1443385   .1560705     0.92   0.362    -.1735668    .4622437
                    x2 |   .2559484   .1982288     1.29   0.206    -.1478303    .6597272
                    x3 |  -.0618487    .214448    -0.29   0.775     -.498665    .3749677
                    x4 |   .0251512   .1788148     0.14   0.889    -.3390827    .3893851
                 year1 |  (omitted)
                 year2 |   -3.37898   13.61836    -0.25   0.806    -31.11866     24.3607
                 year3 |  -16.59534   13.63284    -1.22   0.232    -44.36454    11.17386
                 year4 |   10.21783   14.08045     0.73   0.473    -18.46312    38.89877
                 year5 |   18.03578   15.46857     1.17   0.252    -13.47266    49.54422
                 _cons |   44.74023   17.97087     2.49   0.018     8.134777    81.34569
          -------------+----------------------------------------------------------------
               sigma_u |  20.884301
               sigma_e |  30.034388
                   rho |   .3259214   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0:     F(9, 32) =     1.83               Prob > F = 0.1017

          In this case, year1 is automatically omitted (as under i.year), so estimates are identical. Let us omit year 2 instead



          Code:
          xtreg y x* year1 year3 year4 year5, fe

          Code:
          Fixed-effects (within) regression               Number of obs      =        50
          Group variable: id                              Number of groups   =        10
          
          R-sq:  within  = 0.2926                         Obs per group: min =         5
                 between = 0.0136                                        avg =       5.0
                 overall = 0.1888                                        max =         5
          
                                                          F(8,32)            =      1.65
          corr(u_i, Xb)  = -0.1009                        Prob > F           =    0.1485
          
          ------------------------------------------------------------------------------
                     y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                    x1 |   .1443385   .1560705     0.92   0.362    -.1735668    .4622437
                    x2 |   .2559484   .1982288     1.29   0.206    -.1478303    .6597272
                    x3 |  -.0618487    .214448    -0.29   0.775     -.498665    .3749677
                    x4 |   .0251512   .1788148     0.14   0.889    -.3390827    .3893851
                 year1 |    3.37898   13.61836     0.25   0.806     -24.3607    31.11866
                 year3 |  -13.21636   13.45437    -0.98   0.333    -40.62202     14.1893
                 year4 |   13.59681   13.61522     1.00   0.325    -14.13649    41.33011
                 year5 |   21.41476   14.52492     1.47   0.150    -8.171545    51.00106
                 _cons |   41.36125   19.82699     2.09   0.045     .9749953    81.74751
          -------------+----------------------------------------------------------------
               sigma_u |  20.884301
               sigma_e |  30.034388
                   rho |   .3259214   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0:     F(9, 32) =     1.83               Prob > F = 0.1017
          Coefficients for the dummies and intercept are different, but coefficient estimates of the regressors still stay the same. Now the question is, why would you be interested in the intercept and dummy coefficients in the first place? They are not meaningful in fixed effects regressions.



          Comment


          • #6
            Thank you Andrew!

            You are a king!

            Comment


            • #7
              Thank you also Fernando and Sergio!

              Comment

              Working...
              X