Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Carl:
    replicating your codes (with some tweaks to uniform to the way predictors were named in -nlswork.dta- file we have:
    Code:
    . use "https://www.stata-press.com/data/r17/nlswork.dta"
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    
    . reg ln_wage union grade tenure i.year, robust
    
    Linear regression                               Number of obs     =     19,008
                                                    F(14, 18993)      =     616.93
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.3039
                                                    Root MSE          =     .38985
    
    ------------------------------------------------------------------------------
                 |               Robust
         ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           union |   .1514804     .00672    22.54   0.000     .1383087    .1646522
           grade |   .0790582   .0012815    61.69   0.000     .0765464      .08157
          tenure |   .0300961   .0007711    39.03   0.000     .0285846    .0316075
                 |
            year |
             71  |   .0274538   .0161976     1.69   0.090    -.0042949    .0592025
             72  |   .0292378   .0152852     1.91   0.056    -.0007225    .0591981
             73  |   .0166856   .0159472     1.05   0.295    -.0145722    .0479435
             77  |  -.0133802   .0144448    -0.93   0.354    -.0416934     .014933
             78  |   .0456363   .0152336     3.00   0.003     .0157771    .0754955
             80  |   .0036626   .0151359     0.24   0.809    -.0260052    .0333304
             82  |  -.0103459   .0148349    -0.70   0.486    -.0394237    .0187319
             83  |   .0099641   .0156993     0.63   0.526    -.0208079     .040736
             85  |   .0412243   .0154251     2.67   0.008     .0109898    .0714588
             87  |   .0410003   .0155188     2.64   0.008     .0105822    .0714185
             88  |    .048031   .0163018     2.95   0.003      .016078    .0799841
                 |
           _cons |   .5723746   .0195216    29.32   0.000     .5341106    .6106386
    ------------------------------------------------------------------------------
    
    . xtreg ln_wage union grade tenure i.year, fe robust
    note: grade omitted because of collinearity.
    
    Fixed-effects (within) regression               Number of obs     =     19,008
    Group variable: idcode                          Number of groups  =      4,132
    
    R-squared:                                      Obs per group:
         Within  = 0.1282                                         min =          1
         Between = 0.1610                                         avg =        4.6
         Overall = 0.1340                                         max =         12
    
                                                    F(13,4131)        =      97.42
    corr(u_i, Xb) = 0.1429                          Prob > F          =     0.0000
    
                                 (Std. err. adjusted for 4,132 clusters in idcode)
    ------------------------------------------------------------------------------
                 |               Robust
         ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           union |   .1004918    .009703    10.36   0.000     .0814687    .1195149
           grade |          0  (omitted)
          tenure |   .0172592   .0011662    14.80   0.000     .0149727    .0195456
                 |
            year |
             71  |   .0257666   .0106191     2.43   0.015     .0049475    .0465858
             72  |   .0286456   .0120102     2.39   0.017     .0050991    .0521921
             73  |   .0279679   .0132397     2.11   0.035     .0020109    .0539249
             77  |   .0556208   .0144188     3.86   0.000     .0273522    .0838894
             78  |   .0936785   .0149516     6.27   0.000     .0643652    .1229918
             80  |   .0773018   .0154508     5.00   0.000     .0470099    .1075937
             82  |   .0906583   .0156842     5.78   0.000     .0599089    .1214077
             83  |   .1130978   .0160829     7.03   0.000     .0815667    .1446289
             85  |   .1470453   .0164796     8.92   0.000     .1147363    .1793542
             87  |    .166594   .0175077     9.52   0.000     .1322694    .2009186
             88  |   .1921114     .01866    10.30   0.000     .1555279     .228695
                 |
           _cons |   1.566306   .0121762   128.64   0.000     1.542434    1.590178
    -------------+----------------------------------------------------------------
         sigma_u |   .4055671
         sigma_e |  .25625658
             rho |  .71467812   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    .
    Some commennts on what aboce follow;
    1) OLS code: it considers all the observations as independent. Standard errors (SEs) take heteroskedasticity only into account. No demeaning is applied.
    2) -xtreg,fe- code: it considers the panel structure of your dataset. SEs take both heteroskedasticity and serial correlation into account (while -robust- and -vce(cluster idcode)- can be used interchangeably with -xtreg-, this does not hold for -regress-). Demeaning is applied; therefore, the mean of a constant (that is, a time-invariant variable, such as -race-) equals the constant and the subtraction sums up to 0 (ie, no coefficient is returned).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #17
      Hi Carlo, thanks. I just found the following. In the book "Introductery Econometrics - 6th Edition" by Wooldridge, I found the following on page 437:


      "When we include a full set of year dummies—that is, year dummies for all years but the first—

      we cannot estimate the effect of any variable whose change across time is constant. An example is

      years of experience in a panel data set where each person works in every year, so that experience

      always increases by one in each year, for every person in the sample. The presence of ai accounts for

      differences across people in their years of experience in the initial time period. But then the effect of

      a one-year increase in experience cannot be distinguished from the aggregate time effects (because

      experience increases by the same amount for everyone).
      "

      Could this be the explanation why one of the variables is removed in the FE estimator? Is there then multicolinearity in keeping all (8) variables in the model? And to get around that one variable is removed or? (Wooldridge used all year dummy variables, which is why experience drops out for im and 1987 for me).

      Comment


      • #18
        Carl:
        you had two different omissions in your first -xtreg,fe- code:
        1)-edu-, as it was time-invariant;
        2) 1987 in -i.year- (BTW: this omision was decided by Stata, but you can manage it using the -ib#.- prefix from -fvvarlist-) notation was omitted to protect your analysis from the so called "dummy-trap" (https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #19
          Originally posted by Carlo Lazzaro View Post
          Carl:
          you had two different omissions in your first -xtreg,fe- code:
          1)-edu-, as it was time-invariant;
          2) 1987 in -i.year- (BTW: this omision was decided by Stata, but you can manage it using the -ib#.- prefix from -fvvarlist-) notation was omitted to protect your analysis from the so called "dummy-trap" (https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).
          Thanks four your answer. I get all your points, but the year 1980 was removed to prevent the dummy-trap. And if 1987 had been removed because of the dummy variable trap, shouldn't it have already been removed in a "normal" regression? But i dont think that 1987 was. Dont you think that wooldridge's explanation would be logical?
          Sorry to be so pushy, but I still don't quite get it why two year dummies were removed.

          Comment


          • #20
            Carl:
            as far as the omission of 1987 in your second code (ie, when -exp- was plugged in the right-hand side of your regression equation) is concerned, Jeff's explanation makes sense, as 1987 was in all likelihood perfectly collinear with -exp-.
            Therefore, in your secon code, the omission of -1980- was due to avoid the dummy trap, whereas the omission of -1987- was due to perfect collinear with -exp-.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X