Why the different bootstrap errors in panel data random effects?

Alfonso Sánchez-Peñalver

Join Date: Mar 2014
Posts: 432

Why the different bootstrap errors in panel data random effects?

31 Oct 2019, 09:33

Hi,

in help xt_vce_options I found the following recommendation:

When working with panel-data models, we strongly encourage you to use the vce(bootstrap) or vce(jackknife) options instead of the corresponding prefix command.

Of course, this called into my curiosity, and I was wondering if it was because of the clustering nature of the data to avoid any mistakes when using the prefix, or because there is something else to it. So I decided to try it out. When doing a fixed-effects estimation, I found no difference in using either method:

Code:

. clear all

. set more off

. webuse nlswork
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. local xv "c.age##c.age c.ttl_exp##c.ttl_exp south"

. xtset idcode year
       panel variable:  idcode (unbalanced)
        time variable:  year, 68 to 88, but with gaps
                delta:  1 unit

. xtreg ln_w `xv', fe vce(boot, reps(50) seed(1234))
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Fixed-effects (within) regression               Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1546                                         min =          1
     between = 0.2856                                         avg =        6.1
     overall = 0.2149                                         max =         15

                                                Wald chi2(5)      =    1521.76
corr(u_i, Xb)  = 0.1348                         Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0291285   .0055949     5.21   0.000     .0181626    .0400943
                    |
        c.age#c.age |  -.0006749   .0000913    -7.39   0.000    -.0008539    -.000496
                    |
            ttl_exp |   .0617062   .0035824    17.22   0.000     .0546848    .0687275
                    |
c.ttl_exp#c.ttl_exp |   -.000893   .0001529    -5.84   0.000    -.0011927   -.0005933
                    |
              south |  -.0684464   .0200641    -3.41   0.001    -.1077714   -.0291214
              _cons |   1.126962   .0780397    14.44   0.000     .9740066    1.279917
--------------------+----------------------------------------------------------------
            sigma_u |  .36581516
            sigma_e |  .29463102
                rho |  .60654417   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

. bs, reps(50) cl(idcode) id(cid) group(year) seed(1234): xtreg ln_w `xv', fe
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Fixed-effects (within) regression               Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1546                                         min =          1
     between = 0.2856                                         avg =        6.1
     overall = 0.2149                                         max =         15

                                                Wald chi2(5)      =    1521.76
corr(u_i, Xb)  = 0.1348                         Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0291285   .0055949     5.21   0.000     .0181626    .0400943
                    |
        c.age#c.age |  -.0006749   .0000913    -7.39   0.000    -.0008539    -.000496
                    |
            ttl_exp |   .0617062   .0035824    17.22   0.000     .0546848    .0687275
                    |
c.ttl_exp#c.ttl_exp |   -.000893   .0001529    -5.84   0.000    -.0011927   -.0005933
                    |
              south |  -.0684464   .0200641    -3.41   0.001    -.1077714   -.0291214
              _cons |   1.126962   .0780397    14.44   0.000     .9740066    1.279917
--------------------+----------------------------------------------------------------
            sigma_u |  .36581516
            sigma_e |  .29463102
                rho |  .60654417   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

Fitting random effects, however, presents a different picture

Code:

. xtreg ln_w `xv', re vce(boot, reps(50) seed(1234))
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Random-effects GLS regression                   Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1538                                         min =          1
     between = 0.2971                                         avg =        6.1
     overall = 0.2249                                         max =         15

                                                Wald chi2(5)      =    2034.78
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0325329   .0050847     6.40   0.000     .0225671    .0424987
                    |
        c.age#c.age |  -.0007202   .0000839    -8.58   0.000    -.0008847   -.0005557
                    |
            ttl_exp |   .0639336   .0027724    23.06   0.000     .0584998    .0693674
                    |
c.ttl_exp#c.ttl_exp |   -.000943   .0001341    -7.03   0.000    -.0012059   -.0006801
                    |
              south |  -.1253318   .0116613   -10.75   0.000    -.1481877    -.102476
              _cons |    1.08762   .0695168    15.65   0.000     .9513691     1.22387
--------------------+----------------------------------------------------------------
            sigma_u |  .31293049
            sigma_e |  .29463102
                rho |  .53009223   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

. bs, reps(50) cl(idcode) id(cid) group(year) seed(1234): xtreg ln_w `xv', re
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Random-effects GLS regression                   Number of obs     =     28,502
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1538                                         min =          1
     between = 0.2971                                         avg =        6.1
     overall = 0.2249                                         max =         15

                                                Wald chi2(5)      =    1979.20
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                                     (Replications based on 4,710 clusters in idcode)
-------------------------------------------------------------------------------------
                    |   Observed   Bootstrap                         Normal-based
            ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
                age |   .0325329   .0052315     6.22   0.000     .0222793    .0427865
                    |
        c.age#c.age |  -.0007202   .0000858    -8.39   0.000    -.0008884    -.000552
                    |
            ttl_exp |   .0639336    .002964    21.57   0.000     .0581244    .0697429
                    |
c.ttl_exp#c.ttl_exp |   -.000943   .0001408    -6.70   0.000     -.001219    -.000667
                    |
              south |  -.1253318     .01277    -9.81   0.000    -.1503606   -.1003031
              _cons |    1.08762   .0719991    15.11   0.000     .9465039    1.228735
--------------------+----------------------------------------------------------------
            sigma_u |  .31293049
            sigma_e |  .29463102
                rho |  .53009223   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

The question, then, is why? It seems odd that with a fixed effects estimation there is no difference, but with a random effects estimation there is. It is unfortunate, because there may be applications where the statistic we want to bootstrap is some post-estimation, not simply the standard errors of the coefficients, but that it is based on those standard errors. In any case, can someone explain why the different results?

Thanks!!!

Alfonso Sanchez-Penalver

Tags: None

Announcement

Why the different bootstrap errors in panel data random effects?