Bootstrapping with dummies

Alexander Marx

Join Date: Jul 2019
Posts: 17

Bootstrapping with dummies

30 May 2020, 02:52

Dear Stata Forum,
I have data with a small number of clusters but with a large number of individuals within a class.
I have four household surveys over 4 countries and 26 regions (Pooled-Crosssection).

To overcome the overstated standard errors I´d like to bootstrap my results.
However, a regional dummy variable is causing me trouble.
I already know that sometimes a region is included in a bootstrap sample and sometimes left out.
That is alright since I am not interested in the coefficients of the dummy variable.

I am using Stata 16.

Below you can see my data:

Code:

 Stata 16
input double rem_edu float(handling_edu1 ln_exp_total ln_remittance) double(rem_freq oecd femmigrant childprim childsec childtert elderlyageabove62 womenabove15) float(hh_edu head_gender) byte(hagriland urban) float(incomepoverty school) byte region
                  0 .03949346  6.050091  4.544163  5 0 1 0 0 0 1 1 2 0 1 1 .5724485 .9207876 11
                  0 .03949346   6.18593  5.642776  3 1 1 0 0 0 1 1 1 0 1 1 .5724485 .9207876 11
   .800000011920929 .03949346  6.777756  6.510276 10 0 0 0 0 0 0 4 3 1 1 1 .5724485 .9207876 11
                  1 .03949346  6.255384  6.510276  5 0 1 2 0 0 0 1 1 1 1 1 .5724485 .9207876 11
                  0 .03949346  5.062107  5.370842  5 1 1 3 2 0 0 2 1 1 1 1 .5724485 .9207876 11
                  0 .03949346  5.837912  7.818609 16 1 1 2 0 0 0 3 3 1 0 1 .5724485 .9207876 11
                .75 .03949346  7.687905  8.685028  3 1 0 2 1 0 0 3 3 0 0 1 .5724485 .9207876 11
                  0 .03949346  6.927889  5.163203  4 1 1 0 0 0 1 3 2 0 1 1 .5724485 .9207876 11
                  0         0  4.999174  5.999451  2 1 1 1 0 0 0 1 2 0 1 0 .6428978 .7905115 12
                  0         0  4.515176         .  3 1 1 0 0 0 0 1 2 0 1 1 .6428978 .7905115 12
  .4000000059604645         0  8.181123  7.203424  2 1 1 0 0 0 0 3 2 0 1 0 .6428978 .7905115 12
  .5882353186607361         0  6.092485  7.916373 12 0 0 3 0 0 0 3 1 1 1 0 .6767302        1 12
 .44117647409439087         0  4.240481  6.181772 12 0 0 5 2 0 0 1 1 1 1 0 .6767302        1 12

and this is the regression command I use:
[CODE][xtset country
bootstrap, nodrop cluster(country) idcluster(countryb): xtreg rem_edu handling_edu1 ln_exp_total ln_remittance rem_freq oecd femmigrant childprim childsec childtert elderlyageabove62 womenabove15 i.hh_edu head_gender i.hagriland i.urban incomepoverty school i.region, re vce(cluster country)/CODE]

Is there any solution to the dummy problem?

Kind regards
Alexander

Last edited by Alexander Marx; 30 May 2020, 03:01.

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17691
#2

30 May 2020, 03:51

Alexander:
can't you simply bootstrap your standard errors (see the SE options under -xtreg-)?

Kind regards,
Carlo
(Stata 19.0)
Comment

Alexander Marx

Join Date: Jul 2019
Posts: 17

30 May 2020, 04:11

You mean by

Code:

xtreg y x, re vce(boot)

?

It doesnt give me standard errors:

Code:

Random-effects GLS regression                   Number of obs     =      2,268
Group variable: country                         Number of groups  =          4

R-sq:                                           Obs per group:
     within  = 0.0805                                         min =        332
     between = 0.9977                                         avg =      567.0
     overall = 0.2180                                         max =        818

                                                Wald chi2(0)      =          .
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .

                                       (Replications based on 4 clusters in country)
------------------------------------------------------------------------------------
                   |   Observed   Bootstrap                         Normal-based
           rem_edu |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------+----------------------------------------------------------------
     handling_edu0 |  -.0427359          .        .       .            .           .
      ln_exp_total |   .0197525          .        .       .            .           .
     ln_remittance |   .0023672          .        .       .            .           .

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17691
#4

30 May 2020, 07:07

Alexander:
can you please share via -dataex- an excerpt of your data that mirrors exactly your last -xtreg- code?
In your previous example -country- was not included.

Kind regards,
Carlo
(Stata 19.0)
Comment

Alexander Marx

Join Date: Jul 2019
Posts: 17

30 May 2020, 08:15

I am sorry for my mistake,,
here is the the data including the country variable:

input float country double rem_edu float(handling_edu1 ln_exp_total ln_remittance) double(rem_freq oecd femmigrant elderlyageabove62 womenabove15) float(hh_edu head_gender) byte(hagriland urban) float(incomepoverty school) byte region
6 .5882353186607361 0 6.092485 7.916373 12 0 0 0 3 1 1 1 0 .6767302 1 12
6 .44117647409439087 0 4.240481 6.181772 12 0 0 0 1 1 1 1 0 .6767302 1 12
6 0 0 3.257499 4.544163 3 0 1 1 2 2 0 1 0 .6767302 1 12
15 0 .09039909 6.283722 7.56499 6 1 1 0 6 1 0 0 1 .6026112 .238213 19
15 0 .09039909 4.4023714 . 1 1 0 0 2 1 1 0 1 .6026112 .238213 19
15 0 .09039909 5.271214 . 10 1 1 0 3 1 0 0 1 .6026112 .238213 19
15 .062339331954717636 .09039909 5.506657 8.006822 11 1 0 1 5 1 0 1 1 .6026112 .238213 19
15 .04739336669445038 .09039909 4.201944 7.325038 24 1 0 0 3 1 0 0 1 .6026112 .238213 19
18 .7627118644067796 .04118309 4.2753835 6.311872 4 0 1 0 1 3 0 1 1 .3454125 .9154598 1
18 0 .04118309 3.675561 7.353326 2 1 0 0 2 2 0 0 1 .3454125 .9154598 1
18 0 .04118309 3.433923 6.311872 3 0 0 0 1 2 1 0 1 .3454125 .9154598 1
14 0 .0625 3.010644 3.9633024 3 0 1 1 1 1 0 1 0 .75 1 15
14 .18181818181818182 .0625 6.479462 . 0 0 1 0 4 1 0 0 0 .75 1 15
14 0 .0625 1.3308897 2.7105396 2 0 0 0 1 1 1 1 0 .75 1 15

This is the regression I performed:

Code:

xtreg rem_edu handling_edu1 ln_exp_total ln_remittance rem_freq oecd femmigrant childprim childsec childtert elderlyageabove62 womenabove15 i.hh_edu head_gender i.hagriland i.urban incomepoverty school road i.region, re vce(boot)

and here is the stata output:

Code:

Random-effects GLS regression                   Number of obs     =      2,268
Group variable: country                         Number of groups  =          4

R-sq:                                           Obs per group:
     within  = 0.0817                                         min =        332
     between = 0.9982                                         avg =      567.0
     overall = 0.2192                                         max =        818

                                                Wald chi2(0)      =          .
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .

                                            (Replications based on 4 clusters in country)
-----------------------------------------------------------------------------------------
                        |   Observed   Bootstrap                         Normal-based
                rem_edu |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
          handling_edu1 |  -.1360502          .        .       .            .           .
           ln_exp_total |    .019857          .        .       .            .           .
          ln_remittance |   .0020914          .        .       .            .           .
               rem_freq |  -.0011525          .        .       .            .           .
                   oecd |  -.0100997          .        .       .            .           .
             femmigrant |   .0115505          .        .       .            .           .
              childprim |   .0059648          .        .       .            .           .
               childsec |   .0248458          .        .       .            .           .
              childtert |   .0284721          .        .       .            .           .
      elderlyageabove62 |  -.0032485          .        .       .            .           .
           womenabove15 |   .0002082          .        .       .            .           .

Last edited by Alexander Marx; 30 May 2020, 08:25.

Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17691

30 May 2020, 08:36

Alexander:
set aside that -child*- predictors were not included in your excerpt, your dataset seems to suffer from quasi-extreme multicollinearity.
This suspect is also supported by the sky-rocketing R-sq between.
The advise is to consider a more parsimonious model.
Please find below what I got after running -xtreg- on your data excerpt:

Code:

. xtreg rem_edu handling_edu1 ln_exp_total ln_remittance rem_freq oecd femmigrant elderlyageabove62 womenabove15 i.hh_edu head_gender i
> .hagriland i.urban incomepoverty school i.region, re vce(cluster country)
note: 3.hh_edu omitted because of collinearity
note: head_gender omitted because of collinearity
note: 1.hagriland omitted because of collinearity
note: 1.urban omitted because of collinearity
note: incomepoverty omitted because of collinearity
note: school omitted because of collinearity
note: 12.region omitted because of collinearity
note: 15.region omitted because of collinearity
note: 19.region omitted because of collinearity
insufficient observations
r(2001);

. xtreg rem_edu handling_edu1 ln_exp_total ln_remittance rem_freq oecd femmigrant elderlyageabove62 womenabove15 i.hh_edu head_gender i
> .hagriland i.urban incomepoverty school i.region, re vce(boot)
note: head_gender omitted because of collinearity
note: 1.hagriland omitted because of collinearity
note: 1.urban omitted because of collinearity
note: incomepoverty omitted because of collinearity
note: school omitted because of collinearity
note: 12.region omitted because of collinearity
note: 15.region omitted because of collinearity
note: 19.region omitted because of collinearity
(running xtreg on estimation sample)
insufficient observations
an error occurred when bootstrap executed xtreg
r(2001);

Kind regards,
Carlo
(Stata 19.0)

Announcement

Bootstrapping with dummies

Comment

Comment

Comment

Comment

Comment