boottest does faster wild bootstrap for didregress, xtdidregress

David Roodman

Join Date: Jul 2014

Posts: 478
#1

boottest does faster wild bootstrap for didregress, xtdidregress

02 Jun 2021, 08:05

Stata 17 introduces a convenient package for double- and triple-differenced treatment effects estimation, embodied in the didregress and xtdidregress commands. It offers three methods for correcting the p values and confidence intervals, by far the slowest of which is the wild cluster bootstrap.

But I've just made boottest work after these commands and it is doing the same bootstrap about 200 times faster, making the test much more practical.

If no hypothesis is stated in the boottest command line, it defaults to testing that the treatment effect is 0.

Example:

Code:

webuse smallg didregress (outcome x i.b) (treated), group(county) time(year) wildbootstrap(rseed(123) errorweight(webb)) boottest, seed(123) weight(webb) nograph reps(1000) // same test, ~200 times faster

Install with:

Code:

ssc install boottest, replace

Difference-in-differences (DID) and DDD models | New in Stata 17

https://www.stata.com

Difference in differences (DID) is one of the most respected tools to estimate the average treatment effect on the treated (ATET).
Tags: None

3 likes

Raymond Zhang

Join Date: Jan 2021
Posts: 349

23 Jun 2021, 13:32

Dear @David Roodman,
I tried your example code,but the t value,p value and confident intervals are different between the different method?How can I get the same results?

Code:

. webuse smallg,clear (Simulated data with a small number of groups) . didregress (outcome x i.b) (treated), group(county) time(year) wildbootstrap(rseed(1 > 23) errorweight(webb)) computing 1000 replications Finding p-value .................................................. 50% ................................................. 100% Confidence interval lower bound .......................... Confidence interval upper bound ...... Number of groups and treatment time Time variable: year Control: treated = 0 Treatment: treated = 1 ----------------------------------- | Control Treatment -------------+--------------------- Group | county | 4 2 -------------+--------------------- Time | Minimum | 2011 2013 Maximum | 2011 2013 ----------------------------------- DID with wild-cluster bootstrap inference Number of obs = 10,000 No. of clusters = 6 Replications = 1,000 Data type: Repeated cross-sectional Error weight: webb --------------------------------------------------------------------------- outcome | Coefficient t P>|t| [95.10% conf. interval] ----------------------+---------------------------------------------------- ATET | treated | (Treated vs Untreated)| -.9394987 -10.63 0.020 -1.248532 -.5621484 --------------------------------------------------------------------------- Note: 95.10% confidence interval is wider than requested. Note: ATET estimate adjusted for covariates, group effects, and time effects. . boottest, seed(123) weight(webb) nograph reps(1000) Note: The bootstrap usually performs best when the confidence level (here, 95%) times the number of replications plus 1 (1000+1=1001) is an integer. Wild bootstrap-t, null imposed, 1000 replications, Wald test, bootstrap clustering by > county, Webb weights: r1vs0.treated t(5) = -10.6262 Prob>|t| = 0.0340 95% confidence set for null hypothesis expression: [-1.247, -.5168]

Code:

Best regards.

Raymond Zhang
Stata 17.0,MP

Comment

David Roodman

Join Date: Jul 2014

Posts: 478
#3

23 Jun 2021, 14:09

Like most bootstraps, the process uses randomness. So I wouldn't expect a perfect match, just a close one, and that's what I see in the above results. They should match better and better as you increase the number of draws from 1000 to 10,000 etc. But for didregress, that would take a very long time...

Actually I believe the t values do match exactly, as they should. didregress just displays the t value with more rounding.
Comment
Enrique Pinzon (StataCorp)

StataCorp Employee

Join Date: Jan 2015

Posts: 220
#4

23 Jun 2021, 20:15

Dear Raymond,

I agree with David. Also, we are working on speeding the our implementation of the wild bootstrap.
Comment
Raymond Zhang

Join Date: Jan 2021

Posts: 349
#5

23 Jun 2021, 23:51

Dear David Roodman, Thank you for your reply.I will try to increase the number of draws.

Best regards.

Raymond Zhang
Stata 17.0,MP
Comment
Benjamin Liu

Join Date: Nov 2020

Posts: 1
#6

31 Aug 2021, 09:37

Originally posted by David Roodman View Post

Like most bootstraps, the process uses randomness. So I wouldn't expect a perfect match, just a close one, and that's what I see in the above results. They should match better and better as you increase the number of draws from 1000 to 10,000 etc. But for didregress, that would take a very long time...

Actually I believe the t values do match exactly, as they should. didregress just displays the t value with more rounding.

Dear David Roodman,I also tried my data by didregress and boottest,but stata displayed an error,the following is my code and result:

Code:

didregress (y $control) (D), group(pro_id) time(year) wildbootstrap(rseed(123)) ............................. ------------------------------------------------------------------ fdi | Coefficient t P>|t| [95% conf. interval] -------------+---------------------------------------------------- ATET | D | (1 vs 0) | 29.3818 2.56 0.062 .8773544 68.98671 ------------------------------------------------------------------ Note: ATET estimate adjusted for covariates, group effects, and time effects. boottest D, seed(123) nograph reps(1000) r1vs0.D invalid name r(198);

My stata version is 17.0, I also don't know what r1vs0.D is,and D is a dummy of my interest.Could you please give me some advice?I would appreciate it,thank you!

Last edited by Benjamin Liu; 31 Aug 2021, 09:42.
Comment
David Roodman

Join Date: Jul 2014

Posts: 478
#7

31 Aug 2021, 09:46

Notice that in the posted example the boottest command line looks like "boottest, ..." . There is no variable name in it.
Comment
David Roodman

Join Date: Jul 2014

Posts: 478
#8

10 Sep 2021, 16:51

Update: the August 9 Stata update has greatly sped up the wild bootstrap implementation in didregress and xtdidregress.
Comment
Eskindir Loha Shumbullo

Join Date: Sep 2021

Posts: 4
#9

08 Oct 2021, 07:35

Hi David, thanks! Is there any update on wildbootstrap option for xtreg?

Last edited by Eskindir Loha Shumbullo; 08 Oct 2021, 07:51.
Comment
David Roodman

Join Date: Jul 2014

Posts: 478
#10

08 Oct 2021, 08:15

@Eskindir Loha Shumbullo I don't understand what you are asking. I don't think xtreg has ever offered a wild bootstrap. boottest will perform it after "xtreg, fe", but that is not new...
Comment
Eskindir Loha Shumbullo

Join Date: Sep 2021

Posts: 4
#11

10 Oct 2021, 14:29

David Roodman Thanks a lot for your response. Here is what i did and got:

xtset hospital

xtreg y x1 x2 x1*x2 x3 x4 x5 x6 x7, fe cluster(District)

boottest x1, reps(9999) nonull

This gave:

Wild bootstrap-t, null not imposed, 9999 replications, Wald test, bootstrap clustering by District, Rademacher weights:
x1

t(14) = .
Prob>|t| = .

95% confidence set for null hypothesis expression: [., .]
(A confidence interval could not be bounded. Try widening the search range with the gridmin() and gridmax() options.)

It worked well with out fe framework.
Comment

David Roodman

Join Date: Jul 2014
Posts: 478

#12

10 Oct 2021, 15:05

If, per the guidelines, you post a reproducible example, then I can engage more with it. I don't know what to do with this example.

Here's a reproducible example I just ran on my computer:

Code:

. webuse abdata

. xtreg n w k, fe cluster(ind)

Fixed-effects (within) regression               Number of obs      =      1031
Group variable: id                              Number of groups   =       140

R-sq:  Within  = 0.5704                         Obs per group: min =         7
       Between = 0.8466                                        avg =       7.4
       Overall = 0.8341                                        max =         9

                                                F(2,8)             =     84.50
corr(u_i, Xb)  = 0.4352                         Prob > F           =    0.0000

                                    (Std. err. adjusted for 9 clusters in ind)
------------------------------------------------------------------------------
             |               Robust
           n | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           w |   -.367774   .1451567    -2.53   0.035     -.702506    -.033042
           k |   .6403675   .0534576    11.98   0.000     .5170941    .7636408
       _cons |   2.494684    .452064     5.52   0.001     1.452222    3.537145
-------------+----------------------------------------------------------------
     sigma_u |  .58883268
     sigma_e |   .1372825
         rho |  .94844636   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. boottest w

Warning: with 9 boostrap clusters, the number of replications, 999, exceeds the universe of Rademacher draws, 2^9 = 512. Sampling each once.
Consider Webb weights instead, using weight(webb).

Wild bootstrap-t, null imposed, 512 replications, Wald test, bootstrap clustering by ind, Rademacher weights:
  w

                            t(8) =    -2.5349
                        Prob>|t| =     0.0586

95% confidence set for null hypothesis expression: [-.7159, .02128]

.

Last edited by David Roodman; 10 Oct 2021, 15:08.

Comment

Eskindir Loha Shumbullo

Join Date: Sep 2021

Posts: 4
#13

11 Oct 2021, 02:55

David Roodman What you posted helped me to search for other problems in my specifications and I found that the variable i used for the boottest was the one omitted by the model because of collinearity. Thank you so much! Now i got:

Wild bootstrap-t, null not imposed, 9999 replications, Wald test, bootstrap clustering by District, Rademacher weights:
x2

t(14) = -3.4191
Prob>|t| = 0.0152

95% confidence set for null hypothesis expression: [-.07742, -.01158]
Comment

Announcement