Mean treatment-control differences using p-values obtained from bootstrap with replications clustered

Morgane Monjour

Join Date: Feb 2023

Posts: 8
#1

Mean treatment-control differences using p-values obtained from bootstrap with replications clustered

14 Feb 2023, 02:22

Dear stata users,

I would like to check if the difference between the mean of a certain variable for my control group and the mean for my treatment group is significant. I would like to obtain p-values with bootstrap with replications clustered. I think that I have to use the command "bootstrap : ..." but I have issues writing a code that works.

Thank you for your help,

Best regards

Morgane
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17854
#2

14 Feb 2023, 03:44

Morgane:
please get yourself familiar with the FAQ recommendationsa on posting more effectively, so to increase your chances to get helpfu replies. Thanks.
I guess that you're intereseted in performing a bootstrapped -ttest- (if that were thecase, please see among the examples in -boostrap- entry, Stata .pdf manual).
The -bootstrap- command offers the -idcluster- option, that might be what you're looking for.

Kind regards,
Carlo
(Stata 19.0)
Comment
Morgane Monjour

Join Date: Feb 2023

Posts: 8
#3

14 Feb 2023, 05:49

Thank you very much for your answer.

When I perform the command "bootstrap, rep(1000) : ttest ..." I have the error "expression list required"

I did not really understand what I should put in the exp_list just after "bootstrap," when I just want to test the equality of mean between unpaired variable.

Best regards,

Morgane
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17854

14 Feb 2023, 07:53

Morgane:
see what follows (no -cluster()- option invoked due to 2 clusters only (-foreign-)):

Code:

. use "C:\Program Files\Stata17\ado\base\a\auto.dta"
(1978 automobile data)

. ttest price, by(foreign) unpaired

Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
Domestic |      52    6072.423    429.4911    3097.104    5210.184    6934.662
 Foreign |      22    6384.682    558.9942    2621.915     5222.19    7547.174
---------+--------------------------------------------------------------------
Combined |      74    6165.257    342.8719    2949.496    5481.914      6848.6
---------+--------------------------------------------------------------------
    diff |           -312.2587    754.4488               -1816.225    1191.708
------------------------------------------------------------------------------
    diff = mean(Domestic) - mean(Foreign)                         t =  -0.4139
H0: diff = 0                                     Degrees of freedom =       72

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.3401         Pr(|T| > |t|) = 0.6802          Pr(T > t) = 0.6599

. return list

scalars:
              r(level) =  95
                 r(sd) =  2949.495884768919
               r(sd_2) =  2621.915083190759
               r(sd_1) =  3097.104279086425
                 r(se) =  754.4488373823767
                r(p_u) =  .6599074558726947
                r(p_l) =  .3400925441273052
                  r(p) =  .6801850882546103
                  r(t) =  -.4138898832983144
               r(df_t) =  72
               r(mu_2) =  6384.681818181818
                r(N_2) =  22
               r(mu_1) =  6072.423076923077
                r(N_1) =  52


. bootstrap (r(mu_2)-r(mu_1)), reps(200) : ttest price, by(foreign) unpaired
(running ttest on estimation sample)

warning: ttest does not set e(sample), so no observations will be excluded from the resampling because of missing values or other
         reasons. To exclude observations, press Break, save the data, drop any observations that are to be excluded, and rerun
         bootstrap.

Bootstrap replications (200)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100
..................................................   150
..................................................   200

Bootstrap results                                          Number of obs =  74
                                                           Replications  = 200

      Command: ttest price, by(foreign) unpaired
        _bs_1: r(mu_2)-r(mu_1)

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
             | coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _bs_1 |   312.2587   742.6891     0.42   0.674    -1143.385    1767.903
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(Stata 19.0)

Comment

George Ford

Join Date: Aug 2014

Posts: 3337
#5

14 Feb 2023, 07:56

If you're interested in covariate balance, then don't use p-values. The most trival of difference is significant in large samples. (See Imbens/Wooldridge in JEP). Use standardized differences (which is like a t-test but without a sample size adjustment). there's a user written command covbal that does a nice job.
Comment
Morgane Monjour

Join Date: Feb 2023

Posts: 8
#6

15 Feb 2023, 06:08

Thank you very for your help and for the code. It helps me a lot.

Best regards,

Morgane
Comment

Announcement

Mean treatment-control differences using p-values obtained from bootstrap with replications clustered

Comment

Comment

Comment

Comment

Comment