Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mean treatment-control differences using p-values obtained from bootstrap with replications clustered

    Dear stata users,


    I would like to check if the difference between the mean of a certain variable for my control group and the mean for my treatment group is significant. I would like to obtain p-values with bootstrap with replications clustered. I think that I have to use the command "bootstrap : ..." but I have issues writing a code that works.

    Thank you for your help,

    Best regards

    Morgane



  • #2
    Morgane:
    please get yourself familiar with the FAQ recommendationsa on posting more effectively, so to increase your chances to get helpfu replies. Thanks.
    I guess that you're intereseted in performing a bootstrapped -ttest- (if that were thecase, please see among the examples in -boostrap- entry, Stata .pdf manual).
    The -bootstrap- command offers the -idcluster- option, that might be what you're looking for.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you very much for your answer.

      When I perform the command "bootstrap, rep(1000) : ttest ..." I have the error "expression list required"

      I did not really understand what I should put in the exp_list just after "bootstrap," when I just want to test the equality of mean between unpaired variable.

      Best regards,

      Morgane

      Comment


      • #4
        Morgane:
        see what follows (no -cluster()- option invoked due to 2 clusters only (-foreign-)):
        Code:
        . use "C:\Program Files\Stata17\ado\base\a\auto.dta"
        (1978 automobile data)
        
        . ttest price, by(foreign) unpaired
        
        Two-sample t test with equal variances
        ------------------------------------------------------------------------------
           Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
        ---------+--------------------------------------------------------------------
        Domestic |      52    6072.423    429.4911    3097.104    5210.184    6934.662
         Foreign |      22    6384.682    558.9942    2621.915     5222.19    7547.174
        ---------+--------------------------------------------------------------------
        Combined |      74    6165.257    342.8719    2949.496    5481.914      6848.6
        ---------+--------------------------------------------------------------------
            diff |           -312.2587    754.4488               -1816.225    1191.708
        ------------------------------------------------------------------------------
            diff = mean(Domestic) - mean(Foreign)                         t =  -0.4139
        H0: diff = 0                                     Degrees of freedom =       72
        
            Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
         Pr(T < t) = 0.3401         Pr(|T| > |t|) = 0.6802          Pr(T > t) = 0.6599
        
        . return list
        
        scalars:
                      r(level) =  95
                         r(sd) =  2949.495884768919
                       r(sd_2) =  2621.915083190759
                       r(sd_1) =  3097.104279086425
                         r(se) =  754.4488373823767
                        r(p_u) =  .6599074558726947
                        r(p_l) =  .3400925441273052
                          r(p) =  .6801850882546103
                          r(t) =  -.4138898832983144
                       r(df_t) =  72
                       r(mu_2) =  6384.681818181818
                        r(N_2) =  22
                       r(mu_1) =  6072.423076923077
                        r(N_1) =  52
        
        
        . bootstrap (r(mu_2)-r(mu_1)), reps(200) : ttest price, by(foreign) unpaired
        (running ttest on estimation sample)
        
        warning: ttest does not set e(sample), so no observations will be excluded from the resampling because of missing values or other
                 reasons. To exclude observations, press Break, save the data, drop any observations that are to be excluded, and rerun
                 bootstrap.
        
        Bootstrap replications (200)
        ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
        ..................................................    50
        ..................................................   100
        ..................................................   150
        ..................................................   200
        
        Bootstrap results                                          Number of obs =  74
                                                                   Replications  = 200
        
              Command: ttest price, by(foreign) unpaired
                _bs_1: r(mu_2)-r(mu_1)
        
        ------------------------------------------------------------------------------
                     |   Observed   Bootstrap                         Normal-based
                     | coefficient  std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
               _bs_1 |   312.2587   742.6891     0.42   0.674    -1143.385    1767.903
        ------------------------------------------------------------------------------
        
        .
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          If you're interested in covariate balance, then don't use p-values. The most trival of difference is significant in large samples. (See Imbens/Wooldridge in JEP). Use standardized differences (which is like a t-test but without a sample size adjustment). there's a user written command covbal that does a nice job.

          Comment


          • #6
            Thank you very for your help and for the code. It helps me a lot.

            Best regards,

            Morgane

            Comment

            Working...
            X