2 sample z test appropriate for testing between the 2 means after bootstrapping a 2 part model

Karl Yesler

Join Date: Jul 2014

Posts: 103
#1

2 sample z test appropriate for testing between the 2 means after bootstrapping a 2 part model

26 Feb 2015, 18:40

Hi Statalist,

I'd like to get some help on calculating the p-value testing the difference between 2 means after bootstrapping.

Background: I cannot simply use vce(bootstrap) because I'm not just testing for significance of a coefficient: I'm using a 2 part model with recycled predictions to estimate a treatment's effect on healthcare costs (first part logit of any cost, second part glm log gamma among those with cost). I then predict the counterfactual estimate of not receiving treatment. I then combine the 1st and 2nd parts of the model and compare the factual estimate to the counterfactual.

Therefore I used -bsample- with -forvalues iter=1(1)1000- loop to estimate 1000 random sample estimates of the factual and counterfactual means.
According to "cameron and trivedi microeconometrics using stata 2010" pg. 423 the standard deviation of the random sample estimates is the standard error of estimator. Which makes sense to me.

Given the above (which I am fairly certain is correct) my question is: is the 2 sample z test the appropriate test for testing the difference between the 2 means?

To make this more concrete I provide my code for the 2 sample z test:

Code:

***z-test for 1000 replicates of bsample bootstrap of factual (mean_mu) and counterfactual(mean_mu0) estimates foreach var of varlist mean_mu mean_mu0 { sum `var' gen mean_`var'=r(mean) gen sd_`var'=r(sd) } gen zstat_2sample=(mean_mean_mu - mean_mean_mu0)/(sqrt((sd_mean_mu^2) + sd_mean_mu0^2)) gen pv_2sample=1.96*(1-normal(abs(zstat_2sample)))

Thanks!

I am using Stata SE x64 ver 13.1 with Win 7 x64 and with 8 GB of ram.
Tags: None
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

27 Feb 2015, 18:17

The two-sample z test is valid only if the two means are independent, but they are not: they are computed from the same data.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Karl Yesler

Join Date: Jul 2014

Posts: 103
#3

05 Mar 2015, 11:58

Steve,

Thanks for your input. What would you suggest as the appropriate test for the difference between the factual and counterfactual?

I am using Stata SE x64 ver 13.1 with Win 7 x64 and with 8 GB of ram.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

07 Mar 2015, 01:18

Karl:
did you take a look at -search propensity-?

Kind regards,
Carlo
(Stata 19.0)
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#5

09 Mar 2015, 16:16

Two issues here:
1. Write a program that ultimately puts out the two means or their difference; then bootstrap their difference to get a CI. (You won't need bsample.( Since you show us no code, I can only refer you to an example in which I estimated the standard error for an ATT that took into account variability due to estimating a propensity score: http://www.stata.com/statalist/archi.../msg01213.html. (By the way, it's now accepted that one should ignore that variability)

2. Doing a hypothesis test about the difference is not so simple: to calculate p-values, you need to bootstrap under the null hypothesis of no difference. If \(Y_i\) is the factual estimate for observation \(i\) and \(Z_i\) is the counterfactual estimate, let \(\bar{Y}\) and \(\bar{Z}\) be the sample means. Then bootstrap the difference of \(Y_{i}'= Y_i - \bar{Y}\) and \(Z_i'= Z_i - \bar{Z}\). \(Y_{i}'\) and \(Z_i' \) have the same mean (zero). See, e.g.: http://www.biostat.umn.edu/~will/647.../Handout21.pdf

Last edited by Steve Samuels; 09 Mar 2015, 16:21.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Announcement

2 sample z test appropriate for testing between the 2 means after bootstrapping a 2 part model

Comment

Comment

Comment

Comment