Hi Statalist,
I'd like to get some help on calculating the p-value testing the difference between 2 means after bootstrapping.
Background: I cannot simply use vce(bootstrap) because I'm not just testing for significance of a coefficient: I'm using a 2 part model with recycled predictions to estimate a treatment's effect on healthcare costs (first part logit of any cost, second part glm log gamma among those with cost). I then predict the counterfactual estimate of not receiving treatment. I then combine the 1st and 2nd parts of the model and compare the factual estimate to the counterfactual.
Therefore I used -bsample- with -forvalues iter=1(1)1000- loop to estimate 1000 random sample estimates of the factual and counterfactual means.
According to "cameron and trivedi microeconometrics using stata 2010" pg. 423 the standard deviation of the random sample estimates is the standard error of estimator. Which makes sense to me.
Given the above (which I am fairly certain is correct) my question is: is the 2 sample z test the appropriate test for testing the difference between the 2 means?
To make this more concrete I provide my code for the 2 sample z test:
Thanks!
I'd like to get some help on calculating the p-value testing the difference between 2 means after bootstrapping.
Background: I cannot simply use vce(bootstrap) because I'm not just testing for significance of a coefficient: I'm using a 2 part model with recycled predictions to estimate a treatment's effect on healthcare costs (first part logit of any cost, second part glm log gamma among those with cost). I then predict the counterfactual estimate of not receiving treatment. I then combine the 1st and 2nd parts of the model and compare the factual estimate to the counterfactual.
Therefore I used -bsample- with -forvalues iter=1(1)1000- loop to estimate 1000 random sample estimates of the factual and counterfactual means.
According to "cameron and trivedi microeconometrics using stata 2010" pg. 423 the standard deviation of the random sample estimates is the standard error of estimator. Which makes sense to me.
Given the above (which I am fairly certain is correct) my question is: is the 2 sample z test the appropriate test for testing the difference between the 2 means?
To make this more concrete I provide my code for the 2 sample z test:
Code:
***z-test for 1000 replicates of bsample bootstrap of factual (mean_mu) and counterfactual(mean_mu0) estimates foreach var of varlist mean_mu mean_mu0 { sum `var' gen mean_`var'=r(mean) gen sd_`var'=r(sd) } gen zstat_2sample=(mean_mean_mu - mean_mean_mu0)/(sqrt((sd_mean_mu^2) + sd_mean_mu0^2)) gen pv_2sample=1.96*(1-normal(abs(zstat_2sample)))
Thanks!
Comment