Hello everyone,
I am trying to conduct a Permutation Test for Inference in Dif-in-Dif with multiple periods. I have 19 clusters (provinces) that received treatment in different years. There are 7 years in my sample and the treatment arrives to one province first, another province second, and then to the rest of the country. Once a province is treated, it must remain treated for all subsequent periods.
My plan is to shuffle the treatment assignment across provinces subject to the restriction that in each year the number of treated units must remain fixed, and subject to the restriction that once a province is treated it remains treated all following years. I could not manage to implement this with the STATA command "PERMUTE", so I wrote my own program:
gen N=_n
gen beta=.
reg Y treatment i.year i.prov // this is the "original beta"
replace beta = _b[treatment] if N==1001
forvalues i = 1(1)1000 {
generate a = uniform()
gen N2=_n
replace a = . if _n>19
sort a
gen a1 = N2 if _n==1
gen a2 = N2 if _n==2
egen b = mean(a1)
egen c = mean(a2)
replace treatment=0
replace treatment = 1 if year>=3 & prov == b
replace treatment = 1 if year>=5 & prov == c
replace treatment = 1 if year>=6 & prov !=b & prov !=c
qui: reg Y treatment i.year i.province
replace beta = _b[treatment] if N==`i'
drop a a1 a2 b c N2
}
sum beta if N==1001
gen bigbeta = (beta>=`r(mean)') // this is because my original "beta" is >0
sum bigbeta if (N>=1 & N<=1000)
global p1 = round(`r(mean)', 0.001)
di $p1
The problem with my current program is that it produces a different p-value every time I run it. I think it is not something that would be solved efficiently by just increasing replications. I think the problem is that I have in total 19*18=375 ways to shuffle the treatment, and every time I get different combinations of treatments. Instead, I would like to only conduct 375 repetitions, ensuring that each of those is different from the others so that I have exactly one version of each of the possible 375 placebos. Then, I would compute my p-value as the number of betas larger than my original beta, divided by 375. Is there an existing command that would do this for me? Does anyone have any ideas on how to update my program to do this?
Thanks a lot in advance
MLY
I am trying to conduct a Permutation Test for Inference in Dif-in-Dif with multiple periods. I have 19 clusters (provinces) that received treatment in different years. There are 7 years in my sample and the treatment arrives to one province first, another province second, and then to the rest of the country. Once a province is treated, it must remain treated for all subsequent periods.
My plan is to shuffle the treatment assignment across provinces subject to the restriction that in each year the number of treated units must remain fixed, and subject to the restriction that once a province is treated it remains treated all following years. I could not manage to implement this with the STATA command "PERMUTE", so I wrote my own program:
gen N=_n
gen beta=.
reg Y treatment i.year i.prov // this is the "original beta"
replace beta = _b[treatment] if N==1001
forvalues i = 1(1)1000 {
generate a = uniform()
gen N2=_n
replace a = . if _n>19
sort a
gen a1 = N2 if _n==1
gen a2 = N2 if _n==2
egen b = mean(a1)
egen c = mean(a2)
replace treatment=0
replace treatment = 1 if year>=3 & prov == b
replace treatment = 1 if year>=5 & prov == c
replace treatment = 1 if year>=6 & prov !=b & prov !=c
qui: reg Y treatment i.year i.province
replace beta = _b[treatment] if N==`i'
drop a a1 a2 b c N2
}
sum beta if N==1001
gen bigbeta = (beta>=`r(mean)') // this is because my original "beta" is >0
sum bigbeta if (N>=1 & N<=1000)
global p1 = round(`r(mean)', 0.001)
di $p1
The problem with my current program is that it produces a different p-value every time I run it. I think it is not something that would be solved efficiently by just increasing replications. I think the problem is that I have in total 19*18=375 ways to shuffle the treatment, and every time I get different combinations of treatments. Instead, I would like to only conduct 375 repetitions, ensuring that each of those is different from the others so that I have exactly one version of each of the possible 375 placebos. Then, I would compute my p-value as the number of betas larger than my original beta, divided by 375. Is there an existing command that would do this for me? Does anyone have any ideas on how to update my program to do this?
Thanks a lot in advance
MLY
Comment