I want to randomly assign a treatment to my control goup, I have state, year, and the treatment variable, in treatment group, only one year will be treated, and thus in my control group, I only want to assign one treatment to one state across entire period.
I have full 50 states, from 2012 to 2020, all my treatment start from 2016 and my treatment group has following distribution:
this means I only have 1 state that is treated in 2016 and 3 states in 2017 and so on
Here is my data example with two states in treatment group and two states in control group
cyear means the year state got the treatment, dif means the how far away from the treatment year. I tried following but when I try to random assign, I fail to let only one state get the treatment, in the end, my state can get more than one treatment:
I have full 50 states, from 2012 to 2020, all my treatment start from 2016 and my treatment group has following distribution:
count | year |
1 | 2016 |
3 | 2017 |
7 | 2018 |
9 | 2019 |
9 | 2020 |
Here is my data example with two states in treatment group and two states in control group
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str20 state float(year cyear dif) "Alabama" 2018 . 0 "Alabama" 2013 . 0 "Alabama" 2012 . 0 "Alabama" 2014 . 0 "Alabama" 2017 . 0 "Alabama" 2020 . 0 "Alabama" 2019 . 0 "Alabama" 2015 . 0 "Alabama" 2016 . 0 "Alabama" 2021 . 0 "Alaska" 2018 . 0 "Alaska" 2012 . 0 "Alaska" 2020 . 0 "Alaska" 2015 . 0 "Alaska" 2013 . 0 "Alaska" 2019 . 0 "Alaska" 2017 . 0 "Alaska" 2016 . 0 "Alaska" 2021 . 0 "Alaska" 2014 . 0 "Arizona" 2017 2018 -1 "Arizona" 2021 2018 3 "Arizona" 2016 2018 -2 "Arizona" 2012 2018 -6 "Arizona" 2015 2018 -3 "Arizona" 2014 2018 -4 "Arizona" 2019 2018 1 "Arizona" 2013 2018 -5 "Arizona" 2020 2018 2 "Arizona" 2018 2018 0 "Arkansas" 2012 2019 -7 "Arkansas" 2016 2019 -3 "Arkansas" 2020 2019 1 "Arkansas" 2013 2019 -6 "Arkansas" 2019 2019 0 "Arkansas" 2017 2019 -2 "Arkansas" 2015 2019 -4 "Arkansas" 2021 2019 2 "Arkansas" 2014 2019 -5 "Arkansas" 2018 2019 -1 end
cyear means the year state got the treatment, dif means the how far away from the treatment year. I tried following but when I try to random assign, I fail to let only one state get the treatment, in the end, my state can get more than one treatment:
Code:
gen ran=uniform() gen treatment =. sort year ran by year: replace treatment =(_n <= 1) if (year == 2016) & t ==. by year: replace treatment =(_n <= 3) if (year == 2017) & t ==. by year: replace treatment =(_n <= 7) if (year == 2018) & t ==. by year: replace treatment =(_n <= 9) if (year == 2019) & t ==. by year: replace treatment =(_n <= 5) if (year == 2020) & t ==.
Comment