Hi everyone,
I am a student and i just joined this forum so I apologise in advance if this question is a little dim.
I am trying to estimate corruption in the healthcare sector during COVID. My data (as given below) has the following variables :
Deceased : If an individual died because of COVID
i_COVID19 : Whether an individual contracted COVID
respiratory : If an individual was hospitalized for a respiratory disease after 2016
i_Month and i_Year give the month and year of hospitalization for i_COVID19 and respiratory
i_costhospitalbed gives the cost of a hospital bed for i_COVID19 and respiratory
ICUBed is a dummy for whether an individual required an ICU bed or just an isolation bed.
i_HospitalSector is 1 if treatment was recieved in the Public sector and 2 if treatment was recieved in the Private sector.
Bed_Bribery_overpay & Bed_Clout_Connections are dummies for how the individual had a hospital bed allocated to them - Through personal connections or through bribery.
I tried to do a difference in difference analysis to compare the cost between other respiratory diseases and COVID. Mostly Corruption took place during the 2nd Wave of COVID, so I generated a variable period which is 1 for Pre-COVID, 2 for the 1st wave in 2020 and 3 for the 2nd wave in 2021. I also generated covorresp which is a dummy for whether an individual was hospitalized for COVID or a respiratory disease.
The goal is to estimate the impact of corruption duting the 2nd Wave. For that,
Firstly, I would like to compare the costs during respiratory diseases and COVID as it will help me remove any pre-existing corruption.
Secondly, comparing with the 2nd wave with the 1st wave will help remove any changes in the price due to changes in the demand due to the spike in COVID cases. This makes for a very complicated model and i am not sure how to progress.
Lastly, I also think the parallel trend assumption and SUTVA would be violated as due to COVID, the cost of all medical services increased. I am trying to use a conditional parallel trend assumption that controls for the trend in 2020 and 2021 due to COVID.
I have been stuck on this for a while now. Kindly help me with this. Or if you have an other recommendations or suggestions for me, I am all ears.
Here is my data:
I tried using the following code so far:
Thankyou so much for your time.
I really appreciate it.
I am a student and i just joined this forum so I apologise in advance if this question is a little dim.
I am trying to estimate corruption in the healthcare sector during COVID. My data (as given below) has the following variables :
Deceased : If an individual died because of COVID
i_COVID19 : Whether an individual contracted COVID
respiratory : If an individual was hospitalized for a respiratory disease after 2016
i_Month and i_Year give the month and year of hospitalization for i_COVID19 and respiratory
i_costhospitalbed gives the cost of a hospital bed for i_COVID19 and respiratory
ICUBed is a dummy for whether an individual required an ICU bed or just an isolation bed.
i_HospitalSector is 1 if treatment was recieved in the Public sector and 2 if treatment was recieved in the Private sector.
Bed_Bribery_overpay & Bed_Clout_Connections are dummies for how the individual had a hospital bed allocated to them - Through personal connections or through bribery.
I tried to do a difference in difference analysis to compare the cost between other respiratory diseases and COVID. Mostly Corruption took place during the 2nd Wave of COVID, so I generated a variable period which is 1 for Pre-COVID, 2 for the 1st wave in 2020 and 3 for the 2nd wave in 2021. I also generated covorresp which is a dummy for whether an individual was hospitalized for COVID or a respiratory disease.
The goal is to estimate the impact of corruption duting the 2nd Wave. For that,
Firstly, I would like to compare the costs during respiratory diseases and COVID as it will help me remove any pre-existing corruption.
Secondly, comparing with the 2nd wave with the 1st wave will help remove any changes in the price due to changes in the demand due to the spike in COVID cases. This makes for a very complicated model and i am not sure how to progress.
Lastly, I also think the parallel trend assumption and SUTVA would be violated as due to COVID, the cost of all medical services increased. I am trying to use a conditional parallel trend assumption that controls for the trend in 2020 and 2021 due to COVID.
I have been stuck on this for a while now. Kindly help me with this. Or if you have an other recommendations or suggestions for me, I am all ears.
Here is my data:
Code:
input byte(Deceased i_COVID19 respiratory i_Month) int i_Year long i_costhospitalbed byte(ICUBed i_HospitalSector Bed_Bribery_overpay Bed_Clout_Connections) float(period DID covorresp) 0 0 . 1 2016 10000 . 2 . . 1 0 . 0 0 . 1 2017 4500 . 2 . . 1 0 . 0 0 . 1 2018 6000 . 2 . . 1 0 . 0 0 . 1 2019 10000 . 2 . . 1 0 . 1 0 1 2 2018 5000 . 2 . . 1 0 0 0 0 . 2 2018 5000 . 2 . . 1 0 . 0 0 . 2 2018 7000 . 2 . . 1 0 . 1 0 1 3 2019 5000 . 2 . . 1 0 0 1 0 . 3 2016 6500 0 2 . . 1 0 . 0 0 . 3 2017 30000 . 2 . . 1 0 . 0 0 . 3 2018 15000 . 2 . . 1 0 . 1 0 1 4 2018 10000 0 2 . . 1 0 0 0 0 . 4 2018 3000 . 2 . . 1 0 . 1 0 . 4 2019 4100 . 2 . . 1 0 . 1 0 . 5 2017 6000 0 2 . . 1 0 . 1 0 . 5 2017 5000 0 2 . . 1 0 . 1 0 . 5 2018 2500 . 1 . . 1 0 . 1 0 1 6 2019 8000 0 2 . . 1 0 0 1 0 1 6 2019 6000 . 2 . . 1 0 0 1 0 1 6 2019 80000 . 2 . . 1 0 0 0 0 1 6 2018 2000 . 1 . . 1 0 0 1 0 1 6 2018 3000 . 2 . . 1 0 0 0 0 1 6 2017 6000 . 2 . . 1 0 0 1 0 . 6 2019 1500 . 2 . . 1 0 . 0 0 . 6 2018 5000 . 2 . . 1 0 . 1 0 . 6 2018 4000 . 2 . . 1 0 . 1 0 . 6 2019 6000 . 2 . . 1 0 . 0 0 . 6 2019 12000 . 2 . . 1 0 . 0 0 . 6 2017 3000 . 2 . . 1 0 . 1 0 1 7 2018 7500 0 2 . . 1 0 0 1 0 1 7 2017 2500 . 2 . . 1 0 0 1 0 1 7 2018 6000 . 2 . . 1 0 0 1 0 1 7 2016 7000 0 2 . . 1 0 0 0 0 . 7 2019 100000 . 2 . . 1 0 . 1 0 . 7 2018 6500 0 2 . . 1 0 . 1 0 . 7 2017 4000 . 2 . . 1 0 . 0 0 . 7 2018 4000 . 2 . . 1 0 . 1 0 1 8 2018 120000 . 2 . . 1 0 0 1 0 1 8 2019 120000 . 2 . . 1 0 0 1 0 1 8 2017 8500 0 2 . . 1 0 0 1 0 . 8 2019 5000 . 2 . . 1 0 . 0 0 . 8 2019 5000 . 2 . . 1 0 . 1 0 . 8 2019 10000 . 2 . . 1 0 . 1 0 . 8 2018 5000 0 2 . . 1 0 . 1 0 1 9 2016 1500 . 2 . . 1 0 0 1 0 1 9 2019 100000 . 2 . . 1 0 0 0 0 . 9 2019 200000 . 2 . . 1 0 . 0 0 . 9 2017 15000 . 2 . . 1 0 . 0 0 . 10 2017 10000 . 2 . . 1 0 . 0 0 . 10 2016 9000 . 2 . . 1 0 . 1 0 . 10 2018 5500 0 2 . . 1 0 . 1 0 1 11 2016 3000 . 2 . . 1 0 0 1 0 1 11 2016 8000 0 2 . . 1 0 0 1 1 . 11 2019 . 0 . 0 0 2 0 1 1 1 . 11 2019 . 0 . 0 0 2 0 1 1 0 . 11 2017 6000 0 2 . . 1 0 . 0 0 . 11 2019 3000 . 2 . . 1 0 . 0 0 . 11 2019 12000 . 2 . . 1 0 . 0 0 . 11 2019 2000 . 2 . . 1 0 . 0 0 . 11 2017 1500 . 2 . . 1 0 . 1 0 1 12 2018 20000 . 2 . . 1 0 0 1 0 1 12 2018 7500 0 2 . . 1 0 0 1 1 . 12 2019 . 0 2 0 0 2 0 1 1 0 . 12 2019 2500 0 2 . . 1 0 . 1 0 . 12 2016 5500 0 2 . . 1 0 . 1 0 . 12 2018 5000 . 2 . . 1 0 . 1 0 . 12 2016 3000 . 2 . . 1 0 . 1 0 . 12 2017 5500 . 2 . . 1 0 . 1 1 . 1 2020 35000 1 2 0 0 2 0 1 1 1 . 1 2020 . 0 2 0 0 2 0 1 1 1 . 1 2020 . 0 1 0 0 2 0 1 1 1 . 1 2020 . 0 2 0 0 2 0 1 0 0 . 1 2020 50000 . 2 . . 2 0 . 1 1 . 2 2020 . 1 . 0 0 2 0 1 1 1 . 2 2020 . 0 2 0 0 2 0 1 1 1 . 2 2020 . 0 2 0 1 2 0 1 1 1 . 2 2020 20000 0 2 0 1 2 0 1 0 1 . 2 2020 9000 0 2 0 0 2 0 1 1 1 . 2 2020 . 0 2 1 0 2 0 1 1 1 . 2 2020 . 1 . 0 0 2 0 1 1 0 1 3 2020 4000 . 2 . . 2 0 0 1 1 . 3 2020 . 1 2 0 0 2 0 1 1 1 . 3 2020 5000 1 2 0 0 2 0 1 1 1 . 3 2020 . 1 2 1 0 2 0 1 1 1 . 3 2020 . 0 2 0 0 2 0 1 1 1 . 3 2020 . 0 2 0 0 2 0 1 0 1 . 3 2020 . 0 0 0 0 2 0 1 1 1 . 3 2020 . 0 2 1 0 2 0 1 1 1 . 3 2020 . 0 2 0 0 2 0 1 0 1 . 3 2020 . 0 0 0 0 2 0 1 1 1 . 3 2020 . 0 2 1 0 2 0 1 1 0 . 3 2020 20000 . 2 . . 2 0 . 1 0 1 4 2020 5000 . 2 . . 2 0 0 1 0 1 4 2020 140000 . 2 . . 2 0 0 0 1 . 4 2020 11000 1 2 0 0 2 0 1 0 1 . 4 2020 . 0 0 0 0 2 0 1 1 1 . 4 2020 . 0 . 0 0 2 0 1 1 1 . 4 2020 . 0 2 0 0 2 0 1 1 1 . 4 2020 . 1 2 0 0 2 0 1 1 1 . 4 2020 . 1 2 0 1 2 0 1
Code:
// treatment - COVID, control - other respiratory diseases gen covorresp = . replace covorresp = 1 if i_COVID19 == 1 replace covorresp = 0 if respiratory == 1 //across 3 time periods - pre covid, wave 1 and wave 2 gen period = . replace period = 1 if i_Year == 2016 replace period = 1 if i_Year == 2017 replace period = 1 if i_Year == 2018 replace period = 1 if i_Year == 2019 & i_COVID19 == 0 replace period = 2 if i_Year == 2019 & i_COVID19 == 1 replace period = 2 if i_Year == 2020 replace period = 3 if i_Year == 2021 replace period = 3 if i_Year == 2022 gen DID = 0 replace DID = 1 if period == 3 & covorresp == 1 reg i_costhospitalbed covorresp i.period DID, robust
I really appreciate it.