Hello Stata community!
I have a question about CSDID, specifically how to specify gvar with repeated cross-sectional data. I am using Stata 18.
I would like to evaluate the impact of a cash transfer programme on poverty. I have yearly, repeated cross-sectional data from 2004-2019. Treatment (the cash transfer) occurs at the household level, and has started in 2009. After that, the programme has been rolled out continuously. Several districts in the country are geographically prioritised, but not all households within the same district are treated. Also, not all districts occur in every year of the repeated cross-sectional household survey.
I have created a treatment variable that takes the value 0 for years 2004-2009, as well as households who were not beneficiaries after, and 1 for households who are beneficiaries (see below, example for 2004-2014):
Eventually, with this specification,
gives me 0 observations and all results are omitted.
I have two questions:
1) I understand that this is not correct, and the treatment variable should not be 0 for, e.g., years 2010-2014 when treatment is 2009, right? However, based on my repeated cross-sectional data, how can I identify which households remain in the treatment group from year to year?
2) Does it constitute a problem that the programme was rolled out in specific districts, but not all households in the same districts were treated each year? Eligibility was based on a set of household demographic and living standard measures.
I would appreciate any help regarding the coding of my gvar and how to implement csdid. Thank you very much in advance!
Teresa
I have a question about CSDID, specifically how to specify gvar with repeated cross-sectional data. I am using Stata 18.
I would like to evaluate the impact of a cash transfer programme on poverty. I have yearly, repeated cross-sectional data from 2004-2019. Treatment (the cash transfer) occurs at the household level, and has started in 2009. After that, the programme has been rolled out continuously. Several districts in the country are geographically prioritised, but not all households within the same district are treated. Also, not all districts occur in every year of the repeated cross-sectional household survey.
I have created a treatment variable that takes the value 0 for years 2004-2009, as well as households who were not beneficiaries after, and 1 for households who are beneficiaries (see below, example for 2004-2014):
Code:
| treatment Year | 0 2009 2010 2011 2012 2013 2014 | Total -----------+-----------------------------------------------------------------------------+---------- 2004 | 3,698 0 0 0 0 0 0 | 3,698 2005 | 1,896 0 0 0 0 0 0 | 1,896 2006 | 2,435 0 0 0 0 0 0 | 2,435 2007 | 2,041 0 0 0 0 0 0 | 2,041 2008 | 1,986 0 0 0 0 0 0 | 1,986 2009 | 1,842 119 0 0 0 0 0 | 1,961 2010 | 1,781 0 174 0 0 0 0 | 1,955 2011 | 1,662 0 0 95 0 0 0 | 1,757 2012 | 1,633 0 0 0 221 0 0 | 1,854 2013 | 1,741 0 0 0 0 176 0 | 1,917 2014 | 1,609 0 0 0 0 0 219 | 1,828 -----------+-----------------------------------------------------------------------------+---------- Total | 22,324 119 174 95 221 176 219 | 23,328
Code:
csdid poor urban head_years_schooling head_literate head_female head_regionbirth head_age language, gvar(treatment) notyet time(year)
I have two questions:
1) I understand that this is not correct, and the treatment variable should not be 0 for, e.g., years 2010-2014 when treatment is 2009, right? However, based on my repeated cross-sectional data, how can I identify which households remain in the treatment group from year to year?
2) Does it constitute a problem that the programme was rolled out in specific districts, but not all households in the same districts were treated each year? Eligibility was based on a set of household demographic and living standard measures.
I would appreciate any help regarding the coding of my gvar and how to implement csdid. Thank you very much in advance!
Teresa
Comment