Hello,
I have data on state policies for 50 US states. I am interested in 1998-2017. Currently, my data provides a set of "policy begin dates" and "policy end dates" for each state. I would like to extract the years represented within each of these date ranges to create state-year data so that each year represents the policy active for the most months out of that year. Is there a way to A) extract years represented between two date variables? OR B) if you see a solution the coding approach I attempted below?
Currently, my data looks like this:
I would like it to look like this where the "year" represents the policy that was active for the most months out of the year.
So far, I have extracted the day, month, and year from each policy begin and end date variable using the following code.
I then used the following loop to create a flag called "monthsflag_YRXX" to count the months a policy was active in each year.
I then used the "policy begin date" to merge the data with a separate file of state-years only for 1998-2017, so that I would have a state observation for each year. This almost worked but I encounter issues when there are multiple observations per state-year. I would like to assign the policy that was active for the months out of year to the respective year. Here is example of data with the flag for 2006 looks like this:
In the above case, the first row is policies that were active in NC in 2006 for 6 months. The second two rows of policies were active for only 2 months each. However, you will notice this creates duplicate years for some states and does not captures instances where the most active policy is tied to a year in a date range from the observation prior. Based on the policy data I have, I would like to tell Stata to use the first line of policies as the policy for NC in 2006 since it was active for the most months. I would like to ignore the number of days and I am not concerned about the few months of missing data (as you will notice in this case from July to September of 2006). However, I cannot determine how to tell Stata to do so.
Thank you for your time and knowledge.
KaLeigh
I have data on state policies for 50 US states. I am interested in 1998-2017. Currently, my data provides a set of "policy begin dates" and "policy end dates" for each state. I would like to extract the years represented within each of these date ranges to create state-year data so that each year represents the policy active for the most months out of that year. Is there a way to A) extract years represented between two date variables? OR B) if you see a solution the coding approach I attempted below?
Currently, my data looks like this:
Code:
state policydatebegin policydateend Alabama 5/1/1998 3/31/2000 Alabama 4/1/2000 1/31/2004 Alabama 2/1/2004 9/30/2006 Alabama 1/1/2007 1/31/2008 Alabama 2/1/2008 9/30/2008 Alabama 10/1/2008 12/31/2013 Alabama 1/1/2014 9/30/2020 ...
Code:
State Year Alabama 1998 Alabama 1999 Alabama 2000 Alabama 2001 Alabama 2002 Alabama 2003 Alabama 2004 Alabama 2005 Alabama 2006 ... Alabama 2014 Alabama 2015 Alabama 2016
Code:
*Create day, month, year variables for policy begin date gen month_begin=month(policydatebegin) gen year_begin=year(policydatebegin) gen day_begin=day(policydatebegin)
Code:
forvalues i=1998/2017 {
gen monthsflag_`i' = .
replace monthsflag_`i' = 12-month_begin if (`i'==year_begin & year_end!=year_begin)
replace monthsflag_`i' = month_end if `i'==year_end
replace monthsflag_`i' = 12 if (`i'<year_end & `i'>year_begin)
replace monthsflag_`i' = (month_end-month_begin) if `i'==year_begin & `i'==year_end
}
I then used the "policy begin date" to merge the data with a separate file of state-years only for 1998-2017, so that I would have a state observation for each year. This almost worked but I encounter issues when there are multiple observations per state-year. I would like to assign the policy that was active for the months out of year to the respective year. Here is example of data with the flag for 2006 looks like this:
Code:
state policydatebegin policydateend monthsflag_2006 year(as start year) North Carolina 5/1/2004 6/30/2006 6 2004 North Carolina 2005 North Carolina 10/1/2006 12/31/2007 2 2006 North Carolina 7/1/2006 9/30/2006 2 2006 North Carolina 2007 North Carolina 12/1/2008 7/31/2009 2008
Thank you for your time and knowledge.
KaLeigh

Comment