Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate a time dummy pre - and post -treatment in panel data

    Hello everyone,
    I have a panel dataset covering US states and counties for a number of years and quarters. For each county I have also the list of banks operating there, in the specific time period. Therefore a county is repeated many times. There is a variable 'treated" indicating with 1 whereas in a certain county in a certain year there was a specific event (say an election). I want to create two time dummies. The first one taking value of 1 in each of the three years before that event, the other one taking value one for the three years following the event. The dataset looks like this:
    input str20 state str33 county double date float(treated year) byte quarter
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "LOUISIANA" "East Baton Rouge Parish" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "LOUISIANA" "Jefferson Parish" 20080331 0 2008 3
    "LOUISIANA" "Tangipahoa Parish" 20120331 0 2012 3
    "ALABAMA" "Baldwin County" 20040331 0 2004 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "VIRGINIA" "Chesterfield County" 20030331 0 2003 3
    "TEXAS" "Harris County" 20050331 0 2005 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 19980331 0 1998 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "MISSISSIPPI" "Desha County" 20050331 0 2005 3
    "VIRGINIA" "King and Queen County" 19990331 0 1999 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "LOUISIANA" "East Baton Rouge Parish" 20120331 0 2012 3
    "TEXAS" "Harris County" 20050331 0 2005 3
    "TEXAS" "Washington County" 20080331 0 2008 3
    "LOUISIANA" "Harrison County" 20080331 0 2008 3
    "TEXAS" "Calhoun County" 19980331 0 1998 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "VIRGINIA" "Southampton County" 19990331 0 1999 3
    "NORTH CAROLINA" "Wilson County" 20110331 0 2011 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "VIRGINIA" "Fairfax County" 20030331 0 2003 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "FLORIDA" "Collier County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "LOUISIANA" "La Salle Parish" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "ALABAMA" "Baldwin County" 20040331 0 2004 3
    "LOUISIANA" "Jefferson Parish" 20080331 0 2008 3
    "LOUISIANA" "East Baton Rouge Parish" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "LOUISIANA" "Iberia Parish" 20120331 0 2012 3
    "TEXAS" "Harris County" 20080331 0 2008 3
    "TEXAS" "Harris County" 20050331 0 2005 3
    "ALABAMA" "Baldwin County" 20040331 0 2004 3
    "MARYLAND" "Howard County" 20030331 0 2003 3
    "LOUISIANA" "Iberia Parish" 20080331 1 2008 3
    "LOUISIANA" "Tangipahoa Parish" 20080331 1 2008 3
    "FLORIDA" "Palm Beach County" 20040331 1 2004 3
    "FLORIDA" "Highlands County" 20040331 1 2004 3
    "FLORIDA" "Walton County" 19980331 1 1998 3
    "NORTH CAROLINA" "Stanly County" 19990331 1 1999 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "FLORIDA" "Hendry County" 20040331 1 2004 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "NORTH CAROLINA" "Pitt County" 20110331 1 2011 3
    "LOUISIANA" "Tangipahoa Parish" 19980331 1 1998 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "VIRGINIA" "Nottoway County" 20030331 1 2003 3
    "LOUISIANA" "Concordia Parish" 20080331 1 2008 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "LOUISIANA" "Acadia Parish" 20080331 1 2008 3
    "NORTH CAROLINA" "Beaufort County" 19960331 1 1996 3
    "NORTH CAROLINA" "Wilson County" 19960331 1 1996 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "LOUISIANA" "West Baton Rouge Parish" 20080331 1 2008 3
    "NORTH CAROLINA" "Pitt County" 19960331 1 1996 3
    "NORTH CAROLINA" "Moore County" 19960331 1 1996 3
    "TEXAS" "Montgomery County" 20080331 1 2008 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "NORTH CAROLINA" "Johnston County" 19960331 1 1996 3
    "TEXAS" "Brazoria County" 19980331 1 1998 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "NORTH CAROLINA" "Lenoir County" 19960331 1 1996 3
    "NORTH CAROLINA" "Nash County" 19960331 1 1996 3
    "LOUISIANA" "Acadia Parish" 20080331 1 2008 3
    "LOUISIANA" "Tangipahoa Parish" 20080331 1 2008 3
    "MISSISSIPPI" "Hinds County" 20050331 1 2005 3
    "FLORIDA" "Volusia County" 20040331 1 2004 3
    "LOUISIANA" "Iberia Parish" 20080331 1 2008 3
    "FLORIDA" "Palm Beach County" 20040331 1 2004 3
    "NORTH CAROLINA" "Lenoir County" 20110331 1 2011 3
    "NORTH CAROLINA" "Wilson County" 19990331 1 1999 3
    "NORTH CAROLINA" "Halifax County" 19990331 1 1999 3
    "MISSISSIPPI" "Tangipahoa Parish" 20080331 1 2008 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "ALABAMA" "Covington County" 20040331 1 2004 3
    "FLORIDA" "Jackson County" 20040331 1 2004 3
    "LOUISIANA" "Ascension Parish" 20080331 1 2008 3
    "VIRGINIA" "Northampton County" 20030331 1 2003 3
    "FLORIDA" "Broward County" 20040331 1 2004 3
    "FLORIDA" "DeSoto County" 20040331 1 2004 3
    "LOUISIANA" "Rapides Parish" 20080331 1 2008 3
    "LOUISIANA" "Lafourche Parish" 20080331 1 2008 3
    "NORTH CAROLINA" "Wilson County" 19990331 1 1999 3
    "FLORIDA" "Hillsborough County" 20040331 1 2004 3
    "NORTH CAROLINA" "Wilson County" 19990331 1 1999 3
    "NORTH CAROLINA" "Cumberland County" 19960331 1 1996 3
    end

    Any help is much appreciated!


  • #2
    I don't understand what you want.

    Which variable gives the year in which the event occurred? And where did you get that "date" variable from: it's not a Stata internal format date variable. It's a number which human eyes can interpret as a date, but will cause you no end of trouble if you try to use it as a date in Stata. In any case, the 4 high-order digits of "date" always match the value of the year variable, so it isn't clear what you mean when you refer to observations three years before or three years after the event, because each observation has only one year mentioned regardless of which variable you look at.

    Finally, you also need to clarify whether "the three years before the event" means the year of the event itself and the two preceding years, or means the three preceding years but not the year of the event itself. Similarly for after the event.

    Comment


    • #3
      Dear Clyde,
      Thank you for your answer and sorry if I did not make myself clear. The variable that gives the year is called "year" in the dataset. I extracted it from the variable "date", which is not coming from STATA but from Call Reports of the Federal Reserve. I did not sort the obs. in the right way but each county has data from 1994 up to 2017. I just wanted to rank them based on the "treatment". Regarding the dummy, I was meaning the three preceding years, without the year of the event. The same for the other one. Below I have extracted another piece of the dataset:
      The dummies should be related only to the year itself. As an example below, if there is an event for Kent county in 2003 third quarter, the dummy should take value one for the the three years preceding 2003 for all the dataset not only for Kent county. The same for the other dummy.
      Thank you very much for your help.



      input long fips str33 county str3 state_code double date float year byte quarter float treated
      10001 "Kent County" " DE" 20030630 2003 2 0
      10001 "Kent County" " DE" 20030630 2003 2 0
      10001 "Kent County" " DE" 20030630 2003 2 0
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20030930 2003 3 1
      10001 "Kent County" " DE" 20031231 2003 4 0
      10001 "Kent County" " DE" 20031231 2003 4 0
      10001 "Kent County" " DE" 20031231 2003 4 0
      end

      Comment


      • #4
        I'm still confused. In fact, I think I'm even more confused. Your example data has only observations for year 2003. And treated is 1 in Kent County in 2003. So does this mean that you want your new indicator ("dummy") to take on the value 1 for any observation in the entire data set where year (the very same variable) is 2000, 2001, or 2002, and 0 elsewhere?

        Your data set also confuses me because every observation appears to be duplicated numerous times. Is that a data management error? Or do these observations, in the real data set, differ on other variables you have not shown?

        I have modified your data example to include years before and after 2003 to illustrate the code. I have also eliminated all the duplicate observations (although that will not affect the code, I'm pretty sure). If my example data is too dissimilar to your actual data, or this isn't what you wanted, please post back with a clearer example. Be sure your example data includes multiple counties and multiple years, and, perhaps it would make sense to also include some observations with different years where treated == 1.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input long fips str33 county str3 state_code double date float year byte quarter float treated
        10001 "Kent County" " DE" 20030630 2000 2 0
        10001 "Kent County" " DE" 20030930 2000 3 0
        10001 "Kent County" " DE" 20031231 2000 4 0
        10001 "Kent County" " DE" 20030630 2001 2 0
        10001 "Kent County" " DE" 20030930 2001 3 0
        10001 "Kent County" " DE" 20031231 2001 4 0
        10001 "Kent County" " DE" 20030630 2002 2 0
        10001 "Kent County" " DE" 20030930 2002 3 0
        10001 "Kent County" " DE" 20031231 2002 4 0
        10001 "Kent County" " DE" 20030630 2003 2 0
        10001 "Kent County" " DE" 20030930 2003 3 1
        10001 "Kent County" " DE" 20031231 2003 4 0
        10001 "Kent County" " DE" 20030630 2004 2 0
        10001 "Kent County" " DE" 20030930 2004 3 0
        10001 "Kent County" " DE" 20031231 2004 4 0
        10001 "Kent County" " DE" 20030630 2005 2 0
        10001 "Kent County" " DE" 20030930 2005 3 0
        10001 "Kent County" " DE" 20031231 2005 4 0
        end
        
        preserve
        keep if treated == 1
        keep year
        gen byte past_3_years = 1
        duplicates drop
        tempfile treated_years
        save `treated_years'
        
        restore
        rangejoin year 1 3 using `treated_years'
        replace past_3_years = 0 if missing(past_3_years)
        drop year_U
        -rangejoin- is written by Robert Picard and is available from SSC.

        The code for next_3_years is analogous.

        Comment


        • #5
          Dear Clyde,
          Sorry for the late reply. The code you wrote worked pretty fine. There were duplicate observations because there are also banks in the dataset for each county as I mentioned in the previous post.
          Thank you very much!
          Kind regards!

          Comment

          Working...
          X