Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify first time dummy variable = 1 across observations in a panel data set

    Hello! This is my first time asking a question on here, so please excuse me if it is phrased incorrectly, or difficult to understand.

    I am working with a panel dataset that has 22 years of observations at the school district/Year level. School districts are identified using the variable "Code," and there is a time variable "Year." I also have a dummy, treatment variable, "Treat," that is 1 when a school district has a wind project within its boundaries, and 0 otherwise. The treatment variable changes from 0 to 1 in the first year that the wind project has come online. However, many districts do not have any wind projects, and thus Treat = 0 across all 22 years of observations. Additionally, wind projects came online at different times in different districts. For example, in district 1 a wind project came online in 2005, while in district 5 a project came online in 2017.

    I need to create a new variable "firstYear" that is equal to the first year that a project came online, so for district 1 it would be 2005 and for district 5 it would be 2017, etc. I have tried to use the foreach command to execute this, but I am getting stuck. The code I have tried so far is:

    gen firstYear = . *Create firstYear as all missing because I only need these values for districts with wind projects

    foreach row of Treat {
    replace firstYear = `row' Year if `row' Treat == 1
    * Add a column for firstYear and
    continue if `row' Treat == 0
    }

    This is returning a syntax error, which I believe is due to my inexperience with the foreach command, however I am also unsure if the logic I used is correct. I would appreciate help on both fixing the syntax and the logic (if there are other errors).

    Additionally, if there is a better way to do this, please let me know!

    Thank you!

  • #2
    Maybe this is easier:

    Code:
    gen treatyear = Year if Treat==1
    egen firstYear = min(treatyear), by(Code)
    drop treatyear

    Comment


    • #3
      Yes, that worked perfectly! Thank you so much!

      Comment


      • #4
        Another way to do it

        Code:
         
         egen firstYear = min(cond(treat == 1, year, .)), by(Code)
        For more discussion see https://www.stata.com/support/faqs/d...t-occurrences/ and Section 9 of https://www.stata-journal.com/articl...article=dm0055

        Comment


        • #5
          Nick's method is way more elegant! Time for me to update my scripts!

          Comment


          • #6
            Section 10 of the paper cited has a related method

            Code:
            egen firstYear = min(year / (treat == 1)), by(Code)
            or even for a (0, 1) variable
            Code:
            egen firstYear = min(year / treat), by(Code)
            but either is perhaps best regarded as a Stata joke, and perhaps not even funny.

            Comment


            • #7
              One does not simply... divide by zero.

              Comment

              Working...
              X