Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating dummy variables before and after event by firm with loop

    Hi,

    I would like to create multiple dummy variables(BC_dum_"") (14 in total) for 5 periods before, the day of and 8 periods after the event if another dummy variable (BC) equals 1 using a for loop on a firm level (firmid--> permno). The dummy variable BC equals 1 if a law was passed in that year. So if a law was passed in 1989 then I want to create a dummy variable=1 in 1984 (period -5), and so on. The code I currently have is:
    Code:
    gen BC_dum_6=1 if BC==1
    forvalues i = 1/5{
    gen BC_dum_-`i' = 0
    bysort permno: replace BC_dum_`i' = 1 if BC[_n-`i']==BC & BC==1
    }
    forvalues i = 7/14{
    gen BC_dum_`i' = 0
    bysort permno: replace BC_dum_`i' = 1 if BC[_n+`i']==BC & BC==1
    }
    Which does work, however e.g. for BC_dum_-5 is set =1 for multiple years and not just for 1984 as in the example above. I have also tried it with:
    Code:
    gen BC_dum_6=1 if BC==1
    forvalues i = 1/5{
    gen BC_dum_`i' = 0
    bysort permno: replace BC_dum_-`i' = 1 if BC[_n-`i']==BC & BC==1
    }
    forvalues i = 7/14{
    gen BC_dum_`i' = 0
    bysort permno: replace BC_dum_`i' = 1 if BC[_n+`i']==BC & BC==1
    }
    But with that I get the error message
    invalid syntax
    r(198);
    I feel (or hope) that I am not far off doing it correctly, so any help would be appreciated!

    Thanks

  • #2
    I'd appreciate a data example.

    I think your syntax error is, or should be, a stray minus sign or hyphen in a variable name.

    Code:
     
     BC_dum_-`i'
    I am at a loss to explain the report that this works.

    Comment


    • #3
      Hi Nick, thank you for your reply. here is an example of the data:
      permno datadate year BC incorpn
      1003 31-Dec-82 1982 0 DE
      1003 31-Dec-83 1983 0 DE
      1003 31-Dec-84 1984 0 DE
      1003 31-Jan-86 1985 0 DE
      1003 31-Jan-87 1986 0 DE
      1003 31-Jan-88 1987 0 DE
      1003 31-Jan-89 1988 1 DE
      1003 31-Jan-90 1989 1 DE
      Also I noticed 2 things, one that the BC dummy variable takes takes the value of one every year once the BC law has been passed, so I've adjusted my code for that. Also I realise my code was quite confusing and I copied the wrong thing over, apologies for that, I should obviously be more careful to make sure everything is correct before asking for help but, as I was working quite late mistakes can happen. I have managed it now and the code I have looks like this:

      Code:
      bysort gvkey: gen BCyear=year if sum(BC)==1
      replace BCyear=0 if BCyear==.
      bysort gvkey: egen BCyear2=max(BCyear)
      gen BC_dum_0=1 if BCyear==year
      forvalues i = 1/5{
      gen BC_dum_minus`i' = 0
      bysort gvkey: replace BC_dum_minus`i' = 1 if year==BCyear2-`i'
      }
      
      forvalues i = 1/8{
      gen BC_dum_plus`i' = 0
      bysort gvkey: replace BC_dum_plus`i' = 1 if year==BCyear2+`i'
      }

      If you have any advice on how I could improve the code, especially the first three lines as I presume there is a more efficient way of doing that, I would be grateful.

      Thanks!

      Comment


      • #4
        Thanks for your data example, but it isn't easy to work with. Do please study all the details at https://www.statalist.org/forums/help#stata including the key comment about date variables.


        Your code seems to reduce as follows. I can't see a variable gvkey in your data example but I will guess wildly that it is equivalent to permno.

        Code:
        bysort gvkey: egen BCyear = min(cond(BC == 1, year, .)) 
        
        gen BC_dum_0 = BCyear == year
        
        forvalues i = 1/5 {
            gen BC_dum_minus`i' = year == (BCyear - `i') 
        }
        
        forvalues i = 1/8 {
            gen BC_dum_plus`i' = year == (BCyear + `i') 
        }
        Noting that

        1. (0, 1) indicators are much more useful than (missing, 1) indicators. There is a tutorial review at https://www.stata-journal.com/articl...article=dm0099

        2. Once you have determined the BC year separately by gvkey, then the grouping is immaterial to later calculations.

        3. https://www.stata-journal.com/articl...article=dm0055 covers the technique behind the first statement. See Sections 9 and 10.





        Comment


        • #5
          That worked like a charm, thanks a lot Nick! And you were correct about gvkey being equivalent to permno (permno had some missing values), apologies again for the confusion.
          Also thanks for the explanations and links, and I have read the details regarding comment/question etiquette!

          Comment


          • #6
            Hi Nick and James,

            My question is very similar to that of James, but I have an additional issue: instead of having only one potential law passed per firm in the whole time-frame, different laws can be passed in different years. Thus the "min" specification on the command limits this. More specifically, my dataset is as follows:

            Each unique observation is a country-year (I have 10,143 observations: 69 years (1952-2020) x 147 countries). I want to study the effect on IMF agreements on some country level variables, and for such, I´d like to create a dummy pre, and a dummy post, for up to 5 years before the year when the agreement was signed, and for up to 5 years after the year when the agreement was signed, respectively. It would be okay to have a dummy for each pre/post year as in Jame´s case, but I don´t necessarily need that - just a dummy if the year is within the 5 year pre or post time-window is enough. An example of my dataset is the following:

            CountryCode_year CountryCode_ISO year arrangement_d id_arrangement
            AFG1965 AFG 1965 1 1
            AFG1966 AFG 1966 1 2
            AFG1967 AFG 1967 0
            AFG1968 AFG 1968 1 3
            AFG1969 AFG 1969 1 4
            AFG1970 AFG 1970 0
            AFG1971 AFG 1971 0
            AFG1972 AFG 1972 0
            AFG1973 AFG 1973 1 5
            AFG1974 AFG 1974 0
            AFG1975 AFG 1975 1 6

            In this specific example, due to the arrangement in the year 1965, I´d like the years between 1960 to 1964 to have a variable "pre" = 1, and the years 1966 to 1970 to have a variable "post" = 1, and similar for the years preceding and following the arrangement in 1966, etc. I am aware that, since there are other arrangements in closely following years, some years will have both dummies "pre" and "post" = 1, but this is not a big problem, I´ll figure out later what to do with these cases.

            I would really appreciate it if you could help me here - thanks a lot in advance!

            Comment


            • #7
              Sorry, the example of the dataset changed of format when the question was submitted and is pretty hard to understand it now. Example of first line is: CountryCode_year = AFG1965; CountryCode_ISO = AFG, year = 1965; arrangement_d= 1 (or 0 if there wasn´t any arrangement); id_arrangement = 1 (each arrangement has a different number, and missing value for country-year without arrangement).

              Comment

              Working...
              X