Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Codes for Pre-and-Post firm years

    Dear Stata Community,
    I am very new to econometrics. I wish to examine the impact of X variable on the dummy variable of accounting fraud. If a company has committed accounting fraud in the year 2014--that particular year is coded as fraud year (fraud_year=1 and 0 otherwise). This way I can examine the impact of x on Y(Fraud Dummy). I need your help in:

    If I wish to create a dummy variable of pre and post around the fraud event date (fraud_year), how to do that? I am confused because every company has a different year of fraud. This would help me to further investigate the changes in any other variable (For instance Dividends) in the post fraud period. I wish to specify exactly 3 (or 2) years pre and post of that particular event excluding the event year.

    Your help is highly appreciated.

    Thanks in advance


    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(gvkey fyear) float dividends byte fraud_year
    1004 1994 7.65 0
    1004 1995 7.676 0
    1004 1996 7.976 0
    1004 1997 9.118 0
    1004 1998 9.375 0
    1004 1999 9.218 1
    1004 2000 9.157 0
    1004 2001 4.43 0
    1004 2002 .797 0
    1004 2003 0 0
    1004 2004 0 0
    1004 2005 0 0
    1004 2006 0 0
    1004 2007 0 0
    1013 1993 0 0
    1013 1994 0 0
    1013 1995 0 0
    1013 1996 0 0
    1013 1997 0 0
    1013 1998 0 0
    1013 1999 0 0
    1013 2000 0 0
    1013 2001 0 0
    1013 2002 0 0
    1013 2003 0 0
    1013 2004 0 0
    1013 2005 0 1
    1013 2006 0 0
    1013 2007 0 0
    1013 2008 0 0
    1034 1993 3.873 0
    1034 1994 3.893 0
    1034 1995 3.914 0
    1034 1996 3.928 0
    1034 1997 4.198 0
    1034 1998 4.651 0
    1034 1999 5.061 0
    1034 2000 6.526 0
    1034 2001 7.541 0
    1034 2002 9.235 0
    1034 2003 9.32 1
    1034 2004 9.404 0
    1034 2005 9.481 0
    1034 2006 7.384 0
    1034 2007 0 0
    1045 1993 60 0
    1045 1994 56 0
    1045 1995 5 0
    1045 1996 0 0
    1045 1997 0 0
    1045 1998 0 0
    1045 1999 0 0
    1045 2000 0 1
    1045 2001 0 0
    1045 2002 0 0
    1045 2003 0 0
    1045 2004 0 0
    1045 2005 0 0
    1045 2006 0 0
    1045 2007 0 0
    1045 2008 0 0
    1055 1993 0 0
    1055 1994 0 0
    1055 1995 0 0
    1055 1996 0 0
    1056 1999 0 0
    1056 2000 0 0
    1056 2001 0 0
    1056 2002 0 0
    1056 2003 0 0
    1056 2004 0 0
    1056 2005 0 1
    1056 2006 0 0
    1072 2000 24.463 0
    1072 2001 26.201 0
    1072 2002 26.146 0
    1072 2003 26.048 0
    1072 2004 26.022 0
    1072 2005 25.862 1
    1072 2006 25.819 0
    1072 2007 27.466 0
    1076 1998 .837 0
    1076 1999 .8 0
    1076 2000 .792 0
    1076 2001 .797 0
    1076 2002 .833 0
    1076 2003 1.09 0
    1076 2004 1.954 0
    1076 2005 2.693 0
    1076 2006 3.021 1
    1076 2007 3.307 0
    1078 1993 562.344 0
    1078 1994 615.271 0
    1078 1995 666.406 0
    1078 1996 748.659 0
    1078 1997 825.138 0
    1078 1998 917.611 0
    1078 1999 1038.895 0
    1078 2000 1176.694 0
    1078 2001 1303.534 0


  • #2
    Code:
    clear
    input int(gvkey fyear) float dividends byte fraud_year
    1004 1994 7.65 0
    1004 1995 7.676 0
    1004 1996 7.976 0
    1004 1997 9.118 0
    1004 1998 9.375 0
    1004 1999 9.218 1
    1004 2000 9.157 0
    1004 2001 4.43 0
    1004 2002 .797 0
    1004 2003 0 0
    1004 2004 0 0
    1004 2005 0 0
    1004 2006 0 0
    1004 2007 0 0
    1013 1993 0 0
    1013 1994 0 0
    1013 1995 0 0
    1013 1996 0 0
    1013 1997 0 0
    1013 1998 0 0
    1013 1999 0 0
    1013 2000 0 0
    1013 2001 0 0
    1013 2002 0 0
    1013 2003 0 0
    1013 2004 0 0
    1013 2005 0 1
    1013 2006 0 0
    1013 2007 0 0
    1013 2008 0 0
    1034 1993 3.873 0
    1034 1994 3.893 0
    1034 1995 3.914 0
    1034 1996 3.928 0
    1034 1997 4.198 0
    1034 1998 4.651 0
    1034 1999 5.061 0
    1034 2000 6.526 0
    1034 2001 7.541 0
    1034 2002 9.235 0
    1034 2003 9.32 1
    1034 2004 9.404 0
    1034 2005 9.481 0
    1034 2006 7.384 0
    1034 2007 0 0
    1045 1993 60 0
    1045 1994 56 0
    1045 1995 5 0
    1045 1996 0 0
    1045 1997 0 0
    1045 1998 0 0
    1045 1999 0 0
    1045 2000 0 1
    1045 2001 0 0
    1045 2002 0 0
    1045 2003 0 0
    1045 2004 0 0
    1045 2005 0 0
    1045 2006 0 0
    1045 2007 0 0
    1045 2008 0 0
    1055 1993 0 0
    1055 1994 0 0
    1055 1995 0 0
    1055 1996 0 0
    1056 1999 0 0
    1056 2000 0 0
    1056 2001 0 0
    1056 2002 0 0
    1056 2003 0 0
    1056 2004 0 0
    1056 2005 0 1
    1056 2006 0 0
    1072 2000 24.463 0
    1072 2001 26.201 0
    1072 2002 26.146 0
    1072 2003 26.048 0
    1072 2004 26.022 0
    1072 2005 25.862 1
    1072 2006 25.819 0
    1072 2007 27.466 0
    1076 1998 .837 0
    1076 1999 .8 0
    1076 2000 .792 0
    1076 2001 .797 0
    1076 2002 .833 0
    1076 2003 1.09 0
    1076 2004 1.954 0
    1076 2005 2.693 0
    1076 2006 3.021 1
    1076 2007 3.307 0
    1078 1993 562.344 0
    1078 1994 615.271 0
    1078 1995 666.406 0
    1078 1996 748.659 0
    1078 1997 825.138 0
    1078 1998 917.611 0
    1078 1999 1038.895 0
    1078 2000 1176.694 0
    1078 2001 1303.534 0
    end
    
    
    bysort gvkey (fyear): gen new = sum(fraud_year)
    replace new = 2 if fraud_year == 0 & new == 1
    
    
    *2 Years window*
    gen window = .
    bysort gvkey (fyear): replace window = 1 if fraud_year[_n+1] == 1 | fraud_year[_n+2] == 1
    bysort gvkey (fyear): replace window = 2 if fraud_year[_n-1] == 1 | fraud_year[_n-2] == 1
    
    list gvkey fyear fraud_year new in 1/100, sepby(gvkey)
    This gives you a new variable with 0 in pre-fraud years, 1 in the fraud year and 2 in the post-fraud years.

    Edit: added information on how to include a window.
    Last edited by Felix Bittmann; 17 Sep 2021, 04:52.
    Best wishes

    Stata 18.0 MP | ORCID | Google Scholar

    Comment


    • #3
      Dear Felix Bittmann , The code worked great for the two years window. Thank you so much.

      Comment


      • #4
        Dear @Felix Bittman, Referring to the previous question, may I further ask few more questions? I need to do the followings:
        1. If a firm has done fraud in consecutive years, I wish to keep the first one and then create a window of 2 years. if the data is not there for the window period, I wish to exclude that firm. How to do that?
        2. Second, I wish to create a control sample of non-fraudulent firms based on some variables such as firm size, and industry? The selection of the control group is based on the year of fraud of the fraudulent firm.
        The purpose of these questions is to compare the dividend behaviour of the fraudulent and non-fraudulent firms. Please correct me if I am misunderstanding something.
        I highly appreciate Sir, your time and consideration.
        Last edited by Miral Ayan; 18 Sep 2021, 03:26.

        Comment


        • #5
          Hi Miral Ayan,

          I think that the more data you have the more possible cleaning problem you may get. My suggestion does not cover all potential cases in your data but you may want to try these codes following:

          Code:
          bys gvkey (fyear): egen fraud_year_count = total(fraud_year)
          drop if fraud_year_count == 0 // drop firm if have no fraud_year
          
          bys gvkey (fyear): gen dropcase = (fraud_year[_n]==1 & fraud_year[_n-1]==1)
          drop if dropcase == 1 // drop consecutive years
          drop dropcase fraud_year_count
          
          * re-generate fraud_year_count after drop in the step above
          bys gvkey (fyear): egen fraud_year_count = total(fraud_year)
          
          * counting number of window periods
          bys gvkey: egen yeartag1 = max(cond(fraud_year==1, fyear, 0))       // latest year fraudulent occured
          bys gvkey: egen yeartag2 = min(cond(fraud_year==1, fyear, 9999))  // earliest year fraudulent occured if there is one fraund_year then yeartag1 is equal to yeartag2
          
          bys gvkey (fyear): egen casetag11 = count(fraud_year==0) if fyear > yeartag1  // count number of window after fraud_year
          bys gvkey (fyear): egen casetag12 = count(fraud_year==0) if fyear < yeartag1  // count number of window before fraud_year
          gen casetag1 = cond(mi(casetag12), casetag11, casetag12) // combine two variables into one
          
          bys gvkey (fyear): egen casetag21 = count(fraud_year==0) if fyear > yeartag2 // count number of window after fraud_year
          bys gvkey (fyear): egen casetag22 = count(fraud_year==0) if fyear < yeartag2 // count number of window before fraud_year
          gen casetag2 = cond(mi(casetag22), casetag21, casetag22)
          
          * by couting number of window you can choose number of window periods appropiately
          
          ** Drop firm has only one fraud_year and no window data or less than 2 window 
          bys gvkey (fyear): gen dropcase2 = (cond((fraud_year_count==1 & casetag1<2) | (fraud_year[_N]==1) | (fraud_year[_n==1]==1), 1, 0))
          bys gvkey: egen dropcase3 = max(dropcase2)
          drop dropcase2
          drop if dropcase3==1
          
          ** For firm has more than one fraudulent, I think it's better if you look at every possible cases
          Goodluck!

          Comment


          • #6
            Hi cu dao huy, thank you so much for your time. Much appreciated

            Comment


            • #7
              Dear Felix Bittmann I have used your code and it created a window of two years around the event and keeps the fraud year blank. May I know how can I graph this pre and post window around the event date to show if a variable (for instance dividends) has changed from pre to post? Pre should include previous two years (t-1, t-2) and post should show (t+1, t+2), where t is the event (fraud) year. I need your kind help with these codes.

              Thanks in advance

              Comment


              • #8
                There are many options to graph this. A simple option is a bar graph, like:

                Code:
                graph bar outcome, over(window)
                Best wishes

                Stata 18.0 MP | ORCID | Google Scholar

                Comment

                Working...
                X