Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create Initial Audit variable with Auditor Change and Audit Fee data

    Hi all,

    for my research note, I am investigating the effect of initial audits on audit fees.
    I generated with Auditor Change data from Audit Analytics a sample of initial audits for 6011 companies for the years 2005-2009 and cleaned this data in such a way that I have 1 auditor change for every company.
    For my research, I need to compare the audit fees of companies with initial audits to companies that didn't change their auditor in the period 2005-2009.
    To investigate this I need an indicator variable which is 1 for an initial audit and 0 otherwise.
    Therefore, I also downloaded a sample with Audit Fees data from Audit Analytics with audit fee data for companies in the same period 2005-2009.
    Both samples have the same company identifier CIK and have the same year variable fyear.
    For my other variables, I will later merge this sample with a Compustat sample and therefore I already created the company identifier CIK and the year variable fyear.

    But, now I don't know how I can put these companies with initial audits of the first sample in the second sample.
    In the end, I want a list of CIK codes and audit fees and the variable INITIAL which is sometimes 1 if the company had an initial audit in that year and 0 otherwise, illustrated as follows:

    CIK fyear INITIAL Fee
    1 2005 0
    1 2006 0
    1 2007 1
    1 2008 0
    1 2009 0
    2 2005 0
    2 2006 1
    2 2007 0
    2 2008 0
    2 2009 0
    3 2005 1
    3 2006 0
    3 2007 0
    3 2008 0
    3 2009 0
    4 2005 0
    4 2006 1
    4 2007 0
    4 2008 0
    4 2009 0
    5 2005 0
    5 2006 0
    5 2007 0
    5 2008 1
    5 2009 0

    My current auditor change sample looks as follows:

    [CODE]
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(dismiss_key engaged_auditor_key) long cik double(latest_fees_fye latest_fees_sum) float fyear
    6 598 1961 20181231 12500 2007
    4 7 2034 20180630 2028000 2005
    2 1687 2098 20181231 328000 2008
    7 2 3116 20181231 2543000 2007
    3 358 3197 20181231 1654340 2005
    358 7 3197 20181231 1654340 2008
    2 3 3521 20191031 369800 2009
    459 10 3545 20180930 405000 2007
    1662 1687 3570 20181231 6663330 2006
    1687 2 3570 20181231 6663330 2007
    1 3 3673 20101231 3715620 2007
    6 7 3952 20121231 75000 2005
    1662 1687 3982 20101231 813661 2006
    3 8 4164 20051231 504000 2006
    8 1829 4187 20181231 128680 2005
    2 1929 4317 20060331 148000 2007
    498 2 5117 20120831 348285 2005
    2 10 5117 20120831 348285 2006
    428 142 5138 20130731 32850 2006
    1 3 5337 20101231 206000 2007
    3 2 5352 20090331 28600 2005
    1 2356 5768 20160331 400000 2005
    4 3 6314 20130630 560000 2008
    1 2 6383 20061231 1282000 2006
    90 823 6951 20181028 7776000 2005
    1 6 7332 20181231 2724590 2007
    6 517 7426 20120630 9000 2005
    517 686 7426 20120630 9000 2009
    10 27 7623 20181130 112193 2006
    10 348 7951 20120930 29000 2005
    2 6 8146 20190131 224500 2005
    3 7 8177 20181231 452000 2006
    1416 2248 8328 20181231 227657 2008
    331 240 8497 20111231 201525 2006
    54 686 8855 20131231 97000 2005

    Does anyone know how I can do this? I would be really really happy to receive an answer.

    Kind regards

    Floor Muijres
    Last edited by Floor Muijres; 07 May 2020, 14:26.

  • #2
    Hi all,

    I already solved the problem.
    I first created a variable INITIAL for the Auditor Change sample with the following code:

    sort cik fyear
    by cik: gen INITIAL = cik

    Then, I opened the Audit fee sample and merged this sample with the Auditor changes sample with the code:

    merge 1:1 cik fyear using "Auditor changes.dta"

    Now I have a dataset that looks like this:

    CIK fyear INITIAL Fee
    1 2005
    1 2006
    1 2007 1
    1 2008
    1 2009
    2 2005
    2 2006 2
    2 2007
    2 2008
    2 2009
    3 2005 3
    3 2006
    3 2007
    3 2008
    3 2009
    4 2005
    4 2006 4
    4 2007
    4 2008
    4 2009
    5 2005
    5 2006
    5 2007
    5 2008 5
    5 2009

    Now I have for the firms with an initial audit the CIK code under INITIAL for the corresponding fyear and for the other company-year observations I have a blank space.
    Does anyone know how I can create a dummy variable Initial_audit with this information?
    I would be very happy to receive an answer

    Kind regards,

    Floor



    Comment


    • #3
      Does anyone know how I can create a dummy variable Initial_audit with this information?
      It's not clear to me what you're asking, but for what it's worth here's two of my interpretations/guesses:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str1 cik str4 fyear str1 initialfee
      "1" "2005" "" 
      "1" "2006" "" 
      "1" "2007" "1"
      "1" "2008" "" 
      "1" "2009" "" 
      "2" "2005" "" 
      "2" "2006" "2"
      "2" "2007" "" 
      "2" "2008" "" 
      "2" "2009" "" 
      "3" "2005" "3"
      "3" "2006" "" 
      "3" "2007" "" 
      "3" "2008" "" 
      "3" "2009" "" 
      "4" "2005" "" 
      "4" "2006" "4"
      "4" "2007" "" 
      "4" "2008" "" 
      "4" "2009" "" 
      "5" "2005" "" 
      "5" "2006" "" 
      "5" "2007" "" 
      "5" "2008" "5"
      "5" "2009" "" 
      end
      
      
      // dummy variable for initial year 
      gen Initial_audit = (initialfee != "")
      
      
      // dummy variable for years after/including initial fee 
      clonevar initialfee_fillin = initialfee
      bys cik (fyear): replace initialfee_fillin = initialfee_fillin[_n-1] if initialfee_fillin == ""
      gen Initial_audit_alt = (initialfee_fillin != "")
      drop initialfee_fillin
      
      list, noobs
      
      +----------------------------------------------+
        | cik   fyear   initia~e   Initi~it   Initi~lt |
        |----------------------------------------------|
        |   1    2005                     0          0 |
        |   1    2006                     0          0 |
        |   1    2007          1          1          1 |
        |   1    2008                     0          1 |
        |   1    2009                     0          1 |
        |----------------------------------------------|
        |   2    2005                     0          0 |
        |   2    2006          2          1          1 |
        |   2    2007                     0          1 |
        |   2    2008                     0          1 |
        |   2    2009                     0          1 |
        |----------------------------------------------|
        |   3    2005          3          1          1 |
        |   3    2006                     0          1 |
        |   3    2007                     0          1 |
        |   3    2008                     0          1 |
        |   3    2009                     0          1 |
        |----------------------------------------------|
        |   4    2005                     0          0 |
        |   4    2006          4          1          1 |
        |   4    2007                     0          1 |
        |   4    2008                     0          1 |
        |   4    2009                     0          1 |
        |----------------------------------------------|
        |   5    2005                     0          0 |
        |   5    2006                     0          0 |
        |   5    2007                     0          0 |
        |   5    2008          5          1          1 |
        |   5    2009                     0          1 |
        +----------------------------------------------+

      Comment


      • #4
        Hi,

        thank you very much for your solution.
        The first code was exactly what I needed!

        Kind regards,

        Floor

        Comment

        Working...
        X