Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create Industry specialist variable based on SIC code

    Hi all,

    For my research, I need to know for every company (CIK) in my dataset whether this company is audited by the industry specialist or not.
    I'm new to Stata and I want to create an industry specialist variable called "specialist" which equals 1 if the auditor (auditor_fkey) has a market share of 25% or bigger compared to other audit firms in the specific industry, and 0 otherwise.
    I want to calculate this market share based on the total audit fees of the audit firm in the industry, and the industries are based on their SIC code.
    So, for every company (CIK) I want to know whether the auditor of this company (auditor_fkey) is the industry specialist (specialist) or not.

    Does anyone know how to do this? Thank you in advance.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double fyear str10 cik str4 sic double(auditor_fkey audit_fees) byte chg_auditor
    2005 "0000000020" "3823"     6   244000 0
    2006 "0000000020" "3823"     6   574000 0
    2007 "0000000020" "3823"     6   669000 0
    2008 "0000000020" "3823"     6   680000 0
    2005 "0000001800" "3845"     3 11875000 0
    2006 "0000001800" "3845"     3 13966000 0
    2007 "0000001800" "3845"     3 15409000 0
    2008 "0000001800" "3845"     3 16000000 0
    2009 "0000001800" "3845"     3 14739000 0
    2005 "0000002098" "3420"     2   183000 0
    2006 "0000002098" "3420"     2   220000 0
    2007 "0000002098" "3420"     2   231000 0
    2008 "0000002098" "3420"  1687   150000 1
    2009 "0000002098" "3420"     2        0 1
    2005 "0000002178" "5172"     3   338100 0
    2006 "0000002178" "5172"     3   425534 0
    2007 "0000002178" "5172"     3   544747 0
    2008 "0000002178" "5172"     3   634253 0
    2009 "0000002178" "5172"     3   846528 0
    2005 "0000002186" "3663"     7   150500 0
    2006 "0000002186" "3663"     7   141000 0
    2007 "0000002186" "3663"     7   181500 0
    2008 "0000002186" "3663"     7   170000 0
    2009 "0000002186" "3663"     7   183750 0
    2005 "0000002488" "3674"     2  9300000 0
    2006 "0000002488" "3674"     2  4500000 0
    2007 "0000002488" "3674"     2  5100000 0
    2008 "0000002488" "3674"     2  4400000 0
    2009 "0000002488" "3674"     2  4000000 0
    2005 "0000002809" "1040"     2   627104 0
    2006 "0000002809" "1040"     2  1150200 0
    2007 "0000002809" "1040"     2  1357620 0
    2008 "0000002809" "1040"     2  1715190 0
    2009 "0000002809" "1040"     2  1536260 0
    2005 "0000003116" "2834"     7   332100 0
    2006 "0000003116" "2834"     7   526500 0
    2007 "0000003116" "2834"     7   550600 0
    2008 "0000003116" "2834"     2   805000 1
    2009 "0000003116" "2834"     2   624000 0
    2005 "0000003197" "3564"     3    57742 0
    2006 "0000003197" "3564"   358   180432 1
    2007 "0000003197" "3564"   358   426652 0
    2008 "0000003197" "3564"     7   385000 1
    2009 "0000003197" "3564"     7   415218 0
    2005 "0000003453" "4400"     3  1557150 0
    2006 "0000003453" "4400"     3  1548750 0
    2007 "0000003453" "4400"     3  1617180 0
    2008 "0000003453" "4400"     3  1605000 0
    2009 "0000003453" "4400"     3  1556000 0
    2005 "0000003570" "1311"  1662   401360 0
    2006 "0000003570" "1311"  1662   533664 0
    2007 "0000003570" "1311"     2   795525 1
    2008 "0000003570" "1311"     2  1490870 0
    2009 "0000003570" "1311"     2  1693800 0
    2005 "0000003673" "4911"     1  5005360 0
    2006 "0000003673" "4911"     1  4270670 0
    2007 "0000003673" "4911"     1  3619960 0
    2008 "0000003673" "4911"     3  3184620 1
    2009 "0000003673" "4911"     3  3571550 0
    2005 "0000003982" "1381"  1662   325542 0
    2006 "0000003982" "1381"  1687   629778 1
    2007 "0000003982" "1381"  1687  1236000 0
    2008 "0000003982" "1381"  1687  1038460 0
    2009 "0000003982" "1381"  1687   863096 0
    2005 "0000004187" "4953"     8    44500 0
    2006 "0000004187" "4953"  1829    93975 1
    2007 "0000004187" "4953"  1829    77500 0
    2008 "0000004187" "4953"  1829   107500 0
    2009 "0000004187" "4953" 10547   108950 1
    2005 "0000004447" "1311"     2  7395000 0
    2006 "0000004447" "1311"     2  8288000 0
    2007 "0000004447" "1311"     2  9228000 0
    2008 "0000004447" "1311"     2  9496000 0
    2009 "0000004447" "1311"     2  9713000 0
    2005 "0000004904" "4911"     3 12890000 0
    2006 "0000004904" "4911"     3 12644000 0
    2007 "0000004904" "4911"     3 11747000 0
    2008 "0000004904" "4911"     3 11762000 0
    2009 "0000004904" "4911"     3 11411000 0
    2005 "0000005187" "2834"     1 10600000 0
    2006 "0000005187" "2834"     1 10700000 0
    2007 "0000005187" "2834"     1 11000000 0
    2008 "0000005187" "2834"     1 12500000 0
    2005 "0000005981" "2870"     7   474000 0
    2006 "0000005981" "2870"     7   503000 0
    2007 "0000005981" "2870"     7   621000 0
    2008 "0000005981" "2870"     7   605000 0
    2009 "0000005981" "2870"     7   506000 0
    2005 "0000006176" "3460"     3   654520 0
    2006 "0000006176" "3460"     3   616798 0
    2007 "0000006176" "3460"     3   598458 0
    2008 "0000006176" "3460"     3   692250 0
    2009 "0000006176" "3460"     3   697329 0
    2005 "0000006201" "4512"     2  2275000 0
    2006 "0000006201" "4512"     2  2578000 0
    2007 "0000006201" "4512"     2  2489000 0
    2008 "0000006201" "4512"     2  2578000 0
    2009 "0000006201" "4512"     2  2570000 0
    2005 "0000006292" "7371"     3   507279 0
    2006 "0000006292" "7371"     3   456037 0
    end




  • #2
    I want to calculate this market share based on the total audit fees of the audit firm in the industry, and the industries are based on their SIC code.
    Assuming the share is calculated by SICs for the entire sample and not by fyear;

    Code:
    egen share = pc(audit_fees), by(sic)
    gen specialist = (share >= 25)

    Comment


    • #3
      Thank you for your answer.

      I tried this code, and it does identify some specialists but not in the way I need it.
      After all, I need to have for every industry (every sic code) 1 audit firm (auditor_fkey) who is the specialist in that specif industry.
      So I think I need to calculate the market share based on the audit fees for every industry separately, and then identify the specialist with the second code.
      And then with the indicator variable for the specialist, check whether the firm (CIK) is audited by the specialist (specialist = 1) or not.
      But I don't know how to calculate this.

      So, if you or anyone else knows how to do this I would be very happy.


      Comment


      • #4
        Justin's code should give you each auditors percentage of the fees charged in that specific industry by year. It essentially does both your steps in one command. I think you just want to aggregate the shares by SIC code so something like"\

        Code:
        egen share = pc(audit_fees), by(sic)
        egen totpct=total(share), by(auditor_fkey sic)
        gen specialist = (totpct >= 25)
        However, I originally thought you would want to do this by years. However, most of your years only have one observation per SIC code. So I must assume you want this done across the entire sample.

        Comment

        Working...
        X