Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • auditor tenure

    Hi All

    I kindly request your help below.

    My data consists of 'clientid' (client), 'auditorid' (auditor), year and 'tenure' (number of years auditor has audited client).

    1. I would like to create new_var1 as shown below. New_var1 is such that for each client, if the last observation of tenure for an auditor is greater than or equal to 5, new_var receives an indicator 1, otherwise 0

    2. new_var2 is such that, for each client, if an auditor has new_var1 =1, then the next auditor(if there is) will have an indicator of 1, otherwise 0 as shown below

    3. new_var3 is such that, for each client, if the last observation of tenure for that auditor is less than 5 and there is a new auditor the following year for that same client, new_var3 receives an indicator 1, otherwise 0 as shown below

    4.new_var4 is such that for each client, if an auditor has new_var3 =1, then the next auditor (if there is) will have an indicator of 1, otherwise 0 as shown below.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte clientid int(auditorid year) byte(tenure new_var1 new_var2 new_var3 new_var4)
     2  778 2009 1 0 0 0 0
     2  778 2010 2 0 0 0 0
     2  778 2011 3 0 0 0 0
     2  778 2012 4 0 0 0 0
     2  778 2013 5 0 0 0 0
     2  778 2014 6 1 0 0 0
     2  628 2015 1 0 1 0 0
     2  628 2016 2 0 0 0 0
     2  628 2017 3 0 0 0 0
     3  801 2013 1 0 0 0 0
     3  801 2014 2 0 0 0 0
     3  801 2015 3 0 0 0 0
     3  801 2016 4 0 0 0 0
     3  801 2017 5 1 0 0 0
     4  888 2010 1 0 0 1 0
     4 1073 2011 1 0 0 1 1
     4  875 2012 1 0 0 0 1
     4  875 2013 2 0 0 0 0
     4  875 2014 3 0 0 0 0
     4  875 2015 4 0 0 0 0
     4  875 2016 5 0 0 0 0
     4  875 2017 6 1 0 0 0
     5  810 2010 1 0 0 0 0
     5  810 2011 2 0 0 0 0
     5  810 2012 3 0 0 1 0
     5  581 2013 1 0 0 0 1
     5  581 2014 2 0 0 1 0
     5  618 2015 1 0 0 0 1
     5  618 2016 2 0 0 0 0
     5  618 2017 3 0 0 0 0
     6  971 2011 1 0 0 0 0
     6  971 2012 2 0 0 0 0
     6  971 2013 3 0 0 0 0
     6  971 2014 4 0 0 0 0
     6  971 2015 5 0 0 0 0
     6  971 2016 6 1 0 0 0
     6  254 2017 1 0 1 0 0
     8  638 2010 1 0 0 0 0
     8  638 2011 2 0 0 0 0
     8  638 2012 3 0 0 0 0
     8  638 2013 4 0 0 0 0
     8  638 2014 5 0 0 0 0
     8  638 2015 6 0 0 0 0
     8  638 2016 7 1 0 0 0
    10  495 2010 1 0 0 0 0
    10  495 2011 2 0 0 0 0
    10  495 2012 3 0 0 0 0
    10  495 2013 4 0 0 0 0
    10  495 2014 5 0 0 0 0
    10  495 2015 6 0 0 0 0
    10  495 2016 7 1 0 0 0
    13  156 2012 1 0 0 1 0
    13  475 2013 1 0 0 0 1
    13  475 2014 2 0 0 0 0
    13  475 2015 3 0 0 0 0
    13  475 2016 4 0 0 0 0
    13  475 2017 5 1 0 0 0
    14  898 2012 1 0 0 0 0
    14  898 2013 2 0 0 0 0
    14  898 2014 3 0 0 0 0
    14  898 2015 4 0 0 0 0
    14  898 2016 5 1 0 0 0
    14  482 2017 1 0 1 0 0
    15  337 2011 1 0 0 1 0
    15  400 2012 1 0 0 0 1
    17  978 2009 1 0 0 0 0
    17  978 2010 2 0 0 0 0
    17  978 2011 3 0 0 0 0
    17  978 2012 4 0 0 0 0
    17  978 2013 5 1 0 0 0
    17  699 2014 1 0 1 0 0
    17  699 2015 2 0 0 0 0
    17  699 2016 3 0 0 0 0
    17  699 2017 4 0 0 0 0
    18   45 2010 1 0 0 0 0
    18   45 2011 2 0 0 0 0
    18   45 2012 3 0 0 0 0
    18   45 2013 4 0 0 0 0
    18   45 2014 5 1 0 0 0
    18  139 2015 1 0 1 0 0
    18  139 2016 2 0 0 1 0
    18  773 2017 1 0 0 0 1
    19  743 2011 1 0 0 0 0
    19  743 2012 2 0 0 0 0
    19  743 2013 3 0 0 0 0
    19  743 2014 4 0 0 0 0
    19  743 2015 5 1 0 0 0
    19  805 2016 1 0 1 0 0
    19  805 2017 2 0 0 0 0
    20  116 2011 1 0 0 0 0
    20  116 2012 2 0 0 0 0
    20  116 2013 3 0 0 0 0
    20  116 2014 4 0 0 0 0
    20  116 2015 5 1 0 0 0
    20 1038 2016 1 0 1 0 0
    20 1038 2017 2 0 0 0 0
    end




  • #2
    Probably not the most efficient code, but it does what you want:
    Code:
    sort clientid auditorid year
    by clientid auditorid: gen n=_n
    by clientid auditorid: egen maxn=max(n)
    gen new_var1=1 if tenure>=5 & n==maxn
    recode new_var1 (.=0)
    
    xtset clientid year
    gen new_var2=l.new_var1
    recode new_var2 (.=0)
    
    gen aux=1 if tenure<5 & n==maxn
    recode aux (.=0)
    by clientid: gen new_var3=aux if auditorid!=f.auditorid
    recode new_var3 (.=0)
    by clientid: gen n2=_n
    by clientid: egen maxn2=max(n2)
    recode new_var3 (1=0) if n2==maxn2
    
    gen new_var4=1 if l.new_var3==1 & auditorid!=l.auditorid & clientid==l.clientid
    recode new_var4 (.=0)
    
    drop n n2 maxn maxn2 aux

    Comment


    • #3
      Hi Salvatore, this is great coding and i am most grateful. Thanks a million.


      One thing; new_var4 doesn't really give the required results. However when i follow your thought for generating new_var2 and repeat same for new_var4, I get the required results.

      Comment


      • #4
        Just a generic comment on the very helpful code of Salvatore Lattanzio : Pairs of statements like

        Code:
        gen new_var4=1 if l.new_var3==1 & auditorid!=l.auditorid & clientid==l.clientid
        recode new_var4 (.=0)
        can be collapsed to single statements like

        Code:
        gen new_var4 = l.new_var3==1 & auditorid!=l.auditorid & clientid==l.clientid
        The reason is that true-or-false statements are automatically evaluated as 1 if true and 0 if false. For more see https://www.stata.com/support/faqs/d...rue-and-false/ and for yet more see https://journals.sagepub.com/doi/abs...36867X19830921

        As another example, I think that

        Code:
        sort clientid auditorid year
        by clientid auditorid: gen n=_n
        by clientid auditorid: egen maxn=max(n)
        gen new_var1=1 if tenure>=5 & n==maxn
        recode new_var1 (.=0)
        can be rewritten as

        Code:
        bysort clientid auditorid (year): gen new_var1 =  tenure>=5 & _n==_N
        Last edited by Nick Cox; 23 Feb 2020, 02:52.

        Comment


        • #5
          Thank you Nick Cox . having the luxury of both ideas goes a long way in helping my coding. very appreciative of Salvatore Lattanzio and your immense assistance.

          Comment


          • #6
            Hi Nick and Salvatore. Unfortunately I would like to revisit this code again. Particularly for newvar 3 and newvar4. See example below

            . dataex clientid partnerid year tenure_ep max_tenure n2 maxn newvar3 newvar4 if clientid==40

            ----------------------- copy starting from the next line -----------------------
            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(clientid partnerid) int year float(tenure_ep max_tenure n2 maxn2 newvar3 newvar4)
            40 413 2013 1 4 1 5 0 0
            40 413 2014 2 4 2 5 0 0
            40 415 2015 1 1 3 5 1 0
            40 413 2016 3 4 4 5 0 1
            40 413 2017 4 4 5 5 0 0
            end
            ------------------ copy up to and including the previous line ------------------



            You realise that auditor 413 was changed in 2015 and therefore newvar3 should be 1 in 2014 and newvar4 also 1 in 2015. Again this same auditor was brought back in 2016. I think given that the auditor is changed and brought back, the coding seem not to reflect this. This repeats itself for a couple of my observations when an auditor is taken off and then brought back for a particular client. Any help pls ...

            NB: Pls note such practice is allowed; meaning data is correct.

            Comment


            • #7
              Any further assistance will be greatly appreciated.
              Last edited by Kwadwo Boateng; 10 Jul 2020, 18:20.

              Comment

              Working...
              X