Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting the occurrence of “directors name”

    Hello respected experts,

    I am counting the number of times the same director has served on a board, irrespective of id and year. I need just counting the same name if it appears in a column from a specific fixed start time, like from 2007.

    I use this formula in excel =COUNTIF($C$2:C3,C3) to it. But due to thousands and thousands of observation excel is unable to do it or its very slow.
    id year name count
    4 2007 Mao Baodi 1
    4 2007 Xiao Ziren 1
    4 2007 Pan Lingman 1
    4 2008 Pan Lingman 2
    4 2008 Mao Baodi 2
    4 2008 Xiao Ziren 2
    4 2009 Mao Baodi 3
    4 2009 Pan Lingman 3
    4 2009 Li Liangzhi 1
    4 2009 Pan Xiuling 1
    4 2009 Xiao Ziren 3
    4 2009 Wang Qi 1
    4 2010 Pan Xiuling 2
    4 2010 Li Liangzhi 2
    4 2010 Wang Qi 2
    4 2011 Wang Qi 3
    4 2011 Li Liangzhi 3
    4 2011 Pan Xiuling 3
    4 2012 Li Liangzhi 4
    4 2012 Wang Qi 4
    4 2012 Pan Xiuling 4
    Thanks and Regards,

  • #2
    Tauseef:
    welcome to this forum.
    You may want to try:
    Code:
    .  input id year str20 name count
    
                id       year                  name      count
      1. 4 2007 Mao_Baodi 1
      2. 4 2007 Xiao_Ziren 1
      3. 4 2007 Pan_Lingman 1
      4. 4 2008 Pan_Lingman 2
      5. 4 2008 Mao_Baodi 2
      6. 4 2008 Xiao_Ziren 2
      7. 4 2009 Mao_Baodi 3
      8. 4 2009 Pan_Lingman 3
      9. 4 2009 Li_Liangzhi 1
     10. 4 2009 Pan_Xiuling 1
     11. 4 2009 Xiao_Ziren 3
     12. 4 2009 Wang_Qi 1
     13. 4 2010 Pan_Xiuling 2
     14. 4 2010 Li_Liangzhi 2
     15. 4 2010 Wang_Qi 2
     16. 4 2011 Wang_Qi 3
     17. 4 2011 Li_Liangzhi 3
     18. 4 2011 Pan_Xiuling 3
     19. 4 2012 Li_Liangzhi 4
     20. 4 2012 Wang_Qi 4
     21. 4 2012 Pan_Xiuling 4
     22. end
    
    
    . sort name year
    
    . duplicates tag name if year>=2007,gen(new_count)
    
    Duplicates in terms of name
    
    . replace new_count = new_count +1
    (21 real changes made)
    
    . list
    
         +--------------------------------------------+
         | id   year          name   count   new_co~t |
         |--------------------------------------------|
      1. |  4   2009   Li_Liangzhi       1          4 |
      2. |  4   2010   Li_Liangzhi       2          4 |
      3. |  4   2011   Li_Liangzhi       3          4 |
      4. |  4   2012   Li_Liangzhi       4          4 |
      5. |  4   2007     Mao_Baodi       1          3 |
         |--------------------------------------------|
      6. |  4   2008     Mao_Baodi       2          3 |
      7. |  4   2009     Mao_Baodi       3          3 |
      8. |  4   2007   Pan_Lingman       1          3 |
      9. |  4   2008   Pan_Lingman       2          3 |
     10. |  4   2009   Pan_Lingman       3          3 |
         |--------------------------------------------|
     11. |  4   2009   Pan_Xiuling       1          4 |
     12. |  4   2010   Pan_Xiuling       2          4 |
     13. |  4   2011   Pan_Xiuling       3          4 |
     14. |  4   2012   Pan_Xiuling       4          4 |
     15. |  4   2009       Wang_Qi       1          4 |
         |--------------------------------------------|
     16. |  4   2010       Wang_Qi       2          4 |
     17. |  4   2011       Wang_Qi       3          4 |
     18. |  4   2012       Wang_Qi       4          4 |
     19. |  4   2007    Xiao_Ziren       1          3 |
     20. |  4   2008    Xiao_Ziren       2          3 |
         |--------------------------------------------|
     21. |  4   2009    Xiao_Ziren       3          3 |
         +--------------------------------------------+
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you, Sir Carlo Lazzaro,

      Thank so much for your reply however, the provided codes only counts the duplicates. While I need the results just like the “count” variable. I mean if the director’s name appears for the first time the Stata should give value 1, next time if the same name appears Stata should give the vale 2 to the same name. I need results just like the forth variable named "count". The results of the provided codes which are given in the next column (new_count) do not produce similar results as the "count" variable produced in Excel. I hope I have explained the situation correctly now. Thanks, again
      id year name count new_count
      4 2009 Li Liangzhi 1 4
      4 2010 Li Liangzhi 2 4
      4 2011 Li Liangzhi 3 4
      4 2012 Li Liangzhi 4 4
      4 2007 Mao Baodi 1 3
      4 2008 Mao Baodi 2 3
      4 2009 Mao Baodi 3 3
      4 2007 Pan Lingman 1 3
      4 2008 Pan Lingman 2 3
      4 2009 Pan Lingman 3 3
      4 2009 Pan Xiuling 1 4
      4 2010 Pan Xiuling 2 4
      4 2011 Pan Xiuling 3 4
      4 2012 Pan Xiuling 4 4
      4 2009 Wang Qi 1 4
      4 2010 Wang Qi 2 4
      4 2011 Wang Qi 3 4
      4 2012 Wang Qi 4 4
      4 2007 Xiao Ziren 1 3
      4 2008 Xiao Ziren 2 3
      4 2009 Xiao Ziren 3 3


      Comment


      • #4
        I believe this does what you want:
        Code:
        bysort name (year): gen wanted = sum(name == name)
        sort year

        Comment


        • #5
          Code:
          bysort name (year) : gen wanted = _n

          Comment


          • #6
            Thank you so much @ Nick Cox and @Wouter Wakker it works. And many many congratulations @ Nick Cox on achieving 20, 000 posts. You are an asset for us. Stay blessed.

            Comment


            • #7
              Originally posted by tauseef ali View Post
              Thank you so much @ Nick Cox and @Wouter Wakker it works. And many many congratulations @ Nick Cox on achieving 20, 000 posts. You are an asset for us. Stay blessed.
              Thank you so much for your help. Now I have to a new issue and I want you expertise. Suppose there are two people on the board (i.e., LI and Mao) in two companies (i.e., companies having IDs of 4 and 8). Now, I need to count the directors experience with respect to year irrespective of the fact if the same director serves on the board in the same year in other company. For more information you can check the wanted variable.

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input byte id int year str11 name float wanted
              4 2007 "Li"  1
              8 2007 "Li"  1
              8 2008 "Li"  2
              4 2008 "Li"  2
              4 2009 "Li"  3
              8 2009 "Li"  3
              4 2007 "Mao" 1
              8 2007 "Mao" 1
              4 2008 "Mao" 2
              8 2008 "Mao" 2
              8 2009 "Mao" 3
              4 2009 "Mao" 3
              end

              Comment


              • #8

                Code:
                bysort name (year) : gen wanted = year - year[1]  + 1

                Comment


                • #9
                  Originally posted by Nick Cox View Post
                  Code:
                  bysort name (year) : gen wanted = year - year[1] + 1
                  Thank you so much for your feedback. The provided codes are perfect.

                  Comment


                  • #10
                    Originally posted by Nick Cox View Post
                    Code:
                    bysort name (year) : gen wanted = year - year[1] + 1
                    Dear Stata Experts
                    I need the Stata code for the following data. I need a wanted variable having a dummy value equal to 1 if the “employeeexperience” is greater than 2, and if an employ having personal ID (i.e., 1001, 1002, 1003 provided in column 3) has worked in his initial career for consecutive 2 years provide that the “CEOdummy” is 1 else 0. In simple words, I need a wanted variable equal to 1 if “employeeexperience” is more than 2 years, and if he has consecutive 2 years of experience in his initial career with a CEOdummy == 1 else 0.
                    Please receive the data as bellow;
                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input float(id year personalID CEOdummy employeeexperience wanted)
                    4 2001 1002 1 1 .
                    4 2001 1001 1 1 .
                    4 2002 1001 0 2 .
                    4 2002 1003 0 2 .
                    4 2003 1003 0 3 0
                    4 2003 1001 0 3 1
                    4 2004 1001 1 4 1
                    4 2004 1002 1 4 1
                    8 2001 1001 0 1 .
                    8 2001 1003 0 1 .
                    8 2002 1001 1 2 .
                    8 2002 1002 1 2 .
                    8 2003 1001 0 3 1
                    8 2003 1002 0 3 1
                    8 2004 1003 1 4 0
                    8 2004 1001 1 4 1
                    end

                    Comment

                    Working...
                    X