Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • egen, group

    Dear All, Suppose that I have this data set (the original question is here),
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(A B) str9 C float(ID1 ID2)
    1 2010 "艾一" 2 1
    1 2011 "张三" 1 2
    1 2012 "张三" 1 2
    2 2010 "李四" 3 3
    2 2011 "李四" 3 3
    3 2012 "车八" 6 4
    3 2013 "王五" 5 5
    3 2014 "李白" 4 6
    end
    The raw data have three variables, A, B, and C (names). If I use
    Code:
    egen ID1=group(A C)
    I obtain ID1 variable. However, the desired outcome is ID2 (keep the order/names of C unchanged, and the group number is in an increasing pattern ). Any suggestions? Thanks.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    Code:
    clear
    input int(A B) str9 C float(ID1 ID2)
    1 2010 "艾一" 2 1
    1 2011 "张三" 1 2
    1 2012 "张三" 1 2
    2 2010 "李四" 3 3
    2 2011 "李四" 3 3
    3 2012 "车八" 6 4
    3 2013 "王五" 5 5
    3 2014 "李白" 4 6
    end
    
    gen C2 = sum(C != C[_n-1])
    
    egen wanted = group(A C2)
    
    list, sepby(wanted)
    
         +-------------------------------------------+
         | A      B      C   ID1   ID2   C2   wanted |
         |-------------------------------------------|
      1. | 1   2010   艾一     2     1    1        1 |
         |-------------------------------------------|
      2. | 1   2011   张三     1     2    2        2 |
      3. | 1   2012   张三     1     2    2        2 |
         |-------------------------------------------|
      4. | 2   2010   李四     3     3    3        3 |
      5. | 2   2011   李四     3     3    3        3 |
         |-------------------------------------------|
      6. | 3   2012   车八     6     4    4        4 |
         |-------------------------------------------|
      7. | 3   2013   王五     5     5    5        5 |
         |-------------------------------------------|
      8. | 3   2014   李白     4     6    6        6 |
         +-------------------------------------------+

    Comment


    • #3
      Dear Nick, Thank you for this helpful suggestion. Just for curiosity, will C2 always be equal to wanted?
      Last edited by River Huang; 31 Oct 2021, 03:59.
      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        It works for the example! What I understood to be wanted is a grouping defined by a name changing within the current sort order; I don't read Chinese and so don't know what more the OP said.

        Comment


        • #5
          The OP says, A is company ID, B is year, C is the name of employee in each company. The OP wants to generate a unique ID for each company-name pair, and the ID to be sorted by company ID and year. So Nick gave the solution.

          To River, C2 differs from wanted when, for example, the first person in company 2 is named "张三" instead of "李四", and that's why grouping both A and C2 is necessary.

          Comment


          • #6
            Hi, @Nick Cox and @Fei Wang, Thanks for these additional comments.
            Ho-Chuan (River) Huang
            Stata 19.0, MP(4)

            Comment

            Working...
            X