Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • generate new variable using egen with sum/count

    Hello everyone, hope you all have a good day ahead.

    So i want to ask 1 question.. i have a data set which consist of ID, Year, Dummy_rich.

    Year ID rich
    2005 1101 0
    2006 1101 0
    2007 1101 1
    2008 1101 0
    2009 1101 0
    2010 1101 1
    2011 1101 1
    2012 1101 0
    2013 1101 1
    2005 1102 0
    2006 1102 0
    2007 1102 0
    2008 1102 0
    2009 1102 1
    2010 1102 1
    2011 1102 0
    2012 1102 1
    2013 1102 2
    2005 1103 1
    2006 1103 0
    2007 1103 1
    2008 1103 1
    2009 1103 0
    2010 1103 3
    2011 1103 1

    Then i want to create one other variable which formed by the calculation of sum dummy_rich, but in different way. i don't know what is the name of this form. here it is

    Year ID rich sum_rich
    2005 1101 0 0
    2006 1101 0 0
    2007 1101 1 1
    2008 1101 0 1
    2009 1101 0 1
    2010 1101 1 2
    2011 1101 1 3
    2012 1101 0 3
    2013 1101 1 4
    2005 1102 0 0
    2006 1102 0 0
    2007 1102 0 0
    2008 1102 0 0
    2009 1102 1 1
    2010 1102 1 2
    2011 1102 0 2
    2012 1102 1 3
    2013 1102 2 5
    2005 1103 1 1
    2006 1103 0 1
    2007 1103 1 2
    2008 1103 1 3
    2009 1103 0 3
    2010 1103 3 6
    2011 1103 1 7

    Do you guys could help me to find out what is the name of this form and the syntax which i could use in stata? Thank you so much.

    ohya Fyi, i already tried several syntax such as egen count, egen sum, egen total, gen _n and so on..

  • #2


    I would call that a cumulative or running sum, and it is what the function sum() provides.

    Code:
    help sum()
    Code:
    clear 
    input Year ID rich 
    2005 1101 0
    2006 1101 0
    2007 1101 1
    2008 1101 0
    2009 1101 0
    2010 1101 1
    2011 1101 1
    2012 1101 0
    2013 1101 1
    2005 1102 0
    2006 1102 0
    2007 1102 0
    2008 1102 0
    2009 1102 1
    2010 1102 1
    2011 1102 0
    2012 1102 1
    2013 1102 2
    2005 1103 1
    2006 1103 0
    2007 1103 1
    2008 1103 1
    2009 1103 0
    2010 1103 3
    2011 1103 1
    end 
    
    bysort ID (Year) : gen wanted = sum(rich) 
    
    list, sepby(ID) 
    
         +-----------------------------+
         | Year     ID   rich   wanted |
         |-----------------------------|
      1. | 2005   1101      0        0 |
      2. | 2006   1101      0        0 |
      3. | 2007   1101      1        1 |
      4. | 2008   1101      0        1 |
      5. | 2009   1101      0        1 |
      6. | 2010   1101      1        2 |
      7. | 2011   1101      1        3 |
      8. | 2012   1101      0        3 |
      9. | 2013   1101      1        4 |
         |-----------------------------|
     10. | 2005   1102      0        0 |
     11. | 2006   1102      0        0 |
     12. | 2007   1102      0        0 |
     13. | 2008   1102      0        0 |
     14. | 2009   1102      1        1 |
     15. | 2010   1102      1        2 |
     16. | 2011   1102      0        2 |
     17. | 2012   1102      1        3 |
     18. | 2013   1102      2        5 |
         |-----------------------------|
     19. | 2005   1103      1        1 |
     20. | 2006   1103      0        1 |
     21. | 2007   1103      1        2 |
     22. | 2008   1103      1        3 |
     23. | 2009   1103      0        3 |
     24. | 2010   1103      3        6 |
     25. | 2011   1103      1        7 |
         +-----------------------------+

    Comment

    Working...
    X