Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating sum of variable for groups based on specified time-frames

    Hi all,

    I am using Stata 14.2 on Windows and am currently working with some financial data.

    I want to generate a variable, displaying the sum of the return on assets from a period t to t+2 for each company, specified by the ds_code in a way that the generated value is represented as new variable in the row of year t.

    I first only aggregated the roa across a company using, but the code does not specify the time-frames I am interested in:

    egen roa3y = sum(roa), by(ds_code)

    I am struggling to include the condition to only calculate the sum of the years t=0 to t=2 referring to the year of the observation. My data looks somewhat like this:

    ds_code year roa
    "130062" 2004 .1783172
    "130062" 2005 6.661182
    "130062" 2006 2.339188
    "130062" 2007 .16718867
    "130062" 2008 .12236715
    "130088" 2004 .22295973
    "130088" 2005 .04354694
    "130088" 2006 .3157642
    "130088" 2007 .11244648
    "130088" 2008 5.147346

    So in e.g. in row 1, I want to generate a new variable roa3y that is the sum of the roa of firm "130062" from years 2004, 2005, and 2006 (9.1786872 in this case).

    Really appreciate any help, you may be able to provide!

  • #2
    Try this:

    Code:
    . bysort ds_code (year): gen roa3 = roa + roa[_n+1] + roa[_n+2]
    (4 missing values generated)
    
    . list
    
         +--------------------------------------+
         | ds_code   year        roa       roa3 |
         |--------------------------------------|
      1. |  130062   2004   .1783172   9.178687 |
      2. |  130062   2005   6.661182   9.167559 |
      3. |  130062   2006   2.339188   2.628744 |
      4. |  130062   2007   .1671887          . |
      5. |  130062   2008   .1223672          . |
         |--------------------------------------|
      6. |  130088   2004   .2229597   .5822709 |
      7. |  130088   2005   .0435469   .4717576 |
      8. |  130088   2006   .3157642   5.575557 |
      9. |  130088   2007   .1124465          . |
     10. |  130088   2008   5.147346          . |
         +--------------------------------------+

    Comment


    • #3
      That works out perfectly. Thanks a lot!

      Comment


      • #4
        The code of Joro Kolev is excellent if there are no gaps in the data. There aren't any in the example but rangestat (SSC) would cope better if there were and it provides another solution in any case.

        Code:
        clear 
        input str6 ds_code year roa
        "130062" 2004 .1783172
        "130062" 2005 6.661182
        "130062" 2006 2.339188
        "130062" 2007 .16718867
        "130062" 2008 .12236715
        "130088" 2004 .22295973
        "130088" 2005 .04354694
        "130088" 2006 .3157642
        "130088" 2007 .11244648
        "130088" 2008 5.147346
        end 
        
        rangestat (count) roa (sum) roa, int(year 0 2) by(ds_code) 
        
        
        list, sepby(ds_code) 
        
            +--------------------------------------------------+
             | ds_code   year        roa   roa_co~t     roa_sum |
             |--------------------------------------------------|
          1. |  130062   2004   .1783172          3   9.1786872 |
          2. |  130062   2005   6.661182          3   9.1675587 |
          3. |  130062   2006   2.339188          3   2.6287439 |
          4. |  130062   2007   .1671887          2   .28955583 |
          5. |  130062   2008   .1223672          1   .12236715 |
             |--------------------------------------------------|
          6. |  130088   2004   .2229597          3   .58227086 |
          7. |  130088   2005   .0435469          3   .47175761 |
          8. |  130088   2006   .3157642          3   5.5755567 |
          9. |  130088   2007   .1124465          2   5.2597925 |
         10. |  130088   2008   5.147346          1    5.147346 |
             +--------------------------------------------------+

        Comment

        Working...
        X