Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Discounted cumulative sum of preceding values (unbalanced panel with many missing observations)

    Dear STATALIST,


    I am interested in creating the discounted cumulative sum of preceding values of a variable. I have the following unbalanced panel data with many missing observations. I filled in the gaps between the available calendar years, and this is why the current data set has complete years regardless of missing observations. Year and time variables are the same. I am trying to generate the discounted cumulative sum using the variable (performance).
    firm year time performance
    102 1995 1 13474868
    102 1996 2 26222656
    102 1997 3 .
    102 1998 4 74088704
    102 1999 5 .
    102 2000 6 69155080
    102 2001 7 39108876
    109 2006 1 1787580.5
    109 2007 2 .
    109 2008 3 9137924
    The equation for the discounted sum is as following:

    Click image for larger version

Name:	Capture.JPG
Views:	1
Size:	12.3 KB
ID:	1445406

    P1 is performance of the focal firm i at time 1 (first firm-year observation). T is the given time of the focal firm i.

    For example, for the 5th observation (firm 102, time 5), the discounted sum from 1995 to 1998 should be calculated as 13474868 * (1/5-1) + 26222656 * (1/5-2) + 74088704 * ( 1/5-4). Therefore, I will get the following discounted sum values based on this equation. The discounted sum of any first observation of each firm should be zero since there is no preceding value. Any missing observation will be excluded in the equation.
    firm time discounted sum
    102 1 0
    102 2 13474868 * (1/2-1)
    102 3 13474868 * (1/3-1) + 26222656 * (1/3-2)
    102 4 13474868 * (1/4-1) + 26222656 * (1/4-2)
    102 5 13474868 * (1/5-1) + 26222656 * (1/5-2) + 74088704 * ( 1/5-4)
    102 6 13474868 * (1/6-1) + 26222656 * (1/6-2) + 74088704 * ( 1/6-4)
    102 7 13474868 * (1/7-1) + 26222656 * (1/7-2) + 74088704 * ( 1/7-4) + 69155080 * (1/7-6)
    109 1 0
    109 2 1787580.5 * (1/2-1)
    109 3 1787580.5 * (1/3-1)
    I would like to know if there is a way to generate this variable with loops (or any simpler way).
    Any suggestions or advice would be greatly appreciated. Thank you in advance for your help.


    Best regards,


    Anna
    Last edited by Anna Pak; 23 May 2018, 03:10.

  • #2
    You did not have to fill in the gaps to do this and you do not need the time variable, you can use year. The performance variable is weighted using the difference in time between the current observation and the past observation. Say you want to calculate the sum up to year 2000 and the first observation is in 1995, the weight to apply can be calculated directly from the year variable as follows:
    Code:
    clear
    input float year
    1995
    1996
    1997
    1998
    1999
    2000
    end
    
    gen weight = year[_N] - year
    drop if inlist(year, 1997,1999)
    gen weight2 = year[_N] - year
    list
    and the results as the same even though weight2 was calculated after dropping 1997 and 1999
    Code:
    . list
    
         +-------------------------+
         | year   weight   weight2 |
         |-------------------------|
      1. | 1995        5         5 |
      2. | 1996        4         4 |
      3. | 1998        2         2 |
      4. | 2000        0         0 |
         +-------------------------+
    You can perform the desired calculation using rangerun (from SSC). For each observation, rangerun will run the dcumsum program on a data subset that includes all observations up to the current observation (within each firm). All new variables created by dcumsum are considered results and the values for the last observations in memory at the time the dcumsum program terminates are returned.
    Code:
    clear all
    input int(firm year) double performance
    102 1995  13474868
    102 1996  26222656
    102 1998  74088704
    102 2000  69155080
    102 2001  39108876
    109 2006 1787580.5
    109 2008   9137924
    end
    xtset firm year
    
    program dcumsum
        gen weight = (year[_N] - year)
        gen wanted = sum(performance/weight)
    end
    rangerun dcumsum, interval(year . 0) by(firm)
    
    format %12.0g wanted
    list, sepby(firm)
    and the results
    Code:
    . list, sepby(firm)
    
         +----------------------------------------------+
         | firm   year   perform~e   weight      wanted |
         |----------------------------------------------|
      1. |  102   1995    13474868        0           0 |
      2. |  102   1996    26222656        0    13474868 |
      3. |  102   1998    74088704        0    17602950 |
      4. |  102   2000    69155080        0    46294988 |
      5. |  102   2001    39108876        0   101341656 |
         |----------------------------------------------|
      6. |  109   2006   1787580.5        0           0 |
      7. |  109   2008     9137924        0   893790.25 |
         +----------------------------------------------+
    
    .
    Here's code to spot check results for firm 102 in 2001:
    Code:
    gen w = (2001 - year)
    gen pw = performance / w
    gen xpw = sum(pw)
    format %12.0g xpw
    list if firm == 102
    and the results
    Code:
    . list if firm == 102
    
         +-------------------------------------------------------------------------+
         | firm   year   perfor~e   weight      wanted   w         pw          xpw |
         |-------------------------------------------------------------------------|
      1. |  102   1995   13474868        0           0   6    2245811   2245811.25 |
      2. |  102   1996   26222656        0    13474868   5    5244531      7490342 |
      3. |  102   1998   74088704        0    17602950   3   2.47e+07     32186576 |
      4. |  102   2000   69155080        0    46294988   1   6.92e+07    101341656 |
      5. |  102   2001   39108876        0   101341656   0          .    101341656 |
         +-------------------------------------------------------------------------+
    
    .

    Comment


    • #3
      Dear Robert,

      Thank you so much! Your codes are very helpful.

      As for the missing observations, if I do not fill in the gaps, a lot of information will be wasted (my data has so many missing observations). For example, three observations (firm 102, year 1997, 1999 & firm 109 and year 2007) are missing in the current sample data set, but the discounted sum values of such observations can still be generated for the missing years.

      Thank you again for your help.


      Best regards,

      Anna

      Comment

      Working...
      X