Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with cumulating sum over group.

    Hello everyone,

    I assume, that what I have here is actually a very simple problem, but I somehow can't manage to find the right solution.

    I work with a panel dataset which includes wealth distribution on a household level. What you see here is an excerpt of my data, where
    _1_net_wealth is the net wealth of each household
    ranknw1 ranks the households based on net wealth in descending order
    pct_percent1 shows the distribution of percentiles based on net wealth
    weight are the weights each household observation represents
    sumweight is the sum of weights in each percentile, grouped by the percentile thresholds. So 24492.785 is the sum of all weights in 99. percentile.

    My problem:
    I want to create a new variable based on sumweight that accumulates the values in descending order. Meaning that in the case of my code down here, all values of sumweight in the 98. percentile should be accumulated with the values of the 99. percentile and so on...

    I tried "bysort pct_percent1(sumweight ranknw1): gen newvar = sum(sumweight)" but that doesn't seem to be the right way.

    Any ideas?

    Thank you for your help
    Moritz

    Code:
    ranknw1 pct_percent1weight _1_net_wealth sumweight
    395 99 591.08234 1439000 24492.785
    396 99 5544.0381 1438500 24492.785
    397 99  452.9566 1435180 24492.785
    398 99 4311.9194 1434000 24492.785
    399 99 15280.917 1430200 24492.785
    400 98 1164.6069 1428700  26808.31
    401 98 8693.4824 1428500  26808.31
    402 98 2705.4902 1421300  26808.31
    403 98 1163.8893 1419100  26808.31
    404 98 2102.8994 1416000  26808.31
    405 98 1226.6825 1415700  26808.31
    end
    Last edited by Moritz Huth; 03 Mar 2023, 02:17.

  • #2
    This should get you started, assuming the data were sorted in descending order of percentile:

    Code:
    egen uniq_perc = tag(pct_percent1weight)
    gen running_sum = sum(sumweight) if uniq_perc
    replace running_sum = running_sum[_n-1] if missing(running_sum)
    drop uniq_perc

    Comment


    • #3
      This worked out perfectly. Just had to drop "weight" in the brackets of your first line of code.

      Thanks a lot!

      Comment

      Working...
      X