Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating variables referencing values of other variables.

    Each observation of the dataset I'm working with is a person with a variable age. I created a variable, i, the value of which is the total number of people with that age. For example, if there are 56 people with age 5 then everyone with age 5 has 56 for the variable i. I did this with the following line of code:
    egen i = total(age == age), by(age)
    but now I want to make a variable, d, the value of which is (i for age) - (i for age+1). For example, if there are 50 people with age 6 then all observations of age 5 will have 6 as their 'd' value. Any ideas on how to accomplish this?

  • #2

    Code:
    gen agePLUS1 = age + 1
    will ensure that all people with age 5 have value 6 on the new variable, all people with age 42 will have value 43 on the new variable. and so forth.That desire doesn't seem to follow from your frequency calculation, which could be just

    Code:
    bysort age : gen wanted = _N
    or

    Code:
    bysort age : egen wanted = total(1)
    as the repeated evaluation of age == age can be pre-empted, as the result is predictably and inevitably 1 in every observation.

    Comment


    • #3
      I am still struggling with what is wanted here. Just in case it's something more unusual than I imagined -- namely that you want to populate observations with a certain age with the frequency of those one year older -- here is some technique using rangestat from SSC:


      Code:
      . webuse nlswork
      (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
      
      . tab age
      
           Age in |
          current |
             year |      Freq.     Percent        Cum.
      ------------+-----------------------------------
               14 |          3        0.01        0.01
               15 |          1        0.00        0.01
               16 |         27        0.09        0.11
               17 |        104        0.36        0.47
               18 |        557        1.95        2.43
               19 |        972        3.41        5.84
               20 |      1,141        4.00        9.84
               21 |      1,317        4.62       14.46
               22 |      1,458        5.11       19.57
               23 |      1,604        5.63       25.20
               24 |      1,636        5.74       30.94
               25 |      1,566        5.49       36.43
               26 |      1,414        4.96       41.39
               27 |      1,317        4.62       46.01
               28 |      1,250        4.38       50.39
               29 |      1,251        4.39       54.78
               30 |      1,161        4.07       58.85
               31 |      1,202        4.22       63.07
               32 |      1,112        3.90       66.97
               33 |      1,245        4.37       71.34
               34 |      1,168        4.10       75.43
               35 |      1,264        4.43       79.87
               36 |      1,111        3.90       83.76
               37 |        956        3.35       87.12
               38 |        868        3.04       90.16
               39 |        770        2.70       92.86
               40 |        577        2.02       94.89
               41 |        519        1.82       96.71
               42 |        346        1.21       97.92
               43 |        297        1.04       98.96
               44 |        215        0.75       99.72
               45 |         79        0.28       99.99
               46 |          2        0.01      100.00
      ------------+-----------------------------------
            Total |     28,510      100.00
      
      .  rangestat (count) agep1count=idcode, int(age 1 1)
      
      .  rangestat (count) agecount=idcode, int(age 0 0)
      
      . bysort age : gen freq = _N
      
      . tabdisp age, c(freq agecount agep1count)
      
      -------------------------------------------------------------
      Age in    |
      current   |
      year      |            freq  count of idcode  count of idcode
      ----------+--------------------------------------------------
             14 |               3                3                1
             15 |               1                1               27
             16 |              27               27              104
             17 |             104              104              557
             18 |             557              557              972
             19 |             972              972             1141
             20 |            1141             1141             1317
             21 |            1317             1317             1458
             22 |            1458             1458             1604
             23 |            1604             1604             1636
             24 |            1636             1636             1566
             25 |            1566             1566             1414
             26 |            1414             1414             1317
             27 |            1317             1317             1250
             28 |            1250             1250             1251
             29 |            1251             1251             1161
             30 |            1161             1161             1202
             31 |            1202             1202             1112
             32 |            1112             1112             1245
             33 |            1245             1245             1168
             34 |            1168             1168             1264
             35 |            1264             1264             1111
             36 |            1111             1111              956
             37 |             956              956              868
             38 |             868              868              770
             39 |             770              770              577
             40 |             577              577              519
             41 |             519              519              346
             42 |             346              346              297
             43 |             297              297              215
             44 |             215              215               79
             45 |              79               79                2
             46 |               2                2                 
              . |              24                                  
      -------------------------------------------------------------
      As second author of rangestat I should not post admiring my own work, except that almost all the credit belongs to the first author Robert Picard. I am often gobsmacked at how often problems yield to one line of rangestat and it would never have existed without Robert. In this case that's still true even the exotic variable isn't what is wanted.

      There was a Sidney Harris cartoon with two white-coated scientists looking at a pile of dust with a caption something like "No one wanted desiccated elephant, but it is a fine technical achievement".

      Comment


      • #4
        Thank you Nick! rangestat was the function I was looking for to accomplish this, it worked just as I hoped it would.

        Comment

        Working...
        X