Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate age from other occurrences and attribute it to specific observation

    Hi,

    I'm working with consecutive censuses. I can follow the same individuals through several decades. However, age is not always reported (missing from the census, unreadable, etc.) and therefore a zero is shown instead of a missing (don't be mad, I know...moreover, newborns also show an age of 0...no comment). But age probably have been reported in a previous or a subsequent census. How can I use that information to infer age when it is 0 (when applicable)?

    Also, age is not always consistent so (t-1 + 10) and (t+1 - 10) may yield different results. From my experience, most of the age spread through time range between 8 and 12 years so no matter the census year used in the calculation age should be in the ballpark. In the example below, how to determine which census to use in the calculation?

    Finally, individuals are part of dyads (last variable) and may be present in more than one dyad. Note sure it is relevant in the calculation, but agediff is a dyad characteristic that will need to be updated afterwards.

    I'm adding a few questions that may help figure out all the possible cases:
    - A newborn will be coded as 0. What if in the next census the individual is also of age 0 (instead of 9-10)? Should the calculation start from the last occurrence to the first?
    - What if it's the last occurrence that is 0?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long ego int census byte(ego_age agediff) long dyad
    708884 1881 24 23 11415600
    708884 1891  0  0 11415600
    708884 1901 40 20 11415600
    708884 1911 57 25 11415600
    739865 1881  1 23 11415600
    739865 1891  0  0 11415600
    739865 1901 20 20 11415600
    739865 1911 32 25 11415600
    end
    Thanks

    EDIT: data is coming from the censuses in SQL tables. Since individuals have only one occurrence by census (compared to multiple kin relationships in dyadic format), maybe I should figure how to recode age in SQL so that age remains consistent through all dyads.
    Last edited by Jean-Sebastien Bournival; 29 Oct 2022, 10:13.

  • #2
    I would handle this using the ipolate command to linearly interpolate age. To your example data I have added a second dyad cloned from the first but changing it so the first observation of the second individual has age 0 rather than age 1.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long ego int census byte(ego_age agediff) long dyad
    708884 1881 24 23 11415600
    708884 1891  0  0 11415600
    708884 1901 40 20 11415600
    708884 1911 57 25 11415600
    739865 1881  1 23 11415600
    739865 1891  0  0 11415600
    739865 1901 20 20 11415600
    739865 1911 32 25 11415600
    800004 1881 24  0 20000000
    800004 1891  0  0 20000000
    800004 1901 40 20 20000000
    800004 1911 57 25 20000000
    800005 1881  0  0 20000000
    800005 1891 10  0 20000000
    800005 1901 20 20 20000000
    800005 1911 32 25 20000000
    end
    // c_age is ego_age with 0 replaced by missing
    generate c_age = ego_age if ego_age>0
    bysort ego (census): ipolate c_age census, generate(i_age) epolate
    order i_age, after(ego_age)
    drop c_age
    // now take care of dyads
    bysort dyad census: generate i_diff = abs(i_age[1]-i_age[2]) if _N==2, after(agediff)
    sort ego census
    list, sepby(ego) noobs
    Code:
    . list, sepby(ego) noobs
    
      +-----------------------------------------------------------------+
      |    ego   census   ego_age   i_age   agediff   i_diff       dyad |
      |-----------------------------------------------------------------|
      | 708884     1881        24      24        23       23   11415600 |
      | 708884     1891         0      32         0     21.5   11415600 |
      | 708884     1901        40      40        20       20   11415600 |
      | 708884     1911        57      57        25       25   11415600 |
      |-----------------------------------------------------------------|
      | 739865     1881         1       1        23       23   11415600 |
      | 739865     1891         0    10.5         0     21.5   11415600 |
      | 739865     1901        20      20        20       20   11415600 |
      | 739865     1911        32      32        25       25   11415600 |
      |-----------------------------------------------------------------|
      | 800004     1881        24      24         0       24   20000000 |
      | 800004     1891         0      32         0       22   20000000 |
      | 800004     1901        40      40        20       20   20000000 |
      | 800004     1911        57      57        25       25   20000000 |
      |-----------------------------------------------------------------|
      | 800005     1881         0       0         0       24   20000000 |
      | 800005     1891        10      10         0       22   20000000 |
      | 800005     1901        20      20        20       20   20000000 |
      | 800005     1911        32      32        25       25   20000000 |
      +-----------------------------------------------------------------+
    You will probably want to round the fractional interpolated ages up or down to a whole number of years; that's left as an exercise. You will probably also want to change any negative interpolated value to 0. In both cases this should be done before calculating the dyadic age difference, of course.
    Last edited by William Lisowski; 29 Oct 2022, 11:07.

    Comment


    • #3
      William,

      Thanks! I wasn't aware of that function which seems more robust and systematic. Since the code manage age before age difference, I assume that age will be consistent throughout all my datasets (organized by types of dyad) for each individual. This is really helpful.

      However, I get "options not allowed" when running the dyad's bysort line. I'm running version 14.2.

      EDIT: forget it, runs perfectly in version 15.0. Thanks.
      Last edited by Jean-Sebastien Bournival; 29 Oct 2022, 12:44.

      Comment

      Working...
      X