Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating continuous variable indicating child's age.

    Hello Statalist,

    My dataset includes variables indicating:
    • Birth year of child (q7004y)
    • Survey year (round)
    • Child gender (q7003)
    I would like to generate a continuous variable that tells me the age of the first boy born (regardless of whether they were the first child, second, third etc.) with 0 indicating no boy child born.

    I am unsure where to start with coding this in Stata.

    Below is an example of my data (note missing values means no child was born therefore no birth year/gender was recorded - indid/hhid are the womens' individual and household ID's):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(indid hhid) int(q7004y1 q7004y2 q7004y3 q7004y4 q7004y5 q7004y6 q7004y7 q7004y8 q7004y9 q7004y10 q7004y11 q7004y12) byte(q7003_1 q7003_2 q7003_3 q7003_4 q7003_5 q7003_6 q7003_7 q7003_8 q7003_9 q7003_10 q7003_11 q7003_12) int round
    601000101 6010001    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601000201 6010002    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601000301 6010003    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601000401 6010004 1979 1980 1981 1983 1985 1986 1988 1991 1994 . . . 2 1 1 1 2 2 1 2 2 . . . 2006
    601000501 6010005    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601000601 6010006    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601000701 6010007    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601000801 6010008    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601000901 6010009    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001001 6010010    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001101 6010011    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001201 6010012    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001301 6010013    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001401 6010014    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001501 6010015    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001601 6010016    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001701 6010017    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001801 6010018    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601001901 6010019    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002001 6010020    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002101 6010021    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002201 6010022    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002301 6010023    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002401 6010024    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002501 6010025    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002601 6010026    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002701 6010027    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002801 6010028    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601002901 6010029    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003001 6010030    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003101 6010031    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003201 6010032    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003301 6010033    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003401 6010034    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003501 6010035    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003601 6010036    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003701 6010037    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003801 6010038    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601003901 6010039    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004001 6010040    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004101 6010041    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004201 6010042    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004301 6010043    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004401 6010044    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004501 6010045    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004601 6010046    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004701 6010047    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004801 6010048    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601004901 6010049    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005001 6010050    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005101 6010051    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005201 6010052    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005301 6010053    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005401 6010054    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005501 6010055    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005601 6010056    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005701 6010057    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005801 6010058    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601005901 6010059    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006001 6010060 1977 1978 1986    .    .    .    .    .    . . . . 1 2 1 . . . . . . . . . 2006
    601006101 6010061    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006201 6010062    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006301 6010063    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006401 6010064    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006501 6010065    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006601 6010066    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006701 6010067    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006801 6010068    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601006901 6010069 1983 1984 1987    .    .    .    .    .    . . . . 1 2 1 . . . . . . . . . 2006
    601007001 6010070    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007101 6010071    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007201 6010072    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007301 6010073    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007401 6010074    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007501 6010075    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007601 6010076    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007701 6010077    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007801 6010078    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601007901 6010079    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008001 6010080    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008101 6010081    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008201 6010082    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008301 6010083    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008401 6010084    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008501 6010085    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008601 6010086    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008701 6010087    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008801 6010088    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601008901 6010089    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009001 6010090    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009101 6010091    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009201 6010092    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009301 6010093    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009401 6010094    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009501 6010095    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009601 6010096    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009701 6010097    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009801 6010098    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601009901 6010099    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    601010001 6010100    .    .    .    .    .    .    .    .    . . . . . . . . . . . . . . . . 2006
    end
    label values q7003_1 bygrl
    label values q7003_2 bygrl
    label values q7003_3 bygrl
    label values q7003_4 bygrl
    label values q7003_5 bygrl
    label values q7003_6 bygrl
    label values q7003_7 bygrl
    label values q7003_8 bygrl
    label values q7003_9 bygrl
    label values q7003_10 bygrl
    label values q7003_11 bygrl
    label values q7003_12 bygrl
    label def bygrl 1 "boy", modify
    label def bygrl 2 "girl", modify

  • #2
    Like most Stata data management problems, this is much easier in long layout than in wide.

    Code:
    reshape long q7004y q7003_, i(indid hhid) j(child_num)
    gen child_age = round - q7004y
    by hhid, sort: egen age_oldest_boy = max(cond(q7003_ == "boy":bygrl, child_age, .))
    At this point you will have two new variables. One gives the age of each child, and the other gives the age of the oldest boy child in this hhid.

    You indicated that if there is no boy, you would like the age of the oldest boy to be given as 0. The code above does not do that. It's probably not a good idea to do that: all your subsequent calculations with the age variable will have to make special provisions for 0, and if you forget to do that at some point, you will get wrong answers for that and everything after it. When something doesn't exist, it's usually better to represent it with system missing (.) , which is what the code above does, or one of Stata's extended missing values (.a-.z).

    The code above also leaves the data in long layout, which will probably make whatever else you do with this data easier.

    If, however, you have good reasons for representing no boy child as 0 age or for going back to wide layout, you can do those with:

    Code:
    // OPTIONAL BUT PROBABLY BAD IDEAS:
    replace age_oldest_boy = 0 if missing(age_oldest_boy)
    reshape wide q7004y q7003_ child_age, i(indid hhid) j(child_num)
    Note: I notice, to my surprise, that there is only one individual per household in this data. That's weird. So be advised: the code above calculates the age of the oldest boy in the household. If your real data contains multiple people per household, the age of the oldest boy for the entire household will be calculated. If you want each individual's oldest boy's age, change -by hhid- above to -by hhid indid-.
    Last edited by Clyde Schechter; 08 Feb 2021, 11:34.

    Comment


    • #3
      Thank you for the code and explanation, Clyde. I have been able to use it to create the variable I need.

      If I may ask, the purpose of my study is to differentiate between mothers without sons and mothers with sons (and how old those sons are) - I am not sure what you mean by making special provisions for 0?

      As a side note, I have limited my sample to the wives in each family so there should only be one woman per household (the mother).

      Comment


      • #4
        If I may ask, the purpose of my study is to differentiate between mothers without sons and mothers with sons (and how old those sons are) - I am not sure what you mean by making special provisions for 0?
        What I mean is, for example, if you need to calculate the mean age of the oldest sons, -summ age_oldest_boy- would give incorrect answers because it would include a bunch of 0's that are not real. You would have to instead code it as -summ age_oldest_boy if age_oldest_boy > 0-. You would have to remember to add -if age_oldest_boy > 0- to any command that made any calculation with the age_oldest_boy variable. And experience suggests that sooner or later you will forget to do that somewhere along the line.

        If the purpose of your research is to compare mothers without sons to mothers with sons, the best way to do that is to set up a variable that indicates the presence or absence of any son. With the data in long layout, do this:

        Code:
        by hhid, sort: gen byte has_son = max(q7003_ == "boy":bygrl)
        This variable will take on the value 1 in any hhid where there is a son, and 0 otherwise. This variable can then be used in any regression models, or in the -by()- option of commands that use such an option, etc.

        Comment


        • #5
          Thank you for explaining, Clyde. Noted - will defer to your advice.

          Comment

          Working...
          X