Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linking adult and child data in household surveys

    Hi everyone

    I am working with Brazilian household survey data to analyse child nutrition in households where an adult receives a social assistance transfer from the state (Programme Bolsa Familia, PBF). The dataset I am working with now is a merged file of just under 165,000 individuals, aged 0 - 100+. The problem I am faced with is that I know which adults receive the transfers, but I need to link these adults to their children, so that I can merge children and carers in a child-focused dataset. In this child-focused dataset, the observations would be children, with variables with their anthropometric data, as well as carer-level variables like carer's education level, labour income, etc. I was helped with a similar issue on a previous post in the forum (https://www.statalist.org/forums/for...-child-records). But in that previous post, I had a carer ID to work with for each child benefitting from social assistance; in this dataset this variable does not exist, and I need to create it.

    A portion of the data is shown below. 'pid' is the unique identifier for individuals; 'id_UC' is the household identifier; 'idade_anos' is age in years; 'vare24' is the amount of income received in PBF by the adult; 'PBF' is equal to one simply to indicate that the adult receives some amount of PBF as per vare24; 'cod_sexo' is sex; and I created 'nPBF' to show the number of adult PBF recipients in the household (id_UC) (relevant in cases where households have multiple resident families).

    Looking at household 606, we see there are three residents of 7, 44, and 27 years respectively. The 27-year-old is the PBF claimant, and because PBF is for children, the beneficiary is clearly the 7-year-old (pid 918606). In household 608, there are two possible child beneficiaries: pids 921608 and 920608, cared for by what could be their 60-year-old grandmother. I need to create a new variable - for example, 'id_carer', that shows, for each child (person 18 years or younger), who the PBF-receiving adult is. So for children 921608 and 920608, 'id_carer' would need to be 1402608 (the pid of their grandmother). Similarly, for child 918606, it would need to be 1398606. How can I do this?

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str11 pid float id_UC int idade_anos float(vare24 PBF) byte cod_sexo float nPBF
    "918606"  606  7       . 0 1 1
    "1397606" 606 44       . 0 1 1
    "1398606" 606 27 1397.76 1 2 1
    "1401607" 607 17       . 0 1 1
    "919607"  607  5       . 0 1 1
    "1400607" 607 24       . 0 1 1
    "1399607" 607 40  183.04 1 2 1
    "921608"  608  1       . 0 2 1
    "920608"  608  9       . 0 1 1
    "1402608" 608 60 1397.76 1 2 1
    "1404609" 609 57       . 0 2 0
    "1403609" 609 44       . 0 1 0
    "922609"  609  1       . 0 2 0
    "923610"  610  5       . 0 2 1
    "1405610" 610 51       . 0 1 1
    "1406610" 610 44     687 1 2 1
    end
    I am using Stata SE 16.1. As always I greatly value any potential guidance.

    Regards,
    Zoheb Khan
    CEBRAP Sao Paulo/ CSDA Johannesburg

  • #2
    Code:
    . bysort id_UC (pid): gen which = _n if PBF & nPBF == 1
    (12 missing values generated)
    
    . egen which2 = mean(which) , by(id_UC)
    (3 missing values generated)
    
    . bysort id_UC (pid): gen id_carer = pid[which2] if idade_anos <= 18
    (10 missing values generated)
    
    . list pid id_UC idade_anos PBF nPBF which which2 id_carer, sepby(id_UC)
    
         +---------------------------------------------------------------------+
         |     pid   id_UC   idade_~s   PBF   nPBF   which   which2   id_carer |
         |---------------------------------------------------------------------|
      1. | 1397606     606         44     0      1       .        2            |
      2. | 1398606     606         27     1      1       2        2            |
      3. |  918606     606          7     0      1       .        2    1398606 |
         |---------------------------------------------------------------------|
      4. | 1399607     607         40     1      1       1        1            |
      5. | 1400607     607         24     0      1       .        1            |
      6. | 1401607     607         17     0      1       .        1    1399607 |
      7. |  919607     607          5     0      1       .        1    1399607 |
         |---------------------------------------------------------------------|
      8. | 1402608     608         60     1      1       1        1            |
      9. |  920608     608          9     0      1       .        1    1402608 |
     10. |  921608     608          1     0      1       .        1    1402608 |
         |---------------------------------------------------------------------|
     11. | 1403609     609         44     0      0       .        .            |
     12. | 1404609     609         57     0      0       .        .            |
     13. |  922609     609          1     0      0       .        .            |
         |---------------------------------------------------------------------|
     14. | 1405610     610         51     0      1       .        2            |
     15. | 1406610     610         44     1      1       2        2            |
     16. |  923610     610          5     0      1       .        2    1406610 |
         +---------------------------------------------------------------------+
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thanks Maarten, this is some deft and elegant code! It's worked well.

      Comment

      Working...
      X