Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help needed: linking maternal education to children in household survey data

    Hi everyone,

    I'm new to this forum and I would really appreciate your help. I tried to look for an answer in this forum and elsewhere on the web but couldn't find a case exactly like mine anywhere. I found similar examples where people had used egen command (egen mother_edu=total(something here), by(hhcode)) but I don't get how that could be used in my case since I don't have any exact rule for who is the mother in the household.

    Below is a fictional example of the kind of data I have: I have household survey data where every observation is uniquely identified by an ID code and every household by a household code. Besides this I know what is the member code (two last numbers of the id code) of each respondent in a given household but the codes don't follow any pattern: mothers, fathers and children can have any member code in a given household. I don't have direct information about who is the father and mother in the family, I have only information about what is each respondent's relationship to the household head (household head can be either female or male): spouse, child, grandchild etc.

    I would like to add for every observation a variable "mother_edu" to link the the mother's years of education ('edu') to her child/children if the information is available and "." if not. To help with this I know what is the member code of the mother of the respondent (if present in the household). Is there a way to add a variable for mother's idcode for every observation (when available) and proceed from there somehow? Can someone please help me and give an example of how this could be done in the easiest way? Thank you in advance.

    idcode hhcode hh_member_code relation_to_hh mother_in_hh mothers_hh_member_code edu
    1010101 10101 1 Household head No . 15
    1010102 10101 2 Spouse No . 13
    1010103 10101 3 Child Yes 2 14
    1010201 10102 1 Household head No . 12
    1010202 10102 2 Spouse No . 10
    1010301 10103 1 Spouse No . 11
    1010302 10103 2 Household head No . 12
    1010303 10103 3 Child Yes 1 14

    Best,
    Emmi Hentilä
    Last edited by Emmi Hentilä; 15 Apr 2015, 05:51.

  • #2
    The data is better presented in a [CODE] block as discussed in the FAQ. I've done that here, after expanding the tabs to spaces, to make the structure clearer to the reader.
    Code:
    idcode   hhcode  hh_member_code  relation_to_hh  mother_in_hh  mothers_hh_member_code   edu
    1010101  10101   1               Household head  No            .                        15
    1010102  10101   2               Spouse          No            .                        13
    1010103  10101   3               Child           Yes           2                        14
    1010201  10102   1               Household head  No            .                        12
    1010202  10102   2               Spouse          No            .                        10
    1010301  10103   1               Spouse          No            .                        11
    1010302  10103   2               Household head  No            .                        12
    1010303  10103   3               Child           Yes           1                        14

    Comment


    • #3
      Here's one approach.
      Code:
      clear 
      input ///
      idcode   hhcode  hh_member_code  str30 relation_to_hh   str3 mother_in_hh  mothers_hh_member_code   edu
      1010101  10101   1               "Household head"  "No "         .                        15
      1010102  10101   2               "Spouse        "  "No "         .                        13
      1010103  10101   3               "Child         "  "Yes"         2                        14
      1010201  10102   1               "Household head"  "No "         .                        12
      1010202  10102   2               "Spouse        "  "No "         .                        10
      1010301  10103   1               "Spouse        "  "No "         .                        11
      1010302  10103   2               "Household head"  "No "         .                        12
      1010303  10103   3               "Child         "  "Yes"         1                        14
      end
      save familyfile, replace
      generate momcode = hhcode*100+mothers_hh_member_code
      rename idcode save_idcode
      rename edu save_edu
      rename momcode idcode
      joinby idcode using familyfile, unmatched(master) 
      drop _merge
      rename idcode mothers_idcode
      rename edu mothers_edu
      rename save_edu edu
      rename save_idcode idcode
      sort idcode
      list idcode relation_to_hh mother_in_hh mothers_hh_member_code ///
           mothers_idcode edu mothers_edu, noobs sepby(hhcode)
      Code:
        +----------------------------------------------------------------------------+
        |  idcode   relation_to_hh   mother~h   mo~_code   mo~dcode   edu   mother~u |
        |----------------------------------------------------------------------------|
        | 1010101   Household head        No           .          .    15          . |
        | 1010102   Spouse                No           .          .    13          . |
        | 1010103   Child                 Yes          2    1010102    14         13 |
        |----------------------------------------------------------------------------|
        | 1010201   Household head        No           .          .    12          . |
        | 1010202   Spouse                No           .          .    10          . |
        |----------------------------------------------------------------------------|
        | 1010301   Spouse                No           .          .    11          . |
        | 1010302   Household head        No           .          .    12          . |
        | 1010303   Child                 Yes          1    1010301    14         11 |
        +----------------------------------------------------------------------------+

      Comment


      • #4
        Thanks, William. [BTW It wasn't 100% simple to post into data window because of the two word entries under the relationship variable]

        Emmi: try something like the following

        Code:
        . de, full
        
        Contains data
          obs:             8                         
         vars:             9                         
         size:           272                         
        ----------------------------------------------------------------------------------------------------------------------------------------------------
                      storage   display    value
        variable name   type    format     label      variable label
        ----------------------------------------------------------------------------------------------------------------------------------------------------
        pid             long    %10.0g               
        hhid            int     %8.0g                
        hhpersno        byte    %8.0g                
        rel2hh          str14   %14s                 
        muminhh         str3    %9s                  
        mum_hhpersno    byte    %8.0g                
        edu             byte    %8.0g                
        mum_num         float   %9.0g                
        mumed           float   %9.0g                
        ----------------------------------------------------------------------------------------------------------------------------------------------------
        Sorted by: hhid  hhpersno
        
        
        . li, noobs sepby(hhid)
        
          +------------------------------------------------------------------------+
          |     pid    hhid   hhpersno           rel2hh   muminhh   mum_hh~o   edu |
          |------------------------------------------------------------------------|
          | 1010101   10101          1   Household head        No          .    15 |
          | 1010102   10101          2           Spouse        No          .    13 |
          | 1010103   10101          3            Child       Yes          2    14 |
          |------------------------------------------------------------------------|
          | 1010201   10102          1   Household head        No          .    12 |
          | 1010202   10102          2           Spouse        No          .    10 |
          |------------------------------------------------------------------------|
          | 1010301   10103          1           Spouse        No          .    11 |
          | 1010302   10103          2   Household head        No          .    12 |
          | 1010303   10103          3            Child       Yes          1    14 |
          +------------------------------------------------------------------------+
        
        . bys hhid (hhpersno): egen mum_num = min(mum_hhpersno)
        (2 missing values generated)
        
        . bys hhid (hhpersno): ge mumed = edu[mum_num] if muminhh == "Yes"
        (6 missing values generated)
        
        . li, noobs sepby(hhid)
        
          +------------------------------------------------------------------------------------------+
          |     pid    hhid   hhpersno           rel2hh   muminhh   mum_hh~o   edu   mum_num   mumed |
          |------------------------------------------------------------------------------------------|
          | 1010101   10101          1   Household head        No          .    15         2       . |
          | 1010102   10101          2           Spouse        No          .    13         2       . |
          | 1010103   10101          3            Child       Yes          2    14         2      13 |
          |------------------------------------------------------------------------------------------|
          | 1010201   10102          1   Household head        No          .    12         .       . |
          | 1010202   10102          2           Spouse        No          .    10         .       . |
          |------------------------------------------------------------------------------------------|
          | 1010301   10103          1           Spouse        No          .    11         1       . |
          | 1010302   10103          2   Household head        No          .    12         1       . |
          | 1010303   10103          3            Child       Yes          1    14         1      11 |
          +------------------------------------------------------------------------------------------+
        Note the use of observation indexing (entry within square brackets) in the generation of the mumed variable.
        No doubt there are other ways of doing this

        Comment


        • #5
          I imagine that there could be multi-generational households with more than one mother in the household. In that case, a simple merge would be a good approach

          Code:
          clear 
          input idcode hhcode hhmember str30 relation_to_hh str3 mother_in_hh mothercode edu
          1010101  10101   1 "Household head"  "No " . 15
          1010102  10101   2 "Spouse        "  "No " . 13
          1010103  10101   3 "Child         "  "Yes" 2 14
          1010201  10102   1 "Household head"  "No " . 12
          1010202  10102   2 "Spouse        "  "No " . 10
          1010301  10103   1 "Spouse        "  "No " . 11
          1010302  10103   2 "Household head"  "No " . 12
          1010303  10103   3 "Child         "  "Yes" 1 14
          end
          
          * verify assumptions about the data and save a copy
          isid hhcode hhmember, sort
          tempfile main
          save "`main'"
          
          * reduce to the household member's education data
          keep hhcode hhmember edu
          
          * rename to match with mother codes
          rename hhmember mothercode
          rename edu mother_edu
          
          * use merge to attach the mother's education
          merge 1:m hhcode mothercode using "`main'", keep(match using) nogen
          
          isid hhcode hhmember, sort
          list, sepby(hhcode) noobs

          Comment


          • #6
            I don't have time to look at this again right now, but to get further assistance, I suggest that you post a selection of your data here, exactly in the formats that it really is (as you've now told us), and including some tricky examples such as households with multiple generations. To do the posting, please please use CODE delimiters as explained in the FAQ (accessed via the advanced editor key A). I would first define a new variable select that you will use to select individuals/households for this illustration. (Aim to post between 25-50 lines of data maximum.) Then do a describe, fullnames of the variables used in the illustration (varlist). Then list varlist if select, noobs nolabel. Post exactly what you type into Stata and exactly what you get back. All this is to make it as easy as possible for readers to read your data into their own Stata and "play" with your problem.

            Comment


            • #7
              Emmi - The intent of my example was to point you toward using joinby as a way of adding an additional variable to your existing data, and I thought a worked-through example using joinby would be more helpful than just saying something like "see help joinby".

              Given my example, and the Stata documentation, it is simple to modify my example to use joinby hhcode hh_member_code ... where hh_member_code has been replaced by mothers_hh_member_code before the joinby and restored afterward, in the way that idcode was replaced and restored in my example. Without a better understanding of your data, though, I'm reluctant to provide a second example.

              Comment

              Working...
              X