Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with connecting variables based on id #'s

    Hi All,

    I am working with a smaller dataset of approximately 500 observations in Stata 15.1. I am interested in the relationship between parent's and child's education and am stuck on what should be a relatively simpler part of the analysis. In this dataset, children are identified by a combination of the parent's id (numerical) and the child's id (which is the #of children each parent has, from 0-7). I would like to determine how many kids have parents who graduated from college and while I already created a new categorical variable for parent's education, I can't figure out how to connect that information to each of the parent's kids. If anyone could suggest a way of identifying which parents the kids belong to, I can figure out the rest. Any suggestions are greatly appreciated!!!

    tab child_id
    child_id Freq. Percen tCum.
    1 212 50.84 50.84
    2 117 28.06 78.90
    3 50 11.99 90.89
    4 23 5.52 96.40
    5 10 2.40 98.80
    6 3 0.72 99.52
    7 2 0.48 100.00
    Total 417 100.00


    Parent's ID is reported as 1, 2, 3, 4, 5......

    Paternal Education is reported as Elementary, Jr. High, H.S. Diploma, College, and More than College

  • #2
    Hi Alma and welcome to Statalist!

    So, it would be *really* helpful if, instead of giving us your tabulation, you instead shared 30-40 obs from your data using Stata's dataex command. If you're not familiar with dataex (and most Stata users aren't) I created a Youtube tutorial here. (I made it too long--feel free to watch at 2x speed, and you may only need the first 6 minutes) .

    You may also want to take a look at these posts here, here, and here

    1) If you don't have a household_id of some kind, I would look at creating one.

    2) I would also create a unique person_id for everyone as well (that way you are not relying on parent_id==5 and child_id==2

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(hhid parent_id is_male child_id parents_educ)
    1 1 1 . 4
    1 2 0 . 5
    1 . 0 1 .
    1 . 0 2 .
    1 . 0 3 .
    1 . 0 4 .
    2 1 1 . 4
    2 2 0 . 3
    2 . 1 1 .
    2 . 1 2 .
    2 . 0 3 .
    2 . 1 4 .
    3 1 1 . 2
    3 2 0 . 2
    3 . 1 1 .
    3 . 1 2 .
    3 . 0 3 .
    4 1 1 . 3
    4 2 0 . 4
    5 1 0 . 3
    5 . 1 1 .
    5 . 1 2 .
    end
    
    ------------------ copy up to and including the previous line ------------------
    
    . list, sepby(hhid) abbrev(14)
    
         +------------------------------------------------------+
         | hhid   parent_id   is_male   child_id   parents_educ |
         |------------------------------------------------------|
      1. |    1           1         1          .              4 |
      2. |    1           2         0          .              5 |
      3. |    1           .         0          1              . |
      4. |    1           .         0          2              . |
      5. |    1           .         0          3              . |
      6. |    1           .         0          4              . |
         |------------------------------------------------------|
      7. |    2           1         1          .              4 |
      8. |    2           2         0          .              3 |
      9. |    2           .         1          1              . |
     10. |    2           .         1          2              . |
     11. |    2           .         0          3              . |
     12. |    2           .         1          4              . |
         |------------------------------------------------------|
     13. |    3           1         1          .              2 |
     14. |    3           2         0          .              2 |
     15. |    3           .         1          1              . |
     16. |    3           .         1          2              . |
     17. |    3           .         0          3              . |
         |------------------------------------------------------|
     18. |    4           1         1          .              3 |
     19. |    4           2         0          .              4 |
         |------------------------------------------------------|
     20. |    5           1         0          .              3 |
     21. |    5           .         1          1              . |
     22. |    5           .         1          2              . |
         +------------------------------------------------------+
    
    * NOTE: I put parents_educ as 1==Elementary, 2==Jr. High, 3==H.S. Diploma, 4==College, and 5==More than College
    gen id = _n
    bysort hhid (id): gen n = _n
    bysort hhid (id): gen highest_ed = max(parents_educ[1], parents_educ[2])  // setting highest educ level of mom or dad (whichever is higher)
    // Note: above only works if parents are in 1st two positions for the family
    gen is_child = (child_id!=.)  // just creating a child indicator (may not be necessary)
    order id n, after(hhid)  // moving variables around 
    gen parent_college = (is_child==1 & highest_ed>=4) & highest_ed!=.  // 1 if either parent went to college
    
    . list, sepby(hhid) abbrev(14) noobs
    
      +--------------------------------------------------------------------------------------------------------+
      | hhid   id   n   parent_id   is_male   child_id   parents_educ   highest_ed   is_child   parent_college |
      |--------------------------------------------------------------------------------------------------------|
      |    1    1   1           1         1          .              4            5          0                0 |
      |    1    2   2           2         0          .              5            5          0                0 |
      |    1    3   3           .         0          1              .            5          1                1 |
      |    1    4   4           .         0          2              .            5          1                1 |
      |    1    5   5           .         0          3              .            5          1                1 |
      |    1    6   6           .         0          4              .            5          1                1 |
      |--------------------------------------------------------------------------------------------------------|
      |    2    7   1           1         1          .              4            4          0                0 |
      |    2    8   2           2         0          .              3            4          0                0 |
      |    2    9   3           .         1          1              .            4          1                1 |
      |    2   10   4           .         1          2              .            4          1                1 |
      |    2   11   5           .         0          3              .            4          1                1 |
      |    2   12   6           .         1          4              .            4          1                1 |
      |--------------------------------------------------------------------------------------------------------|
      |    3   13   1           1         1          .              2            2          0                0 |
      |    3   14   2           2         0          .              2            2          0                0 |
      |    3   15   3           .         1          1              .            2          1                0 |
      |    3   16   4           .         1          2              .            2          1                0 |
      |    3   17   5           .         0          3              .            2          1                0 |
      |--------------------------------------------------------------------------------------------------------|
      |    4   18   1           1         1          .              3            4          0                0 |
      |    4   19   2           2         0          .              4            4          0                0 |
      |--------------------------------------------------------------------------------------------------------|
      |    5   20   1           1         0          .              3            3          0                0 |
      |    5   21   2           .         1          1              .            3          1                0 |
      |    5   22   3           .         1          2              .            3          1                0 |
      +--------------------------------------------------------------------------------------------------------+
    
    . tabulate parent_college if is_child==1
    
    parent_coll |
            ege |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |          5       38.46       38.46
              1 |          8       61.54      100.00
    ------------+-----------------------------------
          Total |         13      100.00



    Last edited by David Benson; 19 Feb 2019, 03:32.

    Comment


    • #3
      As highlighted by David, the best approach to entice an insightful reply is presenting data to work on. Please read the FAQ. There you'll find the recommendation to use code delimiters as well as - dataex - for sharing data.

      That being said, I gather there is a problem with the way the variables were build. For example, "the child's id (which is the #of children each parent has, from 0-7)".

      If we do have the parents IDs as well as the children IDs, we may "concatenate" them. For this, we are supposed to have both variables as string and use the - egen - command.
      Last edited by Marcos Almeida; 19 Feb 2019, 03:49.
      Best regards,

      Marcos

      Comment

      Working...
      X