Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • generating parents education variable

    Hi everybody,
    I have a dataset that includes the variables: "family id", "highest diploma", "gender", and "relationship in the family" (1=first in household [main provider -can be either male or female]. 2= the partner of 1. and 3 = the child of 1).

    How can i generate a "Father's highest diploma" and "Mother's highest diploma" using this data?
    Thank you,
    Alon

  • #2
    The solution to this will involve creating a key variable that links each parent to child, creating a file with the parents in your data file, and then doing a -merge-. I can't figure out how to tell you to create the key variable based on the information in your description. To provide such information, please post some example data, using the -dataex- command as described in the StataList FAQ. Some description of your "relationship..." variable also would be helpful, as at least I don't quite know how it translates into parent vs. child.

    Comment


    • #3
      Thank you very much Mike.
      I couldn't use the dataex as my dataset is in a Gov office.
      regarding the relationship variable, each family has a 1 (main provider), his/hers partner 2, and their children 3s. I have family id to connect these numbers within families.

      Comment


      • #4
        You can use the structure of your data and create fake values for your variables. I'd presume that would not violate any office rules.

        Comment


        • #5
          A thought occurs to me: The community-contributed command -shufflevar- might help in creating fake data that could be used with -dataex-.. Ths would preserve the privacy of any actual person in the data set. You could try something like this:

          Code:
          ssc install shufflevar
          set seed 1234 // your favorite random number
          shufflevar ThisVar ThatVar SomeOtherVar
          dataex .....

          Comment


          • #6
            Thank you very much Mike!

            I hope i presented it well, after manipulating the data: i have a sample of eight families and for each child (Relation_in_fam==3) i want to have father's diploma and mother's diploma. gender is (1=male).
            Head of family (Relation_in_fam==1) is the father in most cases but there are some with mothers as head of the family, and the father is their partner (Relation_in_fam==2).

            Code:
            * Example generated by -dataex-. 
            clear
            input float(famID_shuffled Relation_in_fam gender diploma)
                 1 1 1 0
                 1 2 0 3
                 1 3 0 2
                 1 3 0 5
                 1 3 1 0
                 2 1 1 4
                 2 2 0 5
                 2 3 0 1
                 2 3 0 6
                 2 3 1 6
                 3 1 1 3
                 3 2 0 7
                 3 3 1 0
                 3 3 1 0
                 4 1 0 2
                 4 2 1 2
                 4 3 1 3
                 4 3 1 0
                 5 1 1 6
                 5 2 0 0
                 5 3 0 3
                 5 3 1 2
                 5 3 1 4
                 5 3 1 1
                 5 3 1 0
                 6 1 0 0
                 6 2 1 3
                 6 3 0 3
                 6 3 0 3
                 7 1 1 4
                 7 2 0 2
                 7 3 1 0
                 7 3 0 4
                 7 3 1 2
                 7 3 1 5
                 7 3 1 5
                 7 3 1 0
                 8 1 1 0
                 8 3 0 3
                 8 3 1 3
                 8 3 1 1
                 8 3 1 0
                 8 3 1 2
            end

            Comment


            • #7
              I noticed that your example has no families with both mother and father diplomas, so that case was not tested.

              Anyway, I think the following works (check for yourself). It involves a common (but perhaps "hackish") Stata trick, which is to use a zero denominator to create a missing value, which in Stata is a large number.
              Code:
              // Easier variable names for me to type and think
              rename Relation_in_fam relat
              rename famID_shuffled famID
              gen byte child = relat == 3
              gen byte mother = (relat == 1) & (gender == 0)
              gen byte father = (relat == 1) & (gender == 1)
              //
              // No merge needed.
              egen FaDip = min(diploma/father), by(famID)
              egen MoDip = min(diploma/mother), by(famID)
              recode MoDip FaDip (nonmissing = .) if !child

              Comment


              • #8
                Thank you Mike it works perfectly!

                Code:
                // just added the option for number 2 in the family.
                
                gen byte mother = (relat == 1) & (gender == 0) | (relat == 2) & (gender == 0)
                gen byte father = (relat == 1) & (gender == 1) | (relat == 2) & (gender == 1)

                Comment

                Working...
                X