Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a new variable differentiated by gender

    Hi Statalist.

    I want to be able to test if there is a difference in the effect of level of education by gender. Here's my draft code.
    Code:
    gen male_educ = 1 if edhigh1 == 9  // up to year 11 "11 years"
    replace male_educ = 2 if (edhigh1 == 8 | p_edhigh1 == 8) & (hgsex == 1 | p_hgsex == 1) // year 12 "12 years"
    replace male_educ = 3 if edhigh1 == 5 | p_edhigh1 == 5 & (hgsex == 1 | p_hgsex == 1) // cert 3, cert 4 "13 years"
    replace male_educ = 4 if edhigh1 == 4 | p_edhigh1 == 4 & (hgsex == 1 | p_hgsex == 1) // adv dip, diploma "14 years"
    replace male_educ = 5 if edhigh1 == 3 | p_edhigh1 == 3 & (hgsex == 1 | p_hgsex == 1) // bachelor, honours "18-19 years"
    replace male_educ = 6 if edhigh1 == 2 | p_edhigh1 == 2 & (hgsex == 1 | p_hgsex == 1) // grad diploma, grad cert "19-20 years"
    replace male_educ = 7 if edhigh1 == 1 | p_edhigh1 == 1 & (hgsex == 1 | p_hgsex == 1) // masters, doctorate "20-24 years"
    I then repeat the same code for females:
    Code:
    gen fem_educ = 1 if edhigh1 == 9 | p_edhigh1 == 9 & (hgsex == 2 | p_hgsex == 2) // up to year 11 "11 years"
    replace fem_educ = 2 if edhigh1 == 8 | p_edhigh1 == 8 & (hgsex == 2 | p_hgsex == 2) // year 12 "12 years"
    replace fem_educ = 3 if edhigh1 == 5 | p_edhigh1 == 5 & (hgsex == 2 | p_hgsex == 2) // cert 3, cert 4 "13 years"
    replace fem_educ = 4 if edhigh1 == 4 | p_edhigh1 == 4 & (hgsex == 2 | p_hgsex == 2) // adv dip, diploma "14 years"
    replace fem_educ = 5 if edhigh1 == 3 | p_edhigh1 == 3 & (hgsex == 2 | p_hgsex == 2) // bachelor, honours "18-19 years"
    replace fem_educ = 6 if edhigh1 == 2 | p_edhigh1 == 2 & (hgsex == 2 | p_hgsex == 2) // grad diploma, grad cert "19-20 years"
    replace fem_educ = 7 if edhigh1 == 1 | p_edhigh1 == 1 & (hgsex == 2 | p_hgsex == 2) // masters, doctorate "20-24 years"
    Sample data:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(id p_id) byte(wave edhigh1 p_edhigh1  hgsex p_hgsex)
    101  102  1 5 9 1 2
    101  102  2 5 9 1 2
    101  102  3 5 9 1 2
    101  102  4 5 9 1 2
    103  104  1 9 5 2 1
    103  104  2 9 5 2 1
    103  104  3 9 5 2 1
    103  104  4 9 5 2 1
    106 142 11 5 5 2 1
    106 142 12 5 5 2 1
    106 142 13 5 5 2 1
    106 142 14 5 5 2 1
    106 142 15 5 5 2 1
    106 142 16 5 5 2 1
    106 142 17 5 5 2 1
    106 142 18 5 5 2 1
    110 163 12 1 3 1 2
    110 163 13 1 3 1 2
    110 163 14 1 3 1 2
    110 163 15 1 3 1 2
    110 163 16 1 3 1 2
    110 163 17 1 3 1 2
    110 163 18 1 3 1 2
    111  231  6 9 4 2 1
    111  231  7 9 4 2 1
    111  231  8 9 4 2 1
    111  231  9 9 4 2 1
    end
    I would appreciate help correcting/improving this code.

    (New variable based on edhigh1 - tabulated below):
    Click image for larger version

Name:	level_educ.png
Views:	2
Size:	7.9 KB
ID:	1566259



    N.B. Stata v.15.1. Using panel data. variables are differentiated by respondent and their partner - "p_" represents value for partner.
    Last edited by Chris Boulis; 31 Jul 2020, 03:52.

  • #2
    Chris:
    my gut-feeling is:
    Code:
    xtologit education i.gender
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Hi Carlo Lazzaro. Thanks for your reply. Actually, both education and gender are control variables in a survival analysis I'm doing and I want to test whether there are differences in the effect of years (level) of education completed between the female and male in a couple (that is, if male ed > female ed or female ed > male ed) has on the stability of their relationship. I thought I'd need to create a dummy to identify this. What do you think? Kind regadrs, Chris

      Comment


      • #4
        If I understood right, adding the interaction term then using - margins- would provide the information.
        Best regards,

        Marcos

        Comment


        • #5
          Hi Marcos Almeida. Thank you for your help. I've not used margins yet, but am reading up on this function now. That said, I'm not sure if it will achieve what I want, and if not, how shall I proceed as per #1? e.g. where there are differences in the levels of education within couples, I want to know if the effect of the level of completed education on relationship stability differs if the wife's level of education is greater than the husbands and vice versa? Will -margins- give me such results?

          Comment


          • #6
            Chris:
            I would think of -group- function from -egen- to create a categorical variable that groups -gender- and -education level-.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              About margins and interactions maybe this link will be helpful as a starting point: https://stats.idre.ucla.edu/stata/fa...nova-stata-11/
              Best regards,

              Marcos

              Comment


              • #8
                It's difficult for us to help because your hypothesis is unclear. Is your hypothesis that "highly educated" couples survive longer than "non-highly" educated couples? If so, you can create a dummy equal to 1 for however you define highly educated and 0 otherwise. Is your hypothesis that a couple with male education exceeding female education will survive longer? If so, you can create a dummy for that as well. Of course, you can define variables continuously as well.

                Comment


                • #9
                  Thank you Carlo Lazzaro, Marcos Almeida and Chris Boudreaux. I appreciate your responses. As Chris asked I'll clarify the question I'm trying to solve: what impact does the level of education completed between couples have on their survival as a couple (relationship stability). Based on the literature, I would expect that as couples education increases that so may their survival rates. I may also expect a similar outcome where the male's level of education is higher than the females'. Finally, that lower survival rates may occur when females' education is higher than males' education. Hence my desire to create a suitable variable as I was trying to do in #1 but I'm not sure this is coded correctly or if it is the best way forward. I appreciate your thoughts/advice. Kind regards, Chris.

                  Comment


                  • #10
                    Chris:
                    Hoping what follows could be of any help, I was thinking about creating a three-level categoriocal predictor (0=female education equals male edication level; 1=female education>male education level; 1=female education<male education level).
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment


                    • #11
                      Hi Carlo Lazzaro. Yes that sounds more like what I was thinking. I'll try to code that and post back. Kind regards, Chris

                      Comment


                      • #12
                        I have a problem. As my panel data is sorted into respondent and partner data (not male and female), when I create the categorical variable as proposed in #1 or #10 I end up with data about male and female education but they are not in the same couple. Therefore, to convert my source education variable "edhigh1" (respondent education) and "p_edhigh1" (partner education) as in #1, I would I need to create 4 sets of categorical variables: (1) male_respondent_educ, (2) female partner_educ, (3) female_respondent_educ, (4) female_partner_educ. At which point I could code something like Carlo Lazzaro suggested to determine the effect of differences between males and females in couples.

                        Suggestions on a more accurate/concise way of coding this are appreciated. Kind regards, Chris

                        Comment


                        • #13
                          I think I may have found a solution to apply Carlo Lazzaro's code in #11 and achieve my goals:
                          Code:
                          gen educ = 0 if edhigh1 == p_edhigh1 // both in the couple have the same level of education
                          replace educ = 1 if (hgsex == 1 & p_hgsex == 2) & (edhigh1 > p_edhigh1) | (p_hgsex == 1 & hgsex == 2) & (p_edhigh1 > edhigh1)
                          // in a couple, male's educ > female's educ
                          replace educ = 2 if (hgsex == 1 & p_hgsex == 2) & (edhigh1 < p_edhigh1) | (p_hgsex == 1 & hgsex == 2) & (p_edhigh1 < edhigh1)
                          // in a couple, female's educ > male's educ
                          Thoughts/comments appreciated.

                          Comment


                          • #14
                            Chris:
                            I was actually thinking of a similar approach.
                            Just select a small subsample of oibservations in your dataset (via -list-) and see if your approach does the desired trick.
                            Kind regards,
                            Carlo
                            (Stata 18.0 SE)

                            Comment


                            • #15
                              Hi Carlo Lazzaro. Thank you for your comments. I tested my code with a small sub-sample and found a minor issue with missings, so I made a couple of adjustments, as follows:
                              Code:
                              gen educ = .
                              replace educ = 1 if edhigh1 == p_edhigh1 & (edhigh1 < . & p_edhigh1 < .)  // both respondent and their partner have same level of education
                              replace educ = 2 if (hgsex == 1 & edhigh1 > p_edhigh1) | (p_hgsex == 1 & p_edhigh1 > edhigh1) & (edhigh1 < . & p_edhigh1 < .)  // male ed > fem ed
                              replace educ = 3 if (hgsex == 2 & edhigh1 > p_edhigh1) | (p_hgsex == 2 & p_edhigh1 > edhigh1) & (edhigh1 < . & p_edhigh1 < .)  // fem ed > male ed
                              Thoughts/comments appreciated. Kind regards, Chris

                              Comment

                              Working...
                              X