Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a fraction variable based on sex after running "reshape long" command

    Dear All,

    Here is my data. It is in wide form and I transformed it long form using the following code:

    reshape long BORD_ B4_, i(CASEID) j(_childnumber) string
    destring _childnumber, ignore("_") replace
    drop if missing(BORD_)

    In the original data where I do not run the "reshape long" command, I would like to explain what my data tell:

    CASEID= case identification number.
    BORD= is the birth order of children (If it is BORD_01, then it stands for the youngest / last-born child. That is, BORD_02 corresponds to the child which was born just before BORD_01.)
    B4= is the sex of children where 1 stands for male and 2 stands for female (If it B4_01, then it stands for gender of the youngest / last-born child.That is, B4_02 corresponds to the gender of child which was born just before B4_01)
    malebirths = total number of male births
    femalebirths = total number of female births
    totalbirths = row total of malebirth and femalebirth (please note that totalbirths can be 1 while answering my question. In some households, there are no male children. Similarly, in some households, all births are male or female. It is totally random)

    I would like to generate a variable, called male fraction, which should be the fraction of older sibligs of child i that are male, conditional on birth order. To do that, I know that it is impossible to create such a variable in the wide form. That is why I run the reshape code.
    After running this code, I could see the children's (or siblings) gender based on their birth order (who are living in the same household). Yet, I could not create a variable (male fraction) conditional on each child's birth order. How can I do that after running the above "reshape long" command?. I got stuck at that point. I would appreciate help coming from you.

    Thank you so much in advance.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str15 CASEID double(BORD_01 BORD_02 BORD_03 B4_01 B4_02 B4_03) float totalbirths byte(femalebirths malebirths) float bfduration2
    "    01010004 02" 1 . . 2 . . 1 1 0 30
    "    01010007 02" 1 . . 2 . . 1 1 0 12
    "    01010011 02" 1 . . 1 . . 1 0 1  8
    "    01010021 02" 1 . . 1 . . 1 0 1 18
    "    01050003 02" 2 1 . 2 2 . 2 2 0 19
    "    01060002 02" 4 3 2 1 1 2 4 1 3 18
    "    01060005 02" 5 4 3 2 1 2 5 3 2 18
    "    01060006 03" 2 1 . 2 1 . 2 1 1  3
    "    01060007 06" 4 3 2 2 2 1 4 3 1 10
    "    01060013 02" 2 1 . 2 1 . 2 1 1  2
    "    01060017 02" 3 2 1 2 2 1 3 2 1 36
    "    01060020 02" 1 . . 2 . . 1 1 0 18
    "    01070002 02" 2 1 . 1 1 . 2 0 2 24
    "    01070003 02" 2 1 . 1 1 . 2 0 2 24
    "    01070005 02" 4 3 2 1 2 1 4 1 3  8
    "    01070008 02" 3 2 1 1 1 1 3 0 3  1
    "    01070009 02" 2 1 . 2 1 . 2 1 1 20
    "    01070020 02" 2 1 . 2 1 . 2 1 1 10
    "    01080005 02" 1 . . 1 . . 1 0 1  7
    "    01080008 02" 1 . . 2 . . 1 1 0 24
    "    01090009 02" 1 . . 2 . . 1 1 0  6
    "    01090020 03" 2 1 . 2 1 . 2 1 1 14
    "    01100017 02" 4 3 2 1 2 2 4 3 1 27
    "    01110019 02" 2 1 . 1 1 . 2 0 2 26
    "    01120004 02" 2 1 . 2 1 . 2 1 1 18
    "    01120020 02" 5 4 3 2 1 1 5 1 4  9
    "    01140021 02" 2 1 . 2 1 . 2 1 1 15
    "    01170008 02" 1 . . 2 . . 1 1 0 21
    "    01170009 02" 1 . . 2 . . 1 1 0 13
    "    01180002 02" 1 . . 2 . . 1 1 0 13
    "    01180017 02" 1 . . 2 . . 1 1 0 35
    "    01190004 02" 1 . . 2 . . 1 1 0  6
    "    01190008 02" 1 . . 2 . . 1 1 0  3
    "    01190018 01" 1 . . 2 . . 1 1 0  2
    "    01200001 02" 2 1 . 1 1 . 2 0 2  7
    "    01200002 02" 1 . . 2 . . 1 1 0 21
    "    01200008 02" 2 1 . 2 2 . 2 2 0  8
    "    01210004 02" 1 . . 1 . . 1 0 1  1
    "    01210005 02" 3 2 1 2 1 2 3 2 1  5
    "    01210007 02" 2 1 . 1 2 . 2 1 1 12
    "    01210010 02" 2 1 . 2 2 . 2 2 0 12
    "    01210014 02" 2 1 . 2 2 . 2 2 0 18
    "    01230016 02" 1 . . 2 . . 1 1 0 10
    "    01230017 02" 2 1 . 1 2 . 2 1 1 36
    "    01240015 01" 1 . . 1 . . 1 0 1 18
    "    01240019 01" 1 . . 2 . . 1 1 0  4
    "    01250004 02" 2 1 . 2 2 . 2 2 0  3
    "    01250006 02" 2 1 . 2 1 . 2 1 1 10
    "    01250011 02" 2 1 . 1 2 . 2 1 1  5
    "    01250021 02" 3 2 1 1 1 1 3 0 3 24
    "    01260006 02" 2 1 . 2 1 . 2 1 1 46
    "    01270006 02" 1 . . 1 . . 1 0 1  2
    "    01270008 02" 3 2 1 2 2 2 3 3 0 24
    "    01280004 02" 4 3 2 1 1 2 4 2 2 20
    "    01280005 02" 2 1 . 1 2 . 2 1 1 22
    "    01280019 01" 1 . . 1 . . 1 0 1  3
    "    01280020 02" 1 . . 1 . . 1 0 1  6
    "    01300006 02" 2 1 . 2 2 . 2 2 0  4
    "    01300009 02" 3 2 1 2 2 2 3 3 0 24
    "    01300010 02" 2 1 . 1 1 . 2 0 2 18
    "    01300015 02" 1 . . 2 . . 1 1 0  5
    "    01310014 02" 1 . . 1 . . 1 0 1  5
    "    01310015 02" 3 2 1 2 2 1 3 2 1  1
    "    01310019 04" 1 . . 1 . . 1 0 1 12
    "    01320011 02" 2 1 . 1 1 . 2 0 2 24
    "    01330003 02" 1 . . 2 . . 1 1 0 11
    "    01330014 02" 2 1 . 2 1 . 2 1 1  8
    "    01330016 02" 4 3 2 1 1 1 4 0 4 20
    "    01340006 02" 3 2 1 2 2 1 3 2 1 24
    "    01340009 02" 2 1 . 2 2 . 2 2 0  6
    "    01340010 02" 2 1 . 1 2 . 2 1 1 18
    "    01340011 02" 2 1 . 2 1 . 2 1 1 24
    "    01340012 02" 3 2 1 2 1 1 3 1 2 26
    "    01360001 02" 1 . . 1 . . 1 0 1 18
    "    01360002 02" 4 3 2 2 2 1 4 3 1  8
    "    01360007 02" 2 1 . 1 1 . 2 0 2 18
    "    01360009 02" 1 . . 2 . . 1 1 0  2
    "    01360011 02" 1 . . 2 . . 1 1 0 10
    "    01360021 02" 2 1 . 1 2 . 2 1 1  6
    "    01370016 02" 3 2 1 2 2 1 3 2 1  2
    "    01380008 02" 2 1 . 1 1 . 2 0 2 18
    "    01380009 02" 2 1 . 1 2 . 2 1 1 24
    "    01380012 01" 3 2 1 1 1 2 3 1 2  7
    "    01380021 01" 3 2 1 1 1 2 3 1 2  4
    "    01390012 02" 4 3 2 2 2 2 4 4 0 20
    "    01400021 02" 3 2 1 1 2 2 3 2 1 26
    "    01430012 02" 2 1 . 1 1 . 2 0 2 24
    "    01430014 04" 1 . . 1 . . 1 0 1  6
    "    01440009 02" 1 . . 1 . . 1 0 1  2
    "    01440015 02" 3 2 1 2 2 2 3 3 0 24
    "    01440016 02" 4 3 2 1 1 1 4 1 3  1
    "    01440017 02" 3 2 1 1 2 2 3 2 1  7
    "    01450021 02" 2 1 . 1 2 . 2 1 1 29
    "    01460002 02" 2 1 . 1 2 . 2 1 1 12
    "    01460003 02" 2 1 . 2 1 . 2 1 1 27
    "    01460005 02" 3 2 1 2 2 1 3 2 1 24
    "    01460009 02" 1 . . 2 . . 1 1 0 24
    "    01460015 02" 3 2 1 1 2 2 3 2 1 30
    "    01480011 02" 2 1 . 2 2 . 2 2 0  6
    "    01490005 04" 2 1 . 1 2 . 2 1 1  1
    end
    label values B4_01 B4_01
    label def B4_01 1 "Male", modify
    label def B4_01 2 "Female", modify
    label values B4_02 B4_02
    label def B4_02 1 "Male", modify
    label def B4_02 2 "Female", modify
    label values B4_03 B4_03
    label def B4_03 1 "Male", modify
    label def B4_03 2 "Female", modify
    Last edited by Cansu Oymak; 04 May 2021, 12:52.

  • #2
    I would like to generate a variable, called male fraction, which should rely on the children's birth order.
    Unless I am misunderstanding you, this was asked at https://www.statalist.org/forums/for...ies-for-gender and answered there with:
    Code:
    by BORD, sort: egen male_fraction = mean(B4 == 1)
    If that code is not giving you what you want, please explain in greater detail what you are looking for (and specifically explain how what that code gives you differs from what you want.)

    Comment


    • #3
      Dear Cylde Schechter,

      I am really thankful for your kind and quick response.

      I am now only speaking for its wide form. Not the long form. For the wide form: The variable, male fraction should be the fraction of older sibligs of child i that are male, conditional on birth order. As you can see, I would like to concentrate on the older siblings of the youngest child. If for example, let's assume that my household consists of 5 children and I am the youngest child. I am female. I have 4 older siblings and two of them are male. Then, the male fraction is 2/4 = 0.50 (as in my first household having id 10101). Yet, I could not figure out the code. Normally, this variable (male fraction) can be generated by just using the sex variable. Yet I could not do this.

      I hope I could clarify it for this time. I really appreciate your time and effort.


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int id byte bord1 str1(bord2 bord3 bord4 bord5) byte sex1 str1(sex2 sex3 sex4 sex5) double male_fraction byte(totalbirths malebirths femalebirths)
      10101 5 "4" "3" "2" "1" 1 "2" "1" "1" "2" .5 5 4 1
      10102 3 "2" "1" "." "." 1 "1" "1" "." "."  1 3 3 0
      10103 1 "." "." "." "." 2 "." "." "." "."  0 1 0 1
      10104 1 "." "." "." "." 1 "." "." "." "."  0 1 1 0
      end


      Originally posted by Clyde Schechter View Post

      Unless I am misunderstanding you, this was asked at https://www.statalist.org/forums/for...ies-for-gender and answered there with:
      Code:
      by BORD, sort: egen male_fraction = mean(B4 == 1)
      If that code is not giving you what you want, please explain in greater detail what you are looking for (and specifically explain how what that code gives you differs from what you want.)

      Comment


      • #4
        I see, you want a family-specific male fraction that excludes the last-born.

        You've introduced some new complications here, as your new example data has most (but not all) of the bord and sex variables as strings. That's unworkable. So the first thing I'd do is fix that.

        Then:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int id byte bord1 str1(bord2 bord3 bord4 bord5) byte sex1 str1(sex2 sex3 sex4 sex5) double male_fraction byte(totalbirths malebirths femalebirths)
        10101 5 "4" "3" "2" "1" 1 "2" "1" "1" "2" .5 5 4 1
        10102 3 "2" "1" "." "." 1 "1" "1" "." "."  1 3 3 0
        10103 1 "." "." "." "." 2 "." "." "." "."  0 1 0 1
        10104 1 "." "." "." "." 1 "." "." "." "."  0 1 1 0
        end
        
        destring bord* sex*, replace
        reshape long bord sex, i(id)
        drop if missing(bord, sex)
        by id (bord), sort: egen wanted = mean(cond(_n < _N, sex == 1, .))
        I cannot think of any reasonable code for doing this in wide layout.




        Comment


        • #5
          Yes, this was exactly what I want! Maybe I could not tell it clear enough in my previous posts. All errors are mine about it. I just wanted to create variable that captures "son-biased breastfeeding duration for children". I though that the male fraction of older siblings can be a great one. In your example, it worked properly and I get the accurate ratios contional on the birth order (excluding the last born). Since yesterday, you placed significant time and effort to solve my problem. So much thankful for that.

          Originally posted by Clyde Schechter View Post
          I see, you want a family-specific male fraction that excludes the last-born.

          You've introduced some new complications here, as your new example data has most (but not all) of the bord and sex variables as strings. That's unworkable. So the first thing I'd do is fix that.

          Then:

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input int id byte bord1 str1(bord2 bord3 bord4 bord5) byte sex1 str1(sex2 sex3 sex4 sex5) double male_fraction byte(totalbirths malebirths femalebirths)
          10101 5 "4" "3" "2" "1" 1 "2" "1" "1" "2" .5 5 4 1
          10102 3 "2" "1" "." "." 1 "1" "1" "." "." 1 3 3 0
          10103 1 "." "." "." "." 2 "." "." "." "." 0 1 0 1
          10104 1 "." "." "." "." 1 "." "." "." "." 0 1 1 0
          end
          
          destring bord* sex*, replace
          reshape long bord sex, i(id)
          drop if missing(bord, sex)
          by id (bord), sort: egen wanted = mean(cond(_n < _N, sex == 1, .))
          I cannot think of any reasonable code for doing this in wide layout.



          Comment

          Working...
          X