Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Compute Female proportion in each district in long format data in Stata.


    I have a long format dataset where it has 5 variables:
    did---district id
    sid---school id
    id--student id
    age --student age
    count-Bonus count
    male--student gender(1-male;0-female)
    Now, I want to compute female proportions for each district.
    Thank you for your Stata code.


    * Example generated by -dataex-. For more info, type help dataex.
    clear
    input int did long sid float(id age count male)
    1 11 111 42 20 1
    1 11 111 30 20 1
    1 11 111 42 10 1
    1 11 111 34 10 1
    1 11 111 30 10 1
    1 11 111 33 10 1
    1 12 112 39 20 0
    1 12 112 40 20 0
    1 12 112 43 20 0
    1 12 112 33 20 0
    1 13 113 40 10 1
    1 13 113 41 20 1
    1 13 113 38 20 1
    1 14 114 32 10 0
    1 14 114 35 10 0
    1 14 114 32 20 0
    1 14 114 37 10 0
    1 15 115 31 10 0
    1 15 115 29 10 0
    1 15 115 36 10 0
    1 15 115 37 20 0
    1 16 116 31 20 1
    1 16 116 36 20 1
    1 16 116 38 10 1
    1 16 116 27 10 1
    1 16 116 35 20 1
    1 17 117 39 10 0
    1 17 117 44 20 0
    1 17 117 41 10 0
    1 17 117 34 20 0
    1 17 117 45 20 0
    1 17 117 28 10 0
    2 21 211 48 10 0
    2 21 211 38 10 0
    2 21 211 31 20 0
    2 21 211 34 10 0
    2 22 222 42 10 1
    2 22 222 37 20 1
    2 22 222 40 20 1
    2 22 222 44 10 1
    2 22 222 42 20 1
    2 23 222 37 10 1
    2 23 223 43 20 1
    2 23 223 47 10 1
    2 23 223 35 20 1
    2 24 224 44 20 0
    2 24 224 40 10 0
    2 24 224 45 10 0
    2 24 224 36 20 0
    2 24 224 39 10 0
    2 25 225 41 20 1
    2 25 225 34 20 1
    2 25 225 39 20 1
    2 25 225 36 10 1
    2 25 225 38 20 1
    2 26 226 33 10 0
    2 26 226 46 10 0
    2 26 226 45 20 0
    2 26 226 35 10 0
    2 26 226 41 10 0
    2 27 227 33 20 1
    2 27 227 43 10 1
    2 27 227 46 20 1
    2 27 227 32 20 1
    3 31 331 35 20 0
    3 31 331 42 10 0
    3 32 332 73 10 1
    3 32 332 29 20 1
    end
    Last edited by smith Jason; 07 May 2022, 17:37.

  • #2
    Now, I want to compute female proportions for each district.
    ^ This is a little vague and I am not sure what exactly is the ask. E.g. do you just want to see the percentage? Make a dataset? Or merge the percentage into the data? If you just want to see them, they can be obtained with:

    Code:
    bysort did: tabulate male
    If it is other things that is needed, then please clarify.

    Comment


    • #3
      I want to create a new variable--male_p, which means the percentage of male by distrcit.
      Thanks!

      Comment


      • #4
        Originally posted by smith Jason View Post
        I want to create a new variable--male_p, which means the percentage of male by distrcit.
        Thanks!
        Since 1 is male and 0 is female, the mean of it would be the proportion of male in fraction.

        Code:
        egen prop_male = mean(male), by(did)
        If you need this in percent, then just follow with a generate to create a new one multiplied by 100.

        Comment


        • #5
          Originally posted by Ken Chui View Post

          Since 1 is male and 0 is female, the mean of it would be the proportion of male in fraction.

          Code:
          egen prop_male = mean(male), by(did)
          If you need this in percent, then just follow with a generate to create a new one multiplied by 100.

          In fact, this is the long format data. For district 1, male_prop=3/7=42.85%

          Comment


          • #6
            My code is as follows and it can work now,

            egen tag_id = tag(id)
            egen NUM = total(tag_id), by(did)

            egen tagid = tag(id) if male==1
            egen num = total(tagid), by(did)

            gen male_prop=num/NUM

            LOL. It is a good opportunity for me to learn, grow, and think over and over in practice.

            Comment


            • #7
              Originally posted by smith Jason View Post
              My code is as follows and it can work now,

              egen tag_id = tag(id)
              egen NUM = total(tag_id), by(did)

              egen tagid = tag(id) if male==1
              egen num = total(tagid), by(did)

              gen male_prop=num/NUM

              LOL. It is a good opportunity for me to learn, grow, and think over and over in practice.
              This is sufficient, no need for another tag:

              Code:
              egen tag_id = tag(id)
              egen NUM = total(tag_id), by(did)
              egen num = total(tag_id * male), by(did)
              gen male_prop=num/NUM

              Comment


              • #8
                Thank you!

                Comment

                Working...
                X