Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating percentage variables for dummies

    Hello All,

    I am working with a dataset that has a group ID (dist) and a dummy variable (hh23). I want to construct a variable with the percentage of people in a group that reported value == 1. and fill that for the whole group.For example if 18 out of 562 reported 1, then I want the new variable to = 0.032 or 3.2 percent

    I tried the following two options but I am getting the wrong output, I did some manual checks and the numbers did not match up.

    Option 1:
    Code:
    bysort dist: egen pcdum = pc(dum)
    Option 2:
    Code:
    egen pcdum = mean(100* (dum ==1)), by(dist)
    I think Option 2 is best suited for my issue. Let me know if this is the best way to go about this,


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte state int dist byte hh23 float dum
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 1 1
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 1 1
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 1 1
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 1 1
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    1 101 2 0
    end
    label values state labels0
    label def labels0 1 "jammu & kashmir", modify
    label values dist labels1
    label def labels1 101 "kupwara", modify
    label values hh23 labels250
    label def labels250 1 "yes", modify
    label def labels250 2 "no", modify
    Last edited by Lorien Nair; 14 Sep 2022, 06:28.

  • #2
    option 1 is not what you want as it returns, for each dum=1, the percentage that observation is of all observations that have dum=1; as I understand you, that is not what you want; option 2 gives what I think you want

    Comment

    Working...
    X