Generating percentage variables for dummies

Lorien Nair

Join Date: May 2019
Posts: 115

Generating percentage variables for dummies

14 Sep 2022, 06:10

Hello All,

I am working with a dataset that has a group ID (dist) and a dummy variable (hh23). I want to construct a variable with the percentage of people in a group that reported value == 1. and fill that for the whole group.For example if 18 out of 562 reported 1, then I want the new variable to = 0.032 or 3.2 percent

I tried the following two options but I am getting the wrong output, I did some manual checks and the numbers did not match up.

Option 1:

Code:

bysort dist: egen pcdum = pc(dum)

Option 2:

Code:

egen pcdum = mean(100* (dum ==1)), by(dist)

I think Option 2 is best suited for my issue. Let me know if this is the best way to go about this,

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte state int dist byte hh23 float dum
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 1 1
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 1 1
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 1 1
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 1 1
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
1 101 2 0
end
label values state labels0
label def labels0 1 "jammu & kashmir", modify
label values dist labels1
label def labels1 101 "kupwara", modify
label values hh23 labels250
label def labels250 1 "yes", modify
label def labels250 2 "no", modify

Last edited by Lorien Nair; 14 Sep 2022, 06:28.

Tags: None

Rich Goldstein

Join Date: Mar 2014

Posts: 4485
#2

14 Sep 2022, 07:19

option 1 is not what you want as it returns, for each dum=1, the percentage that observation is of all observations that have dum=1; as I understand you, that is not what you want; option 2 gives what I think you want
Comment

Announcement

Generating percentage variables for dummies

Comment