Summing over part of the observations and creating a dummy

Mansha Mahajan

Join Date: May 2021

Posts: 14
#1

Summing over part of the observations and creating a dummy

09 Nov 2021, 08:29

Hello everyone,

I have a dataset where I have three variables: Firm ID, the areas it serves given by Area_Code (in ascending order), and Proportion - which tells me the proportion of customers that a hospital serves in each area code. For each firm, I want to create a dummy variable that takes on the value 1 for the first few area codes that sum to at least 75% and 0 for the remaining area codes. What's an easy way to do this?

I would appreciate any suggestions here!

Happy to answer any clarifications if my ask is not clear.

Thank you in advance for the help!
Tags: None

alejoforero

Join Date: Sep 2014
Posts: 50

09 Nov 2021, 08:56

Assuming no missing values and no duplicates:

Code:

gsort firmID -proportion
by firmID: g rank=_n
by firmID: g cumsum=sum(proportion)
by firmID: egen firstfirm=min(cond(cumsum>.75,rank,.))
g dummy=rank<=firstfirm

Comment

Mansha Mahajan

Join Date: May 2021

Posts: 14
#3

09 Nov 2021, 13:31

This worked beautifully! Thanks so much
Comment

Announcement

Summing over part of the observations and creating a dummy

Comment

Comment