Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summing over part of the observations and creating a dummy

    Hello everyone,

    I have a dataset where I have three variables: Firm ID, the areas it serves given by Area_Code (in ascending order), and Proportion - which tells me the proportion of customers that a hospital serves in each area code. For each firm, I want to create a dummy variable that takes on the value 1 for the first few area codes that sum to at least 75% and 0 for the remaining area codes. What's an easy way to do this?

    I would appreciate any suggestions here!

    Happy to answer any clarifications if my ask is not clear.

    Thank you in advance for the help!

  • #2
    Assuming no missing values and no duplicates:

    Code:
    gsort firmID -proportion
    by firmID: g rank=_n
    by firmID: g cumsum=sum(proportion)
    by firmID: egen firstfirm=min(cond(cumsum>.75,rank,.))
    g dummy=rank<=firstfirm

    Comment


    • #3
      This worked beautifully! Thanks so much

      Comment

      Working...
      X