Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • max over a category but storing results over all categories

    Hi all,

    I have code of the following form. h1paid and h2paid are variables I am trying to generate. h1paid stand for whether any of the helpers in h_category 1 is paid, h2paid stand for whether any of the helpers in h_category 2 is paid,

    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float id byte(helper_category helper_paid) float(h1paid h2paid)
    1 1 1 1 2
    1 1 1 1 2
    1 2 0 1 2
    1 2 0 1 2
    end

    The way I current achieve this is:

    bys id: egen byte h1paid_int = max(helper_paid) if helper_categ ==1
    bys id: egen byte h1paid = max(h1paid_int)

    The problem is when I just do the first line I am maxing over the category I want but the result is also only stored in that category, what I want is to max over the category but store the results over all categories. In the end I want to keep only one observation per id with those summary variables h1paid h2paid. Is there a better way to do this? Thanks!

    Best,
    Angela

  • #2
    Code:
    by id, sort: egen h1_paid = ///
        max(cond(helper_categ == 1, helper_paid, .))
    by id, sort: egen h2_paid = ///
        max(cond(helper_categ == 2, helper_paid, .))
    collapse (first) h?_paid, by(id)

    Comment


    • #3
      For a write-up of the technique of Clyde Schechter
      see Section 9 of https://journals.sagepub.com/doi/pdf...867X1101100210

      The entire paper -- for most users -- will mix what they already know, really, with perhaps some new trickery.

      Comment


      • #4
        Thanks Clyde and Nick. I will read the write up, thanks for sharing this, I am sure myself (and many others) will find it very useful.

        Comment

        Working...
        X