Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Peer effects: generate variable if other observation has value

    I am doing peer effects analysis based on geography. I have an id for the unit level and then the id of the 4 closest units. If a fraction of units were exposed to a treatment then I would like to know if a unit's closest neighbors were exposed to this treatment. for example


    Code:
    ssc install dataex
    
    input long id long neighbor1 long neighbor2 long neighbor3 long neighbor4 float treated
    1 9 10 2 3 1
    2 10 1 3 4 0
    3 1 2 4 5 0
    4 2 3 5 6 0
    5 3 4 6 7 0
    6 4 5 7 8 0
    7 5 6 8 9 0
    8 6 7 9 10 0
    9 7 8 10 1 0
    10 8 9 1 2 0
    So the goal would be to define a new dummy variable that is contingent on one of the neighbors having treated == 1 or the unit itself having treated == 1

    so the desired output would be

    Code:
    ssc install dataex
    
    input long id long neighbor1 long neighbor2 long neighbor3 long neighbor4 float treated float exposed
    1 9 10 2 3 1 1
    2 10 1 3 4 0 1
    3 1 2 4 5 0 1
    4 2 3 5 6 0 0
    5 3 4 6 7 0 0
    6 4 5 7 8 0 0
    7 5 6 8 9 0 0
    8 6 7 9 10 0 0
    9 7 8 10 1 0 1
    10 8 9 1 2 0 1
    I could imagine making a loop but I fear that this could be computationally expensive considering that I have many observations but I would appreciate any suggestions.

  • #2
    It can be done quite efficiently in four lines of Mata:

    Code:
    mata:
        treated = st_data(., "treated")
        ns = st_data(., "neighbor*")
        answer = rowmax((treated, treated[ns[., 1]], treated[ns[., 2]], treated[ns[., 3]], treated[ns[., 4]]))
        st_store(., st_addvar("byte", "answer"), answer)
    end
    And the result is identical to what you get in the "exposed" variable.

    Note that I'm assuming, as in your example, that id==1 on the first observation, id==2 on the second, etc.

    If that's not the case, its a bit trickier, as then you might want to a) reshape the dataset to long format (by renaming "id" into neighbor0 first), and then running sort and "egen group()".

    Comment


    • #3
      Sergio, thank you so much this is fantastic! One quick follow-up: What if I would like to count the number of neighbors that were exposed to the treatment? My exposure to mata is limited but it appears that I can't simply switch out rowmax() for rowtotal().

      Thank you so much for your help!

      Comment


      • #4
        You can, it's just that in Mata it's called rowsum() instead of rowtotal()

        Best,
        S

        Comment

        Working...
        X