Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Similarity-weighted sum of a variable excluding self, by group and year

    Hi, I have been trying my best to figure out this problem, but had no luck. Any suggestion would be greatly appreciated!


    I have a panel dataset of multiple individuals (i), multiple groups, (g) and multiple years (t). Let's say my key variable is score.

    I am trying to operationalize the external influence of peers on an individual in a same group at t. Thus, I want to generate the weighted-average score of peers in group (g) in year (t).

    It has been not easy for two reasons:
    (1) I want to derived "similarity" weight from the dataset. Even within a same group, individuals are quite heterogeneous. Therefore, I generated the score rank within a group as follows:
    Code:
    bysort group year: egen rank_p=rank(-score), unique
    If the "gap" between two ranks is small, two individuals are likely to be very similar. Thus, I want to apply 1/(the absolute difference between two ranks) as the weight that is multiplied by score.

    For an individual, there would be multiple peers in group g, in year t - some are more similar while others are less similar to the individual. In other words, more-similar peer A's score is likely to be more impactful on individual i than the score of B. The different between A's rank and i's rank is smaller than the gap between B's rank and i's rank. Thus, what I eventually want to derive is as follows:
    Picture2.png



    (2) As what I want to operationalize is the impact of peers, I want to exclude the own firm's score from the sum.


    My code is as follows:

    Code:
    gen numerator=.
    gen denominator=.
    foreach x in year{
    foreach y in group{
    foreach i in individual{
    by group year: replace numerator = sum(1/(abs(rank[`i']-rank))*score)
            by group year: replace denominator = sum(1/(abs(rank[`i']-rank)))
    }
    }
    }
    gen peerinfluence = numerator/denominator
    Ideally, the code should include the condition that the individual's own score is not included in the sum, and ignore any missing values. Please share your wisdom with me. Thank you!

  • #2
    Please use the -dataex- command and provide example data. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Clyde Schechter Thank you for your suggestion, Clyde! Let me include example data using dataex shortly.

      Comment

      Working...
      X