Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating averages for all possible pairwise combinations

    Hi All,

    I have data that resembles the following:


    ---------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(I J Trade Year)
    1 2 123 1990
    1 3  21 1990
    1 4   2 1990
    1 5 123 1990
    2 1 123 1990
    2 3 123 1990
    2 4  12 1990
    2 5 123 1990
    3 1 123 1990
    3 2 213 1900
    3 4 123 1990
    4 1 123 1990
    4 2 123 1990
    4 3 123 1990
    1 2 123 1991
    1 3 123 1991
    2 1 123 1991
    2 4 123 1991
    3 1 213 1991
    3 2 123 1991
    end


    In the above, each row corresponds to a particular country pair (I,J) and their bilateral trade, for each year. For instance, row 1 corresponds to 1's trade with 2 for the year 1990. What I want to obtain is the average of all possible pairwise combinations of trade for each year.The following is what I have in mind:

    Code:
     by year I (J), sort: egen M1=mean(Trade)
    by year (I), sort: egen M2=mean(M1)

    I felt that this would do the trick, but I think it may be an issue, because there will be a lot of duplicates for M1, which will arise from the fact that each I can potentially arise numerous times in the sample, by year. One way to do this by brute force would be to temporarily save the dataset, drop all the variables except I and Year, drop duplicates and then calculate M2. Is there any other way of doing this?


    Thanks,
    CS

  • #2
    It is difficult to tell what you mean by

    the average of all possible pairwise combinations of trade for each year.
    At first glance, I don't understand why
    Code:
    by year, sort: egen M = mean(Trade)
    is not what you want.

    Using your pretend data to test code will be difficult, because so many of the values of Trade are the same. It is impossible to know whether that is so because Trade for 1,2 in 1990 needs to be the same as Trade for 2,1 in 1990.

    Below I have posted a modified version of your data with different numbers for each value of Trade. If there are reasons this is not possible, then please modify it accordingly and re-post. And in any event, explain to us what the results should be for several combinations of I,J - and be sure to include at both directions, such as 1,2 and 2,1.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(I J Trade Year)
    1 2 179 1990
    1 3 137 1990
    1 4  70 1990
    1 5  15 1990
    2 1 180 1990
    2 3  37 1990
    2 4 166 1990
    2 5 285 1990
    3 1 105 1990
    3 2 300 1900
    3 4 190 1990
    4 1 201 1990
    4 2  62 1990
    4 3 233 1990
    1 2  35 1991
    1 3 174 1991
    2 1  24 1991
    2 4 255 1991
    3 1  69 1991
    3 2 251 1991
    end

    Comment

    Working...
    X