dyadic dataset: problems with summations

Thomas Wimmer

Join Date: Apr 2016

Posts: 10
#1

dyadic dataset: problems with summations

11 Apr 2016, 13:40

Dear Statalist community,

I have a panel dataset with the following variables:
id year country_a country_b FTA

id uniquely identifies my dyad (pair of country_a and country_b). I have 150 countries, thus 150*(150-1)/2 = 11175 unique dyads (pairings) and therefore 11175 observations per year. This means, the observation country pair ij exists, but country pair ji does not exist in my dataset. For example, country pair AUS USA is included, but USA AUS not (because AUS comes first in alphabetical order).
FTA is a dummy variable, that equals 1 if country_a has a free trade agreement with country_b.

Now I want to generate a new variable that stores the sum of all FTAs that country_a has with other countries than country_b. I tried the following code which is obviously wrong:

Code:

bysort country_a year: egen newvar1 = total(FTA) gen newvar2=newvar1-FTA

How can I get (for each id and each year) the sum of all FTAs that country_a has with other countries than country_b? I would also need a variable with the sum of FTAs that do not include the countries of the country pair in that observation, again per id and year?
Could you please help me out here?

Thank you,
Thomas

Last edited by Thomas Wimmer; 11 Apr 2016, 14:08.
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35754
#2

11 Apr 2016, 13:47

Why do you say your code is obviously wrong? It looks good to me. when recast as

Code:

bysort country_a year: egen newvar1 = total(FTA) gen newvar2 = newvar1 - FTA
Comment
Thomas Wimmer

Join Date: Apr 2016

Posts: 10
#3

11 Apr 2016, 13:58

Hey Nick,

thank you for your quick reply. Of course I meant the code you displayed, I'm very sorry.

But still, this code is not correct due to the dyadic structure with unique dyads:
For example, my first three countries are "ALB", "ARE" and "ARG".
Suppose, "ALB" and "ARE" have an FTA. This code would return 1 for the dyad ALB-ARE, but 0 for the dyad ARE-ARG, although ARE has an FTA with ALB.
This is because the dyad ARE-ALB does not exist, only ALB-ARE exists.

Last edited by Thomas Wimmer; 11 Apr 2016, 14:04.
Comment
Thomas Wimmer

Join Date: Apr 2016

Posts: 10
#4

11 Apr 2016, 14:30

I think for my solution I would need another variable that sums the FTAs when the particular country of row country_a appears in row country_b. My desired variable (sum of all FTAs that country_a has with other countries than country_b) should then be the sum of this variable and newvar2.

But unfortunately I don't know how to do that?
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

11 Apr 2016, 14:40

You can append the data to itself after interchanging the country names. That will restore all pairwise combinations.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(id year) str3(country_a country_b) float FTA
1 2000 "AUS" "USA" 1
2 2000 "AUS" "CAN" 1
3 2000 "CAN" "USA" 1
end
save "a_less_b.dta", replace

rename (country_a country_b) (country_b country_a) 
append using "a_less_b.dta"
isid year country_a country_b, sort
list, sepby(country_a)

. list, sepby(country_a)

     +---------------------------------------+
     | id   year   countr~b   countr~a   FTA |
     |---------------------------------------|
  1. |  2   2000        CAN        AUS     1 |
  2. |  1   2000        USA        AUS     1 |
     |---------------------------------------|
  3. |  2   2000        AUS        CAN     1 |
  4. |  3   2000        USA        CAN     1 |
     |---------------------------------------|
  5. |  1   2000        AUS        USA     1 |
  6. |  3   2000        CAN        USA     1 |
     +---------------------------------------+

Comment

Thomas Wimmer

Join Date: Apr 2016

Posts: 10
#6

11 Apr 2016, 15:44

Thank you, Robert, you helped me alot! I was able to get the sum (for each id and each year) of all FTAs that country_a has with other countries than country_b. (and vice versa)

But I still don't know how to get the sum of all FTAs that do not include any of the two countries of the country pair of a particular observation, again per id and year?
My code so far:

Code:

bysort year: egen newvar5 = total(FTA)

This gives me the sum of all FTAs per year (respective the sum times two in the appended data set), but how can I subtract the number of FTAs where any of the two countries of the country pair of that observation is involved?

EDIT: Problem solved, I obviously had a mental block.
I just divided newvar5 by 2 and subtracted the sums I constructed before.

Last edited by Thomas Wimmer; 11 Apr 2016, 16:12.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30147
#7

11 Apr 2016, 16:10

I think you just need to:

Code:

replace newvar5 = newvar5 - FTA
Comment

Thomas Wimmer

Join Date: Apr 2016
Posts: 10

11 Apr 2016, 16:17

Thank you, Clyde, but this Code does not give me the correct solution, since it reduces the (double) sum of all FTA's per year just by 1 at maximum.

But I think I found the solution:

Code:

bysort year isoa: egen newvar1 = total(ftacu)
gen newvar2=newvar1-ftacu

rename newvar2 ftaowni

bysort year isob: egen newvar3 = total(ftacu)
gen newvar4=newvar3-ftacu

rename newvar4 ftaownj
sort isoa isob year

bysort year: egen newvar5 = total(ftacu)
gen newvar6 = newvar5/2
gen newvar7 = newvar6-ftaowni-ftaownj

Announcement

dyadic dataset: problems with summations

Comment

Comment

Comment

Comment

Comment

Comment

Comment