Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dyadic dataset: problems with summations

    Dear Statalist community,

    I have a panel dataset with the following variables:
    id year country_a country_b FTA

    id uniquely identifies my dyad (pair of country_a and country_b). I have 150 countries, thus 150*(150-1)/2 = 11175 unique dyads (pairings) and therefore 11175 observations per year. This means, the observation country pair ij exists, but country pair ji does not exist in my dataset. For example, country pair AUS USA is included, but USA AUS not (because AUS comes first in alphabetical order).
    FTA is a dummy variable, that equals 1 if country_a has a free trade agreement with country_b.

    Now I want to generate a new variable that stores the sum of all FTAs that country_a has with other countries than country_b. I tried the following code which is obviously wrong:
    Code:
    bysort country_a year: egen newvar1 = total(FTA)
    gen newvar2=newvar1-FTA
    How can I get (for each id and each year) the sum of all FTAs that country_a has with other countries than country_b? I would also need a variable with the sum of FTAs that do not include the countries of the country pair in that observation, again per id and year?
    Could you please help me out here?

    Thank you,
    Thomas
    Last edited by Thomas Wimmer; 11 Apr 2016, 14:08.

  • #2
    Why do you say your code is obviously wrong? It looks good to me. when recast as

    Code:
    bysort country_a year: egen newvar1 = total(FTA)
    gen newvar2 = newvar1 - FTA

    Comment


    • #3
      Hey Nick,

      thank you for your quick reply. Of course I meant the code you displayed, I'm very sorry.

      But still, this code is not correct due to the dyadic structure with unique dyads:
      For example, my first three countries are "ALB", "ARE" and "ARG".
      Suppose, "ALB" and "ARE" have an FTA. This code would return 1 for the dyad ALB-ARE, but 0 for the dyad ARE-ARG, although ARE has an FTA with ALB.
      This is because the dyad ARE-ALB does not exist, only ALB-ARE exists.
      Last edited by Thomas Wimmer; 11 Apr 2016, 14:04.

      Comment


      • #4
        I think for my solution I would need another variable that sums the FTAs when the particular country of row country_a appears in row country_b. My desired variable (sum of all FTAs that country_a has with other countries than country_b) should then be the sum of this variable and newvar2.

        But unfortunately I don't know how to do that?

        Comment


        • #5
          You can append the data to itself after interchanging the country names. That will restore all pairwise combinations.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(id year) str3(country_a country_b) float FTA
          1 2000 "AUS" "USA" 1
          2 2000 "AUS" "CAN" 1
          3 2000 "CAN" "USA" 1
          end
          save "a_less_b.dta", replace
          
          rename (country_a country_b) (country_b country_a) 
          append using "a_less_b.dta"
          isid year country_a country_b, sort
          list, sepby(country_a)
          
          . list, sepby(country_a)
          
               +---------------------------------------+
               | id   year   countr~b   countr~a   FTA |
               |---------------------------------------|
            1. |  2   2000        CAN        AUS     1 |
            2. |  1   2000        USA        AUS     1 |
               |---------------------------------------|
            3. |  2   2000        AUS        CAN     1 |
            4. |  3   2000        USA        CAN     1 |
               |---------------------------------------|
            5. |  1   2000        AUS        USA     1 |
            6. |  3   2000        CAN        USA     1 |
               +---------------------------------------+

          Comment


          • #6
            Thank you, Robert, you helped me alot! I was able to get the sum (for each id and each year) of all FTAs that country_a has with other countries than country_b. (and vice versa)

            But I still don't know how to get the sum of all FTAs that do not include any of the two countries of the country pair of a particular observation, again per id and year?
            My code so far:
            Code:
            bysort year: egen newvar5 = total(FTA)
            This gives me the sum of all FTAs per year (respective the sum times two in the appended data set), but how can I subtract the number of FTAs where any of the two countries of the country pair of that observation is involved?

            EDIT: Problem solved, I obviously had a mental block.
            I just divided newvar5 by 2 and subtracted the sums I constructed before.
            Last edited by Thomas Wimmer; 11 Apr 2016, 16:12.

            Comment


            • #7
              I think you just need to:

              Code:
              replace newvar5 = newvar5 - FTA

              Comment


              • #8
                Thank you, Clyde, but this Code does not give me the correct solution, since it reduces the (double) sum of all FTA's per year just by 1 at maximum.

                But I think I found the solution:

                Code:
                bysort year isoa: egen newvar1 = total(ftacu)
                gen newvar2=newvar1-ftacu
                
                rename newvar2 ftaowni
                
                bysort year isob: egen newvar3 = total(ftacu)
                gen newvar4=newvar3-ftacu
                
                rename newvar4 ftaownj
                sort isoa isob year
                
                bysort year: egen newvar5 = total(ftacu)
                gen newvar6 = newvar5/2
                gen newvar7 = newvar6-ftaowni-ftaownj

                Comment

                Working...
                X