Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate a variable that is the sum of another variable's observations subject to a year and id

    I want to generate a variable that is the sum of "trade-share of GDP" in a given year for a given country. I have data for 4 years, and I have a variable that is the "trade value" for each country pair.
    year importer exporter tradevalue GDPimport
    1970 CAN USA 2356 65748
    1970 CAN AUS 567 65748
    1970 CAN GBR 789 65748
    1971 IND CHN 125 46783
    1971 IND JAP 1256
    46783
    1971 JAP BEN 34
    67890
    1972 JAP BRZ 1345 67890
    1972 EGY JOR 234 3456
    1972 EGY POL 78
    3456
    Ignoring that I need to convert tradevalue and GDPimport to the same dollar amount (so both in billions of $), I want to create a column/variable that is the sum of the tradevalue for a country in a given year. So the new column would be like this:
    year importer exporter tradevalue GDPimport tradesum
    1970 CAN USA 2356 65748 3712‬
    1970 CAN AUS 567 65748 3712‬
    1970 CAN GBR 789 65748 3712‬
    1971 IND CHN 125 46783 1381
    1971 IND JAP 1256
    46783
    1381
    1971 JAP BEN 34
    67890
    1379
    1972 JAP BRZ 1345 67890 1379
    1972 CAN
    USA
    2689 70019 3367
    1972 CAN AUS 678 70019 3367
    So tradevalue is 3712 (567+789+125) for CAN in 1970, but is 3367 (2689+678) in 1972. So basically, I want to sum tradevalue by year and importer, so that the value repeats for importer in the year 1970. As you can see if have CAN twice here, once for 1970 and 1972. In my real data, I have every country in the world as both importer and exporter, so I don't want to simply add tradevalue by importer, as this would add for all four years. And since I have every country in the world, I cannot select each country.

    I'm hoping to take that Tradesum value and then just divide by the GDPimport to get the trade share of GDP. But first I need to figure out how to get that tradesum variable to exist. Any help would be greatly appreciated.


  • #2
    Thanks for the data example, which is helpful, but note this form based on what you would get with dataex (FAQ Advice '#12).

    Code:
    clear
    input year    str3 (importer    exporter) tradevalue    GDPimport
    1970    CAN    USA    2356    65748
    1970    CAN    AUS    567    65748
    1970    CAN    GBR    789    65748
    1971    IND    CHN    125    46783
    1971    IND    JAP    1256    46783
    1971    JAP    BEN    34    67890
    1972    JAP    BRZ    1345    67890
    1972    EGY    JOR    234    3456
    1972    EGY    POL    78    3456
    end
    
    egen wanted = total(tradevalue), by(importer year)
    
    list, sepby(importer year)
    
         +-----------------------------------------------------------+
         | year   importer   exporter   tradev~e   GDPimp~t   wanted |
         |-----------------------------------------------------------|
      1. | 1970        CAN        USA       2356      65748     3712 |
      2. | 1970        CAN        AUS        567      65748     3712 |
      3. | 1970        CAN        GBR        789      65748     3712 |
         |-----------------------------------------------------------|
      4. | 1971        IND        CHN        125      46783     1381 |
      5. | 1971        IND        JAP       1256      46783     1381 |
         |-----------------------------------------------------------|
      6. | 1971        JAP        BEN         34      67890       34 |
         |-----------------------------------------------------------|
      7. | 1972        JAP        BRZ       1345      67890     1345 |
         |-----------------------------------------------------------|
      8. | 1972        EGY        JOR        234       3456      312 |
      9. | 1972        EGY        POL         78       3456      312 |
         +-----------------------------------------------------------+
    
    .
    I can't quite reproduce your calculations, but I think you have lumped together values for Japan in different years,
    Last edited by Nick Cox; 28 Apr 2020, 11:24.

    Comment


    • #3
      I just realized I can use the egen command to do this.

      I can egen tradesum = sum(tradevalue), by (year, importer)

      Please ignore my above message. I can seem to delete it.

      Comment


      • #4
        Thank you Nick Cox. Yes you are correct. And you were very prompt and thorough in your response. Faster than I was in figuring out on my own.

        Comment


        • #5
          You have the same GDPIMPORT for Japan in 1971 and 1972. That's probably a transcription error, which in turn explains the puzzle above. No matter; the principle is what you seek.

          Comment


          • #6
            sum() and total() are names for essentially the same egen function. sum() has been undocumented since Stata 9, so I guess you were reading something from way back, but it still works fine.

            Comment

            Working...
            X