Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • data management bilateral pair data for gravity model

    I am dealing with bilateral pair data for the gravity model. The structure of my data is like
    Sl.No. exporter importer year export
    1 IND IND 2000 500
    2 IND IND 2001 600
    3 IND AFG 2000 100
    4 IND AFG 2001 200
    5 IND USA 2000 500
    6 IND USA 2001 700
    7 AFG AFG 2000 100
    8 AFG AFG 2001 150
    9 AFG IND 2000 20
    10 AFG IND 2001 40
    11 AFG USA 2000 15
    12 AFG USA 2001 20
    13 USA USA 2000 5000
    14 USA USA 2001 7000
    15 USA IND 2000 600
    16 USA IND 2001 800
    17 USA AFG 2000 50
    18 USA AFG 2001 80
    I need to identify the largest trading countries from the data, for which first I need to prepare country-wise total export (for example, total export for IND is the sum of rows 1 to 6) and then country-wise total import (for IND it is sum of row 1+2+9+10+15+16) finally I need to derive total trade (sum of export and import) for each country. Kindly suggest a way out. Thanks

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte slno str4(exporter importer) int(year export)
     1 "IND " "IND " 2000  500
     2 "IND " "IND " 2001  600
     3 "IND " "AFG " 2000  100
     4 "IND " "AFG " 2001  200
     5 "IND " "USA " 2000  500
     6 "IND " "USA " 2001  700
     7 "AFG " "AFG " 2000  100
     8 "AFG " "AFG " 2001  150
     9 "AFG " "IND " 2000   20
    10 "AFG " "IND " 2001   40
    11 "AFG " "USA " 2000   15
    12 "AFG " "USA " 2001   20
    13 "USA " "USA " 2000 5000
    14 "USA " "USA " 2001 7000
    15 "USA " "IND " 2000  600
    16 "USA " "IND " 2001  800
    17 "USA " "AFG " 2000   50
    18 "USA " "AFG " 2001   80
    end
    
    by exporter, sort: egen total_export = total(export)
    by importer, sort: egen total_import = total(export)
    In the future, when showing data examples, please use the -dataex- command to do so, as I have done here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Dear Clyde Schechter afetr running
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte slno str4(exporter importer) int(year export) float(total_export total_import)
       7 "AFG " "AFG " 2000  100   345   680
       8 "AFG " "AFG " 2001  150   345   680
       9 "AFG " "IND " 2000   20   345  2560
      10 "AFG " "IND " 2001   40   345  2560
      11 "AFG " "USA " 2000   15   345 13235
      12 "AFG " "USA " 2001   20   345 13235
       3 "IND " "AFG " 2000  100  2600   680
       4 "IND " "AFG " 2001  200  2600   680
       1 "IND " "IND " 2000  500  2600  2560
       2 "IND " "IND " 2001  600  2600  2560
       5 "IND " "USA " 2000  500  2600 13235
       6 "IND " "USA " 2001  700  2600 13235
      17 "USA " "AFG " 2000   50 13530   680
      18 "USA " "AFG " 2001   80 13530   680
      15 "USA " "IND " 2000  600 13530  2560
      16 "USA " "IND " 2001  800 13530  2560
      13 "USA " "USA " 2000 5000 13530 13235
      14 "USA " "USA " 2001 7000 13530 13235
      end
      after this I need 'total trade' that is sum of total_export and total_import for each country (for example, for AFG it is 345+680, and this same total for all rows where AFG is an exporter). kindly suggest me codes for doing it. thanks
      Last edited by Tariq Masood; 23 Dec 2023, 11:51.

      Comment


      • #4
        See -help generate-, -help exp-.

        Comment


        • #5
          Dear Clyde Schechter and Andrew Musau I am still struggling with this. now let me make it more clear. suppose my data is
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str4(exporter importer) int(year export)
          "AFG " "IND " 2001  40
          "AFG " "IND " 2000  20
          "AFG " "USA " 2001  20
          "AFG " "USA " 2000  15
          "IND " "AFG " 2000 100
          "IND " "AFG " 2001 200
          "IND " "USA " 2000 500
          "IND " "USA " 2001 700
          "USA " "AFG " 2001  80
          "USA " "AFG " 2000  50
          "USA " "IND " 2001 800
          "USA " "IND " 2000 600
          end
          after running
          by exporter, sort: egen total_export = total(export)
          by importer, sort: egen total_import = total(export)

          I am getting following output
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str4(exporter importer) int(year export) float(total_export total_import)
          "AFG " "IND " 2001  40   95 1460
          "AFG " "IND " 2000  20   95 1460
          "AFG " "USA " 2001  20   95 1235
          "AFG " "USA " 2000  15   95 1235
          "IND " "AFG " 2000 100 1500  430
          "IND " "AFG " 2001 200 1500  430
          "IND " "USA " 2000 500 1500 1235
          "IND " "USA " 2001 700 1500 1235
          "USA " "AFG " 2001  80 1530  430
          "USA " "AFG " 2000  50 1530  430
          "USA " "IND " 2001 800 1530 1460
          "USA " "IND " 2000 600 1530 1460
          end
          now I am trying to find sum of export and import variables for each country (like for AFG it will be 95+430 and it must be same for all rows where AFG is an exporter, similarly for IND it will be 1500+1460...).

          Comment


          • #6
            Sorry, I misunderstood what you wanted when I wrote #4. This will give you what you need:
            Code:
            frame put importer total_import, into(imports)
            frame imports {
                duplicates drop
            }
            
            frlink m:1 exporter, frame(imports importer)
            gen wanted = total_export + frval(imports, total_import)

            Comment


            • #7
              Thanks, dear Clyde Schechter. It works perfectly.

              Comment


              • #8
                Dear Clyde Schechter by running these commands, I have to identify the top 150 countries in terms of total trade value; the results I am getting are absolutely fine. I want to ask if there are better ways of doing this:

                * Identify top traders
                by exporter_iso3, sort: egen total_export = total(trade)
                by importer_iso3, sort: egen total_import = total(trade)

                frame put importer_iso3 total_import, into(imports)
                frame imports {
                duplicates drop
                }
                frlink m:1 exporter_iso3, frame(imports importer_iso3)
                gen total_exporter = total_export + frval(imports, total_import)

                frame put exporter_iso3 total_export, into(exports)
                frame exports {
                duplicates drop
                }
                frlink m:1 importer_iso3, frame(exports exporter_iso3)
                gen total_importer = total_import + frval(exports, total_export)

                gen total1= -total_exporter
                egen rank_exporter = group(total1)
                gen total2 =-total_importer
                egen rank_importer = group(total2)
                keep if rank_exporter <= 150
                keep if rank_importer<=150

                Comment


                • #9
                  I don't think the results you are getting are correct.

                  The first problem lies with the -keep if- commands at the end. Once you -keep if rank_exporter <= 150-, the next -keep if rank_importer <= 150- can only retain those observations that survived the rank_exporter -keep- command. In other words, the results you are getting at the end are those ranked among the top 150 in both imports and exports. And there probably won't be 150 of them. Maybe that's what you want, and, if so, that's fine. But if it isn't, then you need to get rid of those two -keep if- commands and instead -keep if rank_exporter <= 150 | rank_importer <= 150-.

                  Also, the use of the -egen, group()- function to rank the countries is a bit dicey. If you read the documentation for that function, you will see that it assigns consecutive positive integers to distinct values of the variable that is the argument of -group()-. Now, as it is currently implemented, it does that by first sorting on that variable and assigning 1 to the lowest value, 2 to the next lowest, etc. But the documentation does not promise that it will do it that way. And in the next version of Stata, or even in the next update of the current version, that behavior could change and it would break your code. You should instead use the -egen, rank()- function, which explicitly provides rank orders.

                  I should also point out that in your code the "top 150" are actually the 150 with the lowest total trade. Is that what you intended?
                  Last edited by Clyde Schechter; 29 Dec 2023, 09:21.

                  Comment


                  • #10
                    Thanks for pointing out these issues. However, I rechecked it, and the output generated after running these codes looks fine.

                    Comment

                    Working...
                    X