Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to merge datasets according to two variables (in this case Year and PS)

    Hello, I.am once again asking the question which is somewhat similar to my previous post where Clyde Schechter suggested me some useful codes. But while I am trying to merge both the datasets in this case, resulting values remains unchanged. Let me show the dataset in a comprehensible manner. There are some countries missing in datatset1 and some other countries missing in datatset2. Moreover, this extra variable PS(ProductSector) which contains 97 sectors for each year makes it more complex. Please suggest how to merge both the datasets effectively according to Year and PS (Product Sector).

    DATASET1

    Code:
    Year (PS     Argentina     Australia      Brazil     Canada      India)
    1996  1    .005412923   .009761705   .021289896   .16551557             .
    1996  2    .000595907 .00012810821   .001922998   .07750466  5.583215e-06
    1996  3   .0000435198   .002957913  .0003875762   .11520422   .0000263602
    1996  4    .007955423   .007539196     .0306171   .13337472  .00006167813
    1996  5    .015250857    .03253398   .017567292   .07675288    .002637858
    1996  6    .003493172   .004160443   .002276731    .3606265    .002714305
    1996  7   .0012534298    .01326288   .006882724    .3898989   .0010493217
    1996  8    .001726718   .010663555   .015049414   .17263103    .011813587
    1996  9  .00031747445   .017899036    .00567275    .6247506   .0005122751
    1996 10    .001720109 .00010939194    .01637734   .01904136 2.5047459e-07
    1996 11   .0019767391   .006426158   .018081691   .15395814  .00002265323
    1996 12    .005288029   .009656722   .008722636   .02489923  .00021144748
    1996 13     .02211273    .03087043    .04993448   .06307018    .017548036
    1996 14   .0023164314  .0028714926   .029666135   .04277401    .003228264
    1996 15    .002928364  .0037859546   .004927469   .11647154    .018047603
    1996 16      .0010688    .02471223  .0022396517   .26705992 .000018143983
    1996 17   .0042029526    .02684881    .01169383   .28744093    .001654093
    1996 18    .006509231   .016741086    .01101217    .4494169  .00002282757
    1996 19    .007220794   .010665032   .014054524    .5118221    .021851575
    1996 20    .006030358   .009497675   .010786434    .2486253  8.543576e-06
    1996 21     .01541473    .02148049    .02358405   .23319013     .02686504
    1996 22    .003537237     .0240996     .0771867   .14182587   .0003387502
    1996 23   .0043448806   .004895512    .00292151   .13918817   .0004737046
    1996 24   .0001914133   .004732565   .004171709 .0041859327  .00019484505
    1996 25    .009544675   .013911826    .01171264   .19207713    .009171514
    1996 26 .000035927405  .0022411484   .001800989     .340304    .007865751
    1996 27    .006805457    .01407332    .05945707   .15685534    .005803204
    1996 28    .008936951    .02996735   .033603705   .16530456    .010543867
    1996 29     .02182444    .02336423   .068946555   .10555278    .020688225
    1996 30    .010402986    .03109907   .021174217   .16109854   .0011136503
    1996 31     .05619381    .09411073    .04671761   .07256504    .032817595
    1996 32    .015798293    .02190727   .036375914   .25390556    .004155384
    1996 33      .0121169    .03077062    .01512533       .2295    .000420718
    1996 34    .024961686    .02957504    .02725441   .28369874     .00557252
    1996 35    .010916688   .026904374   .031362236   .18759064     .00275323
    1996 36    .008250637    .06514726  .0027768505    .3643231   .0018722123
    1996 37    .009746832    .06935137    .04313864   .13898335    .006733587
    1996 38     .01953846    .02547904    .02789035   .17492287    .006317493
    1996 39     .01486332   .020432265   .032280784   .21445473    .004851667
    1996 40    .010485123    .02513504    .03872368    .3231668    .004159508
    1996 41    .000741592  .0005948011  .0046388335   .04236137    .006883563
    1996 42    .003616494   .017351704   .008854772   .09712918   .0004639031
    1996 43  .00028222022 .00017748294  .0001082553    .1929896   .0004114978
    1996 44   .0006247138   .008362416  .0003743146   .15039323   .0004774384
    1996 45    .005849908    .04461202   .009488012   .46924525    .009415307
    1996 46    .005950562   .017074218    .13993771   .30654415  .00022869285
    1996 47    .010362303  .0038930916    .02044946   .06017964     .02042743
    1996 48    .008346957   .023530606   .028377613    .2439417   .0041921143
    1996 49    .004334345    .05278573    .01874489    .3842102    .004505226
    1996 50    .003346217    .17712426   .029707076   .05699746    .008109384
    1996 51    .001375542  .0011008654   .003505378    .2128257    .069497414
    1996 52   .0018810314   .005920296    .02729212   .09789398   .0011306499
    1996 53     .01520171    .02767783   .011378683   .16282643    .016648134
    1996 54   .0081143575   .013419732    .03958957   .20863837    .002983462
    1996 55    .010861279    .01916884   .036523197    .2079764     .01144998
    1996 56    .009178382    .02927494    .02348635   .27284735     .00148111
    1996 56    .009178382    .02927494    .02348635   .27284735     .00148111
    1996 57    .011667924   .008853852   .013208172    .3939994   .0011297581
    1996 58   .0022094462    .01418644   .017264405   .11621238   .0019864219
    1996 59    .005949356   .031456996    .01271354   .30637875    .004374433
    1996 60   .0042629507   .031964306   .010241723    .3864517   .0005371182
    1996 61   .0021944598    .00795275   .006357198   .11126396  .00004792333
    1996 62   .0017026288   .005853608   .008596722   .07815123  .00003734342
    1996 63    .006739212   .021930637   .012327651   .26582518    .011437964
    1996 64   .0022092115   .014485977   .012609435   .09123532    .003489175
    1996 65   .0040906947    .03426809   .014045555   .15078725 .000026429125
    1996 66      .0116533   .017159987   .033730853   .25533256    .005257613
    1996 67    .012765294    .03968953    .10190888   .10167006   .0011619121
    1996 68    .007788125     .0301459   .020502046   .26162264    .003366434
    1996 69    .007807021   .019924775   .025516523   .22380675    .004117709
    1996 70    .005009144   .015347328   .015893767    .3357402    .003225538
    1996 71   .0001720959  .0032916865   .000853719    .1035111    .021531414
    1996 72    .002340086   .006054145   .006893443   .28956455     .02979339
    1996 73     .00961367    .02074197   .016504394     .269932    .005004544
    1996 74   .0012119224   .005603096    .00813138   .23736276     .02952065
    1996 75    .002558758    .02284907   .015675021    .2012739    .007635446
    1996 76    .014487525   .007768621   .037160512    .2972752    .002575617
    1996 77   .0017969486   .001542431   .005292908    .3167117    .019546947
    1996 78   .0038820275   .005485375   .005309915    .3172928     .04272653
    1996 79    .003086012  .0023170013     .0055437    .3244107   .0040726964
    1996 80    .003747034   .021897884    .02311055   .13220201    .010059632
    1996 81    .014616858    .04435155     .0392373   .25329092    .003792971
    1996 82   .0045793853   .018496266   .016613312    .4348763   .0011816168
    1996 83     .01063628   .035657413   .026205974   .20101996     .00606537
    1996 84    .007793189   .018534303    .02210317   .13647923    .002996999
    1996 85   .0030212793   .034630064   .010585407    .6801313     .01029792
    1996 86   .0038835446   .024646277   .008183343   .58850527   .0005256764
    1996 87    .007030264    .05594212   .017181918   .07557124    .018360004
    1996 88    .017291458    .12408047   .008539211   .12826459     .06224852
    1996 89    .007503517   .033528782    .02468535   .12356672    .004937335
    1996 90   .0033941504     .0293272   .007566995    .1337502    .002706017
    1996 91    .005415997    .03545486    .03559451   .08404399  .00026050673
    1996 92    .005103546    .07348242  .0019221066    .2865976   .0001513357
    1996 93    .014554962   .012512697   .013311894    .4298844   .0007868752
    1996 94     .00369776   .031297006   .010347588   .17580058   .0009833977
    1996 95    .014225993   .037229884   .020797146   .18246177   .0014945463
    1996 96  .00008626976    .00721618 .00033961135  .027917065  .00025453427
    1997  1    .005766634    .01490115    .02315075    .1534583             .
    1997  2   .0007933686 .00020810924   .003241023   .08485008             .
    1997  3 .000068952075  .0014792576  .0004966396   .13099173 .000028935674
    end
    DATASET2

    Code:
    Year (Australia Brazil        Canada      India)
    1989  1.238932  6.318592e-13  1.337232  6.010523
    1990 1.2781003 4.4035364e-10  1.310087  6.703764
    1991 1.2677894 1.3399688e-08   1.30362  9.515412
    1992  1.318393 1.5185393e-06 1.3547595 11.766012
    1993 1.4072763   .0005860761  1.430694 14.296855
    1994 1.3007553     .21582076 1.4784745 15.804987
    1995 1.3056957      .4811425 1.4763154  17.51456
    1996  1.233018     .59265333 1.4473275 20.262354
    1997  1.273254      .6641367 1.4594097 21.745016
    1998 1.4940015      .7265455  1.555095  27.54811
    1999 1.4446623     1.1652954 1.5504937  29.44548
    2000  1.624464     1.2169383 1.5403112 30.923504
    2001 1.8489385     1.6239994 1.6014005  32.76907
    2002 1.7842977     2.1548603 1.6343795 34.658844
    2003 1.5015388      2.547112 1.4651107  33.71234
    2004   1.31984      2.513441 1.3499476 33.143703
    2005   1.26242     2.1621165  1.242377 32.520264
    2006   1.28434     1.9499514 1.1495528 34.242355
    2007 1.1499054      1.758715 1.0808864 32.320175
    2008 1.1527679     1.6857247  1.058693 35.482944
    2009 1.2662634     1.9347336 1.1400007  43.93196
    2010 1.0901595     1.7592267 1.0301127  45.72581
    2011  .9708448     1.7292553  .9869114  49.24996
    2012  .9642726     2.0848985   .991586  60.39237
    2013 1.0442442     2.4091325  1.016815 72.489685
    2014 1.1278877      2.750871 1.0935117  79.23273
    2015 1.3721024     4.2357297 1.2785072   87.2698
    2016 1.3868723      4.773291  1.327514  94.73785
    2017 1.3427736      4.419472 1.2930088  92.89115
    2018  1.370266      5.120229 1.2887018  98.98334
    2019  1.469829      5.631801 1.3212835 103.83585
    2020 1.4790536      7.504243 1.3287678 115.07817
    end
    Last edited by Aishwary Trivedi; 30 Nov 2021, 11:46.

  • #2
    It isn't at all clear what you want the result to look like. The numbers in the variables whose names are country names: how are they related to each other in the two data sets? Are they the same measure, disaggregated by sector in data set 1, and aggregated up to country level in data set 2? Or are they measures of something different? What's going on in this data?

    Comment


    • #3
      Here's an example that may start you in a useful direction, in the absence of a clear explanation of what it is that you want the results to be like.
      Code:
      // set up small example data - 2 years 3 PS
      clear
      input int (Year PS) double (Argentina     Australia      Brazil     Canada      India)
      1996  1    .005412923   .009761705   .021289896   .16551557             .
      1996  2    .000595907 .00012810821   .001922998   .07750466  5.583215e-06
      1996  3   .0000435198   .002957913  .0003875762   .11520422   .0000263602
      1997  1    .005766634    .01490115    .02315075    .1534583             .
      1997  2   .0007933686 .00020810924   .003241023   .08485008             .
      1997  3 .000068952075  .0014792576  .0004966396   .13099173 .000028935674
      end
      save ds1, replace
      clear
      input int Year double (Australia Brazil        Canada      India)
      1996  1.233018     .59265333 1.4473275 20.262354
      1997  1.273254      .6641367 1.4594097 21.745016
      end
      save ds2, replace
      
      // work starts here
      use ds2, clear
      rename (Australia - India) (v2=)
      reshape long v2, i(Year)  j(Country) string
      tempfile use
      save `use'
      
      use ds1, clear
      rename (Argentina - India) (v1=)
      reshape long v1, i(Year PS) j(Country) string
      
      merge m:1 Country Year using `use'
      sort Country Year PS
      list, sepby(Country)
      Code:
      . list, sepby(Country)
      
           +-----------------------------------------------------------------+
           | Year   PS     Country          v1          v2            _merge |
           |-----------------------------------------------------------------|
        1. | 1996    1   Argentina   .00541292           .   Master only (1) |
        2. | 1996    2   Argentina   .00059591           .   Master only (1) |
        3. | 1996    3   Argentina   .00004352           .   Master only (1) |
        4. | 1997    1   Argentina   .00576663           .   Master only (1) |
        5. | 1997    2   Argentina   .00079337           .   Master only (1) |
        6. | 1997    3   Argentina   .00006895           .   Master only (1) |
           |-----------------------------------------------------------------|
        7. | 1996    1   Australia   .00976171    1.233018       Matched (3) |
        8. | 1996    2   Australia   .00012811    1.233018       Matched (3) |
        9. | 1996    3   Australia   .00295791    1.233018       Matched (3) |
       10. | 1997    1   Australia   .01490115    1.273254       Matched (3) |
       11. | 1997    2   Australia   .00020811    1.273254       Matched (3) |
       12. | 1997    3   Australia   .00147926    1.273254       Matched (3) |
           |-----------------------------------------------------------------|
       13. | 1996    1      Brazil    .0212899   .59265333       Matched (3) |
       14. | 1996    2      Brazil     .001923   .59265333       Matched (3) |
       15. | 1996    3      Brazil   .00038758   .59265333       Matched (3) |
       16. | 1997    1      Brazil   .02315075    .6641367       Matched (3) |
       17. | 1997    2      Brazil   .00324102    .6641367       Matched (3) |
       18. | 1997    3      Brazil   .00049664    .6641367       Matched (3) |
           |-----------------------------------------------------------------|
       19. | 1996    1      Canada   .16551557   1.4473275       Matched (3) |
       20. | 1996    2      Canada   .07750466   1.4473275       Matched (3) |
       21. | 1996    3      Canada   .11520422   1.4473275       Matched (3) |
       22. | 1997    1      Canada    .1534583   1.4594097       Matched (3) |
       23. | 1997    2      Canada   .08485008   1.4594097       Matched (3) |
       24. | 1997    3      Canada   .13099173   1.4594097       Matched (3) |
           |-----------------------------------------------------------------|
       25. | 1996    1       India           .   20.262354       Matched (3) |
       26. | 1996    2       India   5.583e-06   20.262354       Matched (3) |
       27. | 1996    3       India   .00002636   20.262354       Matched (3) |
       28. | 1997    1       India           .   21.745016       Matched (3) |
       29. | 1997    2       India           .   21.745016       Matched (3) |
       30. | 1997    3       India   .00002894   21.745016       Matched (3) |
           +-----------------------------------------------------------------+
      Note that this approach solves another of your problems posted here several times, where you want to somehow merge two datasets similar to your dataset1 and dataset2 and simultaneously calculate the product of corresponding variables in the two datasets. Following this path, in the dataset created by merge, you can
      Code:
      generate product = v1*v2
      to achieve the desired result. If you insist, you can then reshape the results back into a wide layout, but that is usually a poor idea. The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data.

      Finally, the purpose to the dataex command is to produce example data that can be copied and pasted into a do-file editor to read the data in. Please in the future copy and paste the dataex output as given - don't edit it to look handsome.

      Comment


      • #4
        Clyde Schechter Dataset 1 has the data of percentage share of imports to the US by all the countries for all the years as per each Product Sector. Dataset 2 contains the real exchange rate values with respect to dollar. I need to multiply the values of both the datasets in order to get the final output (exchange rate index).

        Comment


        • #5
          William Lisowski thanks so much for this detailed explanation. I will try these codes and get back to you. Also thanks so much for being patient with me. I am still learning how to use Statalist effectively. I will keep your suggestions regarding dataex in mind.

          Comment


          • #6
            William Lisowski Codes for ds2 worked very well but for ds1 - I am facing some issue.

            Code:
            variable id does not uniquely identify the observations
                Your data are currently wide. You are performing a reshape long. You specified i(Year Productsector) and
                j(Country). In the current wide form, variable Year Productsector should uniquely identify the observations.
                Remember this picture:
            
                     long                                wide
                    +---------------+                   +------------------+
                    | i   j   a   b |                   | i   a1 a2  b1 b2 |
                    |---------------| <--- reshape ---> |------------------|
                    | 1   1   1   2 |                   | 1   1   3   2  4 |
                    | 1   2   3   4 |                   | 2   5   7   6  8 |
                    | 2   1   5   6 |                   +------------------+
                    | 2   2   7   8 |
                    +---------------+
                Type reshape error for a list of the problem observations.
            r(9);

            Comment


            • #7
              Above query is resolved. Thanks so much.

              Comment

              Working...
              X