Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • looking for advices about filling gaps

    Hi Stata Users,

    I'm sorry if my question may seem trivial.
    Anyway, I have a dataset on trade. I have the value of export for every product (hs6) between all countries. I paste an example of the dataset (from dataex command).


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 hs6 str3 i str32 i_country str3 j str32 j_country float v
    "630900" "004" "Afghanistan" "008" "Albania"    1.7082754
    "610469" "004" "Afghanistan" "008" "Albania"        2.734
    "401519" "004" "Afghanistan" "008" "Albania"        2.952
    "091099" "004" "Afghanistan" "012" "Algeria"        1.185
    "080620" "004" "Afghanistan" "012" "Algeria"       29.948
    "330210" "004" "Afghanistan" "012" "Algeria"        1.311
    "701339" "004" "Afghanistan" "012" "Algeria"        8.825
    "732399" "004" "Afghanistan" "012" "Algeria"        7.405
    "880330" "004" "Afghanistan" "031" "Azerbaijan"    88.768
    "722220" "004" "Afghanistan" "031" "Azerbaijan"    56.282
    "842123" "004" "Afghanistan" "032" "Argentina"      1.926
    "730729" "004" "Afghanistan" "032" "Argentina"      2.018
    "392410" "004" "Afghanistan" "032" "Argentina"   3.486448
    "710310" "004" "Afghanistan" "032" "Argentina"  3.9497726
    "852990" "004" "Afghanistan" "032" "Argentina"    389.019
    "721691" "004" "Afghanistan" "032" "Argentina"  1.8223627
    "841490" "004" "Afghanistan" "032" "Argentina"     23.839
    "392630" "004" "Afghanistan" "032" "Argentina"      1.295
    "870840" "004" "Afghanistan" "032" "Argentina"      5.233
    "844390" "004" "Afghanistan" "032" "Argentina"      4.228
    "853400" "004" "Afghanistan" "032" "Argentina"       5.53
    "854140" "004" "Afghanistan" "032" "Argentina"     2.0545
    "853110" "004" "Afghanistan" "032" "Argentina"       4.32
    "851829" "004" "Afghanistan" "032" "Argentina"      1.689
    "842199" "004" "Afghanistan" "032" "Argentina"      6.997
    "870850" "004" "Afghanistan" "032" "Argentina"       1.18
    "220290" "004" "Afghanistan" "036" "Australia"  2.1695094
    "847160" "004" "Afghanistan" "036" "Australia"     25.087
    "711311" "004" "Afghanistan" "036" "Australia"       3.86
    "080290" "004" "Afghanistan" "036" "Australia"      29.41
    "901920" "004" "Afghanistan" "036" "Australia"      2.179
    "847130" "004" "Afghanistan" "036" "Australia"      1.869
    "950390" "004" "Afghanistan" "036" "Australia"       1.93
    "852510" "004" "Afghanistan" "036" "Australia"      3.494
    "970600" "004" "Afghanistan" "036" "Australia"      7.652
    "081340" "004" "Afghanistan" "036" "Australia"  10.573498
    "871492" "004" "Afghanistan" "036" "Australia"      2.617
    "820830" "004" "Afghanistan" "036" "Australia"      2.826
    "200819" "004" "Afghanistan" "036" "Australia"   1.371079
    "630520" "004" "Afghanistan" "036" "Australia"    4.48296
    "200540" "004" "Afghanistan" "036" "Australia"    2.08091
    "852439" "004" "Afghanistan" "036" "Australia"      1.977
    "401693" "004" "Afghanistan" "036" "Australia"      1.356
    "848310" "004" "Afghanistan" "036" "Australia"      1.514
    "570110" "004" "Afghanistan" "036" "Australia"  151.71866
    "620419" "004" "Afghanistan" "036" "Australia"      1.268
    "847141" "004" "Afghanistan" "036" "Australia"      2.295
    "950631" "004" "Afghanistan" "036" "Australia"      1.112
    "852520" "004" "Afghanistan" "036" "Australia"      1.004
    "841381" "004" "Afghanistan" "036" "Australia"     12.504
    "871499" "004" "Afghanistan" "036" "Australia"      1.261
    "852691" "004" "Afghanistan" "036" "Australia"     15.149
    "620349" "004" "Afghanistan" "036" "Australia"      2.601
    "401039" "004" "Afghanistan" "036" "Australia"      1.375
    "630392" "004" "Afghanistan" "036" "Australia"      1.145
    "170490" "004" "Afghanistan" "036" "Australia"   2.605907
    "847490" "004" "Afghanistan" "036" "Australia"      1.074
    "711319" "004" "Afghanistan" "036" "Australia"      2.759
    "040310" "004" "Afghanistan" "036" "Australia"  12.707836
    "200590" "004" "Afghanistan" "036" "Australia"  1.3917105
    "850780" "004" "Afghanistan" "036" "Australia"     16.052
    "902190" "004" "Afghanistan" "036" "Australia"      1.957
    "841121" "004" "Afghanistan" "036" "Australia"   706.6636
    "851790" "004" "Afghanistan" "036" "Australia"      4.368
    "732690" "004" "Afghanistan" "036" "Australia"      1.827
    "620899" "004" "Afghanistan" "036" "Australia"        1.5
    "071310" "004" "Afghanistan" "036" "Australia"  1.4498367
    "761519" "004" "Afghanistan" "036" "Australia"       3.92
    "820559" "004" "Afghanistan" "036" "Australia"      2.468
    "080212" "004" "Afghanistan" "036" "Australia"   9.693563
    "900319" "004" "Afghanistan" "036" "Australia"      1.131
    "870893" "004" "Afghanistan" "036" "Australia"      2.064
    "570500" "004" "Afghanistan" "036" "Australia"      7.167
    "080420" "004" "Afghanistan" "036" "Australia"   1.224237
    "830249" "004" "Afghanistan" "036" "Australia"      3.401
    "730799" "004" "Afghanistan" "036" "Australia"      1.997
    "731100" "004" "Afghanistan" "036" "Australia"  1.3793677
    "900220" "004" "Afghanistan" "036" "Australia"      1.965
    "902300" "004" "Afghanistan" "036" "Australia"      5.192
    "851750" "004" "Afghanistan" "036" "Australia"      5.467
    "200899" "004" "Afghanistan" "036" "Australia"   2.963331
    "900691" "004" "Afghanistan" "036" "Australia"  1.9593333
    "848390" "004" "Afghanistan" "036" "Australia"      2.526
    "080620" "004" "Afghanistan" "036" "Australia"  570.50684
    "851890" "004" "Afghanistan" "036" "Australia"      1.648
    "847150" "004" "Afghanistan" "036" "Australia"     1.6865
    "970300" "004" "Afghanistan" "036" "Australia"     96.373
    "081310" "004" "Afghanistan" "036" "Australia"   21.79087
    "847330" "004" "Afghanistan" "036" "Australia"      1.175
    "071320" "004" "Afghanistan" "036" "Australia"   1.611071
    "210690" "004" "Afghanistan" "036" "Australia"  1.6370057
    "848180" "004" "Afghanistan" "036" "Australia"      1.848
    "570210" "004" "Afghanistan" "036" "Australia"     11.202
    "121120" "004" "Afghanistan" "036" "Australia"      1.725
    "940190" "004" "Afghanistan" "036" "Australia"      3.712
    "121299" "004" "Afghanistan" "036" "Australia"   1.352353
    "570190" "004" "Afghanistan" "036" "Australia"      2.605
    "902139" "004" "Afghanistan" "036" "Australia"      1.626
    "091020" "004" "Afghanistan" "036" "Australia"     39.377
    "090910" "004" "Afghanistan" "036" "Australia"  3.2772045
    end

    This is only an example considering Afghanistan as origin country, but I have, as already said, the value (v) of export for all the countries and it is a mean of three years, so there is no time variable, substantially, in the dataset.

    The problem is that, obviously, not all the countries (origin, i*) export all the products towards all the destinations (j*).

    I am wondering how can fill these gaps to have for all my origin country (i*) and for all the destinations (j*) observations for all the products (hs6).

    I have already tried the tsset to implement tsfill, full (obviously the variable storage type) but STATA returns me the message "repeated time values within panel r(451)".

    Has anyone got suggestions for choosing the right variables with tsset, for example, or another way to solve my problem?

    Any comment or advice would be very appreciated.

    Thank you

  • #2
    Welcome to Statalist, and thank you for the clearly presented sample data.

    I will start by saying that I question why you feel you need to fill in the gaps. But assuming you do, the fillin command is what you want, for the data as it is shown. Starting with your sample data in memory, we have the following code.
    Code:
    fillin i j hs6
    bysort i (_fillin): replace i_country = i_country[1]
    bysort j (_fillin): replace j_country = j_country[1]
    replace v = 0 if _fillin
    drop _fillin
    sort i j hs6
    tab j_country i_country
    Code:
    . tab j_country i_country
    
                          | i_country
                j_country | Afghani.. |     Total
    ----------------------+-----------+----------
                  Albania |        99 |        99 
                  Algeria |        99 |        99 
                Argentina |        99 |        99 
                Australia |        99 |        99 
               Azerbaijan |        99 |        99 
    ----------------------+-----------+----------
                    Total |       495 |       495

    Comment


    • #3
      The problem is that, obviously, not all the countries (origin, i*) export all the products towards all the destinations (j*).

      I am wondering how can fill these gaps to have for all my origin country (i*) and for all the destinations (j*) observations for all the products (hs6).
      This is not a case of missing data. Obviously, if no trade took place between a pair of countries for a given product, then the value of exports or imports is 0. Simple as that. If your analysis is on trade, then values where (trade=0) are not useful, and therefore it makes no sense to include them in your dataset. Perhaps you can explain what you aim to do with this data to elicit more useful responses.
      Last edited by Andrew Musau; 20 Apr 2018, 07:33.

      Comment


      • #4
        First of all, I would like to thank you for your advices.

        I want to fill the gaps because I would like to implement a network analysis and I want to build a matrix of distances in which the distances are the market shares.

        I hope this helps

        Comment

        Working...
        X