Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creation of a distances matrix (Helpman1998)

    Good afternoon to all,
    I have a set of 765 locations geo coordinates, I am trying to build a matrix that will measure the distance across all the locations in couples where in the principal diagonal report measures the distance between the same location (A to A, B to B, ... , XYZ to XYZ) reported as 1 and in all other intersections the distance between locaction in row and location in column.
    The result would be something like:
    Toronto Cape Town Rome Buenos Aires
    Toronto 1 km km km
    Capetown km 1 km km
    Rome km km 1 km
    Buenos aires km km km 1
    I have already tried the cross function but I get back just a newvar of all the possible pair distances.
    I also tried to use the geodist function, but I think I have to implement a nested loop which would perform operations for any location of origin and any location of destination, with the result of distances in the intersection between them (as just drafted before).
    Any hint to resolve this problem, I can't get the way, and I think you ave already realized I'm a newbie here and to exploiting Stata too.

    Thank in avance for your help.

  • #2
    Here, you show what you want the end result to look like but not how your initial dataset looks like. You need to have pairwise distances, then it's just a matter of reshaping. The values in the diagonal should zero.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3(origin destination) float distance
    "A" "A"  0
    "A" "B" 20
    "A" "C" 30
    "B" "A" 20
    "B" "B"  0
    "B" "C" 40
    "C" "A" 30
    "C" "B" 40
    "C" "C"  0
    end
    
    reshape wide distance, i(origin) j(destination) string
    rename distance* *
    Res.:

    Code:
    . rename distance* *
    
    . l
    
         +-----------------------+
         | origin    A    B    C |
         |-----------------------|
      1. |      A    0   20   30 |
      2. |      B   20    0   40 |
      3. |      C   30   40    0 |
         +-----------------------+
    See

    Code:
    help mkmat
    help ds
    on how to convert this to a matrix and how to retrieve the row and column names.
    Last edited by Andrew Musau; 15 Nov 2021, 09:52.

    Comment


    • #3
      Hi
      Attached Files

      Comment


      • #4
        Hi Andrew, thanks for your advice, I've not realized this proceed yet. I was very scared about the size of observation calculate in the first attempt.
        It could be really a good solution to implement geodist formula for any location in a foreach loop and then using reshape command to transform it in a row and then perform tha to all the 765 observation. I will try and I will be back here ASAP. Actually I'm yet scared about how to implement that. I will need a moment to conceptualise it in my mind.
        Thanks for now.

        D.

        Comment


        • #5
          Hi Nicola Tommasi it seems very useful, I'm already working on it.
          Thanks!

          Comment


          • #6
            765 is not big at all. Stata SE/ MP can handle that easy. The following is illustrative and uses geodist from SSC by Robert Picard.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float id str20 City float(Latitude Longitude)
            1 "Drammen"              59.75  10.2
            1 "Fredrikstad-sarpsbor" 59.24 10.94
            1 "Kristiansand"         58.15  7.99
            1 "Oslo"                 59.91 10.75
            1 "Porsgrunn-skien"      59.15  9.66
            1 "Stavanger"            58.97  5.71
            1 "Trondheim"            63.44  10.4
            end
            
            
            *CREATE PAIRWISE COMBINATIONS BY ID
            preserve
            rename (City Latitude Longitude) (City2 Latitude2 Longitude2)
            tempfile City2
            save `City2'
            restore
            joinby id using `City2'
            geodist Latitude Longitude Latitude2 Longitude2, gen(distance)
            gen origin= strtoname(City)
            gen destination= strtoname(City2)
            keep origin destination distance
            sort origin destination
            replace distance=1 if origin==destination
            reshape wide distance, i(origin) j(destination) string
            rename distance* *
            ds origin, not
            local names= "`r(varlist)'"
            mkmat `names', mat(wanted)
            mat colnames wanted= `names'
            mat rownames wanted= `names'
            mat l wanted
            Res.:

            Code:
            . mat l wanted
            
            symmetric wanted[7,7]
                               Drammen  Fredriksta~r  Kristiansand          Oslo  Porsgrunn_~n     Stavanger     Trondheim
                 Drammen             1
            Fredriksta~r      70.60559             1
            Kristiansand     218.94267     209.72986             1
                    Oslo     35.627153     75.409469     252.06399             1
            Porsgrunn_~n     73.525154     73.834644     147.66627     104.75033             1
               Stavanger     269.68021     301.10683     161.09167     304.44826     227.48386             1
               Trondheim     411.34534      468.9118     603.76359     393.81813     479.68484     557.93652             1
            
            .
            Last edited by Andrew Musau; 15 Nov 2021, 10:23.

            Comment


            • #7
              Thanks Andrew Musau, yours seems an useful code too. Well, when I spoke about size it was because original code perfomed only one column with all distance rows (it was over 500 thousand ocurrences!). Yes! Of course I have a lot to learn using Stata and programming.
              Actually, in the meantime, I performed the challenge starting from Nicola Tommasi hint, with some modifications, and I succed in my plan.
              Code:
              import excel "NUTS3areas.xlsx", sheet("LonLat") firstrow
              rename (Longdec Latdec) (p1_x p1_y)
              gen id = _n
              summarize id
              forvalues i=1/`r(max)'  {
                  local xc=p1_x in `i'
                  local yc=p1_y in `i'
                  gen xc_id`i' = `xc'
                  gen yc_id`i' = `yc'
                  
                  geodist p1_x p1_y xc_id`i' yc_id`i', generate (linear_d`i') ellipsoid(6378388,6356863.019)
                      replace linear_d`i'=1 in `i'
                      drop xc_id`i' yc_id`i'
                      
                  
              }
              
              sum id
              
              drop p1_x p1_y Longrad Latrad id
              Tomorrow I want to perform a similar task (here is 9pm) and I will try your code (same numper of origin points, but different and much less point destination points).
              But in general the very goodnews are that I am enjoying all that stuff.
              Thanks to all. Grazie mille
              Last edited by Davide Spano; 15 Nov 2021, 12:48.

              Comment

              Working...
              X