Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Select a varible name by value and put it as value in a new variable

    Hi!
    I'm wondering if there is a function (probably to be nested in a loop) that would allow me to select the varname from a list of 32 and assign it as value on a new variable.
    In particular I got a matrix of 764x32 and for every ocurrence I need:
    1. generate a variable that will select the minimum value across the 32 (in a range of more - containg other data as name, adress, etc. on which I'm not interested to perform the selection);
    2. generate a variable that would select the varname correspondent to the value of the previous point.
    Hope someone could give me some hints (also because I think it would not be the real objective of a software like Stata, but I find it very useful to manage data)

    Thanks.

  • #2
    I would just write a loop. Note that in the event of ties this picks the first minimum seen. If you want to pick,the last seen, change the inequality to fit.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(var1 var2 var3)
    1 -1   42
    2 -2   66
    3 -3 -999
    end
    
    gen min = var1
    gen whichmin = "var1"
    
    foreach v of var var2-var3 {
          replace whichmin = "`v'" if `v' < min
          replace min = `v' if `v' < min
    }
    
    
    list
    
         +--------------------------------------+
         | var1   var2   var3    min   whichmin |
         |--------------------------------------|
      1. |    1     -1     42     -1       var2 |
      2. |    2     -2     66     -2       var2 |
      3. |    3     -3   -999   -999       var3 |
         +--------------------------------------+
    See https://www.stata-journal.com/articl...ticle=pr0046_1 for more on this precise problem.

    Naturally egen gives you the row minimum directly. The reason for there not being an canned function is, I surmise, the possible existence of ties.

    Comment


    • #3
      Thanks Nick Cox
      You know if an institutional login method is allowed for The Stata Journal?
      And well, I need to learn a lot about the loops, Actually I find very difficult to cope with code sintax. Sometimes also the problem is the query correct specification to the knowledge base. I hope to improve using.

      Davide

      Comment


      • #4
        It is possible that your institution has a subscription to the Stata Journal but I can’t comment on that. My earlier paper on row wise calculations and another 2020 paper on loops in general are publicly accessible.

        Comment


        • #5
          Well, it was right, I got access trough my institution (but just now I discovered it).
          Actually, when I opened the link you post there was only the option to buy, and looking for the subscriptions I decided to try an alternative. But this way is the best. Thanks!

          Comment


          • #6
            Hi Nick Cox
            I was reading reading papers (also the one you wrote in 2006) you suggested me, and also watching video tutorials on the web.
            The code I wrote works only up a certain point. I used egen command (at the last to try olso that) to find the minimum value of any row, but when I tried to replace the content of the varible "whichmin" with the varname correspondent to minimum value, it happened (several times and not everytime in the same way) that the command changed the content of the variable not for all the variable occurrencies. Do you know if there exist any plausible reason for that?
            I copy here an extract of my result table:
            Click image for larger version

Name:	Immagine.png
Views:	1
Size:	23.1 KB
ID:	1639155


            And that is the code line inside the loop that is make my quite hydrophobic...
            Code:
            egen dist = rmin(LondonSP-DortmundHBF)
            gen node = "node"
            
            foreach N of var LondonSP-DortmundHBF {
            
                replace node = "`N'" if `N' < dist
            
                }
            Could you please help me to bring light in that ?
            Thanks D.

            PS: I'm sorry, but I can work in this only in my spare time, so I'm aware I'm not efficient at all...

            Comment


            • #7
              You're looking for values less than the minimum. The code should not find any. You need to find for which node the distance is equal to the minimum.

              Comment


              • #8
                Ok! It was my thought too (I tried also "<=" getting the same results), but I'm so sorry, I can't find the right sintax to input in there. I feel really dumb!
                I tried "=", "==", but I stuck ever and ever in the r(198) "invalid sintax" message

                Comment


                • #9
                  Code:
                   
                   replace node = "`N'" if `N' == dist

                  Comment


                  • #10
                    There is something that doesn't work at all.

                    That's the code, Could you find something really wrong and tell me what and where it is?

                    Code:
                    clear all
                    
                    cd ""
                    
                    import excel using "file001.xlsx", firstrow
                    *tab
                    drop Adress Suburb City Postalcode State
                    drop if Longitude==.
                    gen Stat_id=_n
                    save "Nodes.dta", replace
                    clear
                    
                    use "file002.dta"
                    cross using "file003.dta"
                    rename (Longitude Latitude) (Xcor Ycor)
                    
                    geodist Lon Lat Xcor Ycor, generate (distance)
                    drop Xcor Ycor Lon Lat TrainStation
                    
                    reshape wide distance@, i(Code) j(Stat_id)
                    
                    rename (distance1 distance2 distance3 distance4 distance5 distance6 distance7 distance8 distance9 /
                    distance10 distance11 distance12 distance13 distance14 distance15 distance16 distance17 distance18 distance19 /
                    distance20 distance21 distance22 distance23 distance24 distance25 distance26 distance27 distance28 distance29 /
                    distance30 distance31 distance32)/
                    (LondonSP Ebbsfleet Ashford FrethunTGV Lille_EU ParisN CDG_TGV Disney_TGV BordeauxSJ /
                    MarseilleSC AixTGV Avignon_TGV ValenceTGV Lyon_StX Chambery Albertville Moutiers AimeLP /
                    BourgStM BruxMidi AntwerpenC LiegeG RotterdamC Schipol_Int AmsterdamC Aix_HBF KolnHBF /
                    DusseldorfHBF DusseldorfInt DuisburgHBF EssenHBF DortmundHBF)
                    
                    order County_name, after(Code)
                    
                    egen dist = rmin(LondonSP-DortmundHBF)
                    gen node = "node"
                    order node dist, after (County_name)
                    
                    foreach N of var LondonSP-DortmundHBF {
                        
                        
                        replace node = "`N'" if `N' == dist
                        }
                    
                    
                    *save "file004.dta", replace
                    
                    *export excel "file004.xlsx", firstrow(var) keepcellfmt replace

                    Comment


                    • #11
                      Debugging this is really hard for me without the datasets, and you can't win there: I really don't want to playing with so many datasets. One guess is a mismatch between variable types, so that you are comparing a float and a double.

                      Comment


                      • #12
                        Hi Nick Cox thanks a lot for your hint, finally I just changed the type of variable in which record the loop output and everything has worked properly.
                        That's true what a teacher of mine said on primary school, the computer is a faster dumb. It needs everything to be explained. :D
                        So, now, heading to something more complex: changing function from geodist to georoute and after combine data with Dijkstra or Floyd Algorithm.

                        Comment

                        Working...
                        X