Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Randomly generating new list of room mates and room numbers from an existing list

    Hello,

    I have a data set with 3 columns that has pairs of roommates with their current room number. I want to generate a new data set such that each person gets a new room mate and both of them get a new room (with no repetition of any kind). Any suggestions on how I can do this?
    Thanks in advance for the help.

    Best,
    Janaki








  • #2
    I don't understand your datastructure. Can you give us an example? See the section on dataex in the FAQ
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      This is very much subject to what you mean by "random" - which you don't mention in the statement of the problem, only in the title of the topic. The following has a certain amount of randomness and meets your other criteria. But if you mean that each person has an equal chance of being matched with every other person who is not currently their roommate, this will not meet your needs.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float room str1(res1 res2)
      101 "A" "Q"
      102 "B" "R"
      103 "C" "S"
      104 "D" "T"
      105 "E" "U"
      106 "F" "V"
      107 "G" "W"
      108 "H" "X"
      109 "I" "Y"
      110 "J" "Z"
      end
      
      set seed 42
      scalar d1 = runiformint(1,9)
      scalar d2 = runiformint(1,8)
      scalar d2 = d2 + (d2>=d1)
      generate s1 = mod(_n+d1-1,_N)+1
      generate s2 = mod(_n+d2-1,_N)+1
      generate str1 new1 = res1[s1]
      generate str2 new2 = res2[s2]
      list, clean noobs
      Code:
      . list, clean noobs
      
          room   res1   res2   s1   s2   new1   new2  
           101      A      Q    4   10      D      Z  
           102      B      R    5    1      E      Q  
           103      C      S    6    2      F      R  
           104      D      T    7    3      G      S  
           105      E      U    8    4      H      T  
           106      F      V    9    5      I      U  
           107      G      W   10    6      J      V  
           108      H      X    1    7      A      W  
           109      I      Y    2    8      B      X  
           110      J      Z    3    9      C      Y

      Comment


      • #4
        I wonder if there is always a solution to this problem if one stipulates "random." If one randomly picks res1 without replacement to assign to a room, and tries to do the same for res2, is it possible that one will run out of choices for res2 because all the allowable rooms for res2 have already been given (by chance) to someone else? This seems similar to problems with greedy matching. Anyway, this has the sound of a problem that has been investigated in combinatorics, and it would be interesting if someone could say something about known algorithms or proofs. I'm not that person <grin>.

        Comment


        • #5
          Mike has made concrete what I was grasping at.

          This is not a problem in statistics, and while "real programmers can write do-loops in any language" (an old FORTRAN joke, sorry) it's not clear that Stata has the best set of tools to handle this, or that Statalist has anyone with the expertise necessary to give good - as opposed to ad hoc - advice.

          I personally think there is always a solution for 3 or more existing pairs of roommates. The problem is that if you try to find a solution by creating the assignments sequentially - building up your assignments a room at a time with some element of randomness - you run the risk of ending up with, say, rooms 109 and 110 and residents I, J, Y, and Z remaining to be assigned, and for that there is no solution, as Mike worried, so you have to back out of the dead end a ways and try again.

          The answer may be to iteratively do purely random assignments - randomly order the list of residents and assign the first two to room 101, the next two to room 102, ... . Then confirm for each pair that the two residents did not previously room together, and neither resident was in the room to which they are now assigned. If we're luck, all the conditions are confirmed and we're done. If not, re-randomize and try again. My gut feeling is that the set of acceptable solutions is fairly dense within the set of all solutions, for a reasonably large number of rooms, so this may not loop for too long.

          The systematic approach would be a recursive program that accepts a list of N rooms and 2N residents, creates a list of all N rooms times (2N-2 choose 1)x(2N-4 choose 1) [I think] acceptable pairs for each room, marks the first entry on the list as an assignment, then calls itself with lists of the remaining N-1 rooms and 2N-2 residents. If the program cannot find an assignment, it returns a failure code, and the calling program tries the next entry on its list, and so forth. A fun program to write in a recursive language. Probably could get it down to a single line of APL.

          Neither of these approaches are anything I intend to take further, however.
          Last edited by William Lisowski; 20 Feb 2019, 11:37.

          Comment


          • #6
            Following up:
            1) I recall now that I encountered a similar problem to create a list of matched pairs for a series of gift exchanges among N family members. In that application, excluding husband/wife matches made a lot of solutions impossible, but nevertheless, iteratively making random assignments until an acceptable one occurred worked fine in Stata. In the current situation, the N is presumably larger, but the chance of hitting an impossible solution is smaller. Another approach I would consider here is to make random assignments, per Bill's suggestion, but then among the relatively few assignments that are not acceptable, make simple swaps until they're ok. If this swapping took too many iterations, one could re-randomize.

            2) I'd note for the record that both Stata and Mata do recursion just fine, in my experience.

            Code:
            // The hello world case of recursion
            cap prog drop factorial
            prog factorial, rclass
             args N
             if (`N' == 0) {
                return scalar nfact = 1
             }
             else {
                local nm1 = `N' - 1
                factorial `nm1'
                return scalar nfact = r(nfact) * `N'
             }
             end
             // for example
            fact 5

            Comment


            • #7
              This is incredible! Thank you so much every one for your responses.

              Comment


              • #8
                William Lisowski I was vague in my usage of the term 'random'. I used your codes and they gave me exactly what I needed. Thanks so much!

                Comment

                Working...
                X