Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matching observations with a perfect match

    Hello everyone,

    my question is basic, however I cannot find an appropriate command for my problem:

    The observations in my sample can be assigned to a treatment group (var1 =1) and a control group (var1 = 0). Now I would like to assign exactly one observation of the control group to one observation of the treatment group based on var2 (both must have the same value of var2). The matching observation from the control group should be chosen randomly if there is more than one possible matching partner (e.g. No. 3 and No.4 could both be matched with no.1).

    No. var1 var2
    1 1 2
    2 1 3
    3 0 2
    4 0 2
    5 0 3
    6 0 5

    -> I would like to match 1) obs. No. 1 with either No. 3 or No. 4 and 2) obs. No. 2 with No. 5.

    It would be perfect to either get a sample with only the matched observations (e.g. No. 1,2,3,5) or a new variable that indicates whether an observation is matched or not.

    I tried to use the “cross”-command – unfortunately my dataset is too large for all possible combinations. The use of “merge” is also difficult as I cannot randomly choose between variables with the same value of var2. Are there any other possibilities or commands?

    Thank you very much in advance for your help!

  • #2
    2 suggestions: (1) I think, but am not sure, that the "outof" option for the user-written command -vmatch- will do this; use search to find and install vmatch; (2) split your data into 2 datasets and merge; use a procedure (e.g., assign random numbers within sets or use -egen- w/tag option or use duplicates tag to select just one match/case

    Comment


    • #3
      Rich's suggestions look good. Another approach that may be fruitful for you is -joinby var1-. This is like -cross-, except that only observations that agree on the value of var1 will be paired with each other. Then you can make random selections from there.

      Comment


      • #4
        Is this what you are looking for?

        Code:
        set seed 12345
        
        clear
        input id tcgroup matchvar
        1 1 2
        2 1 3
        3 0 2
        4 0 2
        5 0 3
        6 0 5
        end
        
        tempfile main
        save "`main'"
        
        * form all pairwise combinations within groups that have the same matchvar value
        rename (id tcgroup) (id0 tcgroup0)
        joinby matchvar using "`main'"
        
        * drop non-relevant combinations
        drop if tcgroup == 1
        keep if tcgroup0 == 1
        
        * possibilities
        sort matchvar id0 tcgroup0
        list, sepby(matchvar) noobs
        
        * random pick
        gen mixitup = runiform()
        bysort matchvar id0 (mixitup): gen pick = _n == 1
        list, sepby(matchvar) noobs
        
        * do it again
        replace mixitup = runiform()
        bysort matchvar id0 (mixitup): replace pick = _n == 1
        list, sepby(matchvar) noobs

        Comment


        • #5
          Originally posted by Robert Picard View Post
          Is this what you are looking for?

          Code:
          set seed 12345
          
          clear
          input id tcgroup matchvar
          1 1 2
          2 1 3
          3 0 2
          4 0 2
          5 0 3
          6 0 5
          end
          
          tempfile main
          save "`main'"
          
          * form all pairwise combinations within groups that have the same matchvar value
          rename (id tcgroup) (id0 tcgroup0)
          joinby matchvar using "`main'"
          
          * drop non-relevant combinations
          drop if tcgroup == 1
          keep if tcgroup0 == 1
          
          * possibilities
          sort matchvar id0 tcgroup0
          list, sepby(matchvar) noobs
          
          * random pick
          gen mixitup = runiform()
          bysort matchvar id0 (mixitup): gen pick = _n == 1
          list, sepby(matchvar) noobs
          
          * do it again
          replace mixitup = runiform()
          bysort matchvar id0 (mixitup): replace pick = _n == 1
          list, sepby(matchvar) noobs

          Works perfectly! Thank you very much!

          Comment

          Working...
          X