Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • One-to-Many (1:n) propensity score matching without replacement, using psmatch2

    Dear all,

    This is my first ever post here, so please bear with me...

    I have been looking for a solution to this problem for quite a while, and I think I have come up with a solution. I want to do 1:n propensity score matching (with n being flexible up to a certain number) without replacement. However, psmatch2 only allows 1:n matching with replacement. Note that this is not a discussion of the advantages or disadvantages of either method.

    I've combined advice on similar topics from a number of users in the do-file below. It's a bit clunky, but I think it does what I want it to do:
    - 1:3 (in this example) propensity score matching on a previously predicted propensity score [pscore], without replacement
    - The output mirrors that of psmatch2, so pstest or similar can be used
    - Matches are identifiable through the variable [pair], allowing for condition logistic regression or other analysis

    I was wondering if people could try this on their data to see if I haven't made a mess out of it?
    Also, if this does work, then maybe it can help somebody in a similar situation...

    I am looking forward to your feedback, please be kind.

    Johannes




    **** trial of 1:3 matching without replacement, using repeated 1:1 matching with psmatch2 ****
    *** run your own prediction model, save your propensity score under variable [pscore]


    * create copy of propensity score
    sum pscore
    gen pscore_original=pscore

    * random order
    set seed 1000
    gen x=uniform()
    sort x


    *** Round 1
    ** nearest neighbour 1:1 matching with caliper 0.20*SD, adjust for your own data from 'sum pscore' results above
    psmatch2 [your intervention], pscore(pscore) caliper (0.024) noreplacement descending

    ** remove matched controls by changing propensity score to 91 (future rounds will be 92, 93 etc)
    replace pscore=91 if _treated==0 & _weight==1

    ** keep ID of matched control by generating new n1 and ID variable (new variable without underscore so it doesn't get overwritten)
    gen n1=_n1
    gen id=_id

    ** generate paired ID for later analysis
    gen pair = _id if pscore==91
    replace pair = _n1 if _treated==1
    bysort pair: egen paircount = count(pair)
    replace pair=. if paircount!=2
    drop paircount



    *** Round 2
    sort x
    ** nearest neighbour 1:1 matching with caliper 0.20*SD
    psmatch2 [your intervention], pscore(pscore) caliper (0.024) noreplacement descending

    ** remove matched controls by changing propensity score to 92
    replace pscore=92 if _treated==0 & _weight==1

    ** keep ID of matched control by generating new _n2 variable
    gen _n2=_n1


    gen pair2 = _id if pscore==92
    replace pair2 = _n2 if _treated==1
    gsort pair2 _treated
    replace pair=pair[_n+1] if pair==. & pair2!=.
    bysort pair: egen paircount = count(pair)
    drop pair2 paircount


    *** Round 3
    sort x
    ** nearest neighbour 1:1 matching with caliper 0.20*logit of SD
    psmatch2 [your intervention], pscore(pscore) caliper (0.024) noreplacement descending

    ** remove matched controls by changing propensity score to 93
    replace pscore=93 if _treated==0 & _weight==1

    ** keep ID of matched control by generating new n1 variable
    gen _n3=_n1

    gen pair2 = _id if pscore==93
    replace pair2 = _n3 if _treated==1
    gsort pair2 _treated
    replace pair=pair[_n+1] if pair==. & pair2!=.
    bysort pair: egen paircount = count(pair)
    drop pair2


    **** Tidy up and recreate psmatch 1:3 output

    *create 1:3 match descriptor for all matched
    gen one_to_n=(paircount-1)
    replace one_to_n=. if one_to_n==-1
    drop paircount
    sort _treated
    by _treated: tab one_to_n

    * reconstruct matches to original ID numbers
    gsort pair pscore
    replace _n2=id[_n+2] if pscore[_n+2]==92 & _n2!=.
    replace _n3=id[_n+3] if pscore[_n+3]==93 & _n3!=.
    drop _n1
    rename n1 _n1


    ** recreate output from 1:n matching with psmatch2
    replace _id=id
    replace _weight=1 if _treated==1 & _n1!=.
    replace _weight=1 if _treated==0 & one_to_n==1
    replace _weight=0.5 if _treated==0 & one_to_n==2
    replace _weight=0.333 if _treated==0 & one_to_n==3
    replace _nn=0 if _treated==0
    replace _nn=0 if _treated==1 & _n1==.
    replace _nn=1 if _treated==1 & _n1!=. & _n2==.
    replace _nn=2 if _treated==1 & _n1!=. & _n2!=. & _n3==.
    replace _nn=3 if _treated==1 & _n1!=. & _n2!=. & _n3!=.
    replace _support=1 if _treated==1 & _weight==1
    replace _pscore=pscore_original
    order _pscore _treated _support _weight _id _n1 _n2 _n3 _nn , after (x)
    sort pair

    **** check if this worked by using pstest






  • #2
    Hello Johannes,

    Thanks so much for posting this. I am trying to solve this issue with my propensity score model as well. Exciting to see that you've found a solution. I'll post more results once I have finished the model. I have tried your code above, however some of my pairs (one_to_n) are only 1 or 2. Here's the tabulation:

    . tab one_to_n

    one_to_n | Freq. Percent Cum.
    ------------+-----------------------------------
    1 | 38 0.43 0.43
    2 | 153 1.71 2.14
    3 | 8,732 97.86 100.00
    ------------+-----------------------------------
    Total | 8,923 100.00

    I assume this is because more matches couldn't be found and they were left at 1 and 2. Is that correct, and did the same thing happen with your data? Thanks very much.

    Comment


    • #3
      I realise this is an old post,
      But wanted to thank the author for posting it and sharing the code
      Been searching for a long time online for a way to do a 1:5 match on stata.

      Finally got it right with the above program.

      Comment

      Working...
      X