Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to use results of psmatch2 in regression?

    Hi all,

    I have a quite unbalanced panel which is why I apply nearest neighbor matching before doing the regression. In particular, I i) estimate the probit model

    Code:
    probit treated $covariates if event == -1
    ii) predict the propensity score
    Code:
    predict double ps
    iii) I use 10 nearest neighbor matching
    Code:
    psmatch2 treated if $event == -1, outcome(hh_income) pscore(ps) neighbor(10) ai(10) 
    bysort treated : tab _weight
    and now I would like to use these results in my regression. I read that one can either keep the matched sample, i.e.
    Code:
    keep if _weight != . & treated == 0 | treated == 1
    and then run the regression or alternatively, one can use the weights in the regression, i.e.
    Code:
    gen         ms_help = _weight 
    bysort pid: egen ms = mean(ms_help)
    
    reghdfe working treated ... [w=ms], absorb(...) vce(...)
    Regarding this, I have two questions:
    1) What is the difference between keeping the matched sample and using the weights in the regression?
    2) If I want to use
    Code:
    [w=ms]
    , do I have to use analytical weights or pweights?

    Best,
    Kathrin



  • #2
    It's not correct to simply keep the matched sample as _weight has different values even within the matched sample and weights are still needed in the matched-sample regression. Therefore we should directly run regressions with weights -- those with missing weights will be automatically excluded. You may just use analytical weights.

    Comment


    • #3
      That makes sense! Thanks a lot for the quick answer
      Is it okay to do it with the mean (like I did above)?

      Comment


      • #4
        Matching is between subjects. Once a subject obtains its weight in period -1, the weight should remain constant for this subject in the following periods -- Your code seems correct on this point.

        Comment


        • #5
          Cool, thanks!

          Comment


          • #6
            Hi,

            I have a follow-up question on this. I do a similar propensity score matching with multiple neighbours. I know that one can not use [fweight=weight] in the regression since the weights are likely to be fractional when multiple neighbours are being used. I see that the original poster used a work around using
            Code:
             gen         ms_help = _weight
             bysort pid: egen ms = mean(ms_help)
            However, in this case, I wonder what the "pid" stands for in order to get a better understanding of how to tackle my issue? At what level is this ID specified?

            Comment

            Working...
            X