Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Propensity score matching with nearest neighbor

    Hello everyone,

    I am trying to use matching approach in my study. I am using the following stata code:
    1. kmatch ps treated x1 x2 x3 x4, att ematch(state county) wgen(weight) nn(#) caliper(0.03)
    In my data, I have about 65,000 treated observations and 2.5 million controls. Since the ratio of controls is very high, I’m a bit confused about what number of neighbors (nn) I should use — e.g., 1, 2, 5, 10, or even 100.
    Is there any restriction on the choice of nn, or any guidance on how to decide the number of nearest neighbors in this kind of setting?

    Thanks in advance.

  • #2
    This is a choice between bias and variance. The lower the NN, the lower the variance, but matches might be bad. Given your large sample size, you can test various options, like 2, 5, 10, or 15. Using more matches might not be beneficial for your data. In any case, I would perform robustness checks, for example, kernel matching (by leaving the NN option out) or by using entropy balancing with the PS option (kmatch EB (...), comsup). For each model, make sure to test the balancing after the matching (kmatch summarize). This can help you decide which model gives you the best results.
    Best wishes

    Stata 18.0 MP | ORCID | Google Scholar

    Comment

    Working...
    X