Propensity score matching with nearest neighbor

Md Sumon Ali

Join Date: Apr 2025

Posts: 4
#1

Propensity score matching with nearest neighbor

19 Aug 2025, 18:29

Hello everyone,

I am trying to use matching approach in my study. I am using the following stata code:
kmatch ps treated x1 x2 x3 x4, att ematch(state county) wgen(weight) nn(#) caliper(0.03)

In my data, I have about 65,000 treated observations and 2.5 million controls. Since the ratio of controls is very high, I’m a bit confused about what number of neighbors (nn) I should use — e.g., 1, 2, 5, 10, or even 100.
Is there any restriction on the choice of nn, or any guidance on how to decide the number of nearest neighbors in this kind of setting?

Thanks in advance.
Tags: None
Felix Bittmann

Join Date: Aug 2018

Posts: 742
#2

19 Aug 2025, 23:46

This is a choice between bias and variance. The lower the NN, the lower the variance, but matches might be bad. Given your large sample size, you can test various options, like 2, 5, 10, or 15. Using more matches might not be beneficial for your data. In any case, I would perform robustness checks, for example, kernel matching (by leaving the NN option out) or by using entropy balancing with the PS option (kmatch EB (...), comsup). For each model, make sure to test the balancing after the matching (kmatch summarize). This can help you decide which model gives you the best results.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment

Announcement