Propensity score matching

Diego Malo

Join Date: Jun 2019

Posts: 140
#1

Propensity score matching

15 Jun 2021, 01:40

Good morning,

I would like to evaluate an effect of a shock (my treatment, let's call T) in a variable (Y) using X as my covariate (a binary variable), using a propensity score matching. I would like to do the following:

1) Estimate the propensity score using a Logit model
2) Apply a matching algorithm (kernel matching) using the differences in the propensity score.

I have been looking and I have found two ways to do it in stata: psmatch2 and kmatch.

With psmatch2, I will do in this way: psmatch2 T, outcome(Y) pscore(X) kerneltype(uniform) logit

With kmatch, I will do in this way: kmatch ps T X (I am not sure where to put the outcome with this command)

Do I am doing well with psmatch2?

On the other hand, I would like to ask you (probably it is a bit theorerical question more than a question of using stata) if it is possible to add time fixed effect (year) and country fixed effect in this strategy?. If so, so have I to include them in pscore?

Best,

Diego.
Tags: None
Felix Bittmann

Join Date: Aug 2018

Posts: 714
#2

15 Jun 2021, 01:46

In kmatch the syntax would be like

Code:

kmatch ps treatment xvars (outcome)

To include FEs, you can use exact matching I assume, like

Code:

kmatch ps treatment xvars (outcome), ematch(country)

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
1 like
Comment
Diego Malo

Join Date: Jun 2019

Posts: 140
#3

15 Jun 2021, 01:54

Dear Felix Bittmann

Thank you for your answer.

If I also want to include year fixed effect, I should write kmatch ps treatment xvars (outcome), ematch(country year) ? or i.country, i.year?

On the other hand, for exact matching I understand that I match two units with the same pscore, is it what you refer with exact matching when I add ematch? From What I have seen ematch can only be used with kmatch ra instead of kmatch ps.

Best,

Diego.

Last edited by Diego Malo; 15 Jun 2021, 02:01.
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 714
#4

15 Jun 2021, 02:09

What ematch does is that it only matches within a given group and hence creates an exact match. For example, if you enter country, only people are matched to people from the same country. I think this is the most you can get to FE as possible. If you enter country as a regular xvar, it boils down to the PS so potentially the matches are not exact any more. Make sure to test balancing afterwards using kmatch sum. And yes, you can include as many variables in ematch as you like, but to work well the sample must be large and diverse enough (so small countries might be a problem).
And as fas as I know ematch should work fine with the regular kmatch ps.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
1 like
Comment
Diego Malo

Join Date: Jun 2019

Posts: 140
#5

15 Jun 2021, 02:21

Thank you for your answer Felix Bittmann . It helps me a lot. I appreciate it!

I will check the balacing property using command pscore. If my intention is to add FE with country and year, should I check my balacing property with that covariates or just with my initial binary variable X?

Thank you again,

Best regards,

Diego.
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 714
#6

15 Jun 2021, 02:23

The command will always check all variables in the models (and as long as the algorithm converges, the exact vars are always perfectly matched).

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Diego Malo

Join Date: Jun 2019

Posts: 140
#7

15 Jun 2021, 02:31

But is it possible to check the balacing property with kmatch? I was thinking to check like that: pscore treatment xvars, where Xvars is just my binary covariate.
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 714
#8

15 Jun 2021, 02:32

Sure, simply run

Code:

kmatch sum

after the matching command.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
1 like
Comment
Matthew Williams

Join Date: Feb 2021

Posts: 196
#9

15 Jun 2021, 03:11

Originally posted by Felix Bittmann View Post

Sure, simply run

Code:

kmatch sum

after the matching command.

Hi Felix,

To follow up issues related to the PSM in this thread, I would like to ask a question on the PSM procedure for a repeated cross-sectional data. Specifically, in a panel setup of two periods, say t1 and t2, I understand that covariates should be used in t1 while outcome should be in t2. But my setup is a repeated cross-sectional data of two years, say T1 and T2. For example, I want to estimate the effect of unemployment on individuals' health where Y represents an indicator of poor health (=1), U indicates unemployment status (=1) and X denotes a set of covariates. So my question is that how should I choose X? I don't know whether I should choose X in T1 (seem not make sense since this is not a panel data) or just pool the two year together and then use X in the pooled data and do not care whether X belong to T1 or T2.

Thank you.
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 714
#10

15 Jun 2021, 04:15

I would argue that if t1 captures your pre-treatment status, you should match on these variables and then use the results (or the matching weights) when computing the outcome in t2. However, I never tried this myself and I would recommend looking in the literature for examples or further advice.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
1 like
Comment

Announcement

Propensity score matching

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment