-psmatch2- graph for propensity score matching

Navid Asgari

Join Date: Jul 2025

Posts: 30
#1

-psmatch2- graph for propensity score matching

24 Mar 2015, 14:52

Hi,

I have been trying different Stata commands for difference-in-difference estimation. There are many commands that help you get the work done. But, somehow they do not offer much in terms of diagnostics and graphs.

For example, the command -diff- which is a user-written command uses -psmatch2- (also a user-written command) for kernel matching. After running -diff- you can use -psgraph- which is a post estimation command of -psmatch2- and you will get a graph like the following:

There are a few issues with this graph including not having a unit on the vertical axis and also not being what a PSM graph should look like. A PSM graph should show two things: 1) the propensity score of treatment-group observations versus control-group observations and before matching then 2) the same graph after matching.

An example of such a graph is:

The second picture is copied from the following webpage: http://sacemaquarterly.com/methodolo...egression.html

Is there any way to get such neat graph by using the current commands and options of Stata?

Thanks,
Navid
Tags: None
Bert Jung

Join Date: Apr 2014

Posts: 16
#2

25 Mar 2015, 20:15

Not sure if that's what you're asking but take a look at the output that -psmatch2- leaves behind after estimation. For instance, by default -psmatch2- saves the propensity score in a variable called _pscore. You can use that variable to create your own -twoway- plot.

Code:

sysuse auto, clear psmatch2 foreign mpg, out(price) twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0, lpattern(dash)), legend( label( 1 "treated") label( 2 "control" ) ) xtitle("propensity score")

You can create two such plots and stack them using -graph combine-.

Bert
Comment
Navid Asgari

Join Date: Jul 2025

Posts: 30
#3

26 Mar 2015, 10:45

Hi Bert,

Thanks for the suggestion. The code plots _pscore for treatment and control groups. This is essentially the comparison of _pscores before matching:

But, then one always needs to look at the graph comparing the _pscore of treatment and control groups after matching.
Using the output of the program you wrote, the next step would be looking at the variable "_n1" which indicates exactly which member of the control group is matched with each treatment group members. The graph will look like the following:

Is this the correct way?

-diff-uses -psmatch2-, but doe snot produce any of the outputs that -psmatch2- produces. The outputs are necessary for drawing the second graph. Am I right?

Thanks,
Navid
Comment
Richard Hofler

Join Date: Apr 2014

Posts: 12
#4

28 Mar 2015, 09:38

Navid,

Your post #3 contains two graphs. Will you please show the code that you used to generate the the 2nd graph?

Thanks,
Richard
Comment
Navid Asgari

Join Date: Jul 2025

Posts: 30
#5

28 Mar 2015, 14:43

Hi Richard,

The second graph shows the propensity of scores of treated group and the group that is untreated (i.e., control) but it is matched with the treated group.
Therefore, first step would be to identify the untreated observations that are not matched:

gen match=_n1
replace match=_id if match==.
duplicates tag match, gen(dup)
twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0 & dup>0, lpattern(dash)), legend( label( 1 "treated") label( 2 "control" ) ) xtitle("propensity score")

Navid
Comment
Richard Hofler

Join Date: Apr 2014

Posts: 12
#6

29 Mar 2015, 19:03

Hi Navid,

Thank you.

Richard
Comment
Richard Hofler

Join Date: Apr 2014

Posts: 12
#7

29 Mar 2015, 19:32

Hi All,

In case anyone viewing this thread is interested, here's the code to do what Navid with a few additions to display both graphs side-by-side with the y- axes having common scales.

sysuse auto, clear
psmatch2 foreign mpg, out(price)

// compare _pscores before matching & save graph to disk
twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0, ///
lpattern(dash)), legend( label( 1 "treated") label( 2 "control" ) ) ///
xtitle("propensity scores BEFORE matching") saving(before, replace)

// compare _pscores *after* matching & save graph to disk
gen match=_n1
replace match=_id if match==.
duplicates tag match, gen(dup)
twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0 ///
& dup>0, lpattern(dash)), legend( label( 1 "treated") label( 2 "control" )) ///
xtitle("propensity scores AFTER matching") saving(after, replace)

// combine these two graphs that were saved to disk
// put both graphs on y axes with common scales
graph combine before.gph after.gph, ycommon

Richard
Comment
Navid Asgari

Join Date: Jul 2025

Posts: 30
#8

31 Mar 2015, 09:25

Hi Richard,

Do you know what should one do to produce a graph similar to the second one (after matching) when Kernel matching is used?

Thanks,
Navid
Comment
David Radwin

Join Date: Mar 2014

Posts: 368
#9

02 Apr 2015, 12:49

I recommend using grc1leg by Vince Wiggins (findit grc1leg) instead of graph combine.

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Richard Hofler

Join Date: Apr 2014

Posts: 12
#10

04 Apr 2015, 06:54

Hi Navid,

Sorry, I don't know how to do that.

Richard
Comment
Foruhar Moayeri

Join Date: Jun 2015

Posts: 1
#11

04 Jun 2015, 21:01

Dear Navid

I just found your posts. Did you try the following syntax from the help file of "pstest" command?

pstest varname [if exp] [in range], density|box both [treated(varname) mweight(varname) support(varname) outlier title(string) saving(filename[, replace]) atu ]

with _pscore as varname.

Foruhar
Comment
Are Magnus

Join Date: Oct 2015

Posts: 11
#12

12 Mar 2016, 12:02

Hi all,

This is a very helpful thread!

Does anyone know how to make the after matching graph when kernel and radius matching is done? Either based on the code above or differently?

Thanks very much for any help!
Comment
Alhassane Bah

Join Date: Mar 2016

Posts: 4
#13

16 Mar 2016, 02:30

Hi Navid and Richard,
I would highly appreciate if you could please help me find out where the variable "_n1" and "_id" from the commands below come from. I am trying to graph the AFTER MATCHING, but Stata is always returning the error code (_n1 not found). Thank you in advance for your help.

// compare _pscores *after* matching & save graph to disk
gen match=_n1
replace match=_id if match==.
duplicates tag match, gen(dup)
twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0 ///
& dup>0, lpattern(dash)), legend( label( 1 "treated") label( 2 "control" )) ///
xtitle("propensity scores AFTER matching") saving(after, replace)
Comment

David Radwin

Join Date: Mar 2014
Posts: 368

#14

16 Mar 2016, 10:53

Alhassane,

psmatch2 does not create the variables _n1 or _id because those are specific to nearest neighbor matching, not kernel matching or radius matching. _id is the ID of the observation generated by psmatch2 and _n1 is the ID of its nearest neighbor after matching. See this simple example comparing the three methods and what variables they create:

Code:

. sysuse nlsw88, clear
(NLSW, 1988 extract)

. psmatch2 married grade collgrad

Probit regression                               Number of obs     =      2,244
                                                LR chi2(2)        =       0.43
                                                Prob > chi2       =     0.8051
Log likelihood = -1463.2462                     Pseudo R2         =     0.0001

------------------------------------------------------------------------------
     married |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       grade |   .0108616   .0177766     0.61   0.541    -.0239799    .0457031
    collgrad |  -.0360274    .105997    -0.34   0.734    -.2437776    .1717228
       _cons |   .2305452   .2149452     1.07   0.283    -.1907396    .6518299
------------------------------------------------------------------------------
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.

. describe _*

              storage   display    value
variable name   type    format     label      variable label
-----------------------------------------------------------------------------------------------------------------
_pscore         double  %10.0g                psmatch2: Propensity Score
_treated        byte    %9.0g      _treated   psmatch2: Treatment assignment
_support        byte    %11.0g     _support   psmatch2: Common support
_weight         double  %10.0g                psmatch2: weight of matched controls
_id             int     %9.0g                 psmatch2: Identifier (ID)
_n1             int     %8.0g                 psmatch2: ID of nearest neighbor nr. 1
_nn             float   %9.0g                 psmatch2: # matched neighbors
_pdif           double  %10.0g                psmatch2: abs(pscore - pscore[nearest neighbor])

. sysuse nlsw88, clear
(NLSW, 1988 extract)

. psmatch2 married grade collgrad, kernel

Probit regression                               Number of obs     =      2,244
                                                LR chi2(2)        =       0.43
                                                Prob > chi2       =     0.8051
Log likelihood = -1463.2462                     Pseudo R2         =     0.0001

------------------------------------------------------------------------------
     married |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       grade |   .0108616   .0177766     0.61   0.541    -.0239799    .0457031
    collgrad |  -.0360274    .105997    -0.34   0.734    -.2437776    .1717228
       _cons |   .2305452   .2149452     1.07   0.283    -.1907396    .6518299
------------------------------------------------------------------------------

. describe _*

              storage   display    value
variable name   type    format     label      variable label
-----------------------------------------------------------------------------------------------------------------
_pscore         double  %10.0g                psmatch2: Propensity Score
_treated        byte    %9.0g      _treated   psmatch2: Treatment assignment
_support        byte    %11.0g     _support   psmatch2: Common support
_weight         double  %10.0g                psmatch2: weight of matched controls

. sysuse nlsw88, clear
(NLSW, 1988 extract)

. psmatch2 married grade collgrad, radius

Probit regression                               Number of obs     =      2,244
                                                LR chi2(2)        =       0.43
                                                Prob > chi2       =     0.8051
Log likelihood = -1463.2462                     Pseudo R2         =     0.0001

------------------------------------------------------------------------------
     married |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       grade |   .0108616   .0177766     0.61   0.541    -.0239799    .0457031
    collgrad |  -.0360274    .105997    -0.34   0.734    -.2437776    .1717228
       _cons |   .2305452   .2149452     1.07   0.283    -.1907396    .6518299
------------------------------------------------------------------------------

. describe _*

              storage   display    value
variable name   type    format     label      variable label
-----------------------------------------------------------------------------------------------------------------
_pscore         double  %10.0g                psmatch2: Propensity Score
_treated        byte    %9.0g      _treated   psmatch2: Treatment assignment
_support        byte    %11.0g     _support   psmatch2: Common support
_weight         double  %10.0g                psmatch2: weight of matched controls

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him

Comment

Alhassane Bah

Join Date: Mar 2016

Posts: 4
#15

17 Mar 2016, 07:36

Thank you so much Sir for your clarification. But do you have any idea on why I keep having these graphes flopped like this? I mean the BEFORE is better matched than the AFTER. Should I just change the title or something is wrong with my balancing.
Thank you so much again David.
Attached Files

1 Photo

Last edited by Alhassane Bah; 17 Mar 2016, 07:41.
Comment

Announcement