psmatch2--selecting parameter for ai() option to calculate Abadie-Imbens standard errors

David Radwin

Join Date: Mar 2014

Posts: 369
#1

psmatch2--selecting parameter for ai() option to calculate Abadie-Imbens standard errors

04 Feb 2015, 16:36

A recent update to psmatch2 (Leuven & Sianesi, available from SSC) has an option ai() described as such:

ai(integer) calculate the heteroskedasticity-consistent analytical standard errors proposed by Abadie and Imbens (2006) by specifying the number of neighbors to be used to calculate the conditional variance (their formula (14)). With option altvariance one can specify to use the estimator of Abadie et al. (2004) instead.

For propensity score matching, should the ai() value be the same number as the number of nearest neighbors used for matching, e.g., ai(1) for 1:1 matching? If not, how does one select a value for ai()?

References:
Abadie, A., Drukker, D., Herr, J. L., & Imbens, G. W. (2004). "Implementing matching estimators for average treatment effects in Stata", Stata journal 4, 290-311.
Abadie A. and Imbens, G. (2006), "Large sample properties of matching estimators for average treatment effects", Econometrica 74(1), 235-267.

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Tags: None
Melissa Garrido

Join Date: Apr 2014

Posts: 75
#2

05 Feb 2015, 10:01

Hi David,
I believe you are correct - the ai() value should be the number of matches used for each treated individual.
Hope this helps,
Melissa
1 like
Comment
David Radwin

Join Date: Mar 2014

Posts: 369
#3

05 Feb 2015, 14:51

Thank you Melissa!

David

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Sam Huang

Join Date: Jan 2015

Posts: 20
#4

18 May 2016, 02:35

Originally posted by Melissa Garrido View Post

Hi David,
I believe you are correct - the ai() value should be the number of matches used for each treated individual.
Hope this helps,
Melissa

Hi Melissa

Late to this, but how would this work if the -ties option for psmatch2 is chosen and each treated individual may be matched to more than one controls with the same tied pscore? For a nearest neighbour (1) matching with -ties option, will it still be ai(1)?
Comment
Daniel Colombo

Join Date: Mar 2016

Posts: 6
#5

05 Jul 2016, 13:18

Dear all,

I just read this and other posts, and I thought one of you could help me with a simple question on the subject. I've recently learned about the ai(#) option in psmatch2, and I'm trying to understand how to use it.

Before the command update, I used to bootstrap the standard errors to calculate the p-value or statistical significance of the ATT coefficient. I understand that, with the ai(#) option, this is no longer necessary or appropriate, if I'm using the nearest neighbor or Mahalanobis matching.

My question is how do I obtain the p-value or statistical significance now? Is it a straightforward interpretation of the t-stat? I.e., if t-stat>1.96, then I have statistical significance at the 0.05 confidence level? (assuming N sufficiently large)

I've read Abadie and Imbens (2006).

I appreciate any thoughts and comments in advance.

Al the best,
Daniel Colombo
Comment

Sebastian Geiger

Join Date: Oct 2015
Posts: 124

06 Jul 2016, 08:41

Hello,

Stata 13 (or higher) offers an own propensity score matching command: -teffects psmatch-

As described by the help file of -teffects psmatch-, it uses robust standard errors by implementing the procedure derived by Abadie and Imbens (2006, 2011, 2012). I'm not quite sure if -psmatch2- uses the same approach since only the 2006 paper is cited in its help file. I tried to replicate the results obtained by -teffects psmatch- with -psmatch2-. While I was able to create the same coefficient, the standard errors are not identical. I tried multiple values for the -ai()- option. The help file of -teffects psmatch- says "teffects psmatch uses two matches in estimating the robust standard errors". Therefore, I guess that ai(2) should produce the same results for -psmatch2- as -teffects psmatch-. However, it does not.

Since -psmatch2- still displays "Note: S.E. does not take into account that the propensity score is estimated.", I suppose that -teffects psmatch- takes this into account while -psmatch2- does not and this explains the difference. In Abadie and Imbens (2012), the authors propose a method to estimate the correct standard error.

Code:

webuse cattaneo2, clear

. psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, out(bweight) logit ties

Logistic regression                             Number of obs     =      4,642
                                                LR chi2(5)        =     375.00
                                                Prob > chi2       =     0.0000
Log likelihood = -2043.2504                     Pseudo R2         =     0.0841

-------------------------------------------------------------------------------
      mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
     mmarried |  -1.145706   .0918962   -12.47   0.000     -1.32582    -.965593
         mage |    .321518   .0638472     5.04   0.000     .1963798    .4466563
              |
c.mage#c.mage |  -.0060368   .0011849    -5.09   0.000    -.0083592   -.0037144
              |
        fbaby |  -.3864258   .0880445    -4.39   0.000    -.5589898   -.2138618
         medu |  -.1420833   .0173215    -8.20   0.000    -.1760328   -.1081338
        _cons |  -2.950915   .8102504    -3.64   0.000    -4.538976   -1.362853
-------------------------------------------------------------------------------
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.
----------------------------------------------------------------------------------------
        Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
----------------------------+-----------------------------------------------------------
         bweight  Unmatched | 3137.65972   3412.91159  -275.251871   21.4528037   -12.83
                        ATT | 3137.65972   3374.44447  -236.784751   26.0535546    -9.09
----------------------------+-----------------------------------------------------------
Note: S.E. does not take into account that the propensity score is estimated.

           | psmatch2:
 psmatch2: |   Common
 Treatment |  support
assignment | On suppor |     Total
-----------+-----------+----------
 Untreated |     3,778 |     3,778
   Treated |       864 |       864
-----------+-----------+----------
     Total |     4,642 |     4,642


. teffects psmatch (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu), atet

Treatment-effects estimation                   Number of obs      =      4,642
Estimator      : propensity-score matching     Matches: requested =          1
Outcome model  : matching                                     min =          1
Treatment model: logit                                        max =         74
----------------------------------------------------------------------------------------
                       |              AI Robust
               bweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
ATET                   |
               mbsmoke |
(smoker vs nonsmoker)  |  -236.7848   26.57789    -8.91   0.000    -288.8765    -184.693
----------------------------------------------------------------------------------------

Last edited by Sebastian Geiger; 06 Jul 2016, 08:47.

Comment

Daniel Colombo

Join Date: Mar 2016
Posts: 6

08 Jul 2016, 08:45

Dear Sebastian,

Thank you for your reply. I can see you put some effort to help me, and I appreciate it.

Unfortunately, I will be running the regressions in a data room where computers only have Stata 12 installed. So ‘teffects’ is not an option, and ‘psmatch2’ will have to do it.

Making some tests at home with Stata 13, I have tried to replicate ‘teffects’ results in ‘psmatch2’, and was not successful so far. You made it (at least in the coefficient) by including the ‘ties’ option, something I haven’t tried before. I’m looking into it, so thank you for that.

I have two further comments:

1) Using your suggested code, I tried to test if the results of both commands would be the same if I do two neighbors matching. In this case not even the coefficients are the same (see results below). So I really don’t know what’s the difference in the estimations.

Code:

 webuse cattaneo2.dta, clear
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138-154)

. teffects psmatch (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu), atet nn(2)

Treatment-effects estimation                    Number of obs      =      4642
Estimator      : propensity-score matching      Matches: requested =         2
Outcome model  : matching                                      min =         2
Treatment model: logit                                         max =        74
----------------------------------------------------------------------------------------
                       |              AI Robust
               bweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
ATET                   |
               mbsmoke |
(smoker vs nonsmoker)  |  -230.8884   25.24801    -9.14   0.000    -280.3736   -181.4032
----------------------------------------------------------------------------------------

. . psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, out(bweight) logit ties n(2)

Logistic regression                               Number of obs   =       4642
                                                  LR chi2(5)      =     375.00
                                                  Prob > chi2     =     0.0000
Log likelihood = -2043.2504                       Pseudo R2       =     0.0841

-------------------------------------------------------------------------------
      mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
     mmarried |  -1.145706   .0918962   -12.47   0.000     -1.32582    -.965593
         mage |    .321518   .0638472     5.04   0.000     .1963798    .4466563
              |
c.mage#c.mage |  -.0060368   .0011849    -5.09   0.000    -.0083592   -.0037144
              |
        fbaby |  -.3864258   .0880445    -4.39   0.000    -.5589898   -.2138618
         medu |  -.1420833   .0173215    -8.20   0.000    -.1760328   -.1081338
        _cons |  -2.950915   .8102504    -3.64   0.000    -4.538976   -1.362853
-------------------------------------------------------------------------------
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.
----------------------------------------------------------------------------------------
        Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
----------------------------+-----------------------------------------------------------
         bweight  Unmatched | 3137.65972   3412.91159  -275.251871   21.4528037   -12.83
                        ATT | 3137.65972     3368.866  -231.206281   25.6247909    -9.02
----------------------------+-----------------------------------------------------------
Note: S.E. does not take into account that the propensity score is estimated.

           | psmatch2:
 psmatch2: |   Common
 Treatment |  support
assignment | On suppor |     Total
-----------+-----------+----------
 Untreated |     3,778 |     3,778 
   Treated |       864 |       864 
-----------+-----------+----------
     Total |     4,642 |     4,642

2) You mentioned ‘psmatch2’ displays the message "Note: S.E. does not take into account that the propensity score is estimated.". But note that, if you use the ai(#) command, the note is substituted by "Sample S.E." (see below). So I’m not sure if that is the issue.

Code:

. . psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, out(bweight) logit ties n(2) ai(2)

Logistic regression                               Number of obs   =       4642
                                                  LR chi2(5)      =     375.00
                                                  Prob > chi2     =     0.0000
Log likelihood = -2043.2504                       Pseudo R2       =     0.0841

-------------------------------------------------------------------------------
      mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
     mmarried |  -1.145706   .0918962   -12.47   0.000     -1.32582    -.965593
         mage |    .321518   .0638472     5.04   0.000     .1963798    .4466563
              |
c.mage#c.mage |  -.0060368   .0011849    -5.09   0.000    -.0083592   -.0037144
              |
        fbaby |  -.3864258   .0880445    -4.39   0.000    -.5589898   -.2138618
         medu |  -.1420833   .0173215    -8.20   0.000    -.1760328   -.1081338
        _cons |  -2.950915   .8102504    -3.64   0.000    -4.538976   -1.362853
-------------------------------------------------------------------------------
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.
----------------------------------------------------------------------------------------
        Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
----------------------------+-----------------------------------------------------------
         bweight  Unmatched | 3137.65972   3412.91159  -275.251871   21.4528037   -12.83
                        ATT | 3137.65972     3368.866  -231.206281   24.5847243    -9.40
----------------------------+-----------------------------------------------------------
Note: Sample S.E.

           | psmatch2:
 psmatch2: |   Common
 Treatment |  support
assignment | On suppor |     Total
-----------+-----------+----------
 Untreated |     3,778 |     3,778 
   Treated |       864 |       864 
-----------+-----------+----------
     Total |     4,642 |     4,642

Once again, thanks for your help.
Daniel Colombo

Comment

Lizy Dhoro

Join Date: Apr 2016

Posts: 12
#8

18 Oct 2016, 00:05

Hi all

Daniel asked this question and am also having the same problem with this and am using psmatch2. How do I obtain the p-value or statistical significance of the ATT? Is it a straightforward interpretation of the t-stat? I.e., if t-stat>1.96, then I have statistical significance at the 0.05 confidence level? (assuming N sufficiently large).

Netsai
Comment
Sebastian Geiger

Join Date: Oct 2015

Posts: 124
#9

18 Oct 2016, 04:00

Lizy,

Your idea is correct. As always, the t-statistic can be transformed into p-values. Since the t-statistic uses the standard errors, they are, however, incorrect according to the argument made by Abadie and Imbens. From my personal experience, though, the difference is small if your sample is relatively large. You may repeat the analysis with -teffects psmatch- and compare the results to those obtained from -psmatch2-.
Comment
Tim Anderson

Join Date: Aug 2017

Posts: 28
#10

23 Nov 2018, 15:07

Hello all,

I would like to revive an earlier part of this discussion thread which seemed unsettled, which is whether using the psmatch2 ai() option appropriately calculates standard errors based upon the 2012 Abadie and Imbens manuscript, as is done in teffects psmatch program.

I am planning to run a propensity score matching protocol using a nearest neighbor caliper matching without replacement approach. Unfortunately, I cannot use teffects psmatch, as teffects does not allow for matching without replacement, but would like to ensure that the standard errors generated from this approach are appropriate.

Best,
Tim
Comment

Announcement