Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • psmatch2--selecting parameter for ai() option to calculate Abadie-Imbens standard errors

    A recent update to psmatch2 (Leuven & Sianesi, available from SSC) has an option ai() described as such:

    ai(integer) calculate the heteroskedasticity-consistent analytical standard errors proposed by Abadie and Imbens (2006) by specifying the number of neighbors to be used to calculate the conditional variance (their formula (14)). With option altvariance one can specify to use the estimator of Abadie et al. (2004) instead.

    For propensity score matching, should the ai() value be the same number as the number of nearest neighbors used for matching, e.g., ai(1) for 1:1 matching? If not, how does one select a value for ai()?


    References:
    Abadie, A., Drukker, D., Herr, J. L., & Imbens, G. W. (2004). "Implementing matching estimators for average treatment effects in Stata", Stata journal 4, 290-311.
    Abadie A. and Imbens, G. (2006), "Large sample properties of matching estimators for average treatment effects", Econometrica 74(1), 235-267.
    David Radwin
    Senior Researcher, California Competes
    californiacompetes.org
    Pronouns: He/Him

  • #2
    Hi David,
    I believe you are correct - the ai() value should be the number of matches used for each treated individual.
    Hope this helps,
    Melissa

    Comment


    • #3
      Thank you Melissa!

      David
      David Radwin
      Senior Researcher, California Competes
      californiacompetes.org
      Pronouns: He/Him

      Comment


      • #4
        Originally posted by Melissa Garrido View Post
        Hi David,
        I believe you are correct - the ai() value should be the number of matches used for each treated individual.
        Hope this helps,
        Melissa
        Hi Melissa

        Late to this, but how would this work if the -ties option for psmatch2 is chosen and each treated individual may be matched to more than one controls with the same tied pscore? For a nearest neighbour (1) matching with -ties option, will it still be ai(1)?

        Comment


        • #5
          Dear all,

          I just read this and other posts, and I thought one of you could help me with a simple question on the subject. I've recently learned about the ai(#) option in psmatch2, and I'm trying to understand how to use it.

          Before the command update, I used to bootstrap the standard errors to calculate the p-value or statistical significance of the ATT coefficient. I understand that, with the ai(#) option, this is no longer necessary or appropriate, if I'm using the nearest neighbor or Mahalanobis matching.

          My question is how do I obtain the p-value or statistical significance now? Is it a straightforward interpretation of the t-stat? I.e., if t-stat>1.96, then I have statistical significance at the 0.05 confidence level? (assuming N sufficiently large)

          I've read Abadie and Imbens (2006).

          I appreciate any thoughts and comments in advance.

          Al the best,
          Daniel Colombo

          Comment


          • #6
            Hello,

            Stata 13 (or higher) offers an own propensity score matching command: -teffects psmatch-

            As described by the help file of -teffects psmatch-, it uses robust standard errors by implementing the procedure derived by Abadie and Imbens (2006, 2011, 2012). I'm not quite sure if -psmatch2- uses the same approach since only the 2006 paper is cited in its help file. I tried to replicate the results obtained by -teffects psmatch- with -psmatch2-. While I was able to create the same coefficient, the standard errors are not identical. I tried multiple values for the -ai()- option. The help file of -teffects psmatch- says "teffects psmatch uses two matches in estimating the robust standard errors". Therefore, I guess that ai(2) should produce the same results for -psmatch2- as -teffects psmatch-. However, it does not.

            Since -psmatch2- still displays "Note: S.E. does not take into account that the propensity score is estimated.", I suppose that -teffects psmatch- takes this into account while -psmatch2- does not and this explains the difference. In Abadie and Imbens (2012), the authors propose a method to estimate the correct standard error.

            Code:
            webuse cattaneo2, clear
            
            . psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, out(bweight) logit ties
            
            Logistic regression                             Number of obs     =      4,642
                                                            LR chi2(5)        =     375.00
                                                            Prob > chi2       =     0.0000
            Log likelihood = -2043.2504                     Pseudo R2         =     0.0841
            
            -------------------------------------------------------------------------------
                  mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            --------------+----------------------------------------------------------------
                 mmarried |  -1.145706   .0918962   -12.47   0.000     -1.32582    -.965593
                     mage |    .321518   .0638472     5.04   0.000     .1963798    .4466563
                          |
            c.mage#c.mage |  -.0060368   .0011849    -5.09   0.000    -.0083592   -.0037144
                          |
                    fbaby |  -.3864258   .0880445    -4.39   0.000    -.5589898   -.2138618
                     medu |  -.1420833   .0173215    -8.20   0.000    -.1760328   -.1081338
                    _cons |  -2.950915   .8102504    -3.64   0.000    -4.538976   -1.362853
            -------------------------------------------------------------------------------
            There are observations with identical propensity score values.
            The sort order of the data could affect your results.
            Make sure that the sort order is random before calling psmatch2.
            ----------------------------------------------------------------------------------------
                    Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
            ----------------------------+-----------------------------------------------------------
                     bweight  Unmatched | 3137.65972   3412.91159  -275.251871   21.4528037   -12.83
                                    ATT | 3137.65972   3374.44447  -236.784751   26.0535546    -9.09
            ----------------------------+-----------------------------------------------------------
            Note: S.E. does not take into account that the propensity score is estimated.
            
                       | psmatch2:
             psmatch2: |   Common
             Treatment |  support
            assignment | On suppor |     Total
            -----------+-----------+----------
             Untreated |     3,778 |     3,778
               Treated |       864 |       864
            -----------+-----------+----------
                 Total |     4,642 |     4,642
            
            
            . teffects psmatch (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu), atet
            
            Treatment-effects estimation                   Number of obs      =      4,642
            Estimator      : propensity-score matching     Matches: requested =          1
            Outcome model  : matching                                     min =          1
            Treatment model: logit                                        max =         74
            ----------------------------------------------------------------------------------------
                                   |              AI Robust
                           bweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -----------------------+----------------------------------------------------------------
            ATET                   |
                           mbsmoke |
            (smoker vs nonsmoker)  |  -236.7848   26.57789    -8.91   0.000    -288.8765    -184.693
            ----------------------------------------------------------------------------------------
            Last edited by Sebastian Geiger; 06 Jul 2016, 08:47.

            Comment


            • #7
              Dear Sebastian,

              Thank you for your reply. I can see you put some effort to help me, and I appreciate it.

              Unfortunately, I will be running the regressions in a data room where computers only have Stata 12 installed. So ‘teffects’ is not an option, and ‘psmatch2’ will have to do it.

              Making some tests at home with Stata 13, I have tried to replicate ‘teffects’ results in ‘psmatch2’, and was not successful so far. You made it (at least in the coefficient) by including the ‘ties’ option, something I haven’t tried before. I’m looking into it, so thank you for that.

              I have two further comments:

              1) Using your suggested code, I tried to test if the results of both commands would be the same if I do two neighbors matching. In this case not even the coefficients are the same (see results below). So I really don’t know what’s the difference in the estimations.

              Code:
               webuse cattaneo2.dta, clear
              (Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138-154)
              
              . teffects psmatch (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu), atet nn(2)
              
              Treatment-effects estimation                    Number of obs      =      4642
              Estimator      : propensity-score matching      Matches: requested =         2
              Outcome model  : matching                                      min =         2
              Treatment model: logit                                         max =        74
              ----------------------------------------------------------------------------------------
                                     |              AI Robust
                             bweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -----------------------+----------------------------------------------------------------
              ATET                   |
                             mbsmoke |
              (smoker vs nonsmoker)  |  -230.8884   25.24801    -9.14   0.000    -280.3736   -181.4032
              ----------------------------------------------------------------------------------------
              
              . . psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, out(bweight) logit ties n(2)
              
              Logistic regression                               Number of obs   =       4642
                                                                LR chi2(5)      =     375.00
                                                                Prob > chi2     =     0.0000
              Log likelihood = -2043.2504                       Pseudo R2       =     0.0841
              
              -------------------------------------------------------------------------------
                    mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              --------------+----------------------------------------------------------------
                   mmarried |  -1.145706   .0918962   -12.47   0.000     -1.32582    -.965593
                       mage |    .321518   .0638472     5.04   0.000     .1963798    .4466563
                            |
              c.mage#c.mage |  -.0060368   .0011849    -5.09   0.000    -.0083592   -.0037144
                            |
                      fbaby |  -.3864258   .0880445    -4.39   0.000    -.5589898   -.2138618
                       medu |  -.1420833   .0173215    -8.20   0.000    -.1760328   -.1081338
                      _cons |  -2.950915   .8102504    -3.64   0.000    -4.538976   -1.362853
              -------------------------------------------------------------------------------
              There are observations with identical propensity score values.
              The sort order of the data could affect your results.
              Make sure that the sort order is random before calling psmatch2.
              ----------------------------------------------------------------------------------------
                      Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
              ----------------------------+-----------------------------------------------------------
                       bweight  Unmatched | 3137.65972   3412.91159  -275.251871   21.4528037   -12.83
                                      ATT | 3137.65972     3368.866  -231.206281   25.6247909    -9.02
              ----------------------------+-----------------------------------------------------------
              Note: S.E. does not take into account that the propensity score is estimated.
              
                         | psmatch2:
               psmatch2: |   Common
               Treatment |  support
              assignment | On suppor |     Total
              -----------+-----------+----------
               Untreated |     3,778 |     3,778 
                 Treated |       864 |       864 
              -----------+-----------+----------
                   Total |     4,642 |     4,642
              2) You mentioned ‘psmatch2’ displays the message "Note: S.E. does not take into account that the propensity score is estimated.". But note that, if you use the ai(#) command, the note is substituted by "Sample S.E." (see below). So I’m not sure if that is the issue.

              Code:
              . . psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, out(bweight) logit ties n(2) ai(2)
              
              Logistic regression                               Number of obs   =       4642
                                                                LR chi2(5)      =     375.00
                                                                Prob > chi2     =     0.0000
              Log likelihood = -2043.2504                       Pseudo R2       =     0.0841
              
              -------------------------------------------------------------------------------
                    mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              --------------+----------------------------------------------------------------
                   mmarried |  -1.145706   .0918962   -12.47   0.000     -1.32582    -.965593
                       mage |    .321518   .0638472     5.04   0.000     .1963798    .4466563
                            |
              c.mage#c.mage |  -.0060368   .0011849    -5.09   0.000    -.0083592   -.0037144
                            |
                      fbaby |  -.3864258   .0880445    -4.39   0.000    -.5589898   -.2138618
                       medu |  -.1420833   .0173215    -8.20   0.000    -.1760328   -.1081338
                      _cons |  -2.950915   .8102504    -3.64   0.000    -4.538976   -1.362853
              -------------------------------------------------------------------------------
              There are observations with identical propensity score values.
              The sort order of the data could affect your results.
              Make sure that the sort order is random before calling psmatch2.
              ----------------------------------------------------------------------------------------
                      Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
              ----------------------------+-----------------------------------------------------------
                       bweight  Unmatched | 3137.65972   3412.91159  -275.251871   21.4528037   -12.83
                                      ATT | 3137.65972     3368.866  -231.206281   24.5847243    -9.40
              ----------------------------+-----------------------------------------------------------
              Note: Sample S.E.
              
                         | psmatch2:
               psmatch2: |   Common
               Treatment |  support
              assignment | On suppor |     Total
              -----------+-----------+----------
               Untreated |     3,778 |     3,778 
                 Treated |       864 |       864 
              -----------+-----------+----------
                   Total |     4,642 |     4,642
              Once again, thanks for your help.
              Daniel Colombo

              Comment


              • #8
                Hi all

                Daniel asked this question and am also having the same problem with this and am using psmatch2. How do I obtain the p-value or statistical significance of the ATT? Is it a straightforward interpretation of the t-stat? I.e., if t-stat>1.96, then I have statistical significance at the 0.05 confidence level? (assuming N sufficiently large).

                Netsai

                Comment


                • #9
                  Lizy,

                  Your idea is correct. As always, the t-statistic can be transformed into p-values. Since the t-statistic uses the standard errors, they are, however, incorrect according to the argument made by Abadie and Imbens. From my personal experience, though, the difference is small if your sample is relatively large. You may repeat the analysis with -teffects psmatch- and compare the results to those obtained from -psmatch2-.

                  Comment


                  • #10
                    Hello all,

                    I would like to revive an earlier part of this discussion thread which seemed unsettled, which is whether using the psmatch2 ai() option appropriately calculates standard errors based upon the 2012 Abadie and Imbens manuscript, as is done in teffects psmatch program.

                    I am planning to run a propensity score matching protocol using a nearest neighbor caliper matching without replacement approach. Unfortunately, I cannot use teffects psmatch, as teffects does not allow for matching without replacement, but would like to ensure that the standard errors generated from this approach are appropriate.

                    Best,
                    Tim

                    Comment

                    Working...
                    X