  • Teffects aipw and the aequation option - what equations are being shown?

    Hi everyone -

    I have been exploring the teffects routines and have hit a stumbling block when using the aipw option. My impression is aipw should follow the recipe in Jeff Wooldridge (2010, Chapter 21, pp 930-33) on how aipw teffects are estimated. That is, one should begin by estimating a selection-into-treatment equation and then use this equation to form inverse probability weights using the propensity scores generated from the selection-into-treatment model. For outcomes, one then uses the weights in estimating outcome equations for each treatment.

    What I can't figure out is why teffects aipw seems to be reporting the unweighted outcome regressions. That is, suppose I use teffects, aequations as follows:

    webuse cattaneo2
    teffects aipw (bweight prenatal1 mage) (mbsmoke mmarried medu, logit), aequations
    This gives the results:
    Iteration 0:   EE criterion =  2.105e-23  
    Iteration 1:   EE criterion =  3.723e-26  
    Treatment-effects estimation                    Number of obs     =      4,642
    Estimator      : augmented IPW
    Outcome model  : linear by ML
    Treatment model: logit
                           |               Robust
                   bweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ATE                    |
                   mbsmoke |
    (smoker vs nonsmoker)  |  -236.3734   24.27813    -9.74   0.000    -283.9577   -188.7892
    POmean                 |
                   mbsmoke |
                nonsmoker  |   3401.113   9.606971   354.03   0.000     3382.284    3419.943
    OME0                   |
                 prenatal1 |   95.11727   26.80305     3.55   0.000     42.58425    147.6503
                      mage |   9.737189   1.824372     5.34   0.000     6.161485    13.31289
                     _cons |   3073.201   48.65752    63.16   0.000     2977.834    3168.568
    OME1                   |
                 prenatal1 |   64.61752   39.69749     1.63   0.104    -13.18813    142.4232
                      mage |  -4.962403   3.850123    -1.29   0.197     -12.5085    2.583699
                     _cons |   3217.973   93.57647    34.39   0.000     3034.566    3401.379
    TME1                   |
                  mmarried |  -.9757148   .0842798   -11.58   0.000      -1.1409   -.8105294
                      medu |  -.1352031    .015786    -8.56   0.000     -.166143   -.1042631
                     _cons |   .7946998   .1881658     4.22   0.000     .4259017    1.163498
    To replicate what teffects is doing, I would first run a logit:

    logit mbsmoke mmarried medu
    with output:

    Iteration 0:   log likelihood = -2230.7484  
    Iteration 1:   log likelihood = -2081.3446  
    Iteration 2:   log likelihood = -2074.1003  
    Iteration 3:   log likelihood = -2074.0937  
    Iteration 4:   log likelihood = -2074.0937  
    Logistic regression                             Number of obs     =      4,642
                                                    LR chi2(2)        =     313.31
                                                    Prob > chi2       =     0.0000
    Log likelihood = -2074.0937                     Pseudo R2         =     0.0702
         mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        mmarried |  -.9757148   .0823457   -11.85   0.000    -1.137109   -.8143202
            medu |  -.1352031   .0160605    -8.42   0.000    -.1666811    -.103725
           _cons |   .7946998    .188828     4.21   0.000     .4246037    1.164796
    So far, so good - this replicates the TME1 equation from teffects, aequations. So far, so good!

    But now, when I generate weights and run both the weighted and the unweighted regressions for, say, the first outcome, I get:

    predict ps, p
    gen w = 1/ps*mbsmoke + 1/(1-ps)*(1-mbsmoke)
    reg bweight prenatal1 if mbsmoke
    reg bweight prenatal1 [pweight=w] if mbsmoke
    The results for the first regression are:

          Source |       SS           df       MS      Number of obs   =       864
    -------------+----------------------------------   F(2, 861)       =      1.79
           Model |  1125691.15         2  562845.575   Prob > F        =    0.1672
        Residual |   270374985       861  314024.373   R-squared       =    0.0041
    -------------+----------------------------------   Adj R-squared   =    0.0018
           Total |   271500676       863  314601.015   Root MSE        =    560.38
         bweight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
       prenatal1 |   64.61752    41.8932     1.54   0.123    -17.60723    146.8423
            mage |  -4.962403    3.65751    -1.36   0.175    -12.14108    2.216276
           _cons |   3217.973   93.36708    34.47   0.000     3034.719    3401.226
    and for the second regression are:

    (sum of wgt is   4.4684e+03)
    Linear regression                               Number of obs     =        864
                                                    F(2, 861)         =       1.48
                                                    Prob > F          =     0.2288
                                                    R-squared         =     0.0032
                                                    Root MSE          =     568.85
                 |               Robust
         bweight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
       prenatal1 |    64.6658   41.73664     1.55   0.122    -17.25167    146.5833
            mage |  -4.316677   4.538464    -0.95   0.342    -13.22442    4.591072
           _cons |   3228.563   112.0674    28.81   0.000     3008.606     3448.52
    Oddly (to me anyways) it looks as though teffects, aequations is presenting the unweighted regression under OME1 above, whereas one would think it would report the weighted regression. Have I missed something here about how teffects aipw works, or is this just a reporting convention, or what?

    Matthew J. Baker

    Hi Matthew,

    The -teffects aipw- command is an implementation of the augmented inverse probability weighted estimator. I gather from your description that what you think -teffects aipw- is doing, is actually what is implemented in -teffects ipw- or -teffects ipwra-. The default maximum likelihood estimator as well as the nonlinear least squares estimator of the -teffects aipw- outcome model both do not apply the inverse-probability weights to the regression functions. Rather, the weights are used when computing the potential outcome means. To illustrate, here is an example where we replicate the point estimates from a -teffects aipw- specification:

    . * Example data:
    . webuse cattaneo2
    (Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138-154)
    . * Example model:
    . teffects aipw (lbweight mmarried alcohol foreign, probit)       ///
    >               (mbsmoke foreign alcohol mage medu fage fedu),    ///
    >               pom aequations
    Iteration 0:   EE criterion =  3.201e-17  
    Iteration 1:   EE criterion =  4.726e-31  
    Treatment-effects estimation                    Number of obs     =      4,642
    Estimator      : augmented IPW
    Outcome model  : probit by ML
    Treatment model: logit
                 |               Robust
        lbweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    POmeans      |
         mbsmoke |
      nonsmoker  |   .0516139   .0038098    13.55   0.000     .0441468     .059081
         smoker  |   .1083725   .0127418     8.51   0.000     .0833989     .133346
    OME0         |
        mmarried |  -.4785571    .072375    -6.61   0.000    -.6204095   -.3367048
         alcohol |  -.0200134   .2399808    -0.08   0.934    -.4903672    .4503404
         foreign |   .2710554   .1317215     2.06   0.040      .012886    .5292248
           _cons |  -1.352288   .0584441   -23.14   0.000    -1.466837    -1.23774
    OME1         |
        mmarried |  -.1088327   .1142113    -0.95   0.341    -.3326826    .1150173
         alcohol |   .2646055   .1776871     1.49   0.136    -.0836548    .6128658
         foreign |   -.130139    .385808    -0.34   0.736    -.8863087    .6260307
           _cons |  -1.202088   .0795964   -15.10   0.000    -1.358094   -1.046082
    TME1         |
         foreign |  -1.234034   .2596241    -4.75   0.000    -1.742888   -.7251799
         alcohol |    1.64208   .1796468     9.14   0.000     1.289978    1.994181
            mage |  -.0134554   .0092447    -1.46   0.146    -.0315746    .0046639
            medu |  -.1379639   .0204123    -6.76   0.000    -.1779713   -.0979565
            fage |  -.0016902   .0061346    -0.28   0.783    -.0137138    .0103334
            fedu |  -.0748969    .013219    -5.67   0.000    -.1008058   -.0489881
           _cons |   1.475494   .2314186     6.38   0.000     1.021922    1.929066
    . * Treatment assignment model:
    . logit mbsmoke foreign alcohol mage medu fage fedu
    Iteration 0:   log likelihood = -2230.7484  
    Iteration 1:   log likelihood = -2068.1689  
    Iteration 2:   log likelihood = -2055.2825  
    Iteration 3:   log likelihood = -2055.1754  
    Iteration 4:   log likelihood = -2055.1753  
    Logistic regression                             Number of obs     =      4,642
                                                    LR chi2(6)        =     351.15
                                                    Prob > chi2       =     0.0000
    Log likelihood = -2055.1753                     Pseudo R2         =     0.0787
         mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
         foreign |  -1.234031   .2454669    -5.03   0.000    -1.715137   -.7529249
         alcohol |    1.64208   .1779028     9.23   0.000     1.293397    1.990763
            mage |  -.0134554    .008719    -1.54   0.123    -.0305443    .0036336
            medu |  -.1379639   .0193929    -7.11   0.000    -.1759732   -.0999545
            fage |  -.0016902   .0054073    -0.31   0.755    -.0122884     .008908
            fedu |  -.0748969     .01301    -5.76   0.000     -.100396   -.0493978
           _cons |   1.475494   .2323914     6.35   0.000     1.020015    1.930973
    . predict double ps, pr
    . * IPWs:
    . gen double ipw0 = 0.mbsmoke/(1-ps)
    . gen double ipw1 = 1.mbsmoke/ps
    . * Outcome models & POMs:
    . probit lbweight mmarried alcohol foreign if mbsmoke==0
    Iteration 0:   log likelihood = -738.46462  
    Iteration 1:   log likelihood = -716.41468  
    Iteration 2:   log likelihood =  -715.9869  
    Iteration 3:   log likelihood = -715.98678  
    Iteration 4:   log likelihood = -715.98678  
    Probit regression                               Number of obs     =      3,778
                                                    LR chi2(3)        =      44.96
                                                    Prob > chi2       =     0.0000
    Log likelihood = -715.98678                     Pseudo R2         =     0.0304
        lbweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        mmarried |  -.4785571   .0729884    -6.56   0.000    -.6216117   -.3355025
         alcohol |  -.0200134   .2540246    -0.08   0.937    -.5178926    .4778658
         foreign |   .2710554   .1298778     2.09   0.037     .0164997    .5256112
           _cons |  -1.352288   .0585397   -23.10   0.000    -1.467024   -1.237553
    . predict double pom0, pr
    . replace pom0 = pom0 + ipw0*(lbweight-pom0)
    (3,778 real changes made)
    . probit lbweight mmarried alcohol foreign if mbsmoke==1
    Iteration 0:   log likelihood = -299.30561  
    Iteration 1:   log likelihood = -297.62538  
    Iteration 2:   log likelihood = -297.61954  
    Iteration 3:   log likelihood = -297.61954  
    Probit regression                               Number of obs     =        864
                                                    LR chi2(3)        =       3.37
                                                    Prob > chi2       =     0.3377
    Log likelihood = -297.61954                     Pseudo R2         =     0.0056
        lbweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        mmarried |  -.1088327   .1148622    -0.95   0.343    -.3339584    .1162931
         alcohol |   .2646055   .1802264     1.47   0.142    -.0886317    .6178426
         foreign |   -.130139   .3773742    -0.34   0.730    -.8697788    .6095008
           _cons |  -1.202088   .0810425   -14.83   0.000    -1.360928   -1.043247
    . predict double pom1, pr
    . replace pom1 = pom1 + ipw1*(lbweight-pom1)
    (864 real changes made)
    . * Results:
    . sum pom0 pom1
        Variable |        Obs        Mean    Std. Dev.       Min        Max
            pom0 |      4,642    .0516139    .2557622  -.4657595    4.06777
            pom1 |      4,642    .1083725    .7846416  -1.613126   14.20573
      Joerg Luedicke (StataCorp)

      Aha, thanks for clearing that up! That's a big help and you were correct - I thought that the aipw was doing something else. If I may ask a follow-up question, is there a reason why teffects aipw does not produce ATETs (treatment effects on treated)? From your above example, it seems like one could just add in

      An example:

      webuse catteneo2, clear
      teffects aipw (lbweight mmarried alcohol foreign, probit)       ///
      >               (mbsmoke foreign alcohol mage medu fage fedu), atet
      option atet is not allowed


        The augmented inverse-probability-weighted estimator implements an estimating function that is derived particularly for ATE. Estimation of ATET would require the derivation of a different function. This is different from the IPW and IPWRA estimators which use the same estimating functions for both ATE and ATET. I am not even sure if anybody ever derived an AIPW estimator for ATET,

