Can not replicate what teffects ipw is doing

Peter Thomsen

Join Date: Dec 2018

Posts: 2
#1

Can not replicate what teffects ipw is doing

03 Dec 2018, 08:50

Hi,

I want to compute ate and atet using inverse probability weighting (ipw).

So I use a Stata example:

use http://www.stata-press.com/data/r15/cattaneo2
teffects ipw (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu, probit)

This gives me an ATE of -230.6886

Now I want to replicate what Stata is doing for me:

qui probit mbsmoke mmarried c.mage##c.mage fbaby medu
predict ps_score
gen a=((mbsmoke-ps_score)*bweight)/(ps_score*(1-ps_score))
tabsat a, stats(mean)

This gives me an ATE of -313.0509.

So I am not quite sure what teffects ipw exactly is doing. I can see that it gives me the same propensity score by using the post command predict, pr - but I can not obtain the ipw to se what is going wrong.

Hope someone can help

Peter
Tags: None

Weiwen Ng

Join Date: Jun 2015
Posts: 1241

03 Dec 2018, 09:10

Peter, I can show you how to replicate the point estimate of the ATE. I went over this in a class within the last month, so I'm confident about this. You have to calculate weights for:

a) the inverse of the probability of treatment among the treated

b) the inverse of the probability of not being treated among the untreated (i.e. 1 minus p(treatment))

Then, I regressed birthweight on smoking, using my new IPT weights as probability weights.

Code:

use http://www.stata-press.com/data/r15/cattaneo2
teffects ipw (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu, probit)
Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : inverse-probability weights
Outcome model  : weighted mean
Treatment model: probit
----------------------------------------------------------------------------------------
                       |               Robust
               bweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
ATE                    |
               mbsmoke |
(smoker vs nonsmoker)  |  -230.6886   25.81524    -8.94   0.000    -281.2856   -180.0917
-----------------------+----------------------------------------------------------------
POmean                 |
               mbsmoke |
            nonsmoker  |   3403.463   9.571369   355.59   0.000     3384.703    3422.222
----------------------------------------------------------------------------------------

qui probit mbsmoke mmarried c.mage##c.mage fbaby medu
gen wt = 1 / prob if mbsmoke == 1
replace wt = 1 / (1 - prob) if mbsmoke == 0
regress bweight mbsmoke [pw = wt]

------------------------------------------------------------------------------
             |               Robust
     bweight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     mbsmoke |  -230.6886   25.94182    -8.89   0.000    -281.5469   -179.8303
       _cons |   3403.463   9.616992   353.90   0.000     3384.609    3422.317
------------------------------------------------------------------------------

Because the aforementioned class was an epidemiology class, we mainly dealt with binary outcomes and binary confounders in class. I'm also not sure what you're doing in your command to generate a. Nonetheless, this provides the same point estimate as -teffects ipw-. Sharp-eyed readers will note that the standard error is just a bit different in my model. Here, it's a bit wider, but I'm not sure if it's always wider. Here, the difference isn't material. Invoking the -vce(robust)- option in the regress command does nothing. I'm not sure what the source of the difference is.

In the case of a continuous counfounder, I can provide a generalization of the method above if people are interested later on. I would have to hunt for the correct data example.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

Comment

Peter Thomsen

Join Date: Dec 2018

Posts: 2
#3

03 Dec 2018, 10:43

Thank you. I am not sure how to use dataex.

What I try to do with

gen a=((mbsmoke-ps_score)*bweight)/(ps_score*(1-ps_score))
tabstat a, stats(mean)

is to compute the ATE IPW estimate manually. However, I still don't know why this does not give the same result as using the teffects command.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#4

04 Dec 2018, 10:56

Originally posted by Peter Thomsen View Post

Thank you. I am not sure how to use dataex.

What I try to do with

gen a=((mbsmoke-ps_score)*bweight)/(ps_score*(1-ps_score))
tabstat a, stats(mean)

is to compute the ATE IPW estimate manually. However, I still don't know why this does not give the same result as using the teffects command.

To be honest, I can't follow the math or thought process in this code.

The dependent variable, bweight, is continuous. You are interested in the association of smoking, a binary variable, with birth weight. You estimated inverse probability of treatment (IPT) weights via probit.

I'm pretty sure that in this case, -teffects- is fitting a linear model to bweight using mbsmoke as the sole independent variable, but applying the IPT weights. I can't tell what your code does, but it does not do that.

Do note that the IPT weights are 1 / p(mbsmoke) for smokers, and 1 / (1 - p(mbsmoke)) for those who were non-smokers.

My signature line about -dataex- is a general guideline to help posters here help others - it can be difficult to do so if we don't know what your data look like. You demonstrated your problem using a stock Stata dataset that we all can access and that I've played around with as part of learning -teffects-, so this request is irrelevant for your purposes.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement