Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can not replicate what teffects ipw is doing

    Hi,

    I want to compute ate and atet using inverse probability weighting (ipw).

    So I use a Stata example:

    use http://www.stata-press.com/data/r15/cattaneo2
    teffects ipw (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu, probit)

    This gives me an ATE of -230.6886

    Now I want to replicate what Stata is doing for me:

    qui probit mbsmoke mmarried c.mage##c.mage fbaby medu
    predict ps_score
    gen a=((mbsmoke-ps_score)*bweight)/(ps_score*(1-ps_score))
    tabsat a, stats(mean)

    This gives me an ATE of -313.0509.

    So I am not quite sure what teffects ipw exactly is doing. I can see that it gives me the same propensity score by using the post command predict, pr - but I can not obtain the ipw to se what is going wrong.

    Hope someone can help

    Peter

  • #2
    Peter, I can show you how to replicate the point estimate of the ATE. I went over this in a class within the last month, so I'm confident about this. You have to calculate weights for:

    a) the inverse of the probability of treatment among the treated

    b) the inverse of the probability of not being treated among the untreated (i.e. 1 minus p(treatment))

    Then, I regressed birthweight on smoking, using my new IPT weights as probability weights.

    Code:
    use http://www.stata-press.com/data/r15/cattaneo2
    teffects ipw (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu, probit)
    Treatment-effects estimation                    Number of obs     =      4,642
    Estimator      : inverse-probability weights
    Outcome model  : weighted mean
    Treatment model: probit
    ----------------------------------------------------------------------------------------
                           |               Robust
                   bweight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -----------------------+----------------------------------------------------------------
    ATE                    |
                   mbsmoke |
    (smoker vs nonsmoker)  |  -230.6886   25.81524    -8.94   0.000    -281.2856   -180.0917
    -----------------------+----------------------------------------------------------------
    POmean                 |
                   mbsmoke |
                nonsmoker  |   3403.463   9.571369   355.59   0.000     3384.703    3422.222
    ----------------------------------------------------------------------------------------
    
    qui probit mbsmoke mmarried c.mage##c.mage fbaby medu
    gen wt = 1 / prob if mbsmoke == 1
    replace wt = 1 / (1 - prob) if mbsmoke == 0
    regress bweight mbsmoke [pw = wt]
    
    ------------------------------------------------------------------------------
                 |               Robust
         bweight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         mbsmoke |  -230.6886   25.94182    -8.89   0.000    -281.5469   -179.8303
           _cons |   3403.463   9.616992   353.90   0.000     3384.609    3422.317
    ------------------------------------------------------------------------------
    Because the aforementioned class was an epidemiology class, we mainly dealt with binary outcomes and binary confounders in class. I'm also not sure what you're doing in your command to generate a. Nonetheless, this provides the same point estimate as -teffects ipw-. Sharp-eyed readers will note that the standard error is just a bit different in my model. Here, it's a bit wider, but I'm not sure if it's always wider. Here, the difference isn't material. Invoking the -vce(robust)- option in the regress command does nothing. I'm not sure what the source of the difference is.

    In the case of a continuous counfounder, I can provide a generalization of the method above if people are interested later on. I would have to hunt for the correct data example.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Thank you. I am not sure how to use dataex.

      What I try to do with

      gen a=((mbsmoke-ps_score)*bweight)/(ps_score*(1-ps_score))
      tabstat a, stats(mean)

      is to compute the ATE IPW estimate manually. However, I still don't know why this does not give the same result as using the teffects command.

      Comment


      • #4
        Originally posted by Peter Thomsen View Post
        Thank you. I am not sure how to use dataex.

        What I try to do with

        gen a=((mbsmoke-ps_score)*bweight)/(ps_score*(1-ps_score))
        tabstat a, stats(mean)

        is to compute the ATE IPW estimate manually. However, I still don't know why this does not give the same result as using the teffects command.
        To be honest, I can't follow the math or thought process in this code.

        The dependent variable, bweight, is continuous. You are interested in the association of smoking, a binary variable, with birth weight. You estimated inverse probability of treatment (IPT) weights via probit.

        I'm pretty sure that in this case, -teffects- is fitting a linear model to bweight using mbsmoke as the sole independent variable, but applying the IPT weights. I can't tell what your code does, but it does not do that.

        Do note that the IPT weights are 1 / p(mbsmoke) for smokers, and 1 / (1 - p(mbsmoke)) for those who were non-smokers.

        My signature line about -dataex- is a general guideline to help posters here help others - it can be difficult to do so if we don't know what your data look like. You demonstrated your problem using a stock Stata dataset that we all can access and that I've played around with as part of learning -teffects-, so this request is irrelevant for your purposes.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment

        Working...
        X