Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Treatment Effects Model with binary treatment and continuous outcome (Gamma distribution)

    Dear STATALIST:

    I'm trying to identify a Stata command to estimate a treatment effects model with a binary treatment and a continuous outcome of interest (healthcare costs). The latter needs to be estimated using a Gamma regression. The command mtreatreg allows users to specify a Gamma regression to estimate the main equation but it was developed to fit a multinomial treatment (not binary). The command treatreg does not accommodate Gamma estimation for the main equation, but fits binary treatments. I wonder if there is an option within either of these commands or an alternative one that I could use.

    Thanks so much for your help.
    Best,
    Marina
    Last edited by Marina Soley; 22 Feb 2016, 21:05.

  • #2
    Welcome to Statalist! Please read the FAQ for tips on how to post most effectively. E.g. you should tell us where user-written programs come from -- mtreatreg is the example here. I do not know about these models but I note that binary logit models might be seen as a special case of multinomial logit in some contexts. So, I ask if that is the case here. (You could also check with the authors of the program and associated article(s).) Otherwise, for gamma regression, you could look at gammafit on SSC and related programs also there such as gb2fit and gb2lfit.

    Comment


    • #3
      Marina could use one of Stata's stteffects commands for her purposes. Although stteffects is designed for survival time outcomes, she could simply stset her outcome variable without specifying a censoring variable. Here is an example, but first we need some data:

      Code:
      * Data for (two-parameter) gamma treatment effects model:
      clear
      set seed 123
      set obs 10000
      gen double u0 = runiform()
      gen double u1 = runiform()
      
      * Gamma parameters:
      local s0 = 1/exp(2*-.8)
      local s1 = 1/exp(2*-.8)
      
      * Covariates:
      gen double x1 = runiform()
      gen double x2 = rbeta(2,2)
      gen double x3 = rnormal(1,0.5)
      gen double x4 = runiform()
      gen double x5 = rnormal(1,1)
      
      * Treatment model:
      local xb_t (0.5 + 0.8*x1 + 0.8*x2 + 0.4*x3)
      gen double xb = 1/(1+exp(-(`xb_t')))
      gen byte treat = rbinomial(1,xb)
      
      * Outcome:
      gen double xb0 = -1.5 + 0.50*(x1 + x2 + x3 + x4 + x5)
      gen double xb1 = -1.0 + 0.90*(x1 + x2 + x3 + x4 + x5)
      gen double y0 = invgammap(`s0', u0)*exp(xb0)/`s0'
      gen double y1 = invgammap(`s1', u1)*exp(xb1)/`s1'
      gen double y = treat*y1 + (1-treat)*y0
      
      * True potential outcomes:
      gen double pom0 = exp(xb0)
      gen double pom1 = exp(xb1)
      The true average treatment effect here, i.e. the difference in true potential outcome means, is around ATE = 14:

      Code:
      . sum pom0
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
              pom0 |     10,000    1.542234     1.06075   .0957184   21.97223
      
      . sum pom1
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
              pom1 |     10,000    15.75347    28.53738   .0801848   1424.538
      We know declare our outcome variable y as a survival outcome:

      Code:
      stset y
      If Marina has a good idea about the model specification for her outcomes, she could possibly use a simple regression adjustment estimator that is specified with stteffects ra:

      Code:
      . stteffects ra (x1 x2 x3 x4 x5, gamma) (treat)
      
               failure _d:  1 (meaning all fail)
         analysis time _t:  y
      
      Iteration 0:   EE criterion =  1.889e-25  
      Iteration 1:   EE criterion =  8.043e-28  
      
      Survival treatment-effects estimation           Number of obs     =     10,000
      Estimator      : regression adjustment
      Outcome model  : gamma
      Treatment model: none
      Censoring model: none
      ------------------------------------------------------------------------------
                   |               Robust
                _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      ATE          |
             treat |
         (1 vs 0)  |   14.04016   .2967501    47.31   0.000     13.45854    14.62178
      -------------+----------------------------------------------------------------
      POmean       |
             treat |
                0  |   1.537638   .0233885    65.74   0.000     1.491798    1.583479
      ------------------------------------------------------------------------------
      In addition to the estimated average treatment effect and potential outcome mean of the controls, we could also display the gamma model parameters:

      Code:
      . stteffects, aequations
      
      Survival treatment-effects estimation           Number of obs     =     10,000
      Estimator      : regression adjustment
      Outcome model  : gamma
      Treatment model: none
      Censoring model: none
      ------------------------------------------------------------------------------
                   |               Robust
                _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      ATE          |
             treat |
         (1 vs 0)  |   14.04016   .2967501    47.31   0.000     13.45854    14.62178
      -------------+----------------------------------------------------------------
      POmean       |
             treat |
                0  |   1.537638   .0233885    65.74   0.000     1.491798    1.583479
      -------------+----------------------------------------------------------------
      OME0         |
                x1 |   .5526123   .0367705    15.03   0.000     .4805435    .6246811
                x2 |   .5079774   .0470875    10.79   0.000     .4156876    .6002672
                x3 |   .5192854   .0206559    25.14   0.000     .4788006    .5597701
                x4 |   .4696835   .0371511    12.64   0.000     .3968687    .5424983
                x5 |   .5019569   .0103495    48.50   0.000     .4816723    .5222414
             _cons |  -1.544242   .0400991   -38.51   0.000    -1.622835   -1.465649
      -------------+----------------------------------------------------------------
      OME0_lnshape |
             _cons |  -.8329523   .0167625   -49.69   0.000    -.8658061   -.8000984
      -------------+----------------------------------------------------------------
      OME1         |
                x1 |   .8987762   .0169651    52.98   0.000     .8655252    .9320272
                x2 |   .8910722   .0219357    40.62   0.000      .848079    .9340653
                x3 |   .8920407   .0097731    91.28   0.000     .8728857    .9111956
                x4 |   .9378418   .0169833    55.22   0.000      .904555    .9711285
                x5 |   .8938635   .0048658   183.70   0.000     .8843266    .9034004
             _cons |  -1.005742   .0202977   -49.55   0.000    -1.045525   -.9659594
      -------------+----------------------------------------------------------------
      OME1_lnshape |
             _cons |  -.8033248   .0074892  -107.26   0.000    -.8180034   -.7886462
      ------------------------------------------------------------------------------
      We can see that all parameters are close to our specified population parameters. Marina could also fit a doubly-robust model:

      Code:
      . stteffects ipwra (x4, gamma) (treat x1 x2 x3, logit)
      
               failure _d:  1 (meaning all fail)
         analysis time _t:  y
      
      Iteration 0:   EE criterion =  3.753e-16  
      Iteration 1:   EE criterion =  4.416e-29  
      
      Survival treatment-effects estimation           Number of obs     =     10,000
      Estimator      : IPW regression adjustment
      Outcome model  : gamma
      Treatment model: logit
      Censoring model: none
      ------------------------------------------------------------------------------
                   |               Robust
                _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      ATE          |
             treat |
         (1 vs 0)  |   14.08699   .3628709    38.82   0.000     13.37578    14.79821
      -------------+----------------------------------------------------------------
      POmean       |
             treat |
                0  |    1.54512   .0356692    43.32   0.000      1.47521     1.61503
      ------------------------------------------------------------------------------
      For more information about Stata's stteffects commands, I suggest to have a look at help stteffects and the [TE] Treatment Effects manual.

      Best,
      Joerg

      Comment


      • #4
        Thank you so much Stephen and Joerg. Joerg, I will try your suggestion. I really appreciate your help!

        Comment


        • #5
          Dear Joerg: I will use a doubly-robust model as I want to specify both the treatment and the outcome models and obtain the corresponding parameter estimates for each. I have one last question: How can I obtain the individual level predicted probability of treatment (i.e. probability of treatment=yes) and the predicted outcome (predicted costs per patient in my case)? Thanks again for your help!

          Comment


          • #6
            Stata's stteffects has a number of postestimation commands, including predict. The predicted probability of receiving the treatment can be calculated by using predict with the ps option. If active treatment is coded 1, then this would be something like:
            Code:
            predict ps, ps tlevel(1)
            The predicted outcome can be calculated by specifying predict with the cmean option, for example:
            Code:
            predict mu, cmean
            I suggest to have a look at help stteffects postestimation for further information about predict and other postestimation commands for stteffects.

            Best,
            Joerg

            Comment


            • #7
              Dear Joerg: When I run the code that you're suggesting, I get an error message: "outcome-model variable list cannot include the treatment variable". So, the treatment variable (in our case, having all tests or not) cannot be included as an independent variable in the outcome model (cost model in our case)? Thanks so much for all your help!

              Comment


              • #8
                It looks like you have included the treatment group indicator in your outcome model specification. However, the only place you should put the treatment indicator here is as the outcome of the treatment assignment model, as in the examples I showed. The basic idea of the inverse-probability weighted regression adjustment estimator is to calculate the inverse-probability-of-treatment weights by means of the treatment assignment model, and then use those weights for the regression adjustment part which consists of the outcome model being fit conditional on treatment group, weighted by the inverse of the probability of receiving the respective treatment. Thus, the treatment indicator is not used as a predictor variable in the outcome model.

                Comment


                • #9
                  Dear Joerg: Thanks so much for your reply, I helps a lot! One last question: I'm having troubling with model convergence. Do you have any recommendations? Thanks again!

                  Comment


                  • #10
                    I would start by trying to fit the constituent models separately. That is, first fit the treatment assignment model and calculate the inverse-probability weights from the results of this fit, then use the weights with the regression adjustment models fitted separately to data of controls and treated. Make sure to use the same analysis sample as you used with stteffects. If one or more of these models do not converge then have a close look at the particular model and the data that is used to fit the model. Sometimes identification issues arise from limitations of a particular dataset so run a lot of checks such as inspecting descriptive statistics and cross-tabulations of your variables in order to ensure that the parameters that you want to estimate are actually identified given the data. If all three models should converge, then carefully inspect the results of all three models including point estimates and standard errors. If, for example, you observe missing standard errors for one or more of your parameters, then this would also point to some identification problem.

                    As an example, say we use the synthetic dataset from #3 and have the following stteffects specification:
                    Code:
                    stteffects ipwra (x1 x2 x3 x4, gamma) (treat x1 x2 x3, logit)
                    In order to fit the three constituent models outside of stteffects, we can use Stata's logit command to fit the treatment model (if the treatment model was specified as a probit or heteroskedastic probit model, we would use probit or hetprobit accordingly), and then glm with gamma family and log link.
                    Code:
                    * Fitting treatment assignment model:
                    logit treat x1 x2 x3
                    
                    * Calculating inverse-probability weights:
                    predict double ps, pr
                    gen double ipw = 1.treat/ps + 0.treat/(1-ps)
                    
                    * Fitting outcome model for controls:
                    glm y x1 x2 x3 x4 if treat == 0 [pw=ipw], family(gamma) link(log)
                    
                    * Fitting outcome model for treated:
                    glm y x1 x2 x3 x4 if treat == 1 [pw=ipw], family(gamma) link(log)
                    Notice that, once you determined the cause of the convergence problems and if you are able to fix the issue, you should go back and fit your model with stteffects. While we can replicate the correct point estimates for all parameters including potential outcome means and average treatment effect by fitting separate models, the standard errors will be incorrect because we did not take into account that the weights are estimated.

                    Best,
                    Joerg

                    Comment


                    • #11
                      Dear Joerg: Thanks so much for your reply. I tried the approach you suggested and it turns out that the three constituent models converge, but when I put them together in stteffects the resulting framework doesn't converge. I wonder if the default link for the gamma distribution in stteffects is the log. Thanks again.

                      Comment


                      • #12
                        Marina: if possible, please send your dataset and code to [email protected] and we will take a look at this.

                        Joerg

                        Comment


                        • #13
                          Dear Joerg: Thanks for your reply. Unfortunately, that's not possible because I'm using data from the Veterans Administration. I wonder if the default link for the gamma distribution in stteffects is the log... thanks so much for all your help!

                          Comment


                          • #14
                            Yes, using GLM language we could say that we essentially use a log link function for the gamma model in stteffects, although we don't have link functions in the GLM sense in this context. Have you tried to use only a few predictor variables in your stteffects specification? I would recommend trying that and then add variables one at a time. This could provide some hints regarding convergence/identification issues inasmuch as these issues are related to one or more of your variables. Also, you may want to consider posting some of your code and output here so I or others may be able to spot something.

                            Joerg

                            Comment


                            • #15
                              Dear Joerg:

                              Sorry about bothering you again. Finally the model run correctly. Yet, I have two questions regarding a post-estimation command:
                              1. I need to estimate for each subject the expected costs (costs are our outcome of interest) if she was treated and also if she was not. I've reviewed testteffects post-estimation information. Is the code predict mu, cmean tlevel (1,2) correct in this case?
                              2. I also need to estimate the same as in point 1 if one independent variable that appears both in the treatment equation and the main equation was one unit higher. Do you know how I could do that? I really appreciate all your help.
                              Thanks,
                              Marina
                              Last edited by Marina Soley; 29 Mar 2016, 20:03.

                              Comment

                              Working...
                              X