Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Propensity score analysis for complex survey data using PSCORE, PSMATCH2, etc

    Hello,

    I have been told that TEFFECTS is not set up for use with survey data for my previous post (http://www.statalist.org/forums/foru...urvey-data-set).

    Hence I am posting this question with a similar request.

    There were few questions regarding this in statlist in past but there was never a concrete answer.

    http://www.stata.com/statalist/archi.../msg00011.html
    http://www.stata.com/statalist/archi.../msg01239.html

    Can someone point to any example (with acutal stata steps/commands) where propensity score analysis is used analysis of a survey data set.

    Thank you,
    Sincerely,
    Anwar

  • #2
    Hi Anwar,

    Please see this recent thread on Statalist: http://www.statalist.org/forums/foru...pensity-scores

    The DuGoff article referenced on that thread is accompanied by an online appendix with some Stata commands, though they do the bulk of their propensity score estimation in R.

    Hope this helps,
    Melissa

    Comment


    • #3
      Hi Melissa,

      Thanks for the information. I have reviewed the post by Frank Lopresti and your answer. I have reviewed the paper. Indeed, the authors calculate most of the propensity scores in R package and pass on the Stata for anlaysis. If was hoping if Frank Lopresti was able to get stata code for it and if so, can he post it for the benefit of the statalist members.

      Thank you,
      Anwar





      Comment


      • #4
        You can estimate a propensity score weight through the traditional Stata commands (-teffects- or the user-written -pscore- and -psmatch2-), which are outlined elsewhere (See http://onlinelibrary.wiley.com/doi/1...12182/suppinfo as well as the help files for each command).

        DuGoff et al. (HSR 2014 Feb;49(1):284-303) recommend including the survey weight as a covariate in the propensity score model. (They argue that the propensity score model itself does not need to be weighted.) Once you have calculated a propensity score weight, they recommend multiplying that weight by the survey weight. You would then run an outcome model that is svyset by this new weight variable.

        Here is a stylized (untested) example using user-written commands. Assume "svyweight" is the survey weight variable. Italicized variables are those that are constructed (not in your dataset).

        1. Create propensity score, including svyweight as one of the covariates.

        pscore treatment covariate1 covariate2 … covariate# svyweight, pscore(mypscore) blockid(myblock) logit

        2. Assess propensity score's balance across treatment and comparison groups (not shown).

        3. Weight treatment and comparison groups by the propensity score (using covariates chosen from Steps 1 and 2).

        qui dr outcome1 treatment covariate 1… covariate # svyweight, genvars
        egen sumofweights = total(iptwt)
        gen norm_weights = iptwt/sumofweights

        4. Multiply propensity score weight by survey weight.

        gen newweight = norm_weights*svyweight

        5. Run outcome model, svyset by new weight variable.

        This step is covered in Appendix B of DuGoff et al.

        Hope this helps,
        Melissa
        Last edited by Melissa Garrido; 22 Aug 2014, 08:21. Reason: Posted before finished typing

        Comment


        • #5
          Hi Melissa,

          Thank you very much for taking time to explain the steps. As I am not an expert on Stata, It will take sometime for me to go through the steps!
          Thank you again very much for your time.

          With propensity score methodology being frequently used especially in medical literature, it would be great if any stata experts could write a user written command to use propensity score for complex survey data.

          Sincerely,
          Anwar

          Comment


          • #6
            Hello Melissa,
            I have a survey sample and I am using a subpopulation of the data set for my analysis. With this in mind, can you please suggest how I wound run the steps 1 thru step 5. Would the following be appropriate steps

            Step-1: I create varaible(0/1) for subpopulation of interest : subpop (which is ==1 for subpopulation of interest)

            Step-2-A: pscore treatment covariate1 covariate2 … covariate# svyweight if subpop==1, pscore(mypscore) blockid(myblock) logit

            Step-2-B: Assess propensity score's balance across treatment and comparison groups

            Step-3:

            qui dr outcome1 treatment covariate 1… covariate # svyweight if subpop==1 , genvars

            egen sumofweights = total(iptwt) if subpop==1

            gen norm_weights = iptwt/sumofweights if subpop==1

            Step-4: Multiply propensity score weight by survey weight.

            gen newweight = norm_weights*svyweight if subpop==1

            Step-5 : Run outcome model, svyset by new weight variable per Appendix B of DuGoff et al.

            Thank you,
            Sincerely ,
            Anwar

            Comment


            • #7
              Hi Anwar,
              If you are using a subpopulation of your data, you may want to use the -subpop- option that is part of Stata's survey commands to get appropriate variance calculations. You may want to see http://www.stata.com/support/faqs/st...-zero-weights/
              Hope this helps,
              Melissa

              Comment


              • #8
                Dear Melissa (or anyone else who can help!)

                I'm using PS in an unsual way perhaps, as it's not to measure treatment effect but to observe the effect of "being a case" (about 250 patients on a population-based register) on various outcomes to assess cost from a healthcare provider perspective, eg length of hospital stay, use of free medications, number of surgical procedures. The comparison group (about 1200 non-cases) are participants from a randomly selected population survey (multi-stage cluster design). So here's my first question: I should account for survey design by using survey weights, right? But is it OK that only the non-cases were "sampled" per se – so I have given each “case” a “sample weight”=1, is that correct?

                Secondly, I have found the Stata code Melissa gave above very helpful but I'm finding it difficult to do step 5 (Run outcome model, svyset by new weight variable) as when I use pbalchk (which I think I need as it’s IPTW) it does not allow svy: as the prefix. So I get the error message "pbalchk is not supported by svy with vce(linearized); see help svy estimation for a list of Stata estimation commands that are supported by svy r(322);”.

                Any thoughts or advice much appreciated.

                warm regards
                Angie

                ps here is the code I used:

                qui dr outcome case covariate1covariate# , genvars
                egen sumofweights = total(iptwt)
                gen norm_weights = iptwt/sumofweights
                gen newweight = norm_weights*svyweight
                svyset PSU [pweight=newweight] , strata(region)

                svy: pbalchk outcome case covariate1covariate#, wt(newweight) graph

                Comment


                • #9
                  Hi Angie,
                  For your first question, the answer is 'it depends'. If you want to generalize your results to the survey sample from which your comparison group was drawn, you could make a case for ignoring the survey weights. However, if you want to generalize to the larger population from which the survey sample and comparison group were selected, then you should account for the survey design.

                  If you decide to use survey weights, to check the balance of your propensity score (which is the purpose of the user-written -pbalchk- ), you'll likely have to calculate each covariate's means/standard deviations across the treatment and comparison groups individually. Others may be able to help you with this step. If you re-post this as a new thread, since this is a new question, you'll likely get more informative answers from a greater range of Statalist users.

                  Hope this helps,

                  Melissa

                  Comment


                  • #10
                    Thanks so much Melissa - yes I do want to generalise to the population. However after reading your response a lightbulb went off and I realise my error now - I was trying to use the svy command for the balance check rather than in the next step, i.e. when estimating the outcome! If I still have problems after this, I will post to a new thread, sorry about that.

                    Thanks once again - much appreciated
                    Angie

                    Comment


                    • #11
                      I've also been using PSM to estimate the treatment effect but now want to incorporate the complex survey design. I've followed these methods but am confused about interpretation. Normally, when I complete PSM, I get a coefficient that is the difference in probability of a binary outcome in percentage points between my treatment and control groups. However, when I follow this method, I get normal odds ratios. What is the proper interpretation of them? The difference in odds between the treated and control group?

                      My code (per Melissa's recommendation):
                      pscore treatment $controls svyweight if subpopfem==1, pscore(mypscore1) blockid(myblock) logit

                      qui dr outcome treatment $controls surveyweight if subpop==1, genvars

                      egen sumofweights=total(iptwt)
                      gen norm_weights=iptwt/sumofweights

                      gen newweight=norm_weights*svyweight

                      svyset psuscid [weight=newweight], strata(region)


                      Then my outcome analysis using svy:

                      svy, subpop(subpop): logit outcome treatment $controls, or

                      Thank you in advance.

                      Comment


                      • #12
                        Did you get an answer Jacki?

                        Comment


                        • #13
                          Hi Melissa,

                          Your guidance is very helpful, thank you. I am trying to implement the proposed code for step #2 but am having trouble with the "dr" command:

                          qui dr outcome1 treatment covariate 1… covariate # svyweight, genvars

                          I am using STATA 15, and dr is not recognized and I can't seem to find a package I need to install when I search for it. Apologies if I am missing something very obvious but your input would be greatly appreciated!

                          Comment


                          • #14
                            Originally posted by Aybuke Koyuncu View Post
                            I am using STATA 15, and dr is not recognized and I can't seem to find a package I need to install when I search for it.
                            Code:
                            . findit dr
                            
                            . net sj 8-3 st0149
                            
                            -----------------------------------------------------------------------------------------------------------
                            package st0149 from http://www.stata-journal.com/software/sj8-3
                            -----------------------------------------------------------------------------------------------------------
                            
                            TITLE
                                  SJ8-3 st0149.  Implementing Double Robust Estimators of...
                            
                            DESCRIPTION/AUTHOR(S)
                                  Implementing Double Robust Estimators of Causal Effects
                                  by Mark Lunt, arc Epidemiology Unit,
                                       The University of Manchester, UK
                                     Richard Emsley, Biostatistics,
                                       Health Methodology Research Group,
                                       The University of Manchester, UK
                                  Support:  [email protected]
                                  After installation, type help dr
                            
                            INSTALLATION FILES                               (type net install st0149)
                                  st0149/dr.ado
                                  st0149/dr.hlp
                            
                            ANCILLARY FILES                                  (type net get st0149)
                                  st0149/dr_example.dta
                                  st0149/sjpaper.do
                            -----------------------------------------------------------------------------------------------------------
                            
                            . net install st0149.pkg
                            David Radwin
                            Senior Researcher, California Competes
                            californiacompetes.org
                            Pronouns: He/Him

                            Comment


                            • #15
                              Hi,

                              The above command is very useful. How about a case of 3-year pooled cross-sectional survey data. Each wave has its own sampling weights. How can we apply this to PSM for this kind of data.

                              Kind regards,
                              Ayesha.

                              Comment

                              Working...
                              X