Propensity score analysis for complex survey data using PSCORE, PSMATCH2, etc

anwar dudekula

Join Date: Apr 2014

Posts: 61
#1

Propensity score analysis for complex survey data using PSCORE, PSMATCH2, etc

21 Aug 2014, 11:34

Hello,

I have been told that TEFFECTS is not set up for use with survey data for my previous post (http://www.statalist.org/forums/foru...urvey-data-set).

Hence I am posting this question with a similar request.

There were few questions regarding this in statlist in past but there was never a concrete answer.

http://www.stata.com/statalist/archi.../msg00011.html
http://www.stata.com/statalist/archi.../msg01239.html

Can someone point to any example (with acutal stata steps/commands) where propensity score analysis is used analysis of a survey data set.

Thank you,
Sincerely,
Anwar
Tags: None
Melissa Garrido

Join Date: Apr 2014

Posts: 75
#2

21 Aug 2014, 13:55

Hi Anwar,

Please see this recent thread on Statalist: http://www.statalist.org/forums/foru...pensity-scores

The DuGoff article referenced on that thread is accompanied by an online appendix with some Stata commands, though they do the bulk of their propensity score estimation in R.

Hope this helps,
Melissa
1 like
Comment
anwar dudekula

Join Date: Apr 2014

Posts: 61
#3

21 Aug 2014, 18:09

Hi Melissa,

Thanks for the information. I have reviewed the post by Frank Lopresti and your answer. I have reviewed the paper. Indeed, the authors calculate most of the propensity scores in R package and pass on the Stata for anlaysis. If was hoping if Frank Lopresti was able to get stata code for it and if so, can he post it for the benefit of the statalist members.

Thank you,
Anwar
Comment
Melissa Garrido

Join Date: Apr 2014

Posts: 75
#4

22 Aug 2014, 08:11

You can estimate a propensity score weight through the traditional Stata commands (-teffects- or the user-written -pscore- and -psmatch2-), which are outlined elsewhere (See http://onlinelibrary.wiley.com/doi/1...12182/suppinfo as well as the help files for each command).

DuGoff et al. (HSR 2014 Feb;49(1):284-303) recommend including the survey weight as a covariate in the propensity score model. (They argue that the propensity score model itself does not need to be weighted.) Once you have calculated a propensity score weight, they recommend multiplying that weight by the survey weight. You would then run an outcome model that is svyset by this new weight variable.

Here is a stylized (untested) example using user-written commands. Assume "svyweight" is the survey weight variable. Italicized variables are those that are constructed (not in your dataset).

1. Create propensity score, including svyweight as one of the covariates.

pscore treatment covariate1 covariate2 … covariate# svyweight, pscore(mypscore) blockid(myblock) logit

2. Assess propensity score's balance across treatment and comparison groups (not shown).

3. Weight treatment and comparison groups by the propensity score (using covariates chosen from Steps 1 and 2).

qui dr outcome1 treatment covariate 1… covariate # svyweight, genvars
egen sumofweights = total(iptwt)
gen norm_weights = iptwt/sumofweights

4. Multiply propensity score weight by survey weight.

gen newweight = norm_weights*svyweight

5. Run outcome model, svyset by new weight variable.

This step is covered in Appendix B of DuGoff et al.

Hope this helps,
Melissa

Last edited by Melissa Garrido; 22 Aug 2014, 08:21. Reason: Posted before finished typing
1 like
Comment
anwar dudekula

Join Date: Apr 2014

Posts: 61
#5

22 Aug 2014, 10:45

Hi Melissa,

Thank you very much for taking time to explain the steps. As I am not an expert on Stata, It will take sometime for me to go through the steps!
Thank you again very much for your time.

With propensity score methodology being frequently used especially in medical literature, it would be great if any stata experts could write a user written command to use propensity score for complex survey data.

Sincerely,
Anwar
Comment
anwar dudekula

Join Date: Apr 2014

Posts: 61
#6

02 Sep 2014, 10:21

Hello Melissa,
I have a survey sample and I am using a subpopulation of the data set for my analysis. With this in mind, can you please suggest how I wound run the steps 1 thru step 5. Would the following be appropriate steps

Step-1: I create varaible(0/1) for subpopulation of interest : subpop (which is ==1 for subpopulation of interest)

Step-2-A: pscore treatment covariate1 covariate2 … covariate# svyweight if subpop==1, pscore(mypscore) blockid(myblock) logit

Step-2-B: Assess propensity score's balance across treatment and comparison groups

Step-3:

qui dr outcome1 treatment covariate 1… covariate # svyweight if subpop==1 , genvars

egen sumofweights = total(iptwt) if subpop==1

gen norm_weights = iptwt/sumofweights if subpop==1

Step-4: Multiply propensity score weight by survey weight.

gen newweight = norm_weights*svyweight if subpop==1

Step-5 : Run outcome model, svyset by new weight variable per Appendix B of DuGoff et al.

Thank you,
Sincerely ,
Anwar
Comment
Melissa Garrido

Join Date: Apr 2014

Posts: 75
#7

02 Sep 2014, 12:58

Hi Anwar,
If you are using a subpopulation of your data, you may want to use the -subpop- option that is part of Stata's survey commands to get appropriate variance calculations. You may want to see http://www.stata.com/support/faqs/st...-zero-weights/
Hope this helps,
Melissa
Comment
Angie Rose

Join Date: May 2015

Posts: 2
#8

07 May 2015, 13:17

Dear Melissa (or anyone else who can help!)

I'm using PS in an unsual way perhaps, as it's not to measure treatment effect but to observe the effect of "being a case" (about 250 patients on a population-based register) on various outcomes to assess cost from a healthcare provider perspective, eg length of hospital stay, use of free medications, number of surgical procedures. The comparison group (about 1200 non-cases) are participants from a randomly selected population survey (multi-stage cluster design). So here's my first question: I should account for survey design by using survey weights, right? But is it OK that only the non-cases were "sampled" per se – so I have given each “case” a “sample weight”=1, is that correct?

Secondly, I have found the Stata code Melissa gave above very helpful but I'm finding it difficult to do step 5 (Run outcome model, svyset by new weight variable) as when I use pbalchk (which I think I need as it’s IPTW) it does not allow svy: as the prefix. So I get the error message "pbalchk is not supported by svy with vce(linearized); see help svy estimation for a list of Stata estimation commands that are supported by svy r(322);”.

Any thoughts or advice much appreciated.

warm regards
Angie

ps here is the code I used:

qui dr outcome case covariate1 … covariate# , genvars
egen sumofweights = total(iptwt)
gen norm_weights = iptwt/sumofweights
gen newweight = norm_weights*svyweight
svyset PSU [pweight=newweight] , strata(region)

svy: pbalchk outcome case covariate1 … covariate#, wt(newweight) graph
Comment
Melissa Garrido

Join Date: Apr 2014

Posts: 75
#9

08 May 2015, 09:37

Hi Angie,
For your first question, the answer is 'it depends'. If you want to generalize your results to the survey sample from which your comparison group was drawn, you could make a case for ignoring the survey weights. However, if you want to generalize to the larger population from which the survey sample and comparison group were selected, then you should account for the survey design.

If you decide to use survey weights, to check the balance of your propensity score (which is the purpose of the user-written -pbalchk- ), you'll likely have to calculate each covariate's means/standard deviations across the treatment and comparison groups individually. Others may be able to help you with this step. If you re-post this as a new thread, since this is a new question, you'll likely get more informative answers from a greater range of Statalist users.

Hope this helps,

Melissa
1 like
Comment
Angie Rose

Join Date: May 2015

Posts: 2
#10

08 May 2015, 12:16

Thanks so much Melissa - yes I do want to generalise to the population. However after reading your response a lightbulb went off and I realise my error now - I was trying to use the svy command for the balance check rather than in the next step, i.e. when estimating the outcome! If I still have problems after this, I will post to a new thread, sorry about that.

Thanks once again - much appreciated
Angie
Comment
Jacki STrenio

Join Date: Oct 2017

Posts: 3
#11

19 Oct 2017, 15:30

I've also been using PSM to estimate the treatment effect but now want to incorporate the complex survey design. I've followed these methods but am confused about interpretation. Normally, when I complete PSM, I get a coefficient that is the difference in probability of a binary outcome in percentage points between my treatment and control groups. However, when I follow this method, I get normal odds ratios. What is the proper interpretation of them? The difference in odds between the treated and control group?

My code (per Melissa's recommendation):
pscore treatment $controls svyweight if subpopfem==1, pscore(mypscore1) blockid(myblock) logit

qui dr outcome treatment $controls surveyweight if subpop==1, genvars

egen sumofweights=total(iptwt)
gen norm_weights=iptwt/sumofweights

gen newweight=norm_weights*svyweight

svyset psuscid [weight=newweight], strata(region)

Then my outcome analysis using svy:

svy, subpop(subpop): logit outcome treatment $controls, or

Thank you in advance.
Comment
Jimmy Floyd

Join Date: May 2018

Posts: 11
#12

21 Apr 2019, 13:34

Did you get an answer Jacki?
Comment
Aybuke Koyuncu

Join Date: May 2019

Posts: 2
#13

13 Aug 2019, 22:11

Hi Melissa,

Your guidance is very helpful, thank you. I am trying to implement the proposed code for step #2 but am having trouble with the "dr" command:

qui dr outcome1 treatment covariate 1… covariate # svyweight, genvars

I am using STATA 15, and dr is not recognized and I can't seem to find a package I need to install when I search for it. Apologies if I am missing something very obvious but your input would be greatly appreciated!
Comment

David Radwin

Join Date: Mar 2014
Posts: 369

#14

14 Aug 2019, 10:39

Originally posted by Aybuke Koyuncu View Post

I am using STATA 15, and dr is not recognized and I can't seem to find a package I need to install when I search for it.

Code:

. findit dr

. net sj 8-3 st0149

-----------------------------------------------------------------------------------------------------------
package st0149 from http://www.stata-journal.com/software/sj8-3
-----------------------------------------------------------------------------------------------------------

TITLE
      SJ8-3 st0149.  Implementing Double Robust Estimators of...

DESCRIPTION/AUTHOR(S)
      Implementing Double Robust Estimators of Causal Effects
      by Mark Lunt, arc Epidemiology Unit,
           The University of Manchester, UK
         Richard Emsley, Biostatistics,
           Health Methodology Research Group,
           The University of Manchester, UK
      Support:  [email protected]
      After installation, type help dr

INSTALLATION FILES                               (type net install st0149)
      st0149/dr.ado
      st0149/dr.hlp

ANCILLARY FILES                                  (type net get st0149)
      st0149/dr_example.dta
      st0149/sjpaper.do
-----------------------------------------------------------------------------------------------------------

. net install st0149.pkg

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him

Comment

Ayesha Bukar

Join Date: Apr 2018

Posts: 15
#15

19 Sep 2020, 13:07

Hi,

The above command is very useful. How about a case of 3-year pooled cross-sectional survey data. Each wave has its own sampling weights. How can we apply this to PSM for this kind of data.

Kind regards,
Ayesha.
Comment

Announcement

Propensity score analysis for complex survey data using PSCORE, PSMATCH2, etc

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment