accounting for missing treatment with propensity score adjustment

David Biery

Join Date: Jul 2018

Posts: 1
#1

accounting for missing treatment with propensity score adjustment

08 Jul 2018, 11:14

Hi,

I am working on a project in which I am trying to assess the impact of smoking cessation on long-term mortality rates in a specific patient population. My issue is that, while I have complete data on whether or not the patients were smokers initially (i.e. 1 if yes, 0 if no), I don't have complete data on whether they quit smoking within a given time frame. I have roughly 1100 smokers, for which I have cessation data (i.e. 1 if they quit, 0 if they did not quit) on roughly 800. To be clear, I have long-term mortality data (censored) for the full 1100.

Up until now, I have been using propensity score adjustment. My syntax has been of the form:

logit quitteryn [varlist = variables associated with the outcome of death]
predict quitterps
stcox quitteryn quitterps

(I have been led to believe that this is the correct methodology).

I have been trying to determine how best to account for those patients for which I do not have treatment data (i.e. quitteryn = .). However, I'll be honest that I'm a bit lost as this is my first time encountering this type of analysis. It would appear that IPW would provide a solution, but I am uncertain as to whether I need to do two layers of IPW - one that accounts for differences in the type of treatment (i.e. quitteryn = 1 vs = 0) and another that accounts for the existence of missing data (i.e. quitteryn ~= . vs == .). Also, if I do need to do this, I really have no idea what the syntax should be.

I've seen IPW in conjunction with stteffects and seem to have been able to make that code work. However, as my primary results are in the form of a hazard ratio, I would love to be able to incorporate the correction into a cox model (i.e. stcox). Hazard ratios are simply the standard of reporting being used.

Any assistance or explanation (preferably along with STATA syntax) would be much appreciated! Thanks in advance.

~ Dave
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

09 Jul 2018, 11:09

You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

There is a massive literature on missing data. Your technique is certainly not what you want - you're using predicted values even where you have actual. You could look at GSEM's approaches to missing data or multiple imputation.
Comment
Chinmay Sharma

Join Date: Nov 2015

Posts: 351
#3

09 Jul 2018, 11:33

1) I would recommend first examining the nature of missingness- there are various types of ways in which data are missing, the solutions for which can be different. For instance, you can have MAR (missing at random), MCAR (missing completely at random) etc.
2) You may also have a presence of observations that fall under an "incomplete spell" type. Duration/hazard models can naturally incorporate such spells.
Comment

Announcement

accounting for missing treatment with propensity score adjustment

Comment

Comment