Hi,
I am working on a project in which I am trying to assess the impact of smoking cessation on long-term mortality rates in a specific patient population. My issue is that, while I have complete data on whether or not the patients were smokers initially (i.e. 1 if yes, 0 if no), I don't have complete data on whether they quit smoking within a given time frame. I have roughly 1100 smokers, for which I have cessation data (i.e. 1 if they quit, 0 if they did not quit) on roughly 800. To be clear, I have long-term mortality data (censored) for the full 1100.
Up until now, I have been using propensity score adjustment. My syntax has been of the form:
logit quitteryn [varlist = variables associated with the outcome of death]
predict quitterps
stcox quitteryn quitterps
(I have been led to believe that this is the correct methodology).
I have been trying to determine how best to account for those patients for which I do not have treatment data (i.e. quitteryn = .). However, I'll be honest that I'm a bit lost as this is my first time encountering this type of analysis. It would appear that IPW would provide a solution, but I am uncertain as to whether I need to do two layers of IPW - one that accounts for differences in the type of treatment (i.e. quitteryn = 1 vs = 0) and another that accounts for the existence of missing data (i.e. quitteryn ~= . vs == .). Also, if I do need to do this, I really have no idea what the syntax should be.
I've seen IPW in conjunction with stteffects and seem to have been able to make that code work. However, as my primary results are in the form of a hazard ratio, I would love to be able to incorporate the correction into a cox model (i.e. stcox). Hazard ratios are simply the standard of reporting being used.
Any assistance or explanation (preferably along with STATA syntax) would be much appreciated! Thanks in advance.
~ Dave
Comment