Propensity Score Matching for Case Control Study

Anindit Chhibber

Join Date: Jun 2018
Posts: 37

Propensity Score Matching for Case Control Study

09 May 2019, 07:32

I am analyzing data for a case control study looking at association of use of drug 'ACE1mo' and Pneumonia. I would like to match cases with controls at the time of their admission on age, gender and CCI and look at their exposure to 'ACE1mo' going back in time.

I am used to match 'treated patients' with 'untreated patients' using psmatch2 command and looking at the incidence of disease in a retrospective or prospective study design but not matching 'case patients' with 'control patients'. It would be great help if someone can provide any insight as to whether syntax below makes any sense or not.

I will present the syntax i used for this study vs what i usually use for a retrospective/prospective study in which patients are rather matched on treatment status.

1) Create ps score using logistic regression: logistic Pneumonia Age Gender CCI
(For a prospective/retrospective study i usually use logistic ACE1mo Age Gender CCI for looking at the probablity of an individual falling into the treatment arm vs untreated arm).

2) Matching Process with replacement: psmatch2 Penumonia, pscore(pscore) n(2) cal(0.20).
(For a prospective/retrospective study i usually use psmatch2 ACE1mo, pscore(pscore) n(2) cal(0.20).

3) Effect Estimation: psmatch2 Pneumonia Age Gender CCI, outcome(ACE1mo) logit.
(For a prospective/retrospective study i usually use psmatch2 ACE1mo Age Gender CCI, outcome(Pneumonia) logit)

Thanks!
Anindit Chhibber

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long MRN byte Gender double Age byte(CCI Pneumonia ACE1mo)
 2949 0 70.579055  5 0 0
 2949 0 70.584531  5 0 0
 2949 0 70.587269  5 0 0
 2949 0 70.590007  5 0 0
 2972 1 74.360027  2 0 1
 2972 1 74.932238  3 0 0
 3897 1 69.021218  9 0 0
 3897 1 69.023956  9 0 0
 3897 1 69.631759 11 0 0
 3921 0 66.184805  2 0 0
 3921 0 69.420945  2 0 0
 3921 0 69.423682  2 0 0
 3921 0 69.437372  2 0 0
 6783 1     65.22 10 0 0
 6783 1     65.26 10 0 0
 6783 1     65.28 10 0 0
 6783 1     65.33 10 0 0
 6783 1     65.44 10 0 0
 6783 1     65.74 10 0 0
 6783 1     65.77 10 0 0
 6783 1     65.91 10 0 0
 9776 1 74.220397  2 0 0
 9776 1 74.223135  2 0 0
 9776 1 74.225873  2 0 0
 9776 1 74.228611  2 0 0
 9776 1 74.231348  2 0 0
 9776 1 74.234086  2 0 0
 9776 1 74.247775  2 0 0
 9776 1 74.250513  2 0 0
 9776 1 74.253251  2 0 0
10173 0     85.37  7 0 0
10173 0 86.384668  7 0 0
10173 0 86.436687  7 0 0
10173 0 86.603696  7 0 0
10173 0 87.460643  7 0 0
10173 0  87.59206  7 0 0
10173 0 88.591376  7 0 0
10173 0 89.059548 10 0 0
10173 0 89.062286 10 0 0
10173 0 89.065024 10 0 0
10173 0 89.067762 10 0 0
14837 1  71.12115  0 0 1
15107 1 67.069131  4 1 0
15107 1 68.147844  5 0 0
25692 1 67.665982  9 0 0
25692 1 67.674196  9 0 0
25692 1 67.685147  9 0 0
25940 0     93.57  3 0 0
25940 0 97.711157  3 0 0
25940 0 97.716632  3 0 0
26161 0 68.670773  5 0 0
26161 0 68.673511  5 0 1
26278 1 68.290212  2 0 0
32086 1 80.657084  4 0 0
32706 0     65.44  4 0 0
32706 0 68.588638  4 0 0
32946 0 84.180698  . 0 0
32946 0 84.183436  2 0 0
32946 0 84.186174  2 0 1
32946 0 84.188912  2 0 1
32946 0  84.19165  2 0 1
32946 0 84.194387  2 0 1
32946 0 84.199863  2 0 1
33100 1 67.512663  3 0 0
33100 1 67.512663  3 1 0
33100 1 67.627652  3 0 0
33100 1  67.63039  4 0 0
33100 1 67.633128  4 0 0
33100 1 67.635866  4 0 0
33100 1 67.638604  4 0 0
33100 1 67.641342  4 0 0
33100 1 67.644079  4 0 0
33100 1 67.646817  4 0 0
33100 1 67.649555  4 0 0
33100 1 67.655031  4 0 0
33100 1 67.657769  4 0 0
33100 1 67.660507  4 0 0
33100 1 67.663244  4 0 0
33100 1 67.665982  4 0 0
33100 1  67.66872  4 0 0
33100 1 67.671458  4 0 0
33100 1 67.674196  4 0 0
33100 1 67.676934  4 0 0
33100 1 67.685147  4 0 0
33100 1 67.687885  4 0 0
33100 1 67.698836  4 0 0
33589 1 67.353867  . 0 0
33589 1 67.356605  7 0 0
33589 1 67.359343  7 0 1
33589 1 67.362081  7 0 1
36426 0      88.2  4 0 0
36426 0     88.41  4 0 0
36426 0     88.42  4 0 0
39875 1     69.85  5 0 0
43067 0 75.014374  . 0 0
43067 0 75.017112  5 0 0
43216 1 80.065708  1 0 0
44859 0 77.114305  2 0 0
44859 0 77.117043  2 0 0
47134 0 66.086242  0 0 0
end

Tags: None

David Radwin

Join Date: Mar 2014

Posts: 369
#2

09 May 2019, 13:01

I can't evaluate the whole set of syntax, but one aspect you might change is the caliper. You specified a caliper of 0.2, but the standard rule-of-thumb recommendation is to base it on the distribution of propensity scores, such as one-quarter of a standard deviation of the propensity score (citation below). The code would resemble this:

Code:

quietly summarize pscore // calculate the standard deviation local caliper = `r(sd)'/4 // calculate the caliper and save to local macro psmatch2 ACE1mo, pscore(pscore) n(2) cal(`caliper')

Stuart, E.A., and Rubin, D.B. (2008). Best Practices in Quasi-Experimental Designs: Matching Methods for Causal Inference. In J.W. Osborne (Ed.), Best Practices in Quantitative Methods (pp. 155–176). Thousand Oaks, CA: Sage.

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Anindit Chhibber

Join Date: Jun 2018

Posts: 37
#3

09 May 2019, 17:30

Thanks David for the suggestion, it absolutely makes sense. Would adjust the caliper.

In my example i do not want to match the subjects across treatment groups but i want to match 'subjects that have Pneumonia' with 'subjects that do not have Pneumonia' wrt age, gender and CCI and then look at their ACE use in past 1 month.

Would following syntax be able to calculate that effect?

psmatch2 Pneumonia Age Gender CCI, outcome(ACE1mo) logit

Please notice that outcome is replaced by treatmet and vice versa in my example.

Thanks!
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2421
#4

09 May 2019, 17:49

To my knowledge, you can't validly use -psmatch2- to estimate the effect, although I should think you can use it to do the matching. Estimates of effect in case-control designs must be unaffected by the sample fractions on the response variable. I think you would want estimate a conditional logit model, with a variable identifying the c-c pairs derived from -psmatch2- being the group variable in -clogit y, group(pairid)-

I would use -calipmatch- (available from ssc) to create matched pairs from a logit or probit based propensity score. I like the simplicity of the syntax of -calipmatch-, as well as its features. Then, I'd do the -clogit- per my description.
Comment
David Radwin

Join Date: Mar 2014

Posts: 369
#5

09 May 2019, 17:53

I'm sorry that I can't answer that question.

I will caution you that even if it is the right syntax, and even if this comparison makes sense logically, you will want to assess and confirm the quality of your match before looking at the results. Among other things, you want to look at measures of covariate balance (ideally including covariates and higher order terms and interactions not in your model) and overlap (also called common support). The Stuart and Rubin article cited above is a good start for these sorts of diagnostics.

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Anindit Chhibber

Join Date: Jun 2018

Posts: 37
#6

09 May 2019, 23:46

David thanks for the suggestion, I have already assessed the quality of matching and everything looks okay to me in that sense. As you mentioned, i have looked at both covariate balance and common support. I just wanted to make sure whether the synatx mentioned above will indeed answer the question i posed previously.

Thanks for your help!
Comment
Anindit Chhibber

Join Date: Jun 2018

Posts: 37
#7

09 May 2019, 23:50

Mike thanks for your reply. I will try and repeat my analysis as per your suggestion. I have also used teffects psmatch for the same analysis with same results as psmatch2. I hope i am atleast right in thinking that teffects psmatch would be able to calculate the said treatment effects. Synatx is below:

teffectspsmatch (ACE1mo)(Penumonia Age Gender CCI), nn(2) atet

Thanks!

Last edited by Anindit Chhibber; 10 May 2019, 00:41.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2421
#8

11 May 2019, 08:03

I think the -teffects- command you propose would be invalid, as it treats ACE1mo as the outcome variable. My understanding is that it is well-known in epidemiology that examining differences in mean exposure across a disease outcome is invalid as a way to estimate the effect of a continuous exposure. I think this would be a worse error in a case-control study. Perhaps one of the epidemiologists here will comment.
Comment
Anindit Chhibber

Join Date: Jun 2018

Posts: 37
#9

11 May 2019, 15:37

Thanks for the input Mike. It makes sense that it would be an error to compare mean differences in exposure to estimate effect of the exposure across disease outcome but the exposure in this case is also binary like the outcome.

Keeping that in mind do you think i would get different answers if i:

1) Do matching based on your suggestion and then run a simple logistic regression keeping Pneumonia as outcome and ACE1mo as exposure and a weight variable that catches the matches i made earlier.
(Conclusion: Being exposed to ACE decreases/increases odds of getting Pneumonia by ....)

or

2) Using t effects command as mentioned earlier and having ace as an outcome.
(Conclusion: The probability of being exposed to ACE is x% lower/more in patients suffering from Pneumonia when compared to their counterparts)

Thanks!
Comment
Anindit Chhibber

Join Date: Jun 2018

Posts: 37
#10

11 May 2019, 23:17

Mike and David, thanks for your input and useful suggestions. I was able to match my sample using psmatch2 command and then running a logistic regression to estimate the treatment effects. Synatx is below:

Matching Pneumonia patients with healthy controls: psmatch2 Pneumonia Gender Age CCI , logit
Running logistic regression using weight generated by psmatch2 to match controls and cases: logistic ACE1mo Gender Age CCI Pneumonia [fweight=_weight]

Also, to answer the last question i posed, i got the same answers for both. Thanks again for your help!
Comment
ALKEBSEE RADWAN

Join Date: Mar 2019

Posts: 240
#11

12 May 2019, 00:14

Hi everyone . i need to know the main formula of the psmatch2 please
because i m confused where i should put (treatment variable , dependent variable, and independent).
when i watch youtube channels they name the variables differently
please i need the original form
Comment

Announcement

Propensity Score Matching for Case Control Study

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment