Question on matching in a nested case control study

Natalie Malek

Join Date: Jul 2019

Posts: 18
#16

02 Jul 2019, 21:35

I have one question about the line:
drop if (evtimecase >= evtimectl) // control member is not in risk set

This excludes case-control pairs whose time to event for the case is greater than or equal to the time to the event for the control. In the event that the entry date is not the same for all participants wouldn't it also be necessary to exclude the case-control pairs whose age at time of event for the cases is greater than the age at the time of event/death/censoring for the control?
Comment
Rima Saliba

Join Date: Jul 2019

Posts: 4
#17

21 Jun 2022, 20:24

Mike, thank you for a very helpful matching without replacement algorithm!

I ran it on a dataset including 97 cases and a large number of controls (even though, I admit, i am not v familiar with the syntax involved in each command)

my question pertains to the variable "first" which you used initially to count the cases:

when i tab "first" after these commands, i get the expected N of cases: 97
by idcase: gen byte first = (_n ==1) // just to count cases qui count if first ==1 di r(N) " event cases that have a potential match after considering event time" However, when i tab "first" after the last 2 commands(below), the N of "first" drops down to 77 even though each idcase has at least 1 potential unique control in the retained dataset so the record with "first =1" has been dropped for some of the matched case-control pairs. is such an outcome expected? by idcase: gen NCtl = _N tab NCtl if first, missing NCtl | Freq. Percent Cum. ------------+----------------------------------- 1 | 2 2.60 2.60 2 | 17 22.08 24.68 3 | 33 42.86 67.53 4 | 25 32.47 100.00 ------------+----------------------------------- Total | 77 100.00 skipping the very last command in the algorithm, i was able to identify the N of controls for each idcase in the file, using these commands by idcase: gen NCtl = _N by idcase: gen nCtl=_n tab NCtl if nCtl==1 NCtl | Freq. Percent Cum. ------------+----------------------------------- 1 | 12 12.37 12.37 2 | 22 22.68 35.05 3 | 38 39.18 74.23 4 | 25 25.77 100.00 ------------+----------------------------------- Total | 97 100.00 would greatly appreciate your feedback, thank you!
Comment
Rima Saliba

Join Date: Jul 2019

Posts: 4
#18

21 Jun 2022, 20:28

Mike, thank you for a very helpful matching without replacement algorithm!

I ran it on a dataset including 97 cases and a large number of controls (even though, I admit, i am not v familiar with the syntax involved in each command)

my question pertains to the variable "first" which you used initially to count the cases:

when i tab "first" after these commands, i get the expected N of cases: 97

by idcase: gen byte first = (_n ==1) // just to count cases qui count if first ==1 di r(N) " event cases that have a potential match after considering event time"

However, when i tab "first" after the last 2 commands(below), the N of "first" drops down to 77 even though each idcase has at least 1 potential unique control in the retained dataset
so the record with "first =1" has been dropped for some of the matched case-control pairs.
is such an outcome expected?

by idcase: gen NCtl = _N
tab NCtl if first, missing

skipping the very last command in the algorithm, i was able to identify the N of controls for each idcase in the file, using these commands

by idcase: gen NCtl = _N
by idcase: gen nCtl=_n
tab NCtl if nCtl==1

would greatly appreciate your feedback,
thank you!
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#19

30 Jun 2022, 15:18

Some years later, but for the record: The code I posted at #10 in this thread above doesn't work and may well represent a bad approach even if it could be tweaked to work, so I'd like to un-recommend it for anyone who lands here in the future. I'm going to follow up on a generalized version of this question in a separate posting.
Comment

Announcement

Comment

Comment

Comment

Comment