Comparing matched cases and controls with a cluster variable

Joe Tuckles

Join Date: Jul 2018

Posts: 180
#1

Comparing matched cases and controls with a cluster variable

23 Sep 2019, 16:52

Hi, I have matched cases and controls and now all variables are shown twice for example BMI and BMI_ctrl. How do I do a paired ttest to see if the cases have a higher BMI than controls? The lack of one variable that defines cases and controls has thrown me. I also need to control for a cluster variable.

Thanks
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2413
#2

23 Sep 2019, 20:11

I wouldn't recommend a t-test here. You're presumably interested in how BMI affects the risk of being a case. The t-test would treat the data as though BMI were conditioned on case status, which I presume does not make clinical sense.

The most common approach here would be to use conditional logistic regression, with the match id being used in the "group" option in the -clogit- command. If you have a reasonably small number of values for the "cluster" variable (location of case), you can simply use it as a covariate in the conditional logit model.. Given that you didn't match on the cluster (right?), I think things get messier if you have a large number of values for the cluster variable.

For conditional logistic regression and most purposes in Stata, you need to have your data in the so-called "long" format (see -help reshape-), that is, with an observation for each subject, case or control, with an id to indicate the matched group to which each observation belongs.
Comment
Joe Tuckles

Join Date: Jul 2018

Posts: 180
#3

23 Sep 2019, 20:16

Thanks for your response. I was hoping to just do a simple test to see whether BMI differs among cases and controls, rather than looking at risk prediction. Is this feasible? I just want to keep is simple at this stage. If t-test isn't feasible for this purpose then I will look into doing a conditional logistic regression.

I did not match on the cluster, but I feel I do have a large number of values for the cluster variable. Is the way to show you this to do a tab?

Last edited by Joe Tuckles; 23 Sep 2019, 20:19.
Comment
Joe Tuckles

Join Date: Jul 2018

Posts: 180
#4

23 Sep 2019, 22:46

(So basically just want a test that shows the mean of case BMI, the mean of controls BMI so can see which one is higher than a test to show whether the difference is statistically significant, while controlling for a cluster variable with what appears to be a large number of values).

Last edited by Joe Tuckles; 23 Sep 2019, 22:52.
Comment

Joe Tuckles

Join Date: Jul 2018
Posts: 180

24 Sep 2019, 00:25

Could you please advise if I have reshaped correctly in order for the clogit? I am confused as to how to write the code now for the clogit?

Code:

. ds *_ctrl, not
uniqueid    variables etc

. local vbles `r(varlist)'

.
. rename (`vbles') =_case

. gen long obs_num = _n

. clonevar group_id = uniqueid_case

. reshape long  `vbles', i(obs_num) j(cc) string
(note: j = _case _ctrl)
(note: cdgender_ctrl not found)
(note: age_dm_ctrl not found)
(note: age_dm_U_ctrl not found)

Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                     2083   ->    4166
Number of variables                  45   ->      26
j variable (2 values)                     ->   cc
xij variables:
            uniqueid_case uniqueid_ctrl   ->   uniqueid
     etc
-----------------------------------------------------------------------------

.
. drop obs_num

. duplicates drop if cc == "_case"

Comment

Mike Lacy

Join Date: Apr 2014

Posts: 2413
#6

25 Sep 2019, 12:26

It's hard for me to be sure if the re-shape was right without seeing posted small before/after example data set, for which -dataex-, as described in the FAQ, is the tool of choice.
Comment

Announcement

Comparing matched cases and controls with a cluster variable

Comment

Comment

Comment

Comment

Comment