Analyzing matched 1:4 case control data with continuous body weights

Sarah Adam

Join Date: Nov 2017

Posts: 2
#1

Analyzing matched 1:4 case control data with continuous body weights

19 Nov 2017, 18:36

Good evening,

Hope you all are well. I am new to STATA and had a quick on a project i have.
I have a list of patients with body weight before and after surgery. I have matched each of the patients to 4 control patients (age and gender) undergoing a non-surgical weight loss intervention.
I am looking to see if the weight loss before and after intervention (surgery or diet) was different in cases and controls pairs.

It would be very helpful to get some suggestions on how to tackle this!
Thank you in advance.

Sincerely,
Sarah Adam
MS Biology
Towson University
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30147
#2

19 Nov 2017, 19:20

So the first step would be to choose your outcome variable. It sounds like you are primarily interested in the difference in weight, after minus before surgery, although what you right is not 100% clear on that point. You might then perform a mixed-effects regression of the weight difference on an indicator for surgery vs non-surgical intervention, and a random effect at the matched quintuple level. You would be well advised to also include relevant covariates in the fixed effects--you don't say what other information is available in your data that might be useful here.

There are other approaches, however. If your before-surgery weights are self-reported or obtained with a different apparatus or under different conditions from those obtained after surgery, using the after-surgery weight itself as the outcome and having the pre-surgery weight as a covariate might be a better approach.

If the time interval between the before-weight and the after-weight is not the same (at least approximately) among all the patients, then this has to be somehow built into the model as well. And do you have only a single before and a single after weight for each patient, or do you have a series of observations in each condition?

So your description leaves some important questions for you to ponder before settling on an analysis plan, and certainly before writing any code. There is a book available from the Stata bookstore by Twisk on longitudinal data analysis that you might find helpful in pondering these issues. The book is somewhat old at this point, and the code shown in the examples might be somewhat outdated, but the discussion of the different approaches to working with longitudinal data is still quite current.

Let me say that this is not my idea of a beginner-level problem in Stata and you would be well advised to get acquainted with Stata before undertaking it. I would suggest starting with the Getting Started with Stata volume of the PDF documentation that is installed along with Stata. Then read the User's Guide volume. Those two will familiarize you with the basic commands that are commonly used in Stata data management and analysis. You won't remember all the details, but you will at least know what basic commands are available, and when actually working with Stata you will know which help files are PDF documentation chapters to look up the details in when you encounter problems. Since you are likely to end up using a mixed-effects model here, you will want to read the -mixed- chapter of the [ME] volume. It includes some very helpful worked examples.

After you have thought through some of the analysis issues I raised here (and you may encounter more subtleties in the course of doing that much), re-posting with a more focused question would likely get you a helpful and timely response. When you get to the point of writing code, if you need help with that, be sure to use the -dataex- command to show an example of your data, and show what code you have tried up to that point, as well as the output or error messages you have gotten from Stata with that code.
Comment
Sarah Adam

Join Date: Nov 2017

Posts: 2
#3

20 Nov 2017, 07:17

Thank you so much for the detailed input!
As recommended a more detailed background on the study:-

1. All the data from the participants are from EHR (not self reported) and use standardized instruments at all time points.
2. The participants have their weight measured at baseline, 1 month, 3 months, 6 months and 1 year time points.
3. Covariates of interest are:age at intervention, gender, clinical co-morbidities at baseline and weight at baseline.
4. Each patient undergoing surgery is matched with 4 patients that had a diet based therapy. The patients were matched on age, gender and weight at baseline.
5. Instruments were the same at all time points.
6. while dates for assessment are fairly fixed (0,1m, 3m,6m,1year) there is a a bit of variability in the dataset .
7. All weights measured are single data points at each visit.

As suggested i will review the documentation and readings above before i dwell further into writing code!
Also attaching a short clip of the data. Variables with label "_case" are the participants who underwent surgery.

input float(id3 study) byte age double height float(gender wt0 wt30 wt90 wt180 wt360 id3_case) byte age_case double height_case float(wt0_case wt30_case wt90_case wt180_case wt360_case)
43 2 43 67.5 2 188.6 174.3 169.4 . 179.8 4 50 1.6 225.75354 . . . 198
48 2 50 75 1 340.4 323.2 317.8 305.2 291.2 1 46 1.778 285.7472 . . . .
53 2 55 64 2 353 332.8 318.8 299.6 289.1 35 56 1.6 282.19193 258 224 . .
54 2 48 61 2 156.7 151 137 . . 13 50 1.524 189.4551 187 . . .
59 2 68 63 2 185.2 174.6 172.6 161.4 174.8 33 63 1.626 233.15015 213 199 . .
60 2 45 61 2 199.4 195.8 194 . . 21 42 1.727 236.7128 220 209 192 .
61 2 43 64 2 257.4 238.2 . . . 34 49 1.702 229.90913 211 193 . .
63 2 66 64 2 177.2 . . . . 37 70 1.676 185.7821 173 160 . .
64 2 61 65 2 275.8 273 275.4 . . 11 69 1.6 231.3974 221 214 . 189
65 2 36 68.5 1 213.2 193 179 . 192.8 24 31 1.75 236.3082 214 198 184 .
66 2 31 65 2 293.2 . . . . 31 34 1.6 225.75354 199.5 184 . .
68 2 49 67.5 2 281 257.6 242.2 233.4 208.6 21 42 1.727 236.7128 220 209 192 .
69 2 68 70.5 1 267.4 253.5 238.2 223.9 . 32 69 1.676 247.7095 219 215 . .
72 2 38 68 2 262.4 240.6 . . . 14 40 1.6 214.46587 196.4 184 176 169

Again thank you!!

Best,
Sarah Adam
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30147
#4

20 Nov 2017, 10:44

On other thing to think about in terms of data management. The example data you show seems to have one control and one case per observation. What about the other 3 controls? Are there actually more variables in the real data corresponding to the age, height and weights observed on the other controls? Or are there separate observations for the other controls, each carrying the same information for their case variables?

In any case, you will ultimately want to -reshape- this data to fully long layout, with one observation per person containing variables for individual ID numbers, an indicator of whether the person is a surgical case or a non-surgical control, ID number of the corresponding case (= ID number for the case itself), individual's age gender and height, observation time (0, 30, 90, 180, or 360) and weight of that person at that time. All of the different analytic approaches will be most easily done with the data arranged that way.
Comment

Announcement

Analyzing matched 1:4 case control data with continuous body weights

Comment

Comment

Comment