Question about interaction between groups

michael megaly

Join Date: Mar 2021

Posts: 13
#1

Question about interaction between groups

20 Apr 2024, 17:07

Beginner's question here. I have a dataset with change in a continuous variable (LVEF) after a procedure (PCI). The dataset has the procedural done to one of two vessels (RCA and LCX- variable name rca with values of 0 and 1). I have a pre procedure LVEF variable (lvef) and I have a follow up EF variable (fu_lvef). I assessed the changes in EF in the overall sample and in both groups separately. I need to get a p value for interaction between the 2 groups (does the vessel really make a difference).

I tried multiple ways but I am not sure which one is correct
I tried (simple regression)
reg fu_lvef rca

I tried using interaction terms and get the value for this interaction term
reg fu_lvef i.rca##c.lvef

Still not satisfied I feel some outcomes that shoudl be significant are not and vice versa in other variables.
Any recommendations?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

20 Apr 2024, 17:44

Still not satisfied I feel some outcomes that shoudl be significant are not and vice versa in other variables.

It is not science to shop for a different model to produce results that you want to see. It's fine to say you're surprised at the results, but the critique of the model needs to be based on things that could be said without having seen the results, or even seen any of the data. Evidently it's too late for that now. That said, I do believe that neither of the models you estimated is an effective way to estimate the effect of the intervention, nor whether it matters which artery is used for it.

Here's a different model, probably what I would have used. First, I would reorganize the data set. Rather than a single observation per patient with pre- and post- procedural variables, I would make a data set where each patient has two observations: one pre and one post, with an LVEF variable (which takes the same value as your current lvef variable in the patient's pre- observation and the value of your current variable fu_lvef in the post- observation, another variable, pci, coded 0 for pre and 1 for post, and your rca variable (which will take the same value in both observations of the same patient). Then I would fit a model like this:

Code:

mixed LVEF i.rca##i.pci || patient_id: margins rca, dydx(pci)

Since many procedure outcomes are also sensitive to the person performing the procedure, you might need to expand the -mixed- command to include a random or fixed effect representing the cardiologist as well. -mixed LVEF i.rca##i.pci || cardiologist: || patient_id:- Or, if there is only a small number of cardiologists in your study, -mixed LVEF i.rca##i.pci i.cardiologist || patient_id:.

The -margins- output will show you the pre-post procedure difference in LVEF separately for the RCA and LCX approaches. And you can estimate the difference (RCA minus LCX) between those two effects just by looking at the -mixed- output in the row 1.rca#1.pci.

Now, there are a number of other issues that you need to think about and modify this approach accordingly. Was the RCA vs LCX path assigned by randomization? If not, then your results may be sharply biased by other attributes of both the patient and the cardiologist. And to the extent possible, you should include those as covariates in your analysis. You may be limited in the extent to which you can do that due to availability of data or too small a sample size to support an analysis with many variables. But do your best to deal with these if this is observational data.
Comment
michael megaly

Join Date: Mar 2021

Posts: 13
#3

21 Apr 2024, 08:00

Thanks for your answer Clyde, I tried to run the mixed model but I keep running into an error
"could not calculate numerical derivatives -- discontinuous region with missing values encountered"
I made sure there are no missing variables. I sorted the dataset by patient ID, but I still encounter the error.
Any ideas?
Thanks a lot for your help
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

21 Apr 2024, 10:32

The missing values referred to in that error message are not missing values in your data set. They are discontinuities in the log-likelihood function where the attempt to calculate its derivative returns a missing value. This sort of thing can happen with multi-level models and there isn't much one can do about it. Nevertheless, it is unusual to see it happen with a model as simple as this. Which makes me wonder if there is something else odd about your data. Please post back with example data, using the -dataex- command to do so. And also show the exact command you ran which produced this problem (even if it is identical to what I suggested in #2). I'll try to troubleshoot.

If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
michael megaly

Join Date: Mar 2021

Posts: 13
#5

21 Apr 2024, 15:24

This is the data example
dataex lvef fu_ef_echo code3 rca
40 61 "1036" 1
32 30 "1069" 1
20 13 "107" 0
35 30 "1070" 1
34 56 "1074" 0
35 38 "118" 0
35 50 "128" 1
20 25 "136" 0
35 38 "146" 1
20 22 "147" 1
22 38 "170" 1
39 40 "171" 1
34 44 "18" 1
30 20 "185" 0
40 25 "189" 1
40 65 "195" 1
25 35 "197" 1
37 28 "2" 1
35 45 "203" 0
40 45 "214" 1
19 15 "217" 1
40 55 "223" 0
40 26 "225" 1
35 43 "238" 1
40 61 "243" 0
27 25 "25" 1
31 31 "26" 1
31 26 "28" 0
20 45 "281" 1
27 51 "311" 0
12 35 "33" 1
25 27 "331" 1
40 59 "352" 1
22 16 "369" 1
32 35 "398" 1
30 47 "400" 0
40 68 "428" 1
40 32 "43" 1
40 55 "44" 1
15 55 "444" 1
40 37 "454" 0
31 28 "46" 0
40 60 "469" 1
40 68 "48" 1
35 39 "481" 1
40 67 "489" 1
40 40 "494" 1
40 29 "53" 0
31 61 "537" 0
29 61 "553" 1
37 50 "573" 1
21 25 "576" 0
39 41 "577" 0
40 34 "590" 1
35 16 "593" 0
40 55 "602" 1
40 32 "607" 0
30 33 "612" 0
25 26 "619" 0
40 58 "622" 1
20 43 "624" 1
21 30 "627" 1
23 26 "63" 1
40 60 "630" 1
30 53 "637" 1
20 33 "641" 1
21 29 "648" 1
20 25 "654" 0
25 40 "657" 1
25 20 "681" 1
40 64 "684" 1
30 36 "685" 1
21 25 "691" 0
25 45 "697" 1
15 35 "704" 1
40 50 "705" 0
35 52 "713" 1
40 40 "719" 1
10 25 "729" 0
35 45 "73" 1
20 15 "731" 1
29 50 "735" 1
22 50 "737" 1
35 23 "753" 1
30 60 "756" 0
40 54 "766" 0
35 45 "771" 1
35 55 "776" 1
25 10 "780" 0
21 27 "781" 1
32 20 "783" 1
40 57 "785" 1
15 25 "789" 1
30 17 "809" 0
40 39 "827" 1
25 25 "839" 0
25 50 "858" 1
40 35 "878" 1
10 22 "88" 1
25 45 "890" 1

and this is the code I use
mixed fu_ef_echo i.rca || code3:

Thank you
Comment
michael megaly

Join Date: Mar 2021

Posts: 13
#6

25 Apr 2024, 12:01

Any help if possible?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#7

25 Apr 2024, 12:33

I thought I responded to this shortly after you posted #5. I must have forgotten to hit "Post Reply" at the end. Sorry about that.

The problem is that you have only one observation per code in this layout, so you cannot apply mixed to it. You seem to have ignored everything I said in #2 about using a different model with two observations per participant. To that, I will add that I notice in your data that the variance in LVEF is bout 4 times as large at follow-up as it is at baseline. Is a different measurement technique being used, perhaps one with greater measurement error? Or is there some other explanation for this? In any case, this large a discrepancy in variance requires modifying the model a bit to account for it.

Code:

rename lvef lvef0 rename fu_ef_echo lvef1 reshape long lvef, i(code3) j(time) label define time 0 "Baseline" 1 "Follow-Up" label values time time mixed lvef i.time##i.rca || code3:, residuals(, by(time)) margins rca, dydx(time)

The output of the -margins- command shows you the change in LVEF between baseline and follow-up in each of the approaches. The difference between the approaches is found in the time#rca row of the -mixed- output.

Notice that had you shown example data from the start, you would have had an answer to your question 5 days ago. In the future, always show example data when asking for help with code.
1 like
Comment

Announcement

Question about interaction between groups

Comment

Comment

Comment

Comment

Comment

Comment