Time-varying covariate and interaction with time and quadratic time

Julia Lopez

Join Date: Oct 2016

Posts: 11
#1

Time-varying covariate and interaction with time and quadratic time

19 Feb 2017, 16:31

Hello,

I am looking at a continuous outcome (CD4 cell counts) and want to know the impact of a treatment (coded as 1) versus control (coded as 0) program. I was thinking about it and I am looking at the impact of of this treatment program versus those not in the treatment group (aka control), BUT it could have varied over time. For example, I could have been in it for one year and not in another and so on. To make things more complicated, I am using a quadratic model because it fits the data better. My "time" variable is 0, 1, 2, 3, 4 because this data is for over 5 years.

How would I interpret the time-varying treatment program status when it is interacted with the linear growth trtgrp*time (linear growth) and trtgrp*time2 (curvature aka quadratic growth) as well as trtgrp without an anytime "time" interaction? I can't seem a lot information about this level of interpretation? Thank you!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

19 Feb 2017, 18:58

Well, the most important thing to understand and remember is that neither the linear nor the quadratic terms have any meaning by themselves. They must always be treated jointly. Any statement about either term in isolation is necessarily incomplete, at best, or misleading, at worst.

So, at the most primitive level, if you seek a statistical significance test of the effect of treatment it would be the joint test of the terms trtgrp, trtgrp*time, and trtgrp*time2 (in your notation--we'll get to a better syntax below.)

But conceptually, what you are doing in fitting this quadratic model is fitting two separate parabolas to the data, one through the treatment group's results and one through the control's. Now, although when viewed in their entirety all parabolas have the same shape, within limited ranges (such as time = 0 through 5) there is a huge variety. They can look almost like straight lines, or they can be U-shaped or they can exhibit deceleration of growth or acceleration of growth. And they can be rightside-up or upside-down. And it is possible for the two group's growth curves to have completely different shapes from each other, chosen among these varieties. So a single significance test is a grossly impoverished representation of what is going on.

So, a full understanding of a quadratic model like this effectively requires a graphical representation. You can do this with what you have started, but it is a lot of work and it is tedious and error-prone. It is much easier and safer to do this using the -margins- and -marginsplot- commands, which, in turn, mean you have to go back and rerun your regression using Stata's factor variable notation. So before proceeding, read -help fvvarlist- and the associated manual section. You will see that your regression will look something like this in factor variable notation:

Code:

regress cd4_count i.trtgrp##c.time##c.time // AND POSSIBLY SOME COVARIATES

You can follow that up with

Code:

margins trtgrp, at(time = (0(1)4)) marginsplot

The graph produced by -marginsplot- may not be entirely to your taste. But -marginsplot- can be run with almost any of the options available with -graph twoway-, so you can customize the graph to your liking.

So the shortest version of my response is: don't interpret your model, visualize it! If you then want to compose 1,000 words to describe the picture, feel free.

Added: I promised you above the syntax for a joint significance test. Here it is:

Code:

test 1.trtgrp 1.trtgrp#time 1.trtgrp#time#time
1 like
Comment
Julia Lopez

Join Date: Oct 2016

Posts: 11
#3

20 Feb 2017, 08:07

Hello Clyde,

Thank you for the insight and information. I am doing a 2-level mixed model, so from my understanding from all the readings I would be able to and have read the ability to interpret the linear and quadratic terms. But, what I have read mostly about is when the interaction is with a time-invariant predictor and not with a time-varying predictor. This is where I am trying to get a handle on. So, in this case, I know that time, quadratic time and trtgrp are random effects. And, I have data for each person (variable "id") at each yearly time point. Any additional insight on this aspect?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

20 Feb 2017, 08:47

But, what I have read mostly about is when the interaction is with a time-invariant predictor and not with a time-varying predictor. This is where I am trying to get a handle on.

I think that, if anything, the graphical approach to understanding these models is even more helpful in the context of a growth model. I'm not sure what aspect of interpreting your model would remain unclear to you after you do that. Can you pose a more specific question?

So, in this case, I know that time, quadratic time and trtgrp are random effects.

What? That doesn't make any sense to me. Do you mean that your model includes random slopes for time, time square and trtgrp? (Perhaps you should show your code instead of trying to describe your model in words.) That wouldn't really affect anything here. The approach outlined in #2 will show the average trajectories of CD4 counts in each treatment group. The random slopes part of the output will give you a sense of how much individual patients deviate from these mean trajectories. Evidently, the larger the variances of the random slopes, the more individuals depart from the mean trajectories. That might be important information in its own right. But the overall conclusions about treatment effects (or, if this is an observational study, treatment group differences) wouldn't change.
Comment
Julia Lopez

Join Date: Oct 2016

Posts: 11
#5

20 Feb 2017, 09:31

Hi Clyde,

Sure here is what I am thinking would be the syntax (I generated new variables for trtgrp*time and trtgrp*time2 as you will see below):

mixed cd4 time time2 trtgrp trtgrpXtime trtgrpXtime2 || id: time time2 trtgrp, cov(un)

Results: trtgrp = -20 ; time*trtgrp = 18; time2*trtgrp = -2 (all significant).

My take:

Those initially, in 2009, who had treatment were, on average, to have 20 cells/mm³ less than those with no treatment. (This would make sense bc those in the treatment group have higher need).

Further, those with treatment contact indicated an increased change in CD4 cell counts, over the years, as compared with times with no treatment contact.

However, this trend reaches a maxima and the magnitude of the treatment contact on CD4 cell counts decreases (aka a decelerated growth).

Last edited by Julia Lopez; 20 Feb 2017, 10:00.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#6

20 Feb 2017, 10:02

So, your linear coefficient is 18, and the quadratic coefficient is -2. The vertex of this parabola occurs when time = -linear/(2*quadratic) = -18/(-4) = 4.5. So yes, on average in the treatment group, the CD4 counts increase through times 0 through 4, then peak and begin to decline thereafter.
Comment
Julia Lopez

Join Date: Oct 2016

Posts: 11
#7

20 Feb 2017, 10:06

Great! So, I was on the right track in regards to interpretation. If I wanted to graph this...could I still use margins? I tried it but said that factor "trtgrp" not found in list of covariates...?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#8

20 Feb 2017, 10:18

You didn't use factor variable notation in your regression so you can't use -margins- here. This is a somewhat difficult case, because factor variable notation is not supported in the random slopes specification of -mixed- (which is the current name for -xtmixed-.) So you have to kind of fake it.

Code:

gen time2 = time*time // PRESUMABLY YOU HAVE ALREADY DONE THIS mixed cd4 i.trtgrp##c.time#c.time || id: time time2 trtgrp, cov(un)

After running that, then

Code:

margins trtgrp, at(time = (0(1)4)) // OR IS IT (0(1)5) ? marginsplot

will run.

I'm a little bit confused about your data. Does your time variable run from 0 to 4 or 0 to 5. It makes a bit of a difference in this particular situation. Your fitted growth parabola in the treatment group has it vertex at time = 4.5. If your data runs only 0 to 4, then the vertex is outside the range of the data, which means you will observe only a declerating increase in CD4 counts over time, but the peak will not be reached. If your time variables runs all the way to 5, then you will see the turnaround between 4 and 5.
Comment
Julia Lopez

Join Date: Oct 2016

Posts: 11
#9

20 Feb 2017, 10:33

Hi Clyde,

It is actually through 5, sorry for the confusion. So, I can actually see the turnaround which is pretty interesting! I had done this through Excel, but great to use STATA too. Question, if I have some covariates to add to the model...I was planning on rerunning it. So, to get the graph...would I just add the covariates to the above syntax?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#10

20 Feb 2017, 11:23

Well, it depends on how you want to adjust the covariates. To include them in the regression, you would just add them to the bottom level list of regression variables in the -mixed- command. For -margins- you have some options. If you want your results adjusted to the observed distribution of the covariates (which is what is most commonly used), you make no mention of them at all in -margins-: that is its default behavior. If you want to fix the covariates to specific values you can do that by adding mention of them to the -at()- option. For example, if you want everything adjusted to age = 20, your -at()- specification would be -at(time = (0(1)5) age = 20)-. You can put as many of those variables in to the -at- specification as you like. If you would like to adjust the results to all of the covariates being at their mean values, then you can just add the -atmeans- option to your -margins- command. (You can also specify some covariates to special values and adjust the rest to their means by combining, e.g. -at(time = (0(1)5) age = 20 sex = 1) atmeans-. You don't modify the -marginsplot- command at all for this: it just follows whatever you do in -margins-.
Comment
Julia Lopez

Join Date: Oct 2016

Posts: 11
#11

20 Feb 2017, 11:28

Hi Clyde,

This was my plan...

mixed cd4 time time2 trtgrp trtgrpXtime trtgrpXtime2 race_cat gender_cat yearshiv iov age || id: time time2 trtgrp, cov(un)

So, ultimately, when running this mixed model and following it by:

margins trtgrp, at(time = (0(1)5))
marginsplot

The covariates are already being taken into account in the regression and, thus, the margins syntax does not need to be altered. Unless, I am interested in some other aspects as you provide. Correct?

Last edited by Julia Lopez; 20 Feb 2017, 11:39.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#12

20 Feb 2017, 11:37

The covariates are already being taken into account in the regression and, thus, the margins syntax does not need to be altered. Unless, I am interested in some other aspects as you provide. Correct?

Correct.

But your -margins- command is not going to work. mcm_dico is not a variable in your model, and -margins- will not know what you are trying to do. For that matter, I don't know what you're trying to do with this.
Comment
Julia Lopez

Join Date: Oct 2016

Posts: 11
#13

20 Feb 2017, 11:53

Sorry- just a writing error. Thanks so much for the assistance!
Comment
Julia Lopez

Join Date: Oct 2016

Posts: 11
#14

20 Feb 2017, 12:17

Although, I have one other question. What if I wanted to looking at another time-varying predictor. For example, clinic adherence (var is clnadh). A person could have been adherent one year and not the next and, thus, time-varying. Ultimately...how does this play out in the syntax? I hypothesize that times in treatment and being adherent to the doc appointments would have a better health outcome than times when STILL in treatment but not adherent. Conceptually, it is interesting but hard to see where all the interactions would be in the syntax.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#15

20 Feb 2017, 12:34

You'd have to include interaction terms for clnadh with trtgrp, trtgrpXtime and trtgrpXtime2. In factor variable notation this is easy to do:

Code:

mixed cd4_count i.clnadh##i.trtgrp##c.time##c.time /*other covariates*/ || id: time time2 trtgrp, cov(un)

Note: I assumed clnadh is a dichotomous variable (or at least categorical). If it's a continuous measure, change i.clnadh to c.clnadh. Understanding the model becomes a bit harder. If clnadh is dichotomous, the simplest approach is probably:

Code:

margins clnadh#trtgrp, at(time = (0(1)5)) marginsplot

This would give you four quadratic growth curves, one for each combination of trtgrp and clnadh.

If clnadh is continuous, then you would have to specify some interesting values of it in the -at()- option instead.

Code:

margins trtgrp, at(time=(0(1)5) clnadh=(whatever)) marginsplot

In this case the margins plot will be kind of messy several different growth curves. You might also need the -xdimension()- option to get the variable you actually want on the x-axis: it isn't immediately clear whether time or clnadh would be more appropriate in this circumstance.
Comment

Announcement

Time-varying covariate and interaction with time and quadratic time

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment