Hi everyone,
I am working on a longitudinal neuroimaging dataset and I would appreciate some advice regarding the most appropriate analytical strategy.
I have cortical thickness measurements from 68 different brain regions for each subject.
The data are in long format, where each row represents:
subject (idd)
cortical area (area_id)
timepoint
cortical thickness (a)
Each subject was measured at two timepoints only: baseline and follow-up
The follow-up interval is variable across subjects and is represented by the variable "Months_between_MRI"
At baseline Months_between_MRI is always 0
The main clinical predictors are:
"sclerosi" (presence of sclerosis)
"resistant" (drug resistance)
"toniclonic" (presence of seizures)
Additional covariates:
Age
Gender
ICV
Epilepsy_duration
A datasubste with only 7 out 0f 68 brain areas is the following:
The cortical areas have very different absolute thickness scales, so using raw thickness values makes interpretation difficult across regions.
Initially, I tried z-scoring within area
andlongitudinal mixed models such as:
And then:
However:
interpretation became less intuitive, plotting trajectories was difficult and I am unsure whether this is the best strategy given that I only have two timepoints.
Current idea
I am now considering computing percentage change from baseline pct_change = 100*(followup - baseline)/baseline
and then analyzing it by keeping only the follow-up observation.
This would produce one observation per subject-area combination representing the rate of cortical thinning.
Then fitting something like:
Does this approach seem statistically reasonable given only two timepoints, variable follow-up duration, many cortical regions per subject?
In the end...Which analytical strategy would you suggest? My main aim is to identify which subjects, in terms of hippocampal sclerosis, drug resistance, and presence of tonic-clonic seizures,show the greatest cortical thickness reduction over time.
Thanks for your time!
Gianfranco
I am working on a longitudinal neuroimaging dataset and I would appreciate some advice regarding the most appropriate analytical strategy.
I have cortical thickness measurements from 68 different brain regions for each subject.
The data are in long format, where each row represents:
subject (idd)
cortical area (area_id)
timepoint
cortical thickness (a)
Each subject was measured at two timepoints only: baseline and follow-up
The follow-up interval is variable across subjects and is represented by the variable "Months_between_MRI"
At baseline Months_between_MRI is always 0
The main clinical predictors are:
"sclerosi" (presence of sclerosis)
"resistant" (drug resistance)
"toniclonic" (presence of seizures)
Additional covariates:
Age
Gender
ICV
Epilepsy_duration
A datasubste with only 7 out 0f 68 brain areas is the following:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float idd int Months_between_MRI byte area_id double a byte(Age Gender) double(Epilepsy_duration Onset_years) float(sclerosi resistant toniclonic) 1 0 1 2.45 30 0 2 28 0 0 1 1 0 2 2.406 30 0 2 28 0 0 1 1 0 3 2.544 30 0 2 28 0 0 1 1 0 4 1.987 30 0 2 28 0 0 1 1 0 5 2.917 30 0 2 28 0 0 1 1 0 6 2.667 30 0 2 28 0 0 1 1 44 1 2.362 33 0 5 28 0 0 1 1 44 2 2.408 33 0 5 28 0 0 1 1 44 3 2.521 33 0 5 28 0 0 1 1 44 4 1.95 33 0 5 28 0 0 1 1 44 5 2.677 33 0 5 28 0 0 1 1 44 6 2.59 33 0 5 28 0 0 1 2 0 1 2.253 36 0 15 21 1 1 1 2 0 2 1.969 36 0 15 21 1 1 1 2 0 3 2.105 36 0 15 21 1 1 1 2 0 4 1.668 36 0 15 21 1 1 1 2 0 5 1.083 36 0 15 21 1 1 1 2 0 6 1.683 36 0 15 21 1 1 1 2 72 1 2.414 42 0 21 21 1 1 1 2 72 2 2.708 42 0 21 21 1 1 1 2 72 3 2.163 42 0 21 21 1 1 1 2 72 4 2.06 42 0 21 21 1 1 1 2 72 5 1.344 42 0 21 21 1 1 1 2 72 6 1.872 42 0 21 21 1 1 1 3 0 1 2.08 34 0 16 18 0 1 1 3 0 2 2.637 34 0 16 18 0 1 1 3 0 3 2.274 34 0 16 18 0 1 1 3 0 4 1.796 34 0 16 18 0 1 1 3 0 5 .944 34 0 16 18 0 1 1 3 0 6 1.3 34 0 16 18 0 1 1 3 38 1 2.039 37 0 19 18 0 1 1 3 38 2 2.516 37 0 19 18 0 1 1 3 38 3 2.092 37 0 19 18 0 1 1 3 38 4 2.457 37 0 19 18 0 1 1 3 38 5 .982 37 0 19 18 0 1 1 3 38 6 2.094 37 0 19 18 0 1 1 4 0 1 2.035 44 0 2 42 0 0 0 4 0 2 1.936 44 0 2 42 0 0 0 4 0 3 2.137 44 0 2 42 0 0 0 4 0 4 1.374 44 0 2 42 0 0 0 4 0 5 .83 44 0 2 42 0 0 0 4 0 6 1.563 44 0 2 42 0 0 0 4 36 1 1.786 47 0 5 42 0 0 0 4 36 2 1.805 47 0 5 42 0 0 0 4 36 3 1.993 47 0 5 42 0 0 0 4 36 4 2.314 47 0 5 42 0 0 0 4 36 5 1.433 47 0 5 42 0 0 0 4 36 6 1.628 47 0 5 42 0 0 0 5 0 1 2.002 49 0 15 34 0 1 1 5 0 2 1.783 49 0 15 34 0 1 1 5 0 3 2.322 49 0 15 34 0 1 1 5 0 4 1.403 49 0 15 34 0 1 1 5 0 5 .945 49 0 15 34 0 1 1 5 0 6 1.45 49 0 15 34 0 1 1 5 18 1 2.468 51 0 17 34 0 1 1 5 18 2 1.99 51 0 17 34 0 1 1 5 18 3 1.985 51 0 17 34 0 1 1 5 18 4 2.149 51 0 17 34 0 1 1 5 18 5 .942 51 0 17 34 0 1 1 5 18 6 1.599 51 0 17 34 0 1 1 6 0 1 2.484 25 0 5 20 0 0 1 6 0 2 3.064 25 0 5 20 0 0 1 6 0 3 2.445 25 0 5 20 0 0 1 6 0 4 1.973 25 0 5 20 0 0 1 6 0 5 3.265 25 0 5 20 0 0 1 6 0 6 2.84 25 0 5 20 0 0 1 6 66 1 2.072 31 0 11 20 0 0 1 6 66 2 2.731 31 0 11 20 0 0 1 6 66 3 1.656 31 0 11 20 0 0 1 6 66 4 2.286 31 0 11 20 0 0 1 6 66 5 1.459 31 0 11 20 0 0 1 6 66 6 1.88 31 0 11 20 0 0 1 7 0 1 2.11 24 0 11 13 0 1 1 7 0 2 2.386 24 0 11 13 0 1 1 7 0 3 2.184 24 0 11 13 0 1 1 7 0 4 1.28 24 0 11 13 0 1 1 7 0 5 .905 24 0 11 13 0 1 1 7 0 6 1.741 24 0 11 13 0 1 1 7 34 1 2.135 27 0 14 13 0 1 1 7 34 2 2.514 27 0 14 13 0 1 1 7 34 3 1.857 27 0 14 13 0 1 1 7 34 4 2.569 27 0 14 13 0 1 1 7 34 5 1.4 27 0 14 13 0 1 1 7 34 6 1.964 27 0 14 13 0 1 1 8 0 1 2.523 15 1 0 15 0 0 0 8 0 2 3.226 15 1 0 15 0 0 0 8 0 3 2.501 15 1 0 15 0 0 0 8 0 4 2.098 15 1 0 15 0 0 0 8 0 5 3.217 15 1 0 15 0 0 0 8 0 6 2.716 15 1 0 15 0 0 0 8 19 1 2.299 16 1 1 15 0 0 0 8 19 2 2.448 16 1 1 15 0 0 0 8 19 3 2.307 16 1 1 15 0 0 0 8 19 4 2.347 16 1 1 15 0 0 0 8 19 5 2.289 16 1 1 15 0 0 0 8 19 6 2.462 16 1 1 15 0 0 0 9 0 1 1.985 32 0 21 11 1 1 1 9 0 2 2.497 32 0 21 11 1 1 1 9 0 3 1.735 32 0 21 11 1 1 1 9 0 4 1.562 32 0 21 11 1 1 1 end
The cortical areas have very different absolute thickness scales, so using raw thickness values makes interpretation difficult across regions.
Initially, I tried z-scoring within area
andlongitudinal mixed models such as:
Code:
mixed z_area i.sclerosi##c.Months_between_MRI i.resistant##c.Months_between_MRI i.toniclonic##c.Months_between_MRI Age Gender ICV Epilepsy_duration || idd: || area_id:
Code:
margins sclerosi#resistant#toniclonic, ///
at(Months_between_MRI=(0 30 60 120))
marginsplot, xdimension(Months_between_MRI)
interpretation became less intuitive, plotting trajectories was difficult and I am unsure whether this is the best strategy given that I only have two timepoints.
Current idea
I am now considering computing percentage change from baseline pct_change = 100*(followup - baseline)/baseline
and then analyzing it by keeping only the follow-up observation.
This would produce one observation per subject-area combination representing the rate of cortical thinning.
Then fitting something like:
Code:
mixed pct_change i.sclerosi i.responder i.toniclonic c.baseline_area Age Gender ICV Epilepsy_duration || area_id:
Does this approach seem statistically reasonable given only two timepoints, variable follow-up duration, many cortical regions per subject?
In the end...Which analytical strategy would you suggest? My main aim is to identify which subjects, in terms of hippocampal sclerosis, drug resistance, and presence of tonic-clonic seizures,show the greatest cortical thickness reduction over time.
Thanks for your time!
Gianfranco

Comment