Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing longitudinal data with gaps

    Hi all, I'm trying to graph mean utility scores collected over time in a maximum of four surveys, where the dataset has been xtset with the panel variable of patient ID, and time variable of survey response. I am currently using
    Code:
    egen mean = mean(utility), by(survey_response long_covid)
    to get the mean utility score for both groups by long covid (0=no, 1=yes) by survey response. This is a snapshot of what the (dummy) data looks like, with patient ID, the survey response, exposure dummy and finally utility score.
    Code:
    input double(patient_id survey_response) float(long_covid utility)
     15 2 0   .9865437
     21 1 0    .975916
     38 1 0   .7132339
     41 1 0   .5275346
     42 1 0   .9831794
     49 1 0   .6651402
     51 1 0  .52302635
     52 1 0   .6507832
     52 2 0   .8427859
     57 1 0 -.07101776
     57 2 0   .5238919
     57 3 0  .57239664
     62 3 0   .9750341
     62 2 0   .9750341
     62 1 0   .9750341
     62 4 1   .9750341
     63 2 0   .4941897
     63 1 0   .6041356
     65 2 0    .987404
     68 1 0 -.16829173
     70 1 0    .353614
     72 1 1 -.06054749
     75 1 0   .9796785
     76 2 0  .10147964
     77 1 0   .9751386
     77 2 0   .9751386
     82 1 0    .648912
     84 1 0   .7297567
     94 1 0   .9783098
     94 2 0   .9783098
     94 3 0   .9783098
     94 4 0   .9783098
     97 3 1   .9755546
    100 2 1   .9879804
    100 3 1   .9879804
    100 1 0   .9879804
    102 3 0  -.3323491
    102 1 0   .2568085
    104 2 0 -.04498811
    104 1 0   .3425053
    108 1 1  .27844697
    110 2 0   .9834457
    116 4 0    .987404
    116 1 0    .987404
    118 1 0   .5605557
    119 1 1  .08590872
    The issue I have is that some participants (denoted by patient_id) have completed 2 surveys, some 3 and finally the maximum of 4, and combining them using
    Code:
    twoway (tsline mean if long_covid==0) (tsline mean if long_covid==1)
    produces a line that appears to show a downward gradient, but this is more likely loss to follow-up. The question I therefore have is how would you change the graph so as to have one line that shows the mean scores if participants have only completed 2 surveys, then further lines for 3 and 4 responses respectfully. I'm hoping this will show the current gradient just to be from the most affected filling all surveys, whereas least affected may only do 2. I am using Stata MP v. 16.1 and within a secure research environment so user created commands are not allowed unfortunately. Many thanks for any help! Ollie
Working...
X