Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculating annual change

    Dear all,

    I have data with repeated measures (2 year interval) and I am trying to calculate adjusted annual change in systolic blood pressure in 2 groups using linear regression model.

    Data a bit looks as below:
    PHP Code:
    clear
    input long id int
    (systolic_bp diastolic_bpfloat years_from_baseline double baseline_age float(baseline_sbp bmi baseline_bmi baseline_dbp gr)
    1 118 69   2 60.8 127 35.9   34 77 1
    1 114 64 3.9 60.8 127 35.9   35 77 1
    1 108 62 7.2 60.8 127 35.9   32 77 1
    2 128 74   2   71 127   33 30.9 87 2
    2 119 78 3.7   71 127   31 30.9 87 2
    2 100 60 5.7   71 127   28 30.9 87 2
    end


    Commands
    :
    mixed systolic_bp i.gr##c.years_from_baseline bmi||id: years_from_baseline, var reml cov(unstr) 


    My question is: Is it right way to calculate annual change (with 2 year interval measures)?


    I would really appreciate if anyone may help (direct) me.

    Thank you

  • #2
    In terms of your question, the fact that your data are spaced at 2 or more years does not preclude estimating an annual rate of change. Assuming that there is a linear trend in outcome, this is fine.

    The real question is whether a linear trend in outcome is a reasonable assumption. I notice that even in this small data sample, you have one observation taken 7.2 years from baseline. I think it is questionable whether one can reasonably expect a linear trend in blood pressure to be sustained over that long a period of time. If somebody's blood pressure is rising (or falling) over time due to lifestyle changes, or medications, one expects a change for a while, but eventually settling into a new equilibrium. To me, 7.2 years sounds like a very long time for a sustained trend. But really you should explore the data graphically first. Try -xtset id- and run some graphs like -xtline systolic_bp years_from_baseline- to see if long-term linearity looks reasonable or not. If it doesn't you would either restrict your analysis to a short enough time period that linearity is appropriate, or move to a more complicated model.

    Finally, I assume you have, in reality, a much larger data set to work with,.

    Comment


    • #3
      Thank you so much prof.Schechter . As suggested I've tried -xtset gr- and -xtline systolic_bp years_from_baseline- . Data set is large and almost all participants had more than 2-3 visits and some participants have more than 15 visits. Because of too many missing variables I decided to keep only those visits (every 2 yr visit) where majority of participants's data is available. I am not sure whether this approach is right or not. I am also looking for annual change of glomerular filtration rate (gfr) in 4 groups.

      Graph (-xtline gfr years_from_baseline-) looks as below.
      Click image for larger version

Name:	graph.png
Views:	1
Size:	47.1 KB
ID:	1456617


      In this case is it reasonable to run linear regression to calculate annual change in gfr?

      Thank you.















      Comment


      • #4
        It's hard to see what is going on in those graphs because the values at 0 on the horizontal axis are of a different order of magnitude, and so the rest of the graph gets squashed to the point where you can't see what's happening. I would re-do the graphs, restricting them to years_from_baseline > 0 to get a better look.

        Comment


        • #5
          Thank you prof. Schechter. As suggested I've restricted the years+from baseline >0 and graph now looks as below.

          Click image for larger version

Name:	graph1.png
Views:	1
Size:	63.1 KB
ID:	1456621

          Comment


          • #6
            OK, these graphs don't look like they have a whole lot going on in the way of trends, linear or otherwise. It also appears that the data are very sparse after 6 years, so your analysis will not be much influenced by the late results. While I would have expected that the duration of time trends in blood pressure would be only 2 or 3 years, the data do not seem to show any turning point there, nor elsewhere. So I think you are OK. Based on the graphs, I will be surprised if you find any substantial trends either way. But my concern about a linear model being a serious mis-specification is apparently misplaced.

            I do have a concern about your data. Going back to the graphs shown in #3, something seems terribly wrong with the data at baseline. In groups 1 and 2 you show data points with blood pressure measurements in the 600-800 range. Now, you don't specify the units, but given that the rest of the data is all in the 50-120 range and given that the commonest unit used for BP measurements is mmHg, those values at time 0 cannot be right. Cardiac muscle is physically incapable of generating a pressure that high, and even if it could, the damage that would do in the brain is surely incompatible with life. You need to clean up that part of the data.

            (And, as I always say, data errors are like cockroaches--there is never just one. When you find a data error, you need to review the entire data management process that led to it, because there may be problems with other observations as well.)

            Comment


            • #7
              Thank you so much prof.Schechter and I am really sorry for confusing you. Glomerular filtration rate (GFR) is above graph (my primary interest is annual change in GFR but I am also looking for another measurements such as blood pressure).


              Comment


              • #8
                Dear prof.Schechter,

                May you please look at the below graph #1 and #2.

                I will try to explain from the beginning. I have relatively large survival data with 4 yr median follow-up period.Based on Var1 and Var2, all participants were categorized into 4 groups. Var1 had <10% missing values during the follow-up period while Var2 had more than 20%. Regression model for calculation of annual change in Var1 was adjusted for Var2 and other variables during the follow-up. Because of too many missing values in Var2 I decided to keep only those visits (every 2 yr visit) where majority of participants's data is available. Results from this data shown below #1.

                In graph#2 I've included all visits (every 4 months-1 year) and Var2 and other covariates have >10-20% missing values.

                In this case what would you suggest? Which one will produce more correct results?

                #1
                Click image for larger version

Name:	graph1.png
Views:	1
Size:	34.2 KB
ID:	1457389

                graph#2
                Click image for larger version

Name:	garph2.png
Views:	1
Size:	26.2 KB
ID:	1457390


                Thank you.
                Oyun

                Comment


                • #9
                  I had suggested graphs like these earlier to deal with a concern about linearity. That is not the issue here. Here your concern is data missingness, and I don't think these graphs are all that helpful for this purpose. The overall appearance of both graphs does suggest that the thinned out data follow the same general trends and appearance as the denser data. But I'm not sure we can conclude much from this, because even if the thinned data looks representative of the whole data set on this predictor variable, it doesn't tell us how they relate to the survival outcome you want to model. And the latter is more important.

                  There are no easy, satisfying solutions to the problem of missing data in general. The following link to a paper by Paul Allison, https://www.unc.edu/~nielsen/soci709/cdocs/allison.pdf, outlines the theoretical and practical issues associated with some approaches to missing data. It is a long, and not particularly easy, read. But I don't have a good way to distill it down to a Forum post, and I cannot pretend to understand your particular data well enough to say how these principles and methods would apply to them. I suggest you step back from your project and study at least the first three chapters of Allison's paper and see if that helps you approach this question.

                  Comment


                  • #10
                    Thank you so much prof.Schechter for valuable comments. As suggested I will take time and read the paper.

                    Comment

                    Working...
                    X