Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Do you interpret as a 1 unit difference over 10 years in x is associated with 1.36 unit difference in y at 10 year follow up?
    First, the content in #15 contradicts itself. First you indicate that Y is an index ranging from 0 to 1, and then in the following sentence, you say it is a protein value. Since a 1.36 unit difference in Y with a unit difference in X is not possible if Y itself is confined to the 0 to 1 range, that would make interpreting these results rather dicey (it would imply that a unit change in X is simply not possible, or else that the model is highly non-linear in X and therefore the results not interpretable at all.)

    I'm going to assume that you meant that X ranges from 0 to 1 and Y is a protein value with a potential range from 0 to some large number. Since X itself is restricted to the 0 to 1 interval, a unit difference can only happen as the difference between exactly 0 and exactly 1. So I would interpret this result as: given a pair of patients, one with X = 0 and the other with X = 1 at the same time point (which could be at 0, 5 or 10 years), the expected difference in Y is 1.36 units.

    If you are looking for an interpretation based on the change in value of X over time within the same person, you are doing the wrong analysis. For a within patient analysis, you should be using a fixed-effects regression, not random effects. Unfortunately, the -xtreg, fe- command that would do that is not supported by -mi estimate-. While one can force the use of -mi estimate- with an unsupported command by using the -cmdok- option, in general when -mi estimate- doesn't support a command it is because the command does not meet the statistical requirements for multiple imputation analysis to produce valid results. So I'm not sure where you go from here in this case.

    Correction: -mi estimate- does support -xtreg, fe-. So I recommend you go that route if you are looking for effect of a within-patient change over time.

    Comment


    • #17
      I apologise, yes you are correct in your assumption in paragraph 2. Ok, I see that's an important difference. I attempted to run the xtreg command and these were the results.

      Code:
      mi estimate: xtreg y x education_level_bl bmi_BL smoke_status_BL i.time, i(Patient_ID) fe
      
      Multiple-imputation estimates                   Imputations       =         20
      Fixed-effects (within) regression               Number of obs     =      1,984
      
      Group variable: Patient_ID                      Number of groups  =      1,018
                                                      Obs per group:
                                                                    min =          1
                                                                    avg =        1.9
                                                                    max =          3
                                                      Average RVI       =     0.0000
                                                      Largest FMI       =     0.0000
                                                      Complete DF       =        963
      DF adjustment:   Small sample                   DF:     min       =     961.01
                                                              avg       =     961.01
                                                              max       =     961.01
      Model F test:       Equal FMI                   F(   3,  961.0)   =     303.73
      Within VCE type: Conventional                   Prob > F          =     0.0000
      
      ------------------------------------------------------------------------------------
                       y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------------+----------------------------------------------------------------
                       x |   .4976953   .2060433     2.42   0.016     .0933487    .9020419
      education_level_bl |          0  (omitted)
                  bmi_BL |          0  (omitted)
         smoke_status_BL |          0  (omitted)
                         |
                    time |
                      5  |   .5771964   .0297903    19.38   0.000     .5187349    .6356579
                     10  |   .8554192   .0458764    18.65   0.000     .7653897    .9454487
                         |
                   _cons |   2.146974   .0394742    54.39   0.000     2.069508    2.224439
      -------------------+----------------------------------------------------------------
                 sigma_u |  .68773736
                 sigma_e |  .47417033
                     rho |  .67780023   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------------
      Note: sigma_u and sigma_e are combined in the original metric.

      Comment


      • #18
        You not only attempted to run it, you ran it successfully. If I am correct in believing that the *_BL variables represent baseline values, those are omitted from within-person models because they do not change within-person and therefore carry no information for the value of y. If you think that the *_BL variables might have important effects on the y:x relationship (your earlier models explicitly preclude that possibility, but perhaps you did not intend that), then that must be modeled by interacting them with x. If you're going in this direction, you might want to also interact x with the time variables, since the time span over which x changes might well affect the y:x relationship. If you want to go that route, the code looks like this:
        Code:
        mi estimate: xtreg y c.x#(c.education_level_bl c.bmi_BL i.smoke_status_BL i.time) i.time, i(Patient_ID) fe
        When you run this, the "main" terms for the *_BL variables will still drop, but the interactions will be retained, which is what you want. The time variables will not drop, and the interactions with x will be added.

        I have assumed in the above code that education_level_bl and bmi_BL are continuous variables, but smoke_status_BL is discrete. If that is not the case, change the c and i. prefixes accordingly. It's important to get that right.

        If that's not the direction you are going, we can just interpret the results you have. First we notice that the passage of time is, itself, associated with changing values of y. All else equal, y increases by about 0.58 between years 0 and 5, and by about 0.86 between years 0 and 10. On top of that, if we look at people who start at x = 0 and go to x = 1 sometime later, the mean difference in y between those time periods will be about another 0.5.

        Comment


        • #19
          Thank you. Yes that is correct regarding the _BL variables. I wonder if I run the code as you suggest, how to then interpret the results. Would that again be with all else equal, y increases by about 0.55 between years 0 and 5, and by about 0.60 between years 0 and 10? How does one interpret the other estimates with the interactions?

          Code:
          mi estimate: xtreg y c.x#(i.education_level_bl c.bmi_BL i.smoke_status_BL i.time) i.time, 
          > i(Patient_ID) fe
          
          Multiple-imputation estimates                   Imputations       =         20
          Fixed-effects (within) regression               Number of obs     =      1,984
          
          Group variable: Patient_ID                      Number of groups  =      1,018
                                                          Obs per group:
                                                                        min =          1
                                                                        avg =        1.9
                                                                        max =          3
                                                          Average RVI       =     0.0022
                                                          Largest FMI       =     0.0169
                                                          Complete DF       =        953
          DF adjustment:   Small sample                   DF:     min       =     922.20
                                                                  avg       =     939.77
                                                                  max       =     950.92
          Model F test:       Equal FMI                   F(  13,  951.0)   =      73.76
          Within VCE type: Conventional                   Prob > F          =     0.0000
          
          -------------------------------------------------------------------------------------------
                                  y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          --------------------------+----------------------------------------------------------------
             education_level_bl#c.x |
                 Elementary school  |  -1.378386    1.25679    -1.10   0.273    -3.844875    1.088103
          Nine-year compulsory s..  |  -.1556424   1.448169    -0.11   0.914    -2.997676    2.686391
           Junior secondary school  |  -.6930909   1.235276    -0.56   0.575    -3.117348    1.731166
               Vocational training  |  -.3835031   1.242827    -0.31   0.758     -2.82257    2.055564
            Upper secondary school  |  -1.487734   1.334446    -1.11   0.265    -4.106589    1.131121
             University/highschool  |  -.8440084   1.300975    -0.65   0.517    -3.397193    1.709176
                                    |
                       c.x#c.bmi_BL |    .097991   .0435003     2.25   0.025     .0126199     .183362
                                    |
                smoke_status_BL#c.x |
                        Non-smoker  |  -.6808096   .3862928    -1.76   0.078    -1.438898     .077279
                    Current smoker  |  -.4613976   .5523961    -0.84   0.404    -1.545459    .6226639
                   Previous smoker  |          0  (omitted)
                                    |
                           time#c.x |
                                 1  |  -1.243472   .3554801    -3.50   0.000    -1.941088   -.5458553
                                 5  |  -.9185097   .3017359    -3.04   0.002    -1.510655   -.3263644
                                10  |          0  (omitted)
                                    |
                               time |
                                 5  |   .5457145   .0608311     8.97   0.000     .4263357    .6650932
                                10  |   .5890092   .0868322     6.78   0.000     .4186044    .7594141
                                    |
                              _cons |   2.269578   .0610364    37.18   0.000     2.149796     2.38936
          --------------------------+----------------------------------------------------------------
                            sigma_u |  .72026621
                            sigma_e |  .46922543
                                rho |   .7020492   (fraction of variance due to u_i)
          -------------------------------------------------------------------------------------------
          Note: sigma_u and sigma_e are combined in the original metric.

          Comment


          • #20
            Would that again be with all else equal, y increases by about 0.55 between years 0 and 5, and by about 0.60 between years 0 and 10?
            No. Because there are also c.x#i.time interaction terms in the model, the effects of passage of time depend on the value of x, and the increases of 0.55 and 0.60 apply only when x = 0. For any other value of x, the effects of passage of time are a bit more complicated. I did not anticipate that Stata would use a different base category for time in the i.time term than it used in the interaction. Were it not for the multiple imputation, we could nevertheless interpret these results readily enough with the -margins- command, but -margins- does not run after -mi estimate-. Fortunately, we have Daniel Klein's -mimrgns- command from SSC to work with. Nevertheless, to simplify matters, I suggest a slight modification to the way you rerun the regression. Then pick a series of interesting values of x at which to evaluate the marginal effects of the passage of time. For the purposes of illustration below I will use x = 0 by 0.1 to 1.

            Code:
            mi estimate: xtreg y c.x##(i.education_level_bl c.bmi_BL i.smoke_status_BL ib1.time), fe i(Patient_ID) // N.B. USING ##, NOT # THIS TIME
            mimrgns, dydx(time) at(x = (0(0.1)1))
            and you will get a table showing you the marginal effects of the passage of time at each of your time points combined with each of the selected values of x. (In the output of -mi estimate:xtreg...- the coefficients of the *_bl variables will, again, be omitted. That is appropriate and you should not be concerned when you see it.

            How does one interpret the other estimates with the interactions?
            For the other interactions, you are interested in how the effect of x itself is modified by the baseline variables. Remember, in these models there is no such thing as "the" marginal effect of x: there is a different marginal effect of x for each value of the variable(s) with which it is interacted.Those you can get with:
            Code:
            mimrgns education_level_bl smoke_status_BL, dydx(x)
            for the discrete baseline education and smoking variables. For the continuous variable bmi, you will once again need to select values of BMI at which to evaluate the marginal effect of x. Again, for illustration I will use bmi = 15, 20, 25, 30, 35, 40 (the standard cutoffs for clinical categorization of body mass index).

            Code:
            mimrgns, dydx(x) at(bmi_BL = (15(5)40))
            Bear in mind, also, that because there are several different interaction terms, their combined effects also apply. That is, for example, the marginal effect of x at a given education level will also depend on the baseline smoking category and baseline BMI. In principle, you can have -mimrgns- calculate the marginal effects of x at all interesting combinations of those variables, but the volume of results you will generate will be too large and complicated for the minds of mere mortals to really grasp. The results that -mimrgns- produces for one variable at a time, as shown above, are actually averaged over the distribution of the other variables in the model, so they are, in a sense, the "typical" marginal effects of x conditional on those particular baseline variables. In fact, these are appropriately called average marginal effects adjusted to the distribution of other variables. I think they are few enough and simple enough that a comprehensible presentation of findings can be based upon them.


            Comment


            • #21
              Ok so I have ran the new code and have the following results. If I understand you correctly the interpretation is: After 5 years, when x is at whichever value between 0 and 1 (Whichever we choose) we see there is a 0.54 or 0.87 (whichever we choose) increase in y? and that this is the marginal effect of time at each level of x or 5 or 10 years?


              Code:
              . mi estimate: xtreg y c.x##(i.education_level_bl c.bmi_BL i.smoke_status_BL ib1.time), fe i
              > (Patient_ID) // N.B. USING ##, NOT # THIS TIME
              
              Multiple-imputation estimates                   Imputations       =         20
              Fixed-effects (within) regression               Number of obs     =      1,984
              
              Group variable: Patient_ID                      Number of groups  =      1,018
                                                              Obs per group:
                                                                            min =          1
                                                                            avg =        1.9
                                                                            max =          3
                                                              Average RVI       =     0.0022
                                                              Largest FMI       =     0.0169
                                                              Complete DF       =        953
              DF adjustment:   Small sample                   DF:     min       =     922.20
                                                                      avg       =     946.21
                                                                      max       =     950.76
              Model F test:       Equal FMI                   F(  13,  951.0)   =      73.76
              Within VCE type: Conventional                   Prob > F          =     0.0000
              
              -------------------------------------------------------------------------------------------
                                      y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              --------------------------+----------------------------------------------------------------
                                      x |  -3.302667   1.229513    -2.69   0.007    -5.715617   -.8897175
                                        |
                     education_level_bl |
              Nine-year compulsory s..  |          0  (omitted)
               Junior secondary school  |          0  (omitted)
                   Vocational training  |          0  (omitted)
                Upper secondary school  |          0  (omitted)
                 University/highschool  |          0  (omitted)
                                        |
                                 bmi_BL |          0  (omitted)
                                        |
                        smoke_status_BL |
                        Current smoker  |          0  (omitted)
                       Previous smoker  |          0  (omitted)
                                        |
                                   time |
                                     5  |   .5457145   .0608311     8.97   0.000     .4263357    .6650932
                                    10  |   .5890092   .0868322     6.78   0.000     .4186044    .7594141
                                        |
                 education_level_bl#c.x |
              Nine-year compulsory s..  |   1.222744   .7984418     1.53   0.126    -.3441697    2.789657
               Junior secondary school  |    .685295   .4510549     1.52   0.129    -.1998839    1.570474
                   Vocational training  |   .9948828   .4250264     2.34   0.019     .1607826    1.828983
                Upper secondary school  |  -.1093483   .7922222    -0.14   0.890    -1.664055    1.445359
                 University/highschool  |   .5343776   .6281508     0.85   0.395    -.6983451      1.7671
                                        |
                           c.x#c.bmi_BL |    .097991   .0435003     2.25   0.025     .0126199     .183362
                                        |
                    smoke_status_BL#c.x |
                        Current smoker  |    .219412   .4820576     0.46   0.649    -.7266119    1.165436
                       Previous smoker  |   .6808096   .3862928     1.76   0.078     -.077279    1.438898
                                        |
                               time#c.x |
                                     5  |   .3249621    .295196     1.10   0.271    -.2543491    .9042734
                                    10  |   1.243472   .3554801     3.50   0.000     .5458553    1.941088
                                        |
                                  _cons |   2.269578   .0610364    37.18   0.000     2.149796     2.38936
              --------------------------+----------------------------------------------------------------
                                sigma_u |  .72026621
                                sigma_e |  .46922543
                                    rho |   .7020492   (fraction of variance due to u_i)
              -------------------------------------------------------------------------------------------
              Note: sigma_u and sigma_e are combined in the original metric.
              
              . mimrgns, dydx(time) at(x = (0(0.1)1))
              note: option predict() not specified; predict(xb) assumed
              
              Multiple-imputation estimates                   Imputations       =         20
              Average marginal effects                        Number of obs     =      1,984
                                                              Average RVI       =     0.0002
                                                              Largest FMI       =     0.0006
                                                              Complete DF       =        953
              DF adjustment:   Small sample                   DF:     min       =     950.38
                                                                      avg       =     950.57
              Within VCE type: Delta-method                           max       =     950.95
              
              Expression   : Linear prediction, predict(xb)
              dy/dx w.r.t. : 5.time 10.time
              
              1._at        : x               =           0
              
              2._at        : x               =          .1
              
              3._at        : x               =          .2
              
              4._at        : x               =          .3
              
              5._at        : x               =          .4
              
              6._at        : x               =          .5
              
              7._at        : x               =          .6
              
              8._at        : x               =          .7
              
              9._at        : x               =          .8
              
              10._at       : x               =          .9
              
              11._at       : x               =           1
              
              ------------------------------------------------------------------------------
                           |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
              5.time       |
                       _at |
                        1  |   .5457145   .0608311     8.97   0.000     .4263357    .6650932
                        2  |   .5782107   .0382441    15.12   0.000     .5031581    .6532632
                        3  |   .6107069   .0311065    19.63   0.000     .5496616    .6717521
                        4  |   .6432031   .0470683    13.67   0.000     .5508332    .7355729
                        5  |   .6756993    .072153     9.36   0.000     .5341017    .8172969
                        6  |   .7081955    .099697     7.10   0.000     .5125438    .9038473
                        7  |   .7406917    .128124     5.78   0.000     .4892531    .9921303
                        8  |   .7731879   .1569549     4.93   0.000     .4651698    1.081206
                        9  |   .8056842   .1860019     4.33   0.000     .4406623    1.170706
                       10  |   .8381804   .2151776     3.90   0.000     .4159022    1.260458
                       11  |   .8706766   .2444359     3.56   0.000     .3909801    1.350373
              -------------+----------------------------------------------------------------
              10.time      |
                       _at |
                        1  |   .5890092   .0868322     6.78   0.000     .4186044    .7594141
                        2  |   .7133564   .0604169    11.81   0.000     .5947906    .8319222
                        3  |   .8377036   .0478319    17.51   0.000     .7438353    .9315719
                        4  |   .9620508   .0587614    16.37   0.000     .8467337    1.077368
                        5  |   1.086398   .0845295    12.85   0.000     .9205119    1.252284
                        6  |   1.210745   .1156066    10.47   0.000     .9838713    1.437619
                        7  |   1.335092   .1487006     8.98   0.000     1.043273    1.626912
                        8  |   1.459439   .1827188     7.99   0.000      1.10086    1.818018
                        9  |   1.583787   .2172275     7.29   0.000     1.157486    2.010088
                       10  |   1.708134   .2520253     6.78   0.000     1.213543    2.202724
                       11  |   1.832481    .287007     6.38   0.000      1.26924    2.395722
              ------------------------------------------------------------------------------
              Note: dy/dx for factor levels is the discrete change from the base level.

              Comment


              • #22
                If I understand you correctly the interpretation is: After 5 years, when x is at whichever value between 0 and 1 (Whichever we choose) we see there is a 0.54 or 0.87 (whichever we choose) increase in y? and that this is the marginal effect of time at each level of x or 5 or 10 years?
                No, not correct.

                If x is 0, then the average marginal effect of time = 5 on y, relative to baseline value of time, is 0.55 if x = 0, 0.58 if x = 0.1, 0.61 if x = 0.2, and so on up to 0.87 if x = 1. And the average marginal effect of time = 10 on y, again relative to baseline value of time, is 0.59 if x=0, 0.71 if x = 0.1, 0.84 if x = 0.2, and so on up to 1.8 if x = 1.

                Comment


                • #23
                  Hello Schechter! I am new here. I am looking for a solution. I am trying to find out the difference between two individual continuous variables. How can I do that? When I am trying it by Mann-Whitney or Spearman's rank, STATA says no observation!

                  Comment


                  • #24
                    #23 is wildly off topic for this thread. It is easy to mistake Statalist threads for a dialog between a questioner and a responder. But they are more than that. Other people follow along, and still others come later and search for help on particular topics. So as to avoid wasting people's time sorting through irrelevant material, or making it impossible to find important advice because it is buried in a thread with an unrelated topic, it is important to keep threads on topic.

                    Please repost your question as a New Topic, not in another pre-existing thread. Before going that, though, I urge you to read the Forum FAQ for excellent advice that will improve your chances of getting a timely and helpful response. Among the things you will find there:
                    1. When asking for help with code, always show the exact code you have tried, and whatever response you got from Stata. Do that by copy/pasting from your log file or the Results window into your post, and surround it with code delimiters (explained in the FAQ) for best readability.
                    2. When asking for help with code, show example data that those who want to help you can use to test their solutions. The only truly helpful way to show example data is by using the -dataex- command (also explained in the FAQ).
                    3. Except when you are responding to someone who responded to you, it is better not to address your question to any particular person. There are many people in the Forum who can answer most questions, and you gain nothing by discouraging them from attending to your query.

                    Comment


                    • #25
                      Dear Clyde, thank you for the clarity on the interpretation. Extremely helpful. Another thought I had was, is it possible to assess the change in a binary variable over these time periods rather than a continuous one. For example if x was set as (Type 2 diabetes, No, Type 2 diabetes, Yes). To assess if changes in clinically diagnosed state of disease effect y (our protein outcome).

                      Comment


                      • #26
                        Yes. Your example data does not contain a diabetes indicator variable. But assuming you have one, let's call it dm for purposes of discussion, coded 0 for no diabetes and 1 for yes diabetes, you just replace c.x by i.dm in the regression, and you change the -mimrgns- command to -mimrgns dm, dydx(time)-.

                        Comment


                        • #27
                          I wonder if for example we flipped the analysis round and used the binary variable of diabetes (0 no, 1 yes) as our outcome variable. How would one interpret the results below. x_FS represents the outcome of diabetes and y_FGF23 represents the exposure (protein level).

                          Code:
                           mi estimate: xtreg x_FS y_FGF23 bmi_change vitD_change i.education_level_bl i.smoke_change
                          >  i.time, i(Patient_ID) fe
                          
                          Multiple-imputation estimates                   Imputations       =         20
                          Fixed-effects (within) regression               Number of obs     =      1,925
                          
                          Group variable: Patient_ID                      Number of groups  =      1,011
                                                                          Obs per group:
                                                                                        min =          1
                                                                                        avg =        1.9
                                                                                        max =          3
                                                                          Average RVI       =     0.0061
                                                                          Largest FMI       =     0.0414
                                                                          Complete DF       =        906
                          DF adjustment:   Small sample                   DF:     min       =     804.53
                                                                                  avg       =     853.20
                                                                                  max       =     903.94
                          Model F test:       Equal FMI                   F(   8,  903.7)   =      43.44
                          Within VCE type: Conventional                   Prob > F          =     0.0000
                          
                          -------------------------------------------------------------------------------------------
                                               x_FS |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          --------------------------+----------------------------------------------------------------
                                            y_FGF23 |   .0512858   .0246146     2.08   0.037     .0029774    .0995942
                                         bmi_change |   .0052032   .0077215     0.67   0.501    -.0099511    .0203576
                                        vitD_change |   .0009813   .0006715     1.46   0.144    -.0003367    .0022993
                                                    |
                                 education_level_bl |
                          Nine-year compulsory s..  |          0  (omitted)
                           Junior secondary school  |          0  (omitted)
                               Vocational training  |          0  (omitted)
                            Upper secondary school  |          0  (omitted)
                             University/highschool  |          0  (omitted)
                                                    |
                                       smoke_change |
                                                 1  |  -.0957292   .1014285    -0.94   0.346    -.2948249    .1033664
                                                 2  |  -.3107969   .1899033    -1.64   0.102    -.6835601    .0619662
                                                 3  |  -.1593208   .1143057    -1.39   0.164    -.3836817      .06504
                                                    |
                                               time |
                                                 5  |    .055652   .1050633     0.53   0.596    -.1505742    .2618782
                                                10  |    .283865   .1088539     2.61   0.009     .0702006    .4975293
                                                    |
                                              _cons |   1.013544   .2504921     4.05   0.000     .5219203    1.505168
                          --------------------------+----------------------------------------------------------------
                                            sigma_u |    .417668
                                            sigma_e |  .35204883
                                                rho |  .58463595   (fraction of variance due to u_i)
                          -------------------------------------------------------------------------------------------
                          Note: sigma_u and sigma_e are combined in the original metric.

                          Comment


                          • #28
                            This is a linear probability model. It suggests that a within-patient change of 1 unit in the y_FGF23 variable is associated with a 0.05 (i.e. 5 percentage points) increase in probability of diabetes, 95% CI 0 to 0.1 (i.e. 0 to 10 percentage points). (Rounding to 2 decimal places.)

                            Comment


                            • #29
                              I see. Maybe you noticed in the previous model that some of the confounders were different. I now have the possibility to create new confounding variables which are the change in said confounding variable, for example smoking status, BL, 5y, 10y follow up, BMI, BL, 5y and 10y follow up etc. Would you recommend to use the change of the confounding variables in the model or stick with the baseline ones...

                              Comment


                              • #30
                                Would you recommend to use the change of the confounding variables in the model or stick with the baseline ones...
                                I have no idea. I don't know what this protein is. I don't know what its connection to diabetes is, or can be reasonably be supposed to be. (Even if I knew what the protein is, as an epidemiologist whose last brush with biochemistry was about 50 years ago, I probably wouldn't know what to make of it anyway.) So there is no way for me to know whether the baseline or concurrent values are more relevant to the diabetes outcome. That is neither a Stata question nor a statistical question. It is a biology question. If you do not yourself feel comfortable answering the question yourself, you need to ask somebody who understands the biology of this protein and diabetes better.

                                Comment

                                Working...
                                X