Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction between two continues variable

    Hi everybody,
    I'm using Stata14 and are working on biomarkers (logLeptin) that can predict the outcome GDM_group (0/1). It seems to be a significan interaction between logLeptin and bmi: se the following


    Code:
    . logit GDM_group c.logLeptin##c.bmi i.Rygning1 Moderens_alder Paritet3 Etnicitet_group H_jdecm 
    
    Iteration 0:   log likelihood = -445.72321  
    Iteration 1:   log likelihood = -406.28578  
    Iteration 2:   log likelihood = -392.54287  
    Iteration 3:   log likelihood = -392.01395  
    Iteration 4:   log likelihood = -392.00983  
    Iteration 5:   log likelihood = -392.00983  
    
    Logistic regression                             Number of obs     =      2,590
                                                    LR chi2(8)        =     107.43
                                                    Prob > chi2       =     0.0000
    Log likelihood = -392.00983                     Pseudo R2         =     0.1205
    
    -----------------------------------------------------------------------------------
            GDM_group |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------+----------------------------------------------------------------
            logLeptin |   3.706393   .9813917     3.78   0.000     1.782901    5.629886
                  bmi |   1.292402   .3026315     4.27   0.000     .6992556    1.885549
                      |
    c.logLeptin#c.bmi |   -.114701   .0294434    -3.90   0.000     -.172409    -.056993
                      |
             Rygning1 |
              rygere  |  -.0437373   .2633501    -0.17   0.868     -.559894    .4724194
       Moderens_alder |   .0707637   .0207224     3.41   0.001     .0301485    .1113789
             Paritet3 |  -.3005723   .1178892    -2.55   0.011    -.5316309   -.0695136
      Etnicitet_group |   .1521907   .4433702     0.34   0.731    -.7167988     1.02118
              H_jdecm |  -.0388302   .0164453    -2.36   0.018    -.0710624   -.0065981
                _cons |  -39.67156   10.44042    -3.80   0.000    -60.13441   -19.20872
    -----------------------------------------------------------------------------------
    I'm just in doubt whether the interaction is linier. How do I visually inspect the interaction?

    Regards Ida

  • #2
    Chuck Huber has a nice Youtube video on how to do this (marginsplot for 2 continuous variables)

    https://www.youtube.com/watch?v=QFROtui_OyM
    Last edited by Andrew Musau; 12 Apr 2017, 04:23.

    Comment


    • #3
      Thank you, I'll try that.
      I'm in doubt wheather I have produced the right interaction term. I'm trying to understannd wheather there is a difference between the levels of logLeptin in 3 different BMI groups (BMIGroup2) if you have the outcome "GDM_group".
      I haven't succeded with the BMI_goup interaction term so I started with BMI as a contineuos variable (bmi) ( the first post). But this doesn't tell me wheather there is a difference between the BMI groups.
      Which of the two commands below are correct? What is the difference in the understanding of the output?

      Code:
      . logit GDM_group c.logLeptin##i.BMIGroup2 i.Rygning1 Moderens_alder Paritet3 H_jdecm Etnicitet_group
      
      Iteration 0:   log likelihood = -445.72321  
      Iteration 1:   log likelihood =  -416.9304  
      Iteration 2:   log likelihood = -390.82548  
      Iteration 3:   log likelihood = -390.34473  
      Iteration 4:   log likelihood = -390.34283  
      Iteration 5:   log likelihood = -390.34283  
      
      Logistic regression                             Number of obs     =      2,590
                                                      LR chi2(10)       =     110.76
                                                      Prob > chi2       =     0.0000
      Log likelihood = -390.34283                     Pseudo R2         =     0.1242
      
      ---------------------------------------------------------------------------------------
                  GDM_group |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ----------------------+----------------------------------------------------------------
                  logLeptin |   .5768597   .5677957     1.02   0.310    -.5359993    1.689719
                            |
                  BMIGroup2 |
                  30-34.99  |   5.662876   6.266205     0.90   0.366    -6.618659    17.94441
                    >34.99  |   15.96497   6.565828     2.43   0.015     3.096183    28.83375
                            |
      BMIGroup2#c.logLeptin |
                  30-34.99  |  -.3902378   .6402594    -0.61   0.542    -1.645123    .8646475
                    >34.99  |  -1.319169   .6646646    -1.98   0.047    -2.621888   -.0164504
                            |
                   Rygning1 |
                    rygere  |  -.0446384   .2631313    -0.17   0.865    -.5603662    .4710894
             Moderens_alder |   .0701757   .0208238     3.37   0.001     .0293618    .1109896
                   Paritet3 |  -.2978443   .1176235    -2.53   0.011    -.5283821   -.0673066
                    H_jdecm |  -.0388391   .0166489    -2.33   0.020    -.0714703   -.0062079
            Etnicitet_group |   .1383932   .4459385     0.31   0.756    -.7356301    1.012417
                      _cons |  -5.479419   6.203266    -0.88   0.377     -17.6376    6.678759
      ---------------------------------------------------------------------------------------
      Code:
      . logit GDM_group bmi logLeptin c.logLeptin#i.BMIGroup2 i.Rygning1 Moderens_alder Paritet3 H_jdecm Etnicitet_group
      
      Iteration 0:   log likelihood = -445.72321  
      Iteration 1:   log likelihood = -407.38398  
      Iteration 2:   log likelihood =  -394.2196  
      Iteration 3:   log likelihood = -394.01601  
      Iteration 4:   log likelihood = -394.01584  
      Iteration 5:   log likelihood = -394.01584  
      
      Logistic regression                             Number of obs     =      2,590
                                                      LR chi2(9)        =     103.41
                                                      Prob > chi2       =     0.0000
      Log likelihood = -394.01584                     Pseudo R2         =     0.1160
      
      ---------------------------------------------------------------------------------------
                  GDM_group |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ----------------------+----------------------------------------------------------------
                        bmi |   .0154577   .0449262     0.34   0.731     -.072596    .1035115
                  logLeptin |  -.2692089    .230709    -1.17   0.243    -.7213903    .1829725
                            |
      BMIGroup2#c.logLeptin |
                  30-34.99  |   .2109428   .0597448     3.53   0.000     .0938452    .3280405
                    >34.99  |   .2482535   .0841808     2.95   0.003     .0832622    .4132448
                            |
                   Rygning1 |
                    rygere  |   -.025985   .2611249    -0.10   0.921    -.5377805    .4858105
             Moderens_alder |   .0718426   .0207042     3.47   0.001     .0312631     .112422
                   Paritet3 |  -.2981849   .1175493    -2.54   0.011    -.5285773   -.0677925
                    H_jdecm |  -.0380979   .0166331    -2.29   0.022    -.0706982   -.0054977
            Etnicitet_group |    .192406   .4410804     0.44   0.663    -.6720957    1.056908
                      _cons |   1.993322   3.731009     0.53   0.593    -5.319322    9.305966
      ---------------------------------------------------------------------------------------
      Regards Ida

      Comment


      • #4
        I'm trying to understannd wheather there is a difference between the levels of logLeptin in 3 different BMI groups (BMIGroup2) if you have the outcome "GDM_group"
        So the first thing that I would do is to graph, which would give me some idea of whether there are differences in means, median, etc.


        Code:
        *\\Are mean levels different: visual
        graph bar logLeptin if GDM_group==1, over(BMIGroup2)
        
        
        *\\Are median levels different: visual
        graph box logLeptin if GDM_group==1, over(BMIGroup2)
        
        *\\Formal tests of differences in median levels: pairwise comparisons
        
        *I am assuming that BMIGroup2 with 3 levels is coded 1, 2, and 3
        *Difference between BMIGroup2 1 & 2 and BMIGroup2 2 & 3, etc.
        
        median logLeptin if GDM_group==1 & BMIGroup2==1| GDM_group==1 & BMIGroup2==1, by(BMIGroup2)
        median logLeptin if GDM_group==1 & BMIGroup2==2| GDM_group==1 & BMIGroup2==3, by(BMIGroup2)
        
        *Should be consistent with
        
        ranksum logLeptin if GDM_group==1 & BMIGroup2==1| GDM_group==1 & BMIGroup2==1, by(BMIGroup2)
        ranksum logLeptin if GDM_group==1 & BMIGroup2==2| GDM_group==1 & BMIGroup2==3, by(BMIGroup2)
        However, note that the descriptive analysis does not account for other factors that may lead to differences in logLeptin levels between groups. You do this by specifying a regression model. Here is my summary of your models:

        1) Your outcome is GDM_group. The implication is that you are trying to predict the probability that an individual is in this group.
        2) In the first model, you suspect that BMIGroup2 can predict GDM_group, and there is an interaction between BMIGroup2 and levels of logLeptin.
        3) In the second model, you suspect that bmi can predict GDM_group, but you include the interaction logLeptin and BMIGroup2, leaving out the main effects of BMIGroup2.

        Before I can comment further, how did you generate the variable BMIGroup2? Did you take the variable bmi and categorize it by age group?
        Last edited by Andrew Musau; 13 Apr 2017, 06:48.

        Comment


        • #5
          Thank you Andrew, I'm getting closer.
          I want to account for all other known factors that may cause GDM_group (ikkeGDM (0)/GDM(1)), in a logistic regression model.
          The variable BMIGroup2 is a categorcial variable which is a recode of bmi (a contineuos variable) into; normal weight "18.5-24.99", moderat obese "30-34.99" and severly obese ">35":
          Code:
          . codebook BMIGroup2
          
          -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
          BMIGroup2                                                                                                                                    RECODE of BMIGroup (RECODE of bmi)
          -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
          
                            type:  numeric (double)
                           label:  BMI2
          
                           range:  [1,3]                        units:  1
                   unique values:  3                        missing .:  0/2,590
          
                      tabulation:  Freq.   Numeric  Label
                                   1,298         1  18.5-24.99
                                     870         2  30-34.99
                                     422         3  >34.99
          I've modified your graph suggestions a litte bit, trying to illustarte what I want to achieve.

          My first question is does logLeptin predicit GDM_group (which is easy enough with a logit command including other predictors)? The tricky part is to make a model where I can answer the question: is the level of logLeptin different between the BMI groups, and does this effect logLeptins predicitive ability og GDM_group. Here I'm not sure how to use the interaction term? It seams like there is an interaction between BMI_group and logLeptin, which is different in the three groups and also may effect prediction of GDM_group:

          Code:
          graph bar (mean) logLeptin, over(BMIGroup2) over(GDM_group)
          Click image for larger version

Name:	grafbar gdm.png
Views:	1
Size:	56.0 KB
ID:	1383823
          Code:
          graph box logLeptin, over(GDM_group)over(BMIGroup2)
          Click image for larger version

Name:	boxplot.png
Views:	1
Size:	64.3 KB
ID:	1383824
          Regards Ida

          Comment


          • #6
            The variable BMIGroup2 is a categorcial variable which is a recode of bmi (a contineuos variable) into; normal weight "18.5-24.99", moderat obese "30-34.99" and severly obese ">35":
            It is not a good idea to categorize a continuous variable for the sole reason of exploring differences between intervals of that variable: 1) you are thrrowing away valuable information; and 2) margins allows you to achieve what you want. So, stay with the model

            Code:
            logit GDM_group c.logLeptin##c.bmi i.Rygning1 Moderens_alder Paritet3 Etnicitet_group H_jdecm
            My first question is does logLeptin predicit GDM_group (which is easy enough with a logit command including other predictors)? The tricky part is to make a model where I can answer the question: is the level of logLeptin different between the BMI groups, and does this effect logLeptins predicitive ability og GDM_group. Here I'm not sure how to use the interaction term? It seams like there is an interaction between BMI_group and logLeptin, which is different in the three groups and also may effect prediction of GDM_group:
            The way that you have specified your interaction is fine. The only warning is that a significant coefficient with an interaction term does not imply that the probability difference in differences is significant, so you have to calculate the latter. The following link from the UCLA website using Stata code will guide you on how to proceed


            http://stats.idre.ucla.edu/stata/sem...ic-regression/

            Comment

            Working...
            X