Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing for Effect Modification

    Dear all,

    I am studying the association of green spaces on Glycated Hemoglobin and I wanted to test if there was an effect modification by SEX and by living in a rural/urban area (URB_TYPE).
    I am using the svy command in my regression since I'm using survey data.

    code for SEX:
    Code:
    svy linearized : glm HBA1C i.NDVI300mean_tercis_f##i.SEX if diabetes_diag_med1==0 & PREG2==0, family(gaussian) link(log) eform
    testparm i.NDVI300mean_tercis_f#i.SEX
    Results:
    Code:
    . svy linearized : glm HBA1C ib(1).NDVI300mean_tercis_f#SEX if diabetes_diag_med1==0 & PR
    > EG2==0, family(gaussian) link(log) eform
    (running glm on estimation sample)
    
    Survey: Generalized linear models
    
    Number of strata = 14                              Number of obs   =     4,320
    Number of PSUs   = 49                              Population size = 5,938,824
                                                       Design df       =        35
    
    ---------------------------------------------------------------------------------------
                          |             Linearized
                    HBA1C |     exp(b)   std. err.      t    P>|t|     [95% conf. interval]
    ----------------------+----------------------------------------------------------------
     NDVI300mean_tercis_f#|
                      SEX |
                  1#Male  |   1.005928   .0069968     0.85   0.401     .9918231    1.020233
                2#Female  |   1.001994   .0091402     0.22   0.828     .9836091    1.020722
                  2#Male  |   .9982043   .0081972    -0.22   0.828     .9817012    1.014985
                3#Female  |   .9981192   .0084287    -0.22   0.825     .9811539    1.015378
                  3#Male  |   .9970581   .0056531    -0.52   0.607     .9856476    1.008601
                          |
                    _cons |   5.340516   .0294836   303.46   0.000     5.280995    5.400707
    ---------------------------------------------------------------------------------------
    Note: Variance scaled to handle strata with a single sampling unit.
    
    . 
    end of do-file
    
    . do "C:\Users\danie\AppData\Local\Temp\STD7e94_000000.tmp"
    
    . testparm i.NDVI300mean_tercis_f#i.SEX
    
    Adjusted Wald test
    
     ( 1)  [HBA1C]1b.NDVI300mean_tercis_f#1.SEX = 0
     ( 2)  [HBA1C]2.NDVI300mean_tercis_f#0b.SEX = 0
     ( 3)  [HBA1C]2.NDVI300mean_tercis_f#1.SEX = 0
     ( 4)  [HBA1C]3.NDVI300mean_tercis_f#0b.SEX = 0
     ( 5)  [HBA1C]3.NDVI300mean_tercis_f#1.SEX = 0
    
           F(  5,    31) =    0.37
                Prob > F =    0.8636
    Code for URB_TYPE:
    Code:
    svy linearized : glm HBA1C ib(1).NDVI300mean_tercis_f#URB_TYPE if diabetes_diag_med1==0 & PREG2==0, family(gaussian) link(log) eform
    
    *Testing overall interaction effect
    testparm i.NDVI300mean_tercis_f#i.URB_TYPE
    Results:
    Code:
    . svy linearized : glm HBA1C ib(1).NDVI300mean_tercis_f#URB_TYPE if diabetes_diag_med1==0
    >  & PREG2==0, family(gaussian) link(log) eform
    (running glm on estimation sample)
    
    Survey: Generalized linear models
    
    Number of strata = 14                              Number of obs   =     4,320
    Number of PSUs   = 49                              Population size = 5,938,824
                                                       Design df       =        35
    
    ---------------------------------------------------------------------------------------
                          |             Linearized
                    HBA1C |     exp(b)   std. err.      t    P>|t|     [95% conf. interval]
    ----------------------+----------------------------------------------------------------
     NDVI300mean_tercis_f#|
                 URB_TYPE |
                 1#Urban  |   1.002829   .0208283     0.14   0.893     .9614241    1.046017
                 2#Rural  |   1.003726   .0176537     0.21   0.834      .968519    1.040212
                 2#Urban  |   .9980727   .0210786    -0.09   0.928     .9561852    1.041795
                 3#Rural  |    .994309   .0204284    -0.28   0.783     .9536901    1.036658
                 3#Urban  |    1.00056   .0210184     0.03   0.979     .9587869    1.044152
                          |
                    _cons |   5.341458   .1065414    84.00   0.000     5.129488    5.562187
    ---------------------------------------------------------------------------------------
    Note: Variance scaled to handle strata with a single sampling unit.
    
    . 
    end of do-file
    
    . do "C:\Users\danie\AppData\Local\Temp\STD7e94_000000.tmp"
    
    . testparm i.NDVI300mean_tercis_f#i.URB_TYPE
    
    Adjusted Wald test
    
     ( 1)  [HBA1C]1b.NDVI300mean_tercis_f#2.URB_TYPE = 0
     ( 2)  [HBA1C]2.NDVI300mean_tercis_f#1b.URB_TYPE = 0
     ( 3)  [HBA1C]2.NDVI300mean_tercis_f#2.URB_TYPE = 0
     ( 4)  [HBA1C]3.NDVI300mean_tercis_f#1b.URB_TYPE = 0
     ( 5)  [HBA1C]3.NDVI300mean_tercis_f#2.URB_TYPE = 0
    
           F(  5,    31) =    2.70
                Prob > F =    0.0388

    Am I using the correct code and methodology to test this assumption?
    How do I interpret the results of the overall interaction p-value?
    It seems to be the case that SEX is not an effect modifier and URB_TYPE is but I am not sure if this is the correct way to go about it.

  • #2
    I don't see anything wrong with the code as an implementation of these tests of effect modification. I am assuming that your -svyset-ing of the data set was correctly done.

    It seems to be the case that SEX is not an effect modifier and URB_TYPE is but I am not sure if this is the correct way to go about it.
    This is a typical example of the pathological reification of statistical significance. P-values are inherently continuous, and statistical significance (at whatever level is chosen) is an arbitrary dichotomization imposed on them. It is always wrong to conclude that a statistically significant result means "effect" and a statistically insignificant result means "no effect," even though that practice is widely taught. It would be fair to conclude that the data are less consistent with the absence of an effect modification by URB_TYPE than they are with the absence of an effect modification by SEX. Otherwise put, the data are much more surprising if there is no effect modification by URB_TYPE than they are if there is no effect modification by SEX.

    I would suggest that you look at this not so much from a statistical significance perspective as from a practical perspective. The figures shown in the exp(b) column are, by the terms of your model, multipliers relative to the geometric mean HbA1c in the base group of NDVI300mean_tercis_f = 1 and SEX = 0 or URB_TYPE = 1, respectively. These multipliers range from 0.994 to 1.006 in the two models combined (rounded to 3 decimal places). Your base HbA1c in each model is, 5.341 to 3 decimal places. So the most extreme group differences from the baseline HbA1c of 5.341 predicted by this model are 5.309 to 5.373. Given the clinical meaning of HbA1c, this would have to be considered negligible. If policies were being considered, and issues of disparate impact by sex or urban/rural location were of concern, I cannot imagine anybody considering differences of this magnitude as being important enough to worry about regardless of whether they are "statistically significant." What you have here is a sample size that is large enough, in some cases, to make a meaninglessly small distinction in outcome "statistically significant."

    Beyond that, there is also the crucial question of whether there are other variables that may confound the associations under study that ought to be added to the modeling. This is a substantive question that I lack the expertise to advise you on.

    Comment


    • #3
      Thank you, Clyde!
      I do have other variables that I have controlled for in my final model but I did not add them to this analysis because I wanted to just focus on the mediation effect of these 2 variables separately.
      Maybe it is not the correct way of running this analysis and I should add the confounders too.

      Comment

      Working...
      X