Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using margins, test, and lincom to test hypothesis that two predictive values are equal

    Hi,
    I'm trying to do something that should be easy but I'm not certain I am doing it/interpreting the output the correct way.

    My basic model is a mixed level model and I am interested in the independent variables AC, CP, and their interaction. Specifically, I am predicting that CP will be a significant predictor of DV when AC is low, but that CP will become irrelevant when AC is high.

    Thus, I run the following:

    Code:
    mixed DV X Y AC##CP || Country: || ParticipantID:
    
    (deleted since question is about the next step)
    
     margins, at( AC=(1 7) CP=(40 70)    ) post
    
    
    
    Predictive margins                              Number of obs     =      7,243
    
    Expression   : Linear prediction, fixed portion, predict()
    
    1._at        : AC              =           1
                       CP              =          40
    
    2._at        : AC              =           1
                      CP               =          70
    
    3._at        : AC              =           6
                    CP                 =          40
    
    4._at        : AC              =          6
                     CP                  =          70
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |  -.0126381   .1316229    -0.10   0.924    -.2706142    .2453381
              2  |   .0034672   .1128805     0.03   0.975    -.2177744    .2247088
              3  |  -.0421888   .1166925    -0.36   0.718    -.2709019    .1865243
              4  |  -.0267793   .1119971    -0.24   0.811    -.2462896    .1927309
    ------------------------------------------------------------------------------
    
    
     test 3._at=4._at
    
     ( 1)  3._at - 4._at = 0
    
               chi2(  1) =    0.01
             Prob > chi2 =    0.9072
    
    
    . lincom 3._at - 4._at
    
     ( 1)  3._at - 4._at = 0
    
    ------------------------------------------------------------------------------
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             (1) |  -.0154094   .1321867    -0.12   0.907    -.2744906    .2436717
    ------------------------------------------------------------------------------
    
    
    
    
    In case it is relevant:
    
    sum AC CP
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
              AC |     18,887    4.062194    1.136002          1          6
              CP |     23,302    57.39825    15.30319         25         88

    So my questions:

    First, am I correct in interpreting the test and lincom results to say that the probability that the predictive values for 3._at and 4._at are not equal to one another is (1-.907=.093)? In other words, if we consider the null hypothesis to be that 3._at is not equal to 4._at, then the p-value of the test would be .093?

    Second, is there a better way to test the hypothesis that the importance of the CP interactive term declines to zero with an increase in AC?

  • #2
    First, am I correct in interpreting the test and lincom results to say that the probability that the predictive values for 3._at and 4._at are not equal to one another is (1-.907=.093)? In other words, if we consider the null hypothesis to be that 3._at is not equal to 4._at, then the p-value of the test would be .093?
    No, this is a fallacy based on a very serious and profound, but unfortunately extremely widespread, misunderstanding of what a p-value is. It's perhaps the most important reason that many people want to do away with p-values altogether. The p-value is not the probability of the null hypothesis. So if you were to negate the null hypothesis (which isn't usually possible anyway since a null hypothesis in the standard framework of hypothesis testing must be a point hypothesis, whereas the alternative hypothesis typically is not), the p-value would not transform to 1 - original pvalue.

    is there a better way to test the hypothesis that the importance of the CP interactive term declines to zero with an increase in AC?
    Importance is not a statistical concept. It is a value judgment, and there are no statistical tests for the importance of anything. I know I'm being pedantic here, but sloppy use of language leads to sloppy thinking, which leads to bad results. When working with statistics it is important to use clear and correct language, or you will be easily led astray. I imagine that what you mean by "importance" in this context is something like "has a large marginal effect." I think the best way to look at that would be
    Code:
    margins, dydx(CP) at(AC = (1 7))
    Then you will see what the marginal effects of CP actually are at AC = 1 and at AC = 7. And you can then see whether the marginal effect declines as AC change from 1 to 7, and whether the value at AC = 7 is small enough to consider unimportant.

    If, in addition, you want a statistical test of whether the marginal effects of CP are the same at AC = 1 and AC = 7, you can just add the -pwcompare- option to the code I showed.

    Comment


    • #3
      I'm so glad I asked. Thanks much for the quick response!

      Comment


      • #4
        Thanks again for your earlier suggestion. I've been reading the -help- files, statalist and general google searches to make sure I understand what I am getting from the margins, dydx output. I'd really appreciate your input to make sure my interpretation is clear. I am trying to interpret a three-way interaction as follows:

        I am most interested in variables: Type (which has 4 categories) TR and CP (both treated as continuous). There are several other controls, as indicated below.

        Code:
        sum Tol Type TR CP  MagnitudeEncoded BEC  POR  POD  AC  SEP  SES HE UNC COR COI if InSample==1
        
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
                 Tol |     10,301    2.273372    1.152523          1          5
                Type |     10,363    2.106436    1.074237          0          3
                  TR |     10,460    3.034927    1.368629          1          6
                  CP |     10,454    54.77597    14.17571         25         88
        MagnitudeE~d |     10,460    2.498375    .8056695          1          4
        -------------+---------------------------------------------------------
                 BEC |     10,460    4.902693    .9465475          2          6
                 POR |     10,460    4.371096    1.445577          1          7
                 POD |     10,460    3.679924    1.530393          1          7
                  AC |     10,460    4.083381    1.143655          1          6
                 SEP |     10,460    5.439277    1.075069          1          7
        -------------+---------------------------------------------------------
                 SES |     10,460    5.316189    1.200822          1          7
                  HE |     10,460    4.545475     .931144          1          6
                 UNC |     10,460    4.703569     1.02625          1          6
                 COR |     10,460    3.551052    1.207317          1          6
                 COI |     10,460    4.155242    1.201037          1          6
        The model is as follows:

        Code:
        
        . mixed Tol      i.Type##c.TR##c.CP  i.MagnitudeEncoded BEC  POR  POD  AC  SEP  SES HE UNC COR COI   if  InSample==1|| EncodedCountry: || ParticipantIDEncoded:
        
        Performing EM optimization: 
        
        Performing gradient-based optimization: 
        
        Iteration 0:   log likelihood = -13313.144  
        Iteration 1:   log likelihood =  -13313.02  
        Iteration 2:   log likelihood =  -13313.02  
        
        Computing standard errors:
        
        Mixed-effects ML regression                     Number of obs     =     10,198
        
        -------------------------------------------------------------
                        |     No. of       Observations per Group
         Group Variable |     Groups    Minimum    Average    Maximum
        ----------------+--------------------------------------------
           EncodedCou~y |         28         23      364.2      1,365
           Participan~d |           300         20       34.0         53
        -------------------------------------------------------------
        
                                                        Wald chi2(28)     =    4692.98
        Log likelihood =  -13313.02                     Prob > chi2       =     0.0000
        
        ----------------------------------------------------------------------------------
                     Tol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -----------------+----------------------------------------------------------------
                    Type |
                      1  |  -1.526078   .3561183    -4.29   0.000    -2.224057    -.828099
                      2  |  -1.885474   .3408261    -5.53   0.000    -2.553481   -1.217467
                      3  |  -3.161564   .3001002   -10.54   0.000     -3.74975   -2.573379
                         |
                      TR |  -.4234731   .1084189    -3.91   0.000    -.6359703   -.2109759
                         |
               Type#c.TR |
                      1  |   .2187888   .0975347     2.24   0.025     .0276243    .4099534
                      2  |   .3276292   .0931722     3.52   0.000     .1450151    .5102432
                      3  |    .547909    .081605     6.71   0.000     .3879662    .7078519
                         |
                      CP |  -.0111969   .0066881    -1.67   0.094    -.0243053    .0019115
                         |
               Type#c.CP |
                      1  |   .0028216   .0061508     0.46   0.646    -.0092338    .0148771
                      2  |  -.0005084   .0058805    -0.09   0.931    -.0120339    .0110171
                      3  |    .011457    .005189     2.21   0.027     .0012868    .0216272
                         |
               c.TR#c.CP |   .0050417   .0019139     2.63   0.008     .0012906    .0087928
                         |
          Type#c.TR#c.CP |
                      1  |  -.0015755   .0017684    -0.89   0.373    -.0050416    .0018906
                      2  |  -.0019844   .0016886    -1.18   0.240     -.005294    .0013251
                      3  |  -.0052852   .0014818    -3.57   0.000    -.0081896   -.0023809
                         |
        MagnitudeEncoded |
                   auto  |   .0939683   .0314394     2.99   0.003     .0323482    .1555884
                  lunch  |   .1387946   .0337469     4.11   0.000      .072652    .2049372
                 resort  |   .1065348   .0421312     2.53   0.011     .0239591    .1891104
                         |
                     BEC |    .005184   .0403485     0.13   0.898    -.0738976    .0842656
                     POR |   .0157649   .0327055     0.48   0.630    -.0483368    .0798665
                     POD |  -.0086285   .0274652    -0.31   0.753    -.0624592    .0452022
                      AC |   .0498852   .0377419     1.32   0.186    -.0240876    .1238581
                     SEP |  -.0772753   .0315448    -2.45   0.014     -.139102   -.0154486
                     SES |  -.0426371   .0300888    -1.42   0.156    -.1016101    .0163358
                      HE |   .0202024   .0340978     0.59   0.554    -.0466281    .0870329
                     UNC |  -.1709456   .0352716    -4.85   0.000    -.2400766   -.1018146
                     COR |  -.0088246   .0333322    -0.26   0.791    -.0741545    .0565053
                     COI |   .0679164   .0274547     2.47   0.013     .0141062    .1217266
                   _cons |   5.394517   .4129063    13.06   0.000     4.585235    6.203798
        ----------------------------------------------------------------------------------
        
        ------------------------------------------------------------------------------
          Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
        -----------------------------+------------------------------------------------
        EncodedCou~y: Identity       |
                          var(_cons) |   .0015045   .0044222      4.74e-06    .4779327
        -----------------------------+------------------------------------------------
        Participan~d: Identity       |
                          var(_cons) |   .1771494   .0168071      .1470895    .2133524
        -----------------------------+------------------------------------------------
                       var(Residual) |   .7472566    .010622      .7267252    .7683681
        ------------------------------------------------------------------------------
        LR test vs. linear model: chi2(2) = 1538.45               Prob > chi2 = 0.0000
        
        Note: LR test is conservative and provided only for reference.
        
        .

        Given the significance of Type#c.TR#c.CP (which is related to my hypothesis), I want to make sure I am interpreting the margins output correctly. It reads as follows:


        Code:
        . margins, dydx(Type) at(CP = (20 70) TR==(1 5)) pwcompare(effects) mcompare(bonferroni)
        
        Pairwise comparisons of average marginal effects
        
        Expression   : Linear prediction, fixed portion, predict()
        dy/dx w.r.t. : 1.Type 2.Type 3.Type
        
        1._at        : TR              =           1
                       CP              =          20
        
        2._at        : TR              =           1
                       CP              =          70
        
        3._at        : TR              =           5
                       CP              =          20
        
        4._at        : TR              =           5
                       CP              =          70
        
        ---------------------------
                     |    Number of
                     |  Comparisons
        -------------+-------------
                       (base outcome)
        -------------+-------------
        1.Type       |
                 _at |            6
        -------------+-------------
        2.Type       |
                 _at |            6
        -------------+-------------
        3.Type       |
                 _at |            6
        ---------------------------
        
        ------------------------------------------------------------------------------
                     |   Contrast Delta-method    Bonferroni           Bonferroni
                     |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        0.Type       |  (base outcome)
        -------------+----------------------------------------------------------------
        1.Type       |
                 _at |
             2 vs 1  |   .0623063   .2286854     0.27   1.000    -.5410246    .6656372
             3 vs 1  |   .7491164   .2556569     2.93   0.020     .0746278    1.423605
             4 vs 1  |   .4963255    .190045     2.61   0.054    -.0050622    .9977131
             3 vs 2  |   .6868102   .1390899     4.94   0.000     .3198551    1.053765
             4 vs 2  |   .4340192   .1528912     2.84   0.027     .0306529    .8373855
             4 vs 3  |   -.252791   .1988126    -1.27   1.000    -.7773097    .2717278
        -------------+----------------------------------------------------------------
        2.Type       |
                 _at |
             2 vs 1  |  -.1246423   .2187871    -0.57   1.000    -.7018589    .4525743
             3 vs 1  |   1.151762   .2443436     4.71   0.000     .5071202    1.796403
             4 vs 1  |   .6302314   .1823294     3.46   0.003     .1491995    1.111263
             3 vs 2  |   1.276404   .1333158     9.57   0.000     .9246823    1.628125
             4 vs 2  |   .7548736   .1462628     5.16   0.000     .3689949    1.140752
             4 vs 3  |  -.5215302     .19001    -2.74   0.036    -1.022825    -.020235
        -------------+----------------------------------------------------------------
        3.Type       |
                 _at |
             2 vs 1  |   .3085871   .1933554     1.60   0.663    -.2015341    .8187083
             3 vs 1  |   1.768816   .2137268     8.28   0.000      1.20495    2.332682
             4 vs 1  |   1.020354   .1606193     6.35   0.000     .5965989    1.444109
             3 vs 2  |   1.460229   .1156437    12.63   0.000     1.155131    1.765327
             4 vs 2  |   .7117668    .128336     5.55   0.000     .3731833     1.05035
             4 vs 3  |  -.7484624   .1657432    -4.52   0.000    -1.185736   -.3111891
        ------------------------------------------------------------------------------
        Note: dy/dx for factor levels is the discrete change from the base level.
        Considering only the last line of the "Type=3" segment of output (bolded) and also the marginsplot graph attached, is the interpretation:
        If we hold TR at the level of 5 ("high TR"), a change from condition Type=0 to Type=3 associates with significantly greater change (in the negative direction) in the dependent variable (p<.000) when CP=70 ("high CP") than when CP=20 ("low CP")?

        Thanks much for any help you can provide!


        Attached Files

        Comment


        • #5
          You cannot say p < .000 as p-values can never be negative. The p-value was when rounded to 3 decimal places, so you can say p < 0.0005 if you like.

          In all other respects, your interpretation is correct.

          Of course, if this were my project, I would be emphasizing the actual difference in marginal effects and the confidence interval rather than the p-value. (A large number of my posts on Statalist trumpet my support for the American Statistical Association's recommendation that the concept of statistical significance be abandoned. See https://www.tandfonline.com/doi/full...5.2019.1583913 for the "executive summary" and https://www.tandfonline.com/toc/utas20/73/sup1 for all 43 supporting articles. Or https://www.nature.com/articles/d41586-019-00857-9 for the tl;dr.)
          Last edited by Clyde Schechter; 01 Oct 2021, 13:56.

          Comment


          • #6
            That's great-- thank you! I will definitely add language about whether the actual differences are theoretically meaningful.

            Comment

            Working...
            X