Using margins, test, and lincom to test hypothesis that two predictive values are equal

Wendy Wynne

Join Date: Jun 2021
Posts: 7

Using margins, test, and lincom to test hypothesis that two predictive values are equal

26 Sep 2021, 14:45

Hi,
I'm trying to do something that should be easy but I'm not certain I am doing it/interpreting the output the correct way.

My basic model is a mixed level model and I am interested in the independent variables AC, CP, and their interaction. Specifically, I am predicting that CP will be a significant predictor of DV when AC is low, but that CP will become irrelevant when AC is high.

Thus, I run the following:

Code:

mixed DV X Y AC##CP || Country: || ParticipantID:

(deleted since question is about the next step)

 margins, at( AC=(1 7) CP=(40 70)    ) post



Predictive margins                              Number of obs     =      7,243

Expression   : Linear prediction, fixed portion, predict()

1._at        : AC              =           1
                   CP              =          40

2._at        : AC              =           1
                  CP               =          70

3._at        : AC              =           6
                CP                 =          40

4._at        : AC              =          6
                 CP                  =          70

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |  -.0126381   .1316229    -0.10   0.924    -.2706142    .2453381
          2  |   .0034672   .1128805     0.03   0.975    -.2177744    .2247088
          3  |  -.0421888   .1166925    -0.36   0.718    -.2709019    .1865243
          4  |  -.0267793   .1119971    -0.24   0.811    -.2462896    .1927309
------------------------------------------------------------------------------


 test 3._at=4._at

 ( 1)  3._at - 4._at = 0

           chi2(  1) =    0.01
         Prob > chi2 =    0.9072


. lincom 3._at - 4._at

 ( 1)  3._at - 4._at = 0

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |  -.0154094   .1321867    -0.12   0.907    -.2744906    .2436717
------------------------------------------------------------------------------




In case it is relevant:

sum AC CP

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          AC |     18,887    4.062194    1.136002          1          6
          CP |     23,302    57.39825    15.30319         25         88

So my questions:

First, am I correct in interpreting the test and lincom results to say that the probability that the predictive values for 3._at and 4._at are not equal to one another is (1-.907=.093)? In other words, if we consider the null hypothesis to be that 3._at is not equal to 4._at, then the p-value of the test would be .093?

Second, is there a better way to test the hypothesis that the importance of the CP interactive term declines to zero with an increase in AC?

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30093
#2

26 Sep 2021, 16:26

First, am I correct in interpreting the test and lincom results to say that the probability that the predictive values for 3._at and 4._at are not equal to one another is (1-.907=.093)? In other words, if we consider the null hypothesis to be that 3._at is not equal to 4._at, then the p-value of the test would be .093?

No, this is a fallacy based on a very serious and profound, but unfortunately extremely widespread, misunderstanding of what a p-value is. It's perhaps the most important reason that many people want to do away with p-values altogether. The p-value is not the probability of the null hypothesis. So if you were to negate the null hypothesis (which isn't usually possible anyway since a null hypothesis in the standard framework of hypothesis testing must be a point hypothesis, whereas the alternative hypothesis typically is not), the p-value would not transform to 1 - original pvalue.

is there a better way to test the hypothesis that the importance of the CP interactive term declines to zero with an increase in AC?

Importance is not a statistical concept. It is a value judgment, and there are no statistical tests for the importance of anything. I know I'm being pedantic here, but sloppy use of language leads to sloppy thinking, which leads to bad results. When working with statistics it is important to use clear and correct language, or you will be easily led astray. I imagine that what you mean by "importance" in this context is something like "has a large marginal effect." I think the best way to look at that would be

Code:

margins, dydx(CP) at(AC = (1 7))

Then you will see what the marginal effects of CP actually are at AC = 1 and at AC = 7. And you can then see whether the marginal effect declines as AC change from 1 to 7, and whether the value at AC = 7 is small enough to consider unimportant.

If, in addition, you want a statistical test of whether the marginal effects of CP are the same at AC = 1 and AC = 7, you can just add the -pwcompare- option to the code I showed.
3 likes
Comment
Wendy Wynne

Join Date: Jun 2021

Posts: 7
#3

27 Sep 2021, 06:17

I'm so glad I asked. Thanks much for the quick response!
Comment

Wendy Wynne

Join Date: Jun 2021
Posts: 7

01 Oct 2021, 08:23

Thanks again for your earlier suggestion. I've been reading the -help- files, statalist and general google searches to make sure I understand what I am getting from the margins, dydx output. I'd really appreciate your input to make sure my interpretation is clear. I am trying to interpret a three-way interaction as follows:

I am most interested in variables: Type (which has 4 categories) TR and CP (both treated as continuous). There are several other controls, as indicated below.

Code:

sum Tol Type TR CP  MagnitudeEncoded BEC  POR  POD  AC  SEP  SES HE UNC COR COI if InSample==1


    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
         Tol |     10,301    2.273372    1.152523          1          5
        Type |     10,363    2.106436    1.074237          0          3
          TR |     10,460    3.034927    1.368629          1          6
          CP |     10,454    54.77597    14.17571         25         88
MagnitudeE~d |     10,460    2.498375    .8056695          1          4
-------------+---------------------------------------------------------
         BEC |     10,460    4.902693    .9465475          2          6
         POR |     10,460    4.371096    1.445577          1          7
         POD |     10,460    3.679924    1.530393          1          7
          AC |     10,460    4.083381    1.143655          1          6
         SEP |     10,460    5.439277    1.075069          1          7
-------------+---------------------------------------------------------
         SES |     10,460    5.316189    1.200822          1          7
          HE |     10,460    4.545475     .931144          1          6
         UNC |     10,460    4.703569     1.02625          1          6
         COR |     10,460    3.551052    1.207317          1          6
         COI |     10,460    4.155242    1.201037          1          6

The model is as follows:

Code:


. mixed Tol      i.Type##c.TR##c.CP  i.MagnitudeEncoded BEC  POR  POD  AC  SEP  SES HE UNC COR COI   if  InSample==1|| EncodedCountry: || ParticipantIDEncoded:

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -13313.144  
Iteration 1:   log likelihood =  -13313.02  
Iteration 2:   log likelihood =  -13313.02  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =     10,198

-------------------------------------------------------------
                |     No. of       Observations per Group
 Group Variable |     Groups    Minimum    Average    Maximum
----------------+--------------------------------------------
   EncodedCou~y |         28         23      364.2      1,365
   Participan~d |           300         20       34.0         53
-------------------------------------------------------------

                                                Wald chi2(28)     =    4692.98
Log likelihood =  -13313.02                     Prob > chi2       =     0.0000

----------------------------------------------------------------------------------
             Tol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
            Type |
              1  |  -1.526078   .3561183    -4.29   0.000    -2.224057    -.828099
              2  |  -1.885474   .3408261    -5.53   0.000    -2.553481   -1.217467
              3  |  -3.161564   .3001002   -10.54   0.000     -3.74975   -2.573379
                 |
              TR |  -.4234731   .1084189    -3.91   0.000    -.6359703   -.2109759
                 |
       Type#c.TR |
              1  |   .2187888   .0975347     2.24   0.025     .0276243    .4099534
              2  |   .3276292   .0931722     3.52   0.000     .1450151    .5102432
              3  |    .547909    .081605     6.71   0.000     .3879662    .7078519
                 |
              CP |  -.0111969   .0066881    -1.67   0.094    -.0243053    .0019115
                 |
       Type#c.CP |
              1  |   .0028216   .0061508     0.46   0.646    -.0092338    .0148771
              2  |  -.0005084   .0058805    -0.09   0.931    -.0120339    .0110171
              3  |    .011457    .005189     2.21   0.027     .0012868    .0216272
                 |
       c.TR#c.CP |   .0050417   .0019139     2.63   0.008     .0012906    .0087928
                 |
  Type#c.TR#c.CP |
              1  |  -.0015755   .0017684    -0.89   0.373    -.0050416    .0018906
              2  |  -.0019844   .0016886    -1.18   0.240     -.005294    .0013251
              3  |  -.0052852   .0014818    -3.57   0.000    -.0081896   -.0023809
                 |
MagnitudeEncoded |
           auto  |   .0939683   .0314394     2.99   0.003     .0323482    .1555884
          lunch  |   .1387946   .0337469     4.11   0.000      .072652    .2049372
         resort  |   .1065348   .0421312     2.53   0.011     .0239591    .1891104
                 |
             BEC |    .005184   .0403485     0.13   0.898    -.0738976    .0842656
             POR |   .0157649   .0327055     0.48   0.630    -.0483368    .0798665
             POD |  -.0086285   .0274652    -0.31   0.753    -.0624592    .0452022
              AC |   .0498852   .0377419     1.32   0.186    -.0240876    .1238581
             SEP |  -.0772753   .0315448    -2.45   0.014     -.139102   -.0154486
             SES |  -.0426371   .0300888    -1.42   0.156    -.1016101    .0163358
              HE |   .0202024   .0340978     0.59   0.554    -.0466281    .0870329
             UNC |  -.1709456   .0352716    -4.85   0.000    -.2400766   -.1018146
             COR |  -.0088246   .0333322    -0.26   0.791    -.0741545    .0565053
             COI |   .0679164   .0274547     2.47   0.013     .0141062    .1217266
           _cons |   5.394517   .4129063    13.06   0.000     4.585235    6.203798
----------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
EncodedCou~y: Identity       |
                  var(_cons) |   .0015045   .0044222      4.74e-06    .4779327
-----------------------------+------------------------------------------------
Participan~d: Identity       |
                  var(_cons) |   .1771494   .0168071      .1470895    .2133524
-----------------------------+------------------------------------------------
               var(Residual) |   .7472566    .010622      .7267252    .7683681
------------------------------------------------------------------------------
LR test vs. linear model: chi2(2) = 1538.45               Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

.

Given the significance of Type#c.TR#c.CP (which is related to my hypothesis), I want to make sure I am interpreting the margins output correctly. It reads as follows:

Code:

. margins, dydx(Type) at(CP = (20 70) TR==(1 5)) pwcompare(effects) mcompare(bonferroni)

Pairwise comparisons of average marginal effects

Expression   : Linear prediction, fixed portion, predict()
dy/dx w.r.t. : 1.Type 2.Type 3.Type

1._at        : TR              =           1
               CP              =          20

2._at        : TR              =           1
               CP              =          70

3._at        : TR              =           5
               CP              =          20

4._at        : TR              =           5
               CP              =          70

---------------------------
             |    Number of
             |  Comparisons
-------------+-------------
               (base outcome)
-------------+-------------
1.Type       |
         _at |            6
-------------+-------------
2.Type       |
         _at |            6
-------------+-------------
3.Type       |
         _at |            6
---------------------------

------------------------------------------------------------------------------
             |   Contrast Delta-method    Bonferroni           Bonferroni
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
0.Type       |  (base outcome)
-------------+----------------------------------------------------------------
1.Type       |
         _at |
     2 vs 1  |   .0623063   .2286854     0.27   1.000    -.5410246    .6656372
     3 vs 1  |   .7491164   .2556569     2.93   0.020     .0746278    1.423605
     4 vs 1  |   .4963255    .190045     2.61   0.054    -.0050622    .9977131
     3 vs 2  |   .6868102   .1390899     4.94   0.000     .3198551    1.053765
     4 vs 2  |   .4340192   .1528912     2.84   0.027     .0306529    .8373855
     4 vs 3  |   -.252791   .1988126    -1.27   1.000    -.7773097    .2717278
-------------+----------------------------------------------------------------
2.Type       |
         _at |
     2 vs 1  |  -.1246423   .2187871    -0.57   1.000    -.7018589    .4525743
     3 vs 1  |   1.151762   .2443436     4.71   0.000     .5071202    1.796403
     4 vs 1  |   .6302314   .1823294     3.46   0.003     .1491995    1.111263
     3 vs 2  |   1.276404   .1333158     9.57   0.000     .9246823    1.628125
     4 vs 2  |   .7548736   .1462628     5.16   0.000     .3689949    1.140752
     4 vs 3  |  -.5215302     .19001    -2.74   0.036    -1.022825    -.020235
-------------+----------------------------------------------------------------
3.Type       |
         _at |
     2 vs 1  |   .3085871   .1933554     1.60   0.663    -.2015341    .8187083
     3 vs 1  |   1.768816   .2137268     8.28   0.000      1.20495    2.332682
     4 vs 1  |   1.020354   .1606193     6.35   0.000     .5965989    1.444109
     3 vs 2  |   1.460229   .1156437    12.63   0.000     1.155131    1.765327
     4 vs 2  |   .7117668    .128336     5.55   0.000     .3731833     1.05035
     4 vs 3  |  -.7484624   .1657432    -4.52   0.000    -1.185736   -.3111891
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

Considering only the last line of the "Type=3" segment of output (bolded) and also the marginsplot graph attached, is the interpretation:
If we hold TR at the level of 5 ("high TR"), a change from condition Type=0 to Type=3 associates with significantly greater change (in the negative direction) in the dependent variable (p<.000) when CP=70 ("high CP") than when CP=20 ("low CP")?

Thanks much for any help you can provide!

Attached Files

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30093
#5

01 Oct 2021, 13:54

You cannot say p < .000 as p-values can never be negative. The p-value was when rounded to 3 decimal places, so you can say p < 0.0005 if you like.

In all other respects, your interpretation is correct.

Of course, if this were my project, I would be emphasizing the actual difference in marginal effects and the confidence interval rather than the p-value. (A large number of my posts on Statalist trumpet my support for the American Statistical Association's recommendation that the concept of statistical significance be abandoned. See https://www.tandfonline.com/doi/full...5.2019.1583913 for the "executive summary" and https://www.tandfonline.com/toc/utas20/73/sup1 for all 43 supporting articles. Or https://www.nature.com/articles/d41586-019-00857-9 for the tl;dr.)

Last edited by Clyde Schechter; 01 Oct 2021, 13:56.
Comment
Wendy Wynne

Join Date: Jun 2021

Posts: 7
#6

02 Oct 2021, 08:44

That's great-- thank you! I will definitely add language about whether the actual differences are theoretically meaningful.
Comment

Announcement

Using margins, test, and lincom to test hypothesis that two predictive values are equal

Comment

Comment

Comment

Comment

Comment