Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Discrepancy between predicted probabilities and average marginal effects in negative binomial regression

    Hello,

    This is my first time posting here, so apologies for any potential mistakes.

    Short background: I examine how characteristics of citations made by a patent (i.e. backward citations) influence the number of citations that this patent subsequently receives (i.e. forward citations). This approach is similar to examining how the references cited by a journal article influence the number of times this article is subsequently cited.

    Analyses: I am testing the potential U-shaped impact of an independent variable (i.e. ‘time gap’ of backward cites) on a dependent variable (i.e. number of ‘forward citations’) running a negative binomial regression model. The dependent variable is a count variable (named fw4), with the following distribution:

    Code:
     sum fw4, detail
     
            Citations received
    -------------------------------------------------------------
          Percentiles      Smallest
     1%            0              0
     5%            0              0
    10%            0              0       Obs              21,117
    25%            0              0       Sum of Wgt.      21,117
     
    50%            1                      Mean            2.48165
                            Largest       Std. Dev.       4.22759
    75%            3             52
    90%            7             70       Variance       17.87252
    95%           10             81       Skewness       4.294934
    99%           20             85       Kurtosis       37.63707
    The independent variable looks like this:
    Code:
     sum timgap, detail
     
                           Gap
    -------------------------------------------------------------
          Percentiles      Smallest
     1%            1              0
     5%            1              0
    10%            1              0       Obs              21,117
    25%            1              0       Sum of Wgt.      21,117
     
    50%            1                      Mean           1.724582
                            Largest       Std. Dev.      1.765048
    75%            2             28
    90%            3             38       Variance       3.115394
    95%            4           45.5       Skewness       7.354439
    99%          9.5             58       Kurtosis       113.7121

    A sample of the data is as follows:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(fw4 count) float(reusebeginofyear medianage timgap) byte teamsize float(patentgrant uniqueoffice shareint recbreadth pat_finalid) int pat_priy long pat_doc
    1  8     7.875    5 1.5 1 1 1         0   .768595  28 1967 24645379
    0  8     7.375    3   1 3 1 3         0  .9469435   4 2007 39709822
    0 14 12.285714    4   1 2 1 3  .4285714         0  18 2003 34701225
    1 10       9.9  3.5   1 4 1 5        .2 .43055555   1 2006 38535557
    0 10       6.5    9   4 2 0 3        .2  .9921035  39 2007 39710045
    3  8     1.375    2   1 3 1 2         0   .607438  17 2004 35060906
    0 10       2.8    5   1 2 1 1        .2  .2520661   5 2004 35183594
    3  9 20.333334   12   1 5 0 1 .11111111  .9238535 121 2004 36615448
    1  8       .25  8.5 8.5 3 0 1         0   .838843 112 2000 18843534
    3 13  4.923077    2   2 4 1 5  .3846154 .52076125   2 2002 32510676
    3  8      .375 12.5  11 2 1 1         0  .9746667  23 1999 17011054
    1  8      7.25    5 1.5 4 0 4         0  .9153979   1 2007 40381446
    2  8       3.5    2   1 3 1 2         0  .9693205 116 2003 34102979
    0  8     11.25  8.5   1 2 0 1         0   .934375  10 1995  7771531
    1  8      .625    5 1.5 1 1 1         0      .505 118 2000 18685685
    3  9  40.22222    9   1 3 1 3 .11111111  .9566575  17 2003 34394218
    0  8    19.625    4   1 1 1 4         0   .607438   1 2002 32473712
    3  9 12.444445   13   1 5 1 4 .11111111  .9823909  21 1997 17434323
    1  8      4.75    7 1.5 1 0 1         0       .46   1 2004 35183815
    6  8     6.625    6   1 3 1 2         0         0  10 2000  7628904
    2  8        14 14.5   3 4 1 5         0  .9738292  81 2003 33427484
    end
    Important to note is that, here:

    pat_finalid denotes the firm that applied for the patent
    pat_doc denotes the unique patent identification number
    pat_priy denotes the year of application of the patent


    I excluded the coefficients of the dummies for pat_finalid and pat_priy to save space.

    When I run the negative binomial regression I obtain the following:

    Code:
     nbreg fw4 count shareint recb teamsize uniqueo patentgrant reuseb medianage timgap c.timgap#c.timgap i.pat_p i.pat_f, robust
     
    Fitting Poisson model:
     
    Iteration 0:   log pseudolikelihood = -47246.278 
    Iteration 1:   log pseudolikelihood = -46222.733 
    Iteration 2:   log pseudolikelihood = -46205.947 
    Iteration 3:   log pseudolikelihood = -46205.874 
    Iteration 4:   log pseudolikelihood = -46205.874 
     
    Fitting constant-only model:
     
    Iteration 0:   log pseudolikelihood = -44087.015 
    Iteration 1:   log pseudolikelihood = -43010.087 
    Iteration 2:   log pseudolikelihood = -43002.914 
    Iteration 3:   log pseudolikelihood = -43002.914 
     
    Fitting full model:
     
    Iteration 0:   log pseudolikelihood = -39978.186 
    Iteration 1:   log pseudolikelihood = -38095.319 
    Iteration 2:   log pseudolikelihood = -37879.505 
    Iteration 3:   log pseudolikelihood = -37875.829 
    Iteration 4:   log pseudolikelihood = -37875.829 
     
    Negative binomial regression                    Number of obs     =     21,117
                                                    Wald chi2(196)    =   11631.79
    Dispersion           = mean                     Prob > chi2       =     0.0000
    Log pseudolikelihood = -37875.829               Pseudo R2         =     0.1192
     
    -----------------------------------------------------------------------------------
                      |               Robust
                  fw4 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------+----------------------------------------------------------------
                count |   .0156753   .0010517    14.90   0.000      .013614    .0177367
             shareint |  -.3507485   .0451294    -7.77   0.000    -.4392006   -.2622965
           recbreadth |   .1487739   .0277731     5.36   0.000     .0943396    .2032082
             teamsize |   .0261769   .0046877     5.58   0.000     .0169891    .0353647
         uniqueoffice |    .109674    .004895    22.41   0.000       .10008     .119268
          patentgrant |   .2145301   .0243185     8.82   0.000     .1668666    .2621935
     reusebeginofyear |   .0082285   .0015958     5.16   0.000     .0051009    .0113562
            medianage |  -.0090597   .0027831    -3.26   0.001    -.0145146   -.0036049
               timgap |  -.0974457   .0127799    -7.62   0.000    -.1224938   -.0723975
                      |
    c.timgap#c.timgap |   .0028285    .000721     3.92   0.000     .0014152    .0042417
                      |
                _cons |   .8661081   .6416052     1.35   0.177    -.3914151    2.123631
    ------------------+----------------------------------------------------------------
             /lnalpha |  -.2696179   .0207389                     -.3102654   -.2289704
    ------------------+----------------------------------------------------------------
                alpha |   .7636713   .0158377                      .7332523    .7953521
    -----------------------------------------------------------------------------------
    Subsequently, the plot of the predicted probabilities of ‘time gap’ look like this:
    Code:
     margins, at(timgap=(0(1)58))
     
    Predictive margins                              Number of obs     =     21,117
    Model VCE    : Robust
     
    Expression   : Predicted number of events, predict()
     
    1._at        : timgap          =           0
     
    2._at        : timgap          =           1
     
    3._at        : timgap          =           2
     
    4._at        : timgap          =           3
     
    5._at        : timgap          =           4
     
    6._at        : timgap          =           5
     
    7._at        : timgap          =           6
     
    8._at        : timgap          =           7
     
    9._at        : timgap          =           8
     
    10._at       : timgap          =           9
     
    11._at       : timgap          =          10
     
    12._at       : timgap          =          11
     
    13._at       : timgap          =          12
     
    14._at       : timgap          =          13
     
    15._at       : timgap          =          14
     
    16._at       : timgap          =          15
     
    17._at       : timgap          =          16
     
    18._at       : timgap          =          17
     
    19._at       : timgap          =          18
     
    20._at       : timgap          =          19
     
    21._at       : timgap          =          20
     
    22._at       : timgap          =          21
     
    23._at       : timgap          =          22
     
    24._at       : timgap          =          23
     
    25._at       : timgap          =          24
     
    26._at       : timgap          =          25
     
    27._at       : timgap          =          26
     
    28._at       : timgap          =          27
     
    29._at       : timgap          =          28
     
    30._at       : timgap          =          29
     
    31._at       : timgap          =          30
     
    32._at       : timgap          =          31
     
    33._at       : timgap          =          32
     
    34._at       : timgap          =          33
     
    35._at       : timgap          =          34
     
    36._at       : timgap          =          35
     
    37._at       : timgap          =          36
     
    38._at       : timgap          =          37
     
    39._at       : timgap          =          38
     
    40._at       : timgap          =          39
     
    41._at       : timgap          =          40
     
    42._at       : timgap          =          41
     
    43._at       : timgap          =          42
     
    44._at       : timgap          =          43
     
    45._at       : timgap          =          44
     
    46._at       : timgap          =          45
     
    47._at       : timgap          =          46
     
    48._at       : timgap          =          47
     
    49._at       : timgap          =          48
     
    50._at       : timgap          =          49
     
    51._at       : timgap          =          50
     
    52._at       : timgap          =          51
     
    53._at       : timgap          =          52
     
    54._at       : timgap          =          53
     
    55._at       : timgap          =          54
     
    56._at       : timgap          =          55
     
    57._at       : timgap          =          56
     
    58._at       : timgap          =          57
     
    59._at       : timgap          =          58
     
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             _at |
              1  |   2.907145   .0561051    51.82   0.000     2.797181    3.017109
              2  |   2.644691   .0295555    89.48   0.000     2.586764    2.702619
              3  |    2.41958   .0303876    79.62   0.000     2.360022    2.479139
              4  |   2.226188   .0445835    49.93   0.000     2.138806     2.31357
              5  |   2.059873   .0581709    35.41   0.000      1.94586    2.173886
              6  |   1.916795   .0691658    27.71   0.000     1.781233    2.052358
              7  |   1.793775   .0778237    23.05   0.000     1.641243    1.946306
              8  |   1.688172   .0847155    19.93   0.000     1.522133    1.854212
              9  |     1.5978   .0904177    17.67   0.000     1.420585    1.775016
             10  |   1.520845   .0954568    15.93   0.000     1.333753    1.707937
             11  |   1.455808   .1003065    14.51   0.000     1.259211    1.652405
             12  |   1.401458   .1053954    13.30   0.000     1.194887    1.608029
             13  |   1.356791   .1111151    12.21   0.000     1.139009    1.574573
             14  |   1.320999   .1178276    11.21   0.000     1.090061    1.551937
             15  |   1.293447   .1258707    10.28   0.000     1.046745     1.54015
             16  |   1.273655   .1355661     9.40   0.000     1.007951     1.53936
             17  |   1.261281   .1472277     8.57   0.000     .9727198    1.549842
             18  |   1.256112   .1611748     7.79   0.000     .9402152    1.572009
             19  |   1.258061   .1777473     7.08   0.000     .9096829    1.606439
             20  |   1.267161    .197323     6.42   0.000     .8804155    1.653907
             21  |   1.283568   .2203371     5.83   0.000     .8517152    1.715421
             22  |   1.307563   .2473036     5.29   0.000     .8228568    1.792269
             23  |   1.339563   .2788378     4.80   0.000     .7930509    1.886075
             24  |   1.380131   .3156826     4.37   0.000     .7614046    1.998858
             25  |   1.429995    .358739     3.99   0.000     .7268791     2.13311
             26  |   1.490065   .4091018     3.64   0.000     .6882402     2.29189
             27  |   1.561467   .4681041     3.34   0.001     .6439999    2.478934
             28  |   1.645573   .5373717     3.06   0.002      .592344    2.698802
             29  |   1.744047    .618891     2.82   0.005     .5310434    2.957052
             30  |   1.858901   .7150945     2.60   0.009     .4573412     3.26046
             31  |   1.992558    .828968     2.40   0.016     .3678102    3.617305
             32  |   2.147941   .9641859     2.23   0.026     .2581713    4.037711
             33  |   2.328577   1.125284     2.07   0.039     .1230608    4.534093
             34  |   2.538724   1.317878     1.93   0.054    -.0442682    5.121717
             35  |   2.783539   1.548941     1.80   0.072    -.2523303    5.819408
             36  |   3.069275   1.827167     1.68   0.093    -.5119064    6.650457
             37  |   3.403542   2.163423     1.57   0.116    -.8366882    7.643773
             38  |   3.795625   2.571345     1.48   0.140    -1.244119    8.835368
             39  |   4.256887   3.068105     1.39   0.165    -1.756489    10.27026
             40  |   4.801288   3.675405     1.31   0.191    -2.402373    12.00495
             41  |   5.446031   4.420767     1.23   0.218    -3.218514    14.11058
             42  |   6.212399   5.339235     1.16   0.245     -4.25231    16.67711
             43  |   7.126811   6.475586     1.10   0.271    -5.565105    19.81873
             44  |   8.222199   7.887259     1.04   0.297    -7.236545    23.68094
             45  |    9.53976   9.648215     0.99   0.323    -9.370393    28.44991
             46  |   11.13124   11.85406     0.94   0.348    -12.10229    34.36478
             47  |   13.06191   14.62887     0.89   0.372    -15.61014    41.73396
             48  |    15.4144   18.13428     0.85   0.395    -20.12814    50.95694
             49  |   18.29377   22.58176     0.81   0.418    -25.96567    62.55321
             50  |   21.83416   28.24902     0.77   0.440    -33.53289    77.20122
             51  |   26.20756   35.50228     0.74   0.460    -43.37562    95.79075
             52  |   31.63542   44.82649     0.71   0.480    -56.22288    119.4937
             53  |   38.40407   56.86643     0.68   0.499    -73.05209    149.8602
             54  |    46.8854   72.48303     0.65   0.518    -95.17872    188.9495
             55  |   57.56452   92.83061     0.62   0.535    -124.3801    239.5092
             56  |   71.07697   119.4635     0.59   0.552    -163.0672    305.2211
             57  |   88.25913   154.4835     0.57   0.568     -214.523    391.0413
             58  |   110.2166   200.7452     0.55   0.583    -283.2368    503.6701
             59  |   138.4177   262.1421     0.53   0.597    -375.3715    652.2068
    ------------------------------------------------------------------------------
    I subsequently rely on utest from SSC in Stata 14.2. This allows testing for the statistical significance of the inflection point of a curvilinear relationship, as well as estimating the statistical significance and sign of the slope on both sides of the inflection point.

    The utest command returns the following:

    Code:
     generate timgapsq = timgap*timgap
    quietly nbreg fw4 count shareint recb teamsize uniqueo patentgrant reuseb medianage timgap timgapsq i.pat_p i.pat_f, robust
    . utest timgap timgapsq, prefix(fw4) fieller
    (87 missing values generated)
     
    Specification: f(x)=x^2
    Extreme point:   17.2259
     
    Test:
         H1: U shape
     vs. H0: Monotone or Inverse U shape
     
    -------------------------------------------------
                     |   Lower bound      Upper bound
    -----------------+-------------------------------
    Interval         |           0               58
    Slope            |   -.0974457         .2306562
    t-value          |    -7.62492         3.125554
    P>|t|            |    1.27e-14         .0008886
    -------------------------------------------------
     
    Overall test of presence of a U shape:
         t-value =      3.13
         P>|t|   =   .000889
     
    95% Fieller interval for extreme point: [13.191711; 27.997666]
    However, I am a bit confused why the average marginal effects return different results, showing that the average marginal effects beyond a value of 13 are not statistically significant (p>0.05):
    Code:
     quietly nbreg fw4 count shareint recb teamsize uniqueo patentgrant reuseb medianage timgap c.timgap#c.timgap i.pat_p i.pat_f, robust
    margins, dydx(timgap) at(timgap=(0(1)58))
     
    Average marginal effects                        Number of obs     =     21,117
    Model VCE    : Robust
     
    Expression   : Predicted number of events, predict()
    dy/dx w.r.t. : timgap
     
    1._at        : timgap          =           0
     
    2._at        : timgap          =           1
     
    3._at        : timgap          =           2
     
    4._at        : timgap          =           3
     
    5._at        : timgap          =           4
     
    6._at        : timgap          =           5
     
    7._at        : timgap          =           6
     
    8._at        : timgap          =           7
     
    9._at        : timgap          =           8
     
    10._at       : timgap          =           9
     
    11._at       : timgap          =          10
     
    12._at       : timgap          =          11
     
    13._at       : timgap          =          12
     
    14._at       : timgap          =          13
     
    15._at       : timgap          =          14
     
    16._at       : timgap          =          15
     
    17._at       : timgap          =          16
     
    18._at       : timgap          =          17
     
    19._at       : timgap          =          18
     
    20._at       : timgap          =          19
     
    21._at       : timgap          =          20
     
    22._at       : timgap          =          21
     
    23._at       : timgap          =          22
     
    24._at       : timgap          =          23
     
    25._at       : timgap          =          24
     
    26._at       : timgap          =          25
     
    27._at       : timgap          =          26
     
    28._at       : timgap          =          27
     
    29._at       : timgap          =          28
     
    30._at       : timgap          =          29
     
    31._at       : timgap          =          30
     
    32._at       : timgap          =          31
     
    33._at       : timgap          =          32
     
    34._at       : timgap          =          33
     
    35._at       : timgap          =          34
     
    36._at       : timgap          =          35
     
    37._at       : timgap          =          36
     
    38._at       : timgap          =          37
     
    39._at       : timgap          =          38
     
    40._at       : timgap          =          39
     
    41._at       : timgap          =          40
     
    42._at       : timgap          =          41
     
    43._at       : timgap          =          42
     
    44._at       : timgap          =          43
     
    45._at       : timgap          =          44
     
    46._at       : timgap          =          45
     
    47._at       : timgap          =          46
     
    48._at       : timgap          =          47
     
    49._at       : timgap          =          48
     
    50._at       : timgap          =          49
     
    51._at       : timgap          =          50
     
    52._at       : timgap          =          51
     
    53._at       : timgap          =          52
     
    54._at       : timgap          =          53
     
    55._at       : timgap          =          54
     
    56._at       : timgap          =          55
     
    57._at       : timgap          =          56
     
    58._at       : timgap          =          57
     
    59._at       : timgap          =          58
     
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    timgap       |
             _at |
              1  |  -.2832887    .041849    -6.77   0.000    -.3653113   -.2012661
              2  |  -.2427529    .031913    -7.61   0.000    -.3053013   -.1802045
              3  |  -.2084028   .0243079    -8.57   0.000    -.2560454   -.1607603
              4  |  -.1791522   .0185715    -9.65   0.000    -.2155518   -.1427527
              5  |  -.1541155    .014378   -10.72   0.000    -.1822958   -.1259352
              6  |  -.1325676   .0115064   -11.52   0.000    -.1551197   -.1100154
              7  |  -.1139121   .0097904   -11.64   0.000    -.1331008   -.0947233
              8  |   -.097656   .0090425   -10.80   0.000     -.115379    -.079933
              9  |  -.0833896   .0090162    -9.25   0.000     -.101061   -.0657182
             10  |    -.07077   .0094602    -7.48   0.000    -.0893116   -.0522283
             11  |  -.0595082   .0101912    -5.84   0.000    -.0794825   -.0395339
             12  |  -.0493586   .0111085    -4.44   0.000     -.071131   -.0275862
             13  |  -.0401102   .0121726    -3.30   0.001     -.063968   -.0162524
             14  |  -.0315793   .0133801    -2.36   0.018    -.0578037   -.0053548
             15  |  -.0236037   .0147491    -1.60   0.110    -.0525114     .005304
             16  |  -.0160375   .0163108    -0.98   0.325    -.0480062    .0159311
             17  |  -.0087468   .0181059    -0.48   0.629    -.0442336    .0267401
             18  |  -.0016052   .0201834    -0.08   0.937    -.0411639    .0379535
             19  |   .0055091   .0226016     0.24   0.807    -.0387893    .0498075
             20  |   .0127172   .0254295     0.50   0.617    -.0371237    .0625581
             21  |   .0201429   .0287488     0.70   0.484    -.0362037    .0764895
             22  |   .0279162   .0326574     0.85   0.393    -.0360912    .0919236
             23  |   .0361772   .0372734     0.97   0.332    -.0368773    .1092317
             24  |   .0450802   .0427395     1.05   0.292    -.0386876    .1288479
             25  |   .0547983   .0492295     1.11   0.266    -.0416899    .1512864
             26  |   .0655294   .0569563     1.15   0.250    -.0461029    .1771617
             27  |   .0775026   .0661807     1.17   0.242    -.0522093    .2072144
             28  |    .090986   .0772244     1.18   0.239    -.0603711    .2423431
             29  |   .1062968   .0904849     1.17   0.240    -.0710504    .2836439
             30  |   .1238125   .1064554     1.16   0.245    -.0848362    .3324613
             31  |   .1439865   .1257496     1.15   0.252    -.1024781    .3904512
             32  |   .1673656   .1491339     1.12   0.262    -.1249315    .4596627
             33  |   .1946132   .1775686     1.10   0.273    -.1534149    .5426413
             34  |   .2265379   .2122607     1.07   0.286    -.1894855    .6425612
             35  |   .2641297   .2547327     1.04   0.300    -.2351372    .7633966
             36  |   .3086058   .3069114     1.01   0.315    -.2929295    .9101412
             37  |   .3614689   .3712442     0.97   0.330    -.3661563    1.089094
             38  |   .4245811   .4508497     0.94   0.346    -.4590681     1.30823
             39  |   .5002591   .5497171     0.91   0.363    -.5771665    1.577685
             40  |   .5913963   .6729661     0.88   0.380     -.727593    1.910386
             41  |   .7016201   .8271925     0.85   0.396    -.9196473    2.322888
             42  |   .8354954   1.020924     0.82   0.413    -1.165478    2.836469
             43  |   .9987892   1.265225     0.79   0.430    -1.481006    3.478585
             44  |   1.198815   1.574511     0.76   0.446    -1.887169    4.284799
             45  |   1.444884   1.967625     0.73   0.463    -2.411591    5.301359
             46  |   1.748898   2.469302     0.71   0.479    -3.090845     6.58864
             47  |   2.126127   3.112124     0.68   0.494    -3.973524    8.225778
             48  |   2.596246   3.939187     0.66   0.510    -5.124419    10.31691
             49  |   3.184705   5.007713     0.64   0.525    -6.630232    12.99964
             50  |   3.924555   6.393976     0.61   0.539    -8.607407    16.45652
             51  |   4.858902   8.200058     0.59   0.553    -11.21292    20.93072
             52  |   6.044189   10.56313     0.57   0.567    -14.65916    26.74754
             53  |    7.55464   13.66824     0.55   0.580    -19.23462     34.3439
             54  |   9.488271   17.76609     0.53   0.593    -25.33264    44.30918
             55  |   11.97506   23.19769     0.52   0.606    -33.49158     57.4417
             56  |   15.18811   30.42881     0.50   0.618    -44.45125    74.82747
             57  |   19.35896   40.09829     0.48   0.629    -59.23224    97.95016
             58  |   24.79866   53.08607     0.47   0.640    -79.24813    128.8454
             59  |   31.92688    70.6093     0.45   0.651    -106.4648    170.3186
    ------------------------------------------------------------------------------
    I am not sure how to reconcile these results. On the one hand, the predicted probabilities (and utest) show evidence of a U-shaped relationship between gap and cites received, but then the average marginal effects return non-significant effects beyond a value of 13.

    I also ran an additional check using OLS regression, and a log-transformed dependent variable, which produced results in line with a U-shaped relationship between ‘time gap’ and ‘forward cites’ (i.e. statistically significant linear and quadratic coefficients, predictive margins, average marginal effects). I don’t report these results for the sake of brevity.

    So, in summary, my question is: How can I explain the discrepancies between the predicted probabilities and average marginal effects in my negative binomial regression?
    ​​​​​​​
    I would hugely appreciate any help you could give me with this problem.


  • #2
    Disclaimer: I am not familiar with the -utest- command; it is not part of official Stata. To help other readers understand the problem, you are asked to explain what it is and where it comes from in your post.

    Be that as it may, there is no contradiction or inconsistency here. You are taking the marginal effect of a variable that appears as a quadratic in your model. So, you model is E(log(fw4)) = b0 + b1*timegap + b2*timegap^2 (+ other terms that we will ignore here for simplicity). The marginal effect at each value of timegap is the derivative of the outcome with respect to timegap at that value. Again, for simplicity, let's just ignore the details of log-link implicit in the -nbreg- model. In effect, then the marginal effect will be a function of b1 + 2*b2*timegap. In particular, if the coefficient of timegap^2, b2, is positive (as it is in your output), then as timegap increases, so will b1 + 2*b2*timegap. And the "function of" that I'm glossing over is, in fact, a monotone increasing function, so the marginal effect will also increase. Now, as it happens, when timegap = 1, the marginal effect is negative. But, inevitably, with increasing timegap values, it grows towards zero and then past zero into positive, and ultimately large positive territory. Certainly you will understand that for values of the marginal effect that turn out close to zero, the marginal effect will not be "statistically significant." By the time you get to timegap = 15, you are already close enough to zero that given all the things in your model and data that determine the standard errors, that marginal effect is not statistically significant.

    Now, you may wonder, OK, but then as we get into large positive territory, why don't we get statistically significant marginal effects again. If -.28 (timegap = 1) is far enough away from zero to be statistically significant, why isn't +31.9 (timegap = 59)? Well, look at the standard errors out there. They are huge! So the definition of "close enough" to zero is also changing with timegap. Why is that? Well, in this case, it has to do with the distribution of timegap in your data. The distribution is highly left-skew, with a huge spike at timegap = 1, and the data gets very sparse very quickly. The number of observations with values of timegap in large positive territory is quite small, so there is very little information in your data about how fw4 behaves in that region. Hence the large standard error.

    So, putting it all together, in a quadratic model you are guaranteed that there is some domain of values of timegap where the marginal effect is close to zero--and therefore not "statistically significant." As it happens, that region occurs at about the same point where your data starts to get sparse. So once you are out of that domain into a region with timegaps that would yield a large positive marginal effect, you get the expected large estimates of marginal effect, but they are now based on so few observations that their standard errors are getting even larger!

    Comment


    • #3
      Thank you so much for your detailed answer, Clyde. You have provided useful insights into how the marginal effect can be understood in a quadratic relationship. More information regarding the 'utest' command can be found in the paper from which this command originated: Lind, J. T., & Mehlum, H. (2010). With or without U? the Appropriate Test for a U‐Shaped Relationship. Oxford bulletin of economics and statistics, 72(1), 109-118. In this paper, the authors confirm the applicability of the 'utest' command for models with a limited dependent variable.

      Your explanation makes a lot of sense (especially why the marginal effects are non-significant around the inflection point). The 'time gap' variable indeed has an interesting distribution, with many low values and a few high values. Hence, the large standard errors for high values of 'time gap' are indeed to be expected.

      I am still, however, a bit puzzled about how to interpret the different result obtained from estimating the predicted probabilities at different values of time gap. What can I make of these findings? I originally interpreted these results as providing suggestions of a U-shaped relationship between 'time gap' and 'forward cites' (with a note of caution that the right side of the curve had very few observations). But now, with the estimation of average marginal effects, I am not so sure anymore. To further probe into this issue, I have tried winsorizing (at the 1st and 99th percentile) the 'time gap' variable, bringing the 'extreme' values closer to the rest, which yielded statistically significant average marginal effects for the highest value of 'time gap' (which, then, was '9.5'):

      Code:
       winsor2 timgap, replace
      nbreg fw4 count shareint recb teamsize uniqueo patentgrant reuseb medianage timgap c.timgap#c.timgap i.pat_p i.pat_f, robust
       
      Fitting Poisson model:
       
      Iteration 0:   log pseudolikelihood = -47167.258
      Iteration 1:   log pseudolikelihood = -46155.222
      Iteration 2:   log pseudolikelihood = -46138.868
      Iteration 3:   log pseudolikelihood = -46138.798
      Iteration 4:   log pseudolikelihood = -46138.798
       
      Fitting constant-only model:
       
      Iteration 0:   log pseudolikelihood = -44087.015
      Iteration 1:   log pseudolikelihood = -43010.087
      Iteration 2:   log pseudolikelihood = -43002.914
      Iteration 3:   log pseudolikelihood = -43002.914
       
      Fitting full model:
       
      Iteration 0:   log pseudolikelihood = -39965.595
      Iteration 1:   log pseudolikelihood = -38078.008
      Iteration 2:   log pseudolikelihood = -37860.884
      Iteration 3:   log pseudolikelihood = -37857.152
      Iteration 4:   log pseudolikelihood = -37857.151
       
      Negative binomial regression                    Number of obs     =     21,117
                                                      Wald chi2(196)    =   11693.42
      Dispersion           = mean                     Prob > chi2       =     0.0000
      Log pseudolikelihood = -37857.151               Pseudo R2         =     0.1197
       
      -----------------------------------------------------------------------------------
                        |               Robust
                    fw4 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ------------------+----------------------------------------------------------------
                  count |   .0155177   .0010477    14.81   0.000     .0134643    .0175711
               shareint |  -.3526079   .0451422    -7.81   0.000    -.4410851   -.2641308
             recbreadth |   .1528689   .0277972     5.50   0.000     .0983874    .2073504
               teamsize |   .0254887    .004691     5.43   0.000     .0162945    .0346828
           uniqueoffice |   .1082865   .0048876    22.16   0.000     .0987069     .117866
            patentgrant |   .2127039    .024299     8.75   0.000     .1650787    .2603291
       reusebeginofyear |    .007068   .0015846     4.46   0.000     .0039623    .0101738
              medianage |  -.0078716   .0027462    -2.87   0.004    -.0132541   -.0024891
                 timgap |  -.2113695    .024239    -8.72   0.000    -.2588771   -.1638619
                        |
      c.timgap#c.timgap |   .0163477   .0027463     5.95   0.000      .010965    .0217305
                        |
                  _cons |    1.04521   .6460649     1.62   0.106    -.2210539    2.311474
      ------------------+----------------------------------------------------------------
               /lnalpha |  -.2738808   .0208282                     -.3147034   -.2330582
      ------------------+----------------------------------------------------------------
                  alpha |   .7604227   .0158383                      .7300054    .7921075
      -----------------------------------------------------------------------------------
       
      
      . margins, dydx(timgap) at(timgap=(1(0.5)9.5))
      
      Average marginal effects                        Number of obs     =     21,117
      Model VCE    : Robust
      
      Expression   : Predicted number of events, predict()
      dy/dx w.r.t. : timgap
      
      1._at        : timgap          =           1
      
      2._at        : timgap          =         1.5
      
      3._at        : timgap          =           2
      
      4._at        : timgap          =         2.5
      
      5._at        : timgap          =           3
      
      6._at        : timgap          =         3.5
      
      7._at        : timgap          =           4
      
      8._at        : timgap          =         4.5
      
      9._at        : timgap          =           5
      
      10._at       : timgap          =         5.5
      
      11._at       : timgap          =           6
      
      12._at       : timgap          =         6.5
      
      13._at       : timgap          =           7
      
      14._at       : timgap          =         7.5
      
      15._at       : timgap          =           8
      
      16._at       : timgap          =         8.5
      
      17._at       : timgap          =           9
      
      18._at       : timgap          =         9.5
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      timgap       |
               _at |
                1  |  -.4824764   .0547153    -8.82   0.000    -.5897163   -.3752364
                2  |  -.4025129   .0405828    -9.92   0.000    -.4820538   -.3229721
                3  |  -.3351246   .0300182   -11.16   0.000    -.3939591     -.27629
                4  |  -.2777804   .0223262   -12.44   0.000    -.3215389   -.2340218
                5  |  -.2284464   .0171239   -13.34   0.000    -.2620086   -.1948842
                6  |  -.1854715   .0142619   -13.00   0.000    -.2134243   -.1575187
                7  |   -.147499   .0135582   -10.88   0.000    -.1740725   -.1209254
                8  |  -.1133968   .0145274    -7.81   0.000    -.1418699   -.0849237
                9  |  -.0822027   .0165829    -4.96   0.000    -.1147045   -.0497008
               10  |  -.0530786   .0193566    -2.74   0.006    -.0910169   -.0151402
               11  |  -.0252738   .0227253    -1.11   0.266    -.0698144    .0192669
               12  |   .0019077   .0267203     0.07   0.943    -.0504631    .0542785
               13  |   .0291361   .0314651     0.93   0.354    -.0325344    .0908066
               14  |   .0570836   .0371487     1.54   0.124    -.0157264    .1298937
               15  |   .0864522   .0440207     1.96   0.050     .0001733    .1727311
               16  |   .1180028   .0523994     2.25   0.024     .0153017    .2207038
               17  |   .1525883   .0626885     2.43   0.015     .0297212    .2754555
               18  |   .1911923   .0754005     2.54   0.011       .04341    .3389747
      ------------------------------------------------------------------------------
      Last edited by Holmer Kok; 04 Jul 2017, 11:27. Reason: Some issues with the formating of the output were corrected.

      Comment


      • #4
        While awaiting Clyde's definitive word, let me share my (sometimes misguided) opinion.

        I originally interpreted these results as providing suggestions of a U-shaped relationship between 'time gap' and 'forward cites' (with a note of caution that the right side of the curve had very few observations).
        That is correct. Changing your data arbitrarily ("winsorizing") to increase statistical significance is indefensible. You do not have outliers, where it might be a defensible, albeit primitive, technique to deal with the issue; you have thin data that do not support assertions of a statistically significant difference from zero at a particular, arbitrary level of significance at certain values of timgap. But that does not mean that you do not have a u-shaped distribution.

        Meanwhile, here is a link to an earlier discussion by Clyde about a similar situation.

        https://www.statalist.org/forums/for...a-squared-term

        Comment


        • #5
          Originally posted by William Lisowski View Post

          That is correct. Changing your data arbitrarily ("winsorizing") to increase statistical significance is indefensible. You do not have outliers, where it might be a defensible, albeit primitive, technique to deal with the issue; you have thin data that do not support assertions of a statistically significant difference from zero at a particular, arbitrary level of significance at certain values of timgap. But that does not mean that you do not have a u-shaped distribution.
          Yes, I fully agree. I was not planning, in any way, to use the results in which timgap is winsorized, since I might end up losing very interesting observations (plus, choosing the 99th percentile, as you said, is indeed arbitrary). I only included these results to obtain a better understanding of what might drive the difference between the predicted probabilities and average marginal effects in my sample.

          Originally posted by William Lisowski View Post
          Meanwhile, here is a link to an earlier discussion by Clyde about a similar situation.

          https://www.statalist.org/forums/for...a-squared-term
          Thank you for this link. I had come across it earlier, and following Clyde's advice in that post, my results (based on the plot of predicted probabilities) would support a U-shaped impact of timgap on forward cites in my sample as well. My question, however, pertained to the different result indicated by estimating average marginal effects.

          Comment


          • #6
            I fully agree with William that winsorizing this data is not a good idea, and I have nothing more "definitive" to say about it than he has already said.

            I guess, in the end, I might not really understand your question. You are shaken in the confidence of your quadratic model because of something (?lack of statistical significance) about the marginal effects at large values of timegap. I don't get where that is coming from. The marginal effects are completely consistent with a quadratic model: they behave exactly as I would expect. The statistical significance of the marginal effects has nothing to do with it. (In fact, in quadratic models generally, unless your research goals include an explicit question about the value of the marginal effect of the quadratic variable at some particular value, it is actually best not to even look at the p-values associated with them as they really tell you nothing about anything useful and can easily cause confusion.) The pattern of p-values here is purely an artifact of the sparsity of your data at the upper range of timegap. And in any case, I would never judge the appropriateness of a quadratic model based on the statistical significance of the marginal effects. If I were to rely on the statistical significance of anything at all for this purpose, it would be the result of -test timegap timegap#timegap-, the test of the joint null hypothesis that there is no timegap effect at the linear or quadratic level. But really, I wouldn't rely on that either. My concerns would be, rather:

            1. Do the data fit the quadratic model reasonably well: how does a scatterplot of observed and predicted values look?

            2. Is the vertex of the parabola implied by the model located well in the center of the range of the observed data. In your case, the vertex is at 0.097457/(2*0.0028285) = 17.2, approximately. That's clearly located centrally in your data. So the model suggests a turning point that is really in your data range, and not outside or near an edge, supporting a quadratic model.

            3. Is a quadratic relationship plausible in terms of theory?

            I think this may be another case of getting confused by focusing on p-values, which are bizarre non-linear transforms of sample-size sensitive functions of the data, and which are appraised by comparing them to a distribution that is almost always conditional on a straw-man hypothesis and applying an arbitrary threshold. I always tell my students to first understand everything else about their model outputs, and not even to look at the p-values. Then, after you have really understood your model fully, if you have nothing better to do, and a lot of time to kill, you might look at the p-values. The lesson of doing that is usually finding that p-values have nothing helpful to add.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              I guess, in the end, I might not really understand your question. You are shaken in the confidence of your quadratic model because of something (?lack of statistical significance) about the marginal effects at large values of timegap. I don't get where that is coming from. The marginal effects are completely consistent with a quadratic model: they behave exactly as I would expect. The statistical significance of the marginal effects has nothing to do with it. (In fact, in quadratic models generally, unless your research goals include an explicit question about the value of the marginal effect of the quadratic variable at some particular value, it is actually best not to even look at the p-values associated with them as they really tell you nothing about anything useful and can easily cause confusion.) The pattern of p-values here is purely an artifact of the sparsity of your data at the upper range of timegap. And in any case, I would never judge the appropriateness of a quadratic model based on the statistical significance of the marginal effects.
              I apologize for creating confusing with my question, and I thank you once for taking ample time to look into my issue.

              Originally posted by Clyde Schechter View Post
              If I were to rely on the statistical significance of anything at all for this purpose, it would be the result of -test timegap timegap#timegap-, the test of the joint null hypothesis that there is no timegap effect at the linear or quadratic level. But really, I wouldn't rely on that either.
              Even though you question the usefulness of this test, I ran it anyway and it produced the following:
              Code:
              . quietly nbreg fw4 medianage count shareint reuseb recb patentgrant uniqueo teamsize timgap c.timgap#c.timgap i.pat_f i.pat_p, robust
              
              . 
              . test timgap timgap#timgap
              
               ( 1)  [fw4]timgap = 0
               ( 2)  [fw4]c.timgap#c.timgap = 0
              
                         chi2(  2) =   71.52
                       Prob > chi2 =    0.0000
              Regarding the examination of average marginal effects: I started looking into the average marginal effects of my time gap measure, based on several prior studies that raised issues about interpreting the sign, magnitude of coefficient, and statistical significance of coefficients in non-linear models (such as negative binomial regressions). For example, Wiersema and Bowen (2009, p. 682) state that "in an LDV model, an explanatory variable’s estimated coefficient can rarely be used to infer the true nature of the relationship between the explanatory variable and the dependent variable." Furthermore, they go on to state that "Instead, for LDV models the focus of analysis is on the value and statistical significance of an explanatory variable’s marginal effect, which requires analysis beyond simply estimating one’s model. (Wiersema and Bowen, 2009, p. 682)". These concerns are corroborated by several other studies, hence my reduced confidence about the actual shape of the relationship between time gape and forward cites. I was hoping you could give me some clarification regarding these potential issues. I should note that the U-shaped impact of time gap on forward cites is both theoretically plausible and interesting, hence my interest in confirming its robustness and statistical soundness.

              Reference: Wiersema, M. F., & Bowen, H. P. (2009). The use of limited dependent variable techniques in strategy research: Issues and methods. Strategic Management Journal, 30(6), 679-692.

              Comment


              • #8
                I definitely agree that looking at model coefficients in non-linear models can give a misleading impression of what is going on and that it is important to look instead of marginal effects. Just as I discouraging looking at model coefficients through the lens of statistical significance, I discourage looking at marginal effects that way.

                In your situation, the marginal effects you are calculating are precisely what you would expect from a quadratic model. The difficulty is that the standard errors are large (confidence intervals are wide) when timegap is large. That, as we have already seen, is due to the sparsity of your data in that area. So the overall picture here points not to a problem with your model but with your data.

                So, to me, the important "test" of this model is how the predicted outcomes match up with the observed outcomes. (Better still, if you have held out some data, how well do the predicted outcomes match up with the observed ones in the held-out sample.)

                If you then want to also ask how the marginal effects are working out (which is not a question of verifying your model, it is a question of understanding what your model tells you about the domain of your data), your results show you that for large values of timegap, we don't really know those marginal effects--they are estimated with extremely poor precision, and so cannot be relied on very much to describe the relationship bewteen timegap and fw4. But changing your model would not improve that situation: even if an oracle told you the true data generating process and you fit that to your data, because you have so few observations with large values of timegap, you will face the same problem of very imprecise estimates of the marginal effect of timegap at large values.

                Comment


                • #9
                  Originally posted by Clyde Schechter View Post
                  I definitely agree that looking at model coefficients in non-linear models can give a misleading impression of what is going on and that it is important to look instead of marginal effects. Just as I discouraging looking at model coefficients through the lens of statistical significance, I discourage looking at marginal effects that way.

                  In your situation, the marginal effects you are calculating are precisely what you would expect from a quadratic model. The difficulty is that the standard errors are large (confidence intervals are wide) when timegap is large. That, as we have already seen, is due to the sparsity of your data in that area. So the overall picture here points not to a problem with your model but with your data.

                  So, to me, the important "test" of this model is how the predicted outcomes match up with the observed outcomes. (Better still, if you have held out some data, how well do the predicted outcomes match up with the observed ones in the held-out sample.)

                  If you then want to also ask how the marginal effects are working out (which is not a question of verifying your model, it is a question of understanding what your model tells you about the domain of your data), your results show you that for large values of timegap, we don't really know those marginal effects--they are estimated with extremely poor precision, and so cannot be relied on very much to describe the relationship bewteen timegap and fw4. But changing your model would not improve that situation: even if an oracle told you the true data generating process and you fit that to your data, because you have so few observations with large values of timegap, you will face the same problem of very imprecise estimates of the marginal effect of timegap at large values.
                  Thank you for your swift reply.

                  If I may ask, why are the standard errors of the predicted probabilities smaller than those of the average marginal effects? Are the former not affected as much by the sparsity of the data in the right region of the curve of time gap? If this is the case, do you know of any book or article that might explain this in more detail?

                  Also, I unfortunately do not have a held-out sample, as I am running the analysis on the largest sample I could gather.

                  The scatterplot between observed and predicted values is as follows.

                  Code:
                  quietly nbreg fw4 medianage count shareint reuseb recb patentgrant uniqueo teamsize timgap c.timgap#c.timgap i.pat_f i.pat_p, robust
                  ovfplot
                  Click image for larger version

Name:	time gap and fw4.png
Views:	1
Size:	57.4 KB
ID:	1400515





                  I have also computed AIC and BIC values for the models with and without time gap, and time gap squared, to gain some sense of which model has higher fit. A rule-of-thumb I learned, is that improvements exceeding 10 in AIC are normally used as indications of substantially improved fit.

                  Code:
                  . eststo: quietly nbreg fw4 medianage count shareint reuseb recb patentgrant uniqueo teamsize i.pat_f i.pat_p, robust
                  (est1 stored)
                  
                  .
                  . eststo: quietly nbreg fw4 medianage count shareint reuseb recb patentgrant uniqueo teamsize timgap i.pat_f i.pat_p, robust
                  (est2 stored)
                  
                  .
                  . eststo: quietly nbreg fw4 medianage count shareint reuseb recb patentgrant uniqueo teamsize timgap c.timgap#c.timgap i.pat_f i.pat_p, robust
                  (est3 stored)
                  
                  .
                  . esttab est*, aic bic drop(*pat_finalid *pat_priy) varwidth(20)
                  
                  --------------------------------------------------------------------
                                                (1)             (2)             (3)  
                                                fw4             fw4             fw4  
                  --------------------------------------------------------------------
                  fw4                                                                
                  medianage                 -0.0209***      -0.0101***     -0.00906**
                                            (-8.36)         (-3.62)         (-3.26)  
                  
                  count                      0.0166***       0.0158***       0.0157***
                                            (15.64)         (15.00)         (14.90)  
                  
                  shareint                   -0.340***       -0.345***       -0.351***
                                            (-7.51)         (-7.65)         (-7.77)  
                  
                  reusebeginofyear           0.0122***      0.00896***      0.00823***
                                             (7.66)          (5.59)          (5.16)  
                  
                  recbreadth                  0.146***        0.146***        0.149***
                                             (5.25)          (5.27)          (5.36)  
                  
                  patentgrant                 0.220***        0.216***        0.215***
                                             (9.04)          (8.86)          (8.82)  
                  
                  uniqueoffice                0.113***        0.111***        0.110***
                                            (22.97)         (22.57)         (22.41)  
                  
                  teamsize                   0.0264***       0.0264***       0.0262***
                                             (5.64)          (5.62)          (5.58)  
                  
                  timgap                                    -0.0578***      -0.0974***
                                                            (-6.81)         (-7.62)  
                  
                  c.timgap#c.timgap                                         0.00283***
                                                                             (3.92)  
                  
                  _cons                       0.698           0.800           0.866  
                                             (1.11)          (1.25)          (1.35)  
                  --------------------------------------------------------------------
                  lnalpha                                                            
                  _cons                      -0.257***       -0.267***       -0.270***
                                           (-12.55)        (-12.89)        (-13.00)  
                  --------------------------------------------------------------------
                  N                           21117           21117           21117  
                  AIC                       76248.4         76175.8         76147.7  
                  BIC                       77808.1         77743.5         77723.3  
                  --------------------------------------------------------------------
                  t statistics in parentheses
                  * p<0.05, ** p<0.01, *** p<0.001
                  Last edited by Holmer Kok; 05 Jul 2017, 11:12.

                  Comment


                  • #10
                    If I may ask, why are the standard errors of the predicted probabilities smaller than those of the average marginal effects?
                    What predicted probabilities? This is an -nbreg- model: the outcomes are counts, not probabilities. If you're talking about the standard errors of the predicted counts, they're not small at all. Relative to the size of the counts/marginal effects, they look about the same. (Take a look at the z-scores: they are actually remarkably similar.)

                    Comment


                    • #11
                      Originally posted by Clyde Schechter View Post
                      What predicted probabilities? This is an -nbreg- model: the outcomes are counts, not probabilities. If you're talking about the standard errors of the predicted counts, they're not small at all. Relative to the size of the counts/marginal effects, they look about the same. (Take a look at the z-scores: they are actually remarkably similar.)
                      Yes, I meant predicted number of events throughout this post. My bad! I had been looking into logistic models recently for another study, hence the mix-up.

                      If I look at the z-score at time gap = 25, it shows as 3.64 in the predicted counts, but as 1.15 in the average marginal effects. I want to understand where this difference comes from, and what it means for the conclusions that I can draw about how time gap behaves in that region. As you mentioned in an earlier comment, the marginal effects for high values of time gap are not precise (due to the small number of observations in that region), but then the predicted counts should also not be, right? I have also added the scatterplot of observed and predicted values in the previous comment, as well as some information regarding AIC and BIC values which I thought might be useful.

                      Comment


                      • #12
                        I guess I don't understand where your expectations are coming from here. Overall, the z-scores for the predicted counts are similar to the z-scores for the marginal effects. You can't expect them to match exactly: they're different things and they're estimated differently. It is also possible, in some situations, to have very different z-scores for these things, having to do more with the model than with the data. When you have, as here, a data set with rich data in one range and sparse data in another, you can expect to see qualitatively what you have: small standard errors where the data are rich and large standard errors where they are sparse.

                        As for what you can conclude about the marginal effects of timegap for large values, it is as you and I have both said: we can say very little about them--they are very imprecisely estimated, we don't really know much about them. And, in this case, alternative models are unlikely to fare any better in the face of the distribution of timegap. I don't think you can say anything more than that.

                        The scatterplot looks fairly typical for a negative binomial model until you get out to about 30 predicted events. After that point there is an obvious bias towards over-prediction, and the dispersion of the observed values is decreasing rather than increasing (though this may be an artifact of the sparsity of data points out in this region altogether). The over-prediction at large values suggests that a model that grows less rapidly than quadratic might better describe this data. But, again, given the sparseness of the large-timegap data, I think it is unlikely you can materially improve the model, and while you might find one that looks a bit better, I doubt you would be able to convince anybody that the improvement was more than chance or would hold up in a replication with new data. So I think you can say that your model works reasonably well for small values of timegap (which correspond to small predicted numbers of outcomes) but may break down at larger values. Again, I don't think you are likely to do appreciably better with a different model given the sparse data at large values of timegap. The data just don't provide very much information about what's going on out there.

                        The AIC and BIC values, as conventionally interpreted (by the rule of thumb of 10 you cited) do suggest that the model fit improves by adding timegap and improves further by adding timegap squared. At least the improvement in fit is enough to warrant including an extra variable, given how model parsimony is rewarded by these statistics.

                        Comment


                        • #13
                          Originally posted by Clyde Schechter View Post
                          I guess I don't understand where your expectations are coming from here. Overall, the z-scores for the predicted counts are similar to the z-scores for the marginal effects. You can't expect them to match exactly: they're different things and they're estimated differently. It is also possible, in some situations, to have very different z-scores for these things, having to do more with the model than with the data. When you have, as here, a data set with rich data in one range and sparse data in another, you can expect to see qualitatively what you have: small standard errors where the data are rich and large standard errors where they are sparse.
                          Just for my own clarity (and I think that of other Stata users as well), could you explain shortly what hypothesis is tested when we examine the average marginal effects of a variable at a certain range of values versus the predicted counts at a certain range of values? I saw elsewhere (https://www.statalist.org/forums/for...s-significance) that this discussion had already been initiated, and that you noted that this is often model-specific. However, I was unsure how to apply the conclusions therefrom to my own analysis, especially because the opinions on the usefulness of average marginal effects were quite diverse.

                          Comment


                          • #14
                            So, for concreteness, let's take the predicted counts and the average marginal effect when timegap = 5

                            For the predicted counts, we are testing the hypothesis that the expected value of wf4, conditional on timegap = 5, and adjusted for all other model variables, is zero. I think it is pretty obvious that this is a straw-man null hypothesis in almost any realistic situation. So we see in your output that the expected value of wf4 is estimated to be 1.92, with a standard error of 0.07 (95% CI 1.78-2.05). If we want to take the null hypothesis seriously, for some bizarre reason, we can also look at the p-values, which is shown as 0.000 (and really means < 0.0005). Clearly that null hypothesis is roundly rejected. But coming back to reality, it is perfectly reasonable to point out that the predicted expected value of 1.92 comes with a range of uncertainty around it, represented in some sense by the confidence interval. In particular, we can say that the data are also reasonably consistent with the expected value being anywhere between 1.78 and 2.05. If we are interested in predicting expected values, this gives us a good picture of what our best guess is, and how precise it is.

                            Now let's turn to the marginal effect. That's a completely different animal. The marginal effect of timegap is the expected change in wf4 associated with a unit change in timegap, starting from timegap = 5. Otherwise put, it's how much we expect the values of wf4 associated with timegap = 4 or 6 to differ from the values of wf4 associated with timegap = 5. It is about differences between predicted values; it is not about predicted values themselves. Here the null hypothesis that the difference is zero may well be of interest, not a straw man. To say that the marginal effect is zero is to say that wf4 is more or less independent of timegap in the vicinity of timegap = 5. Now, as it happens, in this case, we see that the predicted marginal effect is -0.13, with a standard error of 0.01 (95% CI -0.16 to -0.11). Testing that null hypothesis we see the pvalue listed as 0.000 (again, really <0.0005), a resounding rejection. So we can reject the notion that wf4 and timegap are independent in the vincinity of timegap = 5. For a more informative view, we can go back to the marginal effect itself and its confidence interval. It tells us that on average, the values of wf4 when timegap = 6 are expected to be about 0.13 lower than the average values when timegap = 5. And on the other side, the expected values of wf4 when timegap = 4 are about 0.13 higher than the average values when timegap = 5. So, along with the confidence interval, this tells us something about how strong an association exists between timegap and wf4 when timegap is in the neighborhood of 5. So even though the hypothesis test here is not meaningless, it still is far less informative than a focus on the marginal effect itself and its precision.

                            To summarize: predicted values (margins) are the expected value of the outcome variable in the model. Unless your research goals include an explicit question about whether or not the expected value at some value of the predictors is zero, it is pointless to even look at the p-values. They are the test of a meaningless, straw-man null hypothesis. Marginal-effects, however, tell you about the direction and magnitude of the association between the predictor and the outcome. And while focusing on the marginal effect and its confidence interval is the more informative view for almost any purpose, at least here the hypothesis test does manage to rise above the level of being a complete waste of time and pixels.

                            Comment

                            Working...
                            X