Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins after probit vs LPM

    Hi All,

    According to this Stata blog post and this tutorial video, I expected that the margins coeficients after a probit be very similar to the results of LPM model. It usually works. For example here coefficients of race, age, and wage are almost identical in margins and regression:
    Code:
    sysuse nlsw88.dta
    probit married race age wage
    margins, dydx(*) atmeans
    reg married race age wage
    But for my data it's not true. The coefficients of margins are quiet different from what I get from regression (LPM). Do you know what it happens?

    Below is a small sample of my data. The difference in this sample is smaller than what I get with my whole sample. But, even in this smaller sample the difference is quite large between the coefficient of the k variable in the regression (0.12) and the coefficient of k in the margins output (0.05).

    Thanks


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(Nexp_D k year section HH i j)
    0 0 2007 3 0 2.3025851 15.377805
    1 0 2009 2 0 3.3322046 14.426433
    0 0 2011 2 0 3.6888795  15.27889
    0 0 2013 2 0 4.0943446 15.508067
    0 0 2015 2 0 3.8066626 16.109676
    1 0 2007 1 0  2.484907 16.468065
    0 0 2009 1 0  2.397895 16.441336
    0 0 2011 1 0 3.4011974 16.403145
    0 0 2013 1 0  2.564949 14.482695
    1 0 2015 1 0 2.3025851   14.2186
    0 0 2007 7 1  2.833213  13.87392
    0 0 2009 1 0 4.4426513 15.736116
    0 0 2011 1 0  4.941642 16.793394
    1 0 2013 1 0 3.6888795  15.63886
    0 0 2015 1 0 3.4011974  16.43577
    0 0 2007 3 1  1.609438 12.012426
    0 0 2009 4 1  1.609438 12.224598
    0 0 2007 2 0  2.890372 13.648723
    1 0 2009 2 0 2.3025851  14.34033
    0 0 2011 2 0  3.912023 15.402943
    0 0 2013 2 0 3.8066626 14.957743
    0 0 2015 2 0 3.8066626 15.998454
    1 0 2007 4 0  2.890372 13.088835
    0 0 2007 2 1  .6931472  10.81215
    0 0 2009 3 1         0 11.661812
    0 0 2011 3 1         0 13.362721
    1 0 2013 1 1   1.94591 13.853677
    0 0 2015 1 1 1.7917595  14.82856
    0 0 2007 2 1 1.3862944 11.984256
    0 0 2009 2 1 1.3862944 12.346992
    0 0 2007 3 0  3.912023 14.435515
    0 0 2009 2 0 4.1108737 14.762905
    0 0 2011 2 0  3.218876 14.865374
    0 0 2013 2 0 2.1972246  13.85997
    1 0 2015 3 0 2.1972246 13.860202
    0 0 2007 2 0  3.555348  12.84799
    0 0 2009 2 0  2.484907  12.99542
    0 0 2011 2 0  2.890372 14.570724
    0 0 2013 2 0  2.772589 13.725273
    0 0 2015 2 0  2.484907   14.5059
    0 0 2007 2 1   1.94591 13.077674
    0 1 2007 2 0 4.4998097  15.26651
    0 1 2009 2 0 4.1743875  15.89154
    0 0 2011 2 0 4.6051702  17.01238
    0 0 2007 1 0 3.6888795  14.13269
    0 0 2009 1 0  2.397895  14.47045
    0 0 2011 1 0  5.010635 15.402943
    0 0 2013 1 0  4.787492 15.205634
    0 0 2015 1 0 4.6051702  15.58972
    0 0 2007 3 1  .6931472 10.115307
    0 0 2009 2 1  .6931472  10.40905
    0 0 2011 3 1  .6931472  11.49092
    0 0 2007 1 1 1.7917595 12.425272
    0 0 2007 1 0 1.7917595 11.823416
    0 0 2009 1 1  .6931472  9.478772
    0 0 2007 1 1 2.6390574 13.958892
    0 0 2009 1 1  2.995732 12.387365
    0 0 2011 1 0 3.3322046 13.793505
    0 0 2013 1 0  2.484907 14.324818
    0 0 2015 1 0  3.178054 14.352424
    0 0 2007 2 1  2.484907 14.316677
    0 0 2007 2 0  3.218876 13.649912
    0 0 2009 2 0  3.135494 13.856635
    0 0 2011 2 0   2.70805 13.807982
    0 0 2013 2 0  3.218876 14.718597
    0 0 2015 2 0  2.890372 14.850896
    0 0 2007 1 0  4.787492 17.186691
    0 0 2009 1 0  5.298317 16.968035
    0 0 2011 1 0  2.833213  13.29921
    0 0 2007 1 0 3.6888795 15.449242
    0 0 2009 1 0 3.6888795  15.05237
    0 0 2007 2 0 2.0794415   12.6079
    0 0 2009 2 0 2.6390574  12.89164
    0 0 2011 2 0 2.6390574  12.81932
    0 0 2013 2 0 2.1972246 14.651577
    0 0 2015 2 0 2.6390574 15.702392
    0 0 2007 1 1 1.0986123 10.768788
    0 0 2009 1 1         0 11.074025
    0 0 2007 1 0  1.609438 11.894644
    0 0 2009 1 1  1.609438 11.596215
    0 0 2013 1 1 1.3862944  13.88471
    0 0 2015 1 1 2.0794415 13.595247
    0 0 2007 1 1   1.94591 12.648416
    0 0 2007 1 0  3.433987   14.1687
    1 0 2009 3 0 4.0943446 16.339886
    0 1 2013 3 0 3.4011974 14.452973
    1 1 2015 1 0  2.564949 15.215858
    0 0 2007 1 1 4.0430512  14.33154
    0 0 2009 1 1  2.995732 14.883858
    0 0 2011 1 1  3.583519 14.726324
    0 0 2013 3 1  3.135494  14.89548
    0 0 2015 3 1 1.3862944 14.117247
    0 0 2007 3 1  2.484907 13.727225
    0 0 2009 3 1  2.484907   14.1287
    0 0 2007 1 1  3.218876   13.2164
    0 0 2009 3 1  2.995732 13.570766
    0 0 2007 3 1 1.7917595 11.201496
    0 0 2009 1 1  2.772589 12.683463
    0 0 2011 1 1  1.609438 12.184067
    end
    
    probit Nexp_D k year section HH i j
    margins, dydx(*) atmeans
    reg Nexp_D k year section HH i j

  • #2
    Here is the output:

    Code:
    . margins, dydx(*) atmeans
    
    Conditional marginal effects                    Number of obs     =         99
    Model VCE    : OIM
    
    Expression   : Pr(Nexp_D), predict()
    dy/dx w.r.t. : k year section HH i j
    at           : k               =     .040404 (mean)
                   year            =    2010.131 (mean)
                   section         =    1.808081 (mean)
                   HH              =    .3737374 (mean)
                   i               =    2.748622 (mean)
                   j               =     13.9989 (mean)
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               k |   .0518372     .09571     0.54   0.588    -.1357509    .2394254
            year |   .0025828   .0082636     0.31   0.755    -.0136136    .0187793
         section |   .0220275   .0250538     0.88   0.379    -.0270769     .071132
              HH |  -.1395704   .0704393    -1.98   0.048     -.277629   -.0015119
               i |  -.0728711   .0357892    -2.04   0.042    -.1430166   -.0027256
               j |   .0420548   .0247855     1.70   0.090    -.0065239    .0906335
    ------------------------------------------------------------------------------
    
    . reg Nexp_D k year section HH i j
    
          Source |       SS           df       MS      Number of obs   =        99
    -------------+----------------------------------   F(6, 92)        =      1.61
           Model |  .852163038         6  .142027173   Prob > F        =    0.1544
        Residual |  8.13773595        92  .088453652   R-squared       =    0.0948
    -------------+----------------------------------   Adj R-squared   =    0.0358
           Total |  8.98989899        98  .091733663   Root MSE        =    .29741
    
    ------------------------------------------------------------------------------
          Nexp_D |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               k |   .1178658   .1548691     0.76   0.449    -.1897176    .4254493
            year |    .000571   .0121142     0.05   0.963    -.0234888    .0246308
         section |   .0151021    .032287     0.47   0.641    -.0490227     .079227
              HH |  -.1412624   .0807356    -1.75   0.084    -.3016103    .0190856
               i |  -.0942334   .0456378    -2.06   0.042    -.1848741   -.0035928
               j |    .054302    .033056     1.64   0.104    -.0113502    .1199541
           _cons |   -1.52712   24.21439    -0.06   0.950    -49.61899    46.56475
    ------------------------------------------------------------------------------
    At least in this example, the SEs for k, year, and section are very large and the t values are small.

    For HH, I, and j, the SEs are smaller and the Ts are bigger. Their marginal effects are pretty similar to the coefficients from the LPM.

    I don't have a formal answer to your Q, but my guess is discrepancies will be greater when standard errors are bigger and effects are less statistically significant.

    Last edited by Richard Williams; 12 Dec 2017, 02:48.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Monica: Further to Richard's comment, I think the specific issue in your case is the joint structure of Nexp_D and k and the fact that they are positive only rarely. Specifically note that k is positive in only 4 instances, and in only one of these is Nexp_D positive. Such a structure will generally result in fragile estimates (if the model is estimable at all). For instance, see what happens when you delete observation n=87 and you rerun your LPM and probit models.

      Comment

      Working...
      X