Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • margins, predictnl and nlcom

    Dear All,

    I estimate a logit model and then I want to compute the (average) marginal effects of the regressors on the probability of a positive outcome. Since my dataset is quite big (roughly 450,000 observations), margins takes ages. Therefore, I tried to work this problem around using predictnl to "manually" compute the required marginal effects. Now consider the following example:

    Code:
    webuse bangladesh
    
    logit c_use urban age children
    
    margins, dydx(urban age children)
    The results of the calculation via margins is the following:

    Code:
    Average marginal effects                                 Number of obs = 1,934
    Model VCE: OIM
    
    Expression: Pr(c_use), predict()
    dy/dx wrt:  urban age children
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           urban |   .1784138   .0221528     8.05   0.000     .1349952    .2218325
             age |  -.0063953   .0016695    -3.83   0.000    -.0096674   -.0031232
        children |   .0853374   .0117828     7.24   0.000     .0622435    .1084314
    ------------------------------------------------------------------------------
    Instead, if I use predictnl, I use the following code:

    Code:
    egen mage=mean(age)
    egen mchildren=mean(children)
    egen murban=mean(urban)
    
    predictnl marurban =_b[urban]*(1/(1+exp(-(_b[_cons]+_b[urban]*murban*_b[age]*mage+_b[children]*mchildren))))*(1-(1/(1+exp(-(_b[_cons]+_b[urban]*murban*_b[age]*mage+_b[children]*mchildren))))), se(se1)
    Then if I summarize both marurban and se1 I get:

    Code:
      Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
        marurban |      1,934    .1756926           0   .1756926   .1756926
             se1 |      1,934    .0214322           0   .0214322   .0214322
    Both the (average) marginal effect and the standard errors are slightly different from those obtained using margins. I tried to use nlcom instead:

    Code:
    nlcom (marurban: _b[urban]*(1/(1+exp(-(_b[_cons]+_b[urban]*murban*_b[age]*mage+_b[children]*mchildren))))*(1-(1/(1+exp(-(_b[_cons]+_b[urban]*murban*_b[age]*mage+_b[children]*mchildren))))))
    The result is reported below:

    Code:
      Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
        marurban |      1,934    .1756926           0   .1756926   .1756926
             se1 |      1,934    .0214322           0   .0214322   .0214322
    So, both predictnl and nlcom produce the same results, but they are different from those obtained using margins. Reagrding the standard errors, I know that the latter uses Delta method to compute them. So I know that the standard errors could be different. But why the marginal effect is different from the one obtained with margins. In my understanding its magnitude should be the same. Is there any way I can get the identical results using either predictnl or nlcom?

    Thanks in advance for your help.

    Dario
    Last edited by Dario Maimone Ansaldo Patti; 01 Dec 2021, 16:44.

  • #2
    Your -predictnl- and -nlcom- commands are predicting a different statistic from what -margins- is calculating. You are calculating the marginal effects at the means of the variables, whereas -margins- is calculating the average marginal effects. Given the non-linearity of the logistic model, those are different things. It is possible to write code that calculates what -margins- calculates, but basically you would just be hand-coding the -margins- command. Yes, you could save a little bit of time due to some overhead that -margins- requires to figure out exactly what you want, whereas you could just write the commands needed specifically for that. But in the end, you would not save a noticeable amount of time. The calculation of the standard errors of the marginal effects is especially time consuming. If you don't need those, adding the -noci- option to -margins- will speed things up considerably.

    Another possibility is to pull a more reasonable sized subset at random from your 450,000 observations and do the analysis on that subset. Unless you are trying to estimate some very rare events or extremely small marginal effects, a data set of 450,000 is overkill for this kind of analysis, anyway. If going to a smaller subset isn't suitable for your research goals, then I can't offer you any better advice than to be patient. Let it run overnight. If you must be present while it runs, read a book or do something else to distract you so you are not waiting for your watched pot to boil.

    Comment


    • #3
      Clyde Schechter Thanks a lot for your explanations. Useful, as usual. I think you are right. The best I can do is to be patient and wait. I will try with the noci option, as I do not need them. I just noted that I can replicate the exact result from margins if I generate the estimated probabilities (say p) after logit and then I use:

      Code:
      predictnl marurban=p*(1-p)*_b[urban]
      Again, the standard errors are still different, given the different ways in which they are calculated. Anyway, thanks again for yoru help.

      Comment

      Working...
      X