Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predict outcomes after xtnbreg

    I want to better understand the differences in what is being predicted when using fixed and random effects after a regression
    I run a regression that looks like this with DV a count variable that is severely skewed with and standard deviation significantly bigger than the mean, so negative binomial seems appropriate (and the norm in my field for these kinds of data). I have multiple firms with multiple observations per year over a 20 year period (unbalanced).
    Code:
    encode         firm_id, gen(fid)
    iis         fid
    xtnbreg DV c.z1##c.z2 $xvarlist , fe
    predict fe_dv, nu0
    xtnbreg DV c.z1##c.z2 $xvarlist , re
    predict re_dv, nu0
    Now, the values for fe_dv are significantly smaller than those for re_dv, they differ by a factor of about 10 [9.08, 10.18] and they also differ within the firms (for the fixed effects). It is common practice to use fixed effects for this kind of data in my field but the predictions I can get from them much further away from the actual observations DV. That being said, the correlation between both fe_dv and re_dv is 100% [which is a bit weird given that they are not simple linear combinations of one another...

    Because the prediction of the fixed effects model seems generally ten times smaller than the random effects model, I am struggling with the right interpretation of the following margins command:

    Code:
    xtnbreg DV c.z1##c.z2 $xvarlist , fe
        margins, dydx(z1) at(z2 = (0 5.6(8.6)23 )) predict(nu0)
        margins, dydx(z1) at(z2 = (0 5.6(8.6)23))
    margins determines the average marginal effect of a change in z1 on DV at distinct values of z2.
    I am struggling to understand the difference between both versions of the margins command. I know the former is the predicted number of events, and the latter is the linear prediction but if that is correct, what does it mean if the values of both margins commands are nearly identical (they start differing only 3 or 4 numbers after the comma, see below)?

    If I want to check the economic significance of the values of dy/dx, I have to compare them to a relevant reference value. Now what is the most sensible one to use? Is it the actual DV? Is it the predicted responses fe_dv, or something else?

    And given that z1 is a continuous variable (between 0 and 7), could the following interpretation hold true?
    Say, dydx at z2 = min equals 0.01 and dydx at z2 = mean equals 0.026
    A change of 1 in the value of z1 from 0 to 1 leads to an increase in the number of incidences in the DV of 0.01 if z2 is at its minimum value and an increase of 0.026 if z2 is at its mean value.
    I assume this is the interpretation after the
    Code:
     predict(nu0)
    option. Is this correct and how does it differ after the second margins command I ran above? I thought it should differ substantially but according to the marginal effects I get, there is almost no difference whatsoever.

    For sake of clarity, I show some output of the actual regression

    Code:
     xtnbreg fwd10 c.log_dom##c.t_s_d##c.p_pria_usew p_pria_timew  f_emp f_acap f_dar f_search f_inv_prod_5y  p_inv log_other t_k_s team_t_1st col_difpairs team_m_patcount_5y p_classes dif_cpc p_cpc_1 p_pria p_claims priordum ts i.p_appy i.p_gry if compustat == 1, fe
    note: 3 groups (3 obs) dropped because of only one obs per group
    
    Conditional FE negative binomial regression     Number of obs     =     40,138
    Group variable: fid                             Number of groups  =        105
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =      382.3
                                                                  max =      8,484
    
                                                    Wald chi2(38)     =    6522.35
    Log likelihood  = -128445.56                    Prob > chi2       =     0.0000
    
    -------------------------------------------------------------------------------------------------------------
                                          fwd10 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------------------------------------+----------------------------------------------------------------
                                      log_dom_t |   .0244366   .0048438     5.04   0.000     .0149429    .0339302
                                t_s_degree_cent |   .0031362   .0014045     2.23   0.026     .0003834     .005889
                                                |
                  c.log_dom_t#c.t_s_degree_cent |  -.0012754   .0003445    -3.70   0.000    -.0019506   -.0006003
                                                |
                                    p_pria_usew |  -.0012637   .0115459    -0.11   0.913    -.0238932    .0213658
                                                |
                      c.log_dom_t#c.p_pria_usew |   .0090225   .0036865     2.45   0.014     .0017972    .0162478
                                                |
                c.t_s_degree_cent#c.p_pria_usew |   .0030458   .0013597     2.24   0.025     .0003808    .0057108
                                                |
    c.log_dom_t#c.t_s_degree_cent#c.p_pria_usew |   -.000679   .0003071    -2.21   0.027    -.0012809   -.0000771
    
    margins, dydx(log_dom_t) at(t_s_d = (0 5.66 14.4 23) p_pria_usew = (-1.5 -.9 0 .9 1.8 3.6)) predict(nu0)
    Average marginal effects                        Number of obs     =     40,138
    Model VCE    : OIM
    
    Expression   : Predicted number of events (assuming u_i=0), predict(nu0)
    dy/dx w.r.t. : log_dom_t
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    log_dom_t    |
             _at |
              1  |   .0107409   .0074239     1.45   0.148    -.0038096    .0252914
              2  |   .0162449   .0060048     2.71   0.007     .0044756    .0280141
              3  |   .0247225   .0049323     5.01   0.000     .0150554    .0343896
              4  |   .0334764   .0058737     5.70   0.000     .0219641    .0449887
              5  |   .0425183   .0082944     5.13   0.000     .0262616    .0587751
              6  |   .0615165   .0148566     4.14   0.000      .032398     .090635
              7  |   .0092054   .0064945     1.42   0.156    -.0035236    .0219344
              8  |    .012431   .0053276     2.33   0.020      .001989    .0228729
              9  |   .0174624   .0045355     3.85   0.000      .008573    .0263517
             10  |   .0227338   .0054916     4.14   0.000     .0119704    .0334972
             11  |    .028255   .0077184     3.66   0.000     .0131272    .0433829
             12  |   .0400876    .013751     2.92   0.004     .0131362    .0670391
             13  |   .0068984   .0081304     0.85   0.396    -.0090368    .0228337
             14  |   .0066254   .0066156     1.00   0.317     -.006341    .0195918
             15  |    .006182   .0055015     1.12   0.261    -.0046007    .0169648
             16  |    .005696    .006815     0.84   0.403    -.0076612    .0190531
             17  |   .0051646   .0099822     0.52   0.605    -.0144002    .0247293
             18  |   .0039548   .0187767     0.21   0.833    -.0328468    .0407564
             19  |   .0047026   .0117443     0.40   0.689    -.0183158     .027721
             20  |   .0010094   .0095372     0.11   0.916    -.0176831     .019702
             21  |   -.005011   .0077405    -0.65   0.517    -.0201822    .0101602
             22  |  -.0116496   .0098395    -1.18   0.236    -.0309347    .0076355
             23  |  -.0189558   .0152036    -1.25   0.212    -.0487543    .0108428
             24  |  -.0357875   .0308857    -1.16   0.247    -.0963223    .0247473
    ------------------------------------------------------------------------------
    
    margins, dydx(log_dom_t) at(t_s_d = (0 5.66 14.4 23) p_pria_usew = (-1.5 -.9 0 .9 1.8 3.6))
    
    Average marginal effects                        Number of obs     =     40,138
    Model VCE    : OIM
    
    Expression   : Linear prediction, predict()
    dy/dx w.r.t. : log_dom_t
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    log_dom_t    |
             _at |
              1  |   .0109029   .0075135     1.45   0.147    -.0038233     .025629
              2  |   .0163163   .0059931     2.72   0.006       .00457    .0280627
              3  |   .0244366   .0048438     5.04   0.000     .0149429    .0339302
              4  |   .0325568   .0057465     5.67   0.000     .0212938    .0438197
              5  |    .040677   .0080373     5.06   0.000     .0249242    .0564298
              6  |   .0569174   .0139211     4.09   0.000     .0296326    .0842022
              7  |   .0094488   .0066705     1.42   0.157    -.0036251    .0225228
              8  |   .0125563   .0053762     2.34   0.020     .0020191    .0230935
              9  |   .0172176   .0044673     3.85   0.000     .0084619    .0259732
             10  |   .0218788   .0053233     4.11   0.000     .0114453    .0323122
             11  |     .02654   .0073517     3.61   0.000      .012131     .040949
             12  |   .0358624    .012548     2.86   0.004     .0112688     .060456
             13  |   .0072036   .0085747     0.84   0.401    -.0096025    .0240097
             14  |   .0067502   .0067921     0.99   0.320     -.006562    .0200624
             15  |   .0060702   .0054292     1.12   0.264    -.0045708    .0167111
             16  |   .0053901   .0064915     0.83   0.406     -.007333    .0181133
             17  |   .0047101   .0091719     0.51   0.608    -.0132665    .0226867
             18  |     .00335   .0160016     0.21   0.834    -.0280126    .0347126
             19  |   .0049943   .0125842     0.40   0.691    -.0196703    .0296589
             20  |   .0010371   .0098148     0.11   0.916    -.0181995    .0202738
             21  |  -.0048986   .0075191    -0.65   0.515    -.0196357    .0098385
             22  |  -.0108344   .0089803    -1.21   0.228    -.0284355    .0067667
             23  |  -.0167702   .0129869    -1.29   0.197    -.0422241    .0086838
             24  |  -.0286417   .0231603    -1.24   0.216    -.0740351    .0167517
    ------------------------------------------------------------------------------
    Thanks for your time!
    Simon

  • #2
    I have the same issue. Do you figure out a way to get predicted outcomes? In my case, I would like to get predicted outcomes for each observation instead of just at some levels of interest.

    Comment


    • #3
      Tracy: I’d recommend avoiding xtnbreg, especially with the FE option. Poisson regression is much better for predicting the mean. What are your N and T dimensions?

      Comment

      Working...
      X