Hello,
I try to use margins, dydx(*) after zinb, but it yields crazy results.
You can see that the coefficients of variables after margins, dydx(*) are eight figures. However, if I reduce the number of variables, it works fine.
You can see now the coefficients of variables after margins, dydx(*) are of reasonable magnitude.
I really wanted to share my data using dataex, but since the dataset is large I surpassed the maximum number of characters. The complete dataset is here:https://drive.google.com/file/d/1cRK...ew?usp=sharing
The following is a list of the data:
I also notice that if I make the number of observations smaller, margins, dydx(*) also generates coefficients with reasonable magnitude
The thing is here I just used a subsample of my data and an easier regression to showcase the issues. In my real regression, I have more than 6000 observations, and I have about 30 variables in the ZINB model with two variables in the "inflate" part. I really need to use margins, dydx(*) to see the effects of the variables on the predicted number of events. It would be really helpful if you could let me know why the coefficients after margins, dydx(*) would be of such an unreasonably large magnitude and how I could deal with this issue. Thank you.
I try to use margins, dydx(*) after zinb, but it yields crazy results.
Code:
. zinb sum_l2 mud_sqft_per_capita office_sqft_per_capita retail_sqft_per_capita,inflate(mud_sqft_per_capita owned) vce(robust)
Fitting constant-only model:
Iteration 0: log pseudolikelihood = -1935.8032 (not concave)
Iteration 1: log pseudolikelihood = -1398.7715
Iteration 2: log pseudolikelihood = -1339.4624
Iteration 3: log pseudolikelihood = -1327.9422
Iteration 4: log pseudolikelihood = -1326.7051
Iteration 5: log pseudolikelihood = -1326.6635
Iteration 6: log pseudolikelihood = -1326.6634
Fitting full model:
Iteration 0: log pseudolikelihood = -1326.6634
Iteration 1: log pseudolikelihood = -1310.2141
Iteration 2: log pseudolikelihood = -1301.1221
Iteration 3: log pseudolikelihood = -1291.9257
Iteration 4: log pseudolikelihood = -1290.279
Iteration 5: log pseudolikelihood = -1290.2517
Iteration 6: log pseudolikelihood = -1290.2517
Zero-inflated negative binomial regression Number of obs = 1,594
Nonzero obs = 258
Zero obs = 1,336
Inflation model = logit Wald chi2(3) = 19.52
Log pseudolikelihood = -1290.252 Prob > chi2 = 0.0002
----------------------------------------------------------------------------------------
| Robust
sum_l2 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
sum_l2 |
mud_sqft_per_capita | -.0350013 .674072 -0.05 0.959 -1.356158 1.286155
office_sqft_per_capita | 3.080568 1.015168 3.03 0.002 1.090875 5.070261
retail_sqft_per_capita | 1.813879 .5685284 3.19 0.001 .6995834 2.928174
_cons | .3769259 .2642011 1.43 0.154 -.1408988 .8947507
-----------------------+----------------------------------------------------------------
inflate |
mud_sqft_per_capita | -11.19935 1.662089 -6.74 0.000 -14.45699 -7.94172
owned | -.5364868 .5588103 -0.96 0.337 -1.631735 .5587612
_cons | 1.413014 .4707848 3.00 0.003 .4902923 2.335735
-----------------------+----------------------------------------------------------------
/lnalpha | 1.650809 .1652354 9.99 0.000 1.326953 1.974664
-----------------------+----------------------------------------------------------------
alpha | 5.211192 .8610735 3.76954 7.204199
----------------------------------------------------------------------------------------
. margins, dydx(*)
Average marginal effects Number of obs = 1,594
Model VCE : Robust
Expression : Predicted number of events, predict()
dy/dx w.r.t. : mud_sqft_per_capita office_sqft_per_capita retail_sqft_per_capita owned
----------------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
mud_sqft_per_capita | 2.16e+08 1.85e+09 0.12 0.907 -3.41e+09 3.84e+09
office_sqft_per_capita | 7.35e+09 6.57e+10 0.11 0.911 -1.21e+11 1.36e+11
retail_sqft_per_capita | 4.33e+09 3.74e+10 0.12 0.908 -6.91e+10 7.77e+10
owned | 1.44e+07 1.23e+08 0.12 0.907 -2.27e+08 2.56e+08
----------------------------------------------------------------------------------------
You can see that the coefficients of variables after margins, dydx(*) are eight figures. However, if I reduce the number of variables, it works fine.
Code:
. zinb sum_l2 mud_sqft_per_capita,inflate(mud_sqft_per_capita owned) vce(robust)
Fitting constant-only model:
Iteration 0: log pseudolikelihood = -1935.8032 (not concave)
Iteration 1: log pseudolikelihood = -1398.7715
Iteration 2: log pseudolikelihood = -1339.4624
Iteration 3: log pseudolikelihood = -1327.9422
Iteration 4: log pseudolikelihood = -1326.7051
Iteration 5: log pseudolikelihood = -1326.6635
Iteration 6: log pseudolikelihood = -1326.6634
Fitting full model:
Iteration 0: log pseudolikelihood = -1326.6634
Iteration 1: log pseudolikelihood = -1318.5842
Iteration 2: log pseudolikelihood = -1317.6977
Iteration 3: log pseudolikelihood = -1317.6921
Iteration 4: log pseudolikelihood = -1317.6921
Zero-inflated negative binomial regression Number of obs = 1,594
Nonzero obs = 258
Zero obs = 1,336
Inflation model = logit Wald chi2(1) = 6.17
Log pseudolikelihood = -1317.692 Prob > chi2 = 0.0130
-------------------------------------------------------------------------------------
| Robust
sum_l2 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------------+----------------------------------------------------------------
sum_l2 |
mud_sqft_per_capita | 2.266131 .9126316 2.48 0.013 .4774059 4.054856
_cons | .5442042 .2637812 2.06 0.039 .0272026 1.061206
--------------------+----------------------------------------------------------------
inflate |
mud_sqft_per_capita | -9.984647 1.612426 -6.19 0.000 -13.14494 -6.824349
owned | -.6016932 .5475992 -1.10 0.272 -1.674968 .4715815
_cons | 1.391733 .4705785 2.96 0.003 .469416 2.31405
--------------------+----------------------------------------------------------------
/lnalpha | 1.802471 .169724 10.62 0.000 1.469818 2.135123
--------------------+----------------------------------------------------------------
alpha | 6.064612 1.02931 4.348442 8.45809
-------------------------------------------------------------------------------------
. margins, dydx(*)
Average marginal effects Number of obs = 1,594
Model VCE : Robust
Expression : Predicted number of events, predict()
dy/dx w.r.t. : mud_sqft_per_capita owned
-------------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
--------------------+----------------------------------------------------------------
mud_sqft_per_capita | 6.844191 1.356783 5.04 0.000 4.184946 9.503436
owned | .2546788 .2433755 1.05 0.295 -.2223285 .731686
-------------------------------------------------------------------------------------
You can see now the coefficients of variables after margins, dydx(*) are of reasonable magnitude.
I really wanted to share my data using dataex, but since the dataset is large I surpassed the maximum number of characters. The complete dataset is here:https://drive.google.com/file/d/1cRK...ew?usp=sharing
The following is a list of the data:
Code:
. list in 1/100
+-----------------------------------------------------+
| sum_l2 mud_sq~a office~a retail~a owned |
|-----------------------------------------------------|
1. | 0 .0898069 0 0 .57094592 |
2. | 0 .0035725 0 0 .80785125 |
3. | 1 .0953043 .0008397 0 .34204793 |
4. | 1 0 0 .0742383 .76895308 |
5. | 0 .093042 0 0 .66618496 |
|-----------------------------------------------------|
6. | 0 .068348 0 0 .40298507 |
7. | 0 0 0 0 .81931466 |
8. | 1 .0758335 .0026269 0 .56521738 |
9. | 0 .1456166 .0043934 0 .30172414 |
10. | 0 0 0 0 .4860681 |
|-----------------------------------------------------|
11. | 0 0 0 0 .80697054 |
12. | 0 .0292876 0 0 .44064388 |
13. | 0 0 0 0 .89447236 |
14. | 0 .0066272 0 0 .73400676 |
15. | 0 0 0 0 .85809314 |
|-----------------------------------------------------|
16. | 0 .0046185 .0025944 .0053238 .62865949 |
17. | 0 .0038844 .0026464 0 .6130268 |
18. | 0 .0124808 .0001208 0 .66139239 |
19. | 1 .0056831 .0072896 0 .55831265 |
20. | 0 .013271 0 0 .45341614 |
|-----------------------------------------------------|
21. | 0 .0128877 .0007828 0 .63519311 |
22. | 0 0 0 0 .71408248 |
23. | 2 .026289 0 0 .76016682 |
24. | 0 0 0 0 .89873415 |
25. | 0 0 0 0 .91246682 |
|-----------------------------------------------------|
26. | 0 0 0 0 .71515149 |
27. | 0 .1051657 .000997 0 .24758843 |
28. | 0 .0929398 0 0 .21495327 |
29. | 0 .0373987 0 .0087623 .75645757 |
30. | 1 0 0 .0029182 .65680474 |
|-----------------------------------------------------|
31. | 0 0 0 0 .87826085 |
32. | 0 0 0 0 .90302265 |
33. | 0 0 0 0 .90434784 |
34. | 0 0 0 0 .93874425 |
35. | 0 .2111307 0 0 .05439331 |
|-----------------------------------------------------|
36. | 4 .0051789 0 0 .72692305 |
37. | 0 0 0 0 .92931032 |
38. | 1 0 .0105478 .173447 .84762865 |
39. | 0 0 0 0 .9598214 |
40. | 0 0 0 0 .78552973 |
|-----------------------------------------------------|
41. | 0 .0086437 0 0 .62358278 |
42. | 0 0 0 0 .9366197 |
43. | 0 .0283563 0 0 .81930691 |
44. | 0 0 0 .1779139 .95726496 |
45. | 0 0 .0013528 0 .97846156 |
|-----------------------------------------------------|
46. | 0 .0162672 .0212537 0 .79591835 |
47. | 0 0 0 .0839658 .84812623 |
48. | 0 .070872 .0101204 0 .6792717 |
49. | 0 0 0 .1069245 .69617707 |
50. | 0 0 0 0 .85393256 |
|-----------------------------------------------------|
51. | 0 .0505848 .0426975 .022869 .45894736 |
52. | 0 0 .0099472 .0426096 .68670309 |
53. | 1 .2311722 0 .002651 .06952965 |
54. | 1 0 0 .0411989 .93725491 |
55. | 0 .0880984 .0184813 .0271888 .42549923 |
|-----------------------------------------------------|
56. | 0 0 1.496133 0 .85321099 |
57. | 2 .282277 0 .0589427 .14414415 |
58. | 0 0 .0006414 0 .6652602 |
59. | 13 .2656941 .1226111 .5887356 .35841957 |
60. | 2 .0916511 .0135128 0 .31596452 |
|-----------------------------------------------------|
61. | 0 0 0 0 .85137618 |
62. | 2 .2368419 .0040762 0 .05069124 |
63. | 0 0 0 .0500728 .82165605 |
64. | 24 0 0 0 0 |
65. | 1 .2184411 .0055871 0 .30309987 |
|-----------------------------------------------------|
66. | 0 .1965653 .0323297 .0081131 .19964664 |
67. | 0 .1154042 0 0 .37626776 |
68. | 0 .1270162 .027308 .0632858 .54897958 |
69. | 1 .0138945 0 0 .63136458 |
70. | 0 0 0 0 .88686132 |
|-----------------------------------------------------|
71. | 0 0 0 0 .84666669 |
72. | 5 0 0 .0035174 .83356071 |
73. | 0 .0378089 .0081194 .0117317 .63396782 |
74. | 1 .1965736 .007549 0 .14772727 |
75. | 0 0 0 0 .82617587 |
|-----------------------------------------------------|
76. | 0 0 0 0 .78370786 |
77. | 0 0 0 0 .77880186 |
78. | 0 0 0 0 1 |
79. | 0 .0094124 0 0 .67676765 |
80. | 0 .1000855 .0007475 0 .47757256 |
|-----------------------------------------------------|
81. | 0 0 0 .0061988 .77049178 |
82. | 0 0 0 0 .88834953 |
83. | 0 0 0 0 .81171548 |
84. | 0 0 0 0 .77253217 |
85. | 2 .3587315 .1451557 .0146335 .10459184 |
|-----------------------------------------------------|
86. | 0 .0802478 .0154736 .0465494 .2591241 |
87. | 0 .1259107 0 0 .2429022 |
88. | 0 0 .0146921 .4493715 .82840234 |
89. | 0 0 0 .0142334 .9254902 |
90. | 0 .0522075 0 0 .37804878 |
|-----------------------------------------------------|
91. | 0 .0767034 0 0 .64864862 |
92. | 0 0 .0177341 0 .84394902 |
93. | 0 0 0 0 .73134327 |
94. | 0 0 .0004407 0 .70403588 |
95. | 0 .050782 0 0 .48205128 |
|-----------------------------------------------------|
96. | 0 .0881407 0 0 .41935483 |
97. | 0 .0318145 .0050043 .0561125 .57430339 |
98. | 3 .1423303 .0013297 0 .29166666 |
99. | 0 .0023224 .0111253 .1082391 .60655737 |
100. | 1 .2379044 0 .0040009 .12743823 |
+-----------------------------------------------------+
.
I also notice that if I make the number of observations smaller, margins, dydx(*) also generates coefficients with reasonable magnitude
Code:
. drop if !mod(_n,2)
(797 observations deleted)
.
. zinb sum_l2 mud_sqft_per_capita office_sqft_per_capita retail_sqft_per_capita,inflate(mud_sqft_per_capita owned) vce(robust)
Fitting constant-only model:
Iteration 0: log pseudolikelihood = -961.15273 (not concave)
Iteration 1: log pseudolikelihood = -687.27718
Iteration 2: log pseudolikelihood = -656.77257
Iteration 3: log pseudolikelihood = -650.22461
Iteration 4: log pseudolikelihood = -649.32756
Iteration 5: log pseudolikelihood = -649.28166
Iteration 6: log pseudolikelihood = -649.28131
Iteration 7: log pseudolikelihood = -649.28131
Fitting full model:
Iteration 0: log pseudolikelihood = -649.28131
Iteration 1: log pseudolikelihood = -640.25357
Iteration 2: log pseudolikelihood = -638.13366
Iteration 3: log pseudolikelihood = -638.08243
Iteration 4: log pseudolikelihood = -638.08236
Iteration 5: log pseudolikelihood = -638.08236
Zero-inflated negative binomial regression Number of obs = 797
Nonzero obs = 125
Zero obs = 672
Inflation model = logit Wald chi2(3) = 24.62
Log pseudolikelihood = -638.0824 Prob > chi2 = 0.0000
----------------------------------------------------------------------------------------
| Robust
sum_l2 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
sum_l2 |
mud_sqft_per_capita | -.8490054 .70847 -1.20 0.231 -2.237581 .5395703
office_sqft_per_capita | 3.356185 1.067579 3.14 0.002 1.263769 5.448601
retail_sqft_per_capita | 2.473402 .8377714 2.95 0.003 .8313999 4.115403
_cons | .8226928 .3300019 2.49 0.013 .1759011 1.469485
-----------------------+----------------------------------------------------------------
inflate |
mud_sqft_per_capita | -10.9113 2.320612 -4.70 0.000 -15.45962 -6.362985
owned | -.312362 .694893 -0.45 0.653 -1.674327 1.049603
_cons | 1.66365 .5588054 2.98 0.003 .5684117 2.758889
-----------------------+----------------------------------------------------------------
/lnalpha | 1.539509 .2407484 6.39 0.000 1.067651 2.011367
-----------------------+----------------------------------------------------------------
alpha | 4.6623 1.122442 2.908538 7.473528
----------------------------------------------------------------------------------------
. margins, dydx(*)
Average marginal effects Number of obs = 797
Model VCE : Robust
Expression : Predicted number of events, predict()
dy/dx w.r.t. : mud_sqft_per_capita office_sqft_per_capita retail_sqft_per_capita owned
----------------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
mud_sqft_per_capita | 4.90122 1.353754 3.62 0.000 2.24791 7.55453
office_sqft_per_capita | 4.170577 2.461296 1.69 0.090 -.6534743 8.994629
retail_sqft_per_capita | 3.073583 1.065935 2.88 0.004 .9843888 5.162777
owned | .1705116 .3809638 0.45 0.654 -.5761638 .917187
----------------------------------------------------------------------------------------

Comment