Understanding ZINB & Post Estimation better

Nora Romeo

Join Date: Apr 2020
Posts: 25

Understanding ZINB & Post Estimation better

15 Apr 2020, 06:24

I have a dataset with 3,810 zero observations and 906 nonzero observations.

Code:

. zinb dp_ mig2gross_2016 popden per_vacrent medrent, inflate(mig2gross_2016 popden per_vac
> rent medrent)  zip

....


Zero-inflated negative binomial regression      Number of obs     =      4,716
                                                Nonzero obs       =        906
                                                Zero obs          =      3,810

Inflation model = logit                         LR chi2(4)        =     674.94
Log likelihood  = -7027.052                     Prob > chi2       =     0.0000

--------------------------------------------------------------------------------
           dp_ |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
dp_            |
mig2gross_2016 |   .0013656   .0000791    17.26   0.000     .0012106    .0015207
        popden |   .0000187   6.47e-06     2.89   0.004     5.99e-06    .0000313
   per_vacrent |   .1064657   .0395465     2.69   0.007     .0289561    .1839754
       medrent |   .0004327   .0001698     2.55   0.011     .0000999    .0007656
         _cons |   3.867626   .2184873    17.70   0.000     3.439399    4.295853
---------------+----------------------------------------------------------------
inflate        |
mig2gross_2016 |  -.0206951   .0012508   -16.54   0.000    -.0231467   -.0182435
        popden |  -5.82e-06   .0000493    -0.12   0.906    -.0001025    .0000908
   per_vacrent |  -.1528612    .045326    -3.37   0.001    -.2416986   -.0640238
       medrent |  -.0034237   .0002269   -15.09   0.000    -.0038685    -.002979
         _cons |    5.86385   .2429657    24.13   0.000     5.387646    6.340054
---------------+----------------------------------------------------------------
      /lnalpha |   .1620231   .0510639     3.17   0.002     .0619397    .2621065
---------------+----------------------------------------------------------------
         alpha |   1.175887   .0600454                      1.063898    1.299665
--------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) =  3.0e+05 Pr>=chibar2 =  0.0000




. margins, dydx(*)

Average marginal effects                        Number of obs     =      4,716
Model VCE    : OIM

Expression   : Predicted number of events, predict()
dy/dx w.r.t. : mig2gross_2016 popden per_vacrent medrent

--------------------------------------------------------------------------------
               |            Delta-method
               |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
mig2gross_2016 |   1.787321   .8142064     2.20   0.028     .1915056    3.383136
        popden |   .0226598   .0108408     2.09   0.037     .0014123    .0439074
   per_vacrent |   129.9724   70.19302     1.85   0.064    -7.603427    267.5481
       medrent |    .546259   .3278685     1.67   0.096    -.0963515    1.188869
--------------------------------------------------------------------------------

I have three question:
First, I understand that the coef. is the increase in the log of the expected count as a function of the predictor variables; but I can barely understand what the impact of that is by looking at it. I get that you can exponentiate the coefficients and understand it that way, so mig2gross_2016's exponentiated coefficient is now 1.001, and a one unit increase in mig2gross_2016 is now a 1.001 increase in dp_. Is that a correct interpretation and is there a command so that stata produces the exponentiated coefficients or do I have to do that by hand?

Second: I want to get mig2gross at different levels (at 0, 1, 100, 1000) because it also has a lot of zeros, so I want to see the marginal change when it's at different levels. But I can only get margins to work with zinb as margins, dydx(*). Any suggestions for getting the right code, and am I interpreting margin correctly?

Third: I have dummy variables for state (10 categories) that I originally wanted to treat as multiple levels, but I haven't found a way to do that for ZINB. is it better to just include them in the equation like:

Code:

 zinb dp_ mig2gross_2016 popden per_vacrent medrent i.state_n, inflate(mig2gross_2016 popden per_vacrent medrent)

because that just iterates for forever; any suggestions on that?

Tags: multi level, nbreg, output, zinb

Dung Le

Join Date: May 2018

Posts: 120
#2

15 Apr 2020, 08:15

Hi Nora,

In your 1st question, as you used -margins- command to produce coef of mig2gross_2016 so why do you need to exponentiate it? You are ready to interpret the result of that variable, i.e. on average, increasing in mig2gross_2016 by one unit increases the expected rate of dp_ mig2gross_2016 by 1.79, with other variables held constant. Another way to interpret is to use factor changes in E(y|x), where y is your outcome and x is a set of covariates, using -irr- option.

For your second question, you may want to try this command after -zinb- regression

Code:

margins, at(mig2gross=(0 1 100 1000)) atmeans

In your 3rd question, you may want to try -difficult- option in -zinb- regression, that may helps. However, I am wondering why you didn't put i.state_n in to -inflate()- part? is there a reason behind?

Although your data consists of lots of zeros, hurdle models also could be a good option relative to -zinb-

DL
Comment
Nora Romeo

Join Date: Apr 2020

Posts: 25
#3

15 Apr 2020, 08:38

Thanks DL! Your feedback is super helpful!

Two follow-up questions:

Do you know how I would compare / determine if a hurdle model would be better than a ZINB? I did the ZIP option so I know it is better than that, but I'm not sure how to

Also, if the margins command as I have listed above shows that increasing mig2gross_2016 by one unit increases the expected rate of dp_ by 1.79, why is that different than the exponentiated coefficient of 1.001?
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#4

15 Apr 2020, 08:58

Originally posted by Nora Romeo View Post

Thanks DL! Your feedback is super helpful!

Two follow-up questions:

Do you know how I would compare / determine if a hurdle model would be better than a ZINB? I did the ZIP option so I know it is better than that, but I'm not sure how to

Also, if the margins command as I have listed above shows that increasing mig2gross_2016 by one unit increases the expected rate of dp_ by 1.79, why is that different than the exponentiated coefficient of 1.001?

If I can add to Dung's explanation, I believe a zero inflated model simultaneously models the probability that an observation belongs to a latent class that only produces Y = 0, and that it belongs to a latent class with a negative binomial or Poisson response function. (NB: these models can produce occasional zeroes as well! You're just assuming that some respondents are structural zeroes, i.e. they will always produce a zero.)

Nora could have added the irr option to the original command to report the negative binomial coefficients as incidence rate ratios, which are intuitive. However, I'm pretty sure that this is an incidence rate conditional on membership in the class with the negative binomial response, i.e. they are not in the structural zero class. Maybe margins is the better tool to produce estimates a broad range of people can understand.

Dung said this about the output from margins:

i.e. on average, increasing in mig2gross_2016 by one unit increases the expected rate of dp_ mig2gross_2016 by 1.79

It might be clearer to say that a one-unit increase in mig2gross increases the expected count of dp_ by 1.79 units. Rate usually means the number of events in a specified population per unit time, and it's often standardized by some count, e.g. number of deaths per 10,000 person-years. This may depend on what dp_ actually is.

If you use the irr option, then it appears to only exponentiate the negative binomial or Poisson part of the model. I'm open to correction if I'm wrong, but I think that exponentiating the coefficients in the inflate part of the model, which models the probability of being in the structural zero class, you would get odds ratios. That part of the ZINB model is basically a logistic model.

I have no experience with hurdle models, so I won't comment about those.

Last edited by Weiwen Ng; 15 Apr 2020, 09:00.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Dung Le

Join Date: May 2018

Posts: 120
#5

15 Apr 2020, 09:18

Thank you Weiwen Ng for the detailed explanation.

To Nora

Code:

Do you know how I would compare / determine if a hurdle model would be better than a ZINB? I did the ZIP option so I know it is better than that, but I'm not sure how to

There are several ways to get what you want. One of the most straightforward ways, I think, is to compare AIC, BIC, and log-likelihood produced by the two models and it is not difficult to obtain those information criteria. In addition, because hurdle models and -zinb- are non-nested so you can use Vuong test to compare these two models.

Hope that helps
Comment

Nora Romeo

Join Date: Apr 2020
Posts: 25

15 Apr 2020, 09:19

Okay, let me make sure I'm understanding this correctly. From the code:

Code:

. . zinb dp_ mig2gross_2015 popden per_vacrent medrent, inflate(mig2gross_2015 popden per_v
> acrent medrent) irr

Fitting constant-only model:

Iteration 0:   log likelihood = -11750.491  (not concave)
Iteration 1:   log likelihood = -8595.9891  
Iteration 2:   log likelihood = -8041.0837  
Iteration 3:   log likelihood = -7589.8118  
Iteration 4:   log likelihood = -7421.3286  
Iteration 5:   log likelihood = -7364.3067  
Iteration 6:   log likelihood = -7351.4326  
Iteration 7:   log likelihood = -7351.0042  
Iteration 8:   log likelihood = -7351.0038  

Fitting full model:

Iteration 0:   log likelihood = -7351.0038  
Iteration 1:   log likelihood = -7191.7073  
Iteration 2:   log likelihood = -7071.2588  
Iteration 3:   log likelihood = -7038.9217  
Iteration 4:   log likelihood = -7038.0647  
Iteration 5:   log likelihood = -7038.0609  
Iteration 6:   log likelihood = -7038.0609  

Zero-inflated negative binomial regression      Number of obs     =      4,716
                                                Nonzero obs       =        906
                                                Zero obs          =      3,810

Inflation model = logit                         LR chi2(4)        =     625.89
Log likelihood  = -7038.061                     Prob > chi2       =     0.0000

--------------------------------------------------------------------------------
           dp_ |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
dp_            |
mig2gross_2015 |   1.001338   .0000822    16.29   0.000     1.001177    1.001499
        popden |   1.000014   6.76e-06     2.07   0.039     1.000001    1.000027
   per_vacrent |   1.116446   .0469465     2.62   0.009     1.028122    1.212357
       medrent |   1.000351   .0001748     2.01   0.045     1.000008    1.000694
         _cons |   54.40318   12.33194    17.63   0.000     34.88804     84.8344
---------------+----------------------------------------------------------------
inflate        |
mig2gross_2015 |  -.0221026    .001496   -14.77   0.000    -.0250346   -.0191705
        popden |  -.0000427    .000045    -0.95   0.342    -.0001308    .0000454
   per_vacrent |  -.1326346   .0484116    -2.74   0.006    -.2275196   -.0377496
       medrent |  -.0033009   .0002284   -14.45   0.000    -.0037485   -.0028533
         _cons |   5.766372   .2458235    23.46   0.000     5.284567    6.248177
---------------+----------------------------------------------------------------
      /lnalpha |   .2148961   .0541666     3.97   0.000     .1087315    .3210607
---------------+----------------------------------------------------------------
         alpha |   1.239733   .0671521                      1.114863    1.378589
--------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.
Note: _cons estimates baseline incidence rate.

. margins, at(mig2gross=(0 1 100 1000)) atmeans'
option ' not allowed
r(198);

. margins, at(mig2gross=(0 1 100 1000)) atmeans

Adjusted predictions                            Number of obs     =      4,716
Model VCE    : OIM

Expression   : Predicted number of events, predict()

1._at        : mig2gro~2015    =           0
               popden          =    676.7591 (mean)
               per_vacrent     =    1.812826 (mean)
               medrent         =    816.6234 (mean)

2._at        : mig2gro~2015    =           1
               popden          =    676.7591 (mean)
               per_vacrent     =    1.812826 (mean)
               medrent         =    816.6234 (mean)

3._at        : mig2gro~2015    =         100
               popden          =    676.7591 (mean)
               per_vacrent     =    1.812826 (mean)
               medrent         =    816.6234 (mean)

4._at        : mig2gro~2015    =        1000
               popden          =    676.7591 (mean)
               per_vacrent     =    1.812826 (mean)
               medrent         =    816.6234 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   5.112411   .5241168     9.75   0.000     4.085162    6.139661
          2  |   5.226974   .5321676     9.82   0.000     4.183944    6.270003
          3  |    36.3785   3.654438     9.95   0.000     29.21594    43.54107
          4  |   340.1172   25.48888    13.34   0.000     290.1599    390.0745
------------------------------------------------------------------------------

My interpretation is:
1. With the IRR option, I am getting the exponentiated coefficients, so a one unit increase in mig2gross_2016 is a 1.001 increase in dp_
2. using the margins at different values to mig2gross_2016, we see that when margins are at 1, we have a lower impact on the dp_ than at 100, and then at 1000 (I think I still need some help explaining this a little better)

Thanks again everyone, you have all been insanely helpful!

Comment

Weiwen Ng

Join Date: Jun 2015
Posts: 1241

15 Apr 2020, 09:50

Originally posted by Nora Romeo View Post

...

Code:

...
--------------------------------------------------------------------------------
dp_ | IRR Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
dp_ |
mig2gross_2015 | 1.001338 .0000822 16.29 0.000 1.001177 1.001499
popden | 1.000014 6.76e-06 2.07 0.039 1.000001 1.000027
per_vacrent | 1.116446 .0469465 2.62 0.009 1.028122 1.212357
medrent | 1.000351 .0001748 2.01 0.045 1.000008 1.000694
_cons | 54.40318 12.33194 17.63 0.000 34.88804 84.8344
---------------+----------------------------------------------------------------
inflate |
mig2gross_2015 | -.0221026 .001496 -14.77 0.000 -.0250346 -.0191705
popden | -.0000427 .000045 -0.95 0.342 -.0001308 .0000454
per_vacrent | -.1326346 .0484116 -2.74 0.006 -.2275196 -.0377496
medrent | -.0033009 .0002284 -14.45 0.000 -.0037485 -.0028533
_cons | 5.766372 .2458235 23.46 0.000 5.284567 6.248177
---------------+----------------------------------------------------------------
/lnalpha | .2148961 .0541666 3.97 0.000 .1087315 .3210607
---------------+----------------------------------------------------------------
alpha | 1.239733 .0671521 1.114863 1.378589
--------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.
Note: _cons estimates baseline incidence rate.

. margins, at(mig2gross=(0 1 100 1000)) atmeans'
option ' not allowed
r(198);

. margins, at(mig2gross=(0 1 100 1000)) atmeans

Adjusted predictions Number of obs = 4,716
Model VCE : OIM

Expression : Predicted number of events, predict()

1._at : mig2gro~2015 = 0
popden = 676.7591 (mean)
per_vacrent = 1.812826 (mean)
medrent = 816.6234 (mean)

2._at : mig2gro~2015 = 1
popden = 676.7591 (mean)
per_vacrent = 1.812826 (mean)
medrent = 816.6234 (mean)

3._at : mig2gro~2015 = 100
popden = 676.7591 (mean)
per_vacrent = 1.812826 (mean)
medrent = 816.6234 (mean)

4._at : mig2gro~2015 = 1000
popden = 676.7591 (mean)
per_vacrent = 1.812826 (mean)
medrent = 816.6234 (mean)

------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at |
1 | 5.112411 .5241168 9.75 0.000 4.085162 6.139661
2 | 5.226974 .5321676 9.82 0.000 4.183944 6.270003
3 | 36.3785 3.654438 9.95 0.000 29.21594 43.54107
4 | 340.1172 25.48888 13.34 0.000 290.1599 390.0745
------------------------------------------------------------------------------

My interpretation is:
1. With the IRR option, I am getting the exponentiated coefficients, so a one unit increase in mig2gross_2016 is a 1.001 increase in dp_
...

I think that it is something like this: a one-unit in mig2gross_2016 is associated with an IRR of 1.001 conditional on being in the non-structural zero group. I am open to correction if I'm wrong, but I'm pretty sure that the quote above is not correct.

2. using the margins at different values to mig2gross_2016, we see that when margins are at 1, we have a lower impact on the dp_ than at 100, and then at 1000 (I think I still need some help explaining this a little better)

Again, I'm pretty sure that you're just showing the predicted count of dp_ at those 4 values of mig2gross, holding all covariates at the sample means. Those values are pretty widely separated!

Note: I think that omitting the atmeans option is also acceptable. If you omit this, then all the other covariates will be left at their original levels.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

Comment

Nora Romeo

Join Date: Apr 2020

Posts: 25
#8

15 Apr 2020, 17:55

Guys, thanks so much this has been amazing!
1 like
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#9

15 Apr 2020, 18:56

Originally posted by Nora Romeo View Post

Guys, thanks so much this has been amazing!

You’re welcome. Before I forget, here is a very good explanation of the margins command by Richard Williams, who frequently posts here. The nice thing about margins is that it converts everything into ‘natural’ units, I.e. probability for any probability model, counts for any count model. You aren’t left to wonder how big an odds ratio really is. It also has some very nice automated plotting capability after you estimate the margins.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Nora Romeo

Join Date: Apr 2020

Posts: 25
#10

16 Apr 2020, 06:50

That is a great resource, thanks!

I just wanted to check my understanding compared to someone else. So I have used the code as below: (I trimmed out some of the results)

Code:

zinb dp_ mig2gross_2016 theme1 theme2 theme3 theme4 i.time_n, inflate(theme1 theme2 theme3 theme4 i.time_n i.state_n) irr .... . margins time_n#state_n numerical derivatives are approximate flat or discontinuous region encountered numerical derivatives are approximate flat or discontinuous region encountered numerical derivatives are approximate flat or discontinuous region encountered numerical derivatives are approximate flat or discontinuous region encountered Predictive margins Number of obs = 4,716 Model VCE : OIM Expression : Predicted number of events, predict() .... marginsplot, noci scheme(s1mono) legend(position(1) ring(0))

So the graph I ended up with is above. My interpretation of it, is that there are more events (counts) in the average destination county in the second month after event. And while different states influence the likelihood of having more or less events, they all follow roughly the same trajectory in terms of how time influences counts of movements. Does that sound right?

And would you say that this starts to address some of the aspects that I would be showing in a multi level model where I'd have different levels for time and state?

Thanks again everyone
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#11

16 Apr 2020, 08:39

[QUOTE=Nora Romeo;n1547242]That is a great resource, thanks!

I just wanted to check my understanding compared to someone else. So I have used the code as below: (I trimmed out some of the results)

Code:

zinb dp_ mig2gross_2016 theme1 theme2 theme3 theme4 i.time_n, inflate(theme1 theme2 theme3 theme4 i.time_n i.state_n) irr .... . margins time_n#state_n numerical derivatives are approximate flat or discontinuous region encountered numerical derivatives are approximate flat or discontinuous region encountered numerical derivatives are approximate flat or discontinuous region encountered numerical derivatives are approximate flat or discontinuous region encountered Predictive margins Number of obs = 4,716 Model VCE : OIM Expression : Predicted number of events, predict() .... marginsplot, noci scheme(s1mono) legend(position(1) ring(0))

So the graph I ended up with is above. My interpretation of it, is that there are more events (counts) in the average destination county in the second month after event. And while different states influence the likelihood of having more or less events, they all follow roughly the same trajectory in terms of how time influences counts of movements. Does that sound right?

And would you say that this starts to address some of the aspects that I would be showing in a multi level model where I'd have different levels for time and state?

Your statistical model is actually assuming that every state follows the same trend over time. The way the graph looks is an inevitable consequence of that assumption. Every state gets the same bump in its incidence rate ratio in October, 2017. State 3 looks like it doesn't, but that's likely because it is getting very few events.

Earlier, I think you said that when you included state as a fixed effect in the model, it iterated forever, i.e. it failed to converge. I think we failed to really dig into that. In the raw data, how many events does state 3 get? If it gets exactly zero events, then the MLE for its beta should be negative infinity, which might cause convergence trouble. I see in the margins command, you reported part of the iteration log that says something about flat or discontinuous region. I'm not exactly sure what it means, but it would cause some concern. In practice, Stata might estimate the parameter at something like -15 (see toy example below).

I can't really comment on multi-level models. I don't believe there is a stock Stata command to fit ZINB models with random effects. You can fit a ZINB model in the generalized structural equation model command, and you can include random effects in gsem. However, I haven't tried to fit that sort of model before.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Nora Romeo

Join Date: Apr 2020
Posts: 25

#12

16 Apr 2020, 13:19

Ok, thank you so much. One of the states was entirely populated by zero and it was because I messed it up, so really appreciate you pointing that out. So now it does run with both states and time as part of the ZINB.

My question is, and apologies if I'm interpreting this incorrectly, if we can use marginal effects to estimate what the mean of variable mig2gross_2016 at different states and times, doesn't that give some indication to the varied affects of them on the outcome? Thanks!

Code:

. zinb dp_ mig2gross_2016 theme1 theme2 theme3 theme4 i.time_n i.state_n, inflate(theme1 th
> eme2 theme3 theme4 i.time_n i.state_n) zip irr 

Fitting zip model:

Iteration 0:   log likelihood = -383890.35  
Iteration 1:   log likelihood = -166014.75  
Iteration 2:   log likelihood = -80389.065  
Iteration 3:   log likelihood = -73689.535  
Iteration 4:   log likelihood = -73656.566  
Iteration 5:   log likelihood = -73656.553  
Iteration 6:   log likelihood = -73656.553  

Fitting constant-only model:

Iteration 0:   log likelihood = -12052.075  (not concave)
Iteration 1:   log likelihood = -8798.9175  
Iteration 2:   log likelihood = -8035.7076  
Iteration 3:   log likelihood = -7775.0397  
Iteration 4:   log likelihood = -7766.3645  
Iteration 5:   log likelihood = -7766.3222  
Iteration 6:   log likelihood = -7766.3222  

Fitting full model:

Iteration 0:   log likelihood = -7766.3222  (not concave)
Iteration 1:   log likelihood = -7578.7132  (not concave)
Iteration 2:   log likelihood = -7395.2882  (not concave)
Iteration 3:   log likelihood = -7291.3798  
Iteration 4:   log likelihood = -7069.6876  
Iteration 5:   log likelihood = -7010.7578  
Iteration 6:   log likelihood = -7008.8553  
Iteration 7:   log likelihood = -7008.8512  
Iteration 8:   log likelihood = -7008.8512  

Zero-inflated negative binomial regression      Number of obs     =      4,716
                                                Nonzero obs       =        960
                                                Zero obs          =      3,756

Inflation model = logit                         LR chi2(19)       =    1514.94
Log likelihood  = -7008.851                     Prob > chi2       =     0.0000

--------------------------------------------------------------------------------
           dp_ |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
dp_            |
mig2gross_2016 |   1.001053   .0000559    18.84   0.000     1.000943    1.001162
        theme1 |   .8276043   .1616495    -0.97   0.333     .5643702    1.213616
        theme2 |    1.24371   .2050724     1.32   0.186     .9002568    1.718192
        theme3 |   26.25199   5.268545    16.28   0.000     17.71469     38.9037
        theme4 |   .8976557   .1200949    -0.81   0.420     .6906053    1.166782
               |
        time_n |
       201710  |   1.266187   .0896764     3.33   0.001     1.102078    1.454733
       201711  |   .8812726   .0640859    -1.74   0.082     .7642072    1.016271
       201712  |   .8009891   .0595075    -2.99   0.003     .6924504    .9265408
       201801  |   .5220999   .0414913    -8.18   0.000     .4467953    .6100966
       201802  |   .3539045   .0303242   -12.12   0.000     .2991927     .418621
               |
       state_n |
2  |   3.028161   .4480677     7.49   0.000     2.265841    4.046958
3  |   4.390404   .4837528    13.43   0.000     3.537656    5.448706
4  |   1.980917   .2848017     4.75   0.000     1.494468    2.625704
5  |   3.771753    .530341     9.44   0.000     2.863235    4.968549
6  |   2.865479   .3175786     9.50   0.000     2.305999    3.560698
7  |   3.316452   .3717148    10.70   0.000     2.662376    4.131219
8  |    3.66804   .4694675    10.15   0.000     2.854236    4.713876
9  |   1.566243   .1716763     4.09   0.000     1.263452    1.941599
10  |   1.215705   .1457821     1.63   0.103     .9610722    1.537803
               |
         _cons |   3.840601   .7682528     6.73   0.000     2.594953    5.684194
---------------+----------------------------------------------------------------
inflate        |
        theme1 |    4.59789   .3600309    12.77   0.000     3.892242    5.303537
        theme2 |   .0173825   .2732878     0.06   0.949    -.5182517    .5530168
        theme3 |   -9.28391   .4102645   -22.63   0.000    -10.08801   -8.479806
        theme4 |  -.9647324   .2490078    -3.87   0.000    -1.452779   -.4766862
               |
        time_n |
       201710  |   -.622303   .1694814    -3.67   0.000    -.9544805   -.2901255
       201711  |  -.3098944   .1720491    -1.80   0.072    -.6471043    .0273156
       201712  |  -.0764986   .1744496    -0.44   0.661    -.4184136    .2654164
       201801  |   .5445079   .1830231     2.98   0.003     .1857893    .9032265
       201802  |   1.085149   .1932292     5.62   0.000      .706427    1.463871
               |
       state_n |
2  |  -4.289617   .4817643    -8.90   0.000    -5.233857   -3.345376
3  |  -3.497795   .2252531   -15.53   0.000    -3.939283   -3.056307
4  |  -.9105593   .2831871    -3.22   0.001    -1.465596   -.3555227
5  |  -2.967156     .34299    -8.65   0.000    -3.639404   -2.294908
6  |  -3.177949     .30785   -10.32   0.000    -3.781324   -2.574574
7  |  -3.407486   .2709197   -12.58   0.000    -3.938478   -2.876493
8  |  -3.873965   .2776563   -13.95   0.000    -4.418162   -3.329769
9  |   .4535625   .1926122     2.35   0.019     .0760496    .8310755
10  |  -.4886507    .228459    -2.14   0.032    -.9364221   -.0408793
               |
         _cons |   8.066146   .4101508    19.67   0.000     7.262265    8.870027
---------------+----------------------------------------------------------------
      /lnalpha |  -.8113674   .0439079   -18.48   0.000    -.8974252   -.7253096
---------------+----------------------------------------------------------------
         alpha |   .4442502   .0195061                      .4076179    .4841747

Comment

Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#13

17 Apr 2020, 12:33

Originally posted by Nora Romeo View Post

Ok, thank you so much. One of the states was entirely populated by zero and it was because I messed it up, so really appreciate you pointing that out. So now it does run with both states and time as part of the ZINB.

My question is, and apologies if I'm interpreting this incorrectly, if we can use marginal effects to estimate what the mean of variable mig2gross_2016 at different states and times, doesn't that give some indication to the varied affects of them on the outcome? Thanks!

...

Glad you uncovered a data error!

To the second question, let me re-state things.

The model you are estimating predicts the mean count (aka expected count) of dp_.

In the model, you are assuming that each state has the same increase in the expected incidence rate at time = 2 (and 3, 4, etc). So, each state's underlying mean count gets multiplied by some factor (the coefficient for 201710).

In that sense, the outcome varies. You're right.

Alternatively, each state could get a different effect on its IRR in that time period (and others). If a random effects version of this model exists, then that model inherently assumes that each state's trend is a weighted average between the global mean trend plus the state's own trend (and states with fewer observations are going to look more like the mean trend). Alternatively, if you somehow interacted state with time, you might see a different trend - I'm not suggesting you do that, because I'm not sure you have the data to support it, and it's a lot of interaction terms!

Maybe some of the misunderstanding is just complicated semantics plus the fact that I (and many others) don't always write precisely on the internet.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement

Understanding ZINB & Post Estimation better

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment