Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Understanding ZINB & Post Estimation better

    I have a dataset with 3,810 zero observations and 906 nonzero observations.

    Code:
    . zinb dp_ mig2gross_2016 popden per_vacrent medrent, inflate(mig2gross_2016 popden per_vac
    > rent medrent)  zip
    
    ....
    
    
    Zero-inflated negative binomial regression      Number of obs     =      4,716
                                                    Nonzero obs       =        906
                                                    Zero obs          =      3,810
    
    Inflation model = logit                         LR chi2(4)        =     674.94
    Log likelihood  = -7027.052                     Prob > chi2       =     0.0000
    
    --------------------------------------------------------------------------------
               dp_ |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------+----------------------------------------------------------------
    dp_            |
    mig2gross_2016 |   .0013656   .0000791    17.26   0.000     .0012106    .0015207
            popden |   .0000187   6.47e-06     2.89   0.004     5.99e-06    .0000313
       per_vacrent |   .1064657   .0395465     2.69   0.007     .0289561    .1839754
           medrent |   .0004327   .0001698     2.55   0.011     .0000999    .0007656
             _cons |   3.867626   .2184873    17.70   0.000     3.439399    4.295853
    ---------------+----------------------------------------------------------------
    inflate        |
    mig2gross_2016 |  -.0206951   .0012508   -16.54   0.000    -.0231467   -.0182435
            popden |  -5.82e-06   .0000493    -0.12   0.906    -.0001025    .0000908
       per_vacrent |  -.1528612    .045326    -3.37   0.001    -.2416986   -.0640238
           medrent |  -.0034237   .0002269   -15.09   0.000    -.0038685    -.002979
             _cons |    5.86385   .2429657    24.13   0.000     5.387646    6.340054
    ---------------+----------------------------------------------------------------
          /lnalpha |   .1620231   .0510639     3.17   0.002     .0619397    .2621065
    ---------------+----------------------------------------------------------------
             alpha |   1.175887   .0600454                      1.063898    1.299665
    --------------------------------------------------------------------------------
    Likelihood-ratio test of alpha=0: chibar2(01) =  3.0e+05 Pr>=chibar2 =  0.0000
    
    
    
    
    . margins, dydx(*)
    
    Average marginal effects                        Number of obs     =      4,716
    Model VCE    : OIM
    
    Expression   : Predicted number of events, predict()
    dy/dx w.r.t. : mig2gross_2016 popden per_vacrent medrent
    
    --------------------------------------------------------------------------------
                   |            Delta-method
                   |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------+----------------------------------------------------------------
    mig2gross_2016 |   1.787321   .8142064     2.20   0.028     .1915056    3.383136
            popden |   .0226598   .0108408     2.09   0.037     .0014123    .0439074
       per_vacrent |   129.9724   70.19302     1.85   0.064    -7.603427    267.5481
           medrent |    .546259   .3278685     1.67   0.096    -.0963515    1.188869
    --------------------------------------------------------------------------------
    I have three question:
    First, I understand that the coef. is the increase in the log of the expected count as a function of the predictor variables; but I can barely understand what the impact of that is by looking at it. I get that you can exponentiate the coefficients and understand it that way, so mig2gross_2016's exponentiated coefficient is now 1.001, and a one unit increase in mig2gross_2016 is now a 1.001 increase in dp_. Is that a correct interpretation and is there a command so that stata produces the exponentiated coefficients or do I have to do that by hand?


    Second: I want to get mig2gross at different levels (at 0, 1, 100, 1000) because it also has a lot of zeros, so I want to see the marginal change when it's at different levels. But I can only get margins to work with zinb as margins, dydx(*). Any suggestions for getting the right code, and am I interpreting margin correctly?

    Third: I have dummy variables for state (10 categories) that I originally wanted to treat as multiple levels, but I haven't found a way to do that for ZINB. is it better to just include them in the equation like:

    Code:
     zinb dp_ mig2gross_2016 popden per_vacrent medrent i.state_n, inflate(mig2gross_2016 popden per_vacrent medrent)
    because that just iterates for forever; any suggestions on that?

  • #2
    Hi Nora,

    In your 1st question, as you used -margins- command to produce coef of mig2gross_2016 so why do you need to exponentiate it? You are ready to interpret the result of that variable, i.e. on average, increasing in mig2gross_2016 by one unit increases the expected rate of dp_ mig2gross_2016 by 1.79, with other variables held constant. Another way to interpret is to use factor changes in E(y|x), where y is your outcome and x is a set of covariates, using -irr- option.

    For your second question, you may want to try this command after -zinb- regression
    Code:
    margins, at(mig2gross=(0 1 100 1000)) atmeans
    In your 3rd question, you may want to try -difficult- option in -zinb- regression, that may helps. However, I am wondering why you didn't put i.state_n in to -inflate()- part? is there a reason behind?

    Although your data consists of lots of zeros, hurdle models also could be a good option relative to -zinb-

    DL

    Comment


    • #3
      Thanks DL! Your feedback is super helpful!

      Two follow-up questions:

      Do you know how I would compare / determine if a hurdle model would be better than a ZINB? I did the ZIP option so I know it is better than that, but I'm not sure how to

      Also, if the margins command as I have listed above shows that increasing mig2gross_2016 by one unit increases the expected rate of dp_ by 1.79, why is that different than the exponentiated coefficient of 1.001?

      Comment


      • #4
        Originally posted by Nora Romeo View Post
        Thanks DL! Your feedback is super helpful!

        Two follow-up questions:

        Do you know how I would compare / determine if a hurdle model would be better than a ZINB? I did the ZIP option so I know it is better than that, but I'm not sure how to

        Also, if the margins command as I have listed above shows that increasing mig2gross_2016 by one unit increases the expected rate of dp_ by 1.79, why is that different than the exponentiated coefficient of 1.001?
        If I can add to Dung's explanation, I believe a zero inflated model simultaneously models the probability that an observation belongs to a latent class that only produces Y = 0, and that it belongs to a latent class with a negative binomial or Poisson response function. (NB: these models can produce occasional zeroes as well! You're just assuming that some respondents are structural zeroes, i.e. they will always produce a zero.)

        Nora could have added the irr option to the original command to report the negative binomial coefficients as incidence rate ratios, which are intuitive. However, I'm pretty sure that this is an incidence rate conditional on membership in the class with the negative binomial response, i.e. they are not in the structural zero class. Maybe margins is the better tool to produce estimates a broad range of people can understand.

        Dung said this about the output from margins:

        i.e. on average, increasing in mig2gross_2016 by one unit increases the expected rate of dp_ mig2gross_2016 by 1.79
        It might be clearer to say that a one-unit increase in mig2gross increases the expected count of dp_ by 1.79 units. Rate usually means the number of events in a specified population per unit time, and it's often standardized by some count, e.g. number of deaths per 10,000 person-years. This may depend on what dp_ actually is.

        If you use the irr option, then it appears to only exponentiate the negative binomial or Poisson part of the model. I'm open to correction if I'm wrong, but I think that exponentiating the coefficients in the inflate part of the model, which models the probability of being in the structural zero class, you would get odds ratios. That part of the ZINB model is basically a logistic model.

        I have no experience with hurdle models, so I won't comment about those.
        Last edited by Weiwen Ng; 15 Apr 2020, 09:00.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment


        • #5
          Thank you Weiwen Ng for the detailed explanation.

          To Nora
          Code:
          Do you know how I would compare / determine if a hurdle model would be better than a ZINB? I did the ZIP option so I know it is better than that, but I'm not sure how to
          There are several ways to get what you want. One of the most straightforward ways, I think, is to compare AIC, BIC, and log-likelihood produced by the two models and it is not difficult to obtain those information criteria. In addition, because hurdle models and -zinb- are non-nested so you can use Vuong test to compare these two models.

          Hope that helps

          Comment


          • #6
            Okay, let me make sure I'm understanding this correctly. From the code:

            Code:
            . . zinb dp_ mig2gross_2015 popden per_vacrent medrent, inflate(mig2gross_2015 popden per_v
            > acrent medrent) irr
            
            Fitting constant-only model:
            
            Iteration 0:   log likelihood = -11750.491  (not concave)
            Iteration 1:   log likelihood = -8595.9891  
            Iteration 2:   log likelihood = -8041.0837  
            Iteration 3:   log likelihood = -7589.8118  
            Iteration 4:   log likelihood = -7421.3286  
            Iteration 5:   log likelihood = -7364.3067  
            Iteration 6:   log likelihood = -7351.4326  
            Iteration 7:   log likelihood = -7351.0042  
            Iteration 8:   log likelihood = -7351.0038  
            
            Fitting full model:
            
            Iteration 0:   log likelihood = -7351.0038  
            Iteration 1:   log likelihood = -7191.7073  
            Iteration 2:   log likelihood = -7071.2588  
            Iteration 3:   log likelihood = -7038.9217  
            Iteration 4:   log likelihood = -7038.0647  
            Iteration 5:   log likelihood = -7038.0609  
            Iteration 6:   log likelihood = -7038.0609  
            
            Zero-inflated negative binomial regression      Number of obs     =      4,716
                                                            Nonzero obs       =        906
                                                            Zero obs          =      3,810
            
            Inflation model = logit                         LR chi2(4)        =     625.89
            Log likelihood  = -7038.061                     Prob > chi2       =     0.0000
            
            --------------------------------------------------------------------------------
                       dp_ |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
            ---------------+----------------------------------------------------------------
            dp_            |
            mig2gross_2015 |   1.001338   .0000822    16.29   0.000     1.001177    1.001499
                    popden |   1.000014   6.76e-06     2.07   0.039     1.000001    1.000027
               per_vacrent |   1.116446   .0469465     2.62   0.009     1.028122    1.212357
                   medrent |   1.000351   .0001748     2.01   0.045     1.000008    1.000694
                     _cons |   54.40318   12.33194    17.63   0.000     34.88804     84.8344
            ---------------+----------------------------------------------------------------
            inflate        |
            mig2gross_2015 |  -.0221026    .001496   -14.77   0.000    -.0250346   -.0191705
                    popden |  -.0000427    .000045    -0.95   0.342    -.0001308    .0000454
               per_vacrent |  -.1326346   .0484116    -2.74   0.006    -.2275196   -.0377496
                   medrent |  -.0033009   .0002284   -14.45   0.000    -.0037485   -.0028533
                     _cons |   5.766372   .2458235    23.46   0.000     5.284567    6.248177
            ---------------+----------------------------------------------------------------
                  /lnalpha |   .2148961   .0541666     3.97   0.000     .1087315    .3210607
            ---------------+----------------------------------------------------------------
                     alpha |   1.239733   .0671521                      1.114863    1.378589
            --------------------------------------------------------------------------------
            Note: Estimates are transformed only in the first equation.
            Note: _cons estimates baseline incidence rate.
            
            . margins, at(mig2gross=(0 1 100 1000)) atmeans'
            option ' not allowed
            r(198);
            
            . margins, at(mig2gross=(0 1 100 1000)) atmeans
            
            Adjusted predictions                            Number of obs     =      4,716
            Model VCE    : OIM
            
            Expression   : Predicted number of events, predict()
            
            1._at        : mig2gro~2015    =           0
                           popden          =    676.7591 (mean)
                           per_vacrent     =    1.812826 (mean)
                           medrent         =    816.6234 (mean)
            
            2._at        : mig2gro~2015    =           1
                           popden          =    676.7591 (mean)
                           per_vacrent     =    1.812826 (mean)
                           medrent         =    816.6234 (mean)
            
            3._at        : mig2gro~2015    =         100
                           popden          =    676.7591 (mean)
                           per_vacrent     =    1.812826 (mean)
                           medrent         =    816.6234 (mean)
            
            4._at        : mig2gro~2015    =        1000
                           popden          =    676.7591 (mean)
                           per_vacrent     =    1.812826 (mean)
                           medrent         =    816.6234 (mean)
            
            ------------------------------------------------------------------------------
                         |            Delta-method
                         |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                     _at |
                      1  |   5.112411   .5241168     9.75   0.000     4.085162    6.139661
                      2  |   5.226974   .5321676     9.82   0.000     4.183944    6.270003
                      3  |    36.3785   3.654438     9.95   0.000     29.21594    43.54107
                      4  |   340.1172   25.48888    13.34   0.000     290.1599    390.0745
            ------------------------------------------------------------------------------

            My interpretation is:
            1. With the IRR option, I am getting the exponentiated coefficients, so a one unit increase in mig2gross_2016 is a 1.001 increase in dp_
            2. using the margins at different values to mig2gross_2016, we see that when margins are at 1, we have a lower impact on the dp_ than at 100, and then at 1000 (I think I still need some help explaining this a little better)

            Thanks again everyone, you have all been insanely helpful!

            Comment


            • #7
              Originally posted by Nora Romeo View Post
              ...

              Code:
              ...
              --------------------------------------------------------------------------------
              dp_ | IRR Std. Err. z P>|z| [95% Conf. Interval]
              ---------------+----------------------------------------------------------------
              dp_ |
              mig2gross_2015 | 1.001338 .0000822 16.29 0.000 1.001177 1.001499
              popden | 1.000014 6.76e-06 2.07 0.039 1.000001 1.000027
              per_vacrent | 1.116446 .0469465 2.62 0.009 1.028122 1.212357
              medrent | 1.000351 .0001748 2.01 0.045 1.000008 1.000694
              _cons | 54.40318 12.33194 17.63 0.000 34.88804 84.8344
              ---------------+----------------------------------------------------------------
              inflate |
              mig2gross_2015 | -.0221026 .001496 -14.77 0.000 -.0250346 -.0191705
              popden | -.0000427 .000045 -0.95 0.342 -.0001308 .0000454
              per_vacrent | -.1326346 .0484116 -2.74 0.006 -.2275196 -.0377496
              medrent | -.0033009 .0002284 -14.45 0.000 -.0037485 -.0028533
              _cons | 5.766372 .2458235 23.46 0.000 5.284567 6.248177
              ---------------+----------------------------------------------------------------
              /lnalpha | .2148961 .0541666 3.97 0.000 .1087315 .3210607
              ---------------+----------------------------------------------------------------
              alpha | 1.239733 .0671521 1.114863 1.378589
              --------------------------------------------------------------------------------
              Note: Estimates are transformed only in the first equation.
              Note: _cons estimates baseline incidence rate.
              
              . margins, at(mig2gross=(0 1 100 1000)) atmeans'
              option ' not allowed
              r(198);
              
              . margins, at(mig2gross=(0 1 100 1000)) atmeans
              
              Adjusted predictions Number of obs = 4,716
              Model VCE : OIM
              
              Expression : Predicted number of events, predict()
              
              1._at : mig2gro~2015 = 0
              popden = 676.7591 (mean)
              per_vacrent = 1.812826 (mean)
              medrent = 816.6234 (mean)
              
              2._at : mig2gro~2015 = 1
              popden = 676.7591 (mean)
              per_vacrent = 1.812826 (mean)
              medrent = 816.6234 (mean)
              
              3._at : mig2gro~2015 = 100
              popden = 676.7591 (mean)
              per_vacrent = 1.812826 (mean)
              medrent = 816.6234 (mean)
              
              4._at : mig2gro~2015 = 1000
              popden = 676.7591 (mean)
              per_vacrent = 1.812826 (mean)
              medrent = 816.6234 (mean)
              
              ------------------------------------------------------------------------------
              | Delta-method
              | Margin Std. Err. z P>|z| [95% Conf. Interval]
              -------------+----------------------------------------------------------------
              _at |
              1 | 5.112411 .5241168 9.75 0.000 4.085162 6.139661
              2 | 5.226974 .5321676 9.82 0.000 4.183944 6.270003
              3 | 36.3785 3.654438 9.95 0.000 29.21594 43.54107
              4 | 340.1172 25.48888 13.34 0.000 290.1599 390.0745
              ------------------------------------------------------------------------------

              My interpretation is:
              1. With the IRR option, I am getting the exponentiated coefficients, so a one unit increase in mig2gross_2016 is a 1.001 increase in dp_
              ...
              I think that it is something like this: a one-unit in mig2gross_2016 is associated with an IRR of 1.001 conditional on being in the non-structural zero group. I am open to correction if I'm wrong, but I'm pretty sure that the quote above is not correct.

              2. using the margins at different values to mig2gross_2016, we see that when margins are at 1, we have a lower impact on the dp_ than at 100, and then at 1000 (I think I still need some help explaining this a little better)
              Again, I'm pretty sure that you're just showing the predicted count of dp_ at those 4 values of mig2gross, holding all covariates at the sample means. Those values are pretty widely separated!

              Note: I think that omitting the atmeans option is also acceptable. If you omit this, then all the other covariates will be left at their original levels.
              Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

              When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

              Comment


              • #8
                Guys, thanks so much this has been amazing!

                Comment


                • #9
                  Originally posted by Nora Romeo View Post
                  Guys, thanks so much this has been amazing!
                  You’re welcome. Before I forget, here is a very good explanation of the margins command by Richard Williams, who frequently posts here. The nice thing about margins is that it converts everything into ‘natural’ units, I.e. probability for any probability model, counts for any count model. You aren’t left to wonder how big an odds ratio really is. It also has some very nice automated plotting capability after you estimate the margins.
                  Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                  When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                  Comment


                  • #10
                    That is a great resource, thanks!

                    I just wanted to check my understanding compared to someone else. So I have used the code as below: (I trimmed out some of the results)

                    Code:
                    zinb dp_ mig2gross_2016 theme1 theme2 theme3 theme4 i.time_n, inflate(theme1 theme2 theme3 theme4 i.time_n i.state_n)  irr
                    
                    ....
                    
                    . margins time_n#state_n
                    numerical derivatives are approximate
                    flat or discontinuous region encountered
                    numerical derivatives are approximate
                    flat or discontinuous region encountered
                    numerical derivatives are approximate
                    flat or discontinuous region encountered
                    numerical derivatives are approximate
                    flat or discontinuous region encountered
                    
                    Predictive margins                              Number of obs     =      4,716
                    Model VCE    : OIM
                    
                    Expression   : Predicted number of events, predict()
                    
                    
                    ....
                    
                    
                    marginsplot, noci scheme(s1mono) legend(position(1) ring(0))
                    Click image for larger version

Name:	Graph.png
Views:	1
Size:	84.1 KB
ID:	1547243



                    So the graph I ended up with is above. My interpretation of it, is that there are more events (counts) in the average destination county in the second month after event. And while different states influence the likelihood of having more or less events, they all follow roughly the same trajectory in terms of how time influences counts of movements. Does that sound right?

                    And would you say that this starts to address some of the aspects that I would be showing in a multi level model where I'd have different levels for time and state?

                    Thanks again everyone

                    Comment


                    • #11
                      [QUOTE=Nora Romeo;n1547242]That is a great resource, thanks!

                      I just wanted to check my understanding compared to someone else. So I have used the code as below: (I trimmed out some of the results)

                      Code:
                      zinb dp_ mig2gross_2016 theme1 theme2 theme3 theme4 i.time_n, inflate(theme1 theme2 theme3 theme4 i.time_n i.state_n) irr
                      
                      ....
                      
                      . margins time_n#state_n
                      numerical derivatives are approximate
                      flat or discontinuous region encountered
                      numerical derivatives are approximate
                      flat or discontinuous region encountered
                      numerical derivatives are approximate
                      flat or discontinuous region encountered
                      numerical derivatives are approximate
                      flat or discontinuous region encountered
                      
                      Predictive margins Number of obs = 4,716
                      Model VCE : OIM
                      
                      Expression : Predicted number of events, predict()
                      
                      
                      ....
                      
                      
                      marginsplot, noci scheme(s1mono) legend(position(1) ring(0))
                      So the graph I ended up with is above. My interpretation of it, is that there are more events (counts) in the average destination county in the second month after event. And while different states influence the likelihood of having more or less events, they all follow roughly the same trajectory in terms of how time influences counts of movements. Does that sound right?

                      And would you say that this starts to address some of the aspects that I would be showing in a multi level model where I'd have different levels for time and state?
                      Your statistical model is actually assuming that every state follows the same trend over time. The way the graph looks is an inevitable consequence of that assumption. Every state gets the same bump in its incidence rate ratio in October, 2017. State 3 looks like it doesn't, but that's likely because it is getting very few events.

                      Earlier, I think you said that when you included state as a fixed effect in the model, it iterated forever, i.e. it failed to converge. I think we failed to really dig into that. In the raw data, how many events does state 3 get? If it gets exactly zero events, then the MLE for its beta should be negative infinity, which might cause convergence trouble. I see in the margins command, you reported part of the iteration log that says something about flat or discontinuous region. I'm not exactly sure what it means, but it would cause some concern. In practice, Stata might estimate the parameter at something like -15 (see toy example below).

                      I can't really comment on multi-level models. I don't believe there is a stock Stata command to fit ZINB models with random effects. You can fit a ZINB model in the generalized structural equation model command, and you can include random effects in gsem. However, I haven't tried to fit that sort of model before.
                      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                      Comment


                      • #12
                        Ok, thank you so much. One of the states was entirely populated by zero and it was because I messed it up, so really appreciate you pointing that out. So now it does run with both states and time as part of the ZINB.

                        My question is, and apologies if I'm interpreting this incorrectly, if we can use marginal effects to estimate what the mean of variable mig2gross_2016 at different states and times, doesn't that give some indication to the varied affects of them on the outcome? Thanks!

                        Code:
                        . zinb dp_ mig2gross_2016 theme1 theme2 theme3 theme4 i.time_n i.state_n, inflate(theme1 th
                        > eme2 theme3 theme4 i.time_n i.state_n) zip irr 
                        
                        Fitting zip model:
                        
                        Iteration 0:   log likelihood = -383890.35  
                        Iteration 1:   log likelihood = -166014.75  
                        Iteration 2:   log likelihood = -80389.065  
                        Iteration 3:   log likelihood = -73689.535  
                        Iteration 4:   log likelihood = -73656.566  
                        Iteration 5:   log likelihood = -73656.553  
                        Iteration 6:   log likelihood = -73656.553  
                        
                        Fitting constant-only model:
                        
                        Iteration 0:   log likelihood = -12052.075  (not concave)
                        Iteration 1:   log likelihood = -8798.9175  
                        Iteration 2:   log likelihood = -8035.7076  
                        Iteration 3:   log likelihood = -7775.0397  
                        Iteration 4:   log likelihood = -7766.3645  
                        Iteration 5:   log likelihood = -7766.3222  
                        Iteration 6:   log likelihood = -7766.3222  
                        
                        Fitting full model:
                        
                        Iteration 0:   log likelihood = -7766.3222  (not concave)
                        Iteration 1:   log likelihood = -7578.7132  (not concave)
                        Iteration 2:   log likelihood = -7395.2882  (not concave)
                        Iteration 3:   log likelihood = -7291.3798  
                        Iteration 4:   log likelihood = -7069.6876  
                        Iteration 5:   log likelihood = -7010.7578  
                        Iteration 6:   log likelihood = -7008.8553  
                        Iteration 7:   log likelihood = -7008.8512  
                        Iteration 8:   log likelihood = -7008.8512  
                        
                        Zero-inflated negative binomial regression      Number of obs     =      4,716
                                                                        Nonzero obs       =        960
                                                                        Zero obs          =      3,756
                        
                        Inflation model = logit                         LR chi2(19)       =    1514.94
                        Log likelihood  = -7008.851                     Prob > chi2       =     0.0000
                        
                        --------------------------------------------------------------------------------
                                   dp_ |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        ---------------+----------------------------------------------------------------
                        dp_            |
                        mig2gross_2016 |   1.001053   .0000559    18.84   0.000     1.000943    1.001162
                                theme1 |   .8276043   .1616495    -0.97   0.333     .5643702    1.213616
                                theme2 |    1.24371   .2050724     1.32   0.186     .9002568    1.718192
                                theme3 |   26.25199   5.268545    16.28   0.000     17.71469     38.9037
                                theme4 |   .8976557   .1200949    -0.81   0.420     .6906053    1.166782
                                       |
                                time_n |
                               201710  |   1.266187   .0896764     3.33   0.001     1.102078    1.454733
                               201711  |   .8812726   .0640859    -1.74   0.082     .7642072    1.016271
                               201712  |   .8009891   .0595075    -2.99   0.003     .6924504    .9265408
                               201801  |   .5220999   .0414913    -8.18   0.000     .4467953    .6100966
                               201802  |   .3539045   .0303242   -12.12   0.000     .2991927     .418621
                                       |
                               state_n |
                        2  |   3.028161   .4480677     7.49   0.000     2.265841    4.046958
                        3  |   4.390404   .4837528    13.43   0.000     3.537656    5.448706
                        4  |   1.980917   .2848017     4.75   0.000     1.494468    2.625704
                        5  |   3.771753    .530341     9.44   0.000     2.863235    4.968549
                        6  |   2.865479   .3175786     9.50   0.000     2.305999    3.560698
                        7  |   3.316452   .3717148    10.70   0.000     2.662376    4.131219
                        8  |    3.66804   .4694675    10.15   0.000     2.854236    4.713876
                        9  |   1.566243   .1716763     4.09   0.000     1.263452    1.941599
                        10  |   1.215705   .1457821     1.63   0.103     .9610722    1.537803
                                       |
                                 _cons |   3.840601   .7682528     6.73   0.000     2.594953    5.684194
                        ---------------+----------------------------------------------------------------
                        inflate        |
                                theme1 |    4.59789   .3600309    12.77   0.000     3.892242    5.303537
                                theme2 |   .0173825   .2732878     0.06   0.949    -.5182517    .5530168
                                theme3 |   -9.28391   .4102645   -22.63   0.000    -10.08801   -8.479806
                                theme4 |  -.9647324   .2490078    -3.87   0.000    -1.452779   -.4766862
                                       |
                                time_n |
                               201710  |   -.622303   .1694814    -3.67   0.000    -.9544805   -.2901255
                               201711  |  -.3098944   .1720491    -1.80   0.072    -.6471043    .0273156
                               201712  |  -.0764986   .1744496    -0.44   0.661    -.4184136    .2654164
                               201801  |   .5445079   .1830231     2.98   0.003     .1857893    .9032265
                               201802  |   1.085149   .1932292     5.62   0.000      .706427    1.463871
                                       |
                               state_n |
                        2  |  -4.289617   .4817643    -8.90   0.000    -5.233857   -3.345376
                        3  |  -3.497795   .2252531   -15.53   0.000    -3.939283   -3.056307
                        4  |  -.9105593   .2831871    -3.22   0.001    -1.465596   -.3555227
                        5  |  -2.967156     .34299    -8.65   0.000    -3.639404   -2.294908
                        6  |  -3.177949     .30785   -10.32   0.000    -3.781324   -2.574574
                        7  |  -3.407486   .2709197   -12.58   0.000    -3.938478   -2.876493
                        8  |  -3.873965   .2776563   -13.95   0.000    -4.418162   -3.329769
                        9  |   .4535625   .1926122     2.35   0.019     .0760496    .8310755
                        10  |  -.4886507    .228459    -2.14   0.032    -.9364221   -.0408793
                                       |
                                 _cons |   8.066146   .4101508    19.67   0.000     7.262265    8.870027
                        ---------------+----------------------------------------------------------------
                              /lnalpha |  -.8113674   .0439079   -18.48   0.000    -.8974252   -.7253096
                        ---------------+----------------------------------------------------------------
                                 alpha |   .4442502   .0195061                      .4076179    .4841747

                        Comment


                        • #13
                          Originally posted by Nora Romeo View Post
                          Ok, thank you so much. One of the states was entirely populated by zero and it was because I messed it up, so really appreciate you pointing that out. So now it does run with both states and time as part of the ZINB.

                          My question is, and apologies if I'm interpreting this incorrectly, if we can use marginal effects to estimate what the mean of variable mig2gross_2016 at different states and times, doesn't that give some indication to the varied affects of them on the outcome? Thanks!

                          ...
                          Glad you uncovered a data error!

                          To the second question, let me re-state things.

                          The model you are estimating predicts the mean count (aka expected count) of dp_.

                          In the model, you are assuming that each state has the same increase in the expected incidence rate at time = 2 (and 3, 4, etc). So, each state's underlying mean count gets multiplied by some factor (the coefficient for 201710).

                          In that sense, the outcome varies. You're right.

                          Alternatively, each state could get a different effect on its IRR in that time period (and others). If a random effects version of this model exists, then that model inherently assumes that each state's trend is a weighted average between the global mean trend plus the state's own trend (and states with fewer observations are going to look more like the mean trend). Alternatively, if you somehow interacted state with time, you might see a different trend - I'm not suggesting you do that, because I'm not sure you have the data to support it, and it's a lot of interaction terms!

                          Maybe some of the misunderstanding is just complicated semantics plus the fact that I (and many others) don't always write precisely on the internet.
                          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                          Comment

                          Working...
                          X