Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testin overdispersion in Negative Binomial

    Hi all,

    is there a way to test the presence of overdispersion in a panel negative binomial model? I know that Fixed Effects Negative Binomial provided through the command xtnbreg should eliminate overdispersion parameter delta_i (as in the help), hence I guess that neither the postestimation commands nor the estimated parameters could provide a test for overdispersion in such case.

    Thank you,

    Federico

  • #2
    Dear Federico Nutarelli

    Be careful with the NBFE estimator, see:

    Guimarães, Paulo, 2008. "The fixed effects negative binomial model revisited," Economics Letters, 99(1), 63-66.

    Best wishes,

    Joao

    Comment


    • #3
      Thank you very much Joao Santos Silva

      Comment


      • #4
        I agree with Joao. I’d say it even more strongly: you should use the FE Poisson approach and never use FENB. The former is entirely robust, the latter is fragile to violations of assumptions.

        Comment


        • #5
          Dear Professor Jeff Wooldridge ,

          many thanks for your appreciated and useful reply.
          The problem is that my data are overdispersed. I read some of your interesting past interventions and also Cameron Trivedi book about the topic. It seems that FE Negative Binomial is also not able to eliminate fixed effects. If I may take some of your time, I would like to ask you if is it possible to correct for overdispersion with FE Poisson.

          Many thanks,

          Federico

          Comment


          • #6
            Federico: A few points.

            1. Because the FE Poisson estimator is fully robust to any kind of variance-mean relationship, there is no need to "correct" for overdispersion with FEP. You do need to compute robust standard errors. Fully robust means that the conditional mean needs to be correct, and that's all.
            2. You can't tell by looking at the raw data, or even the data conditional on x, whether overdispersion holds in the sense relevant for panel data. You would have to observe the heterogeneity (or estimate it, which is difficult with small T). It seems likely that some units are undispersed and some over, once you control for what I call ci along with xi. Being overdispersed across the entire population is not the same as being overdispersed for each unit.
            3. What would you do if you concluded you have overdispersion? There are, perhaps, more efficient method of moments estimators, as I discuss in my 1999 Journal of Econometrics paper. But you should not use FENB. The estimator is very fragile to violation of a set of assumptions that are too strong.
            4. One way to emphasize point 3: if the FENB assumptions hold, FEP is consistent. If the FEP assumptions hold, FENB is inconsistent.

            JW

            Comment


            • #7
              Jeff Wooldridge
              Dear Professor,

              thank you for the complete and very clear explanations. I will proceed as you suggest.

              Many thanks,

              Federico

              Comment


              • #8
                Dear Professors Joao Santos Silva and Jeff Wooldridge,


                I found your replies and explanations really interesting and instructive! Actually, I have a similar issue and I would like to be sure that using FE Poisson (estimated with ppmlhdfe command) I'm obtaining robust and consistent estimates. In what follows I give more details about my research. I would be extremely grateful to have any suggestions from you. Thanks in advance!

                I'm interested in estimating the effect of a treatment, measured using a dichotomic variable, on the number of tourists visiting a municipality. I have a balanced sample of countries (origin country of the tourists), Italian municipalities (destination of the travel) and year-months. In particular, I have a sample of 46 countries, 390 municipalities over the period 2000-2017, for a total of 3,875,040 observations.

                I'm estimating the following equation:

                ppmlhdfe log_n_visits treat , absorb(country_mun year_month) cluster(municipality) d
                margins, dydx(treat) post

                where log_n_visits measures the number of tourists from country C visiting municipality M in year-month T (in logs: i.e., log(visits +1)), treat is the dummy variable for the treatment that takes value one for treated countries and municipalities from a particular point in time onwards; the point in time varies across countries (i.e., some countries receive the treatment before other treated countries). Finally, I include FE on the intersection between country and municipality (country_mun) and between year and month (year_month), I cluster SE at the municipality level. Notice that estimating this equation, the number of observations falls to 551,448 because ppmlhdfe drops 3323592 observations that are either singletons or separated by a fixed effect. FE count for 2553 country_mun, 216 year_month, and I have 350 municipality clusters.

                I'm using survey data from the Bank of Italy that are representative of the tourism flow at the national level. However, by construction, the outcome variable has many zeros and I'm worried about the issue of overdispersion when I estimate FE Poisson. I include summary statistics of the outcome (linear) both on the whole sample and on the e(sample) generated by FE Poisson (ppmlhdfe), respectively:

                Click image for larger version

Name:	sum1.PNG
Views:	1
Size:	21.9 KB
ID:	1545544 Click image for larger version

Name:	sum2.PNG
Views:	1
Size:	22.1 KB
ID:	1545545

                These are my questions:
                1. Is overdispersion a concern when using FE poisson (ppmlhdfe)? In particular, I am wondering whether your statement "FE Poisson estimator is fully robust to any kind of variance-mean relationship" suggests that also in my setting the FE poisson (ppmlhdfe) provides consistent and robust estimates. If yes, I would be grateful if can you suggest me some references that can I study and use to support the use of ppmlfhde instead of a Negative Binomial model in the presence of FE.
                2. Do you think that given the large number of zeros in my dataset, shall I try to implement a zero-inflated poisson model with FE? If so, is there a way to implement it in stata?
                3. Is it preferable to use the linear count variable on the number of visits instead of its log transformation? Or using the log transformation is fine?
                Thank you very much for your time and your help!

                Best,
                Samuel




                Comment


                • #9
                  Samuel: You apply Poisson FE directly to visits. Do not use log(visits + 1)! And you have to be careful using margins because those depend on the estimated fixed effects. The coefficients themselves are usually of interest: multiplied by 100, they give percentage effects on the mean visits given a one unit increase in x.

                  Comment


                  • #10
                    Dear Professor Jeff Wooldridge,

                    thank you very much for your useful reply! So, do you think that I can trust on FE Poisson because "FE Poisson estimator is fully robust to any kind of variance-mean relationship"? Should I concern about a large number of zeros? Finally, can you suggest me some references, please?

                    Sorry if I'm bothering you with so many questions? Thank you very much in advance for your time, your suggestions are really precious for me!

                    I wish you all the best in this hard time at the global level!
                    Best,
                    Samuel

                    Comment


                    • #11
                      Dear Samuel Nocito,

                      The reference you need is:

                      Wooldridge, J.M., (1999) “Distribution-Free Estimation of Some Nonlinear Panel Data Models,” Journal of Econometrics 90, 77–97.

                      Best wishes and stay safe,

                      Joao

                      Comment


                      • #12
                        Dear Professor Joao Santos Silva,

                        thank you very much for your reply and the suggested reference, I really appreciate!

                        Best wishes and stay safe you too,
                        Samuel

                        Comment


                        • #13
                          Hi all,

                          Thanks for all the info provided in this thread, learned a lot!

                          I am facing a very similar problem as Samuel Nocito. I work with panel data (N=8091 municipalities, T= 10 years) and my dependent variable is a count variable with many, many zeros. The dependent variable is the number of hospitalizations (for specific diseases) in a municipality.

                          I would like to run a (high-dimensional) FE zero-inflated Poisson regression (municipality and year fixed-effects) and I am looking for ways to implement this in Stata.

                          Is there a direct way or a two-step procedure to implement this? Our assumption is that we observe zero hospitalizations in small municipalities (low population counts).

                          Many thanks in advance and all the best,
                          Dijana





                          Comment


                          • #14
                            Dear Dijana,

                            A zero inflated model will be adequate if for some municipalities it is not possible to have hospitalizations due to a specific disease (for example, because there is no hospital in the municipality). If that is not the case, you can just use Poisson regression with FE, possibly controlling for the population in each municipality (but the FE are likely to be enough to account for differences in population).

                            Best wishes,

                            Joao

                            Comment


                            • #15
                              Dear Joao,

                              Many thanks for your quick and helpful reply.
                              I guess I will then just go for a Poisson regression with FE.

                              Best,
                              Dijana

                              Comment

                              Working...
                              X