Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Negative Binomial Fixed Effects Model with Panel Data (Several Options)

    Dear Statalists,

    I am currently struggling with a STATA issue regarding negative binomial panel regression with fixed effects. I noticed that literature in this respect is very very limited in the internet, which is why decided to post the issues here.

    I have a panel data set of 50 states in the US over 10 years. The dependent variable is the number of new_firms per year and state, covariates are for example bank concentration, vc-firms and as a control GDP growth (all per state and year). Plus, I want to add state and year fixed effects. As "new firms" is a count with overdispersion, I choose a negative binomial regression. Overall I want to find out how the covariates affect the creation of new_firms.

    To the best of my knowledge and research in the statalist forums, I came to notice that there are several options to regress.

    Option 1: xtnbreg, fe such as:

    Code:
    xtset state year
    xtnbreg new_firms bank_concentration vc_firms GDP_growth i.year

    Option 2: nbreg with dummy vectors

    Code:
    xtset state year
    nbreg new_firms bank_concentration vc_firms GDP_growth i.state i.year
    My questions:
    1. Are these commands and models correct in general?
    2. I am not sure if I need a conditional or unconditional model. I found out that this may play a role and I am not even sure which Option delivers unconditional or conditional models. I searched for a while and did not find any helpful resource to solve this by myself. Would you suggest to use a unconditional model? How would I command this in STATA?
    3. Also I found a study of Allison, stating "This paper demonstrates that the conditional negative binomial model for panel data, proposed by Hausman, Hall, and Griliches (1984), is not a true fixed-effects method. This method which has been implemented in both Stata and LIMDEP-does not in fact control for all stable covariates. " This may somehow be overcome by using nbreg and creating the dummy vectors manually. But I am not sure if this actually the case. Does this change anything regarding the questions before?
    I highly appreciate any help regarding these questions.

    Best regards,
    Korhan





  • #2
    Dear Korhan,

    Please see the discussion here (especially #4):

    https://www.statalist.org/forums/for...binomial-model

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao,

      thank you for the quick hint, I missed this thread possibly because it is very up-to-date. As I understand it you would suggest that the following would be the best option?

      Code:
      xtset state year
      xtpoisson new_firms bank_concentration vc_firms GDP_growth i.year, fe
      If that option would not be allowed due to the scope of my work (I could only mention this in a "Limitations"-section), could you say which above mentioned options would be a better choice?

      I would also highly appreciate it if you could say something regarding question two, namely if my setup is better with a conditional or unconditional regression. I cannot find anything regarding this specifications and find it very puzzling.

      Thank you very much already!

      Best regards,
      Korhan

      Comment


      • #4
        Dear Korhan,

        I would not trust any of the results based on the NB model and that is why I would use a Poisson based model. With Poisson, the conditional and unconditional approach give exactly the same results.

        All the best,

        Joao

        Comment


        • #5
          Thanks Joao, I dived into this area and the other threads more deeply now and I am convinced to take a Poisson FE model. Do you know by any chance a paper which dicsusses this problem, i.e. that Poisson FE is a fitting model and that overdispersion is not an issue? For instance, you stated in the thread mentioned above:

          Probably the literature uses the other approaches that you mentioned because of frequent misconceptions about overdispersion and zero-inflation. For example you say that Poisson regression would not be appropriate because the variance is larger than the mean. There are two problems with your statement: 1) to have overdispersion you need the conditional variance to be larger than the conditional mean, so you cannot conclude that Poisson regression is not appropriate just because the variance is larger than the mean; 2) even if indeed there is overdispersion, that is not a serious problem unless you want to compute probabilities of particular counts; if you just want to estimate the conditional mean, overdispersion is irrelevant.
          I am asking because I would like to adress this issue in my work. Maybe you provided some papers by yourself or know a main source? Would be of great help.

          Best regards

          Comment


          • #6
            Sorry for the double post, but I found something in the Allison and Waterman (2002) paper (https://statisticalhorizons.com/wp-c...n.Waterman.pdf) I find a passage on p. 264 that states:

            Code:
            A good alternative is to do conventional negative binomial regressions with direct
            estimation of the fixed effects rather than conditioning them out of the likelihood.
            Greene (2001) has demonstrated the computational feasibility of this approach,
            even with large sample sizes. Simulation results strongly suggest that this estimation
            method does not suffer from incidential paramters bias, and has much better sampling
            properties than the fixed-effects Poisson estimator.
            In this light I am wondering if in my case a Poisson FE makes sense, although I am sure you have good reasons. I also read that with using a simple Poisson FE the "estimated standard errors may be downwardly biased" and this was a case in their analysis (p. 250). They suggest somehow to correct the standard errors.

            Now I am especially wondering which of the models is applyable by STATA and provides results which I can present in my work without major doubts, or which I can at least comment.

            - Is a uncond. negbin fe (nbreg and dummy vectors) dominated by a Poisson FE?
            - How can I perform a Poisson FE without the issue of downwardly biased standard errors?
            - Are there test with which I can check which of the alternatives provide better results?

            All the best
            Korhan

            Comment


            • #7
              Dear Korhan,

              Let me see if I can address all your concerns.

              1) For references, I recommend either the books by Cameron and Trivedi (they have one on count data) or the book by Wooldridge; his paper on the FE Poisson regression is also a key reference.

              2) I am not aware of any proof that the NB model does not suffer from the IPP; simulation results cannot be used to prove this.

              3) You can use clustered standard errors with the FE Poisson regression; these are valid

              4) I do not think we need a test here because there is really only one estimator that we know does not suffer from the IPP in this context.

              All the best,

              Joao

              Comment


              • #8
                Dear Joao,

                thank you for your pacience and your great help. I already am able to make huge steps. To adress your replies:

                1) Thanks for the hint! I assume this is Wooldridge (1999) "Distribution-free estimation of some nonlinear panel data models"? In order to give something back to the forum or interested leaders: https://scholar.google.de/scholar?cl...=de&as_sdt=0,5 lists some abstracts of the paper.

                3) So this would be something like:

                Code:
                xtpoisson yvariable xvar controlvar1 controlvar2 i.year, robust i(state) fe
                ?

                I have another question in this regard, although it may be of more basic nature. In linear regressions it is sometimes useful to take the log/ln of observered or explanatory vars to change their distribution in case they are not standard distributed. Is this the case also for Poissons? More conrectely, does it make sense (in which cases?) to take the ln of y or x variables (or controls).

                Best regards
                Korhan

                Comment


                • #9
                  Dear Korhan,

                  1) Yes, that's the one!
                  3) To be honest, I do not know if that would do the clustering, please check the manual
                  4) Logging the variables on the LHS is fine, but do not log the dependent variable; that would defeat the purpose of using Poisson regression.

                  All the best,

                  Joao

                  Comment


                  • #10
                    Dear Statalists,

                    I'm currently writing my M.Sc thesis in International Business & Management in the Netherlands and have some issues identifying an appropriate regression. I'm basically totally new to Stata and most of the theory behind it which makes it quite hard for me.
                    Anyways, here is my situation:

                    DV: number of patents (called Innovativeness)
                    IVs: (1) number of unique alliance partners (called Partners), (2) total number of alliances (called Alliances), (3) ratio (no. of partners / no. of alliances; called APS-Index)
                    CVs: firm age, total assets, r&d intensity (r&d expenses divided by total sales), debt-equity ratio, portfolio size (as number of alliances.

                    I came up with three hypotheses, each predicting the impact of one of the IVs on the DV. For all three hypotheses I predicted an inverted U-shaped relationship between the IV and the DV.
                    My dataset consists of panel data with a time-lag of three years (L.3). E.g. I want to see what is the impact of the total number of a firm's alliances in 1995 is on innovativeness in 1998. The DV is therefore lagged by three years. I'm looking at data between 1995 and 2005 (for IVs) and 1998 and 2008 (for the DV), respectively. Partners and Alliances are only integers, whereas APS-Index can only take positive decimal numbers (e.g. 1,57).

                    The mean of my DV is 895.5969 and the Std Dev. is 1168.711.

                    So now my questions are:
                    1. As I have panel data and the mean<Std. Dev., should I use Poisson or rather a negative binomial regression? Some former students in their theses argued that if the mean< Std. Dev. one should use the NBREG instead of POISSON.

                    2. How do I know if I should used a random effects or fixed effects-model?

                    3. As I proposed a inverted U-shape for all three hypotheses, do my CVs also have to be squared or are they simply linear? For instance,
                    xtnbreg Patents APS APS^2 Firm_Age Total_Assets R&D_Intensity Debt-Equity_Ratio Portfolio_Size vs.
                    xtnbreg Patents APS APS^2 Firm_Age^2 Total_Assets^2 R&D_Intensity^2 Debt-Equity_Ratio^2 Portfolio_Size^2

                    4. Finally, how do I know if my model is good/valid? Which are the indicators?

                    I would really appreciate your help as I could not find any big help in books and from other students.

                    Best regards,

                    Marcel

                    Comment


                    • #11
                      Dear Marcel,

                      Just use Poisson FE as it is the only approach that is reasonably robust in this context.

                      Best wishes,

                      Joao

                      Comment


                      • #12
                        Dear Joao,

                        Thank you for your quick answer! When running a regression with a hypothesized inverted U-shape, do the control variables have to be squared?

                        Best regards,

                        Marcel

                        Comment


                        • #13
                          I do not think so, but that is an empirical issue; include them and see if they are needed.

                          Best wishes,

                          Joao

                          Comment


                          • #14
                            Hi,
                            I am having similar problems to Korhan and Marcel. I have a count variable as my dependent variable and I am trying to decide between poisson and negative binomial both with fixed effects. The LR test of alpha=0, after running a negative binomial regression, suggests the negative binomial is the model to use. However, looking at some of your posts you have indicated that poisson fe is the only robust approach. Why is this?

                            Comment


                            • #15
                              Dear Anthony ODowd,

                              The NB model with FE is not a true fixed effects model and it is valid only under very restrictive distributional assumptions. In contrast, Poisson regression with FE is valid under very general conditions.

                              Best wishes,

                              Joao

                              Comment

                              Working...
                              X