Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtpoisson/xtbnreg, zero-truncation and overdispersion

    Dear Statalists,


    I am dealing with a panel dataset in which I want to estimate the territories' patenting activity on the basis of some performance indicators. I have a balanced panel dataset of 32 towns for 14 years and two outcomes of interest (SIPO and EPO). The following are the summary statistics of the outcomes. I also attach the kernel density of the variables.
    Var Mean St. Dev Min Max
    SIPO 420.0725 612.1378 2 5670
    EPO 2.162637 4.281392 0 35
    Here are my questions:
    1. The variable EPO shows a pretty clear negative binomial form, so I am almost sure it makes sense to use panel negative binomial panel model - xtbnreg - to analyse the phenomenon. Is the overdispersion sufficient enough to support my choice?
    2. I have more doubts about the SIPO variable. It still show a poisson-like distribution and it seems to me it is still overdispersed given the magnitude of the standard deviation, but it does not show any zero in the distribution. Given this, I am not sure whether I should consider the zero-truncated version. In my case, the value zero could occur but it does not actually occur in my data. Furthrmore, I could not see any stata command that gives me the possibility a zero-truncated negative binomial panel regression. Is correct also in this case to stick with the xtnbreg? In case a zero-truncated negative binomial should be chosen, is there any Stata command to perform a panel version?
    3. I observed that the within variability of both my outcomes is larger than the between one. Except for considering a RE version, does this fact has any importance for the choice of the model (binomial instead of poisson)?

    PHP Code:
    xtsum SIPO EPO 

    Variable         
    |      Mean   StdDev.       Min        Max |    Observations
    -----------------+--------------------------------------------+----------------
    SIPO_P~l overall |  420.0725   612.1378          2       5670 |     =     455
             between 
    |             447.4541   65.71429   1963.714 |     =      33
             within  
    |             461.8728  -1285.642   4436.644 T-bar 13.7879
                     
    |                                            |
    EPO_Pa~l overall |  2.162637   4.281392          0         35 |     =     455
             between 
    |             2.335975          0   9.928571 |     =      33
             within  
    |             3.607863  -7.765934   27.23407 T-bar 13.7879 
    Thank you to anyone who will help!
    Chiara
    Attached Files

  • #2
    Your count data show some clear overdispersion, making the negative binomial a good choice. Of course the overdispersion will typically be reduced once you introduce covariates. The random effect introduced at the panel level may also reduce overdispersion.

    1. This is an empirical question. You can always compare the log-likelihoods of the negative binomial and Poisson to see how much difference this makes.

    2. I would not worry about the lack of zeroes, with a mean as large as 420 the probability of a zero in either model is very small.

    3. I don't think the ratio of between to within variance is relevant to your choice, it just says tells you there is still variation within territories over time.

    Comment


    • #3
      Dear German, thank you for your answer.

      Comment

      Working...
      X