Panel Negative Binomial Model

Ng Yongwen

Join Date: Feb 2018

Posts: 10
#1

Panel Negative Binomial Model

17 Feb 2018, 03:24

Hi everyone!

I am currently working on my dissertation and I would like to seek your advice on Panel negative binomial random effects model.

I am trying to investigate the effect of bilateral income asymmetry and economic growth differences on the initiation of trade dispute. The dependent variable is the number of trade disputes between country i and j for a particular year, (it is possible for the dependent variable to be 0 if there isn't any trade dispute between the country pair). The control variables are bilateral trade GDP, GDP per capita, real exchange rate, democracy values, mfn tariff rates, distance, export shares and dummy variables like (presence of FTA, common language etc)

The data is unbalanced, n is 7723 and T is 13.

I tried running the command of xtnbreg dependent variable independent variables i.year i.country i.country j, re, irr however, the model does not converge. Can anyone advise me as to why the model fails to converge whenever I add the dummy variables for country i and j?

I would appreciate if everyone can help me with this matter as I am really clueless as to how to solve this issue.

Regards,
Kenneth
Tags: None
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#2

17 Feb 2018, 10:51

Dear Kenneth,

The most likely cause is that the maximum likelihood estimator of that model does not exist for your data. That problem exists for other count data models and I have discussed that in the following papers (the problem is well known in the case of the logit and probit):

Santos Silva, J.M.C. and Tenreyro, Silvana (2010), On the Existence of the Maximum Likelihood Estimates in Poisson Regression, Economics Letters, 107(2), pp. 310-312.
Santos Silva, J.M.C. and Tenreyro, Silvana (2011), poisson: Some Convergence Issues, STATA Journal, 11(2), pp. 207-212.

I suggest that instead of the RE NegBin model you use Poisson regression with FE, which is much more robust.

Best wishes,

Joao
3 likes
Comment
Ng Yongwen

Join Date: Feb 2018

Posts: 10
#3

17 Feb 2018, 19:21

Dear Mr Silva,

Thanks for replying.

I believe that my dataset has the problem of overdispersion, the variance of the dependent variable is higher than the mean hence from what I understand Poisson model is no longer suitable.

Most of the relevant literature uses the negative binomial model or zero-inflated Poisson model due to the problem of overdispersion and ''excessive zeros''.

I actually tried using FE Negbin model, however, a lot of the observations were dropped due to zero outcome. I have thought of using zero-inflated Poisson (ZIP) however, I am not able to find a stata command for ZIP specifically for panel data.

And also, would it suitable to use PPML? My professor has recommended using PPML, however, I have my reservations since none of the related literature that I have read has adopted PPML.

Regards,
Kenneth
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#4

18 Feb 2018, 15:14

Dear Kenneth,

I agree with your professor: PPML with FE is the way to go. That is what I recommended initially: Poisson regression with FE.

Probably the literature uses the other approaches that you mentioned because of frequent misconceptions about overdispersion and zero-inflation. For example you say that Poisson regression would not be appropriate because the variance is larger than the mean. There are two problems with your statement: 1) to have overdispersion you need the conditional variance to be larger than the conditional mean, so you cannot conclude that Poisson regression is not appropriate just because the variance is larger than the mean; 2) even if indeed there is overdispersion, that is not a serious problem unless you want to compute probabilities of particular counts; if you just want to estimate the conditional mean, overdispersion is irrelevant.

The important thing is that the only robust count data model for panel data is PPML with FE (which is Poisson regressions with FE and robust standard errors); that is the one I would recommend.

Best wishes,

Joao

Last edited by Joao Santos Silva; 18 Feb 2018, 15:18.
1 like
Comment
Ng Yongwen

Join Date: Feb 2018

Posts: 10
#5

19 Feb 2018, 08:15

Dear Mr Silva,

Followings are my commands

egen id=group(pair)
xtset id year
qui tab year, gen(dyear)
qui tab iso3d_i, gen(diso3d_i)
qui tab iso3d_j, gen(diso3d_j)

When I ran the command ppml td logbilatr exshare_ij exshare_ji eg_i eg_j incomeasy demoi demoj border logdist loggdp_i loggdp_j wto_i wto_j mfn_i mfn_j logreer_i logreer_j fta ctyidummy ctyjdummy yeardummy

A lot of the observations and regressors (mainly country dummy variable) were dropped, it also mentioned that "the model appears to overfit some observations with td=0"

Do you know what is wrong? is there something wrong with the specification of the model?

If need be, I can send the log file of the result and excel spreadsheet of my dataset to your email.

Regards,
Kenneth

Last edited by Ng Yongwen; 19 Feb 2018, 08:31.
Comment
Ng Yongwen

Join Date: Feb 2018

Posts: 10
#6

19 Feb 2018, 09:27

I have also tried using the syntax of xi, noomit: ppml, the same problem still exists

Regards,
Kenneth
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#7

19 Feb 2018, 15:29

That many dummies and observations are dropped is natural; that is explained in the references above.

To solve the over-fitting problem, please try to crate the dummies without omitting the base category; see here:

http://personal.lse.ac.uk/tenreyro/Pisch.do

Joao
Comment
Ng Yongwen

Join Date: Feb 2018

Posts: 10
#8

19 Feb 2018, 17:44

I tried replicating the syntax from the link above

xi, prefix(_D) noomit i.iso3d_i i.iso3d_j i.year (iso3d_i and iso3d_j being the country code for country i and j)

ppml td logbilatr exshare_ij exshare_ji eg_i eg_j incomeasy demo_i demo_j border logdist loggdp_i loggdp_j wto_i wto_j mfn_i mfn_j logreer_i logreer_j fta _D*

Unfortunately, the problem of overfitting still exists.

I only used Stata recently, do pardon me if I ask elementary questions.

Regards,
Kenneth

Last edited by Ng Yongwen; 19 Feb 2018, 17:51.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#9

21 Feb 2018, 13:29

Please try to run it with the option "noconstant".

All the best,

Joao
Comment
Ng Yongwen

Join Date: Feb 2018

Posts: 10
#10

22 Feb 2018, 09:09

The syntax would be ppml dependent variable independent variable, noconstant?

Stata kept prompting the option is not allowed.

Did i key it in a wrong manner?

I also could not figure out why massive amount of observations were dropped when I run a FE Negative Binomial or FE Poisson model?
Those observations were dropped because of "only one obs per group" and "all zero outcomes", I don't quite get the intuition.

Regards,
Kenneth

Last edited by Ng Yongwen; 22 Feb 2018, 09:17.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#11

22 Feb 2018, 11:46

Kenneth,

Please install the latest version of "ppml" from SSC.

Those observations are dropped because they contain no information about the parameters of interest. Please check a good textbook on count data.

Best wishes,

Joao
Comment
Ng Yongwen

Join Date: Feb 2018

Posts: 10
#12

24 Feb 2018, 17:36

Sorry for the late reply,

I have installed the new version of ppml, and included the noconstant option in the syntax.

However, the overfitting problem still exist though, is the problem due to the specification of the model?

Regards,
Kenneth
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#13

25 Feb 2018, 01:58

Kenneth

Please send me the data and the code you are using do that I can check it. I'll then post my findings here.

All the best,

Joao
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#14

28 Feb 2018, 13:08

Dear Kenneth,

Thank you for sharing the data. I had a look at it and I believe you can ignore the warning: your dependent variable has 97.94% of zeros, so the model fits them very, very well.

Best wishes,

Joao
Comment
Ng Yongwen

Join Date: Feb 2018

Posts: 10
#15

28 Feb 2018, 22:40

Dear Prof Santos,

If I can go ahead and ignore the warning , should I include with no constant command or do without it?

I also realised that the coefficients are extremely big in magnitude, with some coefficient having positive and negative values of 20 or 30. Is this a sign that that ppml is not suitable?

Regards,
LLenneth
Comment

Announcement

Panel Negative Binomial Model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment