I examine the number of co-publications of the last 31 years by six countries that are member of a regional organization. I'm particularly interested in similarities and differences of the countries' co-authorship patterns, especially in those variables that reflect the relation to co-authors' countries such as the trade volume or the geographical distance. For this purpose, I've decided to apply a population-averaged negative binomial model
following the field-specific literature's recommendation for the case of overdispersed data.
I'm currently, however, at an impasse because including any of these "pair variables" (or virtual proximity variables) results in
.
Now this issue is certainly not new and I have found helpful explanations and advice in previous forum threads [1,2,3,4,5] - but I'm not sure if I understand all of it properly and I'm a bit uncertain about which of them apply to my specific case. I have summarized potential issues, their recommended remedies and what I've tried so far to give you a better overview on my current understanding. Please indicate if you see something that I got wrong.
1. It may be the case that there is high collinearity between independent variables [5]. Some year dummies at the end of the time period are dropped due to collinearity but checking correlations with
shows relatively weak correlations between pair variables (<0.4) while they are higher between those variables that reflect the domestic dimension (where convergence is achieved).
is also an issue if I exclude the year dummies so I don't think this should be an issue.
2. There is the possibility that the maximum likelihood estimator for my model "does not exist" for my data [1]. This seems to be a possible option as my data has indeed a large number of 0 values in the dependent variable and the "pair" independent variables. A potential remedy would be to start with a poisson regression and plugin the estimates into the negative binomial regression [2]. I wasn't sure what model I should use so I've just run a population-averaged poisson regression
but it results in
as well. I assume using these estimates probably won't help neither, right?
3. In case of using an interaction, the model including the interaction may not be identified by the data [4]. I'm not using interactions so that specific issue should not apply here.
4. My model is insufficient and I should try something different.
I've tried to use -difficult- option to change the steps during the iteration [4], however, to no avail. I have tried to use
as a fall back option [3] but with the same result. What I haven't thoroughly tried so far is to use another maximization technique as I lack proper understanding of the particularities of the different techniques.
I should mention that I have isues with an empty Wald chi² statistic that I attribute to a scaling problem as I was able to fix it by re-scaling the problematic pair variables.
Do you have some recommendations on possible next steps?
I've attached an example of the regression and copied the output below for better
[1] https://www.statalist.org/forums/for...binomial-model
[2] https://www.statalist.org/forums/for...sson-estimates
[3] https://www.statalist.org/forums/for...-fixed-effects
[4] https://www.statalist.org/forums/for...ial-regression
[5] https://www.stata.com/statalist/arch.../msg00288.html
Code:
xtnbreg, pa difficult vce(robust)
I'm currently, however, at an impasse because including any of these "pair variables" (or virtual proximity variables) results in
Code:
no convergence
Now this issue is certainly not new and I have found helpful explanations and advice in previous forum threads [1,2,3,4,5] - but I'm not sure if I understand all of it properly and I'm a bit uncertain about which of them apply to my specific case. I have summarized potential issues, their recommended remedies and what I've tried so far to give you a better overview on my current understanding. Please indicate if you see something that I got wrong.
1. It may be the case that there is high collinearity between independent variables [5]. Some year dummies at the end of the time period are dropped due to collinearity but checking correlations with
Code:
pwcorr
Code:
no convergence
2. There is the possibility that the maximum likelihood estimator for my model "does not exist" for my data [1]. This seems to be a possible option as my data has indeed a large number of 0 values in the dependent variable and the "pair" independent variables. A potential remedy would be to start with a poisson regression and plugin the estimates into the negative binomial regression [2]. I wasn't sure what model I should use so I've just run a population-averaged poisson regression
Code:
xtpoisson, pa
Code:
no convergence
3. In case of using an interaction, the model including the interaction may not be identified by the data [4]. I'm not using interactions so that specific issue should not apply here.
4. My model is insufficient and I should try something different.
I've tried to use -difficult- option to change the steps during the iteration [4], however, to no avail. I have tried to use
Code:
xtpoisson, r fe
I should mention that I have isues with an empty Wald chi² statistic that I attribute to a scaling problem as I was able to fix it by re-scaling the problematic pair variables.
Do you have some recommendations on possible next steps?
I've attached an example of the regression and copied the output below for better
Code:
. xtnbreg collab_weight rtot_trade gdp_pc tertenrol_epol trade_percgdp mobcell100 colotrad langcom i.year, pa difficult vce(robust)
note: 2015.year omitted because of collinearity
note: 2016.year omitted because of collinearity
note: 2017.year omitted because of collinearity
note: 2018.year omitted because of collinearity
Iteration 1: tolerance = .31055186
Iteration 2: tolerance = .07928357
Iteration 3: tolerance = .08383659
Iteration 4: tolerance = .04340788
Iteration 5: tolerance = .22333106
Iteration 6: tolerance = .20362621
Iteration 7: tolerance = .52943928
Iteration 8: tolerance = .75065012
Iteration 9: tolerance = .08511348
Iteration 10: tolerance = .10210511
Iteration 11: tolerance = .06861638
Iteration 12: tolerance = .04175124
Iteration 13: tolerance = .57174736
Iteration 14: tolerance = .83240469
Iteration 15: tolerance = .13318422
Iteration 16: tolerance = .07155872
Iteration 17: tolerance = .07823117
Iteration 18: tolerance = .07148965
Iteration 19: tolerance = .03836049
Iteration 20: tolerance = .5207436
Iteration 21: tolerance = .77286907
Iteration 22: tolerance = .08953199
Iteration 23: tolerance = .09903163
Iteration 24: tolerance = .07404575
Iteration 25: tolerance = .03569057
Iteration 26: tolerance = .37159328
Iteration 27: tolerance = .54720453
Iteration 28: tolerance = .14373912
Iteration 29: tolerance = .03682585
Iteration 30: tolerance = .38592322
Iteration 31: tolerance = .57121807
Iteration 32: tolerance = .14028013
Iteration 33: tolerance = .04100945
Iteration 34: tolerance = .22680789
Iteration 35: tolerance = .25970474
Iteration 36: tolerance = .09572691
Iteration 37: tolerance = .11652831
Iteration 38: tolerance = .66254531
Iteration 39: tolerance = .77250927
Iteration 40: tolerance = .25295589
Iteration 41: tolerance = .05215284
Iteration 42: tolerance = .04514844
Iteration 43: tolerance = .0596035
Iteration 44: tolerance = .07818158
Iteration 45: tolerance = .07041597
Iteration 46: tolerance = .03923707
Iteration 47: tolerance = .55936823
Iteration 48: tolerance = .81969253
Iteration 49: tolerance = .12112588
Iteration 50: tolerance = .07921637
Iteration 51: tolerance = .08010801
Iteration 52: tolerance = .06113941
Iteration 53: tolerance = .08624978
Iteration 54: tolerance = .63494462
Iteration 55: tolerance = .85733655
Iteration 56: tolerance = .21802994
Iteration 57: tolerance = .04431795
Iteration 58: tolerance = .05111883
Iteration 59: tolerance = .07124833
Iteration 60: tolerance = .07989025
Iteration 61: tolerance = .04704872
Iteration 62: tolerance = .2078943
Iteration 63: tolerance = .13467526
Iteration 64: tolerance = .82616843
Iteration 65: tolerance = .57358112
Iteration 66: tolerance = .31724385
Iteration 67: tolerance = .39922756
Iteration 68: tolerance = .18022887
Iteration 69: tolerance = .10210485
Iteration 70: tolerance = .07338091
Iteration 71: tolerance = .06084837
Iteration 72: tolerance = .05441135
Iteration 73: tolerance = .05120285
Iteration 74: tolerance = .07191157
Iteration 75: tolerance = .07959718
Iteration 76: tolerance = .0462729
Iteration 77: tolerance = .21305515
Iteration 78: tolerance = .15071179
Iteration 79: tolerance = .89146622
Iteration 80: tolerance = .6153406
Iteration 81: tolerance = .19964479
Iteration 82: tolerance = .05179655
Iteration 83: tolerance = .32744225
Iteration 84: tolerance = .55548529
Iteration 85: tolerance = .30313567
Iteration 86: tolerance = .12112433
Iteration 87: tolerance = .07585909
Iteration 88: tolerance = .06068375
Iteration 89: tolerance = .05399522
Iteration 90: tolerance = .05320866
Iteration 91: tolerance = .07358059
Iteration 92: tolerance = .07842722
Iteration 93: tolerance = .04223677
Iteration 94: tolerance = .22502611
Iteration 95: tolerance = .23854644
Iteration 96: tolerance = .19215449
Iteration 97: tolerance = .23957639
Iteration 98: tolerance = .20948912
Iteration 99: tolerance = .26024878
Iteration 100: tolerance = .10269846
GEE population-averaged model Number of obs = 3,161
Group variable: target Number of groups = 109
Link: log Obs per group:
Family: negative binomial(k=1) min = 29
Correlation: exchangeable avg = 29.0
max = 29
Wald chi2(31) = 10794.89
Scale parameter: 1 Prob > chi2 = 0.0000
(Std. Err. adjusted for clustering on target)
----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Semirobust
collab_weight | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+------------------------------------------------------------------------------------------------------------------------------------------------
rtot_trade | 2.60e-09 6.37e-09 0.41 0.683 -9.88e-09 1.51e-08
gdp_pc | .0000971 .0000247 3.94 0.000 .0000487 .0001454
tertenrol_epol | 2.22e-06 4.01e-07 5.53 0.000 1.43e-06 3.01e-06
trade_percgdp | -.0183561 .0116593 -1.57 0.115 -.0412079 .0044958
mobcell100 | .0073132 .0011522 6.35 0.000 .0050549 .0095715
colotrad | 1.594459 .5851681 2.72 0.006 .4475504 2.741367
langcom | 1.509419 .6002954 2.51 0.012 .3328614 2.685976
----------------------------------------------------------------------------------------------------------------------------------------------------------------
year |
1991 | .2856466 .3194495 0.89 0.371 -.3404629 .9117561
1992 | -.0231025 .2237049 -0.10 0.918 -.461556 .415351
1993 | .4922213 .1874601 2.63 0.009 .1248063 .8596362
1994 | .4741249 .2524005 1.88 0.060 -.020571 .9688208
1995 | .4691006 .2529398 1.85 0.064 -.0266523 .9648535
1996 | .4894541 .1945354 2.52 0.012 .1081717 .8707365
1997 | .927893 .2780302 3.34 0.001 .3829638 1.472822
1998 | 1.250811 .2103598 5.95 0.000 .8385137 1.663109
1999 | 1.082485 .2243396 4.83 0.000 .6427869 1.522182
2000 | .96179 .2667751 3.61 0.000 .4389204 1.48466
2001 | .9312558 .1905119 4.89 0.000 .5578593 1.304652
2002 | .8492289 .1903188 4.46 0.000 .4762109 1.222247
2003 | .9112329 .2544881 3.58 0.000 .4124454 1.41002
2004 | .6635822 .2677673 2.48 0.013 .138768 1.188396
2005 | .2132622 .2759962 0.77 0.440 -.3276804 .7542048
2006 | .2055204 .3177095 0.65 0.518 -.4171788 .8282196
2007 | .0070632 .3180742 0.02 0.982 -.6163507 .6304771
2008 | -.4258888 .2031861 -2.10 0.036 -.8241263 -.0276514
2009 | -.1525135 .2025328 -0.75 0.451 -.5494704 .2444435
2010 | -.2249248 .181555 -1.24 0.215 -.580766 .1309164
2011 | -.2851735 .1338686 -2.13 0.033 -.5475511 -.0227958
2012 | -.54935 .0686513 -8.00 0.000 -.6839041 -.4147959
2013 | -.6197236 .0624535 -9.92 0.000 -.7421302 -.497317
2014 | -.4995681 .0395978 -12.62 0.000 -.5771784 -.4219578
2015 | 0 (omitted)
2016 | 0 (omitted)
2017 | 0 (omitted)
2018 | 0 (omitted)
|
_cons | -.7398958 .8294814 -0.89 0.372 -2.365649 .8858579
--------------------------------------------------------------------------------
convergence not achieved
r(430);
[1] https://www.statalist.org/forums/for...binomial-model
[2] https://www.statalist.org/forums/for...sson-estimates
[3] https://www.statalist.org/forums/for...-fixed-effects
[4] https://www.statalist.org/forums/for...ial-regression
[5] https://www.stata.com/statalist/arch.../msg00288.html
