order of independent variables in panel negative binomial regression

Geraldine Doolan

Join Date: Oct 2023
Posts: 4

order of independent variables in panel negative binomial regression

03 Oct 2023, 07:58

Hello,

I am using Stata 18 on Windows 10.

I am running a negative binomial regression on a panel dataset, and the model does not always converge when I change the order of the independent variables.

Does the order of the independent variables matter in a count data model?

The data is from a travel-cost contigent behaviour survey. I have 422 observations from 211 individuals. The dependent variable is the number of trips to a recreational site, which are from the current period and a future period. The variance of the dependent variable is greater than the mean, hence the negative binomial form.

This is the code

Code:

xtset _index periods
xtnbreg trips_cb tc_1 tcs periods dog_walking age female third_level if trips < 300 & trips > 0 & distance_1 < 31, nolog

trips_cb - annual trips, integer
tc_1 - travel cost to the recreational site
tcs - travel cost to the substitute site
periods - dummy variable indicating 1 for contingent/hypothetical future visits and 0 for current visits
dog_walker - dummy = 1 if person waking their dog
age - continuous
female - binary
third_level - dummy for 3rd level education.

(I am limiting the observations used in the model to those that fit with the theory underpinning travel cost models). There are 24 ways of arranging the four variables dog_walking age female third_level. Although I think they are the same regression, it does not always converge.

e.g some output

Code:

. xtnbreg trips_cb tc_1 tcs periods dog_walking third_level female age if trips < 300 & trips > 0 &  distance_1 < 31, nolog
convergence not achieved

Random-effects negative binomial regression          Number of obs    =    422
Group variable: _index                               Number of groups =    211

Random effects u_i ~ Beta                            Obs per group:
                                                                  min =      2
                                                                  avg =    2.0
                                                                  max =      2

                                                     Wald chi2(7)     =  71.50
Log likelihood = -1736.7896                          Prob > chi2      = 0.0000

------------------------------------------------------------------------------
    trips_cb | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        tc_1 |  -.2335243   .0337367    -6.92   0.000    -.2996469   -.1674016
         tcs |   .0499543   .0217355     2.30   0.022     .0073536    .0925551
     periods |  -.0281283   .0122978    -2.29   0.022    -.0522316    -.004025
 dog_walking |   .3941446   .1736135     2.27   0.023     .0538684    .7344208
 third_level |   -.227061   .1725194    -1.32   0.188    -.5651929    .1110708
      female |  -.0678351   .1517262    -0.45   0.655     -.365213    .2295428
         age |   .0117346   .0053414     2.20   0.028     .0012656    .0222036
       _cons |   17.67167   128.6094     0.14   0.891    -234.3981    269.7415
-------------+----------------------------------------------------------------
       /ln_r |   13.47252   128.6087                      -238.596     265.541
       /ln_s |  -.1777657    .086382                     -.3470713     -.00846
-------------+----------------------------------------------------------------
           r |   709641.7   9.13e+07                      2.4e-104    2.1e+115
           s |   .8371386   .0723137                      .7067549    .9915757
------------------------------------------------------------------------------
LR test vs. pooled: chibar2(01) = 773.67               Prob >= chibar2 = 0.000
convergence not achieved
r(430);

. xtnbreg trips_cb tc_1 tcs periods female age dog_walking third_level if trips < 300 & trips > 0 &  distance_1 < 31, nolog

Random-effects negative binomial regression          Number of obs    =    422
Group variable: _index                               Number of groups =    211

Random effects u_i ~ Beta                            Obs per group:
                                                                  min =      2
                                                                  avg =    2.0
                                                                  max =      2

                                                     Wald chi2(7)     =  71.47
Log likelihood = -1736.7895                          Prob > chi2      = 0.0000

------------------------------------------------------------------------------
    trips_cb | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        tc_1 |  -.2335191   .0337442    -6.92   0.000    -.2996565   -.1673817
         tcs |   .0499605   .0217411     2.30   0.022     .0073488    .0925722
     periods |  -.0281259   .0122978    -2.29   0.022    -.0522291   -.0040226
      female |  -.0677467   .1517559    -0.45   0.655    -.3651828    .2296895
         age |   .0117311   .0053425     2.20   0.028       .00126    .0222022
 dog_walking |   .3942596   .1736496     2.27   0.023     .0539127    .7346065
 third_level |  -.2272131   .1725566    -1.32   0.188    -.5654177    .1109916
       _cons |   18.59189   173.4156     0.11   0.915    -321.2964    358.4802
-------------+----------------------------------------------------------------
       /ln_r |   14.39218   173.4152                     -325.4954    354.2797
       /ln_s |  -.1779298    .086387                     -.3472453   -.0086143
-------------+----------------------------------------------------------------
           r |    1780095   3.09e+08                      4.4e-142    7.3e+153
           s |   .8370012   .0723061                       .706632    .9914227
------------------------------------------------------------------------------
LR test vs. pooled: chibar2(01) = 773.67               Prob >= chibar2 = 0.000

Tags: None

Andrew Musau

Join Date: Oct 2014

Posts: 10213
#2

03 Oct 2023, 09:43

It should not matter how you order the RHS variables. But convergence problems are common in Neg Bin regressions and there is no strong reason to prefer this estimator to Poisson (more to follow). You can use the estimates from the converged model as starting values for the model with convergence problems.

Code:

xtnbreg trips_cb tc_1 tcs periods female age dog_walking third_level if trips < 300 & trips > 0 & distance_1 < 31, nolog mat b= e(b) xtnbreg trips_cb tc_1 tcs periods dog_walking third_level female age if trips < 300 & trips > 0 & distance_1 < 31, nolog from(b, skip)

The variance of the dependent variable is greater than the mean, hence the negative binomial form.

While the usual Poisson MLE standard errors are wrong if the data are overdispersed, you can correct for this by clustering on the cluster unit. In this way, the estimator is fully robust.

Last edited by Andrew Musau; 03 Oct 2023, 09:49.
4 likes
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#3

03 Oct 2023, 09:51

coef/t are the same. it's just the appendages that are different and that's likely due to non-convergence.

I agree with Andrew--use poisson with robust errors.
3 likes
Comment
Geraldine Doolan

Join Date: Oct 2023

Posts: 4
#4

04 Oct 2023, 04:02

Thanks very much George and Andrew, using possion with vce(robust) did solve the problem of non-convergence.

Usually the standard in the travel cost modelling literature is to use a negative binomial when the trips count data is overdispersed. I'm wondering why that's the case when poisson with robust errors works as well.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 752
#5

04 Oct 2023, 07:10

A common confusion with negative binomial regression models is that "overdispersion relative to a Poisson probability model" means Var(y|x)>E(y|x) not Var(y)>E(y). The sample descriptive statistics will show the latter but not the former. (It may be helpful to recall the decomposition Var(y)=Var_xE(y|x)+E_xVar(y|x).)

In my experience it is often the case that nonconvergence arises when Var(y|x)<E(y|x), which can happen even when Var(y)>E(y) but which cannot in general be accommodated by a negative binomial specification.
3 likes
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#6

04 Oct 2023, 12:34

Similar discussion here.

HTML Code:

https://www.statalist.org/forums/forum/general-stata-discussion/general/1587040-why-do-poisson-and-negative-binomial-regressions-yield-the-same-result

I think people use nbreg mainly because they learned just enough about over dispersion to point them that way. A few papers published, it becomes standard, and you have to use it in journals to get past referees.
Comment
Geraldine Doolan

Join Date: Oct 2023

Posts: 4
#7

05 Oct 2023, 04:41

Thanks everyone. I will probably end up having to use nbreg for journal submission as you say, but knowing the issues with it relative to the poisson is really useful.
Comment

Announcement

order of independent variables in panel negative binomial regression

Comment

Comment

Comment

Comment

Comment

Comment