Margins of tnbreg command outputting predicted counts lower than 1

Christopher Kaufmann

Join Date: Dec 2014
Posts: 9

Margins of tnbreg command outputting predicted counts lower than 1

05 May 2015, 15:02

Hello StataList,

I am running a zero-truncated negative binomial regression model. I want to examine whether the number of prescriptions for benzodiazepines (BZDs) has increased over time among physicians who prescribe BZDs. Since I am only looking among prescribers, I will not have any 0s in the counts variable (hence why I am using the tnbreg command):

Code:

. tab BZD_count if BZD_doctor == 1

      (sum) |
 BZD_count  |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |      3,274       45.87       45.87
          2 |      1,525       21.37       67.24
          3 |        811       11.36       78.60
          4 |        496        6.95       85.55
          5 |        317        4.44       90.00
          6 |        190        2.66       92.66
          7 |        136        1.91       94.56
          8 |         98        1.37       95.94
          9 |         65        0.91       96.85
         10 |         50        0.70       97.55
         11 |         35        0.49       98.04
         12 |         38        0.53       98.57
         13 |         19        0.27       98.84
         14 |         22        0.31       99.15
         15 |         12        0.17       99.31
         16 |          8        0.11       99.43
         17 |          8        0.11       99.54
         18 |          5        0.07       99.61
         19 |          4        0.06       99.66
         20 |          7        0.10       99.76
         21 |          7        0.10       99.86
         22 |          2        0.03       99.89
         23 |          1        0.01       99.90
         25 |          1        0.01       99.92
         26 |          1        0.01       99.93
         28 |          1        0.01       99.94
         29 |          1        0.01       99.96
         30 |          1        0.01       99.97
         32 |          1        0.01       99.99
         39 |          1        0.01      100.00
------------+-----------------------------------
      Total |      7,137      100.00

The specific syntax for estimating this regression is:

Code:

tnbreg BZD_count YEAR if BZD_doctor == 1, exp(numpatients) irr

I would like to summarize the predicted number of prescriptions for BZDs among BZD prescribing physicians across years of my study. To do this, I am using the margins command:

Code:

margins, at(YEAR=(2001(1)2009))

This results in predicted counts being below 1 (see output below). Since none of the doctors in my subsample have values less than 1, why would this be so? Thank you so much for your assistance.

Code:

Adjusted predictions                              Number of obs   =       7137
Model VCE    : OIM

Expression   : Predicted number of events, predict()
1._at        : YEAR    =        2001
2._at        : YEAR    =        2002
3._at        : YEAR    =        2003
4._at        : YEAR    =        2004
5._at        : YEAR    =        2005
6._at        : YEAR    =        2006
7._at        : YEAR    =        2007
8._at        : YEAR    =        2008
9._at        : YEAR    =        2009

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
       2001  |   .4842182   .0943278     5.13   0.000      .299339    .6690974
       2002  |   .4863701   .0940732     5.17   0.000       .30199    .6707502
       2003  |   .4885315    .093977     5.20   0.000       .30434     .672723
       2004  |   .4907026   .0940439     5.22   0.000     .3063799    .6750253
       2005  |   .4928833   .0942781     5.23   0.000     .3081015     .677665
       2006  |   .4950737   .0946828     5.23   0.000     .3094987    .6806486
       2007  |   .4972738   .0952604     5.22   0.000     .3105669    .6839807
       2008  |   .4994837   .0960122     5.20   0.000     .3113033    .6876641
       2009  |   .5017034   .0969388     5.18   0.000     .3117069    .6916999
------------------------------------------------------------------------------

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#2

05 May 2015, 16:26

First, a disclaimer. I'm not familiar with -tnbreg-. But just on general Stata principles, I think that -margins- is giving you the expected number of BZD prescriptions per prescriber in each year. You have 9 different years there, and it looks like most of your prescribers issued only 1 or 2 prescriptions in the entire data set, though there are clearly some who were much more "active." So it doesn't surprise me that the expected number of prescriptions per prescriber in any given year is below 1.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2175
#3

05 May 2015, 22:03

I think the problem is that a truncated analysis typically assumes that, for the purposes of computing predicted values or partial effects, it treats the population of interest as the entire population. In particular, a truncated negbin treats the relevant population as all physicians, including those that do not prescribe BZDs. But you are not interested in this population, if I understand your statement. You want a count distribution over the values {1,2,3,...}. Now, a truncated negbin surves that purpose, but you must force it to compute the expected value of y, the variable that takes on {1,2,...}. I suspect that Stata is computing the expected value of y*, the variable that has an underlying negbin distribution over {0,1,2,...}. There should be an option that forces calculation of the predicted values for y rather than y*. JW
Comment
Dimitriy V. Masterov

Join Date: Mar 2014

Posts: 609
#4

05 May 2015, 23:39

I believe you want the conditional mean E[y_i | y_i >t_i], which can be obtained like this:

Code:

webuse rod93 tnbreg deaths i.cohort, exposure(exposure) ll(9) margins, predict(cm) at(cohort = (1(1)3))

Take a look at tnbreg postestimation margins entry to learn more.
Comment
Christopher Kaufmann

Join Date: Dec 2014

Posts: 9
#5

06 May 2015, 13:28

Thank you, Clyde, Jeff, and Dimitriy! This is exactly what I needed, and I am getting numbers now that make sense. That is good to know that the margins command makes predictions based on the whole sample rather than the stratified sample. It's a very powerful command, and I'm glad I've been learning about it. I appreciate your help!
Comment

Announcement

Margins of tnbreg command outputting predicted counts lower than 1

Comment

Comment

Comment

Comment