Tobit model

Rochelle Zhang

Join Date: Jul 2025

Posts: 0
#1

Tobit model

09 May 2016, 09:59

Dear Statalisters,

I never used Tobit model. I would like to have some advice here. Thank you !

The dependent variable of m regression model has a lower bound of zero. It measures the number of patents a firm was granted in year t scaled by total dollars of research and development expense, i.e., it measures how efficient a firm is using R&D investment to generate patents. Many sample firms have zero patents for year t, I saw another paper with same dependent variable use Tobit model.

I found UCLA stata help on Tobit,
http://www.ats.ucla.edu/stat/stata/dae/tobit.htm
not sure if I need to set upper bound following this post.

Below is summary stats of my dependent variable
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
xvar | 51,036 .5973732 2.196713 0 21.91509

What are the common practices in setting up Tobit in stata?

Thanks,
Rochelle
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

09 May 2016, 10:22

If there is no upper limit, there is no need to specify one.

That said, what you describe does not sound like it is suitable for a tobit model. In a tobit model, the data are censored. For example, your outcome might be a measurement that has a lower limit of detection. This comes up fairly often in clinical studies, where the concentration of something in a specimen is reported as 0 whenever it is lower than, say 0.3 because this particular assay cannot distinguish concentrations lower than 0.3. In that case, a tobit analysis could be appropriate and you would specify -ll(0.3)-. Stata would take that to mean that any value of the outcome variable that is lower than 0.3 is not actually precisely defined but is just some value known to be < 0.3, but known exactly.

Your situation is a lower bound of zero patents. If I understand it, it is not the case that these observations represent firms that really do have patents, but the number of them is too small to ascertain precisely, so you use 0 as a shorthand for "too small to pin down." If I have that right, these zeroes do not represent censoring of the data. They are actual zero outcomes.

My guess is that you would be better off analyzing these data with a count-outcome model such as Poisson or nbreg. It doesn't sound like a -tobit- to me.
Comment
Rochelle Zhang

Join Date: Jul 2025

Posts: 0
#3

09 May 2016, 12:24

Dear Clyde,

Thanks again for your detailed answer ! It is always so helpful .

Your assessment is correct.

Code:

If I understand it, it is not the case that these observations represent firms that really do have patents, but the number of them is too small to ascertain precisely, so you use 0 as a shorthand for "too small to pin down." If I have that right, these zeroes do not represent censoring of the data.

One question about the Poisson model: my dependent variable is count of patent / R&D expense, which is s a ratio with zero as lower bound, would it be okay to use Poisson. My knowledge on Poisson is the dependent variable is count variable.

Rochelle
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4464
#4

09 May 2016, 14:06

there are no problems when some counts are zero - that is perfectly legitimate (in fact, some people have too many zero's and need to use a special model - Stata has several such special model including zip (zero-inflated poisson))
Comment
Rochelle Zhang

Join Date: Jul 2025

Posts: 0
#5

09 May 2016, 20:55

Thanks Rich !

If the dependent variable is the raw count deflated by another continuous variable, can I still use Poisson model?
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#6

10 May 2016, 01:20

That is not a problem. Poisson regression can be used with continous data; see for examle here.

Best regards,

Joao
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4464
#7

10 May 2016, 06:21

Joao is correct - however, your reference to "deflated" makes me nervous - are you using a ratio variable? if yes, you might want to do a search on the use of ratio variables - you will note that some people on this list, including me, are wary about many (but not all) uses of ratio variables
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#8

10 May 2016, 08:10

If you have both the actual count of patents and the continuous variable that is used as a denominator, I would use the count as the outcome variable in a Poisson regression and specify the continuous denominator variable in the -exposure()- option. If ultimately you need to report your effects on the rate scale, -margins- with the -expression()- option can calculate those for you after the regression itself.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment