Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tobit model

    Dear Statalisters,

    I never used Tobit model. I would like to have some advice here. Thank you !

    The dependent variable of m regression model has a lower bound of zero. It measures the number of patents a firm was granted in year t scaled by total dollars of research and development expense, i.e., it measures how efficient a firm is using R&D investment to generate patents. Many sample firms have zero patents for year t, I saw another paper with same dependent variable use Tobit model.

    I found UCLA stata help on Tobit,
    http://www.ats.ucla.edu/stat/stata/dae/tobit.htm
    not sure if I need to set upper bound following this post.

    Below is summary stats of my dependent variable
    Variable | Obs Mean Std. Dev. Min Max
    -------------+---------------------------------------------------------
    xvar | 51,036 .5973732 2.196713 0 21.91509


    What are the common practices in setting up Tobit in stata?


    Thanks,
    Rochelle

  • #2
    If there is no upper limit, there is no need to specify one.

    That said, what you describe does not sound like it is suitable for a tobit model. In a tobit model, the data are censored. For example, your outcome might be a measurement that has a lower limit of detection. This comes up fairly often in clinical studies, where the concentration of something in a specimen is reported as 0 whenever it is lower than, say 0.3 because this particular assay cannot distinguish concentrations lower than 0.3. In that case, a tobit analysis could be appropriate and you would specify -ll(0.3)-. Stata would take that to mean that any value of the outcome variable that is lower than 0.3 is not actually precisely defined but is just some value known to be < 0.3, but known exactly.

    Your situation is a lower bound of zero patents. If I understand it, it is not the case that these observations represent firms that really do have patents, but the number of them is too small to ascertain precisely, so you use 0 as a shorthand for "too small to pin down." If I have that right, these zeroes do not represent censoring of the data. They are actual zero outcomes.

    My guess is that you would be better off analyzing these data with a count-outcome model such as Poisson or nbreg. It doesn't sound like a -tobit- to me.

    Comment


    • #3
      Dear Clyde,

      Thanks again for your detailed answer ! It is always so helpful .

      Your assessment is correct.

      Code:
       If I understand it, it is not the case that these observations represent firms that really do have patents, but the number of them is too small to ascertain precisely, so you use 0 as a shorthand for "too small to pin down." If I have that right, these zeroes do not represent censoring of the data.

      One question about the Poisson model: my dependent variable is count of patent / R&D expense, which is s a ratio with zero as lower bound, would it be okay to use Poisson. My knowledge on Poisson is the dependent variable is count variable.

      Rochelle

      Comment


      • #4
        there are no problems when some counts are zero - that is perfectly legitimate (in fact, some people have too many zero's and need to use a special model - Stata has several such special model including zip (zero-inflated poisson))

        Comment


        • #5
          Thanks Rich !

          If the dependent variable is the raw count deflated by another continuous variable, can I still use Poisson model?

          Comment


          • #6
            That is not a problem. Poisson regression can be used with continous data; see for examle here.

            Best regards,

            Joao

            Comment


            • #7
              Joao is correct - however, your reference to "deflated" makes me nervous - are you using a ratio variable? if yes, you might want to do a search on the use of ratio variables - you will note that some people on this list, including me, are wary about many (but not all) uses of ratio variables

              Comment


              • #8
                If you have both the actual count of patents and the continuous variable that is used as a denominator, I would use the count as the outcome variable in a Poisson regression and specify the continuous denominator variable in the -exposure()- option. If ultimately you need to report your effects on the rate scale, -margins- with the -expression()- option can calculate those for you after the regression itself.

                Comment

                Working...
                X