Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Continuous data with lots of zeros

    Hello

    I want to analyse a dependent variable (donations) as a function of some variables.
    The donations data have lots of zeros because most people donate nothing. Other than zeros, donations can range from 1 to a very high number (continuous data).

    What would be the best way to model the above? A inverse hyperbolic sine transform? If so, how to interpret the coefficients? Or is there some specific models to deal with zero-inflated data (non-count)?

    Thanks,
    Umar

  • #2
    Dear Umar Boodoo,

    You have corner-solutions data and a convenient way to deal with this kind of data is to use Poisson regression with robust standard errors; this works even if the data are not counts.

    Best wishes,

    Joao

    Comment


    • #3
      Another approach is using a two-part model to simultaneously model (a) the probability of a donation >0 and (b) the amount of the donation, conditional on it being >0.

      The introduction to two-part modeling in Stata is in this freely-available Stata Journal article

      https://www.stata-journal.com/articl...article=st0368

      and the latest update to the associated software can be found at the link in the output of

      Code:
      search st0368_2
      (Disclaimer: one of the authors was a friend and colleague with whom I worked in the 1970's when he first applied two-part modeling to health care utilization and expenditures.)





      Comment


      • #4
        Thanks Joao and William. Very helpful (I have tried both). I have seen in some public policy papers that they sometimes run a probit/logit first and then a tobit regression. Would that be an acceptable approach as well?

        Comment


        • #5
          I would stay away form the Tobit and there is no reason to use it in the second stage.

          Best wishes,

          Joao

          Comment

          Working...
          X