Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tobit Model

    HI All,


    Can we not use a dependent variable when it is censored?. But I got this answer from ony of my friends
    the fact that the data do not extend over "all" values does not necessarily suggest the Tobit model. Tobit model is applied only if the underlying relationship covers a range of values on y and x, but you only observe pairs of values (y,x) over a sub-range

    What does it mean? Please help

  • #2
    This is elementary statistics. Read a standard text. Don't rely on hearsay from your friends. If you have numbers above or below some specific number coded as that number (e.g., can't buy less than 0), then tobit. If you lose or don't observe y for some values, you're into sample selection.

    Comment


    • #3
      Hi

      Can you explain to me again, please in a slower manner?

      Comment


      • #4
        So when the observable range of a variable does not extend to the limits of its potential range, you may be dealing with censoring or with truncation. They are different. Censored variables may be dealt with using a -tobit- model, or using survival analysis techniques when the data is right censored (censored from above).

        So what is censoring? Censoring is when the true value of the variable is not observed, or not even observable, but it is known to be at least (right censoring) or at most (left censoring) a certain value. Perhaps some examples will make it simpler.

        You are measuring the concentration of some solute in a solution. The instrument doing the measurement is not able to accurately detect amounts below, say 1 mg/dl, and any concentrations below that amount are instead reported as 1 mg/dl. In that case, the concentration measurement is left-censored at 1 mg/dl.

        You are observing people who have been treated for a cancer and measuring time until recurrence of the cancer. Some patients will die of other causes before a recurrence occurs. Some patients will simply drop out of your study for various reasons. Then for those patients you know that the time to recurrence is greater than or equal to the time they remained under observation in your study, but you don't know anything beyond that. These are right censored times.

        You are working with a publicly released data set from a large population survey. In order to prevent data users from possibly identifying participants with unusual variable combinations, some extreme values of data are not reported. For example, it is common in this situation, for age to be "top-coded" at 80. This means that anyone over the age of 80 is simply reported as being age 80. This is another example of right-censored data. You know the person is 80 or more, but you don't know the specific value.

        Censoring is different from truncation. Truncation arises when selection into the data set directly or indirectly excludes observations above (right truncated) or below (left truncated) some threshold. There is a procedure in Stata, -truncreg- that is sometimes appropriate for modeling truncated data. Here are some examples of truncation:

        1. Only adults are eligible to participate in your study. The age distribution is then left-truncated at age 18. If senior citizens (age >65) are also excluded, then the age distribution is right-truncated at 65.

        2. You are studying a population of people who have been selected based on passing a certain examination. The exam score distribution is left truncated at the passing score.

        Note that with truncation, you actually know the values of the truncated variables for the entities in your study: the truncation arises from selection criteria that exclude part of the natural range of that variable. By contrast censorship entails having incomplete information: the true value is known to be beyond some limit, but the exact value is not known.

        Comment


        • #5
          Clyde, Many thanks for such a lovely description. My dependent variable range from 0% to 50% for one model and another case it is from 50% to 100%. Should I have to observe full range of X or Y before using tobit model? As stated in this URL

          Comment


          • #6
            If you carefully reread #4 you will realize that your question cannot be answered. It depends on why the dependent variable ranges from 0 to 50% in one and 50 to 100% in the other.

            Also the link you posted is the home page of the Bauer institution; there is no relevant contact at that page.

            Comment

            Working...
            X