Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • probit/logit in truncated data with only 1s

    Dear all,

    I am trying to regress being a (blood) donor on education with control variables. The dependent variable is a dummy, equal to 1 if a donor and 0 otherwise. I would usually use probit or logit in Stata. However, my data has only donors and no information about non-donors (the value of the dependent variable equals 1 for all observations in the data). The data is thus truncated, as any information (dep var or ind var or control vars) about non-donors is completely missing. Stata has truncreg, but it does not apply to the binary/dummy dependent variable.

    I would appreciate any help.

    Thanks,
    Kang

  • #2
    So you want a regression method that applies when Y is a constant?

    Comment


    • #3
      Technically, the average of Y would be 1. If Y varies, I would run truncreg in Stata. Since Y equals 0 or 1 in my model and I have only 1s in my data, I don't think I can run truncreg and am asking if there is any other solution?

      Comment


      • #4
        If your data contains only donors, you cannot achieve any kind of analysis, regression-based or otherwise, that identifies attributes that distinguish (whether causally or not) donors from non-donors. The data simply do not have any information about that. All you can do with this data is provide descriptive statistics of the characteristics of your donor sample. And for all you know, everything you find when doing that would also be true of the non-donors. Or not.

        Comment


        • #5
          Thank you, George and Clyde, for your response.
          That's what I thought, but I was hoping to find some way to handle my issue.

          Comment


          • #6
            It is somewhat more than having data on only donors. Even if so, I would run truncreg as long as Y varies. However, in my data, Y = 1 for all observations.

            Comment


            • #7
              If you have a separate dataset containing expalantory/exogenous/right-hand-side/x-variables for non-donors, then you can look at a case-control design: https://en.wikipedia.org/wiki/Case%E...3control_study
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                regression is based on variance. Your Y has no variance.

                Comment


                • #9
                  Do you have any idea as to why y does not vary? Do you have any external information? Could you make a statement like

                  ”Studies of population A have shown as much as 20% of the population has y = 0. However, in our study of population B, we could not find a single case where y = 0.”
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  StataNow Version: 19.5 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    I am just using data compiled by an agency. That is, the agency has collected information about donors, and the dataset has a large amount of useful information about donors.
                    Yes, I think that I can find some studies on the % of donors in the population. However, it is not clear how such studies would be useful for my analysis.

                    Comment


                    • #11
                      As I see it, all you can do is collect data from a census microsample, ideally from the same area(s) as the data you have, and compare the means of the population to the means of your data. Imperfect, since you don't have a clean control, but at least doable. (e.g., donors are, on average, older and more likely to be White than the mean population). But, we already know all this (e.g., https://pubmed.ncbi.nlm.nih.gov/21805814/ ). If you have some measure of repeat donorship, then you may be able to get something from that.

                      Even if you have time of day of donorship, something interesting might pop up. Does age vary by time of day? Blood banks have to schedule events, so this could be helpful in that regard.

                      Comment


                      • #12
                        Thanks for all suggestions and ideas. My intention was to see if there is any regression method for truncated data with a binary dependent variable and missing 0s, because we already know how to analyze truncated data with a continuous dependent variable (or a discrete dependent variable as long as it varies).

                        Comment


                        • #13
                          I like George's ideas. If possible, looking at the amount of time between donations could be very interesting. Some may come back after 8 weeks, some may come back after 6 months, some may not come back at all, at least not during the time span of the study. If I were a blood center I'd be interested in knowing how I can speed up the pace of returning donors.

                          Even if you don't have multiple records for each donor, a Q like "how often do you donate blood" could be a good DV.
                          -------------------------------------------
                          Richard Williams, Notre Dame Dept of Sociology
                          StataNow Version: 19.5 MP (2 processor)

                          EMAIL: [email protected]
                          WWW: https://www3.nd.edu/~rwilliam

                          Comment


                          • #14
                            The data does not always permit what we want, but may permit something.

                            Comment


                            • #15
                              A zero truncated count model might be tempting too, data permitting. You only have people who donated at least once — I.e. you don’t have nondonors — but do you have how many times they donated over, say, a year? You can donate every 8 weeks, so over the course of a year you can donate 6 or even 7 times.

                              Places always try to get new donors, but my guess is that repeat donors are what they rely on the most.
                              -------------------------------------------
                              Richard Williams, Notre Dame Dept of Sociology
                              StataNow Version: 19.5 MP (2 processor)

                              EMAIL: [email protected]
                              WWW: https://www3.nd.edu/~rwilliam

                              Comment

                              Working...
                              X