Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two Step Binary Choice Regression Model: What type of regression model to use to avoid sample selection bias and autocorrelation?

    Hi,
    I have a two step-model (see below) in which all the people answer the question in the first part and only a limited number of people answer in the second step (i.e. based on their response in the first step).

    Step 1: Did you demand credit (1) or not(0)?
    Step 2: Were you able to obtain credit (1) or not (0)? (Only people who demanded credit (1) answer this)

    One way is to run two different probit models (the second model would be run only for people who answered 1 in the first step). But one can argue that the second probit model might suffer from issues pertaining to sample selection bias and autcorrelation. Can someone recommend what type of model would be the most appropriate in this case (other than using heckman process because finding an exclusion variable in this case is near to impossible)? Thanks.


  • #2
    Hi Fozan
    I do not think you will have an autocorrelation problem since you do not appear to have time series data. You will have, as you describe, a sample selection problem.
    Under normality assumption, you can use heckprobit to estimate such model. Look into "help heckprobit" for an example.
    HTH
    Fernando

    Comment


    • #3
      Hi Fernando,

      I am reluctant to use heckprob because its extremely difficult to come up with a selection variable (one that explains first step credit demand and not the second step credit exclusion) especially given the dataset that i am using. So, i am thinking more on the lines of using some kind of simultaneous equation model. Do you think if "sequential logit model" or "nested logit model" would be a good fit in this case? Thanks

      Comment


      • #4
        Hi Fozan
        You are correct, just as with the instrumental variable approach, the "heckman" family estimations also require an instrument that affects selection, but not the outcome. I think, however, that any model that you estimate will require that.
        While sequential logit is certainly an option, keep in mind that it would be similar to a heckprobit without an instrument. Since the identification will come from the nonlinearities.
        Bottom line. Why dont you try heckprobit and sequential logit. I think that, at the end of the day, they will all provide very similar results.
        Fernando

        Comment


        • #5
          Hi Frozan,

          as far as I understand, you have like a tree. So on top of it you have the question whether an individual asks for credit or not. Then, the first type of individuals are asked whether they obtained the credit or not. A sequential logit could help, but maybe also a nested logit model could. Have you though about this? I believe that the choice at the second stage (having/not having credit) is correlated to the one at the first stage. Hence, I suspect that there may be correlation in the errors.

          You can have a quick idea about the differences between sequential and nested logit here:

          https://data.princeton.edu/wws509/stata/c6s4

          and here:

          https://www.bauer.uh.edu/rsusmel/phd/ec1-20.pdf

          Dario

          Comment


          • #6
            Thanks Fernando and Dario for your valuable inputs. I just tried the sequential logit model and it seems to work. I am going to look a bit more into nested logit (I am not sure if nested logit is the right kind of model in this case as individuals dont really have a choice to choose between the two options in step 2).

            Comment

            Working...
            X