Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • heckman first step vs probit

    While running -heckman- with the -twostep- option, I was getting the "convergence not achieved" message. Curiously, when I ran the first step "manually" I didn't hit the error. A bit of investigation revealed the likely culplit to be two sets of "dummy" predictors: there was only 1 observation for one permutation of the 2 variables. But this made me realize that I don't fully understand what the -heckman- command is doing in the first step. So here's my main question: under what conditions does -heckman- not converge, while -probit- does?

    Here's some data you can test with. I run the estimation using -heckman- and "manually". The first step involves 2 sets of dummy predictors (i.x1 and i.x2). The first time through the loop you can see the betas match, as I expected. The second time through the loop I create the situation I was describing above. This causes the heckman (but not the probit) to complain.


    Code:
    *make up some data
    clear
    set obs 1000
    g x1 = int(runiform(1,5))
    g x2 = int(runiform(1,5))
    g d = (runiform()>0.5)
    g y =  runiform()
    
    qui forv z = 1/2 {
        noi di _newline(2) "z=`z'"
        preserve
            est clear
            if `z'==2 replace x1 = 0 if _n==1
            noi ta x1 x2, m                         /* when `z'==2, notice the "single" observation */
            probit d i.x1 i.x2
            predict probitxb, xb
            ge IMR = normalden(probitxb) / normal(probitxb)
            eststo use_command: heckman      y    i.x1          , sel(d= i.x1 i.x2  ) twostep first
            eststo do_manually: reg         y     i.x1  IMR     if d==1
            noi estout ,     cells(    b(fmt(5))      se )        starlevels(* 0.10 *^ 0.05 *^* 0.01     )     
        restore
    }
    Two minor follow up questions:
    What other adjustments does -heckman- make to the standard errors that I haven't done here (clearly I'm missing something)?
    Is it possible to only display the "main" estimates from the second stage of -heckman- (sometimes I don't want to see the estimates for the first stage). Probably an -estout- option?

  • #2
    bump!

    Comment


    • #3
      Hi Brian
      this is a very interesting puzzle!
      If you look into the two regressions, Probit complains as well, but does it "quietly" but fixing it before you even see whats going on.
      So whats happening?
      1. the convergence problem has to do with perfect prediction.
      whenever x1==0 it can predict failure perfectly!.
      You will see this if you run the probit first. Now, it fixes it by dropping the offending observation (now you have 999 obs instead of 1000)
      2. But heckman fails!
      Now, probit would drop the observation, but heckman doesn't , or cant, because the outcome y is not missing when x1==0. So is trying to keep 1000 observations in the sample in the internal probit procedure.
      This causes non-convergence in the first step.

      Unfortunately, you would only know this if you look at the data carefully, which is why ( i would guess) heckman tw steps is having problems.

      Regarding your minor questions.
      SE are corrected using a "longish" formula. But you can get the exact formula used in the manual (look into heckman twostep formulas and methods)
      For the esttab...estout...you can use "drop(eqname: )"

      HTH

      Comment


      • #4
        Originally posted by FernandoRios View Post
        Hi Brian
        this is a very interesting puzzle!
        If you look into the two regressions, Probit complains as well, but does it "quietly" but fixing it before you even see whats going on.
        So whats happening?
        1. the convergence problem has to do with perfect prediction.
        whenever x1==0 it can predict failure perfectly!.
        You will see this if you run the probit first. Now, it fixes it by dropping the offending observation (now you have 999 obs instead of 1000)
        2. But heckman fails!
        Now, probit would drop the observation, but heckman doesn't , or cant, because the outcome y is not missing when x1==0. So is trying to keep 1000 observations in the sample in the internal probit procedure.
        This causes non-convergence in the first step.

        Unfortunately, you would only know this if you look at the data carefully, which is why ( i would guess) heckman tw steps is having problems.

        Regarding your minor questions.
        SE are corrected using a "longish" formula. But you can get the exact formula used in the manual (look into heckman twostep formulas and methods)
        For the esttab...estout...you can use "drop(eqname: )"
        HTH
        It seems like someone made a judgement: when there's perfect prediction, then -probit- should still display estimates while -heckman- should not. I'm not saying that judgement is wrong, but it's not clear to me why that's the case. Are the heckman estimates more flawed than the probit ones in cases where prefect predictors exist? Can anyone help me understand?

        Thanks for the "drop(eqname: )" tip. I had been forgetting the colon.

        Comment


        • #5
          I think that is a question for technical services.
          And who knows, perhaps a question for the original heckman programmer.

          Comment


          • #6
            Originally posted by FernandoRios View Post
            I think that is a question for technical services.
            And who knows, perhaps a question for the original heckman programmer.
            Thanks Fernando. I'll ask them.

            Comment

            Working...
            X