Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it mandatory to select DV in Heckman?

    Dear Stata users,

    I have a case study where I am trying to estimate the heckman selection model in stata. The commands in the stata manual put the dependent variable once but in the two step the box requires to select a dependent variable separately and I am just wondering whether it is mandatory or not. I am also confused with the ML method where if I put the same DV in the selection, it says it has been multiplied. In any case, I simply want to estimate the Heckman in general with having some selection independent variables in the selection equation. Could anyone help me out with the right commands in stata?

    Cheers,
    Azreen.

  • #2
    In the menu, you can generally ignore the check-box for the second dependent variable -- that's only useful if you have a dummy variable indicating missing data instead of actually having missing values on the dependent variable. What *is* important is that you have one or more independent variables in the selection model that are not in the substantive model.

    See -help heckman- and the examples provided there.

    Code:
    webuse womenwk, clear
    
    sum
    
    capture drop noWage
    gen noWage=1
    replace noWage=0 if wage==.
    
    *==========did not indicate second dependent variable
    heckman wage educ age, select(married children educ age)
    
    *==========if I used the check-box for the second dependent variable
    heckman wage educ age, select(noWage = married children educ age)
    I do find it strange that the estimates differ slightly.

    BTW, maybe this will help clear up that mysterious check-box:

    Code:
    *====filling in missing values, but have dummy to keep track of what should be missing
    replace wage=0 if wage==.
    *========this works
    heckman wage educ age, select(noWage = married children educ age)
    *========this barfs
    heckman wage educ age, select(married children educ age)
    Last edited by ben earnhart; 04 Nov 2014, 00:43.

    Comment


    • #3
      ps. Why did they do it this odd way? I think it makes sense in a twisted way. If you have some cases that are missing due to your model (e,g, people not in the labor-force) and others who simply refused to answer, then this seems to accommodate the situation.

      Comment


      • #4
        The case that I am currenctly working with is a cross-sectional dataset of funding distribution of 483 counties by the government. Some countries get funded (positive values) and some counties did not receive funding (0 values). There is no missing values;either positive (funded) or 0 (not funded). As some counties get funded and some didn't, I want to estimate the Heckman equation model. The dependent vatiable is total funding with various independent variables defined. But if I ignore the selection dependent variable, it is showing me an error specifying that it will go to OLS regression. But if I use the same dependent variable in the selection with one added independent variable, it is showing some results. This is what happens when I use the two step case. In general, I simply want to estimate the Heckman (not probably two step or ML) with keeping the same dependent variable in both equations. Can anyone tell me why this is happening if I ignore the selection dependent variable check box?

        Cheers,
        Azreen.

        Comment


        • #5
          Run a summary. If the dependent variable (total funding) is zero, either set it to missing (replace funding=. if funding ==0), and run it with approach 1, no depvar in the selection equation. Or, generate a new variable as I showed you, and use that as the depvar in the selection equation.

          Comment


          • #6
            Just two questions in this case:1) What is the interpretation of this two equation Heckman estimation? 2) The above solution doesn't seem to work in Heckman probit model. Is there any other way to avoid selection dependent variable in Heckprob model?

            Cheers,
            Azreen.

            Comment


            • #7
              Based on -help heckprob- it has (mostly) the same options, with the same basic syntax as regular -heckman- . You can either have missing values for missing data (in which case, do not check the box for depvar, and/or do not use the , select(depvar= a b c) option, but simply have select (a b c). Or, you can have potentially valid values for missing (e.g. 0 wages), in which case, you need to create your variable for flagging missing values and need to use the box for depvar and in syntax, select(depvar=a b c).

              Interpretation I could help with, but is beyond the scope of this forum. This might sound rude, but if you don't know what a Heckman model is doing, you shouldn't be using it.
              Last edited by ben earnhart; 05 Nov 2014, 19:54.

              Comment


              • #8
                BTW -- I apologize for refusing to help you interpret. But seriously, it is beyond the scope of this forum. If you get it to run, and get odd results, feel free to post them. When doing so, hit the "A" right above where you type in the forum, then a menu will open up. Choose the "#" button, which will give you code tags, put your output in the code tags. Then people can maybe see where you went wrong or right, or volunteer interpretations.

                Comment


                • #9
                  Thanks Ben. But in my case, when I have attempted your second option (valid values for missing i.e. 0), I am getting the message [CODE]
                  [Fitting probit model:

                  outcome does not vary; remember:
                  0 = negative outcome,
                  all other nonmissing values = positive outcome
                  /CODE]. Can you please tell me why this is happening?

                  Cheers,
                  Azreen.

                  Comment


                  • #10
                    1) Run crosstabs of your "depvar" (the indicator of missingness) and your true dependent variable. While you're at it, run summarize on both.
                    2) post those results, plus the full output (commands and results) from your heckprob.

                    Make sure you put them in code tags so they're readable.

                    Comment


                    • #11
                      BTW --- when running your tab command for the two variables, include the "missing" option. That is "tab truedepvar depvar, mi"

                      Comment


                      • #12
                        Thanks Ben. First of all, when I run the heckprob model, I got the following:[CODE][Fitting probit model:

                        outcome does not vary; remember:
                        0 = negative outcome,
                        all other nonmissing values = positive outcome
                        r(2000);

                        /CODE]
                        Then I put the command: tab drrspending_total newdrr, mi which shows my newdrr (selection DV) has got 1 (in 272 cases i.e. got funding) and 0 (in 211 cases i.e. no funding) out of 483 counties. Then I put the summarize command. The tabulate command gave me the list by observation and seems not readable when I tried to put them under codetags. Can you tell me why the heckprob is not giving me any results from this information?

                        Cheers,
                        Azreen.

                        Comment


                        • #13
                          Please provide the exact command(s) you issued with the results. Somehow, the code tags didn't work for you... strange web browser? Anyway, try something like:

                          Code:
                            
                           heckman wage educ age, select(married children educ age)
                          since zero is not missing (refused) but truncated (got no funding). You don't need newdrr, and it might be messing you up.
                          Last edited by ben earnhart; 06 Nov 2014, 20:19.

                          Comment

                          Working...
                          X