Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heckman selection models

    I have to use the Heckman selection model for my estimation, to observe the pattern of private tutoring expenditure and I had tried regressing using the command
    heckman private_tut_exp course_fee e19 e23 e27 e28 e29 , select(private_tutoring = gender age_cat general_educationlevel technical_educationlevel d18 sector region e7 type_of_institution e15 mothers_educ b4 b7 b8 b9 e33 e8) twostep

    this shows there are no observations, but there are observations, could any of you help me out with this situation.

    Thank you



  • #2
    Often, this may happen if there is no base (comparison) group. Not sure this is the case here.

    Could you give us an extract of your data and tell us what the e and d variables mean?

    Comment


    • #3
      It is unit-level data for India for the year 2019.
      Dependent variable: private tutoring expenditure and decision to take private tuition or not
      Independent variables: e's are whether the student is receiving all the other expenditures like course fees, books, etc for free/subsidized rates and d18 is the disability variable and others are control variables.
      I would also like to know how can I mention in stata to do logit regression for binary variable case.

      Comment


      • #4
        I suspect that you haven't left one category of the e's or d's out of the specfication. Hard to speculate without any more details, but that's my assumption.

        Regarding your second remark, I am pretty sure the command Heckman uses probit for the selection equation. Try the option
        Code:
        first

        Comment


        • #5
          you could also start with some basic summary statistics
          for example

          Code:
          sum private_tut_exp course_fee e19 e23 e27 e28 e29
          sum private_tut_exp course_fee e19 e23 e27 e28 e29 if private_tutoring 
          sum private_tutoring  gender age_cat general_educationlevel technical_educationlevel d18 sector region e7 type_of_institution e15 mothers_educ b4 b7 b8 b9 e33 e8
          and see where the "missing" variables are
          F

          Comment


          • #6
            Thank you

            Comment


            • #7
              Hi

              heckman private_coaching i.gender i.age_cat i.general_educationlevel i.technical_educationlevel i.religion i.social_group i.free_education i.scholarship i.middaymeals course_fee books_stationary_uniform_charges transport_charges other_exp i.type_of_institution, select( private_tutoring = i.gender i.age_cat i.general_educationlevel i.technical_educationlevel i.religion i.social_group) twostep

              I have just renamed the d's and e's for better understanding.

              this is the command I used and for the binary regression equation pvalues, standard errors are all turning out to be blank, and the coefficient values are also absurd.

              Comment


              • #8
                It would be a lot easier for us to help if you could give us an extract of your data using
                Code:
                dataex
                and then reporting the results using code delimiters.

                Comment


                • #9
                  This kind of result usually means that you are dealing with either near-perfect colinearity or near-perfect prediction.
                  Perhaps it would be far better to start modeling everything step by step, in other words.
                  Estimate your probit model first,
                  Obtain inverse mills ratio
                  estimate the corrected OLS

                  This will help in 2 ways
                  1) it will show you what is going on behind the command heckman
                  2) will show you step by step the process, helping catch possible errors.

                  Pay particular attention to the signs of the probit model, and understand if any of your covariates may be strongly associated to selection.
                  It also seems you do not have an IV for heckman, so it may be that you are identifying everything through nonlinearities, which may be nonexistent in your specification.
                  F

                  Comment


                  • #10
                    Click image for larger version

Name:	dataex.JPG
Views:	1
Size:	78.0 KB
ID:	1666242

                    Is this what you are asking?

                    Comment

                    Working...
                    X