Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating Margins after Heckman

    Hi all,

    I fitted my model using both Maximum Likelihood estimator(MLE) and two-step estimator by -Heckman-. I have a few questions:

    For MLE,
    (1). do the coefficients in the outcome equation represent the estimated marginal effects of the regressors in the underlying regression equation (as mentioned in Stata -Heckman- manual)?

    margins, dydx(varlist)

    I got average marginal effects exactly the same from the outcome coefficients.

    (2).I also want to calculate the average marginal effects ONLY for the probit model, as coefficients in probit model are just estimates that maximize the likelihood function.

    margins, dydx(varlist) predict(psel) noestimcheck

    The -noestimcheck - option is used, because my independent variables are all factor variables, without the option, -margins- get "(not estimable)" message in the table.

    Could anyone check what's wrong with my codes? What should I use for two-step estimator?

    Thanks in advance.





  • #2
    Hi Xiaojin,

    Code:
    margins, dydx(*)
    will always get you the same coefficients as you have for the value equation when using heckman because since you don't specify any predict() options, it assumes that you want the marginal effects on the linear prediction of the value equation, i.e. the xb option is the default. Nothing wrong there, only that if that's not what you want to estimate you should mention that by setting an appropriate predict() option. See heckman postestimation to know what options are available and which one matches what you want.

    With respect to the second part of your question, you don't show the results from the command, so how do we know if anything went wrong?
    Alfonso Sanchez-Penalver

    Comment


    • #3
      Hi Alfonso, thank you for your reply.
      Here are my codes and results.

      Code:
       global demandOG "LnAnnualQoz LnAnnualPoz i.(household_income_R household_size_R age_and_presence_of_children_R female_head_age_R female_head_employment female_head_education_R female_head male_head race hispanic_origin) c.male_head#(male_head_age_R male_head_employment male_head_education_R)"
      
       global selectequation "prob_buy_OG=i.(household_income_R household_size_R age_and_presence_of_children_R female_head_age_R female_head_employment female_head_education_R female_head male_head race hispanic_origin) c.male_head#(male_head_age_R male_head_employment male_head_education_R)"
       
       *maximum likelihood estimator (MLE)
       heckman  $demandOG, select($selectequation)
      *selection equation (probit model) results
      Code:
      prob_buy_OG                       |
                     household_income_R |
                       $25,000-$49,999  |   .0653964    .022277     2.94   0.003     .0217343    .1090585
                       $50,000-$69,999  |    .116738   .0247397     4.72   0.000      .068249     .165227
                             >=$70,000  |   .1523812   .0238739     6.38   0.000     .1055893    .1991731
                                        |
                       household_size_R |
                                     2  |  -.0438284    .026309    -1.67   0.096    -.0953931    .0077363
                                     3  |  -.0946765   .0318624    -2.97   0.003    -.1571256   -.0322273
                                     4  |  -.0941101   .0373169    -2.52   0.012    -.1672499   -.0209703
                                     5  |  -.1065278   .0459023    -2.32   0.020    -.1964947    -.016561
                                    6+  |  -.0684814   .0538542    -1.27   0.204    -.1740336    .0370709
      *Using option predict(psel) ----Pr(yj observed)
      Code:
      margins, dydx(i.(household_income_R household_size_R)) predict(psel) atmean noestimcheck
      Code:
                       |            Delta-method
                         |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------------+----------------------------------------------------------------
      household_income_R |
        $25,000-$49,999  |   .0054737   .0018201     3.01   0.003     .0019063    .0090412
        $50,000-$69,999  |   .0102292    .002127     4.81   0.000     .0060602    .0143981
              >=$70,000  |   .0137816   .0020602     6.69   0.000     .0097436    .0178196
                         |
        household_size_R |
                      2  |  -.0042898   .0026268    -1.63   0.102    -.0094382    .0008586
                      3  |  -.0088822   .0030359    -2.93   0.003    -.0148324    -.002932
                      4  |  -.0088332   .0034977    -2.53   0.012    -.0156887   -.0019778
                      5  |  -.0098956   .0041431    -2.39   0.017    -.0180159   -.0017752
                     6+  |  -.0065667   .0050148    -1.31   0.190    -.0163955    .0032622
      ------------------------------------------------------------------------------------
      Note: dy/dx for factor levels is the discrete change from the base level.
      *using option -predict(xbsel)- linear prediction for selection equation
      Code:
       margins, dydx(i.(household_income_R household_size_R)) predict(xbsel) atmean noestimcheck
      Code:
      --------------------------------------------------------------------------------
                         |            Delta-method
                         |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------------+----------------------------------------------------------------
      household_income_R |
        $25,000-$49,999  |   .0653964    .022277     2.94   0.003     .0217343    .1090585
        $50,000-$69,999  |    .116738   .0247397     4.72   0.000      .068249     .165227
              >=$70,000  |   .1523812   .0238739     6.38   0.000     .1055893    .1991731
                         |
        household_size_R |
                      2  |  -.0438284    .026309    -1.67   0.096    -.0953931    .0077363
                      3  |  -.0946765   .0318624    -2.97   0.003    -.1571256   -.0322273
                      4  |  -.0941101   .0373169    -2.52   0.012    -.1672499   -.0209703
                      5  |  -.1065278   .0459023    -2.32   0.020    -.1964947    -.016561
                     6+  |  -.0684814   .0538542    -1.27   0.204    -.1740336    .0370709
      ------------------------------------------------------------------------------------
      Note: dy/dx for factor levels is the discrete change from the base level.
      These are the same as the regression coefficient.

      So I guess *Using option predict(psel) ----Pr(yj observed) gives me the marginal effects for the probit model?

      Thanks.

      Last edited by Xiaojin Wang; 03 Nov 2015, 10:53.

      Comment


      • #4
        Yes that is right, xbsel gives you the linear prediction of the probit side of the model. Notice it is not really a simple probit model any more, because the coefficients are estimated together with the value equation. So psel is the same as if you did expression(normal(predict(xbsel))).
        Alfonso Sanchez-Penalver

        Comment


        • #5
          Yes, you are right, I got the same results by using expression(normal(predict(xbsel))). Probit model part is solved.

          Now comes the outcome equation in the Heckman two-step model, I have variables appear in both stages, some paper(Saha, et al., 1997; Vance, 2009) mentioned marginal effects (ME) have two parts (from two stages) for those common regressors. Do you have any clue as to how to calculate those MEs in Stata?
          Thanks.



          Saha, A., O. Capps, and P.J. Byrne. 1997. "Calculating marginal effects in dichotomous-continuous models." Applied Economics Letters 4:181-185.
          Vance, C. 2009. "Marginal effects and significance testing with Heckman's sample selection model: a methodological note." Applied Economics Letters 16:1415-1419.

          Comment


          • #6
            After fitting Heckman selection model using MLE, I tried to calculate marginal effects:

            Code:
            global demandOG "LnAnnualQoz LnAnnualPoz i.(household_income_R household_size_R age_and_presence_of_children_R female_head_age_R female_head_employment female_head_education_R female_head male_head race hispanic_origin) c.male_head#(male_head_age_R male_head_employment male_head_education_R)"
            
             global selectequation "prob_buy_OG=i.(household_income_R household_size_R age_and_presence_of_children_R female_head_age_R female_head_employment female_head_education_R female_head male_head race hispanic_origin) c.male_head#(male_head_age_R male_head_employment male_head_education_R)"
             
             *maximum likelihood estimator (MLE)
             heckman  $demandOG, select($selectequation)

            Code:
            Heckman selection model                         Number of obs      =    114624
            (regression model with sample selection)        Censored obs       =    109272
                                                            Uncensored obs     =      5352
            
                                                            Wald chi2(34)      =   1037.20
            Log likelihood =  -28363.2                      Prob > chi2        =    0.0000
            
            ---------------------------------------------------------------------------------------------------
                                              |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            ----------------------------------+----------------------------------------------------------------
            LnAnnualQoz                       |
                                  LnAnnualPoz |   -.637085   .0216838   -29.38   0.000    -.6795846   -.5945855
                                              |
                           household_income_R |
                             $25,000-$49,999  |   .0267746   .0545942     0.49   0.624     -.080228    .1337773
                             $50,000-$69,999  |   .0970547   .0603385     1.61   0.108    -.0212066     .215316
                                   >=$70,000  |   .1830852   .0583721     3.14   0.002     .0686781    .2974923
                                              |
                             household_size_R |
                                           2  |  -.0980171   .0630345    -1.55   0.120    -.2215623    .0255282
                                           3  |  -.2097413   .0771144    -2.72   0.007    -.3608828   -.0585998
                                           4  |  -.1738518   .0901595    -1.93   0.054    -.3505612    .0028576
                                           5  |  -.1819024   .1103865    -1.65   0.099     -.398256    .0344512
                                          6+  |  -.1766631   .1287826    -1.37   0.170    -.4290724    .0757461
            *yexpected E(yj^* ), yj taken to be 0 where unobserved
            Code:
             
            .  margins, dydx(i.(household_income_R household_size_R)) predict(yexpected) atmean noestimcheck
            
            Conditional marginal effects                      Number of obs   =     114382
            Model VCE    : OIM
            
            Expression   : E(LnAnnualQoz*|Pr(prob_buy_OG)), predict(yexpected)
            dy/dx w.r.t. : 2.household_income_R 3.household_income_R 4.household_income_R 2.household_size_R
                           3.household_size_R 4.household_size_R 5.household_size_R 6.household_size_R
            at           : LnAnnualPoz     =   -1.129177 (mean)
            ...
                          |            Delta-method
                               |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------------+----------------------------------------------------------------
            household_income_R |
              $25,000-$49,999  |   .0157423   .0063953     2.46   0.014     .0032077    .0282769
              $50,000-$69,999  |   .0313566   .0074496     4.21   0.000     .0167558    .0459575
                    >=$70,000  |   .0448618   .0072553     6.18   0.000     .0306417    .0590819
                               |
              household_size_R |
                            2  |  -.0160376   .0092398    -1.74   0.083    -.0341473     .002072
                            3  |  -.0329158   .0106204    -3.10   0.002    -.0537314   -.0121002
                            4  |  -.0313451   .0122668    -2.56   0.011    -.0553875   -.0073027
                            5  |  -.0345034   .0145118    -2.38   0.017    -.0629461   -.0060608
                           6+  |  -.0254712   .0173801    -1.47   0.143    -.0595356    .0085932
            ------------------------------------------------------------------------------------
            Note: dy/dx for factor levels is the discrete change from the base level.
            *And with ycond----E(yj|yj observed)

            Code:
             margins, dydx(i.(household_income_R household_size_R)) predict(ycond) atmean noestimcheck
            
            Conditional marginal effects                      Number of obs   =     114382
            Model VCE    : OIM
            
            Expression   : E(LnAnnualQoz|Zg>0), predict(ycond)
            dy/dx w.r.t. : 2.household_income_R 3.household_income_R 4.household_income_R 2.household_size_R
                           3.household_size_R 4.household_size_R 5.household_size_R 6.household_size_R
            at           : LnAnnualPoz     =   -1.129177 (mean)
            ...
            ----------------------------------------------------------------------------------
                               |            Delta-method
                               |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------------+----------------------------------------------------------------
            household_income_R |
              $25,000-$49,999  |  -.0700707   .0427054    -1.64   0.101    -.1537716    .0136303
              $50,000-$69,999  |  -.0754621   .0470243    -1.60   0.109     -.167628    .0167038
                    >=$70,000  |  -.0417726   .0455505    -0.92   0.359      -.13105    .0475048
                               |
              household_size_R |
                            2  |  -.0335965   .0489085    -0.69   0.492    -.1294554    .0622623
                            3  |  -.0702855   .0601236    -1.17   0.242    -.1881255    .0475545
                            4  |  -.0352336   .0701963    -0.50   0.616    -.1728159    .1023488
                            5  |  -.0249131   .0857886    -0.29   0.772    -.1930557    .1432295
                           6+  |  -.0759021      .0996    -0.76   0.446    -.2711144    .1193103
            ------------------------------------------------------------------------------------
            Note: dy/dx for factor levels is the discrete change from the base level.
            They are very different. Does anyone knows how to interpret those marginal effects? Thanks.



            Comment


            • #7
              The partial effects on the yexpected value, are the partial effects on the censored mean of the dependent variable. The partial effects on the ycond are the partial effects on the truncated mean, i.e. only for those who actually have an observed value. The censored mean is supposed to be equal to the probability of being observed times the truncated mean. I hope this clarifies your doubts.
              Last edited by Alfonso Sánchez-Peñalver; 03 Nov 2015, 14:23.
              Alfonso Sanchez-Penalver

              Comment


              • #8
                Originally posted by Alfonso Sánchez-Peñalver View Post
                The partial effects on the yexpected value, are the partial effects on the censored mean of the dependent variable. The partial effects on the ycond are the partial effects on the truncated mean, i.e. only for those who actually have an observed value. The censored mean is supposed to be equal to the probability of being observed times the truncated mean. I hope this clarifies your doubts.
                It surely does. Thanks, Alfonso.

                Comment


                • #9
                  Hi.

                  I fitted my model using Heckman- Maximum Likelihood estimator(MLE) procedure.
                  My dependent variable (y) for the observation equations is log GVA per worker and My selection variable is type of enterprise (ml=1 and ml=0).

                  I am trying to estimate average marginal effects of three equations: Selection equation for ml=1; observation equation for ml=1; and observation equation for ml=0.

                  I have two questions.

                  Firstly, I tried to obtain average marginal effects of observation equation for ml=1 and selection equation for ml=1 using the following steps:

                  Step 1: I ran the command- heckman y x, select(ml=x z)
                  Step 2: For average marginal effects of observation equation for ml=1, I ran the following command:- margins, dydx(*) predict(ycond) subpop(if ml==1)

                  Question 1: What is the command to obtain average marginal effects for the selection equation?

                  Secondly, in order to obtain average marginal effects of observation equation for ml=0, I followed the steps below.

                  Step 3: gen nml=1-ml
                  Step 4: heckman y x, select(nml=x z)
                  Step 5: margins, dydx(*) predict(ycond) subpop(if nml==1).

                  Question 2: Is this the right procedure to obtain average marginal effects for the observation equation for subsample ml==0?

                  Could anyone tell me if I am going in the right direction for my stated purpose?

                  Thanks in advance.

                  Comment

                  Working...
                  X