Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrapping marginal effects in probit model

    Dear Stata members, it is nice to join you, fellows.

    I have the following results in stata. Actually, I wanted to bootstrap marginal effects for my model and did the following steps. But I am not sure the marginals, dydx(*) commands are really referring to the bootstrapped samples here (the last command : .margins, dydx(*)). I need your help,fellows?


    . probit immig HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB

    Iteration 0: log likelihood = -5918.8745
    Iteration 1: log likelihood = -5372.6804
    Iteration 2: log likelihood = -5348.3457
    Iteration 3: log likelihood = -5348.2508
    Iteration 4: log likelihood = -5348.2508

    Probit regression Number of obs = 24635
    LR chi2(12) = 1141.25
    Prob > chi2 = 0.0000
    Log likelihood = -5348.2508 Pseudo R2 = 0.0964

    ------------------------------------------------------------------------------
    immig | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    HHSIZE | -.1591801 .0105724 -15.06 0.000 -.1799015 -.1384586
    agesq | -.0003139 .0000206 -15.26 0.000 -.0003542 -.0002736
    child | .0394447 .0170369 2.32 0.021 .006053 .0728364
    lnmine | .022301 .0057576 3.87 0.000 .0110163 .0335856
    marrge | .2952087 .0332481 8.88 0.000 .2300437 .3603738
    EDU | .2187549 .0282795 7.74 0.000 .1633281 .2741818
    EMP | -.1023748 .029243 -3.50 0.000 -.15969 -.0450597
    sharepast | -1.328398 .1331313 -9.98 0.000 -1.58933 -1.067465
    east | .1624754 .0628192 2.59 0.010 .0393521 .2855987
    khangai | .2025574 .0522529 3.88 0.000 .1001437 .3049711
    central | -.1739832 .0576205 -3.02 0.003 -.2869174 -.0610491
    UB | .2516309 .0518075 4.86 0.000 .15009 .3531718
    _cons | -.8136612 .0686039 -11.86 0.000 -.9481223 -.6792001
    ------------------------------------------------------------------------------

    . margins, dydx(*)

    Average marginal effects Number of obs = 24635
    Model VCE : OIM

    Expression : Pr(immig), predict()
    dy/dx w.r.t. : HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB

    ------------------------------------------------------------------------------
    | Delta-method
    | dy/dx Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    HHSIZE | -.0184734 .00125 -14.78 0.000 -.0209233 -.0160235
    agesq | -.0000364 2.43e-06 -15.00 0.000 -.0000412 -.0000317
    child | .0045777 .0019779 2.31 0.021 .000701 .0084544
    lnmine | .0025881 .0006689 3.87 0.000 .0012772 .003899
    marrge | .03426 .0038809 8.83 0.000 .0266536 .0418664
    EDU | .0253873 .0032944 7.71 0.000 .0189304 .0318441
    EMP | -.0118809 .0033961 -3.50 0.000 -.0185371 -.0052248
    sharepast | -.1541651 .0155968 -9.88 0.000 -.1847342 -.123596
    east | .0188558 .0072931 2.59 0.010 .0045617 .03315
    khangai | .0235075 .0060732 3.87 0.000 .0116042 .0354108
    central | -.0201913 .0066923 -3.02 0.003 -.033308 -.0070747
    UB | .0292026 .0060213 4.85 0.000 .017401 .0410042
    ------------------------------------------------------------------------------

    . probit immig HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB, vce(bootstrap, reps(500))
    (running probit on estimation sample)

    Bootstrap replications (500)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    .................................................. 150
    .................................................. 200
    .................................................. 250
    .................................................. 300
    .................................................. 350
    .................................................. 400
    .................................................. 450
    .................................................. 500





    Probit regression Number of obs = 24635
    Replications = 500
    Wald chi2(12) = 751.94
    Prob > chi2 = 0.0000
    Log likelihood = -5348.2508 Pseudo R2 = 0.0964

    ------------------------------------------------------------------------------
    | Observed Bootstrap Normal-based
    immig | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    HHSIZE | -.1591801 .0126795 -12.55 0.000 -.1840314 -.1343288
    agesq | -.0003139 .0000214 -14.65 0.000 -.0003559 -.0002719
    child | .0394447 .0196725 2.01 0.045 .0008872 .0780022
    lnmine | .022301 .0061473 3.63 0.000 .0102525 .0343495
    marrge | .2952087 .0366813 8.05 0.000 .2233147 .3671028
    EDU | .2187549 .0279829 7.82 0.000 .1639094 .2736005
    EMP | -.1023748 .0280807 -3.65 0.000 -.1574119 -.0473377
    sharepast | -1.328398 .1318705 -10.07 0.000 -1.586859 -1.069936
    east | .1624754 .0665041 2.44 0.015 .0321298 .292821
    khangai | .2025574 .0489888 4.13 0.000 .1065411 .2985737
    central | -.1739832 .0597339 -2.91 0.004 -.2910596 -.0569068
    UB | .2516309 .0510033 4.93 0.000 .1516661 .3515956
    _cons | -.8136612 .0713887 -11.40 0.000 -.9535805 -.6737419
    ------------------------------------------------------------------------------

    . margins, dydx(*)

    Average marginal effects Number of obs = 24635
    Model VCE : Bootstrap

    Expression : Pr(immig), predict()
    dy/dx w.r.t. : HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB

    ------------------------------------------------------------------------------
    | Delta-method
    | dy/dx Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    HHSIZE | -.0184734 .0014711 -12.56 0.000 -.0213568 -.01559
    agesq | -.0000364 2.49e-06 -14.65 0.000 -.0000413 -.0000316
    child | .0045777 .0022822 2.01 0.045 .0001046 .0090508
    lnmine | .0025881 .0007119 3.64 0.000 .0011928 .0039834
    marrge | .03426 .0042176 8.12 0.000 .0259937 .0425263
    EDU | .0253873 .0033093 7.67 0.000 .0189011 .0318734
    EMP | -.0118809 .0032721 -3.63 0.000 -.0182941 -.0054678
    sharepast | -.1541651 .0151716 -10.16 0.000 -.1839008 -.1244294
    east | .0188558 .0077445 2.43 0.015 .0036769 .0340347
    khangai | .0235075 .005675 4.14 0.000 .0123848 .0346302
    central | -.0201913 .0069341 -2.91 0.004 -.033782 -.0066007
    UB | .0292026 .0059231 4.93 0.000 .0175935 .0408117
    ------------------------------------------------------------------------------

    .

  • #2
    It looks to me like they are referring to it. First, there's no reason it shouldn't. Second, the output even says

    Code:
    Average marginal effects Number of obs = 24635
    Model VCE : Bootstrap
    and the standard errors in the -margins-output are different from what they were after the original (non-bootstrap) probit regression.

    So is there some reason you think you're not getting what you asked for?

    Comment


    • #3
      Thank you Clyde. Actually, I wanted to estimate bootstrapped standard error for dprobit model. How do I do that? The bootstrapping command does not working with dprobit.

      Comment


      • #4
        Amarjargal:
        why sticking with the old-fashioned -dprobit- when -margins- can do it (better)?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hey Carlo, thank you. I do not mind -margins- of course. I thought -dprobit- and -margins- are the same. Anyway, how do I find bootstrapped standard error with -margins-? Looks like I have to write macro here.

          Comment


          • #6
            If all you care about are the average marginal effects, then why not just estimate a linear probability model? The average marginal effects gives you an estimate of how much the expected proportion increases for a unit change in x. That is exactly what a linear probability model gives you. The difference is that the average marginal effect first estimates a probit and than summarizes that model to get the marginal effects, while the linear probability model computes those effects directly from the data. So there are less possibilities of your model being wrong if you compute the linear probability model. Bootstrapping is then simple:

            Code:
            sysuse nlsw88
            
            // average marginal effect
            probit union grade i.race
            margins, dydx(*)
            
            // linear probability model
            regress union grade i.race, vce(robust)
            
            // bootstrap the effects
            regress union grade i.race, vce(bootstrap, reps(100))
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              So you want to bootstrap the margins themselves, not apply margins to the bootstrap VCE from probit, is that it? OK. I'm not enough of a bootstrap expert to know if it makes sense to do that or not. But here's how you can do it:

              Code:
              capture program drop my_margins
              program define my_margins, eclass
                  probit immig HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB
                  margins, dydx(*) post
                  exit
              end
              
              bootsstrap _b, reps(100) seed(1234): my_margins
              Notes:
              1. Not tested.
              2. Evidently substitute your desired number of reps and random number seed.

              Also, in Stata, we refer to named blocks of code that can be invoked by other code as programs, not macros. The term macro has a very different meaning in Stata.

              Added later: Crossed in cyberspace with #2. I think Maarten's approach makes more sense.
              Last edited by Clyde Schechter; 17 Mar 2016, 08:27.

              Comment


              • #8
                Dear Maarten, I got your point, thank you. did not know that -regress var_names, vce(robust)- similar to -dprobit- .

                It is my first time to write a program in Stata using Clyde's code. Thank you Clyde.

                One more thing I need to clarify. Actually, my sample size 24635. First time, I simply used marginal effects for probit model and presented the results to a conference. However, one participant had suggested my to bootstrap the result due to large sample issues. Also I saw one paper of 2016 that they have similar sample size and they used bootstrapping. Can you guys give me some comments, why that very large sample size may need bootstrapping?

                Thanks in advance. I find myself very lucky to join this group.


                Amaraa

                Comment


                • #9
                  Large sample sizes are not a problem, so they don't need any special treatment. The p-values and confidence intervals mean exactly what they should mean.

                  However, you do need to be aware that pretty much all effects will be statistically significant. That does not mean that those effects are substatively important; you have so much information in your data that you can easily detect substantively irrelevant deviations from an effect of 0. Bootstrap won't help you with that. Instead what you need to do is just look at the effects and deside whether you find that size of effect substantively relevant. What is sometimes helpful is to compare the effects with an effect that is well known and considered to be large.
                  ---------------------------------
                  Maarten L. Buis
                  University of Konstanz
                  Department of history and sociology
                  box 40
                  78457 Konstanz
                  Germany
                  http://www.maartenbuis.nl
                  ---------------------------------

                  Comment


                  • #10
                    Thank you Maarten, actually all my coefficients are statistically significant and for some reason I was hoping that the bootstrapping would ensure me if those coefficients are really statistically significant.

                    Amaraa

                    Comment


                    • #11
                      Amarjargal:
                      with such a huge sample size, -bootstrap- cannot fail to confirm their statistical significance.
                      As Maarten highlighted, the issue if their statistical significance is also relevant in the empirical world.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment

                      Working...
                      X