Bootstrapping marginal effects in probit model

Amarjargal Amartuvshin

Join Date: Mar 2016

Posts: 14
#1

Bootstrapping marginal effects in probit model

15 Mar 2016, 19:02

Dear Stata members, it is nice to join you, fellows.

I have the following results in stata. Actually, I wanted to bootstrap marginal effects for my model and did the following steps. But I am not sure the marginals, dydx(*) commands are really referring to the bootstrapped samples here (the last command : .margins, dydx(*)). I need your help,fellows?

. probit immig HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB

Iteration 0: log likelihood = -5918.8745
Iteration 1: log likelihood = -5372.6804
Iteration 2: log likelihood = -5348.3457
Iteration 3: log likelihood = -5348.2508
Iteration 4: log likelihood = -5348.2508

Probit regression Number of obs = 24635
LR chi2(12) = 1141.25
Prob > chi2 = 0.0000
Log likelihood = -5348.2508 Pseudo R2 = 0.0964

------------------------------------------------------------------------------
immig | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
HHSIZE | -.1591801 .0105724 -15.06 0.000 -.1799015 -.1384586
agesq | -.0003139 .0000206 -15.26 0.000 -.0003542 -.0002736
child | .0394447 .0170369 2.32 0.021 .006053 .0728364
lnmine | .022301 .0057576 3.87 0.000 .0110163 .0335856
marrge | .2952087 .0332481 8.88 0.000 .2300437 .3603738
EDU | .2187549 .0282795 7.74 0.000 .1633281 .2741818
EMP | -.1023748 .029243 -3.50 0.000 -.15969 -.0450597
sharepast | -1.328398 .1331313 -9.98 0.000 -1.58933 -1.067465
east | .1624754 .0628192 2.59 0.010 .0393521 .2855987
khangai | .2025574 .0522529 3.88 0.000 .1001437 .3049711
central | -.1739832 .0576205 -3.02 0.003 -.2869174 -.0610491
UB | .2516309 .0518075 4.86 0.000 .15009 .3531718
_cons | -.8136612 .0686039 -11.86 0.000 -.9481223 -.6792001
------------------------------------------------------------------------------

. margins, dydx(*)

Average marginal effects Number of obs = 24635
Model VCE : OIM

Expression : Pr(immig), predict()
dy/dx w.r.t. : HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB

------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
HHSIZE | -.0184734 .00125 -14.78 0.000 -.0209233 -.0160235
agesq | -.0000364 2.43e-06 -15.00 0.000 -.0000412 -.0000317
child | .0045777 .0019779 2.31 0.021 .000701 .0084544
lnmine | .0025881 .0006689 3.87 0.000 .0012772 .003899
marrge | .03426 .0038809 8.83 0.000 .0266536 .0418664
EDU | .0253873 .0032944 7.71 0.000 .0189304 .0318441
EMP | -.0118809 .0033961 -3.50 0.000 -.0185371 -.0052248
sharepast | -.1541651 .0155968 -9.88 0.000 -.1847342 -.123596
east | .0188558 .0072931 2.59 0.010 .0045617 .03315
khangai | .0235075 .0060732 3.87 0.000 .0116042 .0354108
central | -.0201913 .0066923 -3.02 0.003 -.033308 -.0070747
UB | .0292026 .0060213 4.85 0.000 .017401 .0410042
------------------------------------------------------------------------------

. probit immig HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB, vce(bootstrap, reps(500))
(running probit on estimation sample)

Bootstrap replications (500)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................................. 150
.................................................. 200
.................................................. 250
.................................................. 300
.................................................. 350
.................................................. 400
.................................................. 450
.................................................. 500

Probit regression Number of obs = 24635
Replications = 500
Wald chi2(12) = 751.94
Prob > chi2 = 0.0000
Log likelihood = -5348.2508 Pseudo R2 = 0.0964

------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
immig | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
HHSIZE | -.1591801 .0126795 -12.55 0.000 -.1840314 -.1343288
agesq | -.0003139 .0000214 -14.65 0.000 -.0003559 -.0002719
child | .0394447 .0196725 2.01 0.045 .0008872 .0780022
lnmine | .022301 .0061473 3.63 0.000 .0102525 .0343495
marrge | .2952087 .0366813 8.05 0.000 .2233147 .3671028
EDU | .2187549 .0279829 7.82 0.000 .1639094 .2736005
EMP | -.1023748 .0280807 -3.65 0.000 -.1574119 -.0473377
sharepast | -1.328398 .1318705 -10.07 0.000 -1.586859 -1.069936
east | .1624754 .0665041 2.44 0.015 .0321298 .292821
khangai | .2025574 .0489888 4.13 0.000 .1065411 .2985737
central | -.1739832 .0597339 -2.91 0.004 -.2910596 -.0569068
UB | .2516309 .0510033 4.93 0.000 .1516661 .3515956
_cons | -.8136612 .0713887 -11.40 0.000 -.9535805 -.6737419
------------------------------------------------------------------------------

. margins, dydx(*)

Average marginal effects Number of obs = 24635
Model VCE : Bootstrap

Expression : Pr(immig), predict()
dy/dx w.r.t. : HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB

------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
HHSIZE | -.0184734 .0014711 -12.56 0.000 -.0213568 -.01559
agesq | -.0000364 2.49e-06 -14.65 0.000 -.0000413 -.0000316
child | .0045777 .0022822 2.01 0.045 .0001046 .0090508
lnmine | .0025881 .0007119 3.64 0.000 .0011928 .0039834
marrge | .03426 .0042176 8.12 0.000 .0259937 .0425263
EDU | .0253873 .0033093 7.67 0.000 .0189011 .0318734
EMP | -.0118809 .0032721 -3.63 0.000 -.0182941 -.0054678
sharepast | -.1541651 .0151716 -10.16 0.000 -.1839008 -.1244294
east | .0188558 .0077445 2.43 0.015 .0036769 .0340347
khangai | .0235075 .005675 4.14 0.000 .0123848 .0346302
central | -.0201913 .0069341 -2.91 0.004 -.033782 -.0066007
UB | .0292026 .0059231 4.93 0.000 .0175935 .0408117
------------------------------------------------------------------------------

.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30080
#2

15 Mar 2016, 19:35

It looks to me like they are referring to it. First, there's no reason it shouldn't. Second, the output even says

Code:

Average marginal effects Number of obs = 24635 Model VCE : Bootstrap

and the standard errors in the -margins-output are different from what they were after the original (non-bootstrap) probit regression.

So is there some reason you think you're not getting what you asked for?
Comment
Amarjargal Amartuvshin

Join Date: Mar 2016

Posts: 14
#3

15 Mar 2016, 21:06

Thank you Clyde. Actually, I wanted to estimate bootstrapped standard error for dprobit model. How do I do that? The bootstrapping command does not working with dprobit.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17706
#4

15 Mar 2016, 23:44

Amarjargal:
why sticking with the old-fashioned -dprobit- when -margins- can do it (better)?

Kind regards,
Carlo
(Stata 19.0)
Comment
Amarjargal Amartuvshin

Join Date: Mar 2016

Posts: 14
#5

17 Mar 2016, 00:02

Hey Carlo, thank you. I do not mind -margins- of course. I thought -dprobit- and -margins- are the same. Anyway, how do I find bootstrapped standard error with -margins-? Looks like I have to write macro here.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3449
#6

17 Mar 2016, 08:16

If all you care about are the average marginal effects, then why not just estimate a linear probability model? The average marginal effects gives you an estimate of how much the expected proportion increases for a unit change in x. That is exactly what a linear probability model gives you. The difference is that the average marginal effect first estimates a probit and than summarizes that model to get the marginal effects, while the linear probability model computes those effects directly from the data. So there are less possibilities of your model being wrong if you compute the linear probability model. Bootstrapping is then simple:

Code:

sysuse nlsw88 // average marginal effect probit union grade i.race margins, dydx(*) // linear probability model regress union grade i.race, vce(robust) // bootstrap the effects regress union grade i.race, vce(bootstrap, reps(100))

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30080
#7

17 Mar 2016, 08:25

So you want to bootstrap the margins themselves, not apply margins to the bootstrap VCE from probit, is that it? OK. I'm not enough of a bootstrap expert to know if it makes sense to do that or not. But here's how you can do it:

Code:

capture program drop my_margins program define my_margins, eclass probit immig HHSIZE agesq child lnmine marrge EDU EMP sharepast east khangai central UB margins, dydx(*) post exit end bootsstrap _b, reps(100) seed(1234): my_margins

Notes:
1. Not tested.
2. Evidently substitute your desired number of reps and random number seed.

Also, in Stata, we refer to named blocks of code that can be invoked by other code as programs, not macros. The term macro has a very different meaning in Stata.

Added later: Crossed in cyberspace with #2. I think Maarten's approach makes more sense.

Last edited by Clyde Schechter; 17 Mar 2016, 08:27.
Comment
Amarjargal Amartuvshin

Join Date: Mar 2016

Posts: 14
#8

18 Mar 2016, 03:12

Dear Maarten, I got your point, thank you. did not know that -regress var_names, vce(robust)- similar to -dprobit- .

It is my first time to write a program in Stata using Clyde's code. Thank you Clyde.

One more thing I need to clarify. Actually, my sample size 24635. First time, I simply used marginal effects for probit model and presented the results to a conference. However, one participant had suggested my to bootstrap the result due to large sample issues. Also I saw one paper of 2016 that they have similar sample size and they used bootstrapping. Can you guys give me some comments, why that very large sample size may need bootstrapping?

Thanks in advance. I find myself very lucky to join this group.

Amaraa
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3449
#9

18 Mar 2016, 04:38

Large sample sizes are not a problem, so they don't need any special treatment. The p-values and confidence intervals mean exactly what they should mean.

However, you do need to be aware that pretty much all effects will be statistically significant. That does not mean that those effects are substatively important; you have so much information in your data that you can easily detect substantively irrelevant deviations from an effect of 0. Bootstrap won't help you with that. Instead what you need to do is just look at the effects and deside whether you find that size of effect substantively relevant. What is sometimes helpful is to compare the effects with an effect that is well known and considered to be large.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Amarjargal Amartuvshin

Join Date: Mar 2016

Posts: 14
#10

19 Mar 2016, 04:09

Thank you Maarten, actually all my coefficients are statistically significant and for some reason I was hoping that the bootstrapping would ensure me if those coefficients are really statistically significant.

Amaraa
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17706
#11

19 Mar 2016, 08:01

Amarjargal:
with such a huge sample size, -bootstrap- cannot fail to confirm their statistical significance.
As Maarten highlighted, the issue if their statistical significance is also relevant in the empirical world.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Bootstrapping marginal effects in probit model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment