Very big bootstrap standard errors for non linear combinations of parameters

Giorgia Estefani

Join Date: Mar 2019

Posts: 17
#1

Very big bootstrap standard errors for non linear combinations of parameters

07 Jan 2023, 01:25

Goodmorning everybody. I have the following problem. I need to compute the nonlinear combination of two previously estimated regression parameters and its bootstrap standard errors. I am running the following program, that apparently works fine. But the bootstrapped standard errors are implausibly (in my view) big. The problem gets worse if I increase the number of bootstrap replications and does not depend on the seed. It also gets worse if I increase the number at the exponent. Any insights? Many thanks in advance! G

*****This is the program:

use data.dta, clear

program define bootstr, rclass

reg y1 x if sample1==3
est store pred1

reg y2 x if sample2==3
est store pred2

suest pred1 pred2, r

local beta1=[pred1_mean]x
local beta2=[pred2_mean]x

display `beta1'
display `beta2'

suest pred1 pred2, r
return scalar comb = (([pred2_mean]x / [pred1_mean]x))^15

end

bootstr
bootstrap comb=r(comb), reps(1000) seed(123): bootstr

****The output is:

. bootstr

Source | SS df MS Number of obs = 700
-------------+---------------------------------- F(1, 742) = 113.77
Model | 89138.2261 1 89138.2261 Prob > F = 0.0000
Residual | 581349.365 742 783.489711 R-squared = 0.1329
-------------+---------------------------------- Adj R-squared = 0.1318
Total | 670487.591 743 902.40591 Root MSE = 27.991

------------------------------------------------------------------------------
y1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .4987301 .0467574 10.67 0.000 .4069376 .5905226
_cons | 24.23444 3.847356 6.30 0.000 16.68144 31.78744
------------------------------------------------------------------------------

Source | SS df MS Number of obs = 159
-------------+---------------------------------- F(1, 157) = 10.73
Model | 7734.32668 1 7734.32668 Prob > F = 0.0013
Residual | 113151.422 157 720.709693 R-squared = 0.0640
-------------+---------------------------------- Adj R-squared = 0.0580
Total | 120885.748 158 765.099674 Root MSE = 26.846

------------------------------------------------------------------------------
y2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .3954244 .120707 3.28 0.001 .1570053 .6338435
_cons | 32.63701 10.45339 3.12 0.002 11.98959 53.28442
------------------------------------------------------------------------------

Simultaneous results for pred1, pred2

Number of obs = 700

------------------------------------------------------------------------------
| Robust
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pred1_mean |
x | .4987301 .0520144 9.59 0.000 .3967838 .6006764
_cons | 24.23444 4.390566 5.52 0.000 15.62909 32.83979
-------------+----------------------------------------------------------------
pred1_lnvar |
_cons | 6.663758 .0529112 125.94 0.000 6.560054 6.767462
-------------+----------------------------------------------------------------
pred2_mean |
x | .3954244 .1272472 3.11 0.002 .1460244 .6448244
_cons | 32.63701 11.05678 2.95 0.003 10.96611 54.30791
-------------+----------------------------------------------------------------
pred2_lnvar |
_cons | 6.580236 .1209233 54.42 0.000 6.343231 6.817242
------------------------------------------------------------------------------
.49873011
.39542437

Simultaneous results for pred1, pred2

Number of obs = 700

------------------------------------------------------------------------------
| Robust
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pred1_mean |
x | .4987301 .0520144 9.59 0.000 .3967838 .6006764
_cons | 24.23444 4.390566 5.52 0.000 15.62909 32.83979
-------------+----------------------------------------------------------------
pred1_lnvar |
_cons | 6.663758 .0529112 125.94 0.000 6.560054 6.767462
-------------+----------------------------------------------------------------
pred2_mean |
x | .3954244 .1272472 3.11 0.002 .1460244 .6448244
_cons | 32.63701 11.05678 2.95 0.003 10.96611 54.30791
-------------+----------------------------------------------------------------
pred2_lnvar |
_cons | 6.580236 .1209233 54.42 0.000 6.343231 6.817242
------------------------------------------------------------------------------

. bootstrap comb=r(comb), reps(1000) seed(123): bootstr
(running bootstr on estimation sample)

Bootstrap replications (1000)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................................. 150
.................................................. 200
.................................................. 250
.................................................. 300
.................................................. 350
.................................................. 400
.................................................. 450
.................................................. 500
.................................................. 550
.................................................. 600
.................................................. 650
.................................................. 700
.................................................. 750
.................................................. 800
.................................................. 850
.................................................. 900
.................................................. 950
.................................................. 1000

Bootstrap results Number of obs = 700
Replications = 1,000

command: bootstr
comb: r(comb)

------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
comb | .0307587 42.81222 0.00 0.999 -83.87965 83.94117
------------------------------------------------------------------------------

Last edited by Giorgia Estefani; 07 Jan 2023, 01:31.
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4402
#2

07 Jan 2023, 03:54

Originally posted by Giorgia Estefani View Post

I am running the following program, that apparently works fine.

Really?

If I read the regression table for your first model correctly, you have 743 total degrees of freedom (742 in the denominator of the model F statistic) with only 700 observations. How does that happen?

But the bootstrapped standard errors are implausibly (in my view) big. The problem gets worse if I increase the number of bootstrap replications and does not depend on the seed. It also gets worse if I increase the number at the exponent. Any insights?

Ratios have a distribution that is long-tailed and you're taking that and exaggerating it by an exponent of 15. You can see this both in the percentile bootstrap results and in summary statistics of individual replicates that can be recovered with the saving() option.

.ÿ
.ÿversionÿ17.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿ//ÿseedem
.ÿsetÿseedÿ159190306

.ÿ
.ÿquietlyÿsetÿobsÿ700

.ÿ
.ÿgenerateÿdoubleÿxÿ=ÿruniform(0,ÿ100)

.ÿgenerateÿdoubleÿy1ÿ=ÿrnormal(24.23444ÿ+ÿ0.4987301ÿ*ÿx,ÿsqrt(783.489711))

.ÿgenerateÿdoubleÿy2ÿ=ÿrnormal(32.63701ÿ+ÿ0.3954244ÿ*ÿx,ÿsqrt(720.709693))

.ÿgenerateÿbyteÿsample1ÿ=ÿ3

.ÿgenerateÿbyteÿsample2ÿ=ÿ3ÿ*ÿ(_nÿ<=ÿ159)

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿprogramÿdefineÿbootEm,ÿrclass
ÿÿ1.ÿÿÿÿÿversionÿ17.0
ÿÿ2.ÿÿÿÿÿsyntaxÿ,ÿ[nl]
ÿÿ3.ÿ
.ÿÿÿÿÿquietlyÿregressÿy1ÿc.xÿifÿsample1ÿ==ÿ3
ÿÿ4.ÿÿÿÿÿestimatesÿstoreÿpred1
ÿÿ5.ÿ
.ÿÿÿÿÿquietlyÿregressÿy2ÿc.xÿifÿsample2ÿ==ÿ3
ÿÿ6.ÿÿÿÿÿestimatesÿstoreÿpred2
ÿÿ7.ÿ
.ÿÿÿÿÿsuestÿpred1ÿpred2,ÿvce(robust)
ÿÿ8.ÿ
.ÿÿÿÿÿifÿ"`nl'"ÿ!=ÿ""ÿnlcomÿ(rat15:ÿ(([pred2_mean]xÿ/ÿ[pred1_mean]x))^15ÿ)ÿ//ÿ,ÿnoheader
ÿÿ9.ÿÿÿÿÿelseÿ{
ÿ10.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿpred1ÿ=ÿ[pred1_mean]x
ÿ11.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿpred2ÿ=ÿ[pred2_mean]x
ÿ12.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿrat15ÿ=ÿ(([pred2_mean]xÿ/ÿ[pred1_mean]x))^15
ÿ13.ÿÿÿÿÿ}
ÿ14.ÿÿÿÿÿestimatesÿdropÿ_all
ÿ15.ÿend

.ÿ
.ÿbootEmÿ,ÿnl

Simultaneousÿresultsÿforÿpred1,ÿpred2ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿ700

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRobust
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿCoefficientÿÿstd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
pred1_meanÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿxÿ|ÿÿÿ.4552695ÿÿÿ.0358197ÿÿÿÿ12.71ÿÿÿ0.000ÿÿÿÿÿ.3850642ÿÿÿÿ.5254749
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ27.07837ÿÿÿ1.953795ÿÿÿÿ13.86ÿÿÿ0.000ÿÿÿÿÿÿÿ23.249ÿÿÿÿ30.90774
-------------+----------------------------------------------------------------
pred1_lnvarÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ6.627308ÿÿÿÿ.054963ÿÿÿ120.58ÿÿÿ0.000ÿÿÿÿÿ6.519582ÿÿÿÿ6.735033
-------------+----------------------------------------------------------------
pred2_meanÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿxÿ|ÿÿÿÿ.351735ÿÿÿ.0725013ÿÿÿÿÿ4.85ÿÿÿ0.000ÿÿÿÿÿ.2096351ÿÿÿÿ.4938348
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿÿ38.3869ÿÿÿ4.049579ÿÿÿÿÿ9.48ÿÿÿ0.000ÿÿÿÿÿ30.44988ÿÿÿÿ46.32393
-------------+----------------------------------------------------------------
pred2_lnvarÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ6.654728ÿÿÿ.1004442ÿÿÿÿ66.25ÿÿÿ0.000ÿÿÿÿÿ6.457861ÿÿÿÿ6.851595
------------------------------------------------------------------------------

ÿÿÿÿÿÿÿrat15:ÿ(([pred2_mean]xÿ/ÿ[pred1_mean]x))^15

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿ.0208547ÿÿÿ.0683028ÿÿÿÿÿ0.31ÿÿÿ0.760ÿÿÿÿ-.1130164ÿÿÿÿ.1547259
------------------------------------------------------------------------------

.ÿ
.ÿtempfileÿbs

.ÿbootstrapÿ///
>ÿÿÿÿÿrat15ÿ=ÿr(rat15),ÿ///
>ÿÿÿÿÿÿÿÿÿreps(400)ÿnodotsÿsaving(`bs'):ÿbootEm

BootstrapÿresultsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿ700
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿReplicationsÿÿ=ÿ400

ÿÿÿÿÿÿCommand:ÿbootEm
ÿÿÿÿÿÿÿÿrat15:ÿr(rat15)

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿObservedÿÿÿBootstrapÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNormal-based
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿcoefficientÿÿstd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿ.0208547ÿÿÿ2.966973ÿÿÿÿÿ0.01ÿÿÿ0.994ÿÿÿÿ-5.794305ÿÿÿÿ5.836014
------------------------------------------------------------------------------

.ÿestatÿbootstrap,ÿbcÿpercentile

BootstrapÿresultsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ700
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿReplicationsÿÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ400

ÿÿÿÿÿÿCommand:ÿbootEm
ÿÿÿÿÿÿÿÿrat15:ÿr(rat15)

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿObservedÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿBootstrap
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿcoefficientÿÿÿÿÿÿÿBiasÿÿÿÿstd.ÿerr.ÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿ.02085475ÿÿÿÿ.803001ÿÿÿ2.9669725ÿÿÿÿ5.22e-06ÿÿÿ10.90865ÿÿÿ(P)
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ4.66e-06ÿÿÿ8.525005ÿÿ(BC)
------------------------------------------------------------------------------
Key:ÿÿP:ÿPercentile
ÿÿÿÿÿBC:ÿBias-corrected

.ÿ
.ÿquietlyÿuseÿ`bs',ÿclear

.ÿsummarizeÿrat15

ÿÿÿÿVariableÿ|ÿÿÿÿÿÿÿÿObsÿÿÿÿÿÿÿÿMeanÿÿÿÿStd.ÿdev.ÿÿÿÿÿÿÿMinÿÿÿÿÿÿÿÿMax
-------------+---------------------------------------------------------
ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿÿÿÿÿÿ400ÿÿÿÿ.8238558ÿÿÿÿ2.966973ÿÿÿ1.15e-07ÿÿÿ29.38264

.ÿ
.ÿexit

endÿofÿdo-file

.
2 likes
Comment
Giorgia Estefani

Join Date: Mar 2019

Posts: 17
#3

07 Jan 2023, 08:35

Yes, sorry. I did an incorrect copy and paste from stata. This is correct:

. bootstr

Source | SS df MS Number of obs = 744
-------------+---------------------------------- F(1, 742) = 113.77
Model | 89138.2261 1 89138.2261 Prob > F = 0.0000
Residual | 581349.365 742 783.489711 R-squared = 0.1329
-------------+---------------------------------- Adj R-squared = 0.1318
Total | 670487.591 743 902.40591 Root MSE = 27.991

------------------------------------------------------------------------------
y1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .4987301 .0467574 10.67 0.000 .4069376 .5905226
_cons | 24.23444 3.847356 6.30 0.000 16.68144 31.78744
------------------------------------------------------------------------------

Source | SS df MS Number of obs = 221
-------------+---------------------------------- F(1, 219) = 18.20
Model | 12100.8976 1 12100.8976 Prob > F = 0.0000
Residual | 145581.03 219 664.753562 R-squared = 0.0767
-------------+---------------------------------- Adj R-squared = 0.0725
Total | 157681.928 220 716.736035 Root MSE = 25.783

------------------------------------------------------------------------------
y2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .4164597 .09761 4.27 0.000 .2240844 .608835
_cons | 30.31819 8.50991 3.56 0.000 13.54639 47.09
------------------------------------------------------------------------------

Simultaneous results for pred1, pred2

Number of obs = 806

------------------------------------------------------------------------------
| Robust
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pred1_mean |
x | .4987301 .0520117 9.59 0.000 .3967891 .6006711
_cons | 24.23444 4.390339 5.52 0.000 15.62954 32.83935
-------------+----------------------------------------------------------------
pred1_lnvar |
_cons | 6.663758 .0529085 125.95 0.000 6.560059 6.767457
-------------+----------------------------------------------------------------
pred2_mean |
x | .4164597 .1041227 4.00 0.000 .212383 .6205364
_cons | 30.31819 9.163427 3.31 0.001 12.35821 48.27818
-------------+----------------------------------------------------------------
pred2_lnvar |
_cons | 6.499416 .1043294 62.30 0.000 6.294935 6.703898
------------------------------------------------------------------------------
.49873011
.41645969

Simultaneous results for pred1, pred2

Number of obs = 806

------------------------------------------------------------------------------
| Robust
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pred1_mean |
x | .4987301 .0520117 9.59 0.000 .3967891 .6006711
_cons | 24.23444 4.390339 5.52 0.000 15.62954 32.83935
-------------+----------------------------------------------------------------
pred1_lnvar |
_cons | 6.663758 .0529085 125.95 0.000 6.560059 6.767457
-------------+----------------------------------------------------------------
pred2_mean |
x | .4164597 .1041227 4.00 0.000 .212383 .6205364
_cons | 30.31819 9.163427 3.31 0.001 12.35821 48.27818
-------------+----------------------------------------------------------------
pred2_lnvar |
_cons | 6.499416 .1043294 62.30 0.000 6.294935 6.703898
------------------------------------------------------------------------------

. bootstrap comb=r(comb), reps(1000) seed(123): bootstr
(running bootstr on estimation sample)

Bootstrap replications (1000)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................................. 150
.................................................. 200
.................................................. 250
.................................................. 300
.................................................. 350
.................................................. 400
.................................................. 450
.................................................. 500
.................................................. 550
.................................................. 600
.................................................. 650
.................................................. 700
.................................................. 750
.................................................. 800
.................................................. 850
.................................................. 900
.................................................. 950
.................................................. 1000

Bootstrap results Number of obs = 806
Replications = 1,000

command: bootstr
comb: r(comb)

------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
comb | .0669284 317.6453 0.00 1.000 -622.5064 622.6403
------------------------------------------------------------------------------
1 like
Comment
Giorgia Estefani

Join Date: Mar 2019

Posts: 17
#4

07 Jan 2023, 08:37

Many thanks for the intuition! Do you have any suggestions about how I can solve the problem?
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4402
#5

08 Jan 2023, 02:06

Originally posted by Giorgia Estefani View Post

Do you have any suggestions about how I can solve the problem?

Did you notice the usage of nlcom after suest?

Your code fits two linear regression models of a single continuous predictor to each of two outcome variables in two partially overlapping sets of observations and then takes the fifteenth power of the ratio of the regression coefficients. (The usage of suest here is unnecessary, because you don't use its adjusted standard errors as nlcom would; instead you bootstrap the exponentiated ratio, itself.) I don't know what your research objective is, what question comb and its standard error are supposed to answer (what you're doing baffles me—I haven't seen anything like it before). But if you're unhappy with the bootstrap estimate of the sampling distribution of the fifteenth power of this ratio, then you might want to look into an alternative tack.
2 likes
Comment
Giorgia Estefani

Join Date: Mar 2019

Posts: 17
#6

08 Jan 2023, 09:44

Thanks. Yes, I did it also with nlcom that gives the same results.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4402
#7

09 Jan 2023, 02:41

You're welcome. OK, so it appears that there's nothing with comb that makes it unsuitable for bootstrapping, and there's no problem to be solved insofar as estimating its standard error. It looks as if your best bet is to look into that alternative tack, to take different approach to answering your research question.
Comment

Announcement

Very big bootstrap standard errors for non linear combinations of parameters

Comment

Comment

Comment

Comment

Comment

Comment