Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Very big bootstrap standard errors for non linear combinations of parameters

    Goodmorning everybody. I have the following problem. I need to compute the nonlinear combination of two previously estimated regression parameters and its bootstrap standard errors. I am running the following program, that apparently works fine. But the bootstrapped standard errors are implausibly (in my view) big. The problem gets worse if I increase the number of bootstrap replications and does not depend on the seed. It also gets worse if I increase the number at the exponent. Any insights? Many thanks in advance! G

    *****This is the program:

    use data.dta, clear

    program define bootstr, rclass

    reg y1 x if sample1==3
    est store pred1

    reg y2 x if sample2==3
    est store pred2

    suest pred1 pred2, r

    local beta1=[pred1_mean]x
    local beta2=[pred2_mean]x

    display `beta1'
    display `beta2'

    suest pred1 pred2, r
    return scalar comb = (([pred2_mean]x / [pred1_mean]x))^15

    end

    bootstr
    bootstrap comb=r(comb), reps(1000) seed(123): bootstr

    ****The output is:

    . bootstr

    Source | SS df MS Number of obs = 700
    -------------+---------------------------------- F(1, 742) = 113.77
    Model | 89138.2261 1 89138.2261 Prob > F = 0.0000
    Residual | 581349.365 742 783.489711 R-squared = 0.1329
    -------------+---------------------------------- Adj R-squared = 0.1318
    Total | 670487.591 743 902.40591 Root MSE = 27.991

    ------------------------------------------------------------------------------
    y1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    x | .4987301 .0467574 10.67 0.000 .4069376 .5905226
    _cons | 24.23444 3.847356 6.30 0.000 16.68144 31.78744
    ------------------------------------------------------------------------------

    Source | SS df MS Number of obs = 159
    -------------+---------------------------------- F(1, 157) = 10.73
    Model | 7734.32668 1 7734.32668 Prob > F = 0.0013
    Residual | 113151.422 157 720.709693 R-squared = 0.0640
    -------------+---------------------------------- Adj R-squared = 0.0580
    Total | 120885.748 158 765.099674 Root MSE = 26.846

    ------------------------------------------------------------------------------
    y2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    x | .3954244 .120707 3.28 0.001 .1570053 .6338435
    _cons | 32.63701 10.45339 3.12 0.002 11.98959 53.28442
    ------------------------------------------------------------------------------

    Simultaneous results for pred1, pred2

    Number of obs = 700

    ------------------------------------------------------------------------------
    | Robust
    | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    pred1_mean |
    x | .4987301 .0520144 9.59 0.000 .3967838 .6006764
    _cons | 24.23444 4.390566 5.52 0.000 15.62909 32.83979
    -------------+----------------------------------------------------------------
    pred1_lnvar |
    _cons | 6.663758 .0529112 125.94 0.000 6.560054 6.767462
    -------------+----------------------------------------------------------------
    pred2_mean |
    x | .3954244 .1272472 3.11 0.002 .1460244 .6448244
    _cons | 32.63701 11.05678 2.95 0.003 10.96611 54.30791
    -------------+----------------------------------------------------------------
    pred2_lnvar |
    _cons | 6.580236 .1209233 54.42 0.000 6.343231 6.817242
    ------------------------------------------------------------------------------
    .49873011
    .39542437

    Simultaneous results for pred1, pred2

    Number of obs = 700

    ------------------------------------------------------------------------------
    | Robust
    | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    pred1_mean |
    x | .4987301 .0520144 9.59 0.000 .3967838 .6006764
    _cons | 24.23444 4.390566 5.52 0.000 15.62909 32.83979
    -------------+----------------------------------------------------------------
    pred1_lnvar |
    _cons | 6.663758 .0529112 125.94 0.000 6.560054 6.767462
    -------------+----------------------------------------------------------------
    pred2_mean |
    x | .3954244 .1272472 3.11 0.002 .1460244 .6448244
    _cons | 32.63701 11.05678 2.95 0.003 10.96611 54.30791
    -------------+----------------------------------------------------------------
    pred2_lnvar |
    _cons | 6.580236 .1209233 54.42 0.000 6.343231 6.817242
    ------------------------------------------------------------------------------

    . bootstrap comb=r(comb), reps(1000) seed(123): bootstr
    (running bootstr on estimation sample)

    Bootstrap replications (1000)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    .................................................. 150
    .................................................. 200
    .................................................. 250
    .................................................. 300
    .................................................. 350
    .................................................. 400
    .................................................. 450
    .................................................. 500
    .................................................. 550
    .................................................. 600
    .................................................. 650
    .................................................. 700
    .................................................. 750
    .................................................. 800
    .................................................. 850
    .................................................. 900
    .................................................. 950
    .................................................. 1000

    Bootstrap results Number of obs = 700
    Replications = 1,000

    command: bootstr
    comb: r(comb)

    ------------------------------------------------------------------------------
    | Observed Bootstrap Normal-based
    | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    comb | .0307587 42.81222 0.00 0.999 -83.87965 83.94117
    ------------------------------------------------------------------------------
    Last edited by Giorgia Estefani; 07 Jan 2023, 01:31.

  • #2
    Originally posted by Giorgia Estefani View Post
    I am running the following program, that apparently works fine.
    Really?

    If I read the regression table for your first model correctly, you have 743 total degrees of freedom (742 in the denominator of the model F statistic) with only 700 observations. How does that happen?

    But the bootstrapped standard errors are implausibly (in my view) big. The problem gets worse if I increase the number of bootstrap replications and does not depend on the seed. It also gets worse if I increase the number at the exponent. Any insights?
    Ratios have a distribution that is long-tailed and you're taking that and exaggerating it by an exponent of 15. You can see this both in the percentile bootstrap results and in summary statistics of individual replicates that can be recovered with the saving() option.

    .ÿ
    .ÿversionÿ17.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿ//ÿseedem
    .ÿsetÿseedÿ159190306

    .ÿ
    .ÿquietlyÿsetÿobsÿ700

    .ÿ
    .ÿgenerateÿdoubleÿxÿ=ÿruniform(0,ÿ100)

    .ÿgenerateÿdoubleÿy1ÿ=ÿrnormal(24.23444ÿ+ÿ0.4987301ÿ*ÿx,ÿsqrt(783.489711))

    .ÿgenerateÿdoubleÿy2ÿ=ÿrnormal(32.63701ÿ+ÿ0.3954244ÿ*ÿx,ÿsqrt(720.709693))

    .ÿgenerateÿbyteÿsample1ÿ=ÿ3

    .ÿgenerateÿbyteÿsample2ÿ=ÿ3ÿ*ÿ(_nÿ<=ÿ159)

    .ÿ
    .ÿ*
    .ÿ*ÿBeginÿhere
    .ÿ*
    .ÿprogramÿdefineÿbootEm,ÿrclass
    ÿÿ1.ÿÿÿÿÿversionÿ17.0
    ÿÿ2.ÿÿÿÿÿsyntaxÿ,ÿ[nl]
    ÿÿ3.ÿ
    .ÿÿÿÿÿquietlyÿregressÿy1ÿc.xÿifÿsample1ÿ==ÿ3
    ÿÿ4.ÿÿÿÿÿestimatesÿstoreÿpred1
    ÿÿ5.ÿ
    .ÿÿÿÿÿquietlyÿregressÿy2ÿc.xÿifÿsample2ÿ==ÿ3
    ÿÿ6.ÿÿÿÿÿestimatesÿstoreÿpred2
    ÿÿ7.ÿ
    .ÿÿÿÿÿsuestÿpred1ÿpred2,ÿvce(robust)
    ÿÿ8.ÿ
    .ÿÿÿÿÿifÿ"`nl'"ÿ!=ÿ""ÿnlcomÿ(rat15:ÿ(([pred2_mean]xÿ/ÿ[pred1_mean]x))^15ÿ)ÿ//ÿ,ÿnoheader
    ÿÿ9.ÿÿÿÿÿelseÿ{
    ÿ10.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿpred1ÿ=ÿ[pred1_mean]x
    ÿ11.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿpred2ÿ=ÿ[pred2_mean]x
    ÿ12.ÿÿÿÿÿÿÿÿÿreturnÿscalarÿrat15ÿ=ÿ(([pred2_mean]xÿ/ÿ[pred1_mean]x))^15
    ÿ13.ÿÿÿÿÿ}
    ÿ14.ÿÿÿÿÿestimatesÿdropÿ_all
    ÿ15.ÿend

    .ÿ
    .ÿbootEmÿ,ÿnl

    Simultaneousÿresultsÿforÿpred1,ÿpred2ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿ700

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRobust
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿCoefficientÿÿstd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    pred1_meanÿÿÿ|
    ÿÿÿÿÿÿÿÿÿÿÿxÿ|ÿÿÿ.4552695ÿÿÿ.0358197ÿÿÿÿ12.71ÿÿÿ0.000ÿÿÿÿÿ.3850642ÿÿÿÿ.5254749
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ27.07837ÿÿÿ1.953795ÿÿÿÿ13.86ÿÿÿ0.000ÿÿÿÿÿÿÿ23.249ÿÿÿÿ30.90774
    -------------+----------------------------------------------------------------
    pred1_lnvarÿÿ|
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ6.627308ÿÿÿÿ.054963ÿÿÿ120.58ÿÿÿ0.000ÿÿÿÿÿ6.519582ÿÿÿÿ6.735033
    -------------+----------------------------------------------------------------
    pred2_meanÿÿÿ|
    ÿÿÿÿÿÿÿÿÿÿÿxÿ|ÿÿÿÿ.351735ÿÿÿ.0725013ÿÿÿÿÿ4.85ÿÿÿ0.000ÿÿÿÿÿ.2096351ÿÿÿÿ.4938348
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿÿ38.3869ÿÿÿ4.049579ÿÿÿÿÿ9.48ÿÿÿ0.000ÿÿÿÿÿ30.44988ÿÿÿÿ46.32393
    -------------+----------------------------------------------------------------
    pred2_lnvarÿÿ|
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ6.654728ÿÿÿ.1004442ÿÿÿÿ66.25ÿÿÿ0.000ÿÿÿÿÿ6.457861ÿÿÿÿ6.851595
    ------------------------------------------------------------------------------

    ÿÿÿÿÿÿÿrat15:ÿ(([pred2_mean]xÿ/ÿ[pred1_mean]x))^15

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿ.0208547ÿÿÿ.0683028ÿÿÿÿÿ0.31ÿÿÿ0.760ÿÿÿÿ-.1130164ÿÿÿÿ.1547259
    ------------------------------------------------------------------------------

    .ÿ
    .ÿtempfileÿbs

    .ÿbootstrapÿ///
    >ÿÿÿÿÿrat15ÿ=ÿr(rat15),ÿ///
    >ÿÿÿÿÿÿÿÿÿreps(400)ÿnodotsÿsaving(`bs'):ÿbootEm

    BootstrapÿresultsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿ700
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿReplicationsÿÿ=ÿ400

    ÿÿÿÿÿÿCommand:ÿbootEm
    ÿÿÿÿÿÿÿÿrat15:ÿr(rat15)

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿObservedÿÿÿBootstrapÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNormal-based
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿcoefficientÿÿstd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿ.0208547ÿÿÿ2.966973ÿÿÿÿÿ0.01ÿÿÿ0.994ÿÿÿÿ-5.794305ÿÿÿÿ5.836014
    ------------------------------------------------------------------------------

    .ÿestatÿbootstrap,ÿbcÿpercentile

    BootstrapÿresultsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ700
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿReplicationsÿÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ400

    ÿÿÿÿÿÿCommand:ÿbootEm
    ÿÿÿÿÿÿÿÿrat15:ÿr(rat15)

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿObservedÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿBootstrap
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿcoefficientÿÿÿÿÿÿÿBiasÿÿÿÿstd.ÿerr.ÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿ.02085475ÿÿÿÿ.803001ÿÿÿ2.9669725ÿÿÿÿ5.22e-06ÿÿÿ10.90865ÿÿÿ(P)
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ4.66e-06ÿÿÿ8.525005ÿÿ(BC)
    ------------------------------------------------------------------------------
    Key:ÿÿP:ÿPercentile
    ÿÿÿÿÿBC:ÿBias-corrected

    .ÿ
    .ÿquietlyÿuseÿ`bs',ÿclear

    .ÿsummarizeÿrat15

    ÿÿÿÿVariableÿ|ÿÿÿÿÿÿÿÿObsÿÿÿÿÿÿÿÿMeanÿÿÿÿStd.ÿdev.ÿÿÿÿÿÿÿMinÿÿÿÿÿÿÿÿMax
    -------------+---------------------------------------------------------
    ÿÿÿÿÿÿÿrat15ÿ|ÿÿÿÿÿÿÿÿ400ÿÿÿÿ.8238558ÿÿÿÿ2.966973ÿÿÿ1.15e-07ÿÿÿ29.38264

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .

    Comment


    • #3
      Yes, sorry. I did an incorrect copy and paste from stata. This is correct:

      . bootstr

      Source | SS df MS Number of obs = 744
      -------------+---------------------------------- F(1, 742) = 113.77
      Model | 89138.2261 1 89138.2261 Prob > F = 0.0000
      Residual | 581349.365 742 783.489711 R-squared = 0.1329
      -------------+---------------------------------- Adj R-squared = 0.1318
      Total | 670487.591 743 902.40591 Root MSE = 27.991

      ------------------------------------------------------------------------------
      y1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      x | .4987301 .0467574 10.67 0.000 .4069376 .5905226
      _cons | 24.23444 3.847356 6.30 0.000 16.68144 31.78744
      ------------------------------------------------------------------------------

      Source | SS df MS Number of obs = 221
      -------------+---------------------------------- F(1, 219) = 18.20
      Model | 12100.8976 1 12100.8976 Prob > F = 0.0000
      Residual | 145581.03 219 664.753562 R-squared = 0.0767
      -------------+---------------------------------- Adj R-squared = 0.0725
      Total | 157681.928 220 716.736035 Root MSE = 25.783

      ------------------------------------------------------------------------------
      y2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      x | .4164597 .09761 4.27 0.000 .2240844 .608835
      _cons | 30.31819 8.50991 3.56 0.000 13.54639 47.09
      ------------------------------------------------------------------------------

      Simultaneous results for pred1, pred2

      Number of obs = 806

      ------------------------------------------------------------------------------
      | Robust
      | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      pred1_mean |
      x | .4987301 .0520117 9.59 0.000 .3967891 .6006711
      _cons | 24.23444 4.390339 5.52 0.000 15.62954 32.83935
      -------------+----------------------------------------------------------------
      pred1_lnvar |
      _cons | 6.663758 .0529085 125.95 0.000 6.560059 6.767457
      -------------+----------------------------------------------------------------
      pred2_mean |
      x | .4164597 .1041227 4.00 0.000 .212383 .6205364
      _cons | 30.31819 9.163427 3.31 0.001 12.35821 48.27818
      -------------+----------------------------------------------------------------
      pred2_lnvar |
      _cons | 6.499416 .1043294 62.30 0.000 6.294935 6.703898
      ------------------------------------------------------------------------------
      .49873011
      .41645969

      Simultaneous results for pred1, pred2

      Number of obs = 806

      ------------------------------------------------------------------------------
      | Robust
      | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      pred1_mean |
      x | .4987301 .0520117 9.59 0.000 .3967891 .6006711
      _cons | 24.23444 4.390339 5.52 0.000 15.62954 32.83935
      -------------+----------------------------------------------------------------
      pred1_lnvar |
      _cons | 6.663758 .0529085 125.95 0.000 6.560059 6.767457
      -------------+----------------------------------------------------------------
      pred2_mean |
      x | .4164597 .1041227 4.00 0.000 .212383 .6205364
      _cons | 30.31819 9.163427 3.31 0.001 12.35821 48.27818
      -------------+----------------------------------------------------------------
      pred2_lnvar |
      _cons | 6.499416 .1043294 62.30 0.000 6.294935 6.703898
      ------------------------------------------------------------------------------

      . bootstrap comb=r(comb), reps(1000) seed(123): bootstr
      (running bootstr on estimation sample)

      Bootstrap replications (1000)
      ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
      .................................................. 50
      .................................................. 100
      .................................................. 150
      .................................................. 200
      .................................................. 250
      .................................................. 300
      .................................................. 350
      .................................................. 400
      .................................................. 450
      .................................................. 500
      .................................................. 550
      .................................................. 600
      .................................................. 650
      .................................................. 700
      .................................................. 750
      .................................................. 800
      .................................................. 850
      .................................................. 900
      .................................................. 950
      .................................................. 1000

      Bootstrap results Number of obs = 806
      Replications = 1,000

      command: bootstr
      comb: r(comb)

      ------------------------------------------------------------------------------
      | Observed Bootstrap Normal-based
      | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      comb | .0669284 317.6453 0.00 1.000 -622.5064 622.6403
      ------------------------------------------------------------------------------

      Comment


      • #4
        Many thanks for the intuition! Do you have any suggestions about how I can solve the problem?

        Comment


        • #5
          Originally posted by Giorgia Estefani View Post
          Do you have any suggestions about how I can solve the problem?
          Did you notice the usage of nlcom after suest?

          Your code fits two linear regression models of a single continuous predictor to each of two outcome variables in two partially overlapping sets of observations and then takes the fifteenth power of the ratio of the regression coefficients. (The usage of suest here is unnecessary, because you don't use its adjusted standard errors as nlcom would; instead you bootstrap the exponentiated ratio, itself.) I don't know what your research objective is, what question comb and its standard error are supposed to answer (what you're doing baffles me—I haven't seen anything like it before). But if you're unhappy with the bootstrap estimate of the sampling distribution of the fifteenth power of this ratio, then you might want to look into an alternative tack.

          Comment


          • #6
            Thanks. Yes, I did it also with nlcom that gives the same results.

            Comment


            • #7
              You're welcome. OK, so it appears that there's nothing with comb that makes it unsuitable for bootstrapping, and there's no problem to be solved insofar as estimating its standard error. It looks as if your best bet is to look into that alternative tack, to take different approach to answering your research question.

              Comment

              Working...
              X