Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Felix Bittmann View Post

    This source is very dated (although a great read otherwise, I really recommend it). Newer references say 15,000 is the lower acceptable limit (see https://arxiv.org/abs/1411.5279). Especially when p-values are volatile (how big is the bias?), much more replications than 500 are probably necessary. If even then the p-values do not stabilize, there are probably bigger problems with the data or very strange distributions are present.
    Originally posted by Felix Bittmann View Post

    This source is very dated (although a great read otherwise, I really recommend it). Newer references say 15,000 is the lower acceptable limit (see https://arxiv.org/abs/1411.5279). Especially when p-values are volatile (how big is the bias?), much more replications than 500 are probably necessary. If even then the p-values do not stabilize, there are probably bigger problems with the data or very strange distributions are present.

    Thanks Carlo and Felix, I read the book by Tibshirani before and also noticed they suggested at least 50 replications for S.E. purposes which is the default option of vce(boot). However, I found that even with 200 replications, the significance level in fact varies quite a bit, jumping from .1 to .005 to .1 again depending on the seeding number. If using more than 1000 replications as suggested by Fei to obtain a "stable" p-value, this will unfortunately slow down the whole estimation process by a lot making it less practical. Is there any way to speed up the bootstrapping process in Stata with xtreg, fe?

    Comment


    • #17
      Originally posted by Felix Bittmann View Post

      This source is very dated (although a great read otherwise, I really recommend it). Newer references say 15,000 is the lower acceptable limit (see https://arxiv.org/abs/1411.5279). Especially when p-values are volatile (how big is the bias?), much more replications than 500 are probably necessary. If even then the p-values do not stabilize, there are probably bigger problems with the data or very strange distributions are present.


      Originally posted by Carlo Lazzaro View Post
      Felix:
      yes, it's true that this pivotal reference is really dated (1993) and at those days computers were less powerful (and their availability was not as wide as today, at least when I graduated, that is well in the past millennium).
      -bootstrap- entry in Stata .pdf manual reports 100 replications for SE estimate (Example 2), that may be not enough to give back stable results in most of the research projects
      With an average powerful laptop, today 200 -bootstrap- replications should be considered the lower limit of the range, whereas the upper one depends on other considerations (bootstrap bias; p-values volatility; research field traditions)-

      Thanks Carlo and Felix, I read the book by Tibshirani before and also noticed they suggested at least 50 replications for S.E. purposes which is the default option of vce(boot). However, I found that even with 200 replications, the significance level in fact varies quite a bit, jumping from .1 to .005 to .1 again depending on the seeding number. If using more than 1000 replications as suggested by Fei to obtain a "stable" p-value, this will unfortunately slow down the whole estimation process by a lot making it less practical. Is there any way to speed up the bootstrapping process in Stata with xtreg, fe?

      Comment


      • #18
        https://journals.sagepub.com/doi/abs...36867X19874242

        https://github.com/gvegayon/parallel
        Best wishes

        Stata 18.0 MP | ORCID | Google Scholar

        Comment


        • #19
          Bootstrapping isn't a panacea to cure all woes. Even Efron, it's inventor, wrote that closed-form/analytic solutions should be used when known to derive standard errors. I would also echo the sound advice here and suggest that it be used as a last resort, when conventional robust estimators don't exist. If you are in the unfortunate situation to use them, it is probably better to either do some simulations to see whether the standard estimate is problematic, or else rerun the analyses with increasing number of resamples to convince yourself that the estimates have converged.

          Comment


          • #20
            In my understanding (which I could be wrong about, I'm still going through my coursework myself), bootstrapping SEs can be useful when you have smaller sample sizes, too.

            Suppose you're using regression and synthetic controls together, and you're limiting your analysis to your donor pool (typically under 100 units)... regular standard errors may give inconsistent estimates due to small sample sizes. So, we bootstrap 2000 times, which if I recall is simply repeatedly estimating the SEs from a normal distribution

            Comment


            • #21
              Jared:
              -bootstrap- is usually a non-parametric resampling (with reintroduction) procedure.
              Hence, data speak for themselves.
              You can impose parametric -bootstrap-procedure, assuming that data are resamples (with reintroduction) from a given theoretical probability distribution (e.g., a Normal one), though.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment

              Working...
              X