Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrap SE for lowess

    I am looking to generate bootstrap SE for lowess smoothed curves. I realize that not everyone agrees that SE are appropriate for lowess curves, but I want to generate bootstrap SE for lowess anyway, so I can plot lowess fit Y and 95% CI Y across my Xs. In some past stata forum lists, when users have asked about SE for lowess (because the lowess function does not provide SE directly), one answer has been to bootstrap the lowess function. I would like to bootstrap the lowess, but I cannot figure out how. I am stymied by the fact that Stata's built-in lowess function returns no r() or e() matrices that could be plugged into the bootstrap, or manipulated. The only output one gets from lowess, aside from the graph, is the possibility to generate a new variable, the smoothed fit. I'm not facile enough with Stata programming to figure out how to simulate the 1000 or so bootstrap iterations and determine the ci at each point. I have looked at the UCLA page on bootstrapping here:
    http://www.ats.ucla.edu/stat/stata/faq/ownboot.htm
    for combining bsample in a program, and simulate to call the program, but I cannot figure out how to make the program work for lowess as the estimate.
    I would love some help. If it matters, I am using Stata 13.1 on Windows 7 and 8.
    Thanks in advance.

    -Michael
    Last edited by Michael Rosenfeld; 23 Oct 2015, 12:42.

  • #2
    The limitation of there not being saved results, as I understand it, is a limitation in principle. As lowess provides local fits, there isn't a summary of uncertainty in terms of global statistics. I join those who are queasy about confidence intervals for what I see as an exploratory or heuristic procedure, and one quite contingent (usually) on an arbitrary choice of bandwidth. You could, however, replicate the procedure on bootstrap samples, using the same grid at which to generate smoothed values and then select quantiles across replications accordingly. It just would not, or need not, mean using bootstrap.

    More positively, lpoly provides, in essence, a more flexible and more general alternative with similar flavour and scope for showing uncertainty too. It's hard to understand sustained enthusiasm for lowess in that light except in terms of some good public relations.

    Comment


    • #3
      Nick: I appreciate your reply and hope I can lean on your wisdom to gain a bit more Stata-specific knowledge.

      I understand (and don't entirely disagree with) doubts about the wisdom of Lowess SEs. Nonetheless many people use Lowess with SE and argue for its relevance, including, if I recall correctly, the classic book on the bootstrap, Efron, B., Tibshirani, R.J., 1993. An introduction to the bootstrap. Chapman and Hall, New York.

      I have not forsaken the local polynomial smoothing, or the fractional polynomial smoothing. I rely on both local polynomial smoothing, and fractional polynomial smoothing. I would like to be able to show my students how each approach, including Lowess, compares in terms of confidence intervals and fit. Leaving the appropriateness of Lowess SE aside, I would like to learn the practical matter of how to find and save the bootstrap replications of Lowess.

      Below is the code that the ats.ucla website showed for bootstrapping with regress, relying on a short user-written program that generates a bsample output, and is called by the command simulate. This procedure from the UCLA website has a different object (bootstrapping the SE of the VIF), which is related to statistics saved in e() after the regression, and then in r() after estat is run. What I would like to know is how to adjust this approach for Lowess (which has neither e() nor r() ), so that I can save the different N bsampled lowess fits, and then plot the confidence interval.

      I realize that this is a question whose object (bootstrapped SEs for Lowess) may seem to be not worth the trouble, but if you could guide me on how to do it, I would have learned something valuable about how Stata works, and I would be grateful.

      -Michael

      acs.ucla bootstrap example here:

      quietly regress read female math write ses estat vif
      Variable | VIF 1/VIF
      -------------+----------------------
      write | 1.86 0.537690
      math | 1.76 0.568278
      female | 1.17 0.857692
      ses | 1.11 0.902671
      -------------+----------------------
      Mean VIF | 1.47
      return list scalars: r(vif_4) = 1.107823014259338 r(vif_3) = 1.165920257568359 r(vif_2) = 1.759701371192932 r(vif_1) = 1.859809398651123 macros: r(name_4) : "ses" r(name_3) : "female" r(name_2) : "math" r(name_1) : "write"
      matrix vif = ( r(vif_4), r(vif_3), r(vif_2), r(vif_1))
      matrix list vif
      vif[1,4] c1 c2 c3 c4 r1 1.107823 1.1659203 1.7597014 1.8598094

      *Step 2 capture program drop myboot2
      program define myboot2, rclass
      preserve
      bsample
      regress read female math write ses
      estat vif
      return scalar vif_4 = r(vif_4)
      return scalar vif_3 = r(vif_3)
      return scalar vif_2 = r(vif_2)
      return scalar vif_1 = r(vif_1)
      restore
      end

      *Step 3 simulate vif_4=r(vif_4) vif_3=r(vif_3) vif_2=r(vif_2) vif_1=r(vif_1), /// reps(100) seed(12345): myboot2

      command: myboot2
      vif_4: r(vif_4)
      vif_3: r(vif_3)
      vif_2: r(vif_2)
      vif_1: r(vif_1)
      Simulations (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100

      bstat, stat(vif) n(200)
      Bootstrap results
      Number of obs = 200 Replications = 100
      ------------------------------------------------------------------------------ |
      Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      vif_4 | 1.107823 .0344814 32.13 0.000 1.040241 1.175405
      vif_3 | 1.16592 .0524449 22.23 0.000 1.06313 1.26871
      vif_2 | 1.759701 .1349314 13.04 0.000 1.495241 2.024162
      vif_1 | 1.859809 .1467453 12.67 0.000 1.572194 2.147425
      ------------------------------------------------------------------------------

      estat bootstrap, all
      Last edited by Michael Rosenfeld; 24 Oct 2015, 10:28.

      Comment


      • #4
        One approach would be to use generate option of lowess to generate an estimate for each bootstrap replication. Then for each replication, save the estimate in a temporary file, merge with the previous temporary file. In the end, you will have a file with all replications and you can calculate manually bootstrap standard errors or confidence intervals. A similar internal bootstrap procedure for twoway mspline is used in gicurve:

        http://adeptanalytics.org/download/a...ve/gicurve.ado

        I hope that this helps,
        Michal



        Comment

        Working...
        X