Bootstrap SE for lowess

Michael Rosenfeld

Join Date: Oct 2015

Posts: 2
#1

Bootstrap SE for lowess

23 Oct 2015, 12:33

I am looking to generate bootstrap SE for lowess smoothed curves. I realize that not everyone agrees that SE are appropriate for lowess curves, but I want to generate bootstrap SE for lowess anyway, so I can plot lowess fit Y and 95% CI Y across my Xs. In some past stata forum lists, when users have asked about SE for lowess (because the lowess function does not provide SE directly), one answer has been to bootstrap the lowess function. I would like to bootstrap the lowess, but I cannot figure out how. I am stymied by the fact that Stata's built-in lowess function returns no r() or e() matrices that could be plugged into the bootstrap, or manipulated. The only output one gets from lowess, aside from the graph, is the possibility to generate a new variable, the smoothed fit. I'm not facile enough with Stata programming to figure out how to simulate the 1000 or so bootstrap iterations and determine the ci at each point. I have looked at the UCLA page on bootstrapping here:
http://www.ats.ucla.edu/stat/stata/faq/ownboot.htm
for combining bsample in a program, and simulate to call the program, but I cannot figure out how to make the program work for lowess as the estimate.
I would love some help. If it matters, I am using Stata 13.1 on Windows 7 and 8.
Thanks in advance.

-Michael

Stata FAQ: How do I write my own bootstrap program?

http://www.ats.ucla.edu

Last edited by Michael Rosenfeld; 23 Oct 2015, 12:42.
Tags: bootstrap, lowess
Nick Cox

Join Date: Mar 2014

Posts: 35664
#2

24 Oct 2015, 05:32

The limitation of there not being saved results, as I understand it, is a limitation in principle. As lowess provides local fits, there isn't a summary of uncertainty in terms of global statistics. I join those who are queasy about confidence intervals for what I see as an exploratory or heuristic procedure, and one quite contingent (usually) on an arbitrary choice of bandwidth. You could, however, replicate the procedure on bootstrap samples, using the same grid at which to generate smoothed values and then select quantiles across replications accordingly. It just would not, or need not, mean using bootstrap.

More positively, lpoly provides, in essence, a more flexible and more general alternative with similar flavour and scope for showing uncertainty too. It's hard to understand sustained enthusiasm for lowess in that light except in terms of some good public relations.
Comment
Michael Rosenfeld

Join Date: Oct 2015

Posts: 2
#3

24 Oct 2015, 10:22

Nick: I appreciate your reply and hope I can lean on your wisdom to gain a bit more Stata-specific knowledge.

I understand (and don't entirely disagree with) doubts about the wisdom of Lowess SEs. Nonetheless many people use Lowess with SE and argue for its relevance, including, if I recall correctly, the classic book on the bootstrap, Efron, B., Tibshirani, R.J., 1993. An introduction to the bootstrap. Chapman and Hall, New York.

I have not forsaken the local polynomial smoothing, or the fractional polynomial smoothing. I rely on both local polynomial smoothing, and fractional polynomial smoothing. I would like to be able to show my students how each approach, including Lowess, compares in terms of confidence intervals and fit. Leaving the appropriateness of Lowess SE aside, I would like to learn the practical matter of how to find and save the bootstrap replications of Lowess.

Below is the code that the ats.ucla website showed for bootstrapping with regress, relying on a short user-written program that generates a bsample output, and is called by the command simulate. This procedure from the UCLA website has a different object (bootstrapping the SE of the VIF), which is related to statistics saved in e() after the regression, and then in r() after estat is run. What I would like to know is how to adjust this approach for Lowess (which has neither e() nor r() ), so that I can save the different N bsampled lowess fits, and then plot the confidence interval.

I realize that this is a question whose object (bootstrapped SEs for Lowess) may seem to be not worth the trouble, but if you could guide me on how to do it, I would have learned something valuable about how Stata works, and I would be grateful.

-Michael

acs.ucla bootstrap example here:

quietly regress read female math write ses estat vif
Variable | VIF 1/VIF
-------------+----------------------
write | 1.86 0.537690
math | 1.76 0.568278
female | 1.17 0.857692
ses | 1.11 0.902671
-------------+----------------------
Mean VIF | 1.47
return list scalars: r(vif_4) = 1.107823014259338 r(vif_3) = 1.165920257568359 r(vif_2) = 1.759701371192932 r(vif_1) = 1.859809398651123 macros: r(name_4) : "ses" r(name_3) : "female" r(name_2) : "math" r(name_1) : "write"
matrix vif = ( r(vif_4), r(vif_3), r(vif_2), r(vif_1))
matrix list vif vif[1,4] c1 c2 c3 c4 r1 1.107823 1.1659203 1.7597014 1.8598094

*Step 2 capture program drop myboot2
program define myboot2, rclass
preserve
bsample
regress read female math write ses
estat vif
return scalar vif_4 = r(vif_4)
return scalar vif_3 = r(vif_3)
return scalar vif_2 = r(vif_2)
return scalar vif_1 = r(vif_1)
restore
end

*Step 3 simulate vif_4=r(vif_4) vif_3=r(vif_3) vif_2=r(vif_2) vif_1=r(vif_1), /// reps(100) seed(12345): myboot2
command: myboot2
vif_4: r(vif_4)
vif_3: r(vif_3)
vif_2: r(vif_2)
vif_1: r(vif_1)
Simulations (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100

bstat, stat(vif) n(200)
Bootstrap results
Number of obs = 200 Replications = 100
------------------------------------------------------------------------------ |
Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
vif_4 | 1.107823 .0344814 32.13 0.000 1.040241 1.175405
vif_3 | 1.16592 .0524449 22.23 0.000 1.06313 1.26871
vif_2 | 1.759701 .1349314 13.04 0.000 1.495241 2.024162
vif_1 | 1.859809 .1467453 12.67 0.000 1.572194 2.147425
------------------------------------------------------------------------------

estat bootstrap, all

Last edited by Michael Rosenfeld; 24 Oct 2015, 10:28.
Comment
Michal Brzezinski

Join Date: Jul 2014

Posts: 16
#4

24 Oct 2015, 11:13

One approach would be to use generate option of lowess to generate an estimate for each bootstrap replication. Then for each replication, save the estimate in a temporary file, merge with the previous temporary file. In the end, you will have a file with all replications and you can calculate manually bootstrap standard errors or confidence intervals. A similar internal bootstrap procedure for twoway mspline is used in gicurve:

http://adeptanalytics.org/download/a...ve/gicurve.ado

I hope that this helps,
Michal
Comment

Announcement

Bootstrap SE for lowess

Comment

Comment

Comment