Hello,
I am using the lasso2 package for model building and variable selection and have some questions about the differences between rlasso and lasso2 commands.
The lasso2 command takes a long time to converge when a large number of variables are included, while the rlasso command is nearly instantaneous. I am a bit confused why this would be.
Here's a reproducible illustration:
On a related note, I am trying to wrap my head around the different selected models. In the auto case, we have HUGE differences in the output:
Can anyone point me in the direction of guidance on how to determine for my case which might be the best to use?
Thanks!
I am using the lasso2 package for model building and variable selection and have some questions about the differences between rlasso and lasso2 commands.
The lasso2 command takes a long time to converge when a large number of variables are included, while the rlasso command is nearly instantaneous. I am a bit confused why this would be.
Here's a reproducible illustration:
Code:
sysuse auto, clear
foreach var in mpg rep78 headroom trunk weight length turn displacement gear_ratio {
gen `var'2 = `var' ^2
gen `var'3 = `var' ^3
} //Create a bunch of new variables for demonstration
eststo lasso1: lasso2 price mpg* rep78* headroom* trunk* weight* length* turn* displacement* gear_ratio*
//takes a while to converge
eststo lasso2: rlasso price mpg* rep78* headroom* trunk* weight* length* turn* displacement* gear_ratio*, displayall
//instant
Code:
lasso2, lic(ebic) //model selected by EBIC NO VARIALBES?
lasso2, lic(aicc) //model selected by AICC
Thanks!

We discuss all of the above questions in much more detail. We present the theory of all three penalization approaches (cross-validation, information criteria, rigorous penalization) and also have some Monte Carlo results.
Comment