Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New Stata command for lasso, ridge regression and elastic net regression

    Hello everyone. I've written a Stata implementation of the Friedman, Hastie and Tibshirani (2010, JStatSoft) coordinate descent algorithm for elastic net regression and its famous special cases: lasso and ridge regression. The resultant command, elasticregress, is now available on ssc -- thanks to Kit Baum for the upload.

    The command extends existing Stata lasso implementations, such as lars, by allowing the regularisation parameter to be given or found by K-fold cross-validation. As such it tends to have better out-of-sample fit. The below plot compares the performance of elasticregress, lars and OLS as the number of covariates increases. As is well known, OLS performs poorly on dense data. lars has roughly constant performance as the number of covariates increases while elasticregress becomes more accurate.
    Click image for larger version

Name:	varK-MSE-noelastic.png
Views:	1
Size:	31.9 KB
ID:	1410311



    (The estimators are fitted on 1000 observations. The true relationships between the standard-normal covariates and the dependent variable are drawn from a spike and slab distribution with p=0.2 chance of being non-zero. Each dot is a mean over 30 replications. Both elasticregress and lars are calculated with their respective lasso options.)

    elasticregress tends to be a little faster than lars when estimating the lasso. elasticregress can also estimate the more general elastic-net regression, which regularises with both the L1 and L2 norms and is thus more robust to colinearity in the regressors -- when it does so it can cross-validate both the regularisation parameter and the mixing parameter.

    Hopefully the help files are self-contained -- do let me know if they're not. If you find a bug, please post an issue in the Github.




  • #2
    Hi Wilbur --

    Excellent work! Thanks for placing this on SSC.

    Comment


    • #3
      Hi Michael -- thanks.

      Comment


      • #4
        Wilbur - is there any way that I can use this to determine if my data is generated from a Poisson distribution?

        Comment


        • #5
          Robert -- This currently only fits linear models. I might extend it to logistic models later, but probably not Poisson (is anything really generated by a Poisson distribution???)

          Comment


          • #6
            Originally posted by Wilbur Townsend View Post
            Robert -- This currently only fits linear models. I might extend it to logistic models later, but probably not Poisson (is anything really generated by a Poisson distribution???)
            If you model positive quantities (exports, sales, wages, etc.) poisson might be a better fit; see: http://personal.lse.ac.uk/tenreyro/LGW.html

            I vaguely recall reading about poisson models with elasticnetsearch on R and it shouldn't be that tricky if you implement it through IRLS. That said, it might not be worth it for your use cases.

            Comment


            • #7
              Does this command handle the covariate rescaling for you? How does it treat factor variables (as a group or individually) and handle their rescaling?

              Comment


              • #8
                Dimitriy -- Yes, it standardises the covariates. It expands factor variables into dummies and standardises them (unless they're invariant).

                Comment


                • #9
                  Dear Professor Townsend,

                  Many thanks for your great Stata programme. It's highly appreciated.

                  I am wondering why there is no t-statistic for each coefficient after estimating the Elastic Net regression?

                  Furthermore, is there any prerequisite for applying the LASSO or Elastic Net regression? In conventional time-series modelling, we normally require statistically stationary variables. Can we run Elastic Net regression on non-stationary variables?

                  Best wishes,
                  Catherine

                  Comment


                  • #10
                    Dear Wilbur,

                    My concern is similar to Catherine's. The program is not creating a e(V) matrix, so outreg and outreg2 will not work. Is there anyway to have this functionality added?

                    Best regards,
                    Travis

                    Comment


                    • #11
                      Dear Professor Townsend,

                      Thank you very much for the program.

                      I was trying to use the -lassoregress- command, and it seems that it generates different results when I run it several times. I copy below the code using the auto database and the output. I believe this should not happen, but apologies in advance if I am misunderstanding the estimation procedure.

                      Best,

                      Thiago

                      --

                      sysuse auto, clear

                      lassoregress mpg weight foreign

                      ereturn clear

                      return clear

                      lassoregress mpg weight foreign

                      LASSO regression Number of observations = 74
                      R-squared = 0.6075
                      alpha = 1.0000
                      lambda = 1.2064
                      Cross-validation MSE = 11.2183
                      Number of folds = 10
                      Number of lambda tested = 100
                      ------------------------------------------------------------------------------
                      mpg | Coef.
                      -------------+----------------------------------------------------------------
                      weight | -.0044458
                      foreign | 0
                      _cons | 34.72133
                      ------------------------------------------------------------------------------



                      LASSO regression Number of observations = 74
                      R-squared = 0.6371
                      alpha = 1.0000
                      lambda = 0.6903
                      Cross-validation MSE = 14.1879
                      Number of folds = 10
                      Number of lambda tested = 100
                      ------------------------------------------------------------------------------
                      mpg | Coef.
                      -------------+----------------------------------------------------------------
                      weight | -.0051144
                      foreign | 0
                      _cons | 36.73993
                      ------------------------------------------------------------------------------


                      Comment


                      • #12
                        Catherine -- for others' sake I'm copying the email in which I answered your questions below:

                        Hi there Catherine,

                        Re t-statistics -- developing inference for LASSO is an ongoing research program, and existing methods are generally quite restrictive about the nature of the data which the model is being estimated on. As such I decided to avoid them when implementing LASSO for Stata at this stage.

                        I'm not familiar with the appropriateness of LASSO for non-stationary data. A brief Google returns this paper, which suggests that LASSO-type estimators might perform well but LASSO itself does not.

                        All the best,
                        Wilbur

                        Comment


                        • #13
                          Travis -- as above, I've quite intentionally avoided implementing inference, given that that is an ongoing research program. I agree it'd be nice if it works with outreg2 -- I've added this as an issue to be resolved in the next version (that won't be soon). In the meantime it should work with estout.

                          Comment


                          • #14
                            Thiago -- you're getting difference results because the random sample used by the K-fold cross-validation differs each time. Try setting the seed before each use.

                            Comment


                            • #15
                              Hi all: I've updated the version on SSC to fix two bugs. Thanks to Kit for the upload.

                              The first was bug an unambiguous mistake, relating to the formula for calculating the maximal lambda.

                              The second 'bug' was a bit more ambiguous -- a friend found that when the variance of the dependent variable was small (e.g. with a binary dependent variable), elasticregress was producing non-optimal results because the default tolerance on the Euclidean norm of beta was too large. I've thus changed the default tolerance to be the lesser of (a) the old tolerance (0.001) and (b) abs(0.0001*var(depvar)).

                              Neither fix will change the output of Elasticregress if the old code was producing correct estimates.

                              Comment

                              Working...
                              X