Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Lasso regress questions

    Hi, I am college student from Barcelona. It is hard to learn Stata by myself because the teacher does not explain how to use commands, what they do... and we are in an introductory subject. The homework for this weekend uses a dataset with wage and some covariates, and we should use the lasso and ridge approach. He encouraged us to create as many variables as we can (I do not why, dummies, etc...). But he told that we should install (net install elasticregress, replace) and (ssc install lassopack, replace). I suppose that it install some new commands.

    In the second question he says that we should use the commands rlasso and lassoregress. I do not know what is the difference between both commands, I could not fin it in Internet. Also I saw an extra command called lasso2. What they do? Thank you.
    2) Use the lasso methods (rlasso, lassoregress and ridgeregress) to select the most relevant covariates for the analysis.

  • #2
    Hi Javier
    first of all I would suggest to read through the help files. Often they pack a lot of information regarding the intuition or details of the command.
    they often also provide some references or are part of a stata journal paper
    I’m surprised you didn’t get more information from your instructor. I consider lasso to be a quite advance method. The intuition behind it is that it can estimate models when you have too many explanatory variables
    some times more than observations.
    Lasso does so by penalizing coefficients that are too small and non relevant to the model.
    hope this helps
    Fernando

    Comment


    • #3
      Originally posted by FernandoRios View Post
      Hi Javier
      first of all I would suggest to read through the help files. Often they pack a lot of information regarding the intuition or details of the command.
      they often also provide some references or are part of a stata journal paper
      I’m surprised you didn’t get more information from your instructor. I consider lasso to be a quite advance method. The intuition behind it is that it can estimate models when you have too many explanatory variables
      some times more than observations.
      Lasso does so by penalizing coefficients that are too small and non relevant to the model.
      hope this helps
      Fernando
      Thank you, but I cannot find the difference between rlasso and lassoregress. I asked to the teacher and he was no able to explain me. Can you help me?
      Last edited by Javier Moreno; 31 May 2019, 18:02.

      Comment


      • #4
        Any answer?

        Comment


        • #5
          I cannot help you directly because I have never used lasso (though it is on my agenda). But Googling would have helped you get started. Have you looked at:
          https://statalasso.github.io/ ?

          Comment


          • #6
            First, note that lassoregress and rlasso are part of two separate packages. lassoregress is part of the elasticregress package which was written by Wilbur Townsend. lassoregress uses K-fold cross-validation as the default method, as clearly stated in the help file.

            rlasso is part of the lassopack. The lassopack consists of three programs
            • lasso2 is the base command. It obtains the lasso, ridge and elastic net coefficient path.
            • cvlasso uses K-fold cross-validation (but also supports rolling cross-validation for time-series data)
            • rlasso implements "rigorous" (theory-driven) penalization for the lasso and square-root lasso
            Thus, cvlasso and elatisregress/lassoregress serve similar purposes, although you won't get numerically the same results unless you make sure that both commands use the same folds. (As far as I can see, elasticregress/lassoregress doesn't support user-supplied folds.)

            To learn more about lassopack, please check The help files and the website include some examples of how to use the programs. In the working paper, we try to explain the theory & intuition behind the lasso, and we also compare the three different approaches for selecting the tuning parameters. I hope the paper answers some of your questions.
            --
            Tag me or email me for ddml/pdslasso/lassopack/pystacked related questions. I don't check Statalist.

            Comment


            • #7
              Originally posted by Achim Ahrens View Post
              First, note that lassoregress and rlasso are part of two separate packages. lassoregress is part of the elasticregress package which was written by Wilbur Townsend. lassoregress uses K-fold cross-validation as the default method, as clearly stated in the help file.

              rlasso is part of the lassopack. The lassopack consists of three programs
              • lasso2 is the base command. It obtains the lasso, ridge and elastic net coefficient path.
              • cvlasso uses K-fold cross-validation (but also supports rolling cross-validation for time-series data)
              • rlasso implements "rigorous" (theory-driven) penalization for the lasso and square-root lasso
              Thus, cvlasso and elatisregress/lassoregress serve similar purposes, although you won't get numerically the same results unless you make sure that both commands use the same folds. (As far as I can see, elasticregress/lassoregress doesn't support user-supplied folds.)

              To learn more about lassopack, please checkThe help files and the website include some examples of how to use the programs. In the working paper, we try to explain the theory & intuition behind the lasso, and we also compare the three different approaches for selecting the tuning parameters. I hope the paper answers some of your questions.
              Thank you a lot! So it seems the lassoregress from the another package is useless since lassopack includes it no?

              Comment


              • #8
                Neither cvlasso nor lassoregress are "useless"! These are two different implementations of cross-validation for the lasso. You probably want to familiarise yourself with both and see which one is better for your purposes.
                --
                Tag me or email me for ddml/pdslasso/lassopack/pystacked related questions. I don't check Statalist.

                Comment

                Working...
                X