Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • General Lasso Confusion

    I ran a lasso regression on my data and it does not reach the r^2 that a regression of my own design achieved. Does stata's lasso tune based on r^2? Ideally I would like it to tune optimizing the adjusted r^2, but I do not know what feature to use.

  • #2
    For details on the lasso command and the methodology behind it, see The Stata Lasso Reference Manual PDF included in your Stata installation and accessible through Stata's Help menu.

    Comment


    • #3
      Dear Ethan

      As William suggested, you should really learn more about lasso (or any method) before using it. The R2 is an in-sample measure of goodness-of-fit which is generally not very interesting. For example, if you keep adding variables to your model the R2 will go to 1, but the model will badly overfit the data and will become useless. The goal of lasso is to select and estimate a model that predicts well out of sample, something much more interesting (and difficult) than getting a high R2.

      Best wishes,

      Joao

      Comment


      • #4
        Am I mistaken that lasso is fit via cross-validation using out-of-sample MSE, and that that is a 1:1 function of R^2? In other words, minimizing OOS MSE will maximize OOS R^2, right? The default is doing what Ethan wants.

        Comment


        • #5
          I may have missed something, but I do not think #1 referred to an out-of-sample R2, did it? Of course, OoS R2 and in-sample R2 are two very different things, and I guess that is the problem.

          Comment


          • #6
            Based on my own research of out of sample predictability cited below, I think that the in-sample and out-of-sample R-squares are absolutely different statistics, and there is no known relationship between them. So I do not think that they are one to one function of each other at all.
            Kolev, Gueorgui I., and Rasa Karapandza. "Out-of-sample equity premium predictability and sample split–invariant inference." Journal of Banking & Finance 84 (2017): 188-201.
            Originally posted by Jackson Monroe View Post
            Am I mistaken that lasso is fit via cross-validation using out-of-sample MSE, and that that is a 1:1 function of R^2? In other words, minimizing OOS MSE will maximize OOS R^2, right? The default is doing what Ethan wants.

            Comment


            • #7
              Originally posted by Joao Santos Silva View Post
              I may have missed something, but I do not think #1 referred to an out-of-sample R2, did it? Of course, OoS R2 and in-sample R2 are two very different things, and I guess that is the problem.
              Ethan asked, "Does stata's lasso tune based on r^2?" Considering OOS R^2 is an R^2, if not the typical one, I would say the answer is yes because MSE is 1 to 1 with R^2. If CV methods do any tuning it will be on OOS data, so his reference to R^2 led me to think he was referring to OOS R^2. Perhaps I put too much gloss on an otherwise unclear question.

              To #6, Joro I agree that in and out-of-sample R^2 don't need to be related, I just assumed #1 was asking about OOS because of the CV nature of the Lasso in Stata. Interesting paper btw, I didn't follow the notation but it seemed reasonable that OOS R^2 was negative on portfolio data, too much noise in the predictions. My general point was MSE is 1 to 1 with R^2 (as it is traditionally defined), and Lasso does indeed fit on OOS MSE.

              Comment


              • #8
                I think indeed you gave a lot of content to a random shot by OP :P.

                You are right: if the dependent variable is the same, R-squared = 1 - MSE/(MS Dependent variable deviation from its mean), so if we do not change the dependent variable they are indeed one to one mapping of each other.

                Originally posted by Jackson Monroe View Post

                Ethan asked, "Does stata's lasso tune based on r^2?" Considering OOS R^2 is an R^2, if not the typical one, I would say the answer is yes because MSE is 1 to 1 with R^2. If CV methods do any tuning it will be on OOS data, so his reference to R^2 led me to think he was referring to OOS R^2. Perhaps I put too much gloss on an otherwise unclear question.

                To #6, Joro I agree that in and out-of-sample R^2 don't need to be related, I just assumed #1 was asking about OOS because of the CV nature of the Lasso in Stata. Interesting paper btw, I didn't follow the notation but it seemed reasonable that OOS R^2 was negative on portfolio data, too much noise in the predictions. My general point was MSE is 1 to 1 with R^2 (as it is traditionally defined), and Lasso does indeed fit on OOS MSE.

                Comment

                Working...
                X