Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Lasso bug - always included variable disappears

    Hi all,

    I'm running into a very bizarre issue while running the -lasso linear- command. I've been using this command for a while, and it only started happening recently.

    I'm using -lasso linear- in order to select a subset of variables, while always including a few variables of interest: let's call them A, B and C. I select variables within two blocks of variables listed within macros $D and $E. I run:

    Code:
     lasso linear (c.A i.B i.C) $D $E , nolog rseed(123) selection(plugin)
    The selection occurs without an error message, but A is never included in the selected variables. However, A does get selected when I remove $E. What may be going on? Some sort of collinearity, perhaps? If so, there would surely be an error message, and how do I explain the fact that this used to work well?

  • #2
    Can you include a data example


    EDIT: And for what it's worth, I don't buy that it just stopped working. Barring ado updates or other things that change the software, syntax doesn't just stop working when I wake up while having coffee. Always, something you've done is the culprit. I know from experience. Last night I was trying to debug a subroutine I'd written for a command. I couldn't figure out why it wasn't looping over all the units correctly. And then I discovered that an artifact from my OLD code was causing problems with my newer updated syntax.

    The point I'm trying to make here, is that Stata isn't a fickle mistess. She doesn't simply get up and leave because she's bored, throwing your do files into anarchy. Typically, it's user error or changes in code that can explain this.
    Last edited by Jared Greathouse; 19 Sep 2022, 06:29.

    Comment


    • #3
      Unfortunately, I can't share this data publicly.

      Agreed that an error/change on my end is most likely, but that doesn't change the fact that there is an issue with the Stata output. I lack precise knowledge of LASSO commands on Stata, which may give me an indication of why a variable isn't chosen even if it's programmed to do so. Any pointers would be greatly appreciated!

      Comment


      • #4
        Statistically speaking I suspect that flaky boyfriends greatly outnumber fickle mistresses, whatever they are. Despite @Jared Greathouse’s dubious analogies, I agree with his larger Stata-based points;

        1. Unless your code has been changed by an update, it shouldn’t suddenly behave differently.

        2. Without a reproducible example — fake data are welcome too — it is hard to discuss the report here.

        Comment


        • #5
          Do you have a saved do-file and dataset that "used to run well" and if so, have you re-run it with no changes, and when you did that, did you find that it no longer will run well?

          Have you tried running your current example repeatedly, with different subsets of variables included in $E, to try to narrow down which of the variables causes the problem?

          Comment


          • #6
            Also, much of this will likely depend on what exactly the global macros consist of. If removing $E makes all right in the world again, this tells me that something about the definition of said macro is not right, to put it mildly.

            Comment

            Working...
            X