Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Forcing lasso to select main effects if it selects the interaction term

    Is there a way in Stata to run a lasso such that I force it to keep the main effect if it selects the interaction term?

    I'm using cvlasso to identify the lambda with the lowest MSPE, and then I want to use the selected predictors to run an OLS. My list of predictors includes interaction terms along with the individual variables that are interacted. Currently, without forcing lasso to keep any predictors, it selects some interaction terms but does not necessarily select the individual variables. I need the individual variables because I need the main effects to make sense of the OLS.

    So, as an example, let's say my list of potential predictors are T, X1, X2, X3, T*X1, T*X2, and T*X3.

    Now if I run the lasso without forcing anything, it selects X1, X2, T*X2, and T*X3.

    But I cannot run an OLS with only these predictors because they don't include X3, and so interpreting the coefficient on T*X3 would not make sense without also having X3 in the OLS, in my case. So I need the lasso to find the combination of predictors that minimizes the MSPE and also keeps the main effects if it selects any of the interaction terms.
    Last edited by Pulkit Aggarwal; 10 Sep 2023, 00:55.

  • #2
    Why not use the official lasso and make use of factor variable notation to avoid this?

    Code:
    help lasso

    Comment


    • #3
      Andrew, can you say more? I've wondered about this, too. I don't see how how you force X3 to stay in conditional on T*X3 staying in. Using factor variable notation alone doesn't do it. Is there an option using lasso that I'm not finding?

      Pulkit: One thing I definitely would try is centering all variables before including the interactions. It seems like you'd want to keep T also, as I suspect that is a "treatment" variable. If you don't center variables then the coefficients on the main effects are often meaningless; they can be close to zero even though an average effects is not zero. So T (and X3) can be dropped when they should be kept if centering (or full standardization) is used.

      Comment


      • #4
        Jeff, I mean that when you run lasso indicating interactions using factor variables, Stata creates a macro of all variables needed post lasso. But you're correct, this is not forcing these variables to be selected conditional on interactions involving them being selected. Here is an example:

        Code:
        webuse cattaneo2, clear 
        lasso logit lbweight c.mage##i.msmoke c.fage##i.foreign c.mage#c.fage c.fedu##c.medu
        display "`e(allvars)'"
        di "`e(allvars_sel)'"
        display "`e(post_sel_vars)'"
        Res.:

        Code:
        . di "`e(allvars)'"
        mage 0bn.msmoke 1bn.msmoke 2bn.msmoke 3bn.msmoke 0bn.msmoke#c.mage 1bn.msmoke#c.mage 2bn.msmoke#c.mage 3bn.msmoke#c.mage fage 0bn.fo
        > reign 1bn.foreign 0bn.foreign#c.fage 1bn.foreign#c.fage c.mage#c.fage fedu medu c.fedu#c.medu
        
        . di "`e(allvars_sel)'"
        1bn.msmoke 0bn.msmoke#c.mage 2bn.msmoke#c.mage 3bn.msmoke#c.mage 0bn.foreign#c.fage 1bn.foreign#c.fage medu
        
        . di "`e(post_sel_vars)'"
        lbweight msmoke mage foreign fage medu

        Comment


        • #5
          Thanks Jeff and Andrew!

          Jeff, did you find a way to do this in R or Python instead?

          And thanks for the point about centering, you're correct that T is treatment so I would like to keep that too, I'll keep that in mind!


          Comment


          • #6
            Pulkit: It's kind of you to think I'd be capable of that ....

            I think it's more than a programming issue. I don't think the available algorithms allow this. It would be good for you to figure it out. :-)

            I do know that allowing heterogeneity in treatment effects when using ML is tricky and has spawned much research by Athey, Chernozhukov, and their coauthors. It's not as straightforward as using lasso to choose controls when you have a single treated variable.

            Comment


            • #7
              Thanks for the references! I'll look into work by Athey, Chernozhukov et al.

              For now, I'm doing the lasso to understand the most important predictors of what kind of individuals/households responded to the treatment more than others, so thinking more in terms of prediction than causal inference.

              It seems like there's a way to force the lasso to keep interactions and main effects in R (https://strakaps.github.io/post/glinternet/) but not sure how robust it is from a mathematical point of view (still trying to understand it).

              Comment

              Working...
              X