Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Do Statacorp's lasso/elasticnet commands automatically normalize independent variables?

    It is generally advisable to normalize (or standardize, or rescale) independent variables before running a LASSO/ridge regression, so that they are all on the same scale. This may not be advisable (or may be irrelevant) for categorical variables. Many machine learning libraries automatically normalize variables for LASSO/ridge/elastic net. Do the official Statacorp
    Code:
    lasso
    and
    Code:
    elasticnet
    commands automatically normalize predictors? If so, what normalization/standardization/rescaling procedure does it use? I could not find documentation on this.

  • #2
    In the output of the help lasso command, click on the "View complete PDF manual entry" link at the top of the output to open the Stata Lasso Reference Manual PDF to the full documentation for the lasso command. Scroll down to the section headed "Penalized and postselection coefficients" where you will see a discussion of standardization.

    Comment


    • #3
      William Lisowski thank you for that. The discussion could be in a more obvious place but I don't know how I missed it.

      It seems Stata offers very little functionality on this and simply standardizes all potential predictors by default, including categorical variables, without an option to turn this off and/or implement your own (potentially by-hand) approach. I think this is an undesirable aspect of Statacorp's
      Code:
      lasso
      . See this cross-validated thread for discussion of some problems with standardizing categorical variables and why one might want to try different approaches.

      Comment

      Working...
      X