Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Varying outcomes of observations with lasso and elasticnet

    Dear Stata community members,

    I have a question regarding the elasticnet and lasso commands on Stata 17. I ran these codes for elastic net (1), ridge (2) and lasso(3):
    1
    Code:
    elasticnet linear x1 $vldummy $vlcontinuous, alpha(0(0.05)1)
    2
    Code:
    elasticnet linear x1 $vldummy $vlcontinuous, alpha(0)
    3
    Code:
    lasso linear x1 $vldummy $vlcontinuous
    with x1 being the dependent variable. I am trying to compare the estimates of the 3 methods for variable selection on my dataset.

    I ran this command several times and always saved the estimates using:
    Code:
    estimates store elasticnet
    Code:
    estimates store elasticnet1
    etc.

    Then using
    Code:
    lassogof
    I got the following table for all estimates:

    Code:
    -------------------------------------------------
           Name |         MSE    R-squared        Obs
    ------------+------------------------------------
     elasticnet |    2.961392       0.2433      5,113
    elasticnet1 |     2.93792       0.2308      4,827
    elasticnetT |    2.955981       0.2527      3,608
    elasticnetT1|    2.950256       0.2541      3,605
    elasticnetT2|    2.962259       0.2430      5,113
    elasticnetT3|    2.961392       0.2433      5,113
          lasso |    2.961809       0.2432      5,113
         lasso1 |    2.994677       0.2343      5,120
         lassoT |    2.961809       0.2432      5,113
        lassoT1 |    2.961809       0.2432      5,113
        lassoT2 |    2.961809       0.2432      5,113
        lassoT3 |    2.961809       0.2432      5,113
          ridge |    2.874946       0.2577      3,416
         ridge1 |    2.938475       0.2413      3,416
         ridgeT |    2.874946       0.2577      3,416
        ridgeT1 |    2.881256       0.2561      3,416
        ridgeT2 |    2.872305       0.2584      3,416
        ridgeT3 |     2.87792       0.2570      3,416
    -------------------------------------------------
    My question is: why do the results vary sometimes and are the same other times (e.g. elasticnet and elasticnetT3). In large datasets such as mine it would be very unlikely that the random sample is the same twice. More importantly, why does the number of Obs vary so drastically (What does the Obs number tell me?) for elasticnet? The entire dataset has 10721 observations.

    I could not find and info on this via
    Code:
    help elasticnet
    or
    Code:
    help lasso
    Any help or hints would be greatly appreciated!

    Best regards,

    Lukas
Working...
X