Interpreting lassologit results

Scott Rick

Join Date: May 2021
Posts: 242

Interpreting lassologit results

07 Jul 2023, 04:55

Hi. I am trying to use lassologit for model selection in my data, where the number of predictors are larger than the number of observations in my dataset. This is my first time using a lasso command and while I have a basic understanding of the mechanics of it, I am struggling to interpret the stata results and figure out which predictors are best suited given my outcome measure.

Below is the output from the lassologit as well as the cvlassologit commands. Any guidance on how to interpret these results and how to identify which predictors work best would be much appreciated, or alternately if I've gotten this all wrong and any suggestion what I should do to identify the best predictors. For reference, I am using Stata 15.1 and don't have access to some of the newer commands that simplify this process.

Code:

lassologit Stepone $demo $health $healthsys $econ
Obtaining solution for 50 lambdas
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50


  Knot|  ID     Lambda    s      L1-Norm     EBIC     Pseudo-R2 | Entered/removed
------+---------------------------------------------------------+----------------
     1|   1    5.61988     0     0.18232     30.31641   0.0000  | Added _cons.
     2|   2    5.11577     1     0.45126     39.14029   0.0329  | Added healthover_tot1516.
     3|   3    4.65687     3     0.88499     57.48661   0.0756  | Added povertyperct
      |                                                         | religion_others_2011.
     4|   7    3.19764     4     2.27917     62.72845   0.2267  | Added
      |                                                         | rate_acuteresp_mlcase2018.
     5|   9    2.64970     6   624.74804     80.40714   0.2915  | Added rate_hepdeaths_2018
      |                                                         | rate_ventilators_public.
     6|  11    2.19565     8  6106.07641     97.11783   0.3882  | Added
      |                                                         | rate_influenzacase_2018
      |                                                         | exphealth_percap1516.
     7|  12    1.99870     7  5594.80079     86.03996   0.4296  | Removed povertyperct.
     8|  15    1.50764     8  2635.32693     92.90529   0.5271  | Added jain_2011.
     9|  16    1.37240     8  1133.65945     91.88012   0.5609  | Added
      |                                                         | rate_typoidml_death2018.
      |                                                         | Removed
      |                                                         | rate_ventilators_public.
    10|  17    1.24930    12  1393.10142    130.18702   0.5932  | Added pop_perkm2_2019
      |                                                         | muslim_2011
      |                                                         | rate_pneumoniaml_case2018
      |                                                         | rate_hospital_beds_private.
    11|  18    1.13723    12  1704.08816    129.02455   0.6315  | Added projtotpop_2019.
    12|  20    0.94236    14  2157.12334    146.57911   0.7004  | Added
      |                                                         | rate_acuteresp_femdeath2018.
    13|  21    0.85783    13  2324.96845    135.87923   0.7294  | Removed pop_perkm2_2019.
    14|  24    0.64707    14  2799.60647    143.58208   0.7993  | Added sexratio_2016.
    15|  27    0.48809    15  3219.14459    151.85042   0.8505  | Added projpercturban_2019.
    16|  28    0.44431    14  3387.09811    141.60784   0.8644  | Removed
      |                                                         | exphealth_percap1516.
    17|  30    0.36817    15  3804.53081    150.70366   0.8883  | Added rate_tb2018.
    18|  48    0.06782    16  7637.99027    157.74653   0.9799  | Added pop_perkm2_2019.
    19|  49    0.06174    17  7860.38536    167.51301   0.9817  | Added per85_2011.
Use 'long' option for full output. 
Type e.g. 'lassologit, lic(ebic)' to run the model selected by EBIC.

. lassologit, lic(ebic)
Use lambda=5.619880267729252 (selected by EBIC).

---------------------------------------------------
         Selected |       Logistic       Post
                  |       Lasso          logit
------------------+--------------------------------
            _cons |      -0.1823216      8.8560519
---------------------------------------------------




. cvlassologit Stepone $demo $health $healthsys $econ, nfolds(10) seed(123) tabfold stratified

     (max) |                               Fold
   Stepone |         1          2          3          4          5          6 |     Total
-----------+------------------------------------------------------------------+----------
         0 |         1          1          1          1          2          1 |        12 
         1 |         1          1          1          1          1          1 |        10 
-----------+------------------------------------------------------------------+----------
     Total |         2          2          2          2          3          2 |        22 


     (max) |                    Fold
   Stepone |         7          8          9         10 |     Total
-----------+--------------------------------------------+----------
         0 |         1          1          1          2 |        12 
         1 |         1          1          1          1 |        10 
-----------+--------------------------------------------+----------
     Total |         2          2          2          3 |        22 
K-fold cross-validation with 10 folds.
Fold 1 2 3 4 5 6 7 8 9 10 
          |         Lambda       Deviance       St. err.
----------+---------------------------------------------
         1|      5.6198803      1.4244316      .01966248  ^
         4|       3.681744      1.4152504      .11431231  *
* lopt = the lambda that minimizes loss measure.
  Run model: cvlassologit, lopt
^ lse = largest lambda for which MSPE is within one standard error of the minimum loss.
  Run model: cvlassologit, lse
  Use 'long' option for long output.

. cvlassologit, lopt postresults
          |         Lambda       Deviance       St. err.
----------+---------------------------------------------
         1|      5.6198803      1.4244316      .01966248  ^
         4|       3.681744      1.4152504      .11431231  *
* lopt = the lambda that minimizes loss measure.
  Run model: cvlassologit, lopt
^ lse = largest lambda for which MSPE is within one standard error of the minimum loss.
  Run model: cvlassologit, lse
  Use 'long' option for long output.
Estimate  with lambda=3.682 (lopt).

---------------------------------------------------
         Selected |       Logistic       Post
                  |       Lasso          logit
------------------+--------------------------------
     povertyperct |       0.0071754      0.0289432
religion_oth~2011 |       0.0551081      0.6745619
healthover_t~1516 |      -0.3188299     -2.1661823
            _cons |       1.3770605     10.4341639
---------------------------------------------------


. predict double phat, pr

Tags: None

Announcement

Interpreting lassologit results