Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different Out-of-Sample R^2 in "lasso" vs. "lassogof" command

    Hello All,

    This is my first post on Statalist, so please do point out to me if I am not following the posting guidelines as accurately as it is encouraged in this forum.

    I would like to predict the unemployment duration of U.S. citizens, using the U.S. Current Population Survey. To that end, I am running the following lasso using an adaptive selection method:

    lasso linear logdur1 logcpsannual1 age age2 hours1 i.multjob i.sex_new ///
    i.married i.educ_new i.hispanic i.race_new i.region i.occ_new i.ind_new ///
    age##i.educ_new age##hours1 age##i.sex_new multjob##i.sex_new ///
    hours1##i.sex_new i.region##i.occ_new i.region##i.ind_new i.occ##i.ind_new, ///
    selection(adaptive, steps(6)) rseed(1234)

    This yields the following output:

    | No. of Out-of- CV mean
    | nonzero sample prediction
    ID | Description lambda coef. R-squared error
    ---------+----------------------------------------------------------------
    360 | first lambda .575668 0 0.0001 1.884722
    415 | lambda before .003451 134 0.1926 1.522009
    * 416 | selected lambda .0031445 134 0.1926 1.521993
    417 | lambda after .0028651 134 0.1926 1.522009
    437 | last lambda .0004457 134 0.1918 1.523534

    Is this reported out-of-sample R-squared the one that is relevant to evaluate model performance? If this was the case, it seems like the selected model doesn't perfom too badly? Or do I need to split the sample and use the command "lassogof" to assess the out-of-sample performance of my model (the reported r^2s differ)?

    Thank you very much for your help and guidance!

    Best,
    Chantal

    ---

    Data example using "dataex"

    clear
    input float(logdur1 logcpsannual1) byte age float(age2 hours1) byte multjob float(sex_new married) long educ_new float hispanic long race_new byte region long(occ_new ind_new)
    0 11.23443 52 2704 40 1 1 1 4 0 3 42 8 6
    2.0794415 . 55 3025 40 1 0 0 8 0 3 42 23 18
    1.3862944 . 33 1089 40 1 0 0 8 1 3 42 17 16
    3.555348 10.58446 26 676 40 1 0 0 2 0 2 42 11 8
    1.7917595 11.838346 35 1225 40 1 0 1 3 0 2 42 6 5
    0 9.128696 36 1296 40 1 0 0 8 1 3 42 2 4
    1.3862944 . 58 3364 40 1 0 1 1 0 3 42 13 13
    1.0986123 8.014336 61 3721 . 1 1 1 7 0 3 41 18 10
    1.3862944 . 21 441 35 1 1 0 8 0 1 11 23 17
    1.3862944 . 31 961 35 1 1 0 8 0 3 31 19 13
    .6931472 8.866317 50 2500 20 1 1 1 1 0 2 42 19 10
    2.0794415 10.002427 37 1369 40 1 0 0 7 0 3 41 16 18
    2.484907 10.651123 58 3364 40 1 0 1 7 0 3 41 1 9
    1.3862944 . 25 625 . 1 0 0 2 0 1 22 6 7
    1.3862944 11.145195 62 3844 40 1 1 0 4 0 1 32 16 8

Working...
X