Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • K-fold cross validation

    Hi all,

    I am trying to conduct K-fold cross validation for both Logistic and OLS regressions. Having read plenty online regarding this topic, the following appear to be my options.

    Code:
    cvauroc
    for logistic regressions

    Code:
    loocv
    or
    Code:
    crossfold
    for OLS regressions

    I cannot use lasso as I do not have Stata 16. I can carry out the
    Code:
    cvauroc
    commands but for the OLS options whenever I try to enter the command it simply rejects it stating command x is unrecognised, regarding the dependent variable. Was hoping someone may be able to assist me?

    Thanks in advance,
    Caleb


  • #2
    each of those 3 is user-written (community contributed); have you downloaded and installed them? if not, use -search- to find and install

    Comment


    • #3
      Hi Rich,

      I know that all are in fact user-written, and I have now proceeded to use
      Code:
      loccv
      for my out-of-sample performance estimates.

      I suppose the only question I have on the back of this is as to whether there is a disadvantage to using this method, given the fact there are no 'folds' in the out of sample estimates i.e. it estimates one observation at a time, rather than a larger sample of say 100?

      Comment


      • #4
        Hi Caleb
        Let me suggest to use a command I wrote -cv_regress-. It is faster than loocv for linear regressions. you can get it from ssc (ssc install cv_regress)
        Im also working on another command for k-fold cross-validation for other estimation commands like logit probit mprobit, etc.
        Best Regards

        Comment


        • #5
          Hi Fernando,

          Thank you for the advice, I indeed also have used the -cv_regress- command for OLS.

          Any suggestions as for Logit? I simply have used loccv and it seems to give robust results.

          Kind regards

          Comment


          • #6
            Use this piece of code. Doesnt have all the safe guards, but works well
            Code:
            capture program drop cross_probit
            program cross_probit, rclass
                syntax, k(int) reps(int) [seed(str)]
                tempname eqreg
                ** save eq
                qui:est sto `eqreg'
                ** get what I need.
                tempvar touse
                qui:gen byte `touse'=e(sample)
                local  cmdln=subinstr("`e(cmdline)'","`e(cmd)'","",1)
                qui:reparser `cmdln'
                local y_x `r(y_x)'
                local wgt `r(wgt)'
                local opts  `r(opts)'
                local cmd  `e(cmd)'
                tempvar y
                clonevar `y'=`e(depvar)'!=0 if `touse'
                tempname binit
                matrix `binit'=e(b)
                ** regress uses residuals
                tempvar kfld resid tmpresid
                tempname msqr
                local mmsqr=0
                qui:gen double `resid'=.
                forvalues i=1/`reps' {
                    capture drop `kfld'
                    qui:xtile `kfld'=runiform() if `touse', n(`k')
                    forvalues j=1/`k' {
                        qui:`cmd' `y_x' `wgt' if `touse' & `kfld'!=`j', `opts' from(`binit',skip)
                        qui:capture drop `tmpresid'
                        qui:predict double `tmpresid', pr 
                        qui:replace `resid'=log(`tmpresid')*(`y'==1)+log(1-`tmpresid')*(`y'==1) if `touse' & `kfld'==`j'
                        
                    }
                    ** Root MSQR
                    qui:sum `resid' if `touse', meanonly
                    qui:matrix `msqr'=nullmat(`msqr')\ (r(mean)*r(N))
                    local mmsqr=`mmsqr'+(r(mean)*r(N))
                }
                local mmsqr = `mmsqr'/`reps'
                matrix colname `msqr'=msqr
                return local mmsqr = `mmsqr'
                return matrix msqr = `msqr'
                return local k = `k'
                return local reps = `reps'
                return local seed  `seed'
                qui:est restore `eqreg'
                display as result "k-fold Cross validation"
                display as text   "Number of Folds     : " %10.0f `k' 
                display as text   "Number of Repetions : " %10.0f `reps'
                display as text   "Avg LL              : " %10.3f `mmsqr'
            end

            Comment


            • #7
              Hi Fernando,

              Really appreciate you sending the above code over. I run your program and then the subsequent command;

              Code:
              cross_logit NoRecoveryCL CreditSpread_n OriginalMaturity HardCallY FlexStatusU LoanTrancheSizeMM SponsorLedY IHealthY, k(10) reps(1) [seed(110)]
              But it keeps getting rejected. I am unsure as to where I have gone wrong?

              Comment


              • #8
                You only need to set rep and k
                everthing else is taken from the original regression

                Comment


                • #9
                  Fernando, thank you for being so patient with me. I am still relatively new to the Forum.

                  I run your code, setting rep and k as you said, on the back of my logistic regression, and I have no difficulties. However, no results are presented. Do I need some further command to display the results of the code you have provided?

                  Thanks again,
                  Caleb

                  Comment


                  • #10
                    Check lassologit, which is part of lassopack. ssc install lassopack.
                    --
                    Tag me or email me for ddml/pdslasso/lassopack/pystacked related questions. I don't check Statalist.

                    Comment


                    • #11
                      Thank you Achim, though I am trying to find the RSE and RAE, is there anyway to do this with your model?

                      Comment

                      Working...
                      X