Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hold out sample or out of sample estimation of logit model

    Hi guys!
    I am writing my master thesis on default prediction. I am finding trouble estimating my logit model in my traing sample and saving the related estimates for out of sample testing. I basically what to test how well my traing sample predict default in my out of sample. Which approach and stata commands do you recommend?

    Best regards
    Niels Martin

  • #2
    My model of the Stata community is that if I have a question about Stata, someone else must have asked a similar question in the past: this package is only slightly younger than I am. Like any model it is wrong but unlike some models it is a useful one

    For instance, googling "out of sample prediction Stata" returns the following thread on page 1:

    https://www.statalist.org/forums/for...-out-of-sample
    Last edited by Hong Il Yoo; 02 Apr 2020, 19:55.

    Comment


    • #3
      Niels - welcome to Statalist. Welcome to Stata list. You will increase your chances of useful answer by following the FAQ on asking questions-provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

      As Hong pointed out, it is a good practice to use the findit option from status command line and to Google things since many of these issues have been discussed extensively on Stata list and in many cases software exists to handle special issues.

      As a beginner, you should read the Users Guide. (Stata also offers helpful webinars, some free and asynchronous). Note also that if you go to the command window and look under help, the first thing is PDF documentation. Open the PDF documentation and you will see all the documentation that comes with your version of Stata. At the very bottom, there is an index entry which if you open repeatedly you will get with subject index. Once you open the subject index you can search that easily as well. It links you back to the appropriate documentation. Note also that after the documentation for every procedure is a section of post estimation. So, after logit, there is logit postestimation. This tells you how and what you can do after the estimation. It is not comprehensive since there also some user written tests and so forth, but for example it tells you about predict.

      One of the nice things about Stata is that almost all of the procedures work exactly alike from a user standpoint. Thus, if you created dummy variable that equals one for the training sample and zero for the rest of the data, you can simply do

      logit y x if sample==training
      predict predprob, if sample!=training

      != and ~= both are not equal to. Note that the logical equals is == while the assignment equals is=. So one might say
      generate y=5 if sample==training

      Comment

      Working...
      X