Hold out sample or out of sample estimation of logit model

Niels Martin Lund

Join Date: Apr 2020

Posts: 1
#1

Hold out sample or out of sample estimation of logit model

02 Apr 2020, 13:38

Hi guys!
I am writing my master thesis on default prediction. I am finding trouble estimating my logit model in my traing sample and saving the related estimates for out of sample testing. I basically what to test how well my traing sample predict default in my out of sample. Which approach and stata commands do you recommend?

Best regards
Niels Martin
Tags: logit
Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#2

02 Apr 2020, 19:52

My model of the Stata community is that if I have a question about Stata, someone else must have asked a similar question in the past: this package is only slightly younger than I am. Like any model it is wrong but unlike some models it is a useful one

For instance, googling "out of sample prediction Stata" returns the following thread on page 1:

https://www.statalist.org/forums/for...-out-of-sample

Last edited by Hong Il Yoo; 02 Apr 2020, 19:55.
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#3

03 Apr 2020, 12:06

Niels - welcome to Statalist. Welcome to Stata list. You will increase your chances of useful answer by following the FAQ on asking questions-provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

As Hong pointed out, it is a good practice to use the findit option from status command line and to Google things since many of these issues have been discussed extensively on Stata list and in many cases software exists to handle special issues.

As a beginner, you should read the Users Guide. (Stata also offers helpful webinars, some free and asynchronous). Note also that if you go to the command window and look under help, the first thing is PDF documentation. Open the PDF documentation and you will see all the documentation that comes with your version of Stata. At the very bottom, there is an index entry which if you open repeatedly you will get with subject index. Once you open the subject index you can search that easily as well. It links you back to the appropriate documentation. Note also that after the documentation for every procedure is a section of post estimation. So, after logit, there is logit postestimation. This tells you how and what you can do after the estimation. It is not comprehensive since there also some user written tests and so forth, but for example it tells you about predict.

One of the nice things about Stata is that almost all of the procedures work exactly alike from a user standpoint. Thus, if you created dummy variable that equals one for the training sample and zero for the rest of the data, you can simply do

logit y x if sample==training
predict predprob, if sample!=training

!= and ~= both are not equal to. Note that the logical equals is == while the assignment equals is=. So one might say
generate y=5 if sample==training
Comment

Announcement

Hold out sample or out of sample estimation of logit model

Comment

Comment