Tobit regression with unusual censoring

Ali Malik

Join Date: Jul 2018

Posts: 23
#1

Tobit regression with unusual censoring

20 Jun 2019, 08:33

Hi all,

I have a panel dataset on syndicated loans. This means that I have data on specific loans made to firms, where there are multiple lenders/banks for a given loan. The panel has a three dimensional structure, with firm-bank-quarter observations. The total loan amount to a given firm is allocated to banks based on the variable "bankallocation". This variable is essentially a percentage that indicates which part of the total loan a specific bank has contributed (i.e. it can only take values from 0-100). In other words, the loan volume is the bank’s allocation times the total loan amount. The problem with this "bankallocation" variable is that the majority of the observations have missing values for this variable in my dataset. This means that I cannot calculate the loan volume for each bank in a given loan syndicate. This problem has been addressed in previous research, namely by Schwert (2018) in his Journal of Finance paper. He estimates missing values of this variable using a Tobit regression. The following quote from the paper explains his approach:

"When the bank’s allocation is missing in the data, it is estimated as the fitted value from a Tobit regression of bank allocation on log loan amount, the ratio of loan amount to lender assets, the ratio of loan amount to borrower assets, the number of lead arrangers, the number of participants, and quarter fixed effects."

I have been trying to find a way to imitate this approach using the same dataset. However, I ran into some issues. Given that the "bankallocation" variable can only take values from 0-100 (because it is a percentage, which should total 100 for a given loan), setting the lower and upper limit to 0 and 100 respecitvely makes sense. However, the added difficulty is that the sum of all "bankallocation" values should sum up to 100 for each loan, or at least not exceed 100. How could I accomplish this in Stata?

In addition, how can I estimate a quarter fixed effects Tobit regression in Stata?

Any help and advice is much appreciated.

Regards, Ali
Tags: fixed effects, panel data, regression, tobit
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

21 Jun 2019, 11:44

You didn't get a quick answer. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

It doesn't sound like Schwert fixed the total of bankallocation values to be 1. However, I suppose you can create the predicted values and then rescale them to make the total 100. If you had followed the FAQ, you might have tempted one of the more adept Stata users to program this for you since it is a nice problem. This might do what you want:

bysort loan: egen totvalues=total(value)
gen scaledvalue=value * 100/totvalues

As for a fixed effects tobit, Stata does not offer one. If you look at the relevant texts, apparently there is no true fixed effects tobit. I've read that Greene's text suggests just using panel dummies, but I haven't checked this.
Comment
Ali Malik

Join Date: Jul 2018

Posts: 23
#3

24 Jun 2019, 06:44

Hi Phil,

Thank you for the helpful comments.

I was wondering if you could help me implement the ideas from your comment to solve my problem. In the following Google Drive link you can find my dataset:

https://drive.google.com/file/d/1tfM...ew?usp=sharing

For purposes of running a Tobit regression on my panel dataset, I tried to declare my dataset as panel data taking into account that I am dealing with a three dimensional dataset (as explained in my initial post). I used the following commands:

egen idvar = group(borrowercompanyid gv) // gv is the bank identifier //
xtset idvar

I didnt specify the time variable because Stata gives me an error message of repeated time values. An important aspect of my dataset is that the same bank can lend to the same firm in the same YearQuarter. So therefore I decided to just specify the panel ID since I wont be using any lead or lag variables in my regressions. I was hoping that you could clarify whether this approach is justified given the regression I want to run as outlined in my previous post?

Now the goal is to estimate the quarter fixed effects Tobit regression, calculate predicted values and rescale them to make sure bankallocation adds up to 100 for each loan (based on the suggestion in your comment). The dependent variable here is "bankallocation". The independent variables here are: "lnloan" (log of loan amount), "L_A" (ratio of loan amount to total assets of lender), "Nlead" (number of lead arrangers per loan), and "participants" (number of participants per loan).

You noted that it is not possible to run a fixed effects tobit regression in Stata and suggested using panel dummies. Could you maybe specify how I could code this in Stata based on my dataset? Next, I would need to estimate the predicted values and do the rescaling as suggested in your comment.

I would greatly appreciate your help in solving my problem.

Regards, Ali
Comment
Ali Malik

Join Date: Jul 2018

Posts: 23
#4

25 Jun 2019, 03:47

ADDED: the variable "facilityid" in my dataset represent the unique ID for a given loan.

you might have tempted one of the more adept Stata users to program this for you

Maybe Nick Cox or Clyde Schechter could also shed some light on this matter?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#5

25 Jun 2019, 03:54

I am not convinced from a quick read that this problem is even a problem where Tobit regression is appropriate. There is no censoring, just bounds that must be respected.

Either way, sorry, but I am not tempted to spend more time on this, which seemingly would mean finding and reading a paper in a field in which I don't work, and that's only the start of it.
Comment
Olive Bat

Join Date: May 2018

Posts: 76
#6

30 Jul 2019, 02:01

Hi.. all i have a doubt in tobit regression is my command correct.. plz check..
Comment
Olive Bat

Join Date: May 2018

Posts: 76
#7

30 Jul 2019, 02:03

tobit ln_gold_LY for_remitt_hh DRatio count_Age_0_14 count_Age_15_24 count_age_25_34 count_Age_35_44 count_Age_45_59 i.sex_hh i.married_hh i.edu_hh i.eco_activity_hh i.hhsize i.religion_hh i.land_acresnew i.ration_card_hh i.type_of_house_hh i.type_of_fuel i.own_house i.Locality i.District_Code , ll(0)
..should i use ln or not.. wll be running the marginal effect after this
Comment
Olive Bat

Join Date: May 2018

Posts: 76
#8

30 Jul 2019, 02:04

https://www.statalist.org/forums/for...66#post1509966
Comment

Announcement

Tobit regression with unusual censoring

Comment

Comment

Comment

Comment

Comment

Comment

Comment