Help with replicating one regression model

Francisco Brazao

Join Date: Jun 2019

Posts: 5
#1

Help with replicating one regression model

04 Sep 2019, 17:36

Hello,
I am trying to replicate a regression from Chava and Purnanandam (2010) but results so far are not as expected and I am not sure if my approach is correct.

So far this is my coding:

*1st Drop firms that are not between 2001-2005
drop if fyear <= 2000 | fyear >= 2005

*Changes in policies
gen difflev = booklev_w - booklev_w[_n-1]
gen diffcash = cashta_w - cashta_w[_n-1]
gen diffdebt = debtmat_w - debtmat_w[_n-1]
gen diffcfaccr = cfaccr_w - cfaccr_w[_n-1]

*Changes in incentives
gen diffceod = ceodelta_w - ceodelta_w[_n-1]
gen diffceov = ceovega_w - ceovega_w[_n-1]
gen diffcfod = cfodelta_w - cfodelta_w[_n-1]
gen diffcfov = cfovega_w - cfovega_w[_n-1]

xtile x_diffceod = diffceod, nquantiles(10)
xtile x_diffceov = diffceov, nquantiles(10)
xtile x_diffcfod = diffcfod, nquantiles(10)
xtile x_diffcfov = diffcfov, nquantiles(10)

*Control variables
gen difflogta = logta - logta[_n-1]
gen diffrndsales = rndsales_w - rndsales_w[_n-1]
gen diffroa = roa_w - roa_w[_n-1]

*Regression1

reg difflev x_diffcfod x_diffcfov difflogta diffrndsales diffroa , clu(gvkey)

*Regression2

reg diffcash x_diffceod x_diffceov difflogta diffrndsales diffroa , clu(gvkey)

*Regression3

reg diffdebt x_diffceod x_diffceov difflogta diffrndsales diffroa , clu(gvkey)

*Regression4

reg diffcfaccr x_diffcfod x_diffcfov difflogta diffrndsales diffroa , clu(gvkey)

I leave attatched the parts of the paper that explain how the regression was computed.
I did not subtracted year 2005-2001 because I was not sure if they only used years 2001 and 2005 or from 2001 to 2005

I would appreciate if someone could take a look at my coding and give me a little help

Regards,
Attached Files
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

04 Sep 2019, 18:15

I think all of your code generating the diff* variables is wrong.

If there is more than one firm in your data, then the code you have written will cause the diffwhatever variable in the first observation for a given firm to be the difference between the first value of whatever in that firm and the last value of whatever in whatever firm precedes it in the data set. (And that assumes that the data are properly sorted in the first place--if not, you are just calculating differences between random observations!) So, to get these right, you have to first advise Stata of the proper panel structure in your data. And then, instead of referring to whatever[_n-1], which can leak into a preceding firm's data, use Stata's built-in first difference operator (see -help tsvarlist-).

Code:

xtset firm fyear gen difflev = D1.booklev_w gen diffcash = D1.cashta_w gen diffdebt = D1.debtmat_w gen difffaccr = D1.cfaccr_w // ETC.
Comment
Francisco Brazao

Join Date: Jun 2019

Posts: 5
#3

05 Sep 2019, 05:20

Originally posted by Clyde Schechter View Post

I think all of your code generating the diff* variables is wrong.

If there is more than one firm in your data, then the code you have written will cause the diffwhatever variable in the first observation for a given firm to be the difference between the first value of whatever in that firm and the last value of whatever in whatever firm precedes it in the data set. (And that assumes that the data are properly sorted in the first place--if not, you are just calculating differences between random observations!) So, to get these right, you have to first advise Stata of the proper panel structure in your data. And then, instead of referring to whatever[_n-1], which can leak into a preceding firm's data, use Stata's built-in first difference operator (see -help tsvarlist-).

Code:

xtset firm fyear gen difflev = D1.booklev_w gen diffcash = D1.cashta_w gen diffdebt = D1.debtmat_w gen difffaccr = D1.cfaccr_w // ETC.

Thank you very much, it makes sense.
One thing that I am still not getting is whether they use in their sample observations from 2001 to 2005 or only 2001 and 2005. Do you have an idea? They do mention that for the dependent variables the change is calculated using the values from fiscal year 2005 minus 2001 but for explanatory variables they mention "change in corresponding variable from fiscal year 2001 to 2005.

Beside that is the code to rank the incentives variables from 1-10 based on their cross-sectional distribution correct?

Regards,
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

05 Sep 2019, 08:24

One thing that I am still not getting is whether they use in their sample observations from 2001 to 2005 or only 2001 and 2005. Do you have an idea? They do mention that for the dependent variables the change is calculated using the values from fiscal year 2005 minus 2001 but for explanatory variables they mention "change in corresponding variable from fiscal year 2001 to 2005.

Had you not raised the question, my reading of the excerpt you posted in #1 is that they used all years from 2001 through 2005. But, rereading it in light of your question, I agree that the language is ambiguous and could also describe a study based only on data from 2001 and 2005. I don't know what they did, nor is this a field where I know enough to conclude that only one or the other approach makes sense. I think you will need to contact the authors of the study if the matter is not stated more clearly somewhere else in the article.

Beside that is the code to rank the incentives variables from 1-10 based on their cross-sectional distribution correct?

Yes, it appears correct to me.
Comment

Announcement

Help with replicating one regression model

Comment

Comment

Comment