Estimating discretionary accruals using the modified Jones (1991)

Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#31

02 Sep 2018, 11:38

I think there is a typographical problem that has crept into your code somehow. When I create a suitable data set and directly type -gen x2= (d.revt - d.rect)/L.at- at the command window, it runs without error messages. When I then -drop x2-, and copy/paste your code (which looks the same to the eye) into the command window and run it, I get the same error message you do.

Dissecting your code a bit by creating a data set in which that one command is the actual data in an observation, the -charlist- command shows me that it contains 3 non-printing characters in addition to the ones we can see with our eyes. So I think all you need to do is delete that line of code from your do-file and then type it in directly from the keyboard and you'll be fine. If that's not the case, please post back and include an example of your data using the -dataex- command.

If the code was at any point passed through a word processing program or some other program (or maybe even the Statalist Forum editor), it can get "contaminated" with "control characters" used by that program. These characters are not recognized by Stata and can completely confuse the parser, leading to error messages, and sometimes error messages that are misleading.
Comment
Anas Farah

Join Date: Oct 2017

Posts: 1
#32

16 Oct 2018, 05:28

Originally posted by Mahmud Hossain View Post

Hi all:
I'm new in stata. Was trying to estimate discretionary accrual, and was using user posted command. Was getting error message.
The full message is posted below!
Any help would be simply great!!
Regards,

Mahmud

gen sic_2= substr(sic,1,2)

. destring sic_2, replace
sic_2: all characters numeric; replaced as byte

.
. egen combo= group(sic_2 fyear)
(575 missing values generated)

gen uhat=.
(254,697 missing values generated)

.
end of do-file

. do "C:\Users\mhossain\AppData\Local\Temp\STD784_00000 0.tmp"

. xtset gvkey fyear
panel variable: gvkey (unbalanced)
time variable: fyear, 1995 to 2018, but with gaps
delta: 1 unit

.
. gen obs= [_n]

. summ obs

Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
obs | 254,697 127349 73524.84 1 254697

. scalar e= r(min)

. scalar f= r(max)

.
. gen ta= (ib-oancf)/L.at
(81,677 missing values generated)

. gen x1= 1/L.at
(65,104 missing values generated)

. gen x2= (d.revt – d.rect)/L.at
d: operator invalid
r(198);

end of do-file

r(198);

I think Clyde's suspicion is correct. I assume you copied the code from the following page: https://robsonglasscock.wordpress.co...cruals-update/

The author mentioned right under -gen x2= (d.revt – d.rect)/L.at- in the original post above that there was some issue when the code is copied from the webpage into a do-file. The fix recommended by the author is below.

gen x2= (d.revt – d.rect)/L.at
/* The minus sign above pasted in an odd manner in the original blog post
triggering one error. This was simply a cut and paste issue. To fix, manually
change the cut-and-pasted minus sign to a new, manually entered minus sign. */
1 like
Comment
Celine Tran

Join Date: Oct 2018

Posts: 46
#33

02 Dec 2018, 20:02

Dear all,

First, thank you very much for your code to estimate Discretionary accruals. However, my supervisor said that the regression model for each industry need to have a significant F or else that industry need to be discard. Could you please help me to write the code for this condition?

Thank you very much in advance.

Celine.
Comment

David Benson

Join Date: Oct 2018
Posts: 489

#34

03 Dec 2018, 00:15

Hi Celine,

I'm not sure, but it could mean that he/she wants you to include the sic2 industry indicators, and then remove those that aren't statistically significant.
I had a Stata dataset of Compustat firms in IT industries( 2-digit SIC of 35, 36, 38, 48, 73) for 1996-2005 handy.

If you actually have to do an F-test (calc an F-statistic) testing whether all of the industry indicators are jointly==0, then see:

Code:

gen sic2 = int(dnum / 100)

. tabulate year sic2

    fiscal |
      year |             2-Digit SIC (int(dnum / 100))
  (=yeara) |        35         36         38         48         73 |     Total
-----------+-------------------------------------------------------+----------
      1996 |       219        378        177        203        640 |     1,617 
      1997 |       227        394        188        212        688 |     1,709 
      1998 |       211        366        172        208        705 |     1,662 
      1999 |       208        374        167        230        873 |     1,852 
      2000 |       199        419        168        241        930 |     1,957 
      2001 |       189        397        160        225        820 |     1,791 
      2002 |       184        378        154        193        753 |     1,662 
      2003 |       171        358        155        180        663 |     1,527 
      2004 |       154        364        148        165        590 |     1,421 
      2005 |       146        365        148        160        547 |     1,366 
-----------+-------------------------------------------------------+----------
     Total |     1,908      3,793      1,637      2,017      7,209 |    16,564

Code:

* Some bogus regressions using firm's end-of-year market capitalization (in $1996 constant dollars) on some variables I had handy

. reg marketcap_end96 data12 begin_asset96 acq_cash_assets i.yeara i.sic2
* omitted year==1996;  omitted sic2 = 35 (Computer Hardware)
* data12 was sales under old Compustat variable names


      Source |       SS           df       MS      Number of obs   =    13,610
-------------+----------------------------------   F(16, 13593)    =    678.18
       Model |  1.4590e+12        16  9.1188e+10   Prob > F        =    0.0000
    Residual |  1.8277e+12    13,593   134459720   R-squared       =    0.4439
-------------+----------------------------------   Adj R-squared   =    0.4433
       Total |  3.2867e+12    13,609   241510406   Root MSE        =     11596

---------------------------------------------------------------------------------
marketcap_end96 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
         data12 |    1.38661   .0315757    43.91   0.000     1.324717    1.448503
  begin_asset96 |   .1605242   .0193756     8.28   0.000     .1225454    .1985031
acq_cash_assets |   91.53494    49.6281     1.84   0.065    -5.743008    188.8129
                |
          yeara |
          1997  |   343.8581   447.2731     0.77   0.442    -532.8592    1220.575
          1998  |   1102.299   446.9197     2.47   0.014     226.2741    1978.323
          1999  |   3523.423     453.76     7.76   0.000     2633.991    4412.856
          2000  |   1890.619   443.6988     4.26   0.000     1020.908     2760.33
          2001  |   436.2058   437.8795     1.00   0.319    -422.0987     1294.51
          2002  |  -454.7354   446.9616    -1.02   0.309    -1330.842    421.3712
          2003  |   95.90589   456.6593     0.21   0.834    -799.2096    991.0213
          2004  |     3.8964   460.7752     0.01   0.993    -899.2869    907.0797
          2005  |   289.9874   483.7544     0.60   0.549    -658.2382    1238.213
                |
           sic2 |
            36  |   1188.402   350.7109     3.39   0.001     500.9604    1875.844
            38  |   37.11065   416.2306     0.09   0.929    -778.7589    852.9802
            48  |   1540.933   426.3282     3.61   0.000     705.2712    2376.596
            73  |   1036.231   326.0885     3.18   0.001     397.0522    1675.409
                |
          _cons |  -835.9087   417.3979    -2.00   0.045    -1654.066   -17.75097
---------------------------------------------------------------------------------

If the above regression were your result, presumably your advisor would want you to omit the control for sic2==38.
(NOTE: All of the industry indicator variables lose their significance when I cluster by firm.)

Code:

. reg marketcap_end96 data12 begin_asset96 acq_cash_assets i.yeara i.sic2, vce(cluster gvkey)

Linear regression                               Number of obs     =     13,610
                                                F(16, 2591)       =      29.29
                                                Prob > F          =     0.0000
                                                R-squared         =     0.4439
                                                Root MSE          =      11596

                                 (Std. Err. adjusted for 2,592 clusters in gvkey)
---------------------------------------------------------------------------------
                |               Robust
marketcap_end96 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
         data12 |    1.38661   .2775466     5.00   0.000     .8423744    1.930846
  begin_asset96 |   .1605242   .1334535     1.20   0.229    -.1011621    .4222106
acq_cash_assets |   91.53494    51.6728     1.77   0.077    -9.789227    192.8591
                |
          yeara |
          1997  |   343.8581   71.83448     4.79   0.000     202.9993    484.7169
          1998  |   1102.299   193.6093     5.69   0.000     722.6541    1481.943
          1999  |   3523.423   486.7877     7.24   0.000     2568.891    4477.955
          2000  |   1890.619   376.4225     5.02   0.000       1152.5    2628.739
          2001  |   436.2058   180.2204     2.42   0.016     82.81518    789.5965
          2002  |  -454.7354   133.4354    -3.41   0.001    -716.3862   -193.0847
          2003  |   95.90589   130.7137     0.73   0.463    -160.4079    352.2197
          2004  |     3.8964   165.8468     0.02   0.981    -321.3093     329.102
          2005  |   289.9874   251.0689     1.16   0.248    -202.3287    782.3034
                |
           sic2 |
            36  |   1188.402   944.6753     1.26   0.209    -663.9926    3040.797
            38  |   37.11065   793.7182     0.05   0.963    -1519.275    1593.497
            48  |   1540.933   1212.884     1.27   0.204    -837.3858    3919.253
            73  |   1036.231   952.5825     1.09   0.277    -831.6692    2904.131
                |
          _cons |  -835.9087   738.2989    -1.13   0.258    -2283.624    611.8069
---------------------------------------------------------------------------------

Comment

Attaullah Shah

Join Date: Aug 2014

Posts: 1669
#35

03 Dec 2018, 03:46

David Benson You are proposing dummy variables for each industry. In fact, the Modified Jones model uses cross-sectional industry regressions, therefore the F-test here refers to the F-test of the industry-specific regression model, not the industry-specific coefficient in a pooled regression.

Regards
--------------------------------------------------
Attaullah Shah, PhD.
Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
FinTechProfessor.com
https://asdocx.com
Check out my asdoc program, which sends outputs to MS Word.
For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.
Comment

Celine Tran

Join Date: Oct 2018
Posts: 46

#36

06 Dec 2018, 17:33

David Benson Attaullah Shah Robson Glasscock

Thank you very much for your support. However, maybe my question is not clear enough, hence your answer is not the one for my question.

After I use the code of Robson Glasscock to estimate the Discretionary accruals, I get the result already. I also do checking and my result is as follows:

Code:

preserve

. reg ta x1-x4 if Indus=="Basic Materials" & year==2008 & obs!=627, nocons

      Source |       SS       df       MS              Number of obs =       7
-------------+------------------------------           F(  4,     3) =    4.58
       Model |  .068752185     4  .017188046           Prob > F      =  0.1208
    Residual |  .011253915     3  .003751305           R-squared     =  0.8593
-------------+------------------------------           Adj R-squared =  0.6718
       Total |  .080006101     7  .011429443           Root MSE      =  .06125

------------------------------------------------------------------------------
          ta |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   -1814563    1595033    -1.14   0.338     -6890670     3261543
          x2 |  -.2785775   .0786377    -3.54   0.038    -.5288377   -.0283172
          x3 |   .0966852   .0601268     1.61   0.206    -.0946651    .2880354
          x4 |   5.61e+07   1.93e+07     2.90   0.063     -5486172    1.18e+08
------------------------------------------------------------------------------

. predict u_hat,resid
(100 missing values generated)

. list u_hat if obs==627

     +-----------+
     |     u_hat |
     |-----------|
627. | -.2343639 |
     +-----------+

.
end of do-file

Here is my supervisor's comment : "The models for each industry need to have a significant F statistic or else you need to discard that industry. DA need to be worked out on all companies except those in your firm list".

Based on her idea, should I keep the result above? (because the F-test is not significant)

Thank you very much in advance.

Regards,
Celine.

Last edited by Celine Tran; 06 Dec 2018, 17:40.

Comment

Emmanuel Sequeira

Join Date: Apr 2019
Posts: 4

#37

09 Apr 2019, 00:13

Hi everyone,

I followed all your recommendations and was testing the code from Robson Glasscock's website and from Clyde and Nick's updates against Roychowdhury's (2006) table 4 and my coefficient for the Suspect_NI variable (defined as 1 if Net Income/ Assets is greater than 0 but less than 0.005) seems to have the wrong sign. I downloaded all the data from Compustat as in the paper (annual data between 1987 and 2001) but since the code still took forever I only used data from 1991 to 2001. Would anyone be able to let me know what I could have possibly done wrong in my attempt at combining all the different resources, which resulted in me getting opposite sign results? I will attach my code here and the tables I obtained.

Code:

clear all
cls
set maxvar 32767 
 
use "C:\Users\ejsequeira\Desktop\Test_Roychow\roych_data.dta"
 

 gen sic_2 = substr(sic, 1,2)
 destring sic_2, replace
  
 *gen date = year(datadate)
 
 *format date %ty
 
 destring gvkey, replace
 sort gvkey fyear
 
 gen dup =1 if gvkey==gvkey[_n-1] & fyear==fyear[_n-1]
 sum dup
 
 drop if dup==1
 drop dup
 
 drop if fyear <1991

 egen combo = group (sic_2 fyear)
 levelsof combo, local (a)

 
 
 xtset gvkey fyear 
 gen u_hat_ram1=.
 gen obs = [_n]
 summ obs
 
 *gen runn=1
 
 
 gen oancf_ram=oancf/L1.at
 gen l_at=L1.at
 gen ram1=1/L1.at
 gen ram2=sale/L1.at
 gen ram3=S1.sale/L1.at
 
 
 *compress 
 forvalues j = 1/`=_N'{
    capture noisily{
        reg oancf_ram ram1-ram3 if sic_2==sic_2[`j'] & fyear == fyear[`j'] & _n!= `j', nocons
        if e(N)>=10{
            predict uhat_2 in `j', resid
            replace u_hat_ram1= uhat_2 in `j'
            drop uhat_2
            }
        }
    }
 
 
 summ u_hat_ram1
 
 rename u_hat_ram1 ab_cfo
 summ ab_cfo
 
 gen prod = (cogs+S1.invt)/L1.at
 *gen l_at=L1.at
 gen ram4=1/L1.at
 gen ram5=sale/L1.at
 gen ram6=S1.sale/L1.at
 gen ram7=S1.L1.sale/L1.at
 gen uhat_prod=.
 
  forvalues j = 1/`=_N'{
    capture noisily{
        reg prod ram4-ram7 if sic_2==sic_2[`j'] & fyear == fyear[`j'] & _n!= `j', nocons
        if e(N)>=10{
            predict uhat_2 in `j', resid
            replace uhat_prod= uhat_2 in `j'
            drop uhat_2
            }
        }
    }
 rename uhat_prod ab_prod
 summ ab_prod
 
 drop ram4 ram5 ram6 ram7 l_at
 
 gen disexp = (xrd+xsga)/L1.at
 count if obs!=. & disexp==.
 replace disexp = xsga/L1.at if xrd==.
 
 gen l_at=L1.at
 gen ram8=1/l_at
 gen ram9 = L1.sale/L1.at
 gen uhat_disexp=.
 
 
  forvalues j = 1/`=_N'{
    capture noisily{
        reg disexp ram8 ram9 if sic_2==sic_2[`j'] & fyear == fyear[`j'] & _n!= `j', nocons
        if e(N)>=10{
            predict uhat_2 in `j', resid
            replace uhat_disexp= uhat_2 in `j'
            drop uhat_2
            }
        }
    }
 
 rename uhat_disexp ab_dis_exp
 summ ab_dis_exp
 
 gen ram = ab_prod - ab_cfo - ab_dis_exp
 summ ram
 
 
 gen size=log(csho*prcc_f)
 gen m2b= (csho*prcc_f)/seq
 *drop suspect_ni
 gen suspect_ni = 0
 replace suspect_ni=1 if (ni/at>0 & ni/at<0.005)
 
 gen ni_reg = ni/at
 
 reg ab_cfo size m2b ni_reg suspect_ni
 
 reg ab_dis_exp size m2b ni_reg suspect_ni
 
 reg ab_prod size m2b ni_reg suspect_ni

The output I obtained is as follows:

Click image for larger version

Name: AB_CFO.png
Views: 1
Size: 20.3 KB
ID: 1492436

Click image for larger version

Name: AB_PROD.png
Views: 1
Size: 20.4 KB
ID: 1492437

Click image for larger version

Name: DIS_EXP.png
Views: 1
Size: 20.6 KB
ID: 1492438

Thank you in advance for any help you may provide me with!

Comment

Robson Glasscock

Join Date: Apr 2014

Posts: 25
#38

09 Apr 2019, 13:58

Emmanuel,
It looks like you have many more observations (firm-years) than Roychowdhury. From his Table 1 on page 347, it looks like he has 21,758 firm-years compared to your minimum number of firm-years of 62,476 in the regressions above. You also say that you eliminated four years of the data so (1987- 1990) so I'm not sure why you have so many more observations than he does.

The second paragraph of page 344 talks about his process for eliminating firms from his sample. Did you follow these same procedures? He eliminates certain two-digit SIC codes, and it looks like he also requires 15 firm-years where your code looks like it requires 10.

Before I estimate models when replicating a paper, I try to see if the number of observations I have and the summary statistics for the variables line up with what the authors published. If I were you, I would see if I could get closer to Roychowdhury's number of observations in his sample and then compare the descriptive statistics you have to his Table 1.
Comment
Emmanuel Sequeira

Join Date: Apr 2019

Posts: 4
#39

11 Apr 2019, 23:37

Robson,

Thank you for your reply! After eliminating as many observations as I could, I got as close to 21,758 as possible. The signs and the magnitudes of the three important variables did appear to be close to those in Roychowdhury (2006), so at least it appears I am on the right track. I really appreciate your help.

Thanks once again!

Last edited by Emmanuel Sequeira; 11 Apr 2019, 23:54.
Comment
Robson Glasscock

Join Date: Apr 2014

Posts: 25
#40

12 Apr 2019, 07:48

Happy to help, and glad it worked.Good luck with the rest of your project.
Comment
dyan sukartha

Join Date: Sep 2019

Posts: 1
#41

23 Sep 2019, 06:28

Originally posted by Ali Ahmed View Post

Now I am able to solve it. Thanks a lot!

Hi all,

I am trying to calculate the discretionary accrual just like Ali and got invalid sytax (r198). I wonder how to solve this. The followings is the syntax that I use.

egen CompanyName_n = group( CompanyName)
. egen GICS_n = group( GICS)
. vallist GICS_n
. local a =r(list)
. vallist Year
. local b =r(list)
. gen uhat=.
. xtset CompanyName_n Year
. gen obs= [_n]
. summ obs
. scalar e= r(min)
. scalar f= r(max)
. gen ta=(NIBE-CFO)/LTA
. gen x1=1/LTA
. gen x2=(Δrevenue-Δreceivable)/LTA
. gen x3=PPE/LTA
. foreach i in `a’ {
2. foreach x in `b’ {
3. forvalues j= `=scalar(e)’/`=scalar(f)’ {
4. capture noisily reg ta x1 x2 x3 if GICS_n ==`i’ & Year ==`x’ & obs != `j’,nocons
6. capture noisily predict GICS_n, resid
7. capture noisily replace GICS_n =. if e(N) < 8
8. capture noisily replace uhat= uhat_2 if GICS_n ==`i' & Year ==`x' & obs== `j'
9. capture noisily drop GICS_n
10. di `i'
11. di `x'
12. di `j'
13. }
14. }
15. }
Comment
Sara Khaled

Join Date: Aug 2022

Posts: 1
#42

16 Aug 2022, 17:30

Originally posted by Emmanuel Sequeira View Post

Robson,

Thank you for your reply! After eliminating as many observations as I could, I got as close to 21,758 as possible. The signs and the magnitudes of the three important variables did appear to be close to those in Roychowdhury (2006), so at least it appears I am on the right track. I really appreciate your help.

Thanks once again!

Which observation did you eliminate as now I have the same problem?
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment