estimating discretionary accruals using modified jones model by company and year (syntax error)

Ashley Choi

Join Date: May 2018

Posts: 8
#1

estimating discretionary accruals using modified jones model by company and year (syntax error)

25 May 2018, 01:01

Hello-
I have tried to estimate total accruals and discretionary accruals with follong Stata code.
But I just can find error "invalid syntax r(198);"
Could someone please help me?

-----using code as below---

sort symbol fyr
by symbol: gen total_acc=(ni-ocf)/ta[_n-1]
by symbol: gen inv_ta=1/ta[_n-1]
by symbol: gen cng_sales=sales-sales[_n-1]
by symbol: gen cng_ar=ar-ar[_n-1]
by symbol: gen rev_ar=(cng_sales-cng_ar)/ta[_n-1]
by symbol: gen s_ppe=ppe/ta[_n-1]

egen group=group(symbol)
destring group, replace
vallist group
local a =r(list)
vallist fyr
local b =r(list)
gen uhat=.

xtset group fyr
gen obs= [_n]
summ obs
scalar e= r(min)
scalar f= r(max)
foreach i in `a’ {
foreach x in `b’ {
forvalues j= `=scalar(e)’/`=scalar(f)’ {
capture noisily reg total_acc inv_ta rev_ar s_ppe if group==`i’ & fyr==`x’ & obs != `j’, nocons
capture noisily predict uhat_2, resid
capture noisily replace uhat_2=. if e(N) < 10
capture noisily replace uhat= uhat_2 if group==`i' & fyr==`x' & obs== `j'
capture noisily drop uhat_2
di `i'
di `x'
di `j'
}
}
}
Tags: None
Ashley Choi

Join Date: May 2018

Posts: 8
#2

25 May 2018, 01:05

result as attached picture
Comment
Amin Sofla

Join Date: May 2018

Posts: 67
#3

25 May 2018, 02:00

Duplicate post. Comments posted elsewhere .
Please also see here.

Last edited by Amin Sofla; 25 May 2018, 02:06.
Comment
Ashley Choi

Join Date: May 2018

Posts: 8
#4

25 May 2018, 02:25

Yes, that post is similar with my question..
I have checked that and see all answers but can not find the reson for my "invalid syntax"error....
Could you kindly let me get your advise??
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30116
#5

25 May 2018, 11:59

Well, there is one thing that jumps out at me. Several of your references to local macros use, appropriately a slanting quote (`) on the left, but close incorrectly with a slanting quote in the opposite direction on the right. (I don't know how to reproduce that character here, and I can't copy it from your post because you showed a screenshot. The FAQ specifically says not to do that for this reason, among others.) Anyway, the close quote to a local macro reference must be the vertical quote character (').

I suggest you change those to vertical quotes and try the code again. If you still get syntax errors, I suggest you post some example data along with the changed code. That way somebody can work with the code and try to figure out a solution.

If you do post back with example data, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
1 like
Comment
Ashley Choi

Join Date: May 2018

Posts: 8
#6

25 May 2018, 21:44

Dear Clyde.
Thank you for your help!
Finally i solved the problem with your advise !!!
And found my data is wrong... the code works well with ohter data..
Again, thank you very much for your kind explanation!!
Comment
Andre Sek Gallina

Join Date: May 2018

Posts: 6
#7

30 May 2018, 12:44

Hello guys
I'm having a problem running the Modified Jones model by year and industry. Is there any command in STATA that rounds regressions by year and by industry automatically?

I have a panel with 6801 remarks, from 1999 to 2016. Using a 2-digit SIC, I have 52 industries.

I created a Do.File as below:

Code:

gen Jones_Modified = . forval y = 1999(1) 2016 { forval i = 1(1) 52 { display `i' display `y' reg TA B1 B2 B3 if `i' == industry & `y' == year, noconstant predict r if `i' == industry & `y' == year, resid replace Jones_Modified = r if `i' == industry & `y' == year drop r } }

But when trying to run the regression STATA returns the following:

Code:

1999 no observations r (2000);

Can anybody help me? I do not know what to do anymore. Thank you! Best regards. André
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30116

30 May 2018, 12:52

This comes up frequently in these batch regressions. And it is apparently common practice to simply skip over the groups that do not have enough observations to proceed.

The following code will check the regression results. If the regression was successful, it will calculate the residuals. If Stata found no observations or insufficient observations for this pair of industry and year it will print out a message to that effect but then proceed to the next group. If some other problem arose during the regression, then Stata will print an error message and will halt execution.

Code:

gen Jones_Modified = . 
forval y = 1999(1) 2016 { 
    forval i = 1(1) 52 { 
        display `i' 
        display `y' 
        capture reg TA B1 B2 B3 if `i' == industry & `y' == year, noconstant
        if c(rc) == 0 {    // SUCCESSFUL REGRESSION
            predict r if `i' == industry & `y' == year, resid 
            replace Jones_Modified = r if `i' == industry & `y' == year 
            drop r
        }
        else if !inlist(c(rc), 2000, 2001) { // UNANTICIPATED ERROR
            display as error "Unexpected error encountered"
            exit c(rc)
        }
        else {    // NO OR INSUFFICIENT OBSERVATIONS; NOTIFY & PROCEED
            display "No or insufficient observations"
        }
    } 
}

Comment

Andre Sek Gallina

Join Date: May 2018
Posts: 6

30 May 2018, 13:11

Hi Clyde

Thanks for your help.

I put the code in Stata and returned the following message for all the years and industries:

Code:

   
 5.         capture reg TA B1 B2 B3 if `i' == industry & `y' == year, noconstant   6.         if c(rc) == 0 {    // SUCCESSFUL REGRESSION   7.             predict r if `i' == industry & `y' == year, resid    8.             replace Jones_Modified56 = r if `i' == industry & `y' == year    9.             drop r  10.         }  11.         else if !inlist(c(rc), 2000, 2001) { // UNANTICIPATED ERROR  12.             display as error "Unexpected error encountered"  13.             exit c(rc)  14.         }  15.         else {    // NO OR INSUFFICIENT OBSERVATIONS; NOTIFY & PROCEED  16.             display "No or insufficient observations"  17.         }  18.     }   19. } 1 1999 No or insufficient observations 2 1999 No or insufficient observations 3 1999 No or insufficient observations 4 1999 No or insufficient observations 5 1999 No or insufficient observations 6 1999 No or insufficient observations 7 1999 No or insufficient observations ... 15 2003 (6791 missing values generated) (10 real changes made) 16 2003 No or insufficient observations 17 2003 No or insufficient observations 18 2003 (6799 missing values generated) (2 real changes made) ... 30 2005 (6798 missing values generated) (3 real changes made) 31 2005 (6799 missing values generated) (2 real changes made) 32 2005 (6784 missing values generated) (17 real changes made) 33 2005 (6772 missing values generated) (29 real changes made) 34 2005 No or insufficient observations ... 42 2008 (6795 missing values generated) (6 real changes made) 43 2008 No or insufficient observations 44 2008 (6798 missing values generated) (3 real changes made) 45 2008 (6796 missing values generated) (5 real changes made) ... 9 2014 (6792 missing values generated) (9 real changes made) 10 2014 (6792 missing values generated) (9 real changes made) 11 2014 (6795 missing values generated) (6 real changes made) ... 33 2016 (6743 missing values generated) (58 real changes made) 34 2016 (6798 missing values generated) (3 real changes made) 35 2016 (6792 missing values generated) (9 real changes made) 36 2016 No or insufficient observations 37 2016 (6781 missing values generated) (20 real changes made) 38 2016 (6798 missing values generated) (3 real changes made) 39 2016 No or insufficient observations 40 2016 (6798 missing values generated) (3 real changes made) 41 2016 (6798 missing values generated) (3 real changes made) 42 2016 (6795 missing values generated) (6 real changes made) 43 2016 No or insufficient observations 44 2016 (6794 missing values generated) (7 real changes made) 45 2016 (6794 missing values generated) (7 real changes made) 46 2016 (6799 missing values generated) (2 real changes made) 47 2016 (6793 missing values generated) (8 real changes made) 48 2016 (6793 missing values generated) (8 real changes made) 49 2016 (6798 missing values generated) (3 real changes made) 50 2016 No or insufficient observations 51 2016 (6798 missing values generated) (3 real changes made) 52 2016 (6789 missing values generated) (12 real changes made)  .  end of do-file  .

I cut the lines to fit the page.
Is there anything I can do to correct this situation? Sorry for my ignorance, but I've never used Stata and need to present a paper on Earning Management using the Modified Jones model, but I'm having a hard time. Thanks again. Best Regards. André

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30116
#10

30 May 2018, 13:40

The code is working exactly as it's supposed to. I can say that because clearly for some combinations of industry and year, e.g. 31 and 2005, you are getting results. But for many others you don't have enough data. So if this is a problem, the problem is with your data and you need to fix the data.

Bear in mind that if you have a missing value for any of the variables TA B1 B2 or B3 in an observation, that observation does not participate in the regression. So, for example, if you run:

Code:

count if !missing(TA, B1, B2, B3) & industry == 2 & year == 1999

you will find out just how many observations are available for industry 2 and year 1999, and I'm sure you will find that it is 3 or fewer, and that is not enough data to do a regression with 3 predictors.

The problem is in your data.
Comment
Andre Sek Gallina

Join Date: May 2018

Posts: 6
#11

30 May 2018, 14:54

Clyde, Tks one more time.

Does Stata have any way to regularize this problem of missing data quickly? I erased everything and started the process again and found that there are 915 missing values in the data.

Code:

gen TA = ( netincome - cashop) / l.assets (915 missing values generated) . gen B0 = 1 / l.assets (915 missing values generated) . gen B1 = d.saletotal / l.assets (915 missing values generated) . gen B2 = ppegtgross / l.assets (915 missing values generated)

If you can help me again .

Thank you so much

Best regards.

André
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30116
#12

30 May 2018, 16:23

No, you can't create information ex nihilo. If the data are missing, they are missing. The solutions are either to find values for the missing variables, or impute them according to some model of the data. But there is no quick fix to missing data.

One quick thought: what is the panel structure of your data and what was the -xtset- command you used? It's possible that those are not matched to each other properly, which might cause lags and differences that really exist to appear to be missing.
Comment
Andre Sek Gallina

Join Date: May 2018

Posts: 6
#13

30 May 2018, 17:14

Hi Clyde

Sorry for my insistence.

My panel has the following structure: these are business data, from the year 1999 to 2016. Each company belongs to one of the 52 indicated industries. I need a regression per year and industry so that with betas I can calculate another equation and find the value that I need. I have some lagged variables (t - 1), which causes at least some missing values to be normal.

Do you speak of the XTSET I used to declare the panel?

I do not know if that's what you meant, but I used this command:

Code:

xtset id year, yearly variable panel: id (unbalanced) time variable: year, 1999 to 2016, but with gaps delta: 1 year

I no longer know what to do to run this data. Thank you!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30116
#14

30 May 2018, 22:01

So the -xtset- looks appropriate and it would not be the source of the problem. It really means that the difficulty is with your data.

I don't do economics or finance, so my only exposure to this particular kind of problem is here on Statalist. But I've seen enough of them here to know that there are usually some groups with too few observations in this kind of analysis. I gather, though, that for your purposes, you have too many of those. I think you should speak with a colleague in your discipline about how best to handle the missing data. Somebody in your field may know where you can get more data to fill in some of the gaps, or may be able to suggest a reasonable way, that would be credible in your field, to impute values to the missing observations. Stata has lots of tools for imputing missing data, but you need to know what kind of imputation you want to do so you can pick the right one. Different approaches have different statistical properties, and when the amount of missing data is large (as, it appears, is the case here) you can end up in a situation where your results reflect your choice of imputation methodology more than they reflect anything going on in the real world!
Comment
Andre Sek Gallina

Join Date: May 2018

Posts: 6
#15

31 May 2018, 07:20

Hi Clyde In several papers, there is a limitation for there to be at least 10 observations in each year x industry. Does Stata provide any command for the regressions to be run with a minimum of 10 observations? I think (and I hope) that solves some of my problems. Regards. André
Comment

Announcement